Kevin Miller, AWS | AWS Storage Day 2021
(bright music) >> Welcome to this next session of AWS Storage Day. I'm your host, Dave Vellante of theCUBE. And right now we're going to explore how to simplify and evolve your data lake backup disaster recovery and analytics in the cloud. And we're joined by Kevin Miller who's the general manager of Amazon S3. Kevin, welcome. >> Thanks Dave. Great to see you again. >> Good to see you too. So listen, S3 started as like a small ripple in the pond and over the last 15 years, I mean, it's fundamentally changed the storage market. We used to think about storage as, you know, a box of disc drives that either store data in blocks or file formats and then object storage at the time it was, kind of used in archival storage, it needed specialized application interfaces, S3 changed all that. Why do you think that happened? >> Well, I think first and foremost, it's really just, the customers appreciated the value of S3 and being fully managed where, you know, we manage capacity. Capacity is always available for our customers to bring new data into S3 and really therefore to remove a lot of the constraints around building their applications and deploying new workloads and testing new workloads where they know that if something works great, it can scale up by a 100x or a 1000x. And if it doesn't work, they can remove the data and move on to the next application or next experiment they want to try. And so, you know, really, it's exciting to me. Really exciting when I see businesses across essentially every industry, every geography, you know, innovate and really use data in new and really interesting ways within their business to really drive actual business results. So it's not just about building data, having data to build a report and have a human look at a report, but actually really drive the day-to-day operations of their business. So that can include things like personalization or doing deeper analytics in industrial and manufacturing. A customer like Georgia-Pacific for example, I think is one of the great examples where they use a big data lake and collect a lot of sensor data, IoT sensor data off of their paper manufacturing machines. So they can run them at just the right speed to avoid tearing the paper as it's going through, which really just keeps their machines running more and therefore, you know, just reduce their downtime and costs associated with it. So you know, it's just that transformation again, across many industries, almost every industry that I can think of. That's really what's been exciting to see and continue to see. I think we're still in the really early days of what we're going to see as far as that innovation goes. >> Yeah, I got to agree. I mean, it's been pretty remarkable. Maybe you could talk about the pace of innovation for S3. I mean, if anything, it seems to be accelerating. How Kevin, does AWS, how has it thought about innovation over the past decade plus and where do you see it headed? >> Yeah, that's a great question Dave, really innovation is at our core as part of our core DNA. S3 launched more than 15 years ago, almost 16 years old. We're going to get a learner's permit for it next year. But, you know, as it's grown to exabytes of storage and trillions of objects, we've seen almost every use case you can imagine. I'm sure there's a new one coming that we haven't seen yet, but we've learned a lot from those use cases. And every year we just think about what can we do next to further simplify. And so you've seen that as we've launched over the last few years, things like S3 Intelligent Tiering, which was really the clouds first storage class to automatically optimize and reduce customer's costs for storage, for data that they were storing for a long time. And based on, you know, variable access patterns. We launched S3 Access Points to provide a simpler way to have different applications operating on shared data sets. And we launched earlier this year S3 Object Lambda, which really is, I think, cool technology. We're just starting to see how it can be applied to simplify serverless application development. Really the next wave, I think, of application development that doesn't need, not only is the storage fully managed, but the compute is fully managed as well. Really just simplify that whole end to end application development. >> Okay, so we heard this morning in the keynote, some exciting news. What can you tell us, Kevin? >> Yeah, so this morning we launched S3 Multi-Region Access Points and these are access points that give you a single global endpoint to access data sets that can span multiple S3 buckets in different AWS regions around the world. And so this allows you to build these multi-region applications and multi-region architectures with, you know, with the same approach that you use in a single region and then run these applications anywhere around the world. >> Okay. So if I interpret this correctly, it's a good fit for organizations with clients or operations around the globe. So for instance, gaming, news outlets, think of content delivery types of customers. Should we think about this as multi-region storage and why is that so important in your view? >> Absolutely. Yeah, that is multi-region storage. And what we're hearing is seeing as customers grow and we have multinational customers who have operations all around the world. And so as they've grown and their data needs grow around the world, they need to be using multiple AWS regions to store and access that data. Sometimes it's for low latency so that it can be closer to their end users or their customers, other times it's for regions where they just have a particular need to have data in a particular geography. But this is really a simple way of having one endpoint in front of data, across multiple buckets. So for applications it's quite easy, they just have that one end point and then the data, the requests are automatically routed to the nearest region. >> Now earlier this year, S3 turned 15. What makes S3 different, Kevin in your view? >> Yeah, it turned 15. It'll be 16 soon, you know, S3 really, I think part of the difference is it just operates at really an unprecedented scale with, you know, more than a hundred trillion objects and regularly peaking to tens of millions of requests per second. But it's really about the resiliency and availability and durability that are our responsibility and we focus every single day on protecting those characteristics for customers so that they don't have to. So that they can focus on building the businesses and applications that they need to really run their business and not worry about the details of running highly available storage. And so I think that's really one of the key differences with S3. >> You know, I first heard the term data lake, it was early last decade. I think it was around 2011, 2012 and obviously the phrase has stuck. How are S3 and data lakes simpatico, and how have data lakes on S3 changed or evolved over the years? >> Yeah. You know, the idea of data lakes, obviously, as you say, came around nine or 10 years ago, but I actually still think it's really early days for data lakes. And just from the standpoint of, you know, originally nine or 10 years ago, when we talked about data lakes, we were looking at maybe tens of terabytes, hundreds of terabytes, or a low number of petabytes and for a lot of data lakes, we're still seeing that that's the kind of scale that currently they're operating at, but I'm also seeing a class of data lakes where you're talking about tens or hundreds of petabytes or even more, and really just being used to drive critical aspects of customer's businesses. And so I really think S3, it's been a great place to run data lakes and continues to be. We've added a lot of capability over the last several years, you know, specifically for that data lake use case. And we're going to continue to do that and grow the feature set for data lakes, you know, over the next many years as well. But really, it goes back to the fundamentals of S3 providing that 11 9s of durability, the resiliency of having three independent data centers within regions. So the customers can use that storage knowing their data is protected. And again, just focus on the applications on top of that data lake and also run multiple applications, right? The idea of a data lake is you're not limited to one access pattern or one set of applications. If you want to try out a new machine learning application or something, do some advanced analytics, that's all possible while running the in-flight operational tools that you also have against that data. So it allows for that experimentation and for transforming businesses through new ideas. >> Yeah. I mean, to your point, if you go back to the early days of cloud, we were talking about storing, you know, gigabytes, maybe tens of terabytes that was big. Today, we're talking about hundreds and hundreds of terabytes, petabytes. And so you've got huge amount of information customers that are of that size and that scale, they have to optimize costs. Really that's top of mind, how are you helping customers save on storage costs? >> Absolutely. Dave, I mean, cost optimization is one of the key things we look at every single year to help customers reduce their costs for storage. And so that led to things like the introduction of S3 Intelligent Tiering, 10 years ago. And that's really the only cloud storage class that just delivers the automatic storage cost savings, as data access patterns change. And, you know, we deliver this without performance impact or any kind of operational overhead. It's really intended to be, you know, intelligent where customers put the data in. And then we optimize the storage cost. Or for example, last year we launched S3 Storage Lens, which is really the first and only service in the cloud that provides organization-wide visibility into where customers are storing their data, what the request rates are and so forth against their storage. So when you talk about these data lakes of hundreds of petabytes or even smaller, these tools are just really invaluable to help customers reduce their storage costs year after year. And actually, Dave I'm pleased, you know, today we're also announcing the launch of some improvements to S3 Intelligent Tiering, to actually further automate the cost savings. And what we're doing is we're actually removing the minimum storage duration. Previously, Intelligent Tiering had a 30 day minimum storage duration, and we're also eliminating our monitoring and automation charge for small objects. So previously there was that monitoring and automation charge applied to all objects independent of size. And now any object less than 120 kilobytes is not charged at that charge. So, and I think some pretty critical innovations on Intelligent Tiering that will help customers use that for an even wider set of data lake and other applications. >> That's three, it's ubiquitous. The innovation continues. You can learn more by attending the Storage Day S3 deep dive right after this interview. Thank you, Kevin Miller. Great to have you on the program. >> Yeah, Dave, thanks for having me. Great to see you. >> You're welcome, this is Dave Vellante and you're watching theCUBE's coverage of AWS Storage Day. Keep it right there. (bright music)
SUMMARY :
and analytics in the cloud. and over the last 15 years, I mean, and therefore, you know, over the past decade plus and And based on, you know, in the keynote, some exciting news. And so this allows you to build around the globe. they need to be using multiple AWS regions Kevin in your view? and applications that they need and obviously the phrase has stuck. And just from the standpoint of, you know, storing, you know, gigabytes, And so that led to things Great to have you on the program. Great to see you. Vellante and you're watching
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Kevin Miller | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Kevin | PERSON | 0.99+ |
30 day | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
hundreds of terabytes | QUANTITY | 0.99+ |
Today | DATE | 0.99+ |
tens of terabytes | QUANTITY | 0.99+ |
next year | DATE | 0.99+ |
hundreds | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
nine | DATE | 0.99+ |
100x | QUANTITY | 0.99+ |
less than 120 kilobytes | QUANTITY | 0.99+ |
three | QUANTITY | 0.98+ |
more than a hundred trillion objects | QUANTITY | 0.98+ |
2012 | DATE | 0.98+ |
S3 | TITLE | 0.98+ |
1000x | QUANTITY | 0.98+ |
one set | QUANTITY | 0.98+ |
Storage Day S3 | EVENT | 0.98+ |
10 years ago | DATE | 0.97+ |
today | DATE | 0.97+ |
11 9s | QUANTITY | 0.97+ |
hundreds of petabytes | QUANTITY | 0.97+ |
tens of millions | QUANTITY | 0.96+ |
15 | QUANTITY | 0.96+ |
first storage class | QUANTITY | 0.95+ |
single region | QUANTITY | 0.95+ |
2011 | DATE | 0.95+ |
hundreds of terabytes | QUANTITY | 0.95+ |
this morning | DATE | 0.94+ |
S3 | COMMERCIAL_ITEM | 0.94+ |
earlier this year | DATE | 0.94+ |
single | QUANTITY | 0.94+ |
earlier this year | DATE | 0.93+ |
S3 Object Lambda | TITLE | 0.93+ |
past decade | DATE | 0.9+ |
one endpoint | QUANTITY | 0.9+ |
16 | QUANTITY | 0.9+ |
almost 16 years old | QUANTITY | 0.89+ |
theCUBE | ORGANIZATION | 0.86+ |
Storage Day 2021 | EVENT | 0.85+ |
three independent data centers | QUANTITY | 0.83+ |
one end point | QUANTITY | 0.83+ |
trillions of objects | QUANTITY | 0.8+ |
petabytes | QUANTITY | 0.8+ |
Storage Day | EVENT | 0.78+ |
single year | QUANTITY | 0.77+ |
last 15 years | DATE | 0.75+ |
S3 Storage Lens | COMMERCIAL_ITEM | 0.74+ |
last decade | DATE | 0.74+ |
second | QUANTITY | 0.73+ |
tens | QUANTITY | 0.73+ |
more than 15 years ago | DATE | 0.73+ |
one access | QUANTITY | 0.7+ |