Mai Lan Tomsen Bukovec, Vice President, Block and Object Storage, AWS
>> We continue with cube on cloud. We here with Mai-Lan Tomsen Bukovec who's the vice president of block and object storage at AWS which comprises elastic block storage, AWS S3 and Amazon glacier. Mai-Lan Great to see you again. Thanks so much for coming on the program. >> Nice to be here. Thanks for having me, Dave. >> You're very welcome. So here we're unpacking the future of cloud and we'd love to get your perspectives on how customers should think about the future of infrastructure things like applying machine intelligence to their data but just to set the stage, when we look back at the history of storage and the cloud has obviously started with S3 and then a couple of years later AWS introduced EBS for block storage and those are the most well-known services in the portfolio but there's more of this cold storage and new capabilities that you announced recently at reinvent around, you know, super-duper block storage and in tiering is another example. But it looks like AWS is really starting to accelerate and pick up the pace of customer options in storage. So my first question is how should we think about this expanding portfolio? >> Well, I think you have to go all the way back to what customers are trying to do with their data Dave. The path to innovation is paved by data. If you don't have data, you don't have machine learning. You don't have the next generation of analytics applications that helps you chart a path forward into a world that seems to be changing every week. And so in order to have that insight in order to have that predictive forecasting that every company needs, regardless of what industry that you're in today, it all starts from data. And I think the key shift that I've seen is how customers are thinking about that data, about being instantly usable. Whereas in the past, it might've been a backup. Now it's part of a data lake. And if you can bring that data into a data lake you can have not just analytics or machine learning or auditing applications, it's really what does your application do for your business and how can it take advantage of that vast amount of shared data set in your business? >> Awesome, so thank you. So I want to make sure we're hitting on the big trends that you're seeing in the market that kind of are informing your strategy around the portfolio, and what you're seeing with customers. Instant usability, you know, you bring in machine learning into the equation. I think people have really started to understand the benefits of cloud storage as a service and the pay by the drink. and that whole model. Obviously COVID has accelerated that, you know, cloud migration is accelerated. Anything else we're missing there? What are the other big trends that you see? If any. >> Well, Dave, you did a good job of capturing a lot of the drivers. The one thing I would say that just sits underneath all of it is the massive growth of digital data year over year. IDC says digital data is growing at a rate of 40% year over year. And that has been true for a while and it's not going to stop. It's going to keep on growing because the sources of that data acquisition keeps on expanding and whether it's IOT devices whether it is a content created by users, that data is going to grow and everything you're talking about depends on the ability to not just capture it and store it. But as you say, use it. >> Well, you know, and we talk about data growth a lot and sometimes it can, it becomes bromide. But I think the interesting thing that I've observed over the last couple of decades really is that the growth is non-linear and it's really the curve is starting to shape exponentially. You guys always talk about that flywheel effect it's really hard to believe, you know people say trees don't grow to the moon. It seems like data does. >> It does and what's interesting about working in a world of AWS storage Dave is that it's counter-intuitive but our goal with a data growth is to make it cost effective. And so year over year how can we make it cheaper and cheaper? It is have customers store more and more data so they can use it. But it's also to think about the definition of usage and what kind of data is being tapped by businesses for their insights and make that easier than it's ever been before. >> Let me ask you a follow up question on that Mai-Lan. Cause I get asked this a lot, or I hear comments a lot that yes AWS continuously and rigorously reduces pricing but it's just kind of following the natural curve of Moore's law or whatever. How do you respond to that? Are there other factors involved? Obviously labor is another, you know, cost reducing factor, but what's the trend line say? >> Well, cost efficiency is in our DNA, Dave we come to work every day in AWS across all of our services and we ask ourselves, how can we lower our costs and be able to pass that along to customers. As you say, there are many different aspects to costs. There's a cost to the storage itself There's a cost to the data center. And that's really what we've seen impact a lot of customers that were slower or just getting started with a move to the cloud, is they entered 2020 and then they found out exactly how expensive that data center was to maintain because they had to put in safety equipment and they had to do all the things that you have to do in a pandemic, in a data center. And so sometimes that cost is a little bit hidden or it won't show up until you really don't need to have it land. But the costs of managing that explosive growth of data is very real. And when we're thinking about costs, we're thinking about costs in terms of how can I lower it on a per gigabyte per month basis, but we're also building into the product itself, adaptive discounts. Like we have a storage class in S3 that's called intelligent tiering. And in intelligent tiering we have built-in monitoring where if particular objects aren't frequently accessed in a given month, a customer will automatically get a discounted price for that storage or a customer can, you know, as of late last year say that they want to automatically move storage in the storage class that has been stored for example longer than 180 days and saves 95% by moving it into deep archive storage. And so it's not just, you know relentlessly going after and lowering the cost of storage. It's also building into the products these new ways where we can adaptively discount storage based on what a customer's storage is actually doing. >> Right, and I would add to already is the other thing Gatos has done is it's really forced transparency almost the same way that Amazon has done on retail. And now Mai-Lan when we talked last I mentioned that S3 was an object store. And of course that's technically correct but your comment to me was Dave, it's more than that. And you started to talk about SageMaker and AI and bringing in machine learning. And I wonder if you could talk a little bit about the future of how storage is going to be leveraged in the cloud. That's maybe different than what we've been used to in the early days of S3. And how your customers should be thinking about infrastructure, not as bespoke services, but as a suite of capabilities and maybe some of those adjacent services that you see as most leverageable for customers and why? >> Well, to tell this story, Dave, we're going to have to go a little bit back in time, all the way back to the 1990s or before then. When all you had was a set of hardware appliance vendors that sold you appliances that you put in your data center and inherently created a data silo because those hardware appliances were hardwired to your application. And so an individual application that was dealing with auditing as an example wouldn't really be able to access the storage for another application, because you know, the architecture of that legacy world is tied to a data silo and S3 came out launched in 2006 and introduced very low cost storage. That is an object. And I'll tell you, Dave, you know, over the last 10 plus years we have seen all kinds of data coming to S3. Whereas before it might've been backups or it might've been images and videos. Now a pretty substantial data set is our parquet files and work files. These files are there for business analytics for more real-time type of processing. And that has really been the trend of the future, is taking these different files putting them in a shared file layer, so any application today or in the future can tap into that data. And so this idea of the shared file layer is a major trend that has been taking off for the last I would say five or six years. And I expect that to not only keep on going but to really open up the type of services that you can then do on that shared file layer. And whether that's Sage maker or some of the machine learning introduced by our connect service, it's bringing together the data as a starting point and then the applications can evolve very rapidly on top of that. >> I want to ask your opinion about big data architectures. One of our guests Chamakh Tigani, she's amazing data architect. And she's put forth this notion of a distributed global mesh. And picking up on some of the comments, Andy Jassy made it at re-invent how essentially, "Hey we're bringing AWS to the edge. "We see the data center is just another edge node." So you're seeing this massive distributed system evolving. You guys have talked about that for a while and data by its very nature is distributed but we've had this tendency to put it into a monolithic data Lake or a data warehouse and it's sort of antithetical to that distributed nature. So how do you see that playing out? What do you see customers in the future doing in terms of their big data architectures and what does that mean for storage? >> It comes down to the nature of the data and again the usage and Dave that's where I see the biggest difference in these modern data architectures from the legacy of 20 years ago, is the idea that the data need drives the data storage. So let's take an example of the type of data that you always want to have on the edge. We have customers today that need to have storage in the field and whether the field of scientific research or oftentimes it's content creation in the film industry, or if it's for military operations there's a lot of data that needs to be captured and analyzed in the field. And for us, what that means is that, you know we have a suite of products called snow ball and whether it's snow ball or snow cone, take your pick. That whole portfolio of AWS services is targeted at customers that need to do work with storage at the edge. And so, you know, if you think about the need for multiple applications acting on the same data set that's when you keep it in an AWS region. And what we've done in AWS storage is we've recognized that depending on the need of usage where you put your data and how you interact with it may vary. But we've built a whole set of services like data transfer to help make sure that we can connect data from, for example that new snow cone into a region automatically. And so our goal Dave is to make sure that when customers are operating at the edge or they're operating in the region they have the same quality of storage service and they have easy ways to go between them. You shouldn't have to pick, you should be able to do it all. >> So in the spirit of do it all there's this sort of age old dynamic in the tech business where you've got the friction between the best of breed and the integrated suite. And my question is around what you're optimizing for customers. And can you have your cake and eat it too? In other words, why AWS storage? What makes it compelling? Is it because it's kind of a best of breed storage service or is it because it's integrated with AWS? Would you ever sub optimize one in order to get an advantage to the other? Or can you actually, you know have your cake and eat it too? >> The way that we build storage is to focus on being both the breadth of capabilities and the depth of capabilities. And so where we identify a particular need where we think that it takes a whole new service to deliver we'll go build that service. And an example for that as FTP our AWS SFTP service, which, you know, there's a lot of SFTP usage out there and there will be for a while because of the, you know, the legacy B2B type of architectures that still live in the business world today. And so we looked at that problem. We said, how are we going to build that in the best depth way, in the best focus? And we launched a separate service for that. And so our goal is to take the individual building blocks of EBS and glacier and S3 and make the best of class and the most comprehensive in the capabilities of what we can do and where we identify a very specific need. We'll go build a service for it. But Dave, you know as an example for that idea of both depth and breadth, S3 Storage Lens is a great example of that. S3 Storage Lens is a new capability that we launched late last year. And what it does is it lets you look across all your regions and all your accounts and get a summary view of all your S3 storage and whether that's buckets or the most active prefixes that you have and be able to drill down from that. And that is built in to the S3 service and available for any customer that wants to turn it on in the AWS management console. >> Right, and we saw just recently made, I called it super-duper block storage but you can make some improvements in really addressing the highest performance. I want to ask you, so we've all learned about an experience that benefits of cloud over the last several years and especially in the last 10 months during the pandemic but one of the challenges and it's particularly acute with IO is of course latency and moving data around and accessing data remotely. It's a challenge for customers, you know, due to speed of light, et cetera. So my question is how was AWS thinking about all that data that's still resides on premises? I think we heard at reinvent, that's still on 90% of the opportunity is, or the the workloads are still on prem that live inside a customer's data centers. So how do you tap into those and help customers innovate with on-prem data, particularly from a storage angle? >> Well, we always want to provide the best of class solution for those little latency workloads. And that's why we launched Block Express just late last year at reinvent. And Block Express has a new capability in preview on top of our IO to provisioned IOPS volume type. And what's really interesting about block express Dave is that the way that we're able to deliver the performance of Block Express, which is sound performance with cloud elasticity is that we went all the way down to the network layer and we customize the hardware software. And at the network layer we built Block Express on something called SRD which stands for a scalable reliable diagrams. And basically what it's letting us do is offload all of our EBS operations for Block Express on the nitrile card on hardware. And so that type of innovation where we're able to, you know, take advantage of modern cop commodity, multi-tenant data center networks, where we're sending in this new network protocol across a large number of network paths. And that type of innovation all the way down to that protocol level helps us innovate in a way that's hard. In fact, I would say impossible for other sound providers to kind of really catch up and keep up. And so we feel that the amount of innovation that we have for delivering those low latency workloads in our AWS cloud storage is unlimited really because of that ability to customize software hardware and network protocols as we go along without requiring upgrades from a customer it just gets better. And the customer benefits. Now, if you want to stay in your data center that's why we build outposts. And for outposts, we have UVS and we have S3 for outposts and our goal there is that some customers will have workloads where they want to keep them resident in the data center. And for those customers we want to give them that AWS storage opportunities as well. >> So thank you for coming back to Block Express. So you call it, you know, sand in the cloud. So is that essentially it comprises a custom built essentially storage network. Is that right? What you just described SRD? I think you called it. >> Yeah, it's a SRD is used by other AWS services as well but it is a custom network protocol that we designed to deliver the lowest latency experience and we're taking advantage of it with Block Express. >> So sticking with traditional data centers for a moment I'm interested in your thoughts on the importance of the cloud pricing approach, I.e the consumption model to pay by the drink. Obviously it's one of the most attractive features, and I asked that because we're seeing what Andy Jassy refers to as the old guard Institute, flexible pricing models two of the biggest storage companies, HP with GreenLake and Dell has this thing called apex. They've announced such models for on-prem and presumably cross cloud. How do you think this is going to impact your customers leverage of AWS cloud storage? Is it something that you have an opinion on? >> Yeah, I think it all comes down to, again that usage of the storage, and this is where I think there's an inherent advantage for our cloud storage. So there might be an attempt by the old guard to lower prices or add flexibility but at the end of the day it comes down to what the customer actually needs to tune. And if you think about gp3 which is the new EBS volume. The idea with gp3 is we're going to pass a long savings to the customer by making the storage 20% cheaper than gp2. And we're going to make the product better by giving a great, reliable baseline performance. But we're also going to let customers who want to run workloads like Cassandra on EBS tune their throughput separately, for example from their capacity. So if you're running Cassandra sometimes you don't need to change your capacity. Your storage capacity works just fine. But what happens with, for example Cassandra workload is that you may need more throughput. And if you're buying hardware appliance you just have to buy for your peak. You have to buy for the max of what you think your throughput and the max of what your storage is. And this inherent flexibility that we have for AWS storage and being able to tune throughput separate from up separate from capacity like you do for gp3 that is really where the future is for customers having control over costs and control over customer experience without compromising or trading off either one. >> Awesome, thank you for that. So in the time we have remaining Mai-Lan, I want to talk about the topic of diversity social impact, and as a woman leader, women executive, and I really want to get your perspectives on this. And I've shared with the audience previously, one of my breaking analysis segments, your boxing video which is awesome. And so, you've got a lot of unique non-traditional aspects to your life and I love it, but I want to ask you this. So it's obviously, you know, certainly politically and socially correct to talk about diversity, the importance of diversity, there's data that suggests that diversity is good both economically, not just socially, and of course it's the right thing to do. But there are those, you know, Peter teal is probably the most prominent but there are others that say, "You know what? "Forget that, just hire people, just like you'll be able "to go faster, ramp up more quickly, hit escape "velocity it's natural." And that's what you should do. Why is that not the right approach? Why is diversity both, of course, socially, you know responsible, but also, you know, good for business >> For Amazon we think about diversity as something that is essential to how we think about innovation. And so, Dave, as you know, from listening to some of the announcements at reinvent, we launch a lot of new ideas, like new concepts and new services in AWS. And just bringing that lens down to storage. Astri has been reinventing itself every year since we launched in 2006. EBS introduced the first sun on the cloud late last year, and continues to reinvent how customers think about block storage. We would not be able to look at a product in a different way and think to ourselves, not just what is the legacy system do in a data center today but how do we want to build this new distributed system in a way that helps customers achieve not just what they're doing today, but what they want to do in five and 10 years. You can't get that innovative mindset without bringing different perspectives to the table. And so we strongly believe in hiring people who are from under represented groups and whether that's gender or it's related to racial equality or if it's geographic diversity and bringing them in to have the conversation because those diverse viewpoints inform how we can innovate at all levels in AWS. >> Right, and so I really appreciate their perspectives on that. And we've had, as you probably know the cube has been, you know a very big advocate of diversity, you know, generally but women in tech specifically, we participated a lot. And I often ask this question is, you know, as a smaller company, I, and some of my other colleagues in small business, sometimes we struggle. And so my question is how do you go beyond what's your advice for going beyond, you know the good old boys network? I think it's large companies like AWS and, you know, the big players, you've got responsibility too that you can put somebody in charge and make it their full-time job. How should smaller companies that are largely white male dominated, how should they become more diverse? What should they do to increase that diversity? >> I think the place to start is voice. A lot of what we try to do is make sure that the under represented voice is heard. And so Dave, any small business owner of any industry can encourage voice for your under represented or your unheard populations. And honestly, it is as simple as being in a meeting and looking around that table or on your screen, as it were and asking yourself, who hasn't talked? Who hasn't weighed in? Particularly if the debate is contentious or even animated. And you will see, particularly if you note this over time you will see that there may be somebody and whether it's an under represented group or it's a woman who's early career, or it's not it's just a member of your team who happens to be a white male too, who's not being heard. And you can ask that person for their perspective. And that is a step that every one of us can and should do which is ask to have everyone's voice at the table to listen and to weigh in on it. So I think that is something everyone should do. I think if you are a member of an under represented group as for example, I'm Vietnamese American and I'm a female in tech, I think, it's something to think about how you can make sure that you're always taking that bold step forward. And it's one of the topics that we covered at re-invent. We had a great discussion with a group of women CEOs and a lot of it we talked about is being bold taking the challenge of being bold in tough situations. And that is an important thing, I think for anybody to keep in mind, but especially for members of under represented groups, because sometimes Dave that bold step that you kind of think of as like, "Oh I don't know if I should ask for that promotion." or "I don't know if I should volunteer for that project." It's not a big ask, but it's big in your head. And so if you can internalize as a member of some, you know, a group that maybe isn't heard as or seen as much how you can take those bold challenges and step forward and learn, maybe fail also cause that's how you learn. Then that is a way to also have people learn and develop and become leaders in whatever industry it is. >> That's great advice. It reminds me of, I think most of us can relate to that Mai-Lan, because when we started in the industry, we may be timid. You didn't want to necessarily speak up. And I think it's incumbent upon those in a position of power. And by the way power might just be running a meeting agenda to maybe call on those folks that are, maybe it's not diversity of gender or, you know, or race. Maybe it's just the under represented. Maybe that's a good way to start building muscle memory. So that's unique advice that I hadn't heard before. So thank you very much for that. I appreciate it. And Hey, listen. Thanks so much for coming on the Cube On Cloud. We're out of time and really always appreciate your perspectives and you're doing a great job. And thank you. >> Great, thank you Dave. Thanks for having me and have a great day. >> All right, and Keep it right there buddy. You're watching the Cube On Cloud. Right back. (gentle upbeat music)
SUMMARY :
Mai-Lan Great to see you again. Nice to be here. and the cloud has And so in order to have that insight in the market that kind of on the ability to not just it's really hard to believe, you know and make that easier than Obviously labor is another, you know, And so it's not just, you know And I wonder if you could talk And I expect that to in the future doing of data that you always And can you have your cake and eat it too? And that is built in to the S3 service and especially in the last is that the way that we're I think you called it. network protocol that we of the most attractive features, by the old guard to lower and of course it's the right thing to do. And so, Dave, as you know, from listening the cube has been, you know And it's one of the topics And by the way Great, thank you Dave. it right there buddy.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
2006 | DATE | 0.99+ |
Andy Jassy | PERSON | 0.99+ |
HP | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
40% | QUANTITY | 0.99+ |
90% | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
EBS | ORGANIZATION | 0.99+ |
GreenLake | ORGANIZATION | 0.99+ |
20% | QUANTITY | 0.99+ |
Chamakh Tigani | PERSON | 0.99+ |
Mai Lan Tomsen Bukovec | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
first question | QUANTITY | 0.99+ |
95% | QUANTITY | 0.99+ |
IDC | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
six years | QUANTITY | 0.99+ |
Moore | PERSON | 0.99+ |
10 years | QUANTITY | 0.99+ |
2020 | DATE | 0.98+ |
1990s | DATE | 0.98+ |
S3 | TITLE | 0.98+ |
both | QUANTITY | 0.98+ |
gp2 | TITLE | 0.98+ |
gp3 | TITLE | 0.98+ |
late last year | DATE | 0.98+ |
20 years ago | DATE | 0.98+ |
longer than 180 days | QUANTITY | 0.97+ |
Mai-Lan Tomsen Bukovec | PERSON | 0.97+ |
pandemic | EVENT | 0.96+ |
today | DATE | 0.95+ |
Gatos | ORGANIZATION | 0.94+ |
block express | TITLE | 0.94+ |
EBS | TITLE | 0.94+ |
Mai-Lan | PERSON | 0.93+ |
Astri | ORGANIZATION | 0.92+ |