Image Title

Search Results for Petr:

Matt Burr, Scott Sinclair, Garrett Belschner | The Convergence of File and Object


 

>>From around the globe presenting the convergence of file and object brought to you by pure storage. Okay. >>We're back with the convergence of file and object and a power panel. This is a special content program made possible by pure storage. And co-created with the cube. Now in this series, what we're doing is we're exploring the coming together of file and object storage, trying to understand the trends that are driving this convergence, the architectural considerations that users should be aware of and which use cases make the most sense for so-called unified fast file in object storage. And with me are three great guests to unpack these issues. Garrett bell center is the data center solutions architect he's with CDW. Scott Sinclair is a senior analyst at enterprise strategy group. He's got deep experience on enterprise storage and brings that independent analyst perspective. And Matt Burr is back with us, gentlemen, welcome to the program. >>Thank you. >>Hey Scott, let me, let me start with you, uh, and get your perspective on what's going on in the market with, with object to cloud huge amount of unstructured data out there. It lives in files. Give us your independent view of the trends that you're seeing out there. >>Well, Dave, you know where to start, I mean, surprise, surprise data's growing. Um, but one of the big things that we've seen is that we've been talking about data growth for what decades now, but what's really fascinating is or changed is because of the digital economy, digital business, digital transformation, whatever you call it. Now, people are not just storing data. They actually have to use it. And so we see this in trends like analytics and artificial intelligence. And what that does is it's just increasing the demand for not only consolidation of massive amounts of storage that we've seen for awhile, but also the demand for incredibly low latency access to that storage. And I think that's one of the things that we're seeing, that's driving this need for convergence, as you put it of having multiple protocols can Solidated onto one platform, but also the need for high performance access to that data. >>Thank you for that. A great setup. I got, like I wrote down three topics that we're going to unpack as a result of that. So Garrett, let me, let me go to you. Maybe you can give us the perspective of what you see with customers is, is this, is this like a push where customers are saying, Hey, listen, I need to converge my file and object. Or is it more a story where they're saying, Garrett, I have this problem. And then you see unified file and object as a solution. >>Yeah, I think, I think for us, it's, you know, taking that consultative approach with our customers and really kind of hearing pain around some of the pipelines, the way that they're going to market with data today and kind of what are the problems that they're seeing. We're also seeing a lot of the change driven by the software vendors as well. So really being able to support a dis-aggregated design where you're not having to upgrade and maintain everything as a single block has been a place where we've seen a lot of customers pivot to where they have more flexibility as they need to maintain larger volumes of data and higher performance data, having the ability to do that separate from compute and cash. And some of those other layers are, is really critical. >>So, Matt, I wonder if you could follow up on that. So, so Gary was talking about this dis-aggregated design, so I like it, you know, distributed cloud, et cetera, but then we're talking about bringing things together in one place, right? So square that circle. How does this fit in with this hyper distributed cloud edge that's getting built out? >>Yeah. You know, I mean, I could give you the easy answer on that, but I can also pass it back to Garrett in the sense that, you know, Garrett, maybe it's important to talk about, um, elastic and Splunk and some of the things that you're seeing in, in that world and, and how that, I think the answer today, the question I think you can give, you can give a pretty qualified answer relative to what your customers are seeing. >>Oh, that'd be great, please. >>Yeah, absolutely. No, no problem at all. So, you know, I think with, um, Splunk kind of moving from its traditional design and classic design, whatever you want to, you want to call it up into smart store? Um, that was kind of one of the first that we saw kind of make that move towards kind of separating object out. And I think, you know, a lot of that comes from their own move to the cloud and updating their code to basically take advantage of object object in the cloud. Um, but we're starting to see, you know, with like Vertica Ian, for example, um, elastic other folks taking that same type of approach where in the past we were building out many to use servers. We were jamming them full of, uh, you know, SSDs and then DME drives. Um, that was great, but it doesn't really scale. >>And it kind of gets into that same problem that we see with hyperconvergence a little bit where it's, you know, you're all, you're always adding something maybe that you didn't want to add. Um, so I think it, you know, again, being driven by software is really kind of where we're seeing the world open up there. Um, but that whole idea of just having that as a hub and a central place where you can then leverage that out to other applications, whether that's out to the edge for machine learning or AI applications to take advantage of it. I think that's where that convergence really comes back in. Um, but I think like Scott mentioned earlier, it's really folks are now doing things with the data where before I think they were really storing and trying to figure out what are we going to actually do with it when we need to do something with it? So this is making it possible. >>Yeah. And Dave, if I could just sort of tack onto the end of the Garrett's answer there, you know, in particular verdict with beyond mode, the ability to leverage sharted sub clusters, give you, um, you know, sort of an advantage in terms of being able to isolate performance, hotspots you an advantage to that as being able to do that on a flash blade, for example. So, um, sharted, sub clusters allow you to sort of say, I am, you know, I am going to give prioritization to, you know, this particular element of my application in my dataset, but I can still share those, share that data across those, across those sub clusters. So, um, you know, as you see, you know, Vertica with the non-motor, >>You see Splunk advanced with, with smart store, um, you know, these are all sort of advancements that are, you know, it's a chicken and the egg thing. Um, they need faster storage, they need, you know, sort of a consolidated data storage data set. Um, and, and that's what sort of allows these things to drive forward. Yes, >>The verdict eon mode, there was a no, no, it's the ability to separate compute and storage and scale independently. I think, I think Vertica, if they're, if they're not the only one, they're one of the only ones I think they might even be the only one that does that in the cloud and on prem and that sort of plays into this distributed nature of this hyper distributed cloud. I sometimes call it and I'm interested in the, in the data pipeline. And I wonder Scott, if we can talk a little bit about that maybe where unified object and file fund. I mean, I'm envisioning this, this distributed mesh and then, you know, UFO is sort of a note on that, that I can tap when I need it. But, but Scott, what are you seeing as the state of infrastructure as it relates to the data pipeline and the trends there? >>Yeah, absolutely. Dave, so w when I think data pipeline, I immediately gravitate to analytics or, or machine learning initiatives. Right. And so one of the big things we see, and this is, it's an interesting trend. It seems, you know, we continue to see increased investment in AI, increase interest and people think, and as companies get started, they think, okay, well, what does that mean? Well, I gotta go hire a data scientist. Okay. Well that data scientist probably needs some infrastructure. And what they end, what often happens in these environments is where it ends up being a bespoke environment or a one-off environment. And then over time organizations run into challenges. And one of the big challenges is the data science team or people whose jobs are outside of it, spend way too much time trying to get the infrastructure, um, to, to keep up with their demands and predominantly around data performance. So one of the, one of the ways organizations that especially have artificial intelligence workloads in production, and we found this in our research have started mitigating that is by deploying flash all across the data pipe. We have. Yeah, >>We have data on this. Sorry to interrupt, but Pat, if you could bring up that, that chart, that would be great. Um, so take us through this, uh, Scott and, and share with us what we're looking at here. >>Yeah, absolutely. So, so Dave, I'm glad you brought this up. So we did this study. Um, I want to say late last year, uh, one of the things we looked at was across artificial intelligence environments. Now, one thing that you're not seeing on this slide is we went through and we asked all around the data pipeline and we saw flash everywhere. But I thought this was really telling because this is around data lakes. And when many people think about the idea of a data Lake, they think about it as a repository. It's a place where you keep maybe cold data. And what we see here is especially within production environments, a pervasive use of flash stores. So I think that 69% of organizations are saying their data Lake is mostly flash or all flash. And I think we had 0% that don't have any flash in that environment. So organizations are out that thing that flashes in essential technology to allow them to harness the value of their data. >>So Garrett, and then Matt, I wonder if you could chime in as well. We talk about digital transformation and I, I sometimes call it, you know, the COVID forced March to digital transformation. And, and I'm curious as to your perspective on things like machine learning and the adoption, um, and Scott, you may have a perspective on this as well. You know, we had to pivot, he had to get laptops. We had to secure the end points, you know, VDI, those became super high priorities. What happened to, you know, injecting AI into my applications and, and machine learning. Did that go in the back burner? Was that accelerated along with the need to digitally transform, uh, Garrett, I wonder if you could share with us what you saw with, with customers last year? >>Yeah. I mean, I think we definitely saw an acceleration. Um, I think folks are in, in my market are, are still kind of figuring out how they inject that into more of a widely distributed business use case. Um, but again, this data hub and allowing folks to now take advantage of this data that they've had in these data lakes for a long time. I agree with Scott. I mean, many of the data lakes that we have were somewhat flashing, accelerated, but they were typically really made up of large capacity, uh, slower spinning nearline drives, um, accelerated with some flash, but I'm really starting to see folks now look at some of those older Hadoop implementations and really leveraging new ways to look at how they consume data. And many of those redesigned customers are coming to us, wanting to look at all flash solutions. So we're definitely seeing it. And we're seeing an acceleration towards folks trying to figure out how to actually use it in more of a business sense now, or before I feel it goes a little bit more skunkworks kind of people dealing with, uh, you know, in a much smaller situation, maybe in the executive offices trying to do some testing and things. >>Scott you're nodding away. Anything you can add in here. >>Yeah. So, well, first off, it's great to get that confirmation that the stuff we're seeing in our research, Garrett seeing, you know, out in the field and in the real world, um, but you know, as it relates to really the past year, it's been really fascinating. So one of the things we, we studied at ESG is it buying intentions. What are things, what are initiatives that companies plan to invest in? And at the beginning of 2020, we saw heavy interest in machine learning initiatives. Then you transition to the middle of 2020 in the midst of COVID. Uh, some organizations continued on that path, but a lot of them had the pivot, right? How do we get laptops, everyone? How do we continue business in this new world? Well, now as we enter into 2021, and hopefully we're coming out of this, uh, you know, the, the pandemic era, um, we're getting into a world where organizations are pivoting back towards these strategic investments around how do I maximize the usage of data and actually accelerating those because they've seen the importance of, of digital business initiatives over the past >>Year. >>Yeah, Matt, I mean, when we exited 2019, we saw a narrowing of experimentation in our premise was, you know, that that organizations are going to start now operationalizing all their digital transformation experiments. And, and then we had a 10 month Petri dish on, on digital. So what are you, what are you seeing in this regard? >>It's 10 months, Petri dish is an interesting way to interesting way to describe it. Um, you know, we, we saw another, there's another, there's another candidate for pivot in there around ransomware as well. Right. Um, you know, security entered into the mix, uh, which took people's attention away from some of this as well. I mean, look, I I'd like to bring this up just a level or two, um, because what we're actually talking about here is progress, right? And, and progress is an, is an inevitability. Um, you know, whether it's whether, whether you believe that it's by 20, 25 or you, or you think it's 20, 35 or 2050, it doesn't matter. We're on a forced March to the eradication of desk. And that is happening in many ways. Uh, you know, in many ways, um, due to some of the things that Garrett was referring to and what Scott was referring to in terms of what our customer's demands for, how they're going to actually leverage the data that they have. >>And that brings me to kind of my final point on this, which is we see customers in three phases. There's the first phase where they say, Hey, I have this large data store, and I know there's value in there. I don't know how to get to it. Or I have this large data store and I've started a project to get value out of it. And we failed. Those could be customers that, um, you know, marched down the dupe, the Hadoop path early on. And they, they, they got some value out of it. Um, but they realized that, you know, HDFS, wasn't going to be a modern protocol going forward for any number of reasons. You know, the first being, Hey, if I have gold dot master, how do I know that I have gold dot four is consistent with my gold dot master? So data consistency matters. >>And then you have the sort of third group that says, I have these large datasets. I know how to extract value from them. And I'm already on to the Vertica is the elastics, you know, the Splunks et cetera. Um, I think those folks are the folks that, that latter group are the folks that kept their, their, their projects going because they were already extracting value from them. The first two groups we were seeing, sort of saying the second half of this year is when we're going to begin really being picking up on these, on these types of initiatives again. >>Well, thank you, Matt, by the way, for, for hitting the escape key, because I think value from data really is what this is all about. And there are some real blockers there that I kind of want to talk about. You've mentioned HDFS. I mean, we were very excited, of course, in the early days of a dupes, many of the concepts were profound, but at the end of the day, it was too complicated. We've got these hyper specialized roles that are, that are serving the business, but it still takes too long. It's, it's too hard to get value from data. And one of the blockers is infrastructure that the complexity of that infrastructure really needs to be abstracted taken up a level. We're starting to see this in, in cloud where you're seeing some of those abstraction layers being built from some of the cloud vendors, but more importantly, a lot of the vendors like pure, Hey, we can do that heavy lifting for you. Uh, and we, you know, we have expertise in engineering to do cloud native. So I'm wondering what you guys see. Maybe Garrett, you could start us off and the other salmon as some of the blockers, uh, to getting value from data and how we're going to address those in the coming decade. >>Yeah. I mean, I think part of it we're solving here obviously with, with pure bringing, uh, you know, flash to a market that traditionally was utilizing a much slower media. Um, you know, the other thing that I, that I see that's very nice with flash blade for example, is the ability to kind of do things, you know, once you get it set up a blade at a time. I mean, a lot of the things that we see from just kind of more of a simplistic approach to this, like a lot of these teams don't have big budgets and being able to kind of break them down into almost a blade type chunk, I think has really kind of allowed folks to get more projects and, and things off the ground because they don't have to buy a full expensive system to run these projects. Um, so that's helped a lot. >>I think the wider use cases have helped a lot. So, um, Matt mentioned ransomware, um, you know, using safe mode as a, as a place to help with ransomware has been a really big growth spot for us. We've got a lot of customers, very interested and excited about that. Um, and the other thing that I would say is bringing dev ops into data is another thing that we're seeing. So kind of that push towards data ops and really kind of using automation and infrastructure as code as a way to now kind of drive things through the system. The way that we've seen with automation through dev ops is, is really an area we're seeing a ton of growth with from a services perspective, >>Guys, any other thoughts on that? I mean, we're, I I'll, I'll tee it up there. I, we are seeing some bleeding edge, which is somewhat counterintuitive, especially from a cost standpoint, organizational changes at some, some companies, uh, think of some of the, the, the, the internet companies that do, uh, music, uh, for instance, and adding podcasts, et cetera. And those are different data products. We're seeing them actually reorganize their data architectures to make them more distributed, uh, and actually put the domain heads, the business heads in charge of the data and the data pipeline. And that is maybe less efficient, but, but it's, again, some of these bleeding edge. What else are you guys seeing out there that might be some harbinger of the next decade? >>Uh, I'll go first. Um, you know, I think specific to, um, the, the construct that you threw out, Dave, one of the things that we're seeing is, um, you know, the, the, the application owner, maybe it's the dev ops person, but it's, you know, maybe it's, it's, it's, it's the application owner through the dev ops person. They're, they're becoming more technical in their understanding of how infrastructure, um, interfaces with their, with their application. I think, um, you know, what, what we're seeing on the flash blade side is we're having a lot more conversations with application people than, um, just it people. It doesn't mean that the, it people aren't there, the it, people are still there for sure if they have to deliver the service, et cetera. Um, but you know, the days of, of it, you know, building up a catalog of services and a business owner subscribing to one of those services, you know, picking, you know, whatever sort of fits their need. >>Um, I don't think that constant, I think that's the construct that changes going forward. The application owner is becoming much more prescriptive about what they want the infrastructure to fit, how they want the infrastructure to fit into their application. Um, and that's a big change. And for, for, um, you know, certainly folks like, like Garrett and CDW, um, you know, they do a good job with this being able to sort of get to the application owner and bring those two sides together. There's a tremendous amount of value there, uh, for us to spend a little bit of a, of a retooling we've traditionally sold to the it side of the house. And, um, you know, we've had to teach ourselves how to go talk the language of, of applications. So, um, you know, I think you pointed out a good, a good, a good construct there, and you know, that that application owner tank playing a much bigger role in what they're expecting from the performance of it, infrastructure I think is, is, is a key, is a key change. >>Interesting. I mean, that definitely is a trend. That's puts you guys closer to the business where the infrastructure team is serving the business, as opposed to sometimes I talked to data experts and they're frustrated, uh, especially data owners or data, product builders who are frustrated that they feel like they have to beg, beg the, the data pipeline team to get, you know, new data sources or get data out. How about the edge? Um, you know, maybe Scott, you can kick us off. I mean, we're seeing, you know, the emergence of, of edge use cases, AI inferencing at the edge, lot of data at the edge. W what are you seeing there and how does this unified object I'll bring us back to that in file fit. >>Wow. Dave, how much time do we have, um, tell me, first of all, Scott, why don't you, why don't you just tell everybody what the edge is? Yeah. You got it all figured out. How much time do you have end of the day. And that's, that's a great question, right? Is if you take a step back and I think it comes back to Dave, something you mentioned it's about extracting value from data. And what that means is when you extract value from data, what it does is as Matt pointed out the, the influencers or the users of data, the application owners, they have more power because they're driving revenue now. And so what that means is from an it standpoint, it's not just, Hey, here are the services you get, use them or lose them, or, you know, don't throw a fit. It is no, I have to, I have to adapt. I have to follow what my application owners me. Now, when you bring that back to the edge, what it means is, is that data is not localized to the data center. I mean, we just went through a nearly 12 month period where >>The entire workforce for most of the companies in this country had went distributed and business continued. So if business is distributed, data is distributed. And that means, that means in the data center, that means at the edge, that means that the cloud, and that means in all other places and tons of places. And what it also means is you have to be able to extract and utilize data anywhere it may be. And I think that's something that we're going to continue to and continue to see. And I think it comes back to, you know, if you think about key characteristics, we've talked about, um, things like performance and scale for years, but we need to start rethinking it because on one hand, we need to get performance everywhere. But also in terms of scale, and this ties back to some of the other initiatives and getting value from data, it's something I call the, the massive success problem. One of the things we see, especially with, with workloads like machine learning is businesses find success with them. And as soon as they do they say, well, I need about 20 of these projects now will all of a sudden that overburdens it organizations, especially across, across core and edge and cloud environments. And so when you look at environments ability to meet performance and scale demands, wherever it needs to be is something that's really important. You know, >>Dave, I'd like to, um, just sort of tie together sort of two things that, um, I think that I heard from Scott and Garrett that I think are important and it's around this concept of scale. Um, you know, some of us are old enough to remember the day when kind of a 10 terabyte blast radius was too big of a blast radius for people to take on, or a terabyte of storage was considered to be, um, you know, uh, uh, an exemplary budget environment. Right. Um, now we sort of think as terabytes, kind of like we used to think of as gigabytes in some ways, um, petabyte, like you don't have to explain to anybody what a petabyte is anymore. Um, and you know, what's on the horizon and it's not far are our exabyte type dataset workloads. Um, and you start to think about what could be in that exabyte of data. >>We've talked about how you extract that value. And we've talked about sort of, um, how you start, but if the scale is big, not everybody's going to start at a petabyte or an exabyte to Garrett's point, the ability to start small and grow into these products, or excuse me, these projects, I think is a, is a really, um, fundamental concept here because you're not going to just go buy five. I'm going to go kick off a five petabyte project, whether you do that on disk or flash, it's going to be expensive, right. But if you could start at a couple of hundred terabytes, not just as a proof of concept, but as something that, you know, you could get predictable value out of that, then you could say, Hey, this either scales linearly, or non-linearly in a way that I can then go map my investments to how I can go dig deeper into this. That's how all of these things are going to, that's how these successful projects are going to start, because the people that are starting with these very large, you know, sort of, um, expansive, you know, Greenfield projects at multi petabyte scale, it's gonna be hard to realize near-term value. Excellent. Uh, >>We we're, we gotta wrap, but, but Garrett, I wonder if you could close it, when you look forward, you talk to customers, do you see this unification of file and object? Is it, is this an evolutionary trend? Is it something that is, that is, that is, that is going to be a lever that customers use. How do you see it evolving over the next two, three years and beyond? >>Yeah, I mean, I think from our perspective, I mean, just from what we're seeing from the numbers within the market, the amount of growth that's happening with unstructured data is really just starting to finally really kind of hit this data delusion or whatever you want to call it that we've been talking about for so many years. Um, it really does seem to now be becoming true, um, as we start to see things scale out and really folks settle into, okay, I'm going to use the cloud to start and maybe train my models, but now I'm going to get it back on prem because of latency or security or whatever the, the, the, um, decision points are there. Um, this is something that is not going to slow down. And I think, you know, folks like pure having the ability to have the tools that they give us, um, do use and bring to market with our customers are, are really key and critical for us. So I see it as a huge growth area and a big focus for us moving forward, >>Guys, great job unpacking a topic that, you know, it's covered a little bit, but I think we, we covered some ground. That is a, that is new. And so thank you so much for those insights and that data really appreciate your time. >>Thanks, Dave. Thanks. Yeah. Thanks, Dave. >>Okay. And thank you for watching the convergence of file and object. Keep it right there. Bright, bright back after the short break.

Published Date : Jan 28 2021

SUMMARY :

of file and object brought to you by pure storage. And Matt Burr is back with us, gentlemen, welcome to the program. Hey Scott, let me, let me start with you, uh, and get your perspective on what's going on in the market with, but also the need for high performance access to that data. And then you see unified Yeah, I think, I think for us, it's, you know, taking that consultative approach with our customers and really kind design, so I like it, you know, distributed cloud, et cetera, you know, Garrett, maybe it's important to talk about, um, elastic and Splunk and some of the things that you're seeing Um, but we're starting to see, you know, with like Vertica Ian, so I think it, you know, again, being driven by software is really kind of where we're seeing the world I am, you know, I am going to give prioritization to, you know, this particular element of my application you know, it's a chicken and the egg thing. But, but Scott, what are you seeing as the state of infrastructure as it relates to the data It seems, you know, we continue to see increased investment in AI, Sorry to interrupt, but Pat, if you could bring up that, that chart, that would be great. So, so Dave, I'm glad you brought this up. We had to secure the end points, you know, uh, you know, in a much smaller situation, maybe in the executive offices trying to do some testing and things. Anything you can add in here. Garrett seeing, you know, out in the field and in the real world, um, but you know, in our premise was, you know, that that organizations are going to start now operationalizing all Um, you know, security entered into the mix, uh, which took people's attention away from some of this as well. Um, but they realized that, you know, HDFS, wasn't going to be a modern you know, the Splunks et cetera. Uh, and we, you know, we have expertise in engineering is the ability to kind of do things, you know, once you get it set up a blade at a time. um, you know, using safe mode as a, as a place to help with ransomware has been a really What else are you guys seeing out there that Um, but you know, the days of, of it, you know, building up a So, um, you know, I think you pointed out a good, a good, a good construct there, to get, you know, new data sources or get data out. And what that means is when you extract value from data, what it does And I think it comes back to, you know, if you think about key characteristics, considered to be, um, you know, uh, uh, an exemplary budget environment. you know, sort of, um, expansive, you know, Greenfield projects at multi petabyte scale, you talk to customers, do you see this unification of file and object? And I think, you know, folks like pure having the Guys, great job unpacking a topic that, you know, it's covered a little bit, but I think we, we covered some ground. Bright, bright back after the short break.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
MattPERSON

0.99+

GarrettPERSON

0.99+

ScottPERSON

0.99+

GaryPERSON

0.99+

Scott SinclairPERSON

0.99+

Matt BurrPERSON

0.99+

DavePERSON

0.99+

Garrett BelschnerPERSON

0.99+

2019DATE

0.99+

2021DATE

0.99+

PetrPERSON

0.99+

69%QUANTITY

0.99+

10 terabyteQUANTITY

0.99+

first phaseQUANTITY

0.99+

10 monthQUANTITY

0.99+

last yearDATE

0.99+

10 monthsQUANTITY

0.99+

fiveQUANTITY

0.99+

0%QUANTITY

0.99+

ESGORGANIZATION

0.99+

two sidesQUANTITY

0.99+

PatPERSON

0.99+

todayDATE

0.99+

next decadeDATE

0.98+

25QUANTITY

0.98+

twoQUANTITY

0.98+

20QUANTITY

0.98+

three phasesQUANTITY

0.98+

oneQUANTITY

0.98+

firstQUANTITY

0.98+

VerticaORGANIZATION

0.98+

2050DATE

0.98+

third groupQUANTITY

0.98+

single blockQUANTITY

0.97+

one platformQUANTITY

0.97+

three topicsQUANTITY

0.97+

five petabyteQUANTITY

0.96+

MarchDATE

0.95+

three great guestsQUANTITY

0.95+

late last yearDATE

0.95+

one placeQUANTITY

0.95+

one thingQUANTITY

0.92+

past yearDATE

0.91+

GreenfieldORGANIZATION

0.9+

CDWPERSON

0.89+

CDWORGANIZATION

0.88+

35QUANTITY

0.88+

pandemicEVENT

0.87+

OneQUANTITY

0.87+

three yearsQUANTITY

0.85+