Pure Storage Convergence of File and Object FULL SHOW V1
we're running what i would call a little mini series and we're exploring the convergence of file and object storage what are the key trends why would you want to converge file an object what are the use cases and architectural considerations and importantly what are the business drivers of uffo so-called unified fast file and object in this program you'll hear from matt burr who is the gm of pure's flashblade business and then we'll bring in the perspectives of a solutions architect garrett belsner who's from cdw and then the analyst angle with scott sinclair of the enterprise strategy group esg he'll share some cool data on our power panel and then we'll wrap with a really interesting technical conversation with chris bond cb bond who is a lead data architect at microfocus and he's got a really cool use case to share with us so sit back and enjoy the program from around the globe it's thecube presenting the convergence of file and object brought to you by pure storage we're back with the convergence of file and object a special program made possible by pure storage and co-created with the cube so in this series we're exploring that convergence between file and object storage we're digging into the trends the architectures and some of the use cases for unified fast file and object storage uffo with me is matt burr who's the vice president and general manager of flashblade at pure storage hello matt how you doing i'm doing great morning dave how are you good thank you hey let's start with a little 101 you know kind of the basics what is unified fast file and object yeah so look i mean i think you got to start with first principles talking about the rise of unstructured data so um when we think about unstructured data you sort of think about the projections 80 of data by 2025 is going to be unstructured data whether that's machine generated data or um you know ai and ml type workloads uh you start to sort of see this um i don't want to say it's a boom uh but it's sort of a renaissance for unstructured data if you will we move away from you know what we've traditionally thought of as general purpose nas and and file shares to you know really things that focus on uh fast object taking advantage of s3 cloud native applications that need to integrate with applications on site um you know ai workloads ml workloads tend to look to share data across you know multiple data sets and you really need to have a platform that can deliver both highly performant and scalable fast file and object from one system so talk a little bit more about some of the drivers that you know bring forth that need to unify file an object yeah i mean look you know there's a there's there's a real challenge um in managing you know bespoke uh bespoke infrastructure or architectures around general purpose nas and daz etc so um if you think about how a an architect sort of looks at an application they might say well okay i need to have um you know fast daz storage proximal to the application um but that's going to require a tremendous amount of dams which is a tremendous amount of drives right hard drives are you know historically pretty pretty pretty unwieldy to manage because you're replacing them relatively consistently at multi-petabyte scale um so you start to look at things like the complexity of daz you start to look at the complexity of general purpose nas and you start to just look at quite frankly something that a lot of people don't really want to talk about anymore but actual data center space right like consolidation matters the ability to take you know something that's the size of a microwave like a modern flash blade or a modern um you know uffo device uh replaces something that might be you know the size of three or four or five refrigerators so matt what why is is now the right time for this i mean for years nobody really paid much attention to object s3 already obviously changed you know that course most of the world's data is still stored in file formats and you get there with nfs or smb why is now the time to think about unifying object and file well because we're moving to things like a contactless society um you know the the things that we're going to do are going to just require a tremendous amount more compute power network um and quite frankly storage throughput and you know i can give you two sort of real primary examples here right you know warehouses are being you know taken over by robots if you will um it's not a war it's a it's a it's sort of a friendly advancement in you know how do i how do i store a box in a warehouse and you know we have we have a customer who focuses on large sort of big box distribution warehousing and you know a box that carried a an object two weeks ago might have a different box size two weeks later well that robot needs to know where the space is in the data center in order to put it but also needs to be able to process hey i don't want to put the thing that i'm going to access the most in the back of the warehouse i'm going to put that thing in the front of the warehouse all of those types of data you know sort of real time you can think of the robot as almost an edge device is processing in real time unstructured data in its object right so it's sort of the emergence of these new types of workloads and i give you the opposite example the other end of the spectrum is ransomware right you know today you know we'll talk to customers and they'll say quite commonly hey if you know anybody can sell me a backup device i need something that can restore quickly um if you had the ability to restore something in 270 terabytes an hour or 250 terabytes an hour uh that's much faster when you're dealing with a ransomware attack you want to get your data back quickly you know so i want to add i was going to ask you about that later but since you brought it up what is the right i guess call it architecture for for for ransomware i mean how and explain like how unified object and file which appointment i get the fast recovery but how how would you recommend a customer uh go about architecting a ransomware proof you know system yeah well you know with with flashblade and and with flasharray there's an actual feature called called safe mode and that safe mode actually protects uh the snapshots and and the data from uh sort of being a part of the of the ransomware event and so if you're in a type of ransomware situation like this you're able to leverage safe mode and you say okay what happens in a ransomware attack is you can't get access to your data and so you know the bad guy the perpetrator is basically saying hey i'm not going to give you access to your data until you pay me you know x in bitcoin or whatever it might be right um with with safe mode those snapshots are actually protected outside of the ransomware blast zone and you can bring back those snapshots because what's your alternative if you're not doing something like that your alternative is either to pay and unlock your data or you have to start retouring restoring excuse me from tape or slow disk that could take you days or weeks to get your data back so leveraging safe mode um you know in either the flash for the flash blade product uh is a great way to go about architecting against ransomware i got to put my my i'm thinking like a customer now so safe mode so that's an immutable mode right can't change the data um is it can can an administrator go in and change that mode can you turn it off do i still need an air gap for example what would you recommend there yeah so there there are still um uh you know sort of our back or roll back role-based access control policies uh around who can access that safe mode and who can right okay so uh anyway subject for a different day i want to i want to actually bring up uh if you don't object a topic that i think used to be really front and center and it now be is becoming front and center again i mean wikibon just produced a research note forecasting the future of flash and hard drives and those of you who follow us know we've done this for quite some time and you can if you could bring up the chart here you you could and we see this happening again it was originally we forecast the the the death of of quote-unquote high spin speed disc drives which is kind of an oxymoron but you can see on here on this chart this hard disk had a magnificent journey but they peaked in volume in manufacturing volume in 2010 and the reason why that is is so important is that volumes now are steadily dropping you can see that and we use wright's law to explain why this is a problem and wright's law essentially says that as you your cumulative manufacturing volume doubles your cost to manufacture decline by a constant percentage now i won't go too much detail on that but suffice it to say that flash volumes are growing very rapidly hdd volumes aren't and so flash because of consumer volumes can take advantage of wright's law and that constant reduction and that's what's really important for the next generation which is always more expensive to build uh and so this kind of marks the beginning of the end matt what do you think what what's the future hold for spinning disc in your view uh well i can give you the answer on two levels on a personal level uh it's why i come to work every day uh you know the the eradication or or extinction of an inefficient thing um you know i like to say that uh inefficiency is the bane of my existence uh and i think hard drives are largely inefficient and i'm willing to accept the sort of long-standing argument that um you know we've seen this transition in block right and we're starting to see it repeat itself in in unstructured data and i'm going to accept the argument that cost is a vector here and it most certainly is right hdds have been considerably cheaper uh than than than flash storage um you know even to this day uh you know up up to this point right but we're starting to approach the point where you sort of reach a a 3x sort of um you know differentiator between the cost of an hdd and an std and you know that really is that point in time when uh you begin to pick up a lot of volume and velocity and so you know that tends to map directly to you know what you're seeing here which is you know a a slow decline uh which i think is going to become even more rapid kind of probably starting around next year um where you start to see sds excuse me ssds uh you know really replacing hdds uh at a much more rapid clip particularly on the unstructured data side and it's largely around cost the the workloads that we talked about robots and warehouses or you know other types of advanced machine learning and artificial intelligence type applications and workflows you know they require a degree of performance that a hard drive just can't deliver we are we are seeing sort of the um creative innovative uh disruption of an entire industry right before our eyes it's a fun thing to live through yeah and and we would agree i mean it doesn't the premise there is that it doesn't have to be less expensive we think it will be by you know the second half or early second half of this decade but even if it's a we think around a 3x delta the value of of ssd relative to spinning disk is going to overwhelm just like with your laptop you know it got to the point where you said why would i ever have a spinning disc in my laptop we see the same thing happening here um and and so and we're talking about you know raw capacity you know put in compression and d-dupe and everything else that you really can't do with spinning discs because of the performance issues you can do with flash okay let's come back to uffo can we dig into the challenges specifically that that this solves for customers give me give us some examples yeah so you know i mean if we if we think about the examples um you know the the robotic one um i think is is is the one that i think is the marker for you know kind of of of the the modern side of of of what we see here um but what we're you know what we're what we're seeing from a trend perspective which you know not everybody's deploying robots right um you know there's there's many companies that are you know that aren't going to be in either the robotic business uh or or even thinking about you know sort of future type oriented type things but what they are doing is green field applications are being built on object um generally not on not on file and and not on block and so you know the rise of of object as sort of the the sort of let's call it the the next great protocol for um you know for uh for for modern workloads right this is this is that that modern application coming to the forefront and that could be anything from you know financial institutions you know right down through um you we've even see it and seen it in oil and gas uh we're also seeing it across across healthcare uh so you know as as as companies take the opportunity as industries to take this opportunity to modernize you know they're modernizing not on things that are are leveraging you know um you know sort of archaic disk technology they're they're they're really focusing on on object but they still have file workflows that they need to that they need to be able to support and so having the ability to be able to deliver those things from one device in a capacity orientation or a performance orientation uh while at the same time dramatically simplifying uh the overall administration of your environment both physically and non-physically is a key driver so the great thing about object is it's simple it's a kind of a get put metaphor um it's it scales out you know because it's got metadata associated with the data uh and and it's cheap uh the drawback is you don't necessarily associate it with high performance and and and as well most applications don't you know speak in that language they speak in the language of file you know or as you mentioned block so i i see real opportunities here if i have some some data that's not necessarily frequently accessed you know every day but yet i want to then whether end of quarter or whatever it is i want to i want to or machine learning i want to apply some ai to that data i want to bring it in and then apply a file format uh because for performance reasons is that right maybe you could unpack that a little bit yeah so um you know we see i mean i think you described it well right um but i don't think object necessarily has to be slow um and nor does it have to be um you know because when you think about you brought up a good point with metadata right being able to scale to a billions of objects being able to scale to billions of objects excuse me is of value right um and i think people do traditionally associate object with slow but it's not necessarily slow anymore right we we did a sort of unofficial survey of of of our of our customers and our employee base and when people described object they thought of it as like law firms and storing a word doc if you will um and that that's just you know i think that there's a lack of understanding or a misnomer around what modern what modern object has become and perform an object particularly at scale when we're talking about billions of objects you know that's the next frontier right um is it at pace performance wise with you know the other protocols no uh but it's making leaps and grounds so you talked a little bit more about some of the verticals that you see i mean i think when i think of financial services i think transaction processing but of course they have a lot of tons of unstructured data are there any patterns you're seeing by by vertical market um we're you know we're not that's the interesting thing um and you know um as a as a as a as a company with a with a block heritage or a block dna those patterns were pretty easy to spot right there were a certain number of databases that you really needed to support oracle sql some postgres work et cetera then kind of the modern databases around cassandra and things like that you knew that there were going to be vmware environments you know you could you could sort of see the trends and where things were going unstructured data is such a broader horizontal thing right so you know inside of oil and gas for example you have you know um you have specific applications and bespoke infrastructures for those applications um you know inside of media entertainment you know the same thing the the trend that we're seeing the commonality that we're seeing is the modernization of you know object as a starting point for all the all the net new workloads within within those industry verticals right that's the most common request we see is what's your object roadmap what's your you know what's your what's your object strategy you know where do you think where do you think object is going so um there isn't any um you know sort of uh there's no there's no path uh it's really just kind of a wide open field in front of us with common requests across all industries so the amazing thing about pure just as a kind of a little you know quasi you know armchair historian the industry is pure was really the only company in many many years to be able to achieve escape velocity break through a billion dollars i mean three part couldn't do it isilon couldn't do it compellent couldn't do it i could go on but pure was able to achieve that as an independent company and so you become a leader you look at the gartner magic quadrant you're a leader in there i mean if you've made it this far you've got to have some chops and so of course it's very competitive there are a number of other storage suppliers that have announced products that unify object and file so i'm interested in how pure differentiates why pure um it's a great question um and it's one that uh you know having been a long time puritan uh you know i take pride in answering um and it's actually a really simple answer um it's it's business model innovation and technology right the the technology that goes behind how we do what we do right and i don't mean the product right innovation is product but having a better support model for example um or having on the business model side you know evergreen storage right where we sort of look at your relationship to us as a subscription right um you know we're going to sort of take the thing that that you've had and we're going to modernize that thing in place over time such that you're not rebuying that same you know terabyte or you know petabyte of storage that you've that you that you've paid for over time so um you know sort of three legs of the stool uh that that have made you know pure clearly differentiated i think the market has has recognized that um you're right it's it's hard to break through to a billion dollars um but i look forward to the day that you know we we have two billion dollar products and i think with uh you know that rise in in unstructured data growing to 80 by 2025 and you know the massive transition that you know you guys have noted in in in your hdd slide i think it's a huge opportunity for us on you know the other unstructured data side of the house you know the other thing i'd add matt i've talked to cause about this is is it's simplicity first i've asked them why don't you do this why don't you do it and the answer is always the same is that adds complexity and we we put simplicity for the customer ahead of everything else and i think that served you very very well what about the economics of of unified file an object i mean if you bring in additional value presumably there's a there there's a cost to that but there's got to be also a business case behind it what kind of impact have you seen uh with customers yeah i mean look i'll i'll i'll go back to something i mentioned earlier which is just the reclamation of floor space and power and cooling right um you know there's a you know there's people people people want to search for kind of the the sexier element if you will when it comes to looking at how we how you derive value from something but the reality is if you're reducing your power consumption by you know by by a material percentage power bills matter in big in big data centers um you know customers typically are are facing you know a paradigm of well i i want to go to the cloud but you know the clouds are not being more expensive than i thought it was going to be or you know i figured out what i can use in the cloud i thought it was going to be everything but it's not going to be everything so hybrid's where we're landing but i want to be out of the data center business and i don't want to have a team of 20 storage people to match you know to administer my storage um you know so there's sort of this this very tangible value around you know hey if i could manage um you know multiple petabytes with one full-time engineer uh because the system uh to yoran kaz's point was radically simpler to administer didn't require someone to be running around swapping drives all the time would that be a value the answer is yes 100 of the time right and then you start to look at okay all right well on the uffo side from a product perspective hey if i have to manage a you know bespoke environment for this application if i have to manage a bespoke environment for this application and a bespoke environment for this application and this book environment for this application i'm managing four different things and can i actually share data across those four different things there's ways to share data but most customers it just gets too complex how do you even know what your what your gold.master copy is of data if you have it in four different places or you try to have it in four different places and it's four different siloed infrastructures so when you get to the sort of the side of you know how do we how do you measure value in uffo it's actually being able to have all of that data concentrated in one place so that you can share it from application to application got it i'm interested we use a couple minutes left i'm interested in the the update on flashblade you know generally but also i have a specific question i mean look getting file right is hard enough uh you just announced smb support for flashblade i'm interested in you know how that fits in i think it's kind of obvious with file and object converging but give us the update on on flashblade and maybe you could address that specific question yeah so um look i mean we're we're um you know tremendously excited about the growth of flashblade uh you know we we we found workloads we never expected to find um you know the rapid restore workload was one that was actually brought to us from from from a customer actually and has become you know one of our one of our top two three four you know workloads so um you know we're really happy with the trend we've seen in it um and you know mapping back to you know thinking about hdds and ssds you know we're well on a path to building a billion dollar business here so you know we're very excited about that um but to your point you know you don't just snap your fingers and get there right um you know we've learned that doing file and object uh is is harder than block um because there's more things that you have to go do for one you're basically focused on three protocols s b nfs and s3 not necessarily in that order um but to your point about smb uh you know we we are uh on the path through to releasing um you know smb uh full full native smb support in in the system that will allow us to uh service customers we have a limitation with some customers today where they'll have an s b portion of their nfs workflow um and we do great on the nfs side um but you know we didn't we didn't have the ability to plug into the s p component of their workflow so that's going to open up a lot of opportunity for us um on on that front um and you know we continue to you know invest significantly across the board in in areas like security which is you know become more than just a hot button you know today security's always been there but it feels like it's blazing hot today um and so you know going through the next couple years we'll be looking at uh you know developing some some um you know pretty material security elements of the product as well so uh well on a path to a billion dollars is the net on that and uh you know we're we're fortunate to have have smb here and we're looking forward to introducing that to to those customers that have you know nfs workloads today with an s p component yeah nice tailwind good tam expansion strategy matt thanks so much really appreciate you coming on the program we appreciate you having us and uh thanks much dave good to see you [Music] okay we're back with the convergence of file and object in a power panel this is a special content program made possible by pure storage and co-created with the cube now in this series what we're doing is we're exploring the coming together of file and object storage trying to understand the trends that are driving this convergence the architectural considerations that users should be aware of and which use cases make the most sense for so-called unified fast file in object storage and with me are three great guests to unpack these issues garrett belsner is the data center solutions architect he's with cdw scott sinclair is a senior analyst at enterprise strategy group he's got deep experience on enterprise storage and brings that independent analyst perspective and matt burr is back with us gentlemen welcome to the program thank you hey scott let me let me start with you uh and get your perspective on what's going on the market with with object the cloud a huge amount of unstructured data out there that lives in files give us your independent view of the trends that you're seeing out there well dave you know where to start i mean surprise surprise date is growing um but one of the big things that we've seen is we've been talking about data growth for what decades now but what's really fascinating is or changed is because of the digital economy digital business digital transformation whatever you call it now people are not just storing data they actually have to use it and so we see this in trends like analytics and artificial intelligence and what that does is it's just increasing the demand for not only consolidation of massive amounts of storage that we've seen for a while but also the demand for incredibly low latency access to that storage and i think that's one of the things that we're seeing that's driving this need for convergence as you put it of having multiple protocols consolidated onto one platform but also the need for high performance access to that data thank you for that a great setup i got like i wrote down three topics that we're going to unpack as a result of that so garrett let me let me go to you maybe you can give us the perspective of what you see with customers is is this is this like a push where customers are saying hey listen i need to converge my file and object or is it more a story where they're saying garrett i have this problem and then you see unified file and object as a solution yeah i think i think for us it's you know taking that consultative approach with our customers and really kind of hearing pain around some of the pipelines the way that they're going to market with data today and kind of what are the problems that they're seeing we're also seeing a lot of the change driven by the software vendors as well so really being able to support a disaggregated design where you're not having to upgrade and maintain everything as a single block has really been a place where we've seen a lot of customers pivot to where they have more flexibility as they need to maintain larger volumes of data and higher performance data having the ability to do that separate from compute and cache and those other layers are is really critical so matt i wonder if if you could you know follow up on that so so gary was talking about this disaggregated design so i like it you know distributed cloud etc but then we're talking about bringing things together in in one place right so square that circle how does this fit in with this hyper-distributed cloud edge that's getting built out yeah you know i mean i i could give you the easy answer on that but i could also pass it back to garrett in the sense that you know garrett maybe it's important to talk about um elastic and splunk and some of the things that you're seeing in in that world and and how that i think the answer to dave's question i think you can give you can give a pretty qualified answer relative what your customers are seeing oh that'd be great please yeah absolutely no no problem at all so you know i think with um splunk kind of moving from its traditional design and classic design whatever you want you want to call it up into smart store um that was kind of one of the first that we saw kind of make that move towards kind of separating object out and i think you know a lot of that comes from their own move to the cloud and updating their code to basically take advantage of object object in the cloud uh but we're starting to see you know with like vertica eon for example um elastic other folks taking that same type of approach where in the past we were building out many 2u servers we were jamming them full of uh you know ssds and nvme drives that was great but it doesn't really scale and it kind of gets into that same problem that we see with you know hyper convergence a little bit where it's you know you're all you're always adding something maybe that you didn't want to add um so i think it you know again being driven by software is really kind of where we're seeing the world open up there but that whole idea of just having that as a hub and a central place where you can then leverage that out to other applications whether that's out to the edge for machine learning or ai applications to take advantage of it i think that's where that convergence really comes back in but i think like scott mentioned earlier it's really folks are now doing things with the data where before i think they were really storing it trying to figure out what are we going to actually do with it when we need to do something with it so this is making it possible yeah and dave if i could just sort of tack on to the end of garrett's answer there you know in particular vertica with neon mode the ability to leverage sharded subclusters give you um you know sort of an advantage in terms of being able to isolate performance hot spots you an advantage to that is being able to do that on a flashblade for example so um sharded subclusters allow you to sort of say i'm you know i'm going to give prioritization to you know this particular element of my application and my data set but i can still share those share that data across those across those subclusters so um you know as you see you know vertica advance with eon mode or you see splunk advance with with smart store you know these are all sort of advancements that are you know it's a chicken in the egg thing um they need faster storage they need you know sort of a consolidated data storage data set um and and that's what sort of allows these things to drive forward yeah so vertica eon mode for those who don't know it's the ability to separate compute and storage and scale independently i think i think vertica if they're if they're not the only one they're one of the only ones i think they might even be the only one that does that in the cloud and on-prem and that sort of plays into this distributed you know nature of this hyper-distributed cloud i sometimes call it and and i'm interested in the in the data pipeline and i wonder scott if we could talk a little bit about that maybe we're unified object and file i mean i'm envisioning this this distributed mesh and then you know uffo is sort of a node on that that i i can tap when i need it but but scott what are you seeing as the state of infrastructure as it relates to the data pipeline and the trends there yeah absolutely dave so when i think data pipeline i immediately gravitate to analytics or or machine learning initiatives right and so one of the big things we see and this is it's an interesting trend it seems you know we continue to see increased investment in ai increased interest and people think and as companies get started they think okay well what does that mean well i got to go hire a data scientist okay well that data scientist probably needs some infrastructure and what they end what often happens in these environments is where it ends up being a bespoke environment or a one-off environment and then over time organizations run into challenges and one of the big challenges is the data science team or people whose jobs are outside of it spend way too much time trying to get the infrastructure to to keep up with their demands and predominantly around data performance so one of the one of the ways organizations that especially have artificial intelligence workloads in production and we found this in our research have started mitigating that is by deploying flash all across the data pipeline we have we have data on this sorry interrupt but yeah if you could bring up that that chart that would be great um so take us through this uh uh scott and share with us what we're looking at here yeah absolutely so so dave i'm glad you brought this up so we did this study um i want to say late last year uh one of the things we looked at was across artificial intelligence environments now one thing that you're not seeing on this slide is we went through and we asked all around the data pipeline and we saw flash everywhere but i thought this was really telling because this is around data lakes and when when or many people think about the idea of a data lake they think about it as a repository it's a place where you keep maybe cold data and what we see here is especially within production environments a pervasive use of flash storage so i think that 69 of organizations are saying their data lake is mostly flash or all flash and i think we have zero percent that don't have any flash in that environment so organizations are finding out that they that flash is an essential technology to allow them to harness the value of their data so garrett and then matt i wonder if you could chime in as well we talk about digital transformation and i sometimes call it you know the coveted forced march to digital transformation and and i'm curious as to your perspective on things like machine learning and the adoption and scott you may have a perspective on this as well you know we had to pivot we had to get laptops we had to secure the end points you know and vdi those became super high priorities what happened to you know injecting ai into my applications and and machine learning did that go in the back burner was that accelerated along with the need to digitally transform garrett i wonder if you could share with us what you saw with with customers last year yeah i mean i think we definitely saw an acceleration um i think folks are in in my market are still kind of figuring out how they inject that into more of a widely distributed business use case but again this data hub and allowing folks to now take advantage of this data that they've had in these data lakes for a long time i agree with scott i mean many of the data lakes that we have were somewhat flash accelerated but they were typically really made up of you know large capacity slower spinning near-line drive accelerated with some flash but i'm really starting to see folks now look at some of those older hadoop implementations and really leveraging new ways to look at how they consume data and many of those redesigned customers are coming to us wanting to look at all flash solutions so we're definitely seeing it we're seeing an acceleration towards folks trying to figure out how to actually use it in more of a business sense now or before i feel it goes a little bit more skunk works kind of people dealing with uh you know in a much smaller situation maybe in the executive offices trying to do some testing and things scott you're nodding away anything you can add in here yeah so first off it's great to get that confirmation that the stuff we're seeing in our research garrett's seeing you know out in the field and in the real world um but you know as it relates to really the past year it's been really fascinating so one of the things we study at esg is i.t buying intentions what are things what are initiatives that companies plan to invest in and at the beginning of 2020 we saw a heavy interest in machine learning initiatives then you transition to the middle of 2020 in the midst of covid some organizations continued on that path but a lot of them had the pivot right how do we get laptops to everyone how do we continue business in this new world well now as we enter into 2021 and hopefully we're coming out of this uh you know the pandemic era um we're getting into a world where organizations are pivoting back towards these strategic investments around how do i maximize the usage of data and actually accelerating those because they've seen the importance of of digital business initiatives over the past year yeah matt i mean when we exited 2019 we saw a narrowing of experimentation and our premise was you know that that organizations are going to start now operationalizing all their digital transformation experiments and and then we had a you know 10 month petri dish on on digital so what do you what are you seeing in this regard a 10 month petri dish is an interesting way to interesting way to describe it um you know we saw another there's another there's another candidate for pivot in there around ransomware as well right um you know security entered into the mix which took people's attention away from some of this as well i mean look i'd like to bring this up just a level or two um because what we're actually talking about here is progress right and and progress isn't is an inevitability um you know whether it's whether whether you believe that it's by 2025 or you or you think it's 2035 or 2050 it doesn't matter we're on a forced march to the eradication of disk and that is happening in many ways uh you know in many ways um due to some of the things that garrett was referring to and what scott was referring to in terms of what are customers demands for how they're going to actually leverage the data that they have and that brings me to kind of my final point on this which is we see customers in three phases there's the first phase where they say hey i have this large data store and i know there's value in there i don't know how to get to it or i have this large data store and i've started a project to get value out of it and we failed those could be customers that um you know marched down the hadoop path early on and they they got some value out of it um but they realized that you know hdfs wasn't going to be a modern protocol going forward for any number of reasons you know the first being hey if i have gold.master how do i know that i have gold.4 is consistent with my gold.master so data consistency matters and then you have the sort of third group that says i have these large data sets i know how to extract value from them and i'm already on to the verticas the elastics you know the splunks etc um i think those folks are the folks that that ladder group are the folks that kept their their their projects going because they were already extracting value from them the first two groups we we're seeing sort of saying the second half of this year is when we're going to begin really being picking up on these on these types of initiatives again well thank you matt by the way for for hitting the escape key because i think value from data really is what this is all about and there are some real blockers there that i kind of want to talk about you mentioned hdfs i mean we were very excited of course in the early days of hadoop many of the concepts were profound but at the end of the day it was too complicated we've got these hyper-specialized roles that are that are you know serving the business but it still takes too long it's it's too hard to get value from data and one of the blockers is infrastructure that the complexity of that infrastructure really needs to be abstracted taking up a level we're starting to see this in in cloud where you're seeing some of those abstraction layers being built from some of the cloud vendors but more importantly a lot of the vendors like pew are saying hey we can do that heavy lifting for you uh and we you know we have expertise in engineering to do cloud native so i'm wondering what you guys see uh maybe garrett you could start us off and other students as some of the blockers uh to getting value from data and and how we're going to address those in the coming decade yeah i mean i i think part of it we're solving here obviously with with pure bringing uh you know flash to a market that traditionally was utilizing uh much slower media um you know the other thing that i that i see that's very nice with flashblade for example is the ability to kind of do things you know once you get it set up a blade at a time i mean a lot of the things that we see from just kind of more of a you know simplistic approach to this like a lot of these teams don't have big budgets and being able to kind of break them down into almost a blade type chunk i think has really kind of allowed folks to get more projects and and things off the ground because they don't have to buy a full expensive system to run these projects so that's helped a lot i think the wider use cases have helped a lot so matt mentioned ransomware you know using safe mode as a place to help with ransomware has been a really big growth spot for us we've got a lot of customers very interested and excited about that and the other thing that i would say is bringing devops into data is another thing that we're seeing so kind of that push towards data ops and really kind of using automation and infrastructure as code as a way to now kind of drive things through the system the way that we've seen with automation through devops is really an area we're seeing a ton of growth with from a services perspective guys any other thoughts on that i mean we're i'll tee it up there we are seeing some bleeding edge which is somewhat counterintuitive especially from a cost standpoint organizational changes at some some companies uh think of some of the the the internet companies that do uh music uh for instance and adding podcasts etc and those are different data products we're seeing them actually reorganize their data architectures to make them more distributed uh and actually put the domain heads the business heads in charge of the the data and the data pipeline and that is maybe less efficient but but it's again some of these bleeding edge what else are you guys seeing out there that might be yes some harbingers of the next decade uh i'll go first um you know i think specific to um the the construct that you threw out dave one of the things that we're seeing is um you know the the application owner maybe it's the devops person but it's you know maybe it's it's it's the application owner through the devops person they're they're becoming more technical in their understanding of how infrastructure um interfaces with their with their application i think um you know what what we're seeing on the flashblade side is we're having a lot more conversations with application people than um just i.t people it doesn't mean that the it people aren't there the it people are still there for sure they have to deliver the service etc um but you know the days of of i.t you know building up a catalog of services and a business owner subscribing to one of those services you know picking you know whatever sort of fits their need um i don't think that constru i think that's the construct that changes going forward the application owner is becoming much more prescriptive about what they want the infrastructure to fit how they want the infrastructure to fit into their application and that's a big change and and for for um you know certainly folks like like garrett and cdw um you know they do a good job with this being able to sort of get to the application owner and bring those two sides together there's a tremendous amount of value there for us it's been a little bit of a retooling we've traditionally sold to the i.t side of the house and um you know we've had to teach ourselves how to go talk the language of of applications so um you know i think you pointed out a good a good a good construct there and and you know that that application owner taking playing a much bigger role in what they're expecting uh from the performance of it infrastructure i think is is is a key is a key change interesting i mean that definitely is a trend that's put you guys closer to the business where the the infrastructure team is is serving the business as opposed to sometimes i talk to data experts and they're frustrated uh especially data owners or or data product builders who are frustrated that they feel like they have to beg beg the the data pipeline team to get you know new data sources or get data out how about the edge um you know maybe scott you can kick us off i mean we're seeing you know the emergence of edge use cases ai inferencing at the edge a lot of data at the edge what are you seeing there and and how does this unified object i'll bring us back to that and file fit wow dave how much time do we have um two minutes first of all scott why don't you why don't you just tell everybody what the edge is yeah you got it figured out all right how much time do you have matt at the end of the day and that that's that's a great question right is if you take a step back and i think it comes back today of something you mentioned it's about extracting value from data and what that means is when you extract value from data what it does is as matt pointed out the the influencers or the users of data the application owners they have more power because they're driving revenue now and so what that means is from an i.t standpoint it's not just hey here are the services you get use them or lose them or you know don't throw a fit it is no i have to i have to adapt i have to follow what my application owners mean now when you bring that back to the edge what it means is is that data is not localized to the data center i mean we just went through a nearly 12-month period where the entire workforce for most of the companies in this country had went distributed and business continued so if business is distributed data is distributed and that means that means in the data center that means at the edge that means that the cloud that means in all other places in tons of places and what it also means is you have to be able to extract and utilize data anywhere it may be and i think that's something that we're going to continue to and continue to see and i think it comes back to you know if you think about key characteristics we've talked about things like performance and scale for years but we need to start rethinking it because on one hand we need to get performance everywhere but also in terms of scale and this ties back to some of the other initiatives and getting value from data it's something i call that the massive success problem one of the things we see especially with with workloads like machine learning is businesses find success with them and as soon as they do they say well i need about 20 of these projects now all of a sudden that overburdens it organizations especially across across core and edge and cloud environments and so when you look at environments ability to meet performance and scale demands wherever it needs to be is something that's really important you know so dave i'd like to um just sort of tie together sort of two things that um i think that i heard from scott and garrett that i think are important and it's around this concept of scale um you know some of us are old enough to remember the day when kind of a 10 terabyte blast radius was too big of a blast radius for people to take on or a terabyte of storage was considered to be um you know an exemplary budget environment right um now we sort of think as terabytes kind of like we used to think of as gigabytes in some ways um petabyte like you don't have to explain anybody what a petabyte is anymore um and you know what's on the horizon and it's not far are our exabyte type data set workloads um and you start to think about what could be in that exabyte of data we've talked about how you extract that value we've talked about sort of um how you start but if the scale is big not everybody's going to start at a petabyte or an exabyte to garrett's point the ability to start small and grow into these products or excuse me these projects i think a is a really um fundamental concept here because you're not going to just go by i'm going to kick off a five petabyte project whether you do that on disk or flash it's going to be expensive right but if you could start at a couple hundred terabytes not just as a proof of concept but as something that you know you could get predictable value out of that then you could say hey this either scales linearly or non-linearly in a way that i can then go map my investments to how i can go dig deeper into this that's how all of these things are gonna that's how these successful projects are going to start because the people that are starting with these very large you know sort of um expansive you know greenfield projects at multi-petabyte scale it's gonna be hard to realize near-term value excellent we gotta wrap but but garrett i wonder if you could close when you look forward you talk to customers do you see this unification of of file and object is it is this an evolutionary trend is it something that is that that is that is that is going to be a lever that customers use how do you see it evolving over the next two three years and beyond yeah i mean i think from our perspective i mean just from what we're seeing from the numbers within the market the amount of growth that's happening with unstructured data is really just starting to finally really kind of hit this data deluge or whatever you want to call it that we've been talking about for so many years it really does seem to now be becoming true as we start to see things scale out and really folks settle into okay i'm going to use the cloud to to start and maybe train my models but now i'm going to get it back on prem because of latency or security or whatever the the um decision points are there this is something that is not going to slow down and i think you know folks like pure having the ability to have the tools that they give us um to use and bring to market with our customers are really key and critical for us so i see it as a huge growth area and a big focus for us moving forward guys great job unpacking a topic that you know it's covered a little bit but i think we we covered some ground that is uh that is new and so thank you so much for those insights and that data really appreciate your time thanks steve thanks yeah thanks dave okay and thank you for watching the convergence of file and object keep it right there right back after this short break innovation impact influence welcome to the cube disruptors developers and practitioners learn from the voices of leaders who share their personal insights from the hottest digital events around the globe enjoy the best this community has to offer on the cube your global leader in high-tech digital coverage [Music] okay now we're going to get the customer perspective on object and we'll talk about the convergence of file and object but really focusing on the object piece this is a content program that's being made possible by pure storage and it's co-created with the cube christopher cb bond is here he's a lead architect for microfocus the enterprise data warehouse and principal data engineer at microfocus cb welcome good to see you thanks dave good to be here so tell us more about your role at microfocus it's a pan microfocus role of course we know the company is a multinational software firm and acquired the software assets of hp of course including vertica tell us where you fit yeah so microfocus is uh you know it's like i said wide worldwide uh company that uh sells a lot of software products all over the place to governments and so forth and um it also grows often by acquiring other companies so there is the problem of of integrating new companies and their data and so what's happened over the years is that they've had a a number of different discrete data systems so you've got this data spread all over the place and they've never been able to get a full complete introspection on the entire business because of that so my role was come in design a central data repository an enterprise data warehouse that all reporting could be generated against and so that's what we're doing and we selected vertica as the edw system and pure storage flashblade as the communal repository okay so you obviously had experience with with vertica in your in your previous role so it's not like you were starting from scratch but but paint a picture of what life was like before you embarked on this sort of consolidated a approach to your your data warehouse what was it just disparate data all over the place a lot of m a going on where did the data live right so again the data was all over the place including under people's desks in just dedicated you know their their own private uh sql servers it a lot of data in in um microfocus is run on sql server which has pros and cons because that's a great uh transactional database but it's not really good for analytics in my opinion so uh but a lot of stuff was running on that they had one vertica instance that was doing some select uh reporting wasn't a very uh powerful system and it was what they call vertica enterprise mode where had dedicated nodes which um had the compute and storage um in the same locus on each uh server okay so vertica eon mode is a whole new world because it separates compute from storage you mentioned eon mode uh and the ability to to to scale storage and compute independently we wanted to have the uh analytics olap stuff close to the oltp stuff right so that's why they're co-located very close to each other and so uh we could what's nice about this situation is that these s3 objects it's an s3 object store on the pure flash plate we could copy those over if we needed to uh aws and we could spin up um a version of vertica there and keep going it's it's like a tertiary dr strategy because we actually have a we're setting up a second flashblade vertica system geo-located elsewhere for backup and we can get into it if you want to talk about how the latest version of the pure software for the flashblade allows synchronization across network boundaries of those flash plays which is really nice because if uh you know there's a giant sinkhole opens up under our colo facility and we lose that thing then we just have to switch the dns and we were back in business off the dr and then if that one was to go we could copy those objects over to aws and be up and running there so we're feeling pretty confident about being able to weather whatever comes along so you're using the the pure flash blade as an object store um most people think oh object simple but slow uh not the case for you is that right not the case at all it's ripping um well you have to understand about vertica and the way it stores data it stores data in what they call storage containers and those are immutable okay on disk whether it's on aws or if you had a enterprise mode vertica if you do an update or delete it actually has to go and retrieve that object container from disk and it destroys it and rebuilds it okay which is why you don't you want to avoid updates and deletes with vertica because the way it gets its speed is by sorting and ordering and encoding the data on disk so it can read it really fast but if you do an operation where you're deleting or updating a record in the middle of that then you've got to rebuild that entire thing so that actually matches up really well with s3 object storage because it's kind of the same way uh it gets destroyed and rebuilt too okay so that matches up very well with vertica and we were able to design this system so that it's append only now we had some reports that were running in sql server okay uh which were taking seven days so we moved that to uh to vertica from sql server and uh we rewrote the queries which were which had been written in t sql with a bunch of loops and so forth and we were to get this is amazing it went from seven days to two seconds to generate this report which has tremendous value uh to the company because it would have to have this long cycle of seven days to get a new introspection in what they call their knowledge base and now all of a sudden it's almost on demand two seconds to generate it that's great and that's because of the way the data is stored and uh the s3 you asked about oh you know is it slow well not in that context because what happens really with vertica eon mode is that it can they have um when you set up your compute nodes they have local storage also which is called the depot it's kind of a cache okay so the data will be drawn from the flash and cached locally uh and that was it was thought when they designed that oh you know it's that'll cut down on the latency okay but it turns out that if you have your compute nodes close meaning minimal hops to the flashblade that you can actually uh tell vertica you know don't even bother caching that stuff just read it directly on the fly from the from the flashblade and the performance is still really good it depends on your situation but i know for example a major telecom company that uh uses the same topology as we're talking about here they did the same thing they just they just dropped the cache because the flash player was able to to deliver the the data fast enough so that's you're talking about that that's speed of light issues and just the overhead of of of switching infrastructure is that that gets eliminated and so as a result you can go directly to the storage array that's correct yeah it's it's like it's fast enough that it's it's almost as if it's local to the compute node uh but every situation is different depending on your uh your knees if you've got like a few tables that are heavily used uh then yeah put them um put them in the cash because that'll be probably a little bit faster but if you have a lot of ad hoc queries that are going on you know you may exceed the storage of the local cache and then you're better off having it uh just read directly from the uh from the flash blade got it look it pure's a fit i mean i sound like a fanboy but pure is all about simplicity so is object so that means you don't have to you know worry about wrangling storage and worrying about luns and all that other you know nonsense and and file i've been burned by hardware in the past you know where oh okay they're building to a price and so they cheap out on stuff like fans or other things and these these components fail and the whole thing goes down but this hardware is super super good quality and uh so i'm i'm happy with the quality that we're getting so cb last question what's next for you where do you want to take this uh this this initiative well we are in the process now of we um when so i i designed this system to combine the best of the kimball approach to data warehousing and the inland approach okay and what we do is we bring over all the data we've got and we put it into a pristine staging layer okay like i said it's uh because it's append only it's essentially a log of all the transactions that are happening in this company just they appear okay and then from the the kimball side of things we're designing the data marts now so that that's what the end users actually interact with and so we're we're taking uh the we're examining the transactional systems to say how are these business objects created what's what's the logic there and we're recreating those logical models in uh in vertica so we've done a handful of them so far and it's working out really well so going forward we've got a lot of work to do to uh create just about every object that that the company needs cb you're an awesome guest to really always a pleasure talking to you and uh thank you congratulations and and good luck going forward stay safe thank you [Music] okay let's summarize the convergence of file and object first i want to thank our guests matt burr scott sinclair garrett belsener and c.b bohn i'm your host dave vellante and please allow me to briefly share some of the key takeaways from today's program so first as scott sinclair of esg stated surprise surprise data's growing and matt burr he helped us understand the growth of unstructured data i mean estimates indicate that the vast majority of data will be considered unstructured by mid-decade 80 or so and obviously unstructured data is growing very very rapidly now of course your definition of unstructured data and that may vary across across a wide spectrum i mean there's video there's audio there's documents there's spreadsheets there's chat i mean these are generally considered unstructured data but of course they all have some type of structure to them you know perhaps it's not as strict as a relational database but there's certainly metadata and certain structure to these types of use cases that i just mentioned now the key to what pure is promoting is this idea of unified fast file and object uffo look object is great it's inexpensive it's simple but historically it's been less performant so good for archiving or cheap and deep types of examples organizations often use file for higher performance workloads and let's face it most of the world's data lives in file formats what pure is doing is bringing together file and object by for example supporting multiple protocols ie nfs smb and s3 s3 of course has really given new life to object over the past decade now the key here is to essentially enable customers to have the best of both worlds not having to trade off performance for object simplicity and a key discussion point that we've had on the program has been the impact of flash on the long slow death of spinning disk look hard disk drives they had a great run but hdd volumes they peaked in 2010 and flash as you well know has seen tremendous volume growth thanks to the consumption of flash in mobile devices and then of course its application into the enterprise and that's volume is just going to keep growing and growing and growing the price declines of flash are coming down faster than those of hdd so it's the writing's on the wall it's just a matter of time so flash is riding down that cost curve very very aggressively and hdd has essentially become you know a managed decline business now by bringing flash to object as part of the flashblade portfolio and allowing for multiple protocols pure hopes to eliminate the dissonance between file and object and simplify the choice in other words let the workload decide if you have data in a file format no problem pure can still bring the benefits of simplicity of object at scale to the table so again let the workload inform what the right strategy is not the technical infrastructure now pure course is not alone there are others supporting this multi-protocol strategy and so we asked matt burr why pure or what's so special about you and not surprisingly in addition to the product innovation he went right to pure's business model advantages i mean for example with its evergreen support model which was very disruptive in the marketplace you know frankly pure's entire business disrupted the traditional disk array model which was fundamentally was flawed pure forced the industry to respond and when it achieved escape velocity velocity and pure went public the entire industry had to react and a big part of the pure value prop in addition to this business model innovation that we just discussed is simplicity pure's keep its simple approach coincided perfectly with the ascendancy of cloud where technology organizations needed cloud-like simplicity for certain workloads that were never going to move into the cloud they're going to stay on-prem now i'm going to come back to this but allow me to bring in another concept that garrett and cb really highlighted and that is the complexity of the data pipeline and what do you mean what do i mean by that and why is this important so scott sinclair articulated he implied that the big challenge is organizations their data full but insights are scarce scarce a lot of data not as much insights it takes time too much time to get to those insights so we heard from our guests that the complexity of the data pipeline was a barrier to getting to faster insights now cb bonds shared how he streamlined his data architecture using vertica's eon mode which allowed him to scale compute independently of storage so that brought critical flexibility and improved economics at scale and flashblade of course was the back-end storage for his data warehouse efforts now the reason i think this is so important is that organizations are struggling to get insights from data and the complexity associated with the data pipeline and data life cycles let's face it it's overwhelming organizations and there the answer to this problem is a much longer and different discussion than unifying object and file that's you know i can spend all day talking about that but let's focus narrowly on the part of the issue that is related to file and object so the situation here is that technology has not been serving the business the way it should rather the formula is twisted in the world of data and big data and data architectures the data team is mired in complex technical issues that impact the time to insights now part of the answer is to abstract the underlying infrastructure complexity and create a layer with which the business can interact that accelerates instead of impedes innovation and unifying file and object is a simple example of this where the business team is not blocked by infrastructure nuance like does this data reside in a file or object format can i get to it quickly and inexpensively in a logical way or is the infrastructure in a stovepipe and blocking me so if you think about the prevailing sentiment of how the cloud is evolving to incorporate on premises workloads that are hybrid and configurations that are working across clouds and now out to the edge this idea of an abstraction layer that essentially hides the underlying infrastructure is a trend we're going to see evolve this decade now is uffo the be all end-all answer to solving all of our data pipeline challenges no no of course not but by bringing the simplicity and economics of object together with the ubiquity and performance of file uffo makes it a lot easier it simplifies life organizations that are evolving into digital businesses which by the way is every business so we see this as an evolutionary trend that further simplifies the underlying technology infrastructure and does a better job supporting the data flows for organizations so they don't have to spend so much time worrying about the technology details that add a little value to the business okay so thanks for watching the convergence of file and object and thanks to pure storage for making this program possible this is dave vellante for the cube we'll see you next time [Music] you
SUMMARY :
on the nfs side um but you know we
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
garrett belsner | PERSON | 0.99+ |
matt burr | PERSON | 0.99+ |
2010 | DATE | 0.99+ |
2050 | DATE | 0.99+ |
270 terabytes | QUANTITY | 0.99+ |
seven days | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
scott sinclair | PERSON | 0.99+ |
2035 | DATE | 0.99+ |
2019 | DATE | 0.99+ |
four | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
two seconds | QUANTITY | 0.99+ |
2025 | DATE | 0.99+ |
matt burr | PERSON | 0.99+ |
first phase | QUANTITY | 0.99+ |
dave | PERSON | 0.99+ |
dave vellante | PERSON | 0.99+ |
scott sinclair | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
250 terabytes | QUANTITY | 0.99+ |
10 terabyte | QUANTITY | 0.99+ |
zero percent | QUANTITY | 0.99+ |
100 | QUANTITY | 0.99+ |
steve | PERSON | 0.99+ |
gary | PERSON | 0.99+ |
two billion dollar | QUANTITY | 0.99+ |
garrett | PERSON | 0.99+ |
two minutes | QUANTITY | 0.99+ |
two weeks later | DATE | 0.99+ |
three topics | QUANTITY | 0.99+ |
two sides | QUANTITY | 0.99+ |
two weeks ago | DATE | 0.99+ |
billion dollars | QUANTITY | 0.99+ |
mid-decade 80 | DATE | 0.99+ |
today | DATE | 0.99+ |
cdw | PERSON | 0.98+ |
three phases | QUANTITY | 0.98+ |
80 | QUANTITY | 0.98+ |
billions of objects | QUANTITY | 0.98+ |
10 month | QUANTITY | 0.98+ |
one device | QUANTITY | 0.98+ |
an hour | QUANTITY | 0.98+ |
one platform | QUANTITY | 0.98+ |
scott | ORGANIZATION | 0.97+ |
last year | DATE | 0.97+ |
five petabyte | QUANTITY | 0.97+ |
scott | PERSON | 0.97+ |
cassandra | PERSON | 0.97+ |
one | QUANTITY | 0.97+ |
single block | QUANTITY | 0.97+ |
one system | QUANTITY | 0.97+ |
next decade | DATE | 0.96+ |
tons of places | QUANTITY | 0.96+ |
both worlds | QUANTITY | 0.96+ |
vertica | TITLE | 0.96+ |
matt | PERSON | 0.96+ |
both | QUANTITY | 0.96+ |
69 of organizations | QUANTITY | 0.96+ |
billion dollars | QUANTITY | 0.95+ |
pandemic | EVENT | 0.95+ |
first | QUANTITY | 0.95+ |
three great guests | QUANTITY | 0.95+ |
next year | DATE | 0.95+ |
DV Pure Storage 208
>> Thank you, sir. All right, you ready to roll? >> Ready. >> All right, we'll go ahead and go in five, four, three, two. >> Okay, let's summarize the convergence of file and object. First, I want to thank our guests, Matt Burr, Scott Sinclair, Garrett Belsner, and CB Bonne. I'm your host, Dave Vellante, and please allow me to briefly share some of the key takeaways from today's program. So first, as Scott Sinclair of ESG stated surprise, surprise, data's growing. And Matt Burr, he helped us understand the growth of unstructured data. I mean, estimates indicate that the vast majority of data will be considered unstructured by mid decade, 80% or so. And obviously, unstructured data is growing very, very rapidly. Now, of course, your definition of unstructured data, now that may vary across a wide spectrum. I mean, there's video, there's audio, there's documents, there's spreadsheets, there's chat. I mean, these are generally considered unstructured data but of course they all have some type of structure to them. You know, perhaps it's not as strict as a relational database, but there's certainly metadata and certain structure to these types of use cases that I just mentioned. Now, the key to what Pure is promoting is this idea of unified fast file and object, U-F-F-O. Look, object is great, it's inexpensive, it's simple, but historically, it's been less performant, so good for archiving, or cheap and deep types of examples. Organizations often use file for higher performance workloads and let's face it, most of the world's data lives in file formats. What Pure is doing is bringing together file and object by, for example, supporting multiple protocols, ie, NFS, SMB, and S3. S3, of course, has really given a new life to object over the past decade. Now, the key here is to essentially enable customers to have the best of both worlds, not having to trade off performance for object simplicity. And a key discussion point that we've had in the program has been the impact of Flash on the long, slow, death of spinning disk. Look, hard disk drives, they had a great run, but HDD volumes, they peaked in 2010, and Flash, as you well know, has seen tremendous volume growth thanks to the consumption of Flash in mobile devices and then of course, its application into the enterprise. And as volume is just going to keep growing and growing, and growing. the price declines of Flash are coming down faster than those of HDD. So it's, the writing's on the wall. It's just a matter of time. So Flash is riding down that cost curve very, very aggressively and HDD has essentially become a managed decline business. Now, by bringing Flash to object as part of the FlashBlade portfolio and allowing for multiple protocols, Pure hopes to eliminate the dissonance between file and object and simplify the choice. In other words, let the workload decide. If you have data in a file format, no problem. Pure can still bring the benefits of simplicity of object at scale to the table. So again, let the workload inform what the right strategy is not the technical infrastructure. Now Pure, of course, is not alone. There are others supporting this multi-protocol strategy. And so we asked Matt Burr why Pure, what's so special about you? And not surprisingly, in addition to the product innovation, he went right to Pure's business model advantages. I mean, for example, with its Evergreen support model which was very disruptive in the marketplace. You know, frankly, Pure's entire business disrupted the traditional disk array model which was, fundamentally, it was flawed. Pure forced the industry to respond. And when it achieved escape velocity and Pure went public, the entire industry had to react. And a big part of the Pure value prop in addition to this business model innovation that we just discussed is simplicity. Pure's keep it simple approach coincided perfectly with the ascendancy of cloud where technology organizations needed cloud-like simplicity for certain workloads that were never going to move into the cloud. They were going to stay on-prem. Now I'm going to come back to this but allow me to bring in another concept that Garrett and CB really highlighted, and that is the complexity of the data pipeline. And what do I mean, what do I mean by that, and why is this important? So Scott Sinclair articulated or he implied that the big challenge is organizations, they're data full, but insights are scarce; a lot of data, not as much insights, and it takes time, too much time to get to those insights. So we heard from our guests that the complexity of the data pipeline was a barrier to getting to faster insights. Now, CB Bonne shared how he streamlined his data architecture using Vertica's Eon Mode which allowed him to scale, compute, independently of storage, so that brought critical flexibility and improved economics at scale. And FlashBlade, of course, was the backend storage for his data warehouse efforts. Now, the reason I think this is so important is that organizations are struggling to get insights from data and the complexity associated with the data pipeline and data lifecycles, let's face it, it's overwhelming organizations. And there, the answer to this problem is a much longer and different discussion than unifying object and file. That's, you know, I could spend all day talking about that, but let's focus narrowly on the part of the issue that is related to file and object. So the situation here is the technology has not been serving the business the way it should. Rather, the formula is twisted in the world of data and big data, and data architectures. The data team is mired in complex technical issues that impact the time to insights. Now, part of the answer is to abstract the underlying infrastructure complexity and create a layer with which the business can interact that accelerates instead of impedes innovation. And unifying file and object is a simple example of this where the business team is not blocked by infrastructure nuance, like does this data reside in the file or object format? Can I get to it quickly and inexpensively in a logical way or is the infrastructure in a stovepipe and blocking me? So if you think about the prevailing sentiment of how the cloud is evolving to incorporate on premises, workloads that are hybrid, and configurations that are working across clouds, and now out to the edge, this idea of an abstraction layer that essentially hides the underlying infrastructure is a trend we're going to see evolve this decade. Now, is UFFO the be-all end-all answer to solving all of our data pipeline challenges? No, no, of course not. But by bringing the simplicity and economics of object together with the ubiquity and performance of file, UFFO makes it a lot easier. It simplifies a life organizations that are evolving into digital businesses, which by the way, is every business. So, we see this as an evolutionary trend that further simplifies the underlying technology infrastructure and does a better job supporting the data flows for organizations so they didn't have to spend so much time worrying about the technology details that add little value to the business. Okay, so thanks for watching the convergence of file and object and thanks to Pure Storage for making this program possible. This is Dave Vellante for theCUBE. We'll see you next time.
SUMMARY :
All right, you ready to roll? in five, four, three, two. that impact the time to insights.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Matt Burr | PERSON | 0.99+ |
Scott Sinclair | PERSON | 0.99+ |
Garrett Belsner | PERSON | 0.99+ |
ESG | ORGANIZATION | 0.99+ |
80% | QUANTITY | 0.99+ |
five | QUANTITY | 0.99+ |
CB Bonne | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
2010 | DATE | 0.99+ |
First | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
first | QUANTITY | 0.98+ |
four | QUANTITY | 0.98+ |
three | QUANTITY | 0.98+ |
both worlds | QUANTITY | 0.98+ |
Flash | TITLE | 0.97+ |
CB | PERSON | 0.97+ |
Vertica | ORGANIZATION | 0.97+ |
Pure Storage | ORGANIZATION | 0.96+ |
Pure | ORGANIZATION | 0.96+ |
Garrett | PERSON | 0.96+ |
Evergreen | ORGANIZATION | 0.86+ |
past decade | DATE | 0.59+ |
UFFO | ORGANIZATION | 0.59+ |
Pure Storage 208 | COMMERCIAL_ITEM | 0.59+ |
Pure | PERSON | 0.58+ |
this decade | DATE | 0.5+ |
FlashBlade | ORGANIZATION | 0.43+ |
FlashBlade | TITLE | 0.37+ |