Analyst Predictions 2022: The Future of Data Management
[Music] in the 2010s organizations became keenly aware that data would become the key ingredient in driving competitive advantage differentiation and growth but to this day putting data to work remains a difficult challenge for many if not most organizations now as the cloud matures it has become a game changer for data practitioners by making cheap storage and massive processing power readily accessible we've also seen better tooling in the form of data workflows streaming machine intelligence ai developer tools security observability automation new databases and the like these innovations they accelerate data proficiency but at the same time they had complexity for practitioners data lakes data hubs data warehouses data marts data fabrics data meshes data catalogs data oceans are forming they're evolving and exploding onto the scene so in an effort to bring perspective to the sea of optionality we've brought together the brightest minds in the data analyst community to discuss how data management is morphing and what practitioners should expect in 2022 and beyond hello everyone my name is dave vellante with the cube and i'd like to welcome you to a special cube presentation analyst predictions 2022 the future of data management we've gathered six of the best analysts in data and data management who are going to present and discuss their top predictions and trends for 2022 in the first half of this decade let me introduce our six power panelists sanjeev mohan is former gartner analyst and principal at sanjamo tony bear is principal at db insight carl olufsen is well-known research vice president with idc dave meninger is senior vice president and research director at ventana research brad shimon chief analyst at ai platforms analytics and data management at omnia and doug henschen vice president and principal analyst at constellation research gentlemen welcome to the program and thanks for coming on thecube today great to be here thank you all right here's the format we're going to use i as moderator are going to call on each analyst separately who then will deliver their prediction or mega trend and then in the interest of time management and pace two analysts will have the opportunity to comment if we have more time we'll elongate it but let's get started right away sanjeev mohan please kick it off you want to talk about governance go ahead sir thank you dave i i believe that data governance which we've been talking about for many years is now not only going to be mainstream it's going to be table stakes and all the things that you mentioned you know with data oceans data lakes lake houses data fabric meshes the common glue is metadata if we don't understand what data we have and we are governing it there is no way we can manage it so we saw informatica when public last year after a hiatus of six years i've i'm predicting that this year we see some more companies go public uh my bet is on colibra most likely and maybe alation we'll see go public this year we we i'm also predicting that the scope of data governance is going to expand beyond just data it's not just data and reports we are going to see more transformations like spark jaws python even airflow we're going to see more of streaming data so from kafka schema registry for example we will see ai models become part of this whole governance suite so the governance suite is going to be very comprehensive very detailed lineage impact analysis and then even expand into data quality we already seen that happen with some of the tools where they are buying these smaller companies and bringing in data quality monitoring and integrating it with metadata management data catalogs also data access governance so these so what we are going to see is that once the data governance platforms become the key entry point into these modern architectures i'm predicting that the usage the number of users of a data catalog is going to exceed that of a bi tool that will take time and we already seen that that trajectory right now if you look at bi tools i would say there are 100 users to a bi tool to one data catalog and i i see that evening out over a period of time and at some point data catalogs will really become you know the main way for us to access data data catalog will help us visualize data but if we want to do more in-depth analysis it'll be the jumping-off point into the bi tool the data science tool and and that is that is the journey i see for the data governance products excellent thank you some comments maybe maybe doug a lot a lot of things to weigh in on there maybe you could comment yeah sanjeev i think you're spot on a lot of the trends uh the one disagreement i think it's it's really still far from mainstream as you say we've been talking about this for years it's like god motherhood apple pie everyone agrees it's important but too few organizations are really practicing good governance because it's hard and because the incentives have been lacking i think one thing that deserves uh mention in this context is uh esg mandates and guidelines these are environmental social and governance regs and guidelines we've seen the environmental rags and guidelines imposed in industries particularly the carbon intensive industries we've seen the social mandates particularly diversity imposed on suppliers by companies that are leading on this topic we've seen governance guidelines now being imposed by banks and investors so these esgs are presenting new carrots and sticks and it's going to demand more solid data it's going to demand more detailed reporting and solid reporting tighter governance but we're still far from mainstream adoption we have a lot of uh you know best of breed niche players in the space i think the signs that it's going to be more mainstream are starting with things like azure purview google dataplex the big cloud platform uh players seem to be uh upping the ante and and addressing starting to address governance excellent thank you doug brad i wonder if you could chime in as well yeah i would love to be a believer in data catalogs um but uh to doug's point i think that it's going to take some more pressure for for that to happen i recall metadata being something every enterprise thought they were going to get under control when we were working on service oriented architecture back in the 90s and that didn't happen quite the way we we anticipated and and uh to sanjeev's point it's because it is really complex and really difficult to do my hope is that you know we won't sort of uh how do we put this fade out into this nebulous nebula of uh domain catalogs that are specific to individual use cases like purview for getting data quality right or like data governance and cyber security and instead we have some tooling that can actually be adaptive to gather metadata to create something i know is important to you sanjeev and that is this idea of observability if you can get enough metadata without moving your data around but understanding the entirety of a system that's running on this data you can do a lot to help with with the governance that doug is talking about so so i just want to add that you know data governance like many other initiatives did not succeed even ai went into an ai window but that's a different topic but a lot of these things did not succeed because to your point the incentives were not there i i remember when starbucks oxley had come into the scene if if a bank did not do service obviously they were very happy to a million dollar fine that was like you know pocket change for them instead of doing the right thing but i think the stakes are much higher now with gdpr uh the floodgates open now you know california you know has ccpa but even ccpa is being outdated with cpra which is much more gdpr like so we are very rapidly entering a space where every pretty much every major country in the world is coming up with its own uh compliance regulatory requirements data residence is becoming really important and and i i think we are going to reach a stage where uh it won't be optional anymore so whether we like it or not and i think the reason data catalogs were not successful in the past is because we did not have the right focus on adoption we were focused on features and these features were disconnected very hard for business to stop these are built by it people for it departments to to take a look at technical metadata not business metadata today the tables have turned cdo's are driving this uh initiative uh regulatory compliances are beating down hard so i think the time might be right yeah so guys we have to move on here and uh but there's some some real meat on the bone here sanjeev i like the fact that you late you called out calibra and alation so we can look back a year from now and say okay he made the call he stuck it and then the ratio of bi tools the data catalogs that's another sort of measurement that we can we can take even though some skepticism there that's something that we can watch and i wonder if someday if we'll have more metadata than data but i want to move to tony baer you want to talk about data mesh and speaking you know coming off of governance i mean wow you know the whole concept of data mesh is decentralized data and then governance becomes you know a nightmare there but take it away tony we'll put it this way um data mesh you know the the idea at least is proposed by thoughtworks um you know basically was unleashed a couple years ago and the press has been almost uniformly almost uncritical um a good reason for that is for all the problems that basically that sanjeev and doug and brad were just you know we're just speaking about which is that we have all this data out there and we don't know what to do about it um now that's not a new problem that was a problem we had enterprise data warehouses it was a problem when we had our hadoop data clusters it's even more of a problem now the data's out in the cloud where the data is not only your data like is not only s3 it's all over the place and it's also including streaming which i know we'll be talking about later so the data mesh was a response to that the idea of that we need to debate you know who are the folks that really know best about governance is the domain experts so it was basically data mesh was an architectural pattern and a process my prediction for this year is that data mesh is going to hit cold hard reality because if you if you do a google search um basically the the published work the articles and databases have been largely you know pretty uncritical um so far you know that you know basically learning is basically being a very revolutionary new idea i don't think it's that revolutionary because we've talked about ideas like this brad and i you and i met years ago when we were talking about so and decentralizing all of us was at the application level now we're talking about at the data level and now we have microservices so there's this thought of oh if we manage if we're apps in cloud native through microservices why don't we think of data in the same way um my sense this year is that you know this and this has been a very active search if you look at google search trends is that now companies are going to you know enterprises are going to look at this seriously and as they look at seriously it's going to attract its first real hard scrutiny it's going to attract its first backlash that's not necessarily a bad thing it means that it's being taken seriously um the reason why i think that that uh that it will you'll start to see basically the cold hard light of day shine on data mesh is that it's still a work in progress you know this idea is basically a couple years old and there's still some pretty major gaps um the biggest gap is in is in the area of federated governance now federated governance itself is not a new issue uh federated governance position we're trying to figure out like how can we basically strike the balance between getting let's say you know between basically consistent enterprise policy consistent enterprise governance but yet the groups that understand the data know how to basically you know that you know how do we basically sort of balance the two there's a huge there's a huge gap there in practice and knowledge um also to a lesser extent there's a technology gap which is basically in the self-service technologies that will help teams essentially govern data you know basically through the full life cycle from developed from selecting the data from you know building the other pipelines from determining your access control determining looking at quality looking at basically whether data is fresh or whether or not it's trending of course so my predictions is that it will really receive the first harsh scrutiny this year you are going to see some organization enterprises declare premature victory when they've uh when they build some federated query implementations you're going to see vendors start to data mesh wash their products anybody in the data management space they're going to say that whether it's basically a pipelining tool whether it's basically elt whether it's a catalog um or confederated query tool they're all going to be like you know basically promoting the fact of how they support this hopefully nobody is going to call themselves a data mesh tool because data mesh is not a technology we're going to see one other thing come out of this and this harks back to the metadata that sanji was talking about and the catalogs that he was talking about which is that there's going to be a new focus on every renewed focus on metadata and i think that's going to spur interest in data fabrics now data fabrics are pretty vaguely defined but if we just take the most elemental definition which is a common metadata back plane i think that if anybody is going to get serious about data mesh they need to look at a data fabric because we all at the end of the day need to speak you know need to read from the same sheet of music so thank you tony dave dave meninger i mean one of the things that people like about data mesh is it pretty crisply articulates some of the flaws in today's organizational approaches to data what are your thoughts on this well i think we have to start by defining data mesh right the the term is already getting corrupted right tony said it's going to see the cold hard uh light of day and there's a problem right now that there are a number of overlapping terms that are similar but not identical so we've got data virtualization data fabric excuse me for a second sorry about that data virtualization data fabric uh uh data federation right uh so i i think that it's not really clear what each vendor means by these terms i see data mesh and data fabric becoming quite popular i've i've interpreted data mesh as referring primarily to the governance aspects as originally you know intended and specified but that's not the way i see vendors using i see vendors using it much more to mean data fabric and data virtualization so i'm going to comment on the group of those things i think the group of those things is going to happen they're going to happen they're going to become more robust our research suggests that a quarter of organizations are already using virtualized access to their data lakes and another half so a total of three quarters will eventually be accessing their data lakes using some sort of virtualized access again whether you define it as mesh or fabric or virtualization isn't really the point here but this notion that there are different elements of data metadata and governance within an organization that all need to be managed collectively the interesting thing is when you look at the satisfaction rates of those organizations using virtualization versus those that are not it's almost double 68 of organizations i'm i'm sorry um 79 of organizations that were using virtualized access express satisfaction with their access to the data lake only 39 expressed satisfaction if they weren't using virtualized access so thank you uh dave uh sanjeev we just got about a couple minutes on this topic but i know you're speaking or maybe you've spoken already on a panel with jamal dagani who sort of invented the concept governance obviously is a big sticking point but what are your thoughts on this you are mute so my message to your mark and uh and to the community is uh as opposed to what dave said let's not define it we spent the whole year defining it there are four principles domain product data infrastructure and governance let's take it to the next level i get a lot of questions on what is the difference between data fabric and data mesh and i'm like i can compare the two because data mesh is a business concept data fabric is a data integration pattern how do you define how do you compare the two you have to bring data mesh level down so to tony's point i'm on a warp path in 2022 to take it down to what does a data product look like how do we handle shared data across domains and govern it and i think we are going to see more of that in 2022 is operationalization of data mesh i think we could have a whole hour on this topic couldn't we uh maybe we should do that uh but let's go to let's move to carl said carl your database guy you've been around that that block for a while now you want to talk about graph databases bring it on oh yeah okay thanks so i regard graph database as basically the next truly revolutionary database management technology i'm looking forward to for the graph database market which of course we haven't defined yet so obviously i have a little wiggle room in what i'm about to say but that this market will grow by about 600 percent over the next 10 years now 10 years is a long time but over the next five years we expect to see gradual growth as people start to learn how to use it problem isn't that it's used the problem is not that it's not useful is that people don't know how to use it so let me explain before i go any further what a graph database is because some of the folks on the call may not may not know what it is a graph database organizes data according to a mathematical structure called a graph a graph has elements called nodes and edges so a data element drops into a node the nodes are connected by edges the edges connect one node to another node combinations of edges create structures that you can analyze to determine how things are related in some cases the nodes and edges can have properties attached to them which add additional informative material that makes it richer that's called a property graph okay there are two principal use cases for graph databases there's there's semantic proper graphs which are used to break down human language text uh into the semantic structures then you can search it organize it and and and answer complicated questions a lot of ai is aimed at semantic graphs another kind is the property graph that i just mentioned which has a dazzling number of use cases i want to just point out is as i talk about this people are probably wondering well we have relational databases isn't that good enough okay so a relational database defines it uses um it supports what i call definitional relationships that means you define the relationships in a fixed structure the database drops into that structure there's a value foreign key value that relates one table to another and that value is fixed you don't change it if you change it the database becomes unstable it's not clear what you're looking at in a graph database the system is designed to handle change so that it can reflect the true state of the things that it's being used to track so um let me just give you some examples of use cases for this um they include uh entity resolution data lineage uh um social media analysis customer 360 fraud prevention there's cyber security there's strong supply chain is a big one actually there's explainable ai and this is going to become important too because a lot of people are adopting ai but they want a system after the fact to say how did the ai system come to that conclusion how did it make that recommendation right now we don't have really good ways of tracking that okay machine machine learning in general um social network i already mentioned that and then we've got oh gosh we've got data governance data compliance risk management we've got recommendation we've got personalization anti-money money laundering that's another big one identity and access management network and i.t operations is already becoming a key one where you actually have mapped out your operation your your you know whatever it is your data center and you you can track what's going on as things happen there root cause analysis fraud detection is a huge one a number of major credit card companies use graph databases for fraud detection risk analysis tracking and tracing churn analysis next best action what-if analysis impact analysis entity resolution and i would add one other thing or just a few other things to this list metadata management so sanjay here you go this is your engine okay because i was in metadata management for quite a while in my past life and one of the things i found was that none of the data management technologies that were available to us could efficiently handle metadata because of the kinds of structures that result from it but grass can okay grafts can do things like say this term in this context means this but in that context it means that okay things like that and in fact uh logistics management supply chain it also because it handles recursive relationships by recursive relationships i mean objects that own other objects that are of the same type you can do things like bill materials you know so like parts explosion you can do an hr analysis who reports to whom how many levels up the chain and that kind of thing you can do that with relational databases but yes it takes a lot of programming in fact you can do almost any of these things with relational databases but the problem is you have to program it it's not it's not supported in the database and whenever you have to program something that means you can't trace it you can't define it you can't publish it in terms of its functionality and it's really really hard to maintain over time so carl thank you i wonder if we could bring brad in i mean brad i'm sitting there wondering okay is this incremental to the market is it disruptive and replaceable what are your thoughts on this space it's already disrupted the market i mean like carl said go to any bank and ask them are you using graph databases to do to get fraud detection under control and they'll say absolutely that's the only way to solve this problem and it is frankly um and it's the only way to solve a lot of the problems that carl mentioned and that is i think it's it's achilles heel in some ways because you know it's like finding the best way to cross the seven bridges of konigsberg you know it's always going to kind of be tied to those use cases because it's really special and it's really unique and because it's special and it's unique uh it it still unfortunately kind of stands apart from the rest of the community that's building let's say ai outcomes as the great great example here the graph databases and ai as carl mentioned are like chocolate and peanut butter but technologically they don't know how to talk to one another they're completely different um and you know it's you can't just stand up sql and query them you've got to to learn um yeah what is that carlos specter or uh special uh uh yeah thank you uh to actually get to the data in there and if you're gonna scale that data that graph database especially a property graph if you're gonna do something really complex like try to understand uh you know all of the metadata in your organization you might just end up with you know a graph database winter like we had the ai winter simply because you run out of performance to make the thing happen so i i think it's already disrupted but we we need to like treat it like a first-class citizen in in the data analytics and ai community we need to bring it into the fold we need to equip it with the tools it needs to do that the magic it does and to do it not just for specialized use cases but for everything because i i'm with carl i i think it's absolutely revolutionary so i had also identified the principal achilles heel of the technology which is scaling now when these when these things get large and complex enough that they spill over what a single server can handle you start to have difficulties because the relationships span things that have to be resolved over a network and then you get network latency and that slows the system down so that's still a problem to be solved sanjeev any quick thoughts on this i mean i think metadata on the on the on the word cloud is going to be the the largest font uh but what are your thoughts here i want to like step away so people don't you know associate me with only meta data so i want to talk about something a little bit slightly different uh dbengines.com has done an amazing job i think almost everyone knows that they chronicle all the major databases that are in use today in january of 2022 there are 381 databases on its list of ranked list of databases the largest category is rdbms the second largest category is actually divided into two property graphs and rdf graphs these two together make up the second largest number of data databases so talking about accolades here this is a problem the problem is that there's so many graph databases to choose from they come in different shapes and forms uh to bright's point there's so many query languages in rdbms is sql end of the story here we've got sci-fi we've got gremlin we've got gql and then your proprietary languages so i think there's a lot of disparity in this space but excellent all excellent points sanji i must say and that is a problem the languages need to be sorted and standardized and it needs people need to have a road map as to what they can do with it because as you say you can do so many things and so many of those things are unrelated that you sort of say well what do we use this for i'm reminded of the saying i learned a bunch of years ago when somebody said that the digital computer is the only tool man has ever devised that has no particular purpose all right guys we gotta we gotta move on to dave uh meninger uh we've heard about streaming uh your prediction is in that realm so please take it away sure so i like to say that historical databases are to become a thing of the past but i don't mean that they're going to go away that's not my point i mean we need historical databases but streaming data is going to become the default way in which we operate with data so in the next say three to five years i would expect the data platforms and and we're using the term data platforms to represent the evolution of databases and data lakes that the data platforms will incorporate these streaming capabilities we're going to process data as it streams into an organization and then it's going to roll off into historical databases so historical databases don't go away but they become a thing of the past they store the data that occurred previously and as data is occurring we're going to be processing it we're going to be analyzing we're going to be acting on it i mean we we only ever ended up with historical databases because we were limited by the technology that was available to us data doesn't occur in batches but we processed it in batches because that was the best we could do and it wasn't bad and we've continued to improve and we've improved and we've improved but streaming data today is still the exception it's not the rule right there's there are projects within organizations that deal with streaming data but it's not the default way in which we deal with data yet and so that that's my prediction is that this is going to change we're going to have um streaming data be the default way in which we deal with data and and how you label it what you call it you know maybe these databases and data platforms just evolve to be able to handle it but we're going to deal with data in a different way and our research shows that already about half of the participants in our analytics and data benchmark research are using streaming data you know another third are planning to use streaming technologies so that gets us to about eight out of ten organizations need to use this technology that doesn't mean they have to use it throughout the whole organization but but it's pretty widespread in its use today and has continued to grow if you think about the consumerization of i.t we've all been conditioned to expect immediate access to information immediate responsiveness you know we want to know if an uh item is on the shelf at our local retail store and we can go in and pick it up right now you know that's the world we live in and that's spilling over into the enterprise i.t world where we have to provide those same types of capabilities um so that's my prediction historical database has become a thing of the past streaming data becomes the default way in which we we operate with data all right thank you david well so what what say you uh carl a guy who's followed historical databases for a long time well one thing actually every database is historical because as soon as you put data in it it's now history it's no longer it no longer reflects the present state of things but even if that history is only a millisecond old it's still history but um i would say i mean i know you're trying to be a little bit provocative in saying this dave because you know as well as i do that people still need to do their taxes they still need to do accounting they still need to run general ledger programs and things like that that all involves historical data that's not going to go away unless you want to go to jail so you're going to have to deal with that but as far as the leading edge functionality i'm totally with you on that and i'm just you know i'm just kind of wondering um if this chain if this requires a change in the way that we perceive applications in order to truly be manifested and rethinking the way m applications work um saying that uh an application should respond instantly as soon as the state of things changes what do you say about that i i think that's true i think we do have to think about things differently that's you know it's not the way we design systems in the past uh we're seeing more and more systems designed that way but again it's not the default and and agree 100 with you that we do need historical databases you know that that's clear and even some of those historical databases will be used in conjunction with the streaming data right so absolutely i mean you know let's take the data warehouse example where you're using the data warehouse as context and the streaming data as the present you're saying here's a sequence of things that's happening right now have we seen that sequence before and where what what does that pattern look like in past situations and can we learn from that so tony bear i wonder if you could comment i mean if you when you think about you know real-time inferencing at the edge for instance which is something that a lot of people talk about um a lot of what we're discussing here in this segment looks like it's got great potential what are your thoughts yeah well i mean i think you nailed it right you know you hit it right on the head there which is that i think a key what i'm seeing is that essentially and basically i'm going to split this one down the middle is i don't see that basically streaming is the default what i see is streaming and basically and transaction databases um and analytics data you know data warehouses data lakes whatever are converging and what allows us technically to converge is cloud native architecture where you can basically distribute things so you could have you can have a note here that's doing the real-time processing that's also doing it and this is what your leads in we're maybe doing some of that real-time predictive analytics to take a look at well look we're looking at this customer journey what's happening with you know you know with with what the customer is doing right now and this is correlated with what other customers are doing so what i so the thing is that in the cloud you can basically partition this and because of basically you know the speed of the infrastructure um that you can basically bring these together and or and so and kind of orchestrate them sort of loosely coupled manner the other part is that the use cases are demanding and this is part that goes back to what dave is saying is that you know when you look at customer 360 when you look at let's say smart you know smart utility grids when you look at any type of operational problem it has a real-time component and it has a historical component and having predictives and so like you know you know my sense here is that there that technically we can bring this together through the cloud and i think the use case is that is that we we can apply some some real-time sort of you know predictive analytics on these streams and feed this into the transactions so that when we make a decision in terms of what to do as a result of a transaction we have this real time you know input sanjeev did you have a comment yeah i was just going to say that to this point you know we have to think of streaming very different because in the historical databases we used to bring the data and store the data and then we used to run rules on top uh aggregations and all but in case of streaming the mindset changes because the rules normally the inference all of that is fixed but the data is constantly changing so it's a completely reverse way of thinking of uh and building applications on top of that so dave menninger there seemed to be some disagreement about the default or now what kind of time frame are you are you thinking about is this end of decade it becomes the default what would you pin i i think around you know between between five to ten years i think this becomes the reality um i think you know it'll be more and more common between now and then but it becomes the default and i also want sanjeev at some point maybe in one of our subsequent conversations we need to talk about governing streaming data because that's a whole other set of challenges we've also talked about it rather in a two dimensions historical and streaming and there's lots of low latency micro batch sub second that's not quite streaming but in many cases it's fast enough and we're seeing a lot of adoption of near real time not quite real time as uh good enough for most for many applications because nobody's really taking the hardware dimension of this information like how do we that'll just happen carl so near real time maybe before you lose the customer however you define that right okay um let's move on to brad brad you want to talk about automation ai uh the the the pipeline people feel like hey we can just automate everything what's your prediction yeah uh i'm i'm an ai fiction auto so apologies in advance for that but uh you know um i i think that um we've been seeing automation at play within ai for some time now and it's helped us do do a lot of things for especially for practitioners that are building ai outcomes in the enterprise uh it's it's helped them to fill skills gaps it's helped them to speed development and it's helped them to to actually make ai better uh because it you know in some ways provides some swim lanes and and for example with technologies like ottawa milk and can auto document and create that sort of transparency that that we talked about a little bit earlier um but i i think it's there's an interesting kind of conversion happening with this idea of automation um and and that is that uh we've had the automation that started happening for practitioners it's it's trying to move outside of the traditional bounds of things like i'm just trying to get my features i'm just trying to pick the right algorithm i'm just trying to build the right model uh and it's expanding across that full life cycle of building an ai outcome to start at the very beginning of data and to then continue on to the end which is this continuous delivery and continuous uh automation of of that outcome to make sure it's right and it hasn't drifted and stuff like that and because of that because it's become kind of powerful we're starting to to actually see this weird thing happen where the practitioners are starting to converge with the users and that is to say that okay if i'm in tableau right now i can stand up salesforce einstein discovery and it will automatically create a nice predictive algorithm for me um given the data that i that i pull in um but what's starting to happen and we're seeing this from the the the companies that create business software so salesforce oracle sap and others is that they're starting to actually use these same ideals and a lot of deep learning to to basically stand up these out of the box flip a switch and you've got an ai outcome at the ready for business users and um i i'm very much you know i think that that's that's the way that it's going to go and what it means is that ai is is slowly disappearing uh and i don't think that's a bad thing i think if anything what we're going to see in 2022 and maybe into 2023 is this sort of rush to to put this idea of disappearing ai into practice and have as many of these solutions in the enterprise as possible you can see like for example sap is going to roll out this quarter this thing called adaptive recommendation services which which basically is a cold start ai outcome that can work across a whole bunch of different vertical markets and use cases it's just a recommendation engine for whatever you need it to do in the line of business so basically you're you're an sap user you look up to turn on your software one day and you're a sales professional let's say and suddenly you have a recommendation for customer churn it's going that's great well i i don't know i i think that's terrifying in some ways i think it is the future that ai is going to disappear like that but i am absolutely terrified of it because um i i think that what it what it really does is it calls attention to a lot of the issues that we already see around ai um specific to this idea of what what we like to call it omdia responsible ai which is you know how do you build an ai outcome that is free of bias that is inclusive that is fair that is safe that is secure that it's audible etc etc etc etc that takes some a lot of work to do and so if you imagine a customer that that's just a sales force customer let's say and they're turning on einstein discovery within their sales software you need some guidance to make sure that when you flip that switch that the outcome you're going to get is correct and that's that's going to take some work and so i think we're going to see this let's roll this out and suddenly there's going to be a lot of a lot of problems a lot of pushback uh that we're going to see and some of that's going to come from gdpr and others that sam jeeve was mentioning earlier a lot of it's going to come from internal csr requirements within companies that are saying hey hey whoa hold up we can't do this all at once let's take the slow route let's make ai automated in a smart way and that's going to take time yeah so a couple predictions there that i heard i mean ai essentially you disappear it becomes invisible maybe if i can restate that and then if if i understand it correctly brad you're saying there's a backlash in the near term people can say oh slow down let's automate what we can those attributes that you talked about are non trivial to achieve is that why you're a bit of a skeptic yeah i think that we don't have any sort of standards that companies can look to and understand and we certainly within these companies especially those that haven't already stood up in internal data science team they don't have the knowledge to understand what that when they flip that switch for an automated ai outcome that it's it's gonna do what they think it's gonna do and so we need some sort of standard standard methodology and practice best practices that every company that's going to consume this invisible ai can make use of and one of the things that you know is sort of started that google kicked off a few years back that's picking up some momentum and the companies i just mentioned are starting to use it is this idea of model cards where at least you have some transparency about what these things are doing you know so like for the sap example we know for example that it's convolutional neural network with a long short-term memory model that it's using we know that it only works on roman english uh and therefore me as a consumer can say oh well i know that i need to do this internationally so i should not just turn this on today great thank you carl can you add anything any context here yeah we've talked about some of the things brad mentioned here at idc in the our future of intelligence group regarding in particular the moral and legal implications of having a fully automated you know ai uh driven system uh because we already know and we've seen that ai systems are biased by the data that they get right so if if they get data that pushes them in a certain direction i think there was a story last week about an hr system that was uh that was recommending promotions for white people over black people because in the past um you know white people were promoted and and more productive than black people but not it had no context as to why which is you know because they were being historically discriminated black people being historically discriminated against but the system doesn't know that so you know you have to be aware of that and i think that at the very least there should be controls when a decision has either a moral or a legal implication when when you want when you really need a human judgment it could lay out the options for you but a person actually needs to authorize that that action and i also think that we always will have to be vigilant regarding the kind of data we use to train our systems to make sure that it doesn't introduce unintended biases and to some extent they always will so we'll always be chasing after them that's that's absolutely carl yeah i think that what you have to bear in mind as a as a consumer of ai is that it is a reflection of us and we are a very flawed species uh and so if you look at all the really fantastic magical looking supermodels we see like gpt three and four that's coming out z they're xenophobic and hateful uh because the people the data that's built upon them and the algorithms and the people that build them are us so ai is a reflection of us we need to keep that in mind yeah we're the ai's by us because humans are biased all right great okay let's move on doug henson you know a lot of people that said that data lake that term's not not going to not going to live on but it appears to be have some legs here uh you want to talk about lake house bring it on yes i do my prediction is that lake house and this idea of a combined data warehouse and data lake platform is going to emerge as the dominant data management offering i say offering that doesn't mean it's going to be the dominant thing that organizations have out there but it's going to be the predominant vendor offering in 2022. now heading into 2021 we already had cloudera data bricks microsoft snowflake as proponents in 2021 sap oracle and several of these fabric virtualization mesh vendors join the bandwagon the promise is that you have one platform that manages your structured unstructured and semi-structured information and it addresses both the beyond analytics needs and the data science needs the real promise there is simplicity and lower cost but i think end users have to answer a few questions the first is does your organization really have a center of data gravity or is it is the data highly distributed multiple data warehouses multiple data lakes on-premises cloud if it if it's very distributed and you you know you have difficulty consolidating and that's not really a goal for you then maybe that single platform is unrealistic and not likely to add value to you um you know also the fabric and virtualization vendors the the mesh idea that's where if you have this highly distributed situation that might be a better path forward the second question if you are looking at one of these lake house offerings you are looking at consolidating simplifying bringing together to a single platform you have to make sure that it meets both the warehouse need and the data lake need so you have vendors like data bricks microsoft with azure synapse new really to the data warehouse space and they're having to prove that these data warehouse capabilities on their platforms can meet the scaling requirements can meet the user and query concurrency requirements meet those tight slas and then on the other hand you have the or the oracle sap snowflake the data warehouse uh folks coming into the data science world and they have to prove that they can manage the unstructured information and meet the needs of the data scientists i'm seeing a lot of the lake house offerings from the warehouse crowd managing that unstructured information in columns and rows and some of these vendors snowflake in particular is really relying on partners for the data science needs so you really got to look at a lake house offering and make sure that it meets both the warehouse and the data lake requirement well thank you doug well tony if those two worlds are going to come together as doug was saying the analytics and the data science world does it need to be some kind of semantic layer in between i don't know weigh in on this topic if you would oh didn't we talk about data fabrics before common metadata layer um actually i'm almost tempted to say let's declare victory and go home in that this is actually been going on for a while i actually agree with uh you know much what doug is saying there which is that i mean we i remembered as far back as i think it was like 2014 i was doing a a study you know it was still at ovum predecessor omnia um looking at all these specialized databases that were coming up and seeing that you know there's overlap with the edges but yet there was still going to be a reason at the time that you would have let's say a document database for json you'd have a relational database for tran you know for transactions and for data warehouse and you had you know and you had basically something at that time that that resembles to do for what we're considering a day of life fast fo and the thing is what i was saying at the time is that you're seeing basically blur you know sort of blending at the edges that i was saying like about five or six years ago um that's all and the the lake house is essentially you know the amount of the the current manifestation of that idea there is a dichotomy in terms of you know it's the old argument do we centralize this all you know you know in in in in in a single place or do we or do we virtualize and i think it's always going to be a yin and yang there's never going to be a single single silver silver bullet i do see um that they're also going to be questions and these are things that points that doug raised they're you know what your what do you need of of of your of you know for your performance there or for your you know pre-performance characteristics do you need for instance hiking currency you need the ability to do some very sophisticated joins or is your requirement more to be able to distribute and you know distribute our processing is you know as far as possible to get you know to essentially do a kind of brute force approach all these approaches are valid based on you know based on the used case um i just see that essentially that the lake house is the culmination of it's nothing it's just it's a relatively new term introduced by databricks a couple years ago this is the culmination of basically what's been a long time trend and what we see in the cloud is that as we start seeing data warehouses as a checkbox item say hey we can basically source data in cloud and cloud storage and s3 azure blob store you know whatever um as long as it's in certain formats like you know like you know parquet or csv or something like that you know i see that as becoming kind of you know a check box item so to that extent i think that the lake house depending on how you define it is already reality um and in some in some cases maybe new terminology but not a whole heck of a lot new under the sun yeah and dave menger i mean a lot of this thank you tony but a lot of this is going to come down to you know vendor marketing right some people try to co-opt the term we talked about data mesh washing what are your thoughts on this yeah so um i used the term data platform earlier and and part of the reason i use that term is that it's more vendor neutral uh we've we've tried to uh sort of stay out of the the vendor uh terminology patenting world right whether whether the term lake house is what sticks or not the concept is certainly going to stick and we have some data to back it up about a quarter of organizations that are using data lakes today already incorporate data warehouse functionality into it so they consider their data lake house and data warehouse one in the same about a quarter of organizations a little less but about a quarter of organizations feed the data lake from the data warehouse and about a quarter of organizations feed the data warehouse from the data lake so it's pretty obvious that three quarters of organizations need to bring this stuff together right the need is there the need is apparent the technology is going to continue to verge converge i i like to talk about you know you've got data lakes over here at one end and i'm not going to talk about why people thought data lakes were a bad idea because they thought you just throw stuff in a in a server and you ignore it right that's not what a data lake is so you've got data lake people over here and you've got database people over here data warehouse people over here database vendors are adding data lake capabilities and data lake vendors are adding data warehouse capabilities so it's obvious that they're going to meet in the middle i mean i think it's like tony says i think we should there declare victory and go home and so so i it's just a follow-up on that so are you saying these the specialized lake and the specialized warehouse do they go away i mean johnny tony data mesh practitioners would say or or advocates would say well they could all live as just a node on the on the mesh but based on what dave just said are we going to see those all morph together well number one as i was saying before there's always going to be this sort of you know kind of you know centrifugal force or this tug of war between do we centralize the data do we do it virtualize and the fact is i don't think that work there's ever going to be any single answer i think in terms of data mesh data mesh has nothing to do with how you physically implement the data you could have a data mesh on a basically uh on a data warehouse it's just that you know the difference being is that if we use the same you know physical data store but everybody's logically manual basically governing it differently you know um a data mission is basically it's not a technology it's a process it's a governance process um so essentially um you know you know i basically see that you know as as i was saying before that this is basically the culmination of a long time trend we're essentially seeing a lot of blurring but there are going to be cases where for instance if i need let's say like observe i need like high concurrency or something like that there are certain things that i'm not going to be able to get efficiently get out of a data lake um and you know we're basically i'm doing a system where i'm just doing really brute forcing very fast file scanning and that type of thing so i think there always will be some delineations but i would agree with dave and with doug that we are seeing basically a a confluence of requirements that we need to essentially have basically the element you know the ability of a data lake and a data laid out their warehouse we these need to come together so i think what we're likely to see is organizations look for a converged platform that can handle both sides for their center of data gravity the mesh and the fabric vendors the the fabric virtualization vendors they're all on board with the idea of this converged platform and they're saying hey we'll handle all the edge cases of the stuff that isn't in that center of data gradient that is off distributed in a cloud or at a remote location so you can have that single platform for the center of of your your data and then bring in virtualization mesh what have you for reaching out to the distributed data bingo as they basically said people are happy when they virtualize data i i think yes at this point but to this uh dave meningas point you know they have convert they are converging snowflake has introduced support for unstructured data so now we are literally splitting here now what uh databricks is saying is that aha but it's easy to go from data lake to data warehouse than it is from data warehouse to data lake so i think we're getting into semantics but we've already seen these two converge so is that so it takes something like aws who's got what 15 data stores are they're going to have 15 converged data stores that's going to be interesting to watch all right guys i'm going to go down the list and do like a one i'm going to one word each and you guys each of the analysts if you wouldn't just add a very brief sort of course correction for me so sanjeev i mean governance is going to be the maybe it's the dog that wags the tail now i mean it's coming to the fore all this ransomware stuff which really didn't talk much about security but but but what's the one word in your prediction that you would leave us with on governance it's uh it's going to be mainstream mainstream okay tony bear mesh washing is what i wrote down that's that's what we're going to see in uh in in 2022 a little reality check you you want to add to that reality check is i hope that no vendor you know jumps the shark and calls their offering a data mesh project yeah yeah let's hope that doesn't happen if they do we're going to call them out uh carl i mean graph databases thank you for sharing some some you know high growth metrics i know it's early days but magic is what i took away from that it's the magic database yeah i would actually i've said this to people too i i kind of look at it as a swiss army knife of data because you can pretty much do anything you want with it it doesn't mean you should i mean that's definitely the case that if you're you know managing things that are in a fixed schematic relationship probably a relational database is a better choice there are you know times when the document database is a better choice it can handle those things but maybe not it may not be the best choice for that use case but for a great many especially the new emerging use cases i listed it's the best choice thank you and dave meninger thank you by the way for bringing the data in i like how you supported all your comments with with some some data points but streaming data becomes the sort of default uh paradigm if you will what would you add yeah um i would say think fast right that's the world we live in you got to think fast fast love it uh and brad shimon uh i love it i mean on the one hand i was saying okay great i'm afraid i might get disrupted by one of these internet giants who are ai experts so i'm gonna be able to buy instead of build ai but then again you know i've got some real issues there's a potential backlash there so give us the there's your bumper sticker yeah i i would say um going with dave think fast and also think slow uh to to talk about the book that everyone talks about i would say really that this is all about trust trust in the idea of automation and of a transparent invisible ai across the enterprise but verify verify before you do anything and then doug henson i mean i i look i think the the trend is your friend here on this prediction with lake house is uh really becoming dominant i liked the way you set up that notion of you know the the the data warehouse folks coming at it from the analytics perspective but then you got the data science worlds coming together i still feel as though there's this piece in the middle that we're missing but your your final thoughts we'll give you the last well i think the idea of consolidation and simplification uh always prevails that's why the appeal of a single platform is going to be there um we've already seen that with uh you know hadoop platforms moving toward cloud moving toward object storage and object storage becoming really the common storage point for whether it's a lake or a warehouse uh and that second point uh i think esg mandates are uh are gonna come in alongside uh gdpr and things like that to uh up the ante for uh good governance yeah thank you for calling that out okay folks hey that's all the time that that we have here your your experience and depth of understanding on these key issues and in data and data management really on point and they were on display today i want to thank you for your your contributions really appreciate your time enjoyed it thank you now in addition to this video we're going to be making available transcripts of the discussion we're going to do clips of this as well we're going to put them out on social media i'll write this up and publish the discussion on wikibon.com and siliconangle.com no doubt several of the analysts on the panel will take the opportunity to publish written content social commentary or both i want to thank the power panelist and thanks for watching this special cube presentation this is dave vellante be well and we'll see you next time [Music] you
SUMMARY :
the end of the day need to speak you
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
381 databases | QUANTITY | 0.99+ |
2014 | DATE | 0.99+ |
2022 | DATE | 0.99+ |
2021 | DATE | 0.99+ |
january of 2022 | DATE | 0.99+ |
100 users | QUANTITY | 0.99+ |
jamal dagani | PERSON | 0.99+ |
last week | DATE | 0.99+ |
dave meninger | PERSON | 0.99+ |
sanji | PERSON | 0.99+ |
second question | QUANTITY | 0.99+ |
15 converged data stores | QUANTITY | 0.99+ |
dave vellante | PERSON | 0.99+ |
microsoft | ORGANIZATION | 0.99+ |
three | QUANTITY | 0.99+ |
sanjeev | PERSON | 0.99+ |
2023 | DATE | 0.99+ |
15 data stores | QUANTITY | 0.99+ |
siliconangle.com | OTHER | 0.99+ |
last year | DATE | 0.99+ |
sanjeev mohan | PERSON | 0.99+ |
six | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
carl | PERSON | 0.99+ |
tony | PERSON | 0.99+ |
carl olufsen | PERSON | 0.99+ |
six years | QUANTITY | 0.99+ |
david | PERSON | 0.99+ |
carlos specter | PERSON | 0.98+ |
both sides | QUANTITY | 0.98+ |
2010s | DATE | 0.98+ |
first backlash | QUANTITY | 0.98+ |
five years | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
dave | PERSON | 0.98+ |
each | QUANTITY | 0.98+ |
three quarters | QUANTITY | 0.98+ |
first | QUANTITY | 0.98+ |
single platform | QUANTITY | 0.98+ |
lake house | ORGANIZATION | 0.98+ |
both | QUANTITY | 0.98+ |
this year | DATE | 0.98+ |
doug | PERSON | 0.97+ |
one word | QUANTITY | 0.97+ |
this year | DATE | 0.97+ |
wikibon.com | OTHER | 0.97+ |
one platform | QUANTITY | 0.97+ |
39 | QUANTITY | 0.97+ |
about 600 percent | QUANTITY | 0.97+ |
two analysts | QUANTITY | 0.97+ |
ten years | QUANTITY | 0.97+ |
single platform | QUANTITY | 0.96+ |
five | QUANTITY | 0.96+ |
one | QUANTITY | 0.96+ |
three quarters | QUANTITY | 0.96+ |
california | LOCATION | 0.96+ |
ORGANIZATION | 0.96+ | |
single | QUANTITY | 0.95+ |
Patrick Osborne, HPE | VMworld 2018
>> (narrator) Live from Las Vegas, it's the Cube, covering VMWorld 2018. Brought to you by VMware and its ecosystem partners. >> Welcome back to Las Vegas everybody. You're watching the Cube, the leader in live-tech coverage. My name is Dave Vellante and I'm here with my co-host, David Floyer. Good to see you again David. VMWorld day three, wall to wall coverage. We got sets going on. 94 guests. Patrick Osborne is here, he's the Vice President of Big Data and Secondary Storage at Hewlett Packard Enterprise. Patrick, it's great to see you again. >> Always a pleasure to be on the Cube. >> Big quarter, Antonio Neri early into his tenure. >> Yes. The earnings, raise guidance, great to see that. Got to feel good. Give us the update, VMworld 2018, what's happening with you guys? >> So Q3 was bang up quarter, for all segments of the business. It was great, you know. Obviously it's the kind of earnings you want to have from a CEO in a second quarter. Steering the ship here. I think everyone's jazzed up. He's brought a lot of new life to the company, in terms of technology leadership. He's someone who's certainly grown up, from the grounds up, starting off his career at HPE. So for us who have started off as a Product Manager, an individual contributor, making your way up to CEO is definitely possible. So that's been great and I think it's favorable micro economics and we're taking advantage of that. VMworld's been awesome. I think this whole story around Multicloud and obviously we talk about hybrid IT at HPE, so it fits very well. VMware Technology, partner of the year, again. Four years running, so it's been a really good show for us. >> As last year, data protection is the single, hottest topic. Data protection, obviously Cloud, The Edge, but The Edge is kind of new and it's hot, it's sexy. But in terms of actual business that's getting done, companies that are getting funded, companies getting huge raises, throwing big parties. We saw you back to back nights at Omnia, it's a lot happening in data protection. HPE has got a whole new strategy around data protection. Maybe talk about that a little bit and how it's going. >> So it's going really well, like you said, that part of the market, it's pretty hot right now. I think there's a couple of things playing into that, certainly this new style of IT, like applied to secondary storage. We saw that with primary storage the last few years. Multicloud, the move to all flash, low-latency workloads. And then, certainly a lot of the things, in that area, are disrupting secondary storage. People want to do it different ways, they want to be able to simplify this area. It's a growing area for data, in general. They want to make that data work for them. Test, Dev, workload placement, intelligent placement of data, for secondary and even tertiary storage in the cloud. So a lot of good things happening, from an HPE perspective. >> So not just back up? >> No, not just back up. >> I want more out of my insurance policy. >> Exactly. Something in the past that was moving from purely a TCO type of conversation. My examples are always like, who likes to pay their life insurance premium, right? Because at the end of the day, I'm not going to derive any utility from that payment. So now, it's moving into more ROI. So we have things like, the Hybrid Flash Array, from Nimble, for example. It allows you to put your workloads to work. We have a great cloud service, called HPE Cloud Volumes, that we use for our customers to be able to do intelligent DR, as a service, and be able to apply Cloud compute to your data. So there's a lot of things going on, in the space, that's just outside of your traditional move data from point A to point B. Now you want to make it work for you. >> And what about the big data portfolio? You hear a lot about data. You don't hear a ton about the Big Data, Hadoop piece of the world. I know Hadoop, nobody seems to be talking about that anymore. But everybody's talking about AI, Machine-Learning, Deep-Learning. Certainly The Edge is all about data. What's the Big Data story? >> So at HPE, we're definitely focused on the whole Edge to Core analytic story. So we have a great story and you can see in the numbers from Q3, The Edge business, The Edge line servers, Aruba, driving a lot of growth in the company, where a lot of that data is being created. And then back into the Core, so for Big Data, we see a number of customers, who are using these tools to affect digital transformation. They're doing it, we're doing it to ourselves. So they're moving from batch oriented, to now fast data, so streaming analytics. And then, incorporating concepts of AI and ML to provide better service or better experience for their customers. And we're doing that with, for example, InfoSight. So we have a great product, Nimble, 3PAR. And then we provide a service, on top of that, which is a SAS based service. It has predictive analytics and Machine Learning. And we're able to do that, by using Big Data analytics. >> You're offering that as a service, as a SAS service to your customers? >> Absolutely. And the way we're able to provide those predictive analytics and be able to provide those recommendations and that Machine-Learning across a entire portfolio and be able to scale that service, because it's a service, we got tens of thousands of users using the service on a daily basis, is moving from an ERP system, data warehouse, to batch analytics, to now we're doing Elasticsearch and Kafka and all these really cool techniques, so it's really helped us unlock a lot of value for our customers. >> So, the Nimble acquisision is interesting, it's bringing that sort of Machine-Learning and AI to infrastructure. You got a lot of automation in the portfolio and you can't really talk about Cloud without talking about automations. So talk a little about automation. >> In particular, even at the show here this week, we are a premier technology partner with VMware and I think more that you see in the VMware Ecosystem is all around Cloud and automation. That's really where they're going. And we've been day-zero partners on a lot of different fronts. So VMware Cloud Foundation integration, we do things on the storage level with Vvols and SRM and all these things that allow customers to essentially program that infrastucture and get out of the mundane tasks of having to do this manually. So for us, automation is key part of our story here. Especially with VMware. >> So going a little bit further with that, what sort of examples, what benefit is this to your customers? How are they justifying putting all this in? >> It's a hybrid world, so our customers are going to expect, from us, as a portfolio vendor, the ability to provide an automated solution, on premises, as automated as what you'd get in the cloud. So for us, the ability to have a sourcing experience, that we call GreenLake, so you can buy everything from us, from a solution perspective, in a pay-as-you-go elastic model where you can flex-up, flex-down. And then being able to, essentially provide a different view, depending on what persona you're coming from. Obviously we've been focused on the infrastructure persona, more often, we're getting into the DevOps persona, the Cloud engineer persona, providing all of our infrastructure, whether it's computer networking or storage, that plugs into all these frameworks. Whether it's Ansible, Chef and all these things that we do around our automation ecosystem, it's pretty ubiquitous. >> You're touching on all the Cloud basis and you're seeing a lot of discussion around that. What are you hearing from customers? Sometimes we have to squint through this, a lot of the guys here, we always like to say, move at the speed of the CIO, which sometimes is slow. At the same time, they're all afraid they're going to get disrupted. HPE, over the last two or three years, has really brought in and partnered with some of the guys your talking about. Whether it's containers and companies that do those types of offerings. How fast do the customers actually adopting, where they adopting them, how are they handling, you talked about a hybrid world; How are they bridging the old and the new? >> That's a great question. For a lot of our customers, it's always a brown field conversation. You do have these mission critical workloads that have to run, so there's no Edge to Core without your core ERP system, right? Your Core Oracle System or for smaller customers that are running their businesses on SQL and other things. But what we're seeing is that, by shoring up that Core and we provide a set of services and products that we feel are the best in the industry for that. And then allow them to provide adjacent services on top of that, it's exactly like the same example we had with InfoSight, where those systems use to call home, right now we're taking that data, we're providing a whole ancillary set of services and functions around it and our customers are doing that. Enormous customers, like British Telecom, folks like Wayfair, for example, they're doing this on premises and their disrupting their competitors, in the mean time. >> What do you make of some of the announcements we've heard this week? Obviously VMware making a big deal with what's going on with AWS. We're seeing AWS capitulate, David Floyer you made the call. Got to have an on-prem strategy. Many said no, that'll never happen. They just want to sweep the floor. So that's a tip to the hybrid cap. What are your thoughts on what's going on there? How does HPE sort of participate in those trends? >> I'd say it's, instead of battle and capitulate, we've been very laser-focused on the customers and helping them, along their way, on the journey. So you see a lot of acquisitions we've done around services, advisory service. CTP is a perfect example. So CTP has a whole cadre of experts who understand AGER, who understand ECS and all the services and functions that go along with them And we're able to help people, right size, right place, whatever you want to call it, within their infrastructure. Because we know, we've been in business for 75+ years and have a very loyal customer base, and we're going to help them along their maturity curve and certainly everyone's not on the same path, in the same race. It's been pretty successful so far. >> You guys tend to connect the dots between your HPE Discover in U.S., in Las Vegas and HPE Discover in December. So June to December, you're on these six month cycles, U.S. focus and Europe focus, Decembers in Madrid, again. Second year of Madrid. U.S. is always Vegas, like most of these conferences, what's the cadence that your on? What was the vibe like at Discover? What should we expect leading up to Q4, calendar Q4 in Madrid? >> I'd say that Discover was a big success in Vegas, always fun to spend time here. In Madrid, you'll see a focus around the value part of our business. So we've been growing in automation, we talked about hybrid IT, certainly the Core around storage. We're really focusing and very heavily invested in, not just storage, but intelligent data management. So we really feel that our offerings, especially doubling down and offering more services around InfoSight and some of those predictive and Cloud-ready user stories for our customers is something that definitely differentiates ourselves in the market. So we'll be very focused on the data plan, the data layer and helping customers transform in that area. >> So let's talk some tenor sax. >> (David laughs) >> This is not New Orleans. When we were down in New Orleans, we were at VeeamON, I think you had your sax with you, you jumped in. >> That's right, I played with the Soul Rebels. >> Playing with the Soul Rebels, you were awesome. Leonard, a big jazz man. Love it. I'm a huge TOP fan. What's new in that world? Are you still active? Are you still playing? >> Yeah, the band's still playing. Shout out to my buddies in Jolpe, sitting in with some friends at a Dead cover band coming up, in a couple weeks. So, should be fun. We're going to reenact The Grateful Dead and Branford Marsalis. >> That's wonderful. >> It should be fun. >> We've been getting a big dose of hip-hop this week. >> Yeah. But the new thing is that, in hip-hop, it's getting back to it's original roots, so a lot of folks in the jazz world, collaborating with the folks in the hip-hop world, so not very commercial, definitely underground, but pretty cool. >> I love it. That's right Leonard, you pointing out Miles Davis was one of the first to make that transformation. >> Yeah >> Good call. >> I'm going to get the numbers wrong, but it's about five percent technique and 95 percent attitude. (multiple laughs) >> Jazz, like hip-hop, there's a lot guys just doing their own thing. And somehow it all comes together. >> Absolutely. >> Okay Patrick, great to see you. >> Great to see you guys. Thank you Dave. Yeah, good to see you guys. >> Always a pleasure, go Sox. >> We got some time for talk stocks? >> Alright. >> What do you think? It's getting a little nerve wrecking. >> #Bucky Dent is trending in my Twitter. That's my problem, so hopefully we can..., I definitely don't want to be limping into the playoffs, and still not a fan of this one team wild card playoff, but I think we'll be alright. >> If we go deep... It's a great time to be a Boston fan. >> Celtics. >> Football starting, Celtics are coming in November, so awesome. Great to see you man. >> Thanks for having me. >> Keep it right there everybody, we'll be right back with our next guest. You're watching the Cube, live. Day three at VMWorld 2018, we'll be right back. (techno music)
SUMMARY :
Brought to you by VMware it's great to see you again. Antonio Neri early into his tenure. great to see that. and obviously we talk and how it's going. and even tertiary storage in the cloud. and be able to apply Cloud compute What's the Big Data story? and you can see in the numbers from Q3, and be able to provide and AI to infrastructure. and get out of the mundane tasks the ability to provide a lot of the guys here, and products that we feel are the best So that's a tip to the hybrid cap. and all the services and functions that go along with them So June to December, in the market. I think you had your sax with you, I played with the Soul Rebels. Are you still active? the band's still playing. a big dose of hip-hop folks in the hip-hop world, you pointing out Miles Davis I'm going to get the numbers wrong, And somehow it all comes together. great to see you. Great to see you guys. Always a pleasure, What do you think? and still not a fan of this It's a great time to be a Boston fan. Great to see you man. with our next guest.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David Floyer | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Leonard | PERSON | 0.99+ |
Patrick Osborne | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
New Orleans | LOCATION | 0.99+ |
David | PERSON | 0.99+ |
Celtics | ORGANIZATION | 0.99+ |
HPE | ORGANIZATION | 0.99+ |
U.S. | LOCATION | 0.99+ |
December | DATE | 0.99+ |
Madrid | LOCATION | 0.99+ |
Patrick | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
six month | QUANTITY | 0.99+ |
95 percent | QUANTITY | 0.99+ |
June | DATE | 0.99+ |
Antonio Neri | PERSON | 0.99+ |
75+ years | QUANTITY | 0.99+ |
Wayfair | ORGANIZATION | 0.99+ |
Miles Davis | PERSON | 0.99+ |
VMware | ORGANIZATION | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
British Telecom | ORGANIZATION | 0.99+ |
Four years | QUANTITY | 0.99+ |
Second year | QUANTITY | 0.99+ |
Vegas | LOCATION | 0.99+ |
VMware Technology | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
Omnia | ORGANIZATION | 0.99+ |
November | DATE | 0.99+ |
VMworld | ORGANIZATION | 0.98+ |
last year | DATE | 0.98+ |
Kafka | TITLE | 0.98+ |
VMWorld 2018 | EVENT | 0.98+ |
Elasticsearch | TITLE | 0.98+ |
SQL | TITLE | 0.97+ |
single | QUANTITY | 0.97+ |
InfoSight | ORGANIZATION | 0.97+ |
Hewlett Packard Enterprise | ORGANIZATION | 0.97+ |
VMworld 2018 | EVENT | 0.97+ |
Discover | ORGANIZATION | 0.97+ |
Decembers | DATE | 0.97+ |
this week | DATE | 0.97+ |
VMWorld | EVENT | 0.97+ |
GreenLake | ORGANIZATION | 0.97+ |
Nimble | ORGANIZATION | 0.96+ |
Day three | QUANTITY | 0.95+ |
Big Data | ORGANIZATION | 0.95+ |
#Bucky Dent | PERSON | 0.93+ |
about five percent | QUANTITY | 0.93+ |
Multicloud | ORGANIZATION | 0.92+ |
The Edge | ORGANIZATION | 0.92+ |
one | QUANTITY | 0.92+ |
The Grateful Dead | ORGANIZATION | 0.91+ |
HPE Discover | ORGANIZATION | 0.91+ |
first | QUANTITY | 0.9+ |
CTP | ORGANIZATION | 0.89+ |
94 guests | QUANTITY | 0.89+ |
tens of thousands | QUANTITY | 0.85+ |
Q4 | DATE | 0.84+ |
last few years | DATE | 0.84+ |
day three | QUANTITY | 0.83+ |
CTP | TITLE | 0.83+ |
Soul Rebels | ORGANIZATION | 0.83+ |
Vice President | PERSON | 0.82+ |
Chris O'Brien, Cisco & Stefan Renner, Veeam | VMworld 2018
>> Live, from Las Vegas. It's theCUBE! Covering VMworld 2018. Brought to you by VMware and it's ecosystem partners. >> Hello, everyone, welcome back to theCUBE's live coverage, here in Las Vegas, for VMworld 2018, with Day Three of three days of wall-to-wall coverage, two sets. Our ninth year of covering VMworld, we're going to have like 96 interviews, a lot of content happening, lot of updates from the entrepreneurs, from the executives, and also the partnerships. In this segment we're going to be talking Cisco and Veeam. We got Stefan Renner who's the technical director of Global Alliances for Veeam, and Chris O'Brien, Technical Marketing Director at Cisco. Programmable networks, easy-to-use backup restore, disaster recovery, all those great stuff. >> You guys just get here from Omnia? (laughing) >> Welcome to theCUBE. >> It's a good party. >> Thank you. >> Thanks for havin' us. >> Do we look like that? (laughing) >> I feel like that. (laughing) >> You know, you guys have been very successful on the Veeam side. We had Peter McKay, the co-CEO on yesterday. Cisco has been very active and relevant in programmable DevOps, or DevNetOps, as it has been called in there. So the need to make things programmable and easy, are a nice combination. You guys have a partnership. How is the Cisco/Veeam partnership going, how did it start? Take a minute to explain, how it all came together, and what's the current situation of the partnership? Well, I think from a Cisco perspective, the partnership is going great, fantastic. They were Partner of the Year. What we're hearing from our customers is they want us to solve some of their problems around how do they scale and manage their data, right? I'm from the UCS Business Unit. We see an opportunity for us to bring UCS was built on programmability, right? We have the APIs, we have those capabilities. We started out with Veeam a few, I guess 18 months ago, maybe two years ago, really focusing on some solutions around our HyperPlex platform, and we released a number of validated designs. When we do these validated designs, it's not just Cisco doing the work. We're in the labs together, we're developing the solutions. >> With Veeam. >> With Veeam. All the engineering efforts, and then obviously, as you go through and you grow that solution, you really see an opportunity where you can enhance the solution. So things like automation, we want to bring that to the table, certainly, with our partner. >> And what's your contribution on this? Obviously, Veeam's role in the solution. Are you guys doing joint validations, or joint engineering? Talk about the integration piece with Cisco, why it's important. >> If you look back, maybe it's two years, right? I took on Veeam actually three years ago, three-and-a-half years ago, and when actually, we really started to kick off the thing with Cisco. So it's a bit more than two years, I would say it's three years, right? But in these days, a couple of years back, it's more about finding a right data protection platform, where we can host Veeam on. Meaning a backup server, right? And these days, it was more about back and recovery. Well, today we talk about hyper-availability. It's not only about backing up stuff or recovering stuff, it's about providing the whole platform, the whole orchestration layer for data availability. Back in these times, three years ago, it was about finding an s3260 or a c240 server of Cisco, which fits exactly the needs we need for Veeam to run on it, right? But over the last, now, 24 months, since Cisco really started HyperFlex and going into hyperconvergency, we partner with them to make sure we have the right data protection for this kind of solution. That's what you just talked about, talking about integrations. We really invested a lot of time and efforts on both fights, it's not only Veeam development, it's also trying to see Cisco develop, to integrate into HyperFlex, to make sure we can provide the right data protection for the customer needs are. >> So talk about the high availability, I just want to talk about that for a second, 'cause I think this really highlights one, the relationship, and the desire in the market for realtime data, whether it's for developers, or for applications, to integrate. High availability is about having data available and integrating into whatever that would be, whether it's a mishmash of application development, and routing across networks. This is a huge deal, this is not like a punchline. High availability used to be, oh, we have a data center where it's fault tolerant. There's a whole another new level that that's going to. Can you just talk what that means, because backing it up and making it available means something different now. >> Yeah. >> Talk about that. >> I do agree, because again, looking back, it was really about backing up and recovering stuff. If I look back couple of years, customers were looking for a solution, that are able to pull the VM out of the v-stream data center, make sure it's stored somewhere, and they can't get it back once it's deleted, right? >> Check. >> But now, if you look at Vmworld, right, we have it at Vmworld, it's all about automation, it's about APIs being true. I can integrate this data protection platform in my centralized management interfaces, making sure I have an orchestration layer on top of it, so it's not only about backing up and recovery anymore, it's about the whole stack from end-to-end, right? Getting data from A to Z, maybe get it offsite to an S3 storage for longterm retention. So, we really went from an on-premise, very small kind of solution stack to a big solution stack, going from a VM into the cloud, and overlaying that stuff. >> Stefan, I want you to comment on this, and of course I want to get your take as well. Talk about the time aspect of it, because you mentioned, okay, I can get it back, okay, got to get the data back. When you talk about making data available, the time series or the timeframe, is critical, in some cases, latency, nanoseconds, milliseconds. This is the new normal; you guys got to make that happen. Talk about that dynamic, are customers really doing that, obviously that want it, but what are some of the examples? >> No, they are, they are. In terms of speed, like in data protection and availability, if I talk about speed I really talk about SLAs, and the RTOs, and the RPOs, so how often do I backup, how often do I have a recovery point, that's what you just talked about, and how fast can I get a data application back once it's gone, or once it's deleted, or once it's discovered an issue in the data center. Again, over the last couple of years, that really involved because in the early days customers said, you know, I want to have that, but it's luxury, right, I don't want to pay for it, it's too expensive, I can't afford that. But looking in these days, and today, even at the conference, you talk to customers that say, I need it, it's critical, I cannot live a second without my data. So this kind of RTOs requirements, they really went down from, maybe a day, which was usual ten years back, to like five minutes, ten minutes, fifteen minutes, right now. That's maybe the maximum you can really afford as a customer, and that's where the integration part comes in, and all the stuff we do with Cisco, because with integration we can actually make sure that we can cover that, and get data back in ten minutes. >> So we're really talking about a whole new way of delivering infrastructure. If I go back to the early days of UCS and conversion infrastructure, yeah, we can support a thousand VMs, and they're like, how are you going to back a thousand VMs up? And they're like, uhhhhh, well, let's see, we're workin' on that. Today, you got your take in this platform approach, it's a fundamental part of cloud, developer, DevOps, and so I wonder if you can talk about, you know, when we were at Cisco Live, the DevNet area was one of the most exciting parts of the show. And if you think about traditional enterprise companies, really, not many, I think even one, has really done a good job with developers, it's Cisco. So where do developers play, is this a platform play, really, for cloud and hybrid infrastructure? I wonder if you can talk about that, the role of developers, and how you're approaching this mindset. >> Yeah, I think from our perspective, there's no downtime window, there's no scheduled windows of downtime, right? >> It's not allowed. >> We don't have that anymore. The way that we look at our infrastructure, we certainly want it to be robust, to address latencies, issues and concerns, and what we're doing with Veeam is really tweaking that infrastructure to make that data available when it's called on, so you can consume it as a developer, as a part of the DevOps team. All of our infrastructure, as you guys probably know, are all open systems, all policy-based models. So with these APIs being available, it allows developers to consume more, if they need to scale-out these infrastructures quickly, we can do it. We're certainly playing in the DevNet space, it's growing, we have our own separate conferences. >> The network becomes more and more important, every day, I mean, at a whole 'nother level. Talk about program ability, you got to be ready for anything Veeam wants to do with you, or whatever the customer wants with respect to high availability. >> Yeah. >> And as the definition changes, you got to be enabling that. >> Totally available if you can get to it through the network. (John laughing) And we certainly carry that all the way through the UCS fabric. >> Talk about Veeam strategy, because I think there's general perception that, oh, Veeam does backup for small- and medium-sized business, that's Veeam. And we had Peter McKay on yesterday, he said, "A third of our business is SMB, a third is commercial, a third is enterprise," number one. Number two is, you guys are getting into the orchestration and management for data availability. Can you talk about the extension of Veeam, in that regard? >> I want to actually grab on your number, because we talked about, oh, we got a thousand VMs, that needs to be backed up and recover. That was a couple of years back, Today, we talk more about ten thousand VMs. Customers actually here at the booth, I talked to customer that talked about ten thousand to twenty thousand VMs that needs to be available. Now I would call a customer that hosts ten thousand VMs no longer an SMB customer, right? That's more of the enterprise, and you're right, and I guess Peter McKay said the same. I didn't actually watch the video, so hopefully, I'm inline with him, but it's really he's, for sure, going into the enterprise, making sure the products actually fit the enterprise's needs. Talking about the orchestration piece, I mentioned before, Veeam Availability Orchestrator we recently announced and released, that's certainly a step into the enterprise market because an SMB customer, even a mid-range customer, they will not invest in an orchestration layer that provides the full capabilities of fade-over secondary data centers, and all that stuff. That's certainly an enterprise play, and that's also where the company's heading to, making sure we have the right fit for the still SMB customers, and mid-range customers, because I think they are still important to the business, right? I'm not saying they're unimportant. But also having the right products, and the scale. And I think scale is actually something we going to talk about anyway, in this conversation. The right scale, to even cover that customer, ten thousand VMs, twenty-thousand VMs, they are approaching us. >> I think the other big trend that we see, and I wonder if you guys could comment, is, again, data protection, backup, used to be an afterthought, and it also used to be kind of a one-size-fits-all. So that'd mean, almost by definition, you're either under-protected or over-protected, spending too much, or too little. Today you're offering much more granularity, and the like; it's a fundamental component of the platform that you're developing, and it's extending beyond just backup. Call it data protection, there's a security component, there's a DevOps and cloud piece, there's a management piece. Maybe you guys could give us your perspectives on those trends. >> Yeah, so short comment on that one, actually, in each and every one of my sessions I speak here, I always say, once you consider to replace your storage system, or your v-stream wired man, or you consider to use HCI, make sure you include data protection immediately, on Day One of your project, because, you're completely right, the last year or so, even still now, a lot of customers I'm going to, they tell me, oh, I replaced all my infrastructure last 6 months, 8 months, and now I want the data protection. Then I get in and I say, yeah, unfortunately, what you did on your infrastructure is completely wrong for the expectations and the requirements you have in data protection. So that's exactly what to talk about, you need to bring together those projects and make sure you bring them under one hood, and talk about this from Day One. Otherwise, you might get in to a wrong direction. >> Yeah, that whole-house view of the world. >> I think, from a Cisco perspective, we really look at, we're unifying the data, we have what your intentions are, your intentions are production apps, your intentions are data protection. I think through ACI we can certainly create the application profiles to make that happen. We carry through our fabric with the UCS system, so for us, we see ourselves as flexible enough to deliver all these options, obviously there's some improvements that we can bring, you know we were talkin' earlier. But that's part of the road map, and part of the way we want to go with Veeams. >> I think one of the things I'm impressed with Cisco about, and looking at the analysis, is that the network guys have always had the keys to the kingdom. You go back to IT, you go back twenty years, if you were a network guy, you ran the show. And you had storage guys came in, they became that same kind of tier, but the network was running everything, everything was sacred. Couldn't let the network go down. It ran offices, it ran branches. And then, when the cloud came, the network now with Cloud Native, and some of the stuff going on up at the stack, makes networking skills, people who think like a networking guy, really valuable, because the data needs to be networked. So, the data's now at the application, that's where the security is, so as you guys have your Veeam, you have needs, you're moving data around, you need more in Cisco, you're going to be better for him, so this is a nice dynamic. >> We're trying to instrument it so we understand what their needs are. If you look at AppDynamics, if you look at Tetration, all these things give us more and more visibility to make the right decisions, and hopefully those will all be automated down the road so we can move as fast as the business wants to. >> Well, and I think of things, you know people talk about air gaps for ransomware, but you need more than air gaps, you need analytics that identify anomalous behavior, and the corpus of backup data has all the data there, and if you can figure out how to analyze it, you're going to have a leg up. >> As you said, that's actually a good point because ransomware, and all that stuff, like Tetration, your project to analyze the network traffic and making sure-- I actually get informed, or I take an action, once I identify ransomware attacks, that's something that we can partner up with, because it would literally mean if Cisco identifies an attack, right, they can trigger automatically a backup or a snapshot backup of the data to make sure we actually have a backup right before the attack happens. So you can see a chain of activities and potential new products, or go to marketplace in the next couple of months and years. >> A lot of opportunities. >> Because there is a lot of stuff, and a lot of potential behind those technologies. >> And there's clear visibility from a customer standpoint, that we would report here on theCUBE, that's lookin' at nanosecs and things of that nature, where at the application, whether it's a V-map, or other things. Security and data has to be centric around the app, it decouples from the network so that you're not bumping into each other, you're helping each other, you're more effective. You help them, you guys help each other. This is the new stack model, this is the way it's going. >> I would say that's all what alliances is about, right? (laughing) It's why we have alliance business, right, because no one, neither Cisco nor us, we couldn't do it on our own, we always need a partner to do that. >> Guys, thanks for comin' and sharing the partnership news. I really think, and Alan Cohen, our CUBE guest this week, said, partnerships used to be a tennis match, now it's like soccer, a lot of things going on, multiple players, certainly you know that, Cisco's been doin' a lot of that for a while. Great stuff, thanks for coming on. Final question for you guys, big takeaways from VMworld 2018 this year. Comment, what's your thoughts, third day now, lookin' back, what's the theme here, what's the big story that people need to know about? >> Just from my experience, I've had a lot of conversations around security, and bringing it to our solution, more embedded within. I'm part of the Validated Design Program, and they're asking, at least the conversations that I've had on the floor here, has really been about showcasing some of the other aspects of Cisco, what we can bring from a security perspective to protect the data. I'm certainly bringing that home. >> Awesome. >> And what are you seeing? I just can continue what he said, because the most conversations I had is around scalability and still the data growth. We've been talking about that the last couple of years, but the more data you have, and the more VMs you have, the more challenging it is to protect it. It's all about scalability and making sure you can really cover and fulfill your needs. >> Well, congratulations on your success at Veeam, the numbers don't lie. You guys are doing very well. >> Thank you. >> Congratulations on Cisco, you guys have a clear line of sight on what you guys want to do with the network. >> Thanks. >> It's great to see, thanks for comin' on. Appreciate it. >> Thank you. CUBE coverage here, live, in Las Vegas. From VMworld 2018, it's theCUBE. I'm John Furrier with Dave Vellante. Stay with us, more Day Three coverage after this short break. (techno music)
SUMMARY :
Brought to you by VMware from the executives, and I feel like that. So the need to make things programmable All the engineering efforts, and then Talk about the integration piece it's about providing the whole platform, So talk about the high availability, VM out of the v-stream it's about the whole stack This is the new normal; you even at the conference, you talk about that, the role in the DevNet space, Talk about program ability, you got to And as the definition carry that all the way the orchestration and management and I guess Peter McKay said the same. of the platform that you're developing, and the requirements you Yeah, that whole-house and part of the way we because the data needs to be networked. the right decisions, and hopefully those and the corpus of backup data has all the backup of the data to a lot of stuff, and a lot of potential This is the new stack model, we always need a partner to do that. the theme here, what's that I've had on the floor here, and the more VMs you have, the more at Veeam, the numbers don't lie. a clear line of sight on what you guys It's great to see, I'm John Furrier with Dave Vellante.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Michael | PERSON | 0.99+ |
Marc Lemire | PERSON | 0.99+ |
Chris O'Brien | PERSON | 0.99+ |
Verizon | ORGANIZATION | 0.99+ |
Hilary | PERSON | 0.99+ |
Mark | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Ildiko Vancsa | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Alan Cohen | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
John Troyer | PERSON | 0.99+ |
Rajiv | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Stefan Renner | PERSON | 0.99+ |
Ildiko | PERSON | 0.99+ |
Mark Lohmeyer | PERSON | 0.99+ |
JJ Davis | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Beth | PERSON | 0.99+ |
Jon Bakke | PERSON | 0.99+ |
John Farrier | PERSON | 0.99+ |
Boeing | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Dave Nicholson | PERSON | 0.99+ |
Cassandra Garber | PERSON | 0.99+ |
Peter McKay | PERSON | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
Dave Brown | PERSON | 0.99+ |
Beth Cohen | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
John Walls | PERSON | 0.99+ |
Seth Dobrin | PERSON | 0.99+ |
Seattle | LOCATION | 0.99+ |
5 | QUANTITY | 0.99+ |
Hal Varian | PERSON | 0.99+ |
JJ | PERSON | 0.99+ |
Jen Saavedra | PERSON | 0.99+ |
Michael Loomis | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Jon | PERSON | 0.99+ |
Rajiv Ramaswami | PERSON | 0.99+ |
Stefan | PERSON | 0.99+ |