IBM DataOps in Action Panel | IBM DataOps 2020

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hi buddy welcome to this special noob digital event where we're focusing in on data ops data ops in Acton with generous support from friends at IBM let me set up the situation here there's a real problem going on in the industry and that's that people are not getting the most out of their data data is plentiful but insights perhaps aren't what's the reason for that well it's really a pretty complicated situation for a lot of organizations there's data silos there's challenges with skill sets and lack of skills there's tons of tools out there sort of a tools brief the data pipeline is not automated the business lines oftentimes don't feel as though they own the data so that creates some real concerns around data quality and a lot of finger-point quality the opportunity here is to really operationalize the data pipeline and infuse AI into that equation and really attack their cost-cutting and revenue generation opportunities that are there in front of you think about this virtually every application this decade is going to be infused with AI if it's not it's not going to be competitive and so we have organized a panel of great practitioners to really dig in to these issues first I want to introduce Victoria Stassi with who's an industry expert in a top at Northwestern you two'll very great to see you again thanks for coming on excellent nice to see you as well and Caitlin Alfre is the director of AI a vai accelerator and also part of the peak data officers organization at IBM who has actually eaten some of it his own practice what a creep let me say it that way Caitlin great to see you again and Steve Lewis good to see you again see vice president director of management associated a bank and Thompson thanks for coming on thanks Dave make speaker alright guys so you heard my authority with in terms of operationalizing getting the most insight hey data is wonderful insights aren't but getting insight in real time is critical in this decade each of you is a sense as to where you are on that journey or Victoria your taste because you're brand new to Northwestern Mutual but you have a lot of deep expertise in in health care and manufacturing financial services but where you see just the general industry climate and we'll talk about the journeys that you are on both personally and professionally so it's all fair sure I think right now right again just me going is you need to have speech insight right so as I experienced going through many organizations are all facing the same challenges today and a lot of those pounds is hard where do my to live is my data trust meaning has a bank curated has been Clinton's visit qualified has a big a lot of that is ready what we see often happen is businesses right they know their KPIs they know their business metrics but they can't find where that data Linda Barragan asked there's abundant data disparity all over the place but it is replicated because it's not well managed it's a lot of what governance in the platform of pools that governance to speak right offer fact it organizations pay is just that piece of it I can tell you where data is I can tell you what's trusted that when you can quickly access information and bring back answers to business questions that is one answer not many answers leaving the business to question what's the right path right which is the correct answer which which way do I go at the executive level that's the biggest challenge where we want the industry to go moving forward right is one breaking that down along that information to be published quickly and to an emailing data virtualization a lot of what you see today is most businesses right it takes time to build out large warehouses at an enterprise level we need to pivot quicker so a lot of what businesses are doing is we're leaning them towards taking advantage of data virtualization allowing them to connect to these data sources right to bring that information back quickly so they don't have to replicate that information across different systems or different applications right and then to be able to provide that those answers back quickly also allowing for seamless access to from the analysts that are running running full speed right try and find the answers as quickly as they find great okay and I want to get into that sort of how news Steve let me go to you one of the things that we talked about earlier was just infusing this this mindset of a data cult and thinking about data as a service so talk a little bit about how you got started what was the starting NICUs through that sure I think the biggest thing for us there is to change that mindset from data being just for reporting or things that have happened in the past to do some insights on us and some data that already existed well we've tried to shift the mentality there is to start to use data and use that into our actual applications so that we're providing those insight in real time through the applications as they're consumed helping with customer experience helping with our personalization and an optimization of our application the way we've started down that path or kind of the journey that we're still on was to get the foundation laid birch so part of that has been making sure we have access to all that data whether it's through virtualization like vic talked about or whether it's through having more of the the data selected in a data like that that where we have all of that foundational data available as opposed to waiting for people to ask for it that's been the biggest culture shift for us is having that availability of data to be ready to be able to provide those insights as opposed to having to make the businesses or the application or asked for that day Oh Kailyn when I first met into pulp andari the idea wobble he paid up there yeah I was asking him okay where does a what's the role of that at CBO and and he mentioned a number of things but two of the things that stood out is you got to understand how data affect the monetization of your company that doesn't mean you know selling the data what role does it play and help cut cost or ink revenue or productivity or no customer service etc the other thing he said was you've got a align with the lines of piss a little sounded good and this is several years ago and IBM took it upon itself Greek its own champagne I was gonna say you know dogfooding whatever but it's not easy just flip a switch and an infuse a I and automate the data pipeline you guys had to go you know some real of pain to get there and you did you were early on you took some arrows and now you're helping your customers better on thin debt but talk about some of the use cases that where you guys have applied this obviously the biggest organization you know one of the biggest in the world the real challenge is they're sure I'm happy today you know we've been on this journey for about four years now so we stood up our first book to get office 2016 and you're right it was all about getting what data strategy offered and executed internally and we want to be very transparent because as you've mentioned you know a lot of challenges possible think differently about the value and so as we wrote that data strategy at that time about coming to enterprise and then we quickly of pivoted to see the real opportunity and value of infusing AI across all of our needs were close to your question on a couple of specific use cases I'd say you know we invested that time getting that platform built and implemented and then we were able to take advantage of that one particular example that I've been really excited about I have a practitioner on my team who's a supply chain expert and a couple of years ago he started building out supply chain solution so that we can better mitigate our risk in the event of a natural disaster like the earthquake hurricane anywhere around the world and be cuz we invest at the time and getting the date of pipelines right getting that all of that were created and cleaned and the quality of it we were able to recently in recent weeks add the really critical Kovach 19 data and deliver that out to our employees internally for their preparation purposes make that available to our nonprofit partners and now we're starting to see our first customers take advantage too with the health and well-being of their employees mine so that's you know an example I think where and I'm seeing a lot of you know my clients I work with they invest in the data and AI readiness and then they're able to take advantage of all of that work work very quickly in an agile fashion just spin up those out well I think one of the keys there who Kaelin is that you know we can talk about that in a covet 19 contact but it's that's gonna carry through that that notion of of business resiliency is it's gonna live on you know in this post pivot world isn't it absolutely I think for all of us the importance of investing in the business continuity and resiliency type work so that we know what to do in the event of either natural disaster or something beyond you know it'll be grounded in that and I think it'll only become more important for us to be able to act quickly and so the investment in those platforms and approach that we're taking and you know I see many of us taking will really be grounded in that resiliency so Vic and Steve I want to dig into this a little bit because you know we use this concept of data op we're stealing from DevOps and there are similarities but there are also differences now let's talk about the data pipeline if you think about the data pipeline as a sort of quasi linear process where you're investing data and you might be using you know tools but whether it's Kafka or you know we have a favorite who will you have and then you're transforming that that data and then you got a you know discovery you got to do some some exploration you got to figure out your metadata catalog and then you're trying to analyze that data to get some insights and then you ultimately you want to operationalize it so you know and and you could come up with your own data pipeline but generally that sort of concept is is I think well accepted there's different roles and unlike DevOps where it might be the same developer who's actually implementing security policies picking it the operations in in data ops there might be different roles and fact very often are there's data science there's may be an IT role there's data engineering there's analysts etc so Vic I wonder if you could you could talk about the challenges in in managing and automating that data pipeline applying data ops and how practitioners can overcome them yeah I would say a perfect example would be a client that I was just recently working for where we actually took a team and we built up a team using agile methodologies that framework right we're rapidly ingesting data and then proving out data's fit for purpose right so often now we talk a lot about big data and that is really where a lot of industries are going they're trying to add an enrichment to their own data sources so what they're doing is they're purchasing these third-party data sets so in doing so right you make that initial purchase but what many companies are doing today is they have no real way to vet that so they'll purchase the information they aren't going to vet it upfront they're going to bring it into an environment there it's going to take them time to understand if the data is of quality or not and by the time they do typically the sales gone and done and they're not going to ask for anything back but we were able to do it the most recent claim was use an instructure data source right bring that and ingest that with modelers using this agile team right and within two weeks we were able to bring the data in from the third-party vendor what we considered rapid prototyping right be able to profile the data understand if the data is of quality or not and then quickly figure out that you know what the data's not so in doing that we were able to then contact the vendor back tell them you know it sorry the data set up to snuff we'd like our money back we're not gonna go forward with it that's enabling businesses to be smarter with what they're doing with 30 new purchases today as many businesses right now um as much as they want to rely on their own data right they actually want to rely on cross the data from third-party sources and that's really what data Ops is allowing us to do it's allowing us to think at a broader a higher level right what to bring the information what structures can we store them in that they don't necessarily have to be modeled because a modeler is great right but if we have to take time to model all the information before we even know we want to use it that's gonna slow the process now and that's slowing the business down the business is looking for us to speed up all of our processes a lot of what we heard in the past raised that IP tends to slow us down and that's where we're trying to change that perception in the industry is no we're actually here to speed you up we have all the tools and technologies to do so and they're only getting better I would say also on data scientists right that's another piece of the pie for us if we can bring the information in and we can quickly catalog it in a metadata and burn it bring in the information in the backend data data assets right and then supply that information back to scientists gone are the days where scientists are going and asking for connections to all these different data sources waiting days for access requests to be approved just to find out that once they figure out how it with them the relationship diagram right the design looks like in that back-end database how to get to it write the code to get to it and then figure out this is not the information I need that Sally next to me right fold me the wrong information that's where the catalog comes in that's where due to absent data governance having that catalog that metadata management platform available to you they can go into a catalog without having to request access to anything quickly and within five minutes they can see the structures what if the tables look like what did the fields look like are these are these the metrics I need to bring back answers to the business that's data apps it's allowing us to speed up all of that information you know taking stuff that took months now down two weeks down two days down two hours so Steve I wonder if you could pick up on that and just help us understand what data means you we talked about earlier in our previous conversation I mentioned it upfront is this notion of you know the demand for for data access is it was through the roof and and you've gone from that to sort of more of a self-service environment where it's not IT owning the data it's really the businesses owning the data but what what is what is all this data op stuff meaning in your world sure I think it's very similar it's it's how do we enable and get access to that clicker showing the right controls showing the right processes and and building that scalability and agility and into all of it so that we're we're doing this at scale it's much more rapidly available we can discover new data separately determine if it's right or or more importantly if it's wrong similar to what what Vic described it's it's how do we enable the business to make those right decisions on whether or not they're going down the right path whether they're not the catalog is a big part of that we've also introduced a lot of frameworks around scale so just the ability to rapidly ingest data and make that available has been a key for us we've also focused on a prototyping environment so that sandbox mentality of how do we rapidly stand those up for users and and still provide some controls but have provide that ability for people to do that that exploration what we're finding is that by providing the platform and and the foundational layers that were we're getting the use cases to sort of evolve and come out of that as opposed to having the use cases prior to then go build things from we're shifting the mentality within the organization to say we don't know what we need yet let's let's start to explore that's kind of that data scientist mentality and culture it more of a way of thinking as opposed to you know an actual project or implement well I think that that cultural aspect is important of course Caitlin you guys are an AI company or at least that you know part of what you do but you know you've you for four decades maybe centuries you've been organized around different things by factoring plant but sales channel or whatever it is but-but-but-but how has the chief data officer organization within IBM been able to transform itself and and really infuse a data culture across the entire company one of the approaches you know we've taken and we talk about sort of the blueprint to drive AI transformation so that we can achieve and deliver these really high value use cases we talked about the data the technology which we've just pressed on with organizational piece of it duration are so important the change management enabling and equipping our data stewards I'll give one a civic example that I've been really excited about when we were building our platform and starting to pull districting structured unstructured pull it in our ADA stewards are spending a lot of time manually tagging and creating business metadata about that data and we identified that that was a real pain point costing us a lot of money valuable resources so we started to automate the metadata and doing that in partnership with our deep learning practitioners and some of the models that they were able to build that capability we pushed out into our contacts our product last year and one of the really exciting things for me to see is our data stewards who be so value exporters and the skills that they bring have reported that you know it's really changed the way they're able to work it's really sped up their process it's enabled them to then move on to higher value to abilities and and business benefits so they're very happy from an organizational you know completion point of view so I think there's ways to identify those use cases particularly for taste you know we drove some significant productivity savings we also really empowered and hold our data stewards we really value to make their job you know easier more efficient and and help them move on to things that they are more you know excited about doing so I think that's that you know another example of approaching taken yes so the cultural piece the people piece is key we talked a little bit about the process I want to get into a little bit into the tech Steve I wonder if you could tell us you know what's it what's the tech we have this bevy of tools I mentioned a number of them upfront you've got different data stores you've got open source pooling you've got IBM tooling what are the critical components of the technology that people should be thinking about tapping in architecture from ingestion perspective we're trying to do a lot of and a Python framework and scaleable ingestion pipe frameworks on the catalog side I think what we've done is gone with IBM PAC which provides a platform for a lot of these tools to stay integrated together so things from the discovery of data sources the cataloging the documentation of those data sources and then all the way through the actual advanced analytics and Python models and our our models and the open source ID combined with the ability to do some data prep and refinery work having that all in an integrated platform was a key to us for us that the rollout and of more of these tools in bulk as opposed to having the point solutions so that's been a big focus area for us and then on the analytic side and the web versus IDE there's a lot of different components you can go into whether it's meal soft whether it's AWS and some of the native functionalities out there you mentioned before Kafka and Anissa streams and different streaming technologies those are all the ones that are kind of in our Ketil box that we're starting to look at so and one of the keys here is we're trying to make decisions in as close to real time as possible as opposed to the business having to wait you know weeks or months and then by the time they get insights it's late and really rearview mirror so Vic your focus you know in your career has been a lot on data data quality governance master data management data from a data quality standpoint as well what are some of the key tools that you're familiar with that you've used that really have enabled you operationalize that data pipeline you know I would say I'm definitely the IBM tools I have the most experience with that also informatica though as well those are to me the two top players IBM definitely has come to the table with a suite right like Steve said cloud pack for data is really a one-stop shop so that's allowing that quick seamless access for business user versus them having to go into some of the previous versions that IBM had rolled out where you're going into different user interfaces right to find your information and that can become clunky it can add the process it can also create almost like a bad taste and if in most people's mouths because they don't want to navigate from system to system to system just to get their information so cloud pack to me definitely brings everything to the table in one in a one-stop shop type of environment in for me also though is working on the same thing and I would tell you that they haven't come up with a solution that really comes close to what IBM is done with cloud pack for data I'd be interested to see if they can bring that on the horizon but really IBM suite of tools allows for profiling follow the analytics write metadata management access to db2 warehouse on cloud those are the tools that I've worked in my past to implement as well as cloud object store to bring all that together to provide that one stop that at Northwestern right we're working right now with belieber I think calibra is a great set it pool are great garments catalog right but that's really what it's truly made for is it's a governance catalog you have to bring some other pieces to the table in order for it to serve up all the cloud pack does today which is the advanced profiling the data virtualization that cloud pack enables today the machine learning at the level where you can actually work with our and Python code and you put our notebooks inside of pack that's some of this the pieces right that are missing in some of the under vent other vendor schools today so one of the things that you're hearing here is the theme of openness others addition we've talked about a lot of tools and not IBM tools all IBM tools there there are many but but people want to use what they want to use so Kaitlin from an IBM perspective what's your commitment the openness number one but also to you know we talked a lot about cloud packs but to simplify the experience for your client well and I thank Stephen Victoria for you know speaking to their experience I really appreciate feedback and part of our approach has been to really take one the challenges that we've had I mentioned some of the capabilities that we brought forward in our cloud platform data product one being you know automating metadata generation and that was something we had to solve for our own data challenges in need so we will continue to source you know our use cases from and grounded from a practitioner perspective of what we're trying to do and solve and build and the approach we've really been taking is co-creation line and that we roll these capability about the product and work with our customers like Stephen light victorious you really solicit feedback to product route our dev teams push that out and just be very open and transparent I mean we want to deliver a seamless experience we want to do it in partnership and continue to solicit feedback and improve and roll out so no I think that will that has been our approach will continue to be and really appreciate the partnerships that we've been able to foster so we don't have a ton of time but I want to go to practitioners on the panel and ask you about key key performance indicators when I think about DevOps one of the things that we're measuring is the elapsed time the deploy applications start finished where we're measuring the amount of rework that has to be done the the quality of the deliverable what are the KPIs Victoria that are indicators of success in operationalizing date the data pipeline well I would definitely say your ability to deliver quickly right so how fast can you deliver is that is that quicker than what you've been able to do in the past right what is the user experience like right so have you been able to measure what what the amount of time was right that users are spending to bring information to the table in the past versus have you been able to reduce that time to delivery right of information business answers to business questions those are the key performance indicators to me that tell you that the suite that we've put in place today right it's providing information quickly I can get my business answers quickly but quicker than I could before and the information is accurate so being able to measure is it quality that I've been giving that I've given back or is this not is it the wrong information and yet I've got to go back to the table and find where I need to gather that from from somewhere else that to me tells us okay you know what the tools we've put in place today my teams are working quicker they're answering the questions they need to accurately that is when we know we're on the right path Steve anything you add to that I think she covered a lot of the people components the around the data quality scoring right for all the different data attributes coming up with a metric around how to measure that and and then showing that trend over time to show that it's getting better the other one that we're doing is just around overall date availability how how much data are we providing to our users and and showing that trend so when I first started you know we had somewhere in the neighborhood of 500 files that had been brought into the warehouse and and had been published and available in the neighborhood of a couple thousand fields we've grown that into weave we have thousands of cables now available so it's it's been you know hundreds of percent in scale as far as just the availability of that data how much is out there how much is is ready and available for for people to just dig in and put into their their analytics and their models and get those back into the other application so that's another key metric that we're starting to track as well so last question so I said at the top that every application is gonna need to be infused with AI this decade otherwise that application not going to be as competitive as it could be and so for those that are maybe stuck in their journey don't really know where to get started I'll start with with Caitlin and go to Victoria and then and then even bring us home what advice would you give the people that need to get going on this my advice is I think you pull the folks that are either producing or accessing your data and figure out what the rate is between I mentioned some of the data management challenges we were seeing this these processes were taking weeks and prone to error highly manual so part was ripe for AI project so identifying those use cases I think that are really causing you know the most free work and and manual effort you can move really quickly and as you build this platform out you're able to spin those up on an accelerated fashion I think identifying that and figuring out the business impact are able to drive very early on you can get going and start really seeing the value great yeah I would actually say kids I hit it on the head but I would probably add to that right is the first and foremost in my opinion right the importance around this is data governance you need to implement a data governance at an enterprise level many organizations will do it but they'll have silos of governance you really need an interface I did a government's platform that consists of a true framework of an operational model model charters right you have data domain owners data domain stewards data custodians all that needs to be defined and while that may take some work in in the beginning right the payoff down the line is that much more it's it it's allowing your business to truly own the data once they own the data and they take part in classifying the data assets for technologists and for analysts right you can start to eliminate some of the technical debt that most organizations have acquired today they can start to look at what are some of the systems that we can turn off what are some of the systems that we see valium truly build out a capability matrix we can start mapping systems right to capabilities and start to say where do we have wares or redundancy right what can we get rid of that's the first piece of it and then the second piece of it is really leveraging the tools that are out there today the IBM tools some of the other tools out there as well that enable some of the newer next-generation capabilities like unit nai right for example allowing automation for automation which right for all of us means that a lot of the analysts that are in place today they can access the information quicker they can deliver the information accurately like we've been talking about because it's been classified that pre works being done it's never too late to start but once you start that it just really acts as a domino effect to everything else where you start to see everything else fall into place all right thank you and Steve bring us on but advice for your your peers that want to get started sure I think the key for me too is like like those guys have talked about I think all everything they said is valid and accurate thing I would add is is from a starting perspective if you haven't started start right don't don't try to overthink that over plan it it started just do something and and and start the show that progress and value the use cases will come even if you think you're not there yet it's amazing once you have the national components there how some of these things start to come out of the woodwork so so it started it going may have it have that iterative approach to this and an open mindset it's encourage exploration and enablement look your organization in the eye to say why are their silos why do these things like this what are our problem what are the things getting in our way and and focus and tackle those those areas as opposed to trying to put up more rails and more boundaries and kind of encourage that silo mentality really really look at how do you how do you focus on that enablement and then the last comment would just be on scale everything should be focused on scale what you think is a one-time process today you're gonna do it again we've all been there you're gonna do it a thousand times again so prepare for that prepare forever that you're gonna do everything a thousand times and and start to instill that culture within your organization a great advice guys data bringing machine intelligence an AI to really drive insights and scaling with a cloud operating model no matter where that data live it's really great to have have three such knowledgeable practitioners Caitlyn Toria and Steve thanks so much for coming on the cube and helping support this panel all right and thank you for watching everybody now remember this panel was part of the raw material that went into a crowd chat that we hosted on May 27th Crouch at net slash data ops so go check that out this is Dave Volante for the cube thanks for watching [Music]

Published Date : May 28 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Steve Lewis	PERSON	0.99+
Caitlyn Toria	PERSON	0.99+
Steve	PERSON	0.99+
Linda Barragan	PERSON	0.99+
Dave Volante	PERSON	0.99+
two weeks	QUANTITY	0.99+
Victoria Stassi	PERSON	0.99+
Caitlin Alfre	PERSON	0.99+
two hours	QUANTITY	0.99+
Vic	PERSON	0.99+
two days	QUANTITY	0.99+
May 27th	DATE	0.99+
500 files	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Python	TITLE	0.99+
five minutes	QUANTITY	0.99+
30 new purchases	QUANTITY	0.99+
last year	DATE	0.99+
Caitlin	PERSON	0.99+
Clinton	PERSON	0.99+
first piece	QUANTITY	0.99+
first book	QUANTITY	0.99+
Dave	PERSON	0.99+
second piece	QUANTITY	0.99+
Boston	LOCATION	0.99+
Sally	PERSON	0.99+
today	DATE	0.99+
AWS	ORGANIZATION	0.99+
hundreds of percent	QUANTITY	0.98+
Stephen Victoria	PERSON	0.98+
one	QUANTITY	0.98+
Northwestern Mutual	ORGANIZATION	0.98+
Kaitlin	PERSON	0.97+
four decades	QUANTITY	0.97+
first	QUANTITY	0.97+
two top players	QUANTITY	0.97+
several years ago	DATE	0.96+
about four years	QUANTITY	0.96+
first customers	QUANTITY	0.95+
tons of tools	QUANTITY	0.95+
Kailyn	PERSON	0.95+
both	QUANTITY	0.95+
two	QUANTITY	0.94+
Northwestern	ORGANIZATION	0.94+
Northwestern	LOCATION	0.93+
each	QUANTITY	0.91+
Crouch	PERSON	0.91+
CBO	ORGANIZATION	0.91+
DevOps	TITLE	0.91+
two of	QUANTITY	0.89+
AI	ORGANIZATION	0.87+
things	QUANTITY	0.87+
three such knowledgeable practitioners	QUANTITY	0.87+

Itumeleng Monale, Standard Bank | IBM DataOps 2020

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hi buddy welcome back to the cube this is Dave Volante and you're watching a special presentation data ops enacted made possible by IBM you know what's what's happening is the innovation engine in the IT economy is really shifted used to be Moore's Law today it's applying machine intelligence and AI to data really scaling that and operationalizing that new knowledge the challenges that is not so easy to operationalize AI and infuse it into the data pipeline but what we're doing in this program is bringing in practitioners who have actually had a great deal of success in doing just that and I'm really excited to have it Kumal a Himalayan Manali is here she's the executive head of data management or personal and business banking at Standard Bank of South Africa the tomb of length thanks so much for coming in the queue thank you for having me Dave you're very welcome and first of all how you holding up with this this bovid situation how are things in Johannesburg um things in Johannesburg are fine we've been on lockdown now I think it's day 33 if I'm not mistaken lost count and but we're really grateful for the swift action of government we we only I mean we have less than 4,000 places in the country and infection rate is is really slow so we've really I think been able to find the curve and we're grateful for being able to be protected in this way so all working from home or learning the new normal and we're all in this together that's great to hear why don't you tell us a little bit about your your role you're a data person we're really going to get into it but here with us you know how you spend your time okay well I head up a date operations function and a data management function which really is the foundational part of the data value chain that then allows other parts of the organization to monetize data and liberate it as as as the use cases apply we monetize it ourselves as well but really we're an enterprise wide organization that ensures that data quality is managed data is governed that we have the effective practices applied to the entire lineage of the data ownership and curation is in place and everything else from a regulatory as well as opportunity perspective then is able to be leveraged upon so historically you know data has been viewed as sort of this expense it's it's big it's growing it needs to be managed deleted after a certain amount of time and then you know ten years ago of the Big Data move data became an asset you had a lot of shadow I people going off and doing things that maybe didn't comply to the corporate ethics probably drove here here you're a part of the organization crazy but talk about that how what has changed but they in the last you know five years or so just in terms of how people approach data oh I mean you know the story I tell my colleague who are all bankers obviously is the fact that the banker in 1989 had to mainly just know debits credits and be able to look someone in the eye and know whether or not they'd be a credit risk or not you know if we lend you money and you pay it back the the banker of the late 90s had to then contend with the emergence of technologies that made their lives easier and allowed for automation and processes to run much more smoothly um in the early two-thousands I would say that digitization was a big focus and in fact my previous role was head of digital banking and at the time we thought digital was the panacea it is the be-all and end-all it's the thing that's gonna make organizations edit lo and behold we realized that once you've gotten all your digital platforms ready they are just the plate or the pipe and nothing is flowing through it and there's no food on the face if data is not the main photo really um it's always been an asset I think organizations just never consciously knew that data was that okay so so what sounds like once you've made that sort of initial digital transformation you really had to work it and what we're hearing from a lot of practitioners like self as challenges related to that involve different parts of the organization different skill sets of challenges and sort of getting everybody to work together on the same page it's better but maybe you could take us back to sort of when you started on this initiative around data Ops what was that like what were some of the challenges that you faced and how'd you get through them okay first and foremost Dave organizations used to believe that data was I t's problem and that's probably why you you then saw the emergence of things like chatter IP but when you really acknowledge that data is an essay just like money is an asset then you you have to then take accountability for it just the same way as you would any other asset in the organization and you will not abdicate its management to a separate function that's not cold to the business and oftentimes IT are seen as a support or an enabling but not quite the main show in most organizations right so what we we then did is first emphasize that data is a business capability the business function it presides in business makes to product management makes to marketing makes to everything else that the business needs for data management also has to be for to every role in every function to different degrees and varying bearing offense and when you take accountability as an owner of a business unit you also take accountability for the data in the systems that support the business unit for us that was the first picture um and convincing my colleagues that data was their problem and not something that we had to worry about they just kind of leave us to to it was was also a journey but that was kind of the first step into it in terms of getting the data operations journey going um you had to first acknowledge please carry on no you just had to first acknowledge that it's something you must take accountability of as a banker not just need to a different part of the organization that's a real cultural mindset you know in the game of rock-paper-scissors you know culture kinda beats everything doesn't it it's almost like a yep a trump card and so so the businesses embrace that but but what did you do to support that is there has to be trust in the data that it has to be a timeliness and so maybe you could take us through how you achieve those objectives and maybe some other objectives that business the man so the one thing I didn't mention Dave is that obviously they didn't embrace it in the beginning it wasn't a it wasn't there oh yeah that make sense they do that type of conversation um what what he had was a few very strategic people with the right mindset that I could partner with that understood the case for data management and while we had that as as an in we developed a framework for a fully matured data operations capability in the organization and what that would look like in a target date scenario and then what you do is you wait for a good crisis so we had a little bit of a challenge in that our local regulator found us a little bit wanting in terms of our date of college and from that perspective it then brought the case for data quality management so now there's a burning platform you have an appetite for people to partner with you and say okay we need this to comply to help us out and when they start seeing their opt-in action do they then buy into into the concept so sometimes you need to just wait for a good Christ and leverage it and only do that which the organization will appreciate at that time you don't have to go Big Bang data quality management was the use case at the time five years ago so we focused all our energy on that and after that it gave us leeway and license really bring to maturity all the other capabilities at the business might not well understand as well so when that crisis hit of thinking about people process in technology you probably had to turn some knobs in each of those areas can you talk about that so from a technology perspective that that's when we partnered with with IBM to implement information analyzer for us in terms of making sure that then we could profile the data effectively what was important for us is to to make strides in terms of showing the organization progress but also being able to give them access to self-service tools that will give them insight into their data from a technology perspective that was kind of I think the the genesis of of us implementing and the IBM suite in earnest from a data management perspective people wise we really then also began a data stewardship journey in which we implemented business unit stewards of data I don't like using the word steward because in my organization it's taken lightly almost like a part-time occupation so we converted them we call them data managers and and the analogy I would give is every department with a P&L any department worth its salt has a FDA or financial director and if money is important to you you have somebody helping you take accountability and execute on your responsibilities in managing that that money so if data is equally important as an asset you will have a leader a manager helping you execute on your data ownership accountability and that was the people journey so firstly I had kind of soldiers planted in each department which were data managers that would then continue building the culture maturing the data practices as as applicable to each business unit use cases so what was important is that every manager in every business unit to the Data Manager focus their energy on making that business unit happy by ensuring that they data was of the right compliance level and the right quality the right best practices from a process and management perspective and was governed and then in terms of process really it's about spreading through the entire ecosystem data management as a practice and can be quite lonely um in the sense that unless the whole business of an organization is managing data they worried about doing what they do to make money and most people in most business units will be the only unicorn relative to everybody else who does what they do and so for us it was important to have a community of practice a process where all the data managers across business as well as the technology parts and the specialists who were data management professionals coming together and making sure that we we work together on on specific you say so I wonder if I can ask you so the the industry sort of likes to market this notion of of DevOps applied to data and data op have you applied that type of mindset approach agile of continuous improvement is I'm trying to understand how much is marketing and how much actually applicable in the real world can you share well you know when I was reflecting on this before this interview I realized that our very first use case of data officers probably when we implemented information analyzer in our business unit simply because it was the first time that IT and business as well as data professionals came together to spec the use case and then we would literally in an agile fashion with a multidisciplinary team come together to make sure that we got the outcomes that we required I mean for you to to firstly get a data quality management paradigm where we moved from 6% quality at some point from our client data now we're sitting at 99 percent and that 1% literally is just the timing issue to get from from 6 to 99 you have to make sure that the entire value chain is engaged so our business partners will the fundamental determinant of the business rules apply in terms of what does quality mean what are the criteria of quality and then what we do is translate that into what we put in the catalog and ensure that the profiling rules that we run are against those business rules that were defined at first so you'd have upfront determination of the outcome with business and then the team would go into an agile cycle of maybe two-week sprints where we develop certain things have stand-ups come together and then the output would be - boarded in a prototype in a fashion where business then gets to go double check that out so that was the first iterate and I would say we've become much more mature at it and we've got many more use cases now and there's actually one that it's quite exciting that we we recently achieved over the end of of 2019 into the beginning of this year so what we did was they I'm worried about the sunlight I mean through the window you look creative to me like sunset in South Africa we've been on the we've been on CubeSat sometimes it's so bright we have to put on sunglasses but so the most recent one which was in in mates 2019 coming in too early this year we we had long kind of achieved the the compliance and regulatory burning platform issues and now we are in a place of I think opportunity and luxury where we can now find use cases that are pertinent to business execution and business productivity um the one that comes to mind is we're a hundred and fifty eight years old as an organization right so so this Bank was born before technology it was also born in the days of light no no no integration because every branch was a standalone entity you'd have these big ledges that transactions were documented in and I think once every six months or so these Ledger's would be taken by horse-drawn carriage to a central place to get go reconcile between branches and paper but the point is if that is your legacy the initial kind of ERP implementations would have been focused on process efficiency based on old ways of accounting for transactions and allocating information so it was not optimized for the 21st century our architecture had has had huge legacy burden on it and so going into a place where you can be agile with data is something that we constantly working toward so we get to a place where we have hundreds of branches across the country and all of them obviously telling to client servicing clients as usual and and not being able for any person needing sales teams or executional teams they were not able in a short space of time to see the impact of the tactic from a database fee from a reporting history and we were in a place where in some cases based on how our Ledger's roll up and the reconciliation between various systems and accounts work it would take you six weeks to verify whether your technique were effective or not because to actually see the revenue hitting our our general ledger and our balance sheet might take that long that is an ineffective way to operate in a such a competitive environment so what you had our frontline sales agents literally manually documenting the sales that they had made but not being able to verify whether that or not is bringing revenue until six weeks later so what we did then is we sat down and defined all the requirements were reporting perspective and the objective was moved from six weeks latency to 24 hours um and even 24 hours is not perfect our ideal would be that bite rows of day you're able to see what you've done for that day but that's the next the next epoch that will go through however um we literally had the frontline teams defining what they'd want to see in a dashboard the business teams defining what the business rules behind the quality and the definitions would be and then we had an entire I'm analytics team and the data management team working around sourcing the data optimising and curating it and making sure that the latency had done that's I think only our latest use case for data art um and now we're in a place where people can look at a dashboard it's a cubed self-service they can learn at any time I see the sales they've made which is very important right now at the time of covert nineteen from a form of productivity and executional competitiveness those are two great use cases of women lying so the first one you know going from data quality 6% the 99% I mean 6% is all you do is spend time arguing about the data bills profanity and then 99% you're there and you said it's just basically a timing issue use latency in the timing and then the second one is is instead of paving the cow path with an outdated you know ledger Barret data process week you've now compressed that down to 24 hours you want to get the end of day so you've built in the agility into your data pipeline I'm going to ask you then so when gdpr hit were you able to very quickly leverage this capability and and apply and then maybe other of compliance edik as well well actually you know what we just now was post TDP our us um and and we got GDP all right about three years ago but literally all we got right was reporting for risk and compliance purposes they use cases that we have now are really around business opportunity lists so the risk so we prioritize compliance report a long time it but we're able to do real-time reporting from a single transaction perspective I'm suspicious transactions etc I'm two hours in Bank and our governor so from that perspective that was what was prioritize in the beginning which was the initial crisis so what you found is an entire engine geared towards making sure that data quality was correct for reporting and regulatory purposes but really that is not the be-all and end-all of it and if that's all we did I believe we really would not have succeeded or could have stayed dead we succeeded because Dana monetization is actually the penis' t the leveraging of data for business opportunity is is actually then what tells you whether you've got the right culture or not you're just doing it to comply then it means the hearts and minds of the rest of the business still aren't in the data game I love this story because it's me it's nirvana for so many years we've been pouring money to mitigate risk and you have no choice do it you know the general council signs off on it the the CFO but grudgingly signs off on it but it's got to be done but for years decades we've been waiting to use these these risk initiatives to actually drive business value you know it kind of happened with enterprise data warehouse but it was too slow it was complicated and it certainly didn't happen with with email archiving that was just sort of a tech balk it sounds like you know we're at that point today and I want to ask you I mean like you know you we talking earlier about you know the crisis gonna perpetuated this this cultural shift and you took advantage of that so we're out who we the the mother nature dealt up a crisis like we've never seen before how do you see your data infrastructure your data pipeline your data ops what kind of opportunities do you see in front of you today as a result of ovid 19 well I mean because of of the quality of kind data that we had now we were able to very quickly respond to to pivot nineteen in in our context where the government put us on lockdown relatively early in in the curve or in the cycle of infection and what it meant is it brought a little bit of a shock to the economy because small businesses all of a sudden didn't have a source of revenue or potentially three to six weeks and based on the data quality work that we did before it was actually relatively easy to be agile enough to do the things that we did so within the first weekend of of lockdown in South Africa we were the first bank to proactively and automatically offer small businesses and student and students with loans on our books a instant three month payment holiday assuming they were in good standing and we did that upfront though it was actually an opt-out process rather than you had to fall in and arrange for that to happen and I don't believe we would have been able to do that if our data quality was not with um we have since made many more initiatives to try and keep the economy going to try and keep our clients in in a state of of liquidity and so you know data quality at that point and that Dharma is critical to knowing who you're talking to who needs what and in which solutions would best be fitted towards various segments I think the second component is um you know working from home now brings an entirely different normal right so so if we had not been able to provide productivity dashboard and and and sales and dashboards to to management and all all the users that require it we would not be able to then validate or say what our productivity levels are now that people are working from home I mean we still have essential services workers that physically go into work but a lot of our relationship bankers are operating from home and that face the baseline and the foundation that we said productivity packing for various methods being able to be reported on in a short space of time has been really beneficial the next opportunity for us is we've been really good at doing this for the normal operational and front line and type of workers but knowledge workers have also know not necessarily been big productivity reporters historically they kind of get an output then the output might be six weeks down the line um but in a place where teams now are not locate co-located and work needs to flow in an edge of passion we need to start using the same foundation and and and data pipeline that we've laid down as a foundation for the reporting of knowledge work and agile team type of metric so in terms of developing new functionality and solutions there's a flow in a multidisciplinary team and how do those solutions get architected in a way where data assists in the flow of information so solutions can be optimally developed well it sounds like you're able to map a metric but business lines care about you know into these dashboards you usually the sort of data mapping approach if you will which makes it much more relevant for the business as you said before they own the data that's got to be a huge business benefit just in terms of again we talked about cultural we talked about speed but but the business impact of being able to do that it has to be pretty substantial it really really is um and and the use cases really are endless because every department finds their own opportunity to utilize in terms of their also I think the accountability factor has has significantly increased because as the owner of a specific domain of data you know that you're not only accountable to yourself and your own operation but people downstream to you as a product and in an outcome depend on you to ensure that the quality of the data you produces is of a high nature so so curation of data is a very important thing and business is really starting to understand that so you know the cards Department knows that they are the owners of card data right and you know the vehicle asset Department knows that they are the owners of vehicle they are linked to a client profile and all of that creates an ecosystem around the plan I mean when you come to a bank you you don't want to be known as a number and you don't want to be known just for one product you want to be known across everything that you do with that with that organization but most banks are not structured that way they still are product houses and product systems on which your data reside and if those don't act in concert then we come across extremely schizophrenic as if we don't know our clients and so that's very very important stupid like I can go on for an hour talking about this topic but unfortunately we're we're out of time thank you so much for sharing your deep knowledge and your story it's really an inspiring one and congratulations on all your success and I guess I'll leave it with you know what's next you gave us you know a glimpse of some of the things you wanted to do pressing some of the the elapsed times and the time cycle but but where do you see this going in the next you know kind of mid term and longer term currently I mean obviously AI is is a big is a big opportunity for all organizations and and you don't get automation of anything right if the foundations are not in place so you believe that this is a great foundation for anything AI to be applied in terms of the use cases that we can find the second one is really providing an API economy where certain data product can be shared with third parties I think that probably where we want to take things as well we are really utilizing external third-party data sources I'm in our data quality management suite to ensure validity of client identity and and and residents and things of that nature but going forward because been picked and banks and other organizations are probably going to partner to to be more competitive going forward we need to be able to provide data product that can then be leveraged by external parties and vice-versa to be like thanks again great having you thank you very much Dave appreciate the opportunity thank you for watching everybody that we go we are digging in the data ops we've got practitioners we've got influencers we've got experts we're going in the crowd chat it's the crowd chat net flash data ops but keep it right there way back but more coverage this is Dave Volante for the cube [Music] you

Published Date : May 28 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Johannesburg	LOCATION	0.99+
1989	DATE	0.99+
six weeks	QUANTITY	0.99+
Dave Volante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
three	QUANTITY	0.99+
24 hours	QUANTITY	0.99+
two-week	QUANTITY	0.99+
6%	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
two hours	QUANTITY	0.99+
South Africa	LOCATION	0.99+
less than 4,000 places	QUANTITY	0.99+
99 percent	QUANTITY	0.99+
Standard Bank	ORGANIZATION	0.99+
99%	QUANTITY	0.99+
21st century	DATE	0.99+
6	QUANTITY	0.99+
second component	QUANTITY	0.99+
hundreds of branches	QUANTITY	0.99+
2019	DATE	0.99+
first step	QUANTITY	0.99+
five years	QUANTITY	0.99+
first bank	QUANTITY	0.99+
1%	QUANTITY	0.98+
five years ago	DATE	0.98+
first time	QUANTITY	0.98+
Boston	LOCATION	0.98+
99	QUANTITY	0.98+
each department	QUANTITY	0.98+
first	QUANTITY	0.98+
late 90s	DATE	0.97+
six weeks later	DATE	0.97+
today	DATE	0.97+
three month	QUANTITY	0.97+
ten years ago	DATE	0.96+
an hour	QUANTITY	0.96+
a hundred and fifty eight years old	QUANTITY	0.96+
firstly	QUANTITY	0.95+
second one	QUANTITY	0.95+
first weekend	QUANTITY	0.94+
one product	QUANTITY	0.94+
nineteen	QUANTITY	0.94+
first picture	QUANTITY	0.93+
each business unit	QUANTITY	0.91+
each	QUANTITY	0.91+
Kumal	PERSON	0.89+
single transaction	QUANTITY	0.89+
Big Bang	EVENT	0.88+
first one	QUANTITY	0.88+
once every six months	QUANTITY	0.87+
2020	DATE	0.86+
Ledger	ORGANIZATION	0.85+
first use case	QUANTITY	0.84+
every branch	QUANTITY	0.83+
about three years ago	DATE	0.82+
Christ	PERSON	0.81+
one	QUANTITY	0.8+
Itumeleng Monale	PERSON	0.79+
DevOps	TITLE	0.78+
two great use cases	QUANTITY	0.78+
years	QUANTITY	0.77+
Standard Bank of South	ORGANIZATION	0.76+
Dharma	ORGANIZATION	0.76+
early this year	DATE	0.74+
l council	ORGANIZATION	0.71+
FDA	ORGANIZATION	0.7+
end	DATE	0.69+
this year	DATE	0.68+
Moore's Law	TITLE	0.67+
IBM DataOps	ORGANIZATION	0.65+
Dana	PERSON	0.63+
every business	QUANTITY	0.62+

Steven Lueck, Associated Bank | IBM DataOps in Action

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hi Bri welcome back this is Dave Volante and welcome to this special presentation made possible by IBM we're talking about data op data ops in Acton Steve Lucas here he's the senior vice president and director of data management at Associated Bank be great to see how are things going and in Wisconsin all safe we're doing well we're staying safe staying healthy thanks for having me Dave yeah you're very welcome so Associated Bank and regional bank Midwest to cover a lot of the territories not just Wisconsin but another number of other states around there retail commercial lending real estate offices stuff I think the largest bank in in Wisconsin but tell us a little bit about your business in your specific role sure yeah no it's a good intro we're definitely largest bank at Corvis concen and then we have branches in the in the Upper Midwest area so Minnesota Illinois Wisconsin our primary locations my role at associated I'm director data management so been with the bank a couple of years now and really just focused on defining our data strategy as an overall everything from data ingestion through consumption of data and analytics all the way through and then I'm also the data governance components and keeping the controls and the rails in place around all of our data in its usage so financial services obviously one of the more cutting-edge industries in terms of their use of technology not only are you good negotiators but you you often are early adopters you guys were on the Big Data bandwagon early a lot of financial services firms we're kind of early on in Hadoop but I wonder if you could tell us a little bit about sort of the business drivers and and where's the poor the pressure point that are informing your digital strategy your your data and data op strategy sure yeah I think that one of the key areas for us is that we're trying to shift from more of a reactive mode into more of a predictive prescriptive mode from a data and analytics perspective and using our data to infuse and drive more business decisions but also to infuse it in actual applications and customer experience etc so we have a wealth of data at our fingertips we're really focused on starting to build out that data link style strategy make sure that we're kind of ahead of the curve as far as trying to predict what our end users are going to need and some of the advanced use cases we're going to have before we even know that they actually exist right so it's really trying to prepare us for the future and what's next and and then abling and empowering the business to be able to pivot when we need to without having everything perfect that they prescribed and and ready for what if we could talk about a little bit about the data journey I know it's kind of a buzzword but in my career as a independent observer and analyst I've kind of watched the promise of whether it was decision support systems or enterprise data warehouse you know give that 360 degree view of the business the the real-time nature the the customer intimacy all that in and up until sort of the recent digital you know meme I feel as though the industry hasn't lived up to that promise so I wonder if you could take us through the journey and tell us sort of where you came from and where you are today and I really want to sort of understand some of the successes they've had sure no that's a that's a great point nice I feel like as an industry I think we're at a point now where the the people process technology have sort of all caught up to each other right I feel that that real-time streaming analytics the data service mentality just leveraging web services and API is more throughout our organization in our industry as a whole I feel like that's really starting to take shape right now and and all the pieces of that puzzle have come together so kind of where we started from a journey perspective it was it was very much if your your legacy reporting data warehouse mindset of tell me tell me the data elements that you think you're going to need we'll figure out how do we map those in and form them we'll figure out how to get those prepared for you and that whole lifecycle that waterfall mentality of how do we get this through the funnel and get it to users quality was usually there the the enablement was still there but it was missing that that rapid turnaround it was also missing the the what's next right than what you haven't thought of and almost to a point of just discouraging people from asking for too many things because it got too expensive it got too hard to maintain there was some difficulty in that space so some of the things that we're trying to do now is build that that enablement mentality of encouraging people to ask for everything so when we bring out new systems - the bank is no longer an option as far as how much data they're going to send to us right we're getting all of the data we're going to we're going to bring that all together for people and then really starting to figure out how can this data now be used and and we almost have to push that out and infuse it within our organization as opposed to waiting for it to be asked for so I think that all of the the concepts so that bringing that people process and then now the tools and capabilities together has really started to make a move for us and in the industry I mean it's really not an uncommon story right you had a traditional data warehouse system you had you know some experts that you had to go through to get the data the business kind of felt like it didn't own the data you know it felt like it was imposing every time it made a request or maybe it was frustrated because it took so long and then by the time they got the data perhaps you know the market had shifted so it create a lot of frustration and then to your point but but it became very useful as a reporting tool and that was kind of this the sweet spot so so how did you overcome that and you know get to where you are today and you know kind of where are you today I was gonna say I think we're still overcoming that we'll see it'll see how this all goes right I think there's there's a couple of things that you know we've started to enable first off is just having that a concept of scale and enablement mentality and everything that we do so when we bring systems on we bring on everything we're starting to have those those components and pieces in place and we're starting to build more framework base reusable processes and procedures so that every ask is not brand new it's not this reinvent the wheel and resolve for for all that work so I think that's helped if expedite our time to market and really get some of the buy-in and support from around the organization and it's really just finding the right use cases and finding the different business partners to work with and partner with so that you help them through their journey as well is there I'm there on a similar roadmap and journey for for their own life cycles as well in their product element or whatever business line there so from a process standpoint that you kind of have to jettison the you mentioned waterfall before and move to a more being an agile approach did it require different different skill sets talk about the process and the people side of yeah it's been a it's been a shift we've tried to shift more towards I wouldn't call us more formal agile I would say we're a little bit more lean from a an iterative backlog type of approach right so what are you putting that work together in queues and having the queue of B reprioritized working with the business owners to help through those things has been a key success criteria for us and how we start to manage that work as opposed to opening formal project requests and and having all that work have to funnel through some of the old channels that like you mentioned earlier kind of distracted a little bit from from the way things had been done in the past and added some layers that people felt potentially wouldn't be necessary if they thought it was a small ask in their eyes you know I think it also led to a lot of some of the data silos and and components that we have in place today in the industry and I don't think our company is alone and having data silos and components of data in different locations but those are there for a reason though those were there because they're they're filling a need that has been missing or a gap in the solution so what we're trying to do is really take that to heart and evaluate what can we do to enable those mindsets and those mentalities and find out what was the gap and why did they have to go get a siloed solution or work around operations and technology and the channels that had been in place what would you say well your biggest challenges in getting from point A to point B point B being where you are today there were challenges on each of the components of the pillar right so people process technology people are hard to change right men behavioral type changes has been difficult that there's components of that that definitely has been in place same with the process side right so so changing it into that backlog style mentality and working with the users and having more that be sort of that maintenance type support work is is a different call culture for our organization and traditional project management and then the tool sets right the the tools and capabilities we had to look in and evaluate what tools do we need to Mabel this behavior in this mentality how do we enable more self-service the exploration how do we get people the data that they need when they need it and empower them to use so maybe you could share with us some of the outcomes and I know it's yeah we're never done in this business but but thinking about you know the investments that you've made in intact people in reprocessing you know the time it takes to get leadership involved what has been so far anyway the business outcome and you share any any metrics or it is sort of subjective a guidance I yeah I think from a subjective perspective the some of the biggest things for us has just been our ability to to truly start to have that very 60 degree view of the customer which we're probably never going to get they're officially right there's there everyone's striving for that but the ability to have you know all of that data available kind of at our fingertips and have that all consolidated now into one one location one platform and start to be that hub that starts to redistribute that data to our applications and infusing that out has been a key component for us I think some of the other big kind of components are differentiators for us and value that we can show from an organizational perspective we're in an M&A mode right so we're always looking from a merger and acquisition perspective our the model that we've built out from a data strategy perspective has proven itself useful over and over now in that M&A mentality of how do you rapidly ingest new data sets it had understood get it distributed to the right consumers it's fit our model exactly and and it hasn't been an exception it's been just part of our overall framework for how we get that data and it wasn't anything new that we had to do different because it was M&A just timelines were probably a little bit more expedited the other thing that's been interesting in some of the world that were in now right from a a Kovach perspective and having a pivot and start to change some of the way we do business and some of the PPP loans and and our business models sort of had to change overnight and our ability to work with our different lines of business and get them the data they need to help drive those decisions was another scenario where had we not had the foundational components there in the platform there to do some of this if we would have spun a little bit longer so your data ops approach I'm gonna use that term helped you in this in this kovat situation I mean you had the PPE you had you know of slew of businesses looking to get access to that money you had uncertainty with regard to kind of what the rules of the game were what you was the bank you had a Judah cape but you it was really kind of opaque in terms of what you had to do the volume of loans had to go through the roof in the time frame it was like within days or weeks that you had to provide these so I wonder if we could talk about that a little bit and how you're sort of approach the data helped you be prepared for that yeah no it was a race I mean the bottom line was it felt like a race right from from industry perspective as far as how how could we get this out there soon enough fast enough provide the most value to our customers our applications teams did a phenomenal job on enabling the applications to help streamline some of the application process for the loans themselves but from a data and reporting perspective behind the scenes we were there and we had some tools and capabilities and readiness to say we have the data now in our in our lake we can start to do some business driven decisions around all all of the different components of what's being processed on a daily basis from an application perspective versus what's been funded and how do those start to funnel all the way through doing some data quality checks and operational reporting checks to make sure that that data move properly and got booked in in the proper ways because of the rapid nature of how that was was all being done other covent type use cases as well we had some some different scenarios around different feed reporting and and other capabilities that the business wasn't necessarily prepared for we wouldn't have planned to have some of these types of things and reporting in place that we were able to give it because we had access to all the data because of these frameworks that we had put into place that we could pretty rapidly start to turn around some of those data some of those data points and analytics for us to make some some better decisions so given the propensity in the pace of M&A there has to be a challenge fundamentally in just in terms of data quality consistency governance give us the before and after you know before kind of before being the before the data ops mindset and after being kind of where you are today I think that's still a journey we're always trying to get better on that as well but the data ops mindset for us really has has shifted us to start to think about automation right pipelines that enablement a constant improvement and and how do we deploy faster deploy more consistently and and have the right capabilities in place when we need it so you know where some of that has come into place from an M&A perspective is it's really been around the building scale into everything that we do dezq real-time nature this scalability the rapid deployment models that we have in place is really where that starts to join forces and really become become powerful having having the ability to rapidly ingesting new data sources whether we know about it or not and then exposing that and having the tools and platforms be able to expose that to our users and enable our business lines whether it's covent whether it's M&A the use cases keep coming up right they we keep running into the same same concept which is how rapidly get people the data they need when they need it but still provide the rails and controls and make sure that it's governed and controllable on the way as well [Music] about the tech though wonder if we could spend some time on that I mean can you paint a picture of us so I thought what what what we're looking at here you've got you know some traditional IDI w's involved I'm sure you've got lots of data sources you you may be one of the zookeepers from the the Hadoop days with a lot of you know experimentation there may be some machine intelligence and they are painting a pic before us but sure no so we're kind of evolving some of the tool sets and capabilities as well we have some some generic kind of custom in-house build ingestion frameworks that we've started to build out for how to rapidly ingest and kind of script out the nature of of how we bring those data sources into play what we're what we've now started as well as is a journey down IBM compact product which is really gonna it's providing us that ability to govern and control all of our data sources and then start to enable some of that real-time ad hoc analytics and data preparation data shaping so some of the components that we're doing in there is just around that data discovery pointing that data sources rapidly running data profiles exposing that data to our users obviously very handy in the emanating space and and anytime you get new data sources in but then the concept of publishing that and leveraging some of the AI capabilities of assigning business terms in the data glossary and those components is another key component for us on the on the consumption side of the house for for data we have a couple of tools in place where Cognos shop we do a tableau from a data visualization perspective as well that what that were we're leveraging but that's where cloud pack is now starting to come into play as well from a data refinement perspective and giving the ability for users to actually go start to shape and prep their data sets all within that governed concept and then we've actually now started down the enablement path from an AI perspective with Python and R and we're using compact to be our orchestration tool to keep all that governed and controlled as well enable some some new AI models and some new technologies in that space we're actually starting to convert all of our custom-built frameworks into python now as well so we start to have some of that embedded within cloud pack and we can start to use some of the rails of those frameworks with it within them okay so you've got the ingest and ingestion side you've done a lot of automation it sounds like called the data profiling that's maybe what classification and automating that piece and then you've got the data quality piece the governance you got visualization with with tableau and and this kind of all fits together in a in an open quote unquote open framework is that right yeah I exactly I mean the the framework itself from our perspective where we're trying to keep the tools as as consistent as we can we really want to enable our users to have the tools that they need in the toolbox and and keep all that open what we're trying to focus on is making sure that they get the same data the same experience through whatever tool and mechanism that they're consuming from so that's where that platform mentality comes into place having compact in the middle to help govern all that and and reprovision some of those data sources out for us has it has been a key component for us well see if it sounds like you're you know making a lot of progress or you know so the days of the data temple or the high priest of data or the sort of keepers of that data really to more of a data culture where the businesses kind of feel ownership for their own data you believe self-service I think you've got confidence much more confident than the in the compliance and governance piece but bring us home just in terms of that notion of data culture and where you are and where you're headed no definitely I think that's that's been a key for us too as as part of our strategy is really helping we put in a strategy that helps define and dictate some of those structures and ownership and make that more clear some of the of the failures of the past if you will from an overall my monster data warehouse was around nobody ever owned it there was there wasn't you always ran that that risk of either the loudest consumer actually owned it or no one actually owned it what we've started to do with this is that Lake mentality and and having all that data ingested into our our frameworks the data owners are clear-cut it's who sends that data in what is the book record system for that source data we don't want a ability we don't touch it we don't transform it as we load it it sits there and available you own it we're doing the same mentality on the consumer side so we have we have a series of structures from a consumption perspective that all of our users are consuming our data if it's represented exactly how they want to consume it so again that ownership we're trying to take out a lot of that gray area and I'm enabling them to say yeah I own this I understand what I'm what I'm going after and and I can put the the ownership and the rule and rules and the stewardship around that as opposed to having that gray model in the middle that that that we never we never get but I guess to kind of close it out really the the concept for us is enabling people and end-users right giving them the data that they need when they need it and it's it's really about providing the framework and then the rails around around doing that and it's not about building out a formal bill warehouse model or a formal lessor like you mentioned before some of the you know the ivory tower type concepts right it's really about purpose-built data sets getting the giving our users empowered with the data they need when they need it all the way through and fusing that into our applications so that the applications and provide the best user experiences and and use the data to our advantage all about enabling the business I got a shove all I have you how's that IBM doing you know as a as a partner what do you like what could they be doing better to make your life easier sure I think I think they've been a great partner for us as far as that that enablement mentality the cloud pack platform has been a key for us we wouldn't be where we are without that tool said I our journey originally when we started looking at tools and modernization of our staff was around data quality data governance type components and tools we now because of the platform have released our first Python I models into the environment we have our studio capabilities natively because of the way that that's all container is now within cloud back so we've been able to enable new use cases and really advance us where we would have a time or a lot a lot more technologies and capabilities and then integrate those ourselves so the ability to have that all done has or and be able to leverage that platform has been a key to helping us get some of these roles out of this as quickly as we have as far as a partnership perspective they've been great as far as listening to what what the next steps are for us where we're headed what can we what do we need more of what can they do to help us get there so it's it's really been an encouraging encouraging environment I think they as far as what can they do better I think it's just keep keep delivering write it delivery is ping so keep keep releasing the new functionality and features and keeping the quality of the product intact well see it was great having you on the cube we always love to get the practitioner angle sounds like you've made a lot of progress and as I said when we're never finished in this industry so best of luck to you stay safe then and thanks so much for for sharing appreciate it thank you all right and thank you for watching everybody this is Dave Volante for the cube data ops in action we got the crowd chat a little bit later get right there but right back right of this short break [Music] [Music]

Published Date : May 28 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Wisconsin	LOCATION	0.99+
Dave Volante	PERSON	0.99+
Associated Bank	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Steve Lucas	PERSON	0.99+
Steven Lueck	PERSON	0.99+
python	TITLE	0.99+
IBM	ORGANIZATION	0.99+
360 degree	QUANTITY	0.99+
Minnesota	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
60 degree	QUANTITY	0.99+
Python	TITLE	0.99+
today	DATE	0.98+
Boston	LOCATION	0.98+
first	QUANTITY	0.97+
each	QUANTITY	0.95+
Acton	ORGANIZATION	0.94+
Cognos	ORGANIZATION	0.94+
M&A	TITLE	0.92+
one platform	QUANTITY	0.91+
one	QUANTITY	0.9+
Corvis concen	ORGANIZATION	0.87+
Midwest	LOCATION	0.87+
R	TITLE	0.86+
Upper Midwest	LOCATION	0.83+
IBM DataOps in Action	ORGANIZATION	0.81+
one location	QUANTITY	0.79+
agile	TITLE	0.78+
a couple of years	QUANTITY	0.75+
M&A	ORGANIZATION	0.7+
point B	OTHER	0.69+
Illinois Wisconsin	LOCATION	0.68+
couple of tools	QUANTITY	0.67+
point	OTHER	0.52+
couple of things	QUANTITY	0.5+
Judah	PERSON	0.31+
Hadoop	LOCATION	0.28+

Inderpal Bhandari, IBM | IBM DataOps 2020

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hi buddy welcome this special digital presentation where we're covering the topic of data ops and specifically how IBM is really operationalizing and automating the data pipeline with data ops and with me is Interpol Bhandari who is the global chief data officer at IBM in Nepal has always great to see you thanks for coming on my pleasure you know the standard throw away question from guys like me is you know what keeps the chief data officer up at night well I know what's keeping you up at night it's kovat 19 how are you doing it's keeping keeping all of us yeah for sure so how you guys making out as a leader I'm interested in you know how you have responded with whether it's you know communications obviously you're doing much more stuff you know remotely you're not on airplanes certainly like you used to be but but what was your first move when you actually realized this was going to require a shift well I think one of the first things that I did was to test the ability of my organization who worked remotely this was well before the the recommendations came in from the government but just so that we wanted you know to be sure that this is something that we could pull off if there were extreme circumstances where even everybody was good and so that was one of the first things we did along with that I think another major activity that we embarked on is even that we had created this central data and AI platform for IBM using our hybrid multi cloud approach how could that be adapting very very quickly you helped with the covert situation but those were the two big items that my team embarked on very quickly and again like I said this is well before there was any recommendations from the government or even internally within IBM any recommendations but B we decided that we wanted to run ahead and make sure that we were ready to ready to operate in that fashion and I believe a lot of my colleagues did the same yeah there's a there's a conversation going on right now just around productivity hits that people may be taking because they really weren't prepared it sounds like you're pretty comfortable with the productivity impact that you're achieving oh I'm totally comfortable with the productivity in fact I will tell you that while we've gone down this spot we've realized that in some cases the productivity is actually going to be better when people are working from home and they're able to focus a lot more on the work aspect you know and this could this runs the gamut from the nature of the job where you know somebody who basically needs to be in the front of a computer and is remotely taking care of operations you know if they don't have to come in their productivity is gonna go up somebody like myself who had a long drive into work you know which I would use on phone calls but now that entire time is can be used a lot more productivity but not maybe in a lot more productive manner so there is a we realize that that there's going to be some aspects of productivity that will actually be helped by the situation provided you're able to deliver the services that you deliver with the same level of quality and satisfaction that you've always done now there were certain other aspects where you know productivity is going to be affected so you know my team there's a lot of whiteboarding that gets done there are lots of informal conversations that spark creativity but those things are much harder to replicate in a remote in life so we've got a sense of you know where we have to do some work what things together versus where we were actually going to be more productive but all in all they are very comfortable that we can pull this off no that's great I want to stay on Kovac for a moment and in the context of just data and data ops and you know why now obviously with a crisis like this it increases the imperative to really have your data act together but I want to ask you both specifically as it relates to Co vid why data ops is so important and then just generally why at this this point in our time so I mean you know the journey we've been on they you know when I joined our data strategy centered around the cloud data and AI mainly because IBM's business strategy was around that and because there wasn't the notion of ái in enterprise right there was everybody understood what AI means for the consumer but for the enterprise people don't really understand what it meant so our data strategy became one of actually making IBM itself into an AI and a BA and then using that as a showcase for our clients and customers who look a lot like us to make them into a eye on the prize and in a nutshell what that translated to was that one had to in few AI into the workflow of the key business processes of enterprise so if you think about that workflow is very demanding why do you have to be able to deliver data and insights on time just when it's needed otherwise you can essentially slow down the whole workflow of a major process with but to be able to pull all that off you need to have your own data very very streamlined so that a lot of it is automated and you're able to deliver those insights as the people who are involved in the workflow needed so we've spent a lot of time while we were making IBM into an AI enterprise and infusing AI into our keepers and thus processes into essentially a data ops pipeline that was very very streamlined which then allowed us to very quickly adapt to the covert 19 situation and I'll give you one specific example that we'll go to you know how one would say one could essentially leverage that capability that I just talked about to do this so one of the key business processes that we had taken aim at was our supply chain you know we're a global company and our supply chain is critical we have lots of suppliers and they are all over the globe and we have different types of products so that you know it has a multiplicative fact is we go from each of those you have other additional suppliers and you have events you have other events you have calamities you have political events so we have to be able to very quickly understand the risk associated with any of those events with regard to our supply chain and make appropriate adjustments on the fly so that was one of the key applications that we built on our central data and the Aqua and as part of a data ops pipeline that meant he ingested the ingestion of the several hundred sources of data had to be blazingly fast and also refreshed very very quickly also we had to then aggregate data from the outside from external sources that had to do with weather related events that had to do with political events social media feeds etcetera and overlay that on top of our map of interest with regard to our supply chain sites and also where they were supposed to deliver we'd also weaved in our capabilities here to track those shipments as they flowed and have that data flow back as well so that we would know exactly where where things were this is only possible because we had a streamlined data ops capability and we had built this central data Nai platform for IBM now you flip over to the covert 19 situation when go with 19 you know emerged and we began to realize that this was going to be a significant significant pandemic what we were able to do very quickly was to overlay the Kovach 19 incidents on top of our sites of interest as well as pick up what was being reported about those sites of interest and provide that over to our business continuity so this became an immediate exercise that we embarked but it wouldn't have been possible if you didn't have the foundation of the data ops pipeline as well as that central data Nai platform in place to help you do that very very quickly and adapt so so what I really like about this story and something that I want to drill into is it essentially a lot of organizations have a real tough time operationalizing AI and fusing it to use your word and the fact that you're doing it is really a good proof point that I want to explore a little bit so you're essentially there was a number of aspects of what you just described there was the data quality piece with your data quality in theory anyway is gonna go up with more data if you can handle it and the other was speed time to insight so you can respond more quickly if it's think about this Kovan situation if your days behind or weeks behind which is not uncommon you know sometimes even worse you just can't respond I mean these things change daily sometimes certainly within the day so is that right that's kind of the the business outcome and objective that you guys were after yes you know so trauma from an infused AI into your business processes by the overarching outcome metric that one focuses on is end to end cycle so you take that process the end-to-end process and you're trying to reduce the end-to-end cycle time by you know several factors several orders of magnitude we did for instance in my organization that have to do with the generation of metadata is data about data and that's usually a very time-consuming process and we've reduced that by over 95% by using AI you actually help in the metadata generation itself and that's applied now across the board for many different business processes that you know iBM has that's the same kind of principle that was you you'll be able to do that so that foundation essentially enables you to go after that cycle time reduction right off the bat so when you get to a situation like of open 19 situation which demands urgent action your foundation is already geared to deliver on that so I think actually we might have a graphic and then the second graphic guys if you bring up this second one I think this is Interpol what you're talking about here that sort of 95 percent reduction guys if you could bring that up would take a look at it so this is maybe not a co vid use case yeah here it is so that 95 percent reduction in in cycle time improving and data quality what we talked about there's actually some productivity metrics right this is what you're talking about here in this metadata example correct yeah yes the middle do that right it's so central to everything that one does with data I mean it's basically data about data and this is really the business metadata that we're talking about which is once you have data in your data Lee if you don't have business metadata describing what that data is then it's very hard for people who are trying to do things to determine whether they can even whether they even have access to the right data and typically this process has been done manually because somebody looks at the data they looks at the fields and they describe it and it could easily take months and what we did was we essentially use a deep learning and a natural language processing approach looked at all the data that we've had historically over an idea and we've automated the metadata generation so whether it was you know you were talking about both the data relevant for probit team or for supply chain or for a receivable process any one of our business processes this is one of those fundamental steps that one must go through to be able to get your data ready for action and if you were able to take that cycle time for that step and reduce it by 95% you can imagine the acceleration yeah and I liked it we were saying before you talk about the end to end a concept you're applying system thinking here which is very very important because you know a lot of a lot of points that I talked you'll they'll be they're so focused on one metric may be optimizing one component of that end to end but it's really the overall outcome that you're trying to achieve you you may sometimes you know be optimizing one piece but not the whole so that systems thinking is is very very important isn't it the system's thinking is extremely important overall no matter you know where you're involved in the process of designing the system but if you're the data guy it's incredibly important because not only does that give you an insight into the cycle time reduction but it also gives it clues you in into what standardization is necessary in the data so that you're able to support an eventual out you know a lot of people will go down the path of data governance and creation of data standard and you can easily boil the ocean trying to do that but if you actually start with an end-to-end view of your key processes and that by extension the outcomes associated with those processes as well as the user experience at the end of those processes and kind of then work backwards as to what are the standards that you need for the data that's going to feed into all that that's how you arrive at you know a viable practical data standards effort that you can essentially push forward with so there's there are multiple aspects when you take that end-to-end system you that helps the chief later one of the other tenets of data ops is really the ability across the organization for everybody to have visibility communications it's very key we've got another graphic that I want to show around the organizational you know in the right regime and this is a complicated situation for a lot of people but it's imperative guys if you bring up the first graphic it's imperative that organizations you know fine bring in the right stakeholders and actually identify those individuals that are going to participate so that there's full visibility everybody understands what their their roles are they're not in in silos so a guys if you could show us that first graphic that would be great but talk about the organization and the right regime they're Interpol yes yes I believe you're going to what you're gonna show up is actually my organization but I think it's yes it's very very illustrative of what one has to set up to be able to pull off the kind of impact you know so let's say we talked about that central data and AI platform that's driving the entire enterprise and you're infusing AI into key business processes like the supply chain you then create applications like the operational risk insights that we talked about and then extend it over to a faster merging and changing situation like the overt nineteen you need an organization that obviously reflects the technical aspects of the plan right so you have to have the data engineering arm and in my case there's a lot of emphasis around because that's one of those skill set areas that's really quite rare and but also very very powerful so they're the major technology arms of that there's also the governance arm that I talked about where you have to produce a set of standards and implement them and enforce them so that you're able to make this end-to-end impact but then there's also there's a there's an adoption where there's a there's a group that reports in to me very very you know empowered which essentially has to convince the rest of the organization to adopt but the key to their success has been in power in the sense that they are empowered to find like-minded individuals in our key business processes who are also empowered and if they agree they just move forward and go ahead and do it because you know we've already provided the central capabilities by central I don't mean they're all in one location we're completely global and you know it's it's it's a hybrid multi-cloud set up but it's central in the sense that it's one source to come for for trusted data as well as the expertise that you need from an AI standpoint to be able to move forward and deliver the business outcome so when these business schemes come together with the adoption that's where the magic hand so that's another another aspect of the organization that's critical and then we've also got a data officer council that I chair and that has to do with the people who are the chief data officer z' of the individual business units that we have and they're kind of my extended team into the rest of the organization and we leverage that bolt from a adoption of the platform standpoint but also in terms of defining and enforcing standard it helps us do want to come back the Ovid talked a little bit about business resiliency people I think you've probably seen the news that IBM's you know providing super computer resources to the government to fight coronavirus you've also just announced that some some RTP folks are helping first responders and nonprofits and providing capabilities for no charge which is awesome I mean it's the kind of thing look I'm sensitive the companies like IBM you know you don't want to appear to be ambulance-chasing in these times however IBM and other big tech companies you're in a position to help and that's what you're doing here so maybe you could talk a little bit about what you're doing in this regard and then we'll tie it up with just business resiliency and the importance of data right right so you know I'd explained the operational risk insights application that we had which we were using internally and be covert nineteen even be using it we were using it primarily to assess the risk to our supply chain from various events and then essentially react very very quickly to those through those events so you could manage the situation well we realize that this is something that you know several non government NGOs that big they could essentially use the ability because they have to manage many of these situations like natural disasters and so we've given that same capability to the NGOs to you and to help them to help them streamline their planning and their thinking by the same token but you talked about Oh with nineteen that same capability with the poet mine team data overlaid on top of them essentially becomes a business continuity planning and resilience because let's say I'm a supply chambers right now I can look the incidence of probe ignite and I can and I know where my suppliers are and I can see the incidence and I can say oh yes know this supplier and I can see that the incidence is going up this is likely to be affected let me move ahead and start making plans backup plans just in case it reaches a crisis level then on the other hand if you're somebody in our revenue planning you know on the finance side and you know where your keep clients and customers are located again by having that information overlaid with those sites you can make your own judgments and you can make your own assessment to do that so that's how it translates over into a business continuity and resilient resilience planning - we are internally doing that now - every department you know that's something that we are actually providing them this capability because we could build rapidly on what we had already done and to be able to do that and then as we get inside into what each of those departments do with that data because you know once they see that data once they overlay it to their sites of interest and this is you know anybody and everybody in IBM because no matter what department they're in there are going to be sites of interest that are going to be affected and they have an understanding of what those sites of interest mean in the context of the planning that they're doing and so they'll be able to make judgments but as we gain a better understanding of that we will automate those capabilities more and more for each of those specific areas and now you're talking about a comprehensive approach an AI approach to business continuity and resilience planning in the context of a large complicated organization like IBM which obviously will be of great interest to enterprise clients and customers right one of the things that we're researching now is trying to understand you know what about this crisis is gonna be permanent some things won't be but but we think many things will be there's a lot of learnings do you think that organizations will rethink business resiliency in this context that they might sub optimize profitability for example to be more prepared for crises like this with better business resiliency and what role would data play in that so no it's a very good question and timely question Dave so I mean clearly people have understood that with regard to such a pandemic the first line of beef right is it is it's not going to be so much on the medicine side because the vaccine is not even we won't be available for a period of time it has to go to development so the first line of defense is actually to take a quarantine like a pro like we've seen play out across the world and then that in effect results in an impact on the businesses right in the economic climate and the businesses there's an impact I think people have realized this now they will obviously factor this in into their into how they do business will become one of those things from if this is time talking about how this becomes permanent I think it's going to become one of those things that if you're a responsible enterprise you are going to be planning for you're going to know how to implement this on the second go-around so obviously you put those frameworks and structures in place and there will be a certain cost associated with them and one could argue that that could eat into the profitability on the other hand what I would say is because these two points really that these are fast emerging fluid situations you have to respond very very quickly to those you will end up laying out a foundation pretty much like we did which enables you to really accelerate your pipeline right so the data ops pipelines we talked about there there's a lot of automation so that you can react very quickly you know data ingestion very very rapidly that you're able to you know do that kind of thing the metadata generation just the entire pipeline that we're talking about that you're able to respond and very quickly bring in new data and then aggregated at the right levels infuse it into the workflows and then deliver it to the right people at the right time I will you know that will become a must now but once you do that you could argue that there is a cost associated with doing that but we know that the cycle time reductions on things like that they can run you know I mean I gave you the example of 95 percent you know on average we see like a 70% end to end cycle time era where we've implemented the approach that's been pretty pervasive with an idea across a business process so that in a sense in in essence then actually becomes a driver for profitability so yes it might you know this might back people into doing that but I would argue that that's probably something that's going to be very good long term for the enterprises involved and they'll be able to leverage that in their in their business and I think that just the competitive pressure of having to do that will force everybody down that path mean but I think it'll be eventually a good that end and cycle time compression is huge and I like what you're saying because it's it's not just a reduction in the expected loss during a crisis there's other residual benefits to the organization Interpol thanks so much for coming on the cube and sharing this really interesting and deep case study I know there's a lot more information out there so really appreciate your time all right take care buddy thanks for watching and this is Dave Allante for the cube and we will see you next time [Music]

Published Date : May 28 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Dave Allante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
95 percent	QUANTITY	0.99+
95 percent	QUANTITY	0.99+
70%	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Dave	PERSON	0.99+
95%	QUANTITY	0.99+
Interpol	ORGANIZATION	0.99+
Nepal	LOCATION	0.99+
Interpol Bhandari	PERSON	0.99+
two points	QUANTITY	0.99+
nineteen	QUANTITY	0.99+
first graphic	QUANTITY	0.99+
first move	QUANTITY	0.99+
Boston	LOCATION	0.99+
first line	QUANTITY	0.99+
one	QUANTITY	0.98+
two big items	QUANTITY	0.98+
one piece	QUANTITY	0.98+
Kovach 19	EVENT	0.97+
pandemic	EVENT	0.97+
one metric	QUANTITY	0.96+
Inderpal Bhandari	PERSON	0.96+
Kovac	ORGANIZATION	0.95+
each	QUANTITY	0.94+
one component	QUANTITY	0.94+
Kovan	EVENT	0.94+
over 95%	QUANTITY	0.93+
both	QUANTITY	0.93+
several hundred sources	QUANTITY	0.92+
first line of beef	QUANTITY	0.92+
iBM	ORGANIZATION	0.91+
second graphic	QUANTITY	0.91+
second one	QUANTITY	0.91+
one source	QUANTITY	0.9+
one of those things	QUANTITY	0.9+
first things	QUANTITY	0.88+
a lot of people	QUANTITY	0.88+
lot of a lot of points	QUANTITY	0.79+
IBM DataOps	ORGANIZATION	0.78+
coronavirus	OTHER	0.77+
second go	QUANTITY	0.77+
lot	QUANTITY	0.75+
first	QUANTITY	0.74+
a lot of people	QUANTITY	0.73+
19	OTHER	0.73+
19 situation	QUANTITY	0.72+
one of those fundamental steps	QUANTITY	0.71+
non government	QUANTITY	0.6+
Ovid	ORGANIZATION	0.55+
2020	DATE	0.55+
more	QUANTITY	0.51+
19	EVENT	0.41+

Julie Lockner, IBM | IBM DataOps 2020

>>from the Cube Studios in Palo Alto and Boston connecting with thought leaders all around the world. This is a cube conversation. >>Hi, everybody. This is Dave Volante with Cuban. Welcome to the special digital presentation. We're really digging into how IBM is operational izing and automating the AI and data pipeline not only for its clients, but also for itself. And with me is Julie Lockner, who looks after offering management and IBM Data and AI portfolio really great to see you again. >>Great, great to be here. Thank you. Talk a >>little bit about the role you have here at IBM. >>Sure, so my responsibility in offering >>management and the data and AI organization is >>really twofold. One is I lead a team that implements all of the back end processes, really the operations behind any time we deliver a product from the Data and AI team to the market. So think about all of the release cycle management are seeing product management discipline, etcetera. The other role that I play is really making sure that I'm We are working with our customers and making sure they have the best customer experience and a big part of that is developing the data ops methodology. It's something that I needed internally >>from my own line of business execution. But it's now something that our customers are looking for to implement in their shops as well. >>Well, good. I really want to get into that. So let's let's start with data ops. I mean, I think you know, a lot of people are familiar with Dev Ops. Not maybe not everybody's familiar with data ops. What do we need to know about data? >>Well, I mean, you bring up the point that everyone knows Dev ops. And in fact, I think you know what data ops really >>does is bring a lot of the benefits that Dev Ops did for application >>development to the data management organizations. So when we look at what is data ops, it's a data management. Uh, it is a data management set of principles that helps organizations bring business ready data to their consumers. Quickly. It takes it borrows from Dev ops. Similarly, where you have a data pipeline that associates a business value requirement. I have this business initiative. It's >>going to drive this much revenue or this must cost >>savings. This is the data that I need to be able to deliver it. How do I develop that pipeline and map to the data sources Know what data it is? Know that I can trust it. So ensuring >>that it has the right quality that I'm actually using, the data that it was meant >>for and then put it to use. So in in history, most data management practices deployed a waterfall like methodology. Our implementation methodology and what that meant is all the data pipeline >>projects were implemented serially, and it was done based on potentially a first in first out program management office >>with a Dev Ops mental model and the idea of being able to slice through all of the different silos that's required to collect the data, to organize it, to integrate it, the validate its quality to create those data integration >>pipelines and then present it to the dashboard like if it's a Cognos dashboard >>or a operational process or even a data science team, that whole end to end process >>gets streamlined through what we're pulling data ops methodology. >>So I mean, as you well know, we've been following this market since the early days of Hadoop people struggle with their data pipelines. It's complicated for them, there's a a raft of tools and and and they spend most of their time wrangling data preparing data moving data quality, different roles within the organization. So it sounds like, you know, to borrow from from Dev Ops Data offices is all about streamlining that data pipeline, helping people really understand and communicate across. End the end, as you're saying, But but what's the ultimate business outcome that you're trying to drive? >>So when you think about projects that require data to again cut costs Teoh Artemia >>business process or drive new revenue initiatives, >>how long does it take to get from having access to the data to making it available? That duration for every time delay that is spent wasted trying to connect to data sources, trying to find subject matter experts that understand what the data means and can verify? It's quality, like all of those steps along those different teams and different disciplines introduces delay in delivering high quality data fat, though the business value of data ops is always associated with something that the business is trying to achieve but with a time element so if it's for every day, we don't have this data to make a decision where either making money or losing money, that's the value proposition of data ops. So it's about taking things that people are already doing today and figuring out the quickest way to do it through automation or work flows and just cutting through all the political barriers >>that often happens when these data's cross different organizational boundaries. >>Yes, sir, speed, Time to insights is critical. But in, you know, with Dev Ops, you really bringing together of the skill sets into, sort of, you know, one Super Dev or one Super ops. It sounds with data ops. It's really more about everybody understanding their role and having communication and line of sight across the entire organization. It's not trying to make everybody else, Ah, superhuman data person. It's the whole It's the group. It's the team effort, Really. It's really a team game here, isn't it? >>Well, that's a big part of it. So just like any type of practice, there's people, aspects, process, aspects and technology, right? So people process technology, and while you're you're describing it, like having that super team that knows everything about the data. The only way that's possible is if you have a common foundation of metadata. So we've seen a surgeons in the data catalog market in the last, you know, 67 years. And what what the what? That the innovation in the data catalog market has actually enabled us to be able >>to drive more data ops pipelines. >>Meaning as you identify data assets you captured the metadata capture its meaning. You capture information that can be shared, whether they're stakeholders, it really then becomes more of a essential repository for people don't really quickly know what data they have really quickly understand what it means in its quality and very quickly with the right proper authority, like privacy rules included. Put it to use >>for models, um, dashboards, operational processes. >>Okay. And we're gonna talk about some examples. And one of them, of course, is IBM's own internal example. But help us understand where you advise clients to start. I want to get into it. Where do I get started? >>Yeah, I mean, so traditionally, what we've seen with these large data management data governance programs is that sometimes our customers feel like this is a big pill to swallow. And what we've said is, Look, there's an operator. There's an opportunity here to quickly define a small project, align into high value business initiative, target something that you can quickly gain access to the data, map out these pipelines and create a squad of skills. So it includes a person with Dev ops type programming skills to automate an instrument. A lot of the technology. A subject matter expert who understands the data sources in it's meeting the line of business executive who translate bringing that information to the business project and associating with business value. So when we say How do you get started? We've developed A I would call it a pretty basic maturity model to help organizations figure out. Where are they in terms of the technology, where are they in terms of organizationally knowing who the right people should be involved in these projects? And then, from a process perspective, we've developed some pretty prescriptive project plans. They help you nail down. What are the data elements that are critical for this business business initiative? And then we have for each role what their jobs are to consolidate the data sets map them together and present them to the consumer. We find that six week projects, typically three sprints, are perfect times to be able to a timeline to create one of these very short, quick win projects. Take that as an opportunity to figure out where your bottlenecks are in your own organization, where your skill shortages are, and then use the outcome of that six week sprint to then focus on billing and gaps. Kick off the next project and iterating celebrate the success and promote the success because >>it's typically tied to a business value to help them create momentum for the next one. >>That's awesome. I want to get into some examples, I mean, or we're both Massachusetts based. Normally you'd be in our studio and we'd be sitting here for face to face of obviously with Kobe. 19. In this crisis world sheltering in place, you're up somewhere in New England. I happened to be in my studio, but I'm the only one here, so relate this to cove it. How would data ops, or maybe you have a, ah, a concrete example in terms of how it's helped, inform or actually anticipate and keep up to date with what's happening with both. >>Yeah, well, I mean, we're all experiencing it. I don't think there's a person >>on the planet who hasn't been impacted by what's been going on with this Cupid pandemic prices. >>So we started. We started down this data obscurity a year ago. I mean, this isn't something that we just decided to implement a few weeks ago. We've been working on developing the methodology, getting our own organization in place so that we could respond the next time we needed to be able todo act upon a data driven decision. So part of the step one of our journey has really been working with our global chief data officer, Interpol, who I believe you have had an opportunity to meet with an interview. So part of this year Journey has been working with with our corporate organization. I'm in a line of business organization where we've established the roles and responsibilities we've established the technology >>stack based on our cloud pack for data and Watson knowledge padlock. >>So I use that as the context. For now, we're faced with a pandemic prices, and I'm being asked in my business unit to respond very quickly. How can we prioritize the offerings that are going to help those in critical need so that we can get those products out to market? We can offer a 90 day free use for governments and hospital agencies. So in order for me to do that as a operations lead or our team, I needed to be able to have access to our financial data. I needed to have access to our product portfolio information. I needed to understand our cloud capacity. So in order for me to be able to respond with the offers that we recently announced and you'll you can take a look at some of the examples with our Watson Citizen Assistant program, where I was able to provide the financial information required for >>us to make those products available from governments, hospitals, state agencies, etcetera, >>that's a That's a perfect example. Now, to set the stage back to the corporate global, uh, the chief data office organization, they implemented some technology that allowed us to, in just data, automatically classify it, automatically assign metadata, automatically associate data quality so that when my team started using that data, we knew what the status of that information >>was when we started to build our own predictive models. >>And so that's a great example of how we've been partnered with a corporate central organization and took advantage of the automated, uh, set of capabilities without having to invest in any additional resources or head count and be able to release >>products within a matter of a couple of weeks. >>And in that automation is a function of machine intelligence. Is that right? And obviously, some experience. But you couldn't you and I when we were consultants doing this by hand, we couldn't have done this. We could have done it at scale anyway. It is it is it Machine intelligence and AI that allows us to do this. >>That's exactly right. And you know, our organization is data and AI, so we happen to have the research and innovation teams that are building a lot of this technology, so we have somewhat of an advantage there, but you're right. The alternative to what I've described is manual spreadsheets. It's querying databases. It's sending emails to subject matter experts asking them what this data means if they're out sick or on vacation. You have to wait for them to come back, and all of this was a manual process. And in the last five years, we've seen this data catalog market really become this augmented data catalog, and the augmentation means it's automation through AI. So with years of experience and natural language understanding, we can home through a lot of the metadata that's available electronically. We can calm for unstructured data, but we can categorize it. And if you have a set of business terms that have industry standard definitions through machine learning, we can automate what you and I did as a consultant manually in a matter of seconds. That's the impact that AI is have in our organization, and now we're bringing this to the market, and >>it's a It's a big >>part of where I'm investing. My time, both internally and externally, is bringing these types >>of concepts and ideas to the market. >>So I'm hearing. First of all, one of the things that strikes me is you've got multiple data, sources and data that lives everywhere. You might have your supply chain data in your er p. Maybe that sits on Prem. You might have some sales data that's sitting in a sas in a cloud somewhere. Um, you might have, you know, weather data that you want to bring in in theory. Anyway, the more data that you have, the better insights that you could gather assuming you've got the right data quality. But so let me start with, like, where the data is, right? So So it's it's anywhere you don't know where it's going to be, but you know you need it. So that's part of this right? Is being able >>to get >>to the data quickly. >>Yeah, it's funny. You bring it up that way. I actually look a little differently. It's when you start these projects. The data was in one place, and then by the time you get through the end of a project, you >>find out that it's moved to the cloud, >>so the data location actually changes. While we're in the middle of projects, we have many or even during this this pandemic crisis. We have many organizations that are using this is an opportunity to move to SAS. So what was on Prem is now cloud. But that shouldn't change the definition of the data. It shouldn't change. It's meaning it might change how you connect to it. It might also change your security policies or privacy laws. Now, all of a sudden, you have to worry about where is that data physically located? And am I allowed to share it across national boundaries right before we knew physically where it waas. So when you think about data ops, data ops is a process that sits on top of where the data physically resides. And because we're mapping metadata and we're looking at these data pipelines and automated work flows, part of the design principles are to set it up so that it's independent of where it resides. However, you have to have placeholders in your metadata and in your tool chain, where we're automating these work flows so that you can accommodate when the data decides to move. Because the corporate policy change >>from on prem to cloud. >>And that's a big part of what Data ops offers is the same thing. By the way, for Dev ops, they've had to accommodate building in, you know, platforms as a service versus on from the development environments. It's the same for data ops, >>and you know, the other part that strikes me and listening to you is scale, and it's not just about, you know, scale with the cloud operating model. It's also about what you were talking about is you know, the auto classification, the automated metadata. You can't do that manually. You've got to be able to do that. Um, in order to scale with automation, That's another key part of data office, is it not? >>It's a well, it's a big part of >>the value proposition and a lot of the part of the business case. >>Right then you and I started in this business, you know, and big data became the thing. People just move all sorts of data sets to these Hadoop clusters without capturing the metadata. And so as a result, you know, in the last 10 years, this information is out there. But nobody knows what it means anymore. So you can't go back with the army of people and have them were these data sets because a lot of the contact was lost. But you can use automated technology. You can use automated machine learning with natural, understand natural language, understanding to do a lot of the heavy lifting for you and a big part of data ops, work flows and building these pipelines is to do what we call management by exception. So if your algorithms say 80% confident that this is a phone number and your organization has a low risk tolerance, that probably will go to an exception. But if you have a you know, a match algorithm that comes back and says it's 99% sure this is an email address, right, and you have a threshold that's 98%. It will automate much of the work that we used to have to do manually. So that's an example of how you can automate, eliminate manual work and have some human interaction based on your risk threshold. >>That's awesome. I mean, you're right, the no schema on write said. I throw it into a data lake. Data Lake becomes a data swamp. We all know that joke. Okay, I want to understand a little bit, and maybe you have some other examples of some of the use cases here, but there's some of the maturity of where customers are. It seems like you've got to start by just understanding what data you have, cataloging it. You're getting your metadata act in order. But then you've got you've got a data quality component before you can actually implement and get yet to insight. So, you know, where are customers on the maturity model? Do you have any other examples that you can share? >>Yeah. So when we look at our data ops maturity model, we tried to simplify, and I mentioned this earlier that we try to simplify it so that really anybody can get started. They don't have to have a full governance framework implemented to to take advantage of the benefits data ops delivers. So what we did is we said if you can categorize your data ops programs into really three things one is how well do you know your data? Do you even know what data you have? The 2nd 1 is, and you trust it like, can you trust it's quality? Can you trust it's meeting? And the 3rd 1 is Can you put it to use? So if you really think about it when you begin with what data do you know, write? The first step is you know, how are you determining what data? You know? The first step is if you are using spreadsheets. Replace it with a data catalog. If you have a department line of business catalog and you need to start sharing information with the department's, then start expanding to an enterprise level data catalog. Now you mentioned data quality. So the first step is do you even have a data quality program, right. Have you even established what your criteria are for high quality data? Have you considered what your data quality score is comprised of? Have you mapped out what your critical data elements are to run your business? Most companies have done that for there. They're governed processes. But for these new initiatives And when you identify, I'm in my example with the covert prices, what products are we gonna help bring to market quickly? I need to be able to >>find out what the critical data elements are. And can I trust it? >>Have I even done a quality scan and have teams commented on it's trustworthiness to be used in this case, If you haven't done anything like that in your organization, that might be the first place to start. Pick the critical data elements for this initiative, assess its quality, and then start to implement the work flows to re mediate. And then when you get to putting it to use, there's several methods for making data available. One is simply making a gate, um, are available to a small set of users. That's what most people do Well, first, they make us spreadsheet of the data available, But then, if they need to have multiple people access it, that's when, like a Data Mart might make sense. Technology like data virtualization eliminates the need for you to move data as you're in this prototyping phase, and that's a great way to get started. It doesn't cost a lot of money to get a virtual query set up to see if this is the right join or the right combination of fields that are required for this use case. Eventually, you'll get to the need to use a high performance CTL tool for data integration. But Nirvana is when you really get to that self service data prep, where users can weary a catalog and say these are the data sets I need. It presents you a list of data assets that are available. I can point and click at these columns I want as part of my data pipeline and I hit go and automatically generates that output or data science use cases for it. Bad news, Dashboard. Right? That's the most mature model and being able to iterate on that so quickly that as soon as you get feedback that that data elements are wrong or you need to add something, you can do it. Push button. And that's where data obscurity should should bring organizations too. >>Well, Julie, I think there's no question that this covert crisis is accentuated the importance of digital. You know, we talk about digital transformation a lot, and it's it's certainly riel, although I would say a lot of people that we talk to we'll say, Well, you know, not on my watch. Er, I'll be retired before that all happens. Well, this crisis is accelerating. That transformation and data is at the heart of it. You know, digital means data. And if you don't have data, you know, story together and your act together, then you're gonna you're not gonna be able to compete. And data ops really is a key aspect of that. So give us a parting word. >>Yeah, I think This is a great opportunity for us to really assess how well we're leveraging data to make strategic decisions. And if there hasn't been a more pressing time to do it, it's when our entire engagement becomes virtual like. This interview is virtual right. Everything now creates a digital footprint that we can leverage to understand where our customers are having problems where they're having successes. You know, let's use the data that's available and use data ops to make sure that we can generate access. That data? No, it trust it, Put it to use so that we can respond to >>those in need when they need it. >>Julie Lockner, your incredible practitioner. Really? Hands on really appreciate you coming on the Cube and sharing your knowledge with us. Thank you. >>Thank you very much. It was a pleasure to be here. >>Alright? And thank you for watching everybody. This is Dave Volante for the Cube. And we will see you next time. >>Yeah, yeah, yeah, yeah, yeah

Published Date : May 28 2020

SUMMARY :

from the Cube Studios in Palo Alto and Boston connecting with thought leaders all around the world. portfolio really great to see you again. Great, great to be here. from the Data and AI team to the market. But it's now something that our customers are looking for to implement I mean, I think you know, I think you know what data ops really Similarly, where you have a data pipeline that associates a This is the data that I need to be able to deliver it. for and then put it to use. So it sounds like, you know, that the business is trying to achieve but with a time element so if it's for every you know, with Dev Ops, you really bringing together of the skill sets into, sort of, in the data catalog market in the last, you know, 67 years. Meaning as you identify data assets you captured the metadata capture its meaning. But help us understand where you advise clients to start. So when we say How do you get started? it's typically tied to a business value to help them create momentum for the next or maybe you have a, ah, a concrete example in terms of how it's helped, I don't think there's a person on the planet who hasn't been impacted by what's been going on with this Cupid pandemic Interpol, who I believe you have had an opportunity to meet with an interview. So in order for me to Now, to set the stage back to the corporate But you couldn't you and I when we were consultants doing this by hand, And if you have a set of business terms that have industry part of where I'm investing. Anyway, the more data that you have, the better insights that you could The data was in one place, and then by the time you get through the end of a flows, part of the design principles are to set it up so that it's independent of where it for Dev ops, they've had to accommodate building in, you know, and you know, the other part that strikes me and listening to you is scale, and it's not just about, So you can't go back with the army of people and have them were these data I want to understand a little bit, and maybe you have some other examples of some of the use cases So the first step is do you even have a data quality program, right. And can I trust it? able to iterate on that so quickly that as soon as you get feedback that that data elements are wrong And if you don't have data, you know, Put it to use so that we can respond to Hands on really appreciate you coming on the Cube and sharing Thank you very much. And we will see you next time.

ENTITIES

Entity	Category	Confidence
Julie	PERSON	0.99+
Julie Lockner	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Volante	PERSON	0.99+
New England	LOCATION	0.99+
90 day	QUANTITY	0.99+
99%	QUANTITY	0.99+
80%	QUANTITY	0.99+
Massachusetts	LOCATION	0.99+
Data Mart	ORGANIZATION	0.99+
first step	QUANTITY	0.99+
98%	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Boston	LOCATION	0.99+
67 years	QUANTITY	0.99+
six week	QUANTITY	0.99+
Cube Studios	ORGANIZATION	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
a year ago	DATE	0.99+
first	QUANTITY	0.98+
Dev Ops	ORGANIZATION	0.98+
2nd 1	QUANTITY	0.97+
One	QUANTITY	0.97+
First	QUANTITY	0.97+
Interpol	ORGANIZATION	0.97+
one place	QUANTITY	0.97+
each role	QUANTITY	0.97+
Hadoop	TITLE	0.95+
Kobe	PERSON	0.95+
SAS	ORGANIZATION	0.95+
Cupid pandemic	EVENT	0.94+
today	DATE	0.93+
3rd 1	QUANTITY	0.93+
this year	DATE	0.93+
few weeks ago	DATE	0.88+
Prem	ORGANIZATION	0.87+
last five years	DATE	0.87+
2020	DATE	0.85+
three sprints	QUANTITY	0.81+
one Super	QUANTITY	0.8+
Nirvana	ORGANIZATION	0.79+
Cuban	ORGANIZATION	0.77+
three things	QUANTITY	0.76+
pandemic	EVENT	0.74+
step one	QUANTITY	0.71+
one of them	QUANTITY	0.7+
last 10 years	DATE	0.69+
Dev Ops	TITLE	0.69+
Teoh Artemia	ORGANIZATION	0.68+
Cognos	ORGANIZATION	0.61+
Watson Citizen Assistant	TITLE	0.6+
Dev ops	TITLE	0.6+
Cube	COMMERCIAL_ITEM	0.57+
ops	ORGANIZATION	0.54+
weeks	QUANTITY	0.48+
Cube	ORGANIZATION	0.47+
couple	QUANTITY	0.47+
Watson	TITLE	0.42+

Victoria Stasiewicz, Harley-Davidson Motor Company | IBM DataOps 2020

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hi everybody this is Dave Volante and welcome to this special digital cube presentation sponsored by IBM we're going to focus in on data op data ops in action a lot of practitioners tell us that they really have challenges operationalizing in infusing AI into the data pipeline we're going to talk to some practitioners and really understand how they're solving this problem and really pleased to bring Victoria stayshia vich who's the Global Information Systems Manager for information management at harley-davidson Vik thanks for coming to the cube great to see you wish we were face to face but really appreciate your coming on in this manner that's okay that's why technology's great right so you you are steeped in a data role at harley-davidson can you describe a little bit about what you're doing and what that role is like definitely so obviously a manager of information management >> governance at harley-davidson and what my team is charged with is building out data governance at an enterprise level as well as supporting the AI and machine learning technologies within my function right so I have a portfolio that portfolio really includes DNA I and governance and also our master data and reference data and data quality function if you're familiar with the dama wheel of course what I can tell you is that my team did an excellent job within this last year in 2019 standing up the infrastructure so those technologies right specific to governance as well as their newer more modern warehouse on cloud technologies and cloud objects tour which also included Watson Studio and Watson Explorer so many of the IBM errs of the world might hear about obviously IBM ISEE or work on it directly we stood that up in the cloud as well as db2 warehouse and cloud like I said in cloud object store we spent about the first five months of last year standing that infrastructure up working on the workflow ensuring that access security management was all set up and can within the platform and what we did the last half of the year right was really start to collect that metadata as well as the data itself and bring the metadata into our metadata repository which is rx metadata base without a tie FCE and then also bring that into our db2 warehouse on cloud environment so we were able to start with what we would consider our dealer domain for harley-davidson and bring those dimensions within to db2 warehouse on cloud which was never done before a lot of the information that we were collecting and bringing together for the analytics team lived in disparate data sources throughout the enterprise so the goal right was to stop with redundant data across the enterprise eliminate some of those disparity to source data resources right and bring it into a centralized repository for reporting okay Wow we got a lot to unpack here Victoria so but let me start with sort of the macro picture I mean years ago you see the data was this thing that had to be managed and it still does but it was a cost was largely a liability you know governance was sort of front and center sometimes you know it was the tail that wagged the value dog and then the whole Big Data movement comes in and everybody wants to be data-driven and so you saw some pretty big changes in just the way in which people looked at data they wanted to you know mine that data and make it an asset versus just a straight liability so what what are the changes that you discerned in in data and in your organization over the last let's say half a decade we to tell you the truth we started looking at access management and the ability to allow some of our users to do some rapid prototyping that they could never do before so what more and more we're seeing as far as data citizens or data scientists right or even analysts throughout most enterprises is it well they want access to the information they want it now they want speed to insight at this moment using pretty much minimal Viable Product they may not need the entire data set and they don't want to have to go through leaps and bounds right to just get access to that information or to bring that information into necessarily a centralized location so while I talk about our db2 warehouse on cloud and that's an excellent example of one we actually need to model data we know that this is data that we trust right that's going to be called upon many many times from many many analysts right there's other information out there that people are collecting because there's so much big data right there's so many ways to enrich your data within your organization for your customer reporting the people are really trying to tap into those third-party datasets so what my team has done what we're seeing right change throughout the industry is that a lot of teams and a lot of enterprises are looking at s technologists how can we enable our scientists and our analysts right the ability to access data virtually so instead of repeating right recuperating redundant data sources we're actually ambling data virtualization at harley-davidson and we've been doing that first working with our db2 warehouse on cloud and connecting to some of our other trusted versions of data warehouses that we have throughout the enterprise that being our dealer warehouse as well to enable obviously analysts to do some quick reporting without having to bring all that data together that is a big change I see the fact that we were able to tackle that that's allowed technology to get back ahead because most backup Furnish say most organizations right have given IT the bad rap wrap up it takes too long to get what we need my technologists cannot give me my data at my fingertips in a timely manner to not allow for speed to insight and answers the business questions at point of time of delivery most and we've supplied data to our analysts right they're able to calculate aggregate brief the reporting metrics to get those answers back to the business but they're a week two weeks too late the information is no longer relevant so data virtualization through data Ops is one of the ways and we've been able to speed that up and act as a catalyst for data delivery but we've also done though and I see this quite a bit is well that's excellent we still need to start classifying our information and labeling that at the system level we've seen most most enterprises right I worked at Blue Cross as well with IBM tool had the same struggle they were trying to eliminate their technology debt reduce their spend reduce the time it takes for resources working on technologies to maintain technologies they want to reduce their their IT portfolio of assets and capabilities that they license today so what do they do to do that it's time to start taking a look at what systems should be classified as essential systems versus those systems that are disparate and could be eliminated and that starts with data governance right so okay so your your main focus is on governance and you talked about real people want answers now they don't want to have to wait they don't want to go big waterfall process so what was what would you say was sort of some of the top challenges in terms of just operationalizing your data pipelining getting to the point that you are today you know I have to be quite honest um standing up the governance framework the methodology behind it right to get it data owners data stewards at a catalog established that was not necessarily the heavy lifting the heavy lifting really came with I'm setting up a brand new infrastructure in the cloud for us to be quite honest um we with IBM partnered and said you know what we're going to the cloud and these tools had never been implemented in the cloud before we were kind of the first do it so some of the struggles that we aren't they or took on and we're actually um standing up the infrastructure security and access management network pipeline access right VPN issues things of that nature I would say is some of the initial roadblocks we went through but after we overcame those challenges with the help of IBM and the patience of both the Harley and IBM team it became quite easy to roll out these technologies to other users the nice thing is right we at harley-davidson have been taking the time to educate our users today up for example we had what we call the data bytes a Lunch and Learn and so in that Lunch and Learn what we did is we took our entire GIS team our global information services team which is all of IT through these new technologies it was a form of over 250 people with our CIO and CTO on and taking them through how do we use these tools what are the purpose of schools why do we need governance to maintain these pools why is metadata management important to the organization that piece of it seems to be much easier than just our initial scanning it up so it's good enough to start letting users in well sounds like you had real sponsorship from from leadership and input from leadership and they were kind of leaning into the whole process first of all is that true and how important is that for success oh it's essential we often said when we were first standing up the tools to be quite honest is our CIO really understand what it is that were for standing up as our CIO really understand governance because we didn't have the time to really get that face-to-face interaction with our leadership so I myself made it a mandate having done this previously at Blue Cross to get in front of my CIO and my CTO and educate them on what it is we are exactly standing up and once we did that it was very easy to get at an executive steering committee as well as an executive membership Council right I'm boarded with our governance council and now they're the champions of that it's never easy that was selling governance to leadership and the ROI is never easy because it's not something that you can easily calculate it's something that has to show its return on investment over time and that means that you're bringing dashboards you're educating your CIO and CTO and how you're bringing people together how groups are now talking about solutions and technologies in a domain like environment right where you have people from at an international level we have people from Asia from Europe from China that join calls every Thursday to talk about the data quality issue specific to dealer for example what systems were using what solutions on there are on the horizon to solve them so that now instead of having people from other countries that work for Harley as well as just even within the US right creating one-off solutions that are answering the same business questions using the same data but creating multiple solutions right to solve the same problem we're now bringing them together and we're solving together and we're prioritizing those as well so that return on investment necessarily down the line you can show that is you know what instead of this printing into five projects we've now turned this into one and instead of implementing four systems we've now implemented one and guess what we have the business rules and we have the classification I to this system so that you CIO or CTO right you now go in and reference this information a glossary a user interface something that a c-level can read interpret understand quickly write dissect the information for their own need without having to take the long lengthy time to talk to a technologist about what does this information mean and how do i how do I use it you know what's interesting is take away based on what you just said is you know harley-davidson is an iconic brand cool company with fuckin motorcycles right and but you came out of an insurance background which is a regulated industry where you know governance is sort of de rigueur right I mean it's it's a table steak so how are you able that arleigh to balance the sort of tension between governance and the sort of business flexibility so there's different there's different lovers I would call them right obviously within healthcare in insurance the importance becomes compliance and risk and regulatory right they're big pushes gosh I don't want to pay millions of dollars for fines start classifying this information enabling security reducing risk all that good stuff right for Harley Davidson it was much different it was more or less we have a mission right we want to invest in our technologies yet we want to save money how do we cut down the technologies that we have today reduce our technology spend yet and able our users have access to more information in a timely manner that's not an easy that's not an easy pass right um so what we did is I took that my married governance part-time model and our time model is specific worried they're gonna tolerate an application we're going to invest in an application we're gonna migrate an application or we're gonna eliminate that so I'm talking to my CIO said you know we can use governance the classifier system help act as a catalyst when we start to implement what it is we're doing with our technologies which technologies are we going to eliminate tomorrow we as IG cannot do that unless we discuss some sort of business impact unless you look at a system and say how many users are using us what reports are essential the business teams do they need this system is this something that's critical for users today to eat is this duplicate 'iv right we have many systems that are solving the same capability that is how I sold that off my CIO and it made it important to the rest of the organization they knew we had a mandate in front of us we had to reduce technology spend and that really for me made it quite easy and talking to other technologists as well as business users on why if governance is important why it's going to help harley-davidson and their mission to save money going forward I will tell you though that the businesses of biggest value right is the fact that they now owns the data they're more likely right to use your master data management systems like I said I'm the owner of our MDM services today as well as our customer knowledge center today they're more likely to access and reference those systems if they feel that they built the rule and they own the rules in those systems so that's another big value add to write as many business users will say ok you know you think I need access to this system I don't know I'm not sure I don't know what the data looks like within it is it easily accessible is it gonna give me the reporting metrics that I need that's where governance will help them for example like our state a scientist beam using a catalog right you can browse your metadata you can look at your server your database your tables your fields understand what those mean understand the classifications the formulas within them right they're all documented in a glossary versus having to go and ask for access to six different systems throughout the enterprise hoping right that's Sally next few that told you you needed access to these systems was right just to find out that you don't need the access and hence it took you three days to get the access anyway that's why a glossary is really a catalyst a lot of that well it's really interesting what you just said about you went through essentially an application rationalization exercise which which saved your organization money that's not always easy because you know businesses even though the you know IIT may be spending money on these systems businesses don't want to give them up but you were able to use it sounds like you're able to use data to actually inform which applications you should invest in versus you know sunset as well you'd sounds like you were giving the business a real incentive to go through this exercise because they ended up as you said owning the data well then what's great right who wants pepper what's using the old power and driving a new car if they can buy the I'm sorry bull owning the old car right driving the old park if they can truly own a new car for a cheaper price nobody wants to do that I've even looked at Tesla's right I can buy a Tesla for the same prices I can buy a minivan these days I think I might buy the Tesla but what I will say is that we also use that we built out a capabilities model with our enterprise architecture team and building that capabilities model we started to bucket our technologies within those capabilities models right like AI machine learning warehouse on cloud technologies are even warehousing technologies governance technologies you know those types of classifications today integrations technologies reporting technologies by kind of grouping all those into a capabilities matrix right and was Eve it was easy for us to then start identifying alright we're the system owners for these when it comes to technologies who are the business users for these based on that right let's go talk to this team the dealer management team about access to this new profiling capability with an IBM or this new catalog with an IBM right that they can use stay versus this sharepoint excel spreadsheets they were using for their metadata management right or the profiling tools that were old you know ten years old some of our sa peoples that they were using before right let's sell them on the noodles and start migrating them that becomes pretty easy because I mean unless you're buying some really old technology when you give people a purview into those new tools and those new capabilities especially with some of the IBM's new tools we have today there the buy-in is pretty quick it's pretty easy to sell somebody on something shiny and it's much easier to use than some of the older technologies let's talk about the business impact in my understanding is you were trying to increase the improve the effectiveness of the dealers not not just go out and brute force sign up more dealers were you able to achieve that outcome and what does it meant for your business yes actually we were so right now what we did is we slipped something called a CDR and that's our consumer dealer and development repository right that's where a lot of our dealer information resides today it's actually argue ler warehouse we had some other systems that we're collecting that information Kalinin like speed for example we were able to bring all that reporting man to one location sunset some of those other technologies but then also enable for that centralized reporting layer which we've also used data virtualization to start to marry submit information to db2 warehouse on cloud for users so we're allowing basically those that want to access CDR and our db2 warehouse and called dealer information to do that within one reporting layer um in doing so we were able to create something called a dealer harmonized ID really which is our version of we have so many dealers today right and some of those dealers actually sell bytes some of those dealers sell just apparel material some of those dealers just sell parts of those dealers right can we have certain you IDs kind of a golden record mastered information if you will right bought back in reporting so that we can accurately assess the dealer performance up to two years ago right it was really hard to do that we had information spread out all over it was really hard to get a good handle on what dealers were performing and what dealers weren't because was it was tough right for our analysts to wrangle that information and bring it together it took time many times we you would get multiple answers to one business question which is never good right one one question should have one answer if it's accurate um that is what we worked on within us last year and that's where really our CEO so the value at is now we can start to act on what dealers are performing at an optimal level versus what dealers are struggling and that's allowed even our account reps or field steel fields that right to go work with those struggling dealers and start to share with them the information of you know these are what some of our stronger dealer performing dealers are doing today that is making them more affecting it inside sorry effective is selling bikes you know these are some of the best practices you can implement that's where we make right our field staff smarter and our dealers smarter we're not looking to shut down dealers we just want to educate them on how to do better well and to your point about a single version of the truth if you will the the lines of business kind of owning their own data that's critical because you're not spending all your time you know pointing at fingers trying to understand the data if the if the users own it then they own it I and so how does self-service fit in were you able to achieve you know some level of self-service how far could you and you go there we were we did use some other tools I'll be quite honest aside from just the IBM tools today that's enabled some of that self-service analytics si PSAC was one of them Alteryx is another big one that we like to that our analyst team likes to use today to wrangle and bring that data together but that really allowed for our analysts spread in our reporting teams to start to build their own derivations their transformations for reporting themselves because they're more user interface space versus going in the backend systems and having to write straight pull right sequel queries things of that nature it usually takes time then requires a deeper level of knowledge then what we'd like to allow for our analysts right to have today I can say the same thing with the data scientist scheme you know they use a lot of the R and Python coding today what we've tried to do is make sure that the tools are available so that they can do everything they need to do without us really having to touch anything and I will be quite honest we have not had to touch much of anything we have a very skilled data scientist team so I will tell you that the tools that we put in place today Watson explore some of the other tools as well they haven't that has enabled the data scientists to really quickly move do what they need to do for reporting and even in cases where maybe Watson or Explorer may not be the optimal technology right for them to use we've also allowed for them to use some of our other resources are open source resources to build some of the models that they're that they were looking to build well I'm glad you brought that up Victoria because IBM makes a big deal out of you know being open and so you're kind of confirming that you can use third-party tools and and if you like you know tool vendor ABC you can use them as part of this framework yeah it's really about TCO right so take a look at what you have today if it's giving you at least 80% of what you need for the business or for your data scientists or reporting analysts right to do what they need to do it's to me it's good enough right it's giving you what you need it's pretty hard to find anything that's exactly 100 percent it's about being open though to when you're scientists or your analysts find another reporting tool right that requires minimal maintenance or let's just say did a scientist flow that requires minimal maintenance it's free right because it's open source IBM can integrate with that and we can enable that to be a quicker way for them to do what they need to do versus telling them no right you can't use the other technologies or the other open source information out there for you today you've got to use just these spools that's pretty tough to do and I think that would shut most IT shops down pretty quick within larger enterprises because it would really act as a roadblock to allow most of our teams right to do what they need to do reporting well last question so a big part of this the data ops you know borrowing from DevOps is this continuous integration continuous improvement you know kind of ongoing MOOC raising the bar if you will what do you see going from here oh I definitely see I see a world I see a world of where we're allowing for that rapid prototyping like I was talking about earlier I see a very big change in the data industry you said it yourself right we are in the brink of big data and it's only gonna get bigger there are organizations right right now that have literally understood how much of an asset their data really is today but they're starting to sell their data ah to other of their similar people are smaller industries right similar vendors within the industry similar spaces right so they can make money off of it because data truly is an asset now the key to it that was obviously making sure that it's curated that it's cleanse that it's rusted so that when you are selling that back you can't really make money off of it but we've seen though and what I really see on the horizon is the ability to vet that data right is in the past what have you been doing the past decade or just buying big data sets we're trusting that it's you know good information we're not doing a lot of profiling at most organizations arts you're gonna pay this big top dollar you're gonna receive this third-party data set and you're not gonna be able to use it the way you need to what I see on the horizon is us being able to do that you know we're building data Lake houses if you will right we're building um really those Hadoop link environments those data lakes right where we can land information we can quickly access it we can quickly profile it with tools that it would take hours for an ALICE write a bunch of queries do to understand what the profile of that data look like we did that recently at harley-davidson we bought and some third-party data evaluated it quickly through our agile scrum team right within a week we determined that the data was not as good as it as the vendor selling it right pretty much sold it to be and so we told the vendor we want our money back the data is not what we thought it would be please take the data sets back now that's just one use case right but to me that was golden it's a way to save money and start betting the data that we're buying otherwise what I would see in the past or what I've seen in the past is many organizations are just buying up big third-party data sets and just saying okay now it's good enough we think that you know just because it comes from the motorcycle and council right for motorcycles and operation Council then it's good enough it may not be it's up to us to start vetting that and that's where technology is going to change data is going to change analytics is going to change is a great example you're really in the cutting edge of this whole data op trend really appreciate you coming on the cube and sharing your insights and there's more in the crowd chatter crowd chatter off the Thank You Victoria for coming on the cube well thank you Dave nice to meet you it was a pleasure speaking with you yeah really a pleasure was all ours and thank you for watching everybody as I say crowd chatting at flash data op or more detail more Q&A this is Dave Volante for the cube keep it right there but right back right after this short break [Music]

Published Date : May 28 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Dave Volante	PERSON	0.99+
Asia	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
five projects	QUANTITY	0.99+
Victoria Stasiewicz	PERSON	0.99+
China	LOCATION	0.99+
Tesla	ORGANIZATION	0.99+
Victoria	PERSON	0.99+
Harley	ORGANIZATION	0.99+
Harley Davidson	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Blue Cross	ORGANIZATION	0.99+
Blue Cross	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Dave	PERSON	0.99+
US	LOCATION	0.99+
Harley-Davidson Motor Company	ORGANIZATION	0.99+
harley-davidson	PERSON	0.99+
six different systems	QUANTITY	0.99+
Dave Volante	PERSON	0.99+
last year	DATE	0.99+
over 250 people	QUANTITY	0.99+
today	DATE	0.99+
three days	QUANTITY	0.99+
100 percent	QUANTITY	0.99+
IG	ORGANIZATION	0.99+
Watson	TITLE	0.99+
Boston	LOCATION	0.99+
tomorrow	DATE	0.98+
one business question	QUANTITY	0.98+
first	QUANTITY	0.98+
ABC	ORGANIZATION	0.98+
one answer	QUANTITY	0.97+
four systems	QUANTITY	0.97+
one	QUANTITY	0.97+
Victoria stayshia	PERSON	0.96+
Watson Explorer	TITLE	0.96+
Explorer	TITLE	0.96+
2019	DATE	0.96+
agile	ORGANIZATION	0.95+
Vik	PERSON	0.95+
two years ago	DATE	0.95+
one question	QUANTITY	0.95+
two weeks	QUANTITY	0.94+
both	QUANTITY	0.93+
excel	TITLE	0.93+
Sally	PERSON	0.92+
a week	QUANTITY	0.92+
harley	ORGANIZATION	0.91+
Watson Studio	TITLE	0.91+
last half of the year	DATE	0.89+
Alteryx	ORGANIZATION	0.88+
millions of dollars	QUANTITY	0.87+
single version	QUANTITY	0.86+
every Thursday	QUANTITY	0.86+
R	TITLE	0.85+

Dave Vellante DataOps Promo

>> DataOps is a term derived from DevOps. The idea is to apply agile methods to your data pipeline as a way to improve data quality and reduce the elapsed time that it takes to get from a corpus of raw data to actual insights. Now, leading organizations are taking advantage of this approach really to operationalize data and evolve their data-driven culture. Many organizations struggle to deliver sustainable and meaningful business value through data. DataOps addresses this problem and is really designed to help you transform your data operation throughout the entire life cycle. Hi everybody. This is Dave Vellante. I'm with the CUBE and on May 27th we're gathering our community in a crowd chat made possible with support from IBM. Now what we're doing is we're going to hear from four data experts and these practitioners have successfully deployed DataOps within their companies. In this event, you're going to learn how they did it, what skills and processes they required and which technologies they deployed, and very importantly, how it impacted their business, so go to crowdchat.net/dataops and add the chat to your calendar. Hope to see you there. (mellow music) (upbeat music)

Published Date : May 8 2020

SUMMARY :

and add the chat to your calendar.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
May 27th	DATE	0.99+
CUBE	ORGANIZATION	0.91+
crowdchat.net/dataops	OTHER	0.91+
DataOps	TITLE	0.91+
DevOps	TITLE	0.74+
experts	QUANTITY	0.56+
four	QUANTITY	0.51+

UNLISTED FOR REVIEW Julie Lockner, IBM | DataOps In Action

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation hi everybody this is David on tape with the cube and welcome to the special digital presentation we're really digging into how IBM is operationalizing and automating the AI and data pipeline not only for its clients but also for itself and with me is Julie Lochner who looks after offering management and IBM's data and AI portfolio Julie great to see you again okay great to be here thank you talk a little bit about the role you have here at IBM sure so my responsibility in offering management in the data and AI organization is really twofold one is I lead a team that implements all of the back-end processes really the operations behind anytime we deliver a product from the data AI team to the market so think about all of the release cycle management pricing product management discipline etc the other roles that I play is really making sure that um we are working with our customers and making sure they have the best customer experience and a big part of that is developing the data ops methodology it's something that I needed internally from my own line of business execution but it's now something that our customers are looking for to implement in their shops as well well good I really want to get into that and so let's let's start with data ops I mean I think you know a lot of people are familiar with DevOps not maybe not everybody's familiar with the data Ops what do we need to know about data well I mean you bring up the point that everyone knows DevOps and and then in fact I think you know what data Ops really does is bring a lot of the benefits that DevOps did for application development to the data management organizations so when we look at what is data ops it's a data management it's a it's a data management set of principles that helps organizations bring business ready data to their consumers quickly it takes it borrows from DevOps similarly where you have a data pipeline that associates a business value requirement I have this business initiative it's gonna drive this much revenue or this much cost savings this is the data that I need to be able to deliver it how do I develop that pipeline and map to the data sources know what data it is know that I can trust it so ensuring that it has the right quality that I'm actually using the data that it was meant for and then put it to use so in in history most dated management practices deployed a waterfall like methodology or implementation methodology and what that meant is all the data pipeline projects were implemented serially and it was dawn based on potentially a first-in first-out program management office with a DevOps mental model and the idea of being able to slice through all of the different silos that's required to collect the data to organize it to integrate it to validate its quality to create those data integration pipelines and then present it to the dashboard like if it's a Cognos dashboard for a operational process or even a data science team that whole end-to-end process gets streamlined through what we're calling data ops methodology so I mean as you well know we've been following this market since the early days of a dupe and people struggle with their data pipelines it's complicated for them there's a raft of tools and and and they spend most of their time wrangling data preparing data improving data quality different roles within the organization so it sounds like you know to borrow from from DevOps data OPS's is all about REME lining that data pipeline helping people really understand and communicate across end to end as you're saying but but what's the ultimate business outcome that you're trying to drive so when you think about projects that require data to again cut cost to automate a business process or drive new revenue initiatives how long does it take to get from having access to the data to making it available that duration for every time delay that is spent wasted trying to connect to data sources trying to find subject matter experts that understand what the data means and can verify its quality like all of those steps along those different teams and different disciplines introduces delay in delivering high quality data fast so the business value of data Ops is always associated with something that the business is trying to achieve but with a time element so if it's for every day we don't have this data to make a decision we're either making money or losing money that's the value proposition of data ops so it's about taking things that people are already doing today and figuring out the quickest way to do it through automation through workflows and just cutting through all of the political barriers that often happens when these data's cross different organizational boundaries yeah so speed time to insights is critical but to in and then you know with DevOps you're really bringing together the skill sets into sort of you know one super dev or one super ops it sounds with data ops it's really more about everybody understanding their role and having communication and line-of-sight across the entire organization it's not trying to make everybody a superhuman data person it's the whole it's the group it's the team effort really it's really a team game here isn't it well that's a big part of it so just like any type of practice there's people aspects process aspects and technology right so people process technology and while you're you're describing it like having that super team that knows everything about the data the only way that's possible is if you have a common foundation of metadata so we've seen a surgeons in the data catalog market and last you know six seven years and what what the what that the innovation in the data catalog market has actually enabled us to be able to drive more data ops pipelines meaning as you identify data assets you've captured the metadata you capture its meaning you capture information that can be shared whether they're stakeholders it really then becomes more of a essential repository for people to really quickly know what data they have really quickly understand what it means in its quality and very quickly with the right proper authority like privacy rules included put it to use for models you know dashboards operational processes okay and and we're gonna talk about some examples and one of them of course is ibm's own internal example but but help us understand where you advise clients to start I want to get into it where do I get started yeah I mean so traditionally what we've seen with these large data management data governance programs is that sometimes our customers feel like this is a big pill to swallow and what we've said is look there's an opera there's an opportunity here to quickly define a small project align it to a high-value business initiative target something that you can quickly gain access to the data map out these pipelines and create a squad of skills so it includes a person with DevOps type programming skills to automate an instrument a lot of the technology a subject matter expert who understands the data sources and its meaning a line of business executive who can translate bringing that information to the business project and associating with business value so when we say how do you get started we've developed a I would call it a pretty basic maturity model to help organizations figure out where are they in terms of the technology where are they in terms of organizationally knowing who the right people should be involved in these projects and then from a process perspective we've developed some pretty prescriptive project plans that help you nail down what are the data elements that are critical for this business business initiative and then we have for each role what their jobs are to consolidate the datasets map them together and present them to the consumer we find that six-week projects typically three sprints are perfect times to be able to in a timeline to create one of these very short quick win projects take that as an opportunity to figure out where your bottlenecks are in your own organization where your skill shortages are and then use the outcome of that six-week sprint to then focus on filling in gaps kick off the next project and iterate celebrate the success and promote the success because it's typically tied to a business value to help them create momentum for the next one all right that's awesome I want to now get into some examples I mean or you're we're both massachusetts-based normally you'd be in our studio and we'd be sitting here face-to-face obviously with kovat 19 in this crisis we're all sheltering in place you're up in somewhere in New England I happen to be in my studio believe it but I'm the only one here so relate this to kovat how would data ops or maybe you have a concrete example in in terms of how it's helped inform or actually anticipate and keep up-to-date with what's happening with building yeah well I mean we're all experiencing it I don't think there's a person on the planet who hasn't been impacted by what's been going on with this coded pandemic crisis so we started we started down this data obscurity a year ago I mean this isn't something that we just decided to implement a few weeks ago we've been working on developing the methodology getting our own organization in place so that we could respond the next time we needed to be able to you know act upon a data-driven decision so part of step one of our journey has really been working with our global chief data officer Interpol who I believe you have had an opportunity to meet with an interview so part of this year journey has been working with with our corporate organization I'm in the line of business organization where we've established the roles and responsibilities we've established the technology stack based on our cloud pack for data and Watson knowledge catalog so I use that as the context for now we're faced with a pandemic crisis and I'm being asked in my business unit to respond very quickly how can we prioritize the offerings that are gonna help those in critical need so that we can get those products out to market we can offer a you know 90-day free use for governments and Hospital agencies so in order for me to do that as a operations lead for our team I needed to be able to have access to our financial data I needed to have access to our product portfolio information I needed to understand our cloud capacity so in order for me to be able to respond with the offers that we recently announced you know you can take a look at some of the examples with our Watson citizen assistant program where I was able to provide the financial information required for us to make those products available for governments hospitals state agencies etc that's a that's a perfect example now to to set the stage back to the corporate global chief data office organization they implemented some technology that allowed us to ingest data automatically classify it automatically assign metadata automatically associate data quality so that when my team started using that data we knew what the status of that information was when we started to build our own predictive models and so that's a great example of how we've partnered with a corporate central organization and took advantage of the automated set of capabilities without having to invest in any additional resources or headcount and be able to release products within a matter of a couple of weeks and in that automation is a function of machine intelligence is that right and obviously some experience but but you couldn't you and I when we were consultants doing this by hand we couldn't have done this we could have done it at scale anyways it is it machine intelligence an AI that allows us to do this that's exactly right and as you know our organization is data and AI so we happen to have the a research and innovation teams that are building a lot of this technology so we have somewhat of an advantage there but you're right the alternative to what I've described is manual spreadsheets it's querying databases it's sending emails to subject matter experts asking them what this data means if they're out sick or on vacation you have to wait for them to come back and all of this was a manual process and in the last five years we've seen this data catalog market really become this augmented data catalog and that augmentation means it's automation through AI so with years of experience and natural language understanding we can comb through a lot of the metadata that's available electronically we can comb through unstructured data we can categorize it and if you have a set of business terms that have industry standard definitions through machine learning we can automate what you and I did as a consultant manually in a matter of seconds that's the impact the AI is had in our organization and now we're bringing this to the market and it's a it's a big part of where I'm investing my time both internally and externally is bringing these types of concepts and ideas to the market so I'm hearing first of all one of the things that strikes me is you've got multiple data sources and data lives everywhere you might have your supply chain data and your ERP maybe that sits on Prem you might have some sales data that's sitting in the SAS store in a cloud somewhere you might have you know a weather data that you want to bring in in theory anyway the more data that you have the better insights that you can gather assuming you've got the right data quality but so let me start with like where the data is right so so it sits anywhere you don't know where it's gonna be but you know you need it so that that's part of this right is being able to read it quickly yeah it's funny you bring it up that way I actually look a little differently it's when you start these projects the data was in one place and then by the time you get through the end of a project you find out that it's a cloud so the data location actually changes while we're in the middle of projects we have many or coming even during this this pandemic crisis we have many organizations that are using this as an opportunity to move to SAS so what was on Prem is now cloud but that shouldn't change the definition of the data it shouldn't change its meaning it might change how you connect to it um it might also change your security policies or privacy laws now all of a sudden you have to worry about where is that data physically located and am I allowed to share it across national boundaries right before we knew physically where it was so when you think about data ops data ops is a process that sits on top of where the data physically resides and because we're mapping metadata and we're looking at these data pipelines and automated workflows part of the design principles are to set it up so that it's independent of where it resides however you have to have placeholders in your metadata and in your tool chain where we oughta mating these workflows so that you can accommodate when the data decides to move because of corporate policy change from on-prem to cloud then that's a big part of what data Ops offers it's the same thing by the way for DevOps they've had to accommodate you know building in you know platforms as a service versus on from the development environments it's the same for data ops and you know the other part that strikes me and listening to you is scale and it's not just about you know scale with the cloud operating model it's also about what you're talking about is you know the auto classification the automated metadata you can't do that manually you've got to be able to do that in order to scale with automation that's another key part of data Ops is it not it's well it's a big part of the value proposition and a lot of a part of the business base right then you and I started in this business you know and Big Data became the thing people just move all sorts of data sets to these Hadoop clusters without capturing the metadata and so as a result you know in the last 10 years this information is out there but nobody knows what it means anymore so you can't go back with the army of people and have them query these data sets because a lot of the contact was lost but you can use automated technology you can use automated machine learning with natural under Snatcher Alang guaa Jing to do a lot of the heavy lifting for you and a big part of data ops workflows and building these pipelines is to do what we call management-by-exception so if your algorithms say you know 80% confident that this is a phone number and your organization has a you know low risk tolerance that probably will go to an exception but if you have a you know a match algorithm that comes back and says it's 99 percent sure this is an email address right and you I have a threshold that's 98% it will automate much of the work that we used to have to do manually so that's an example of how you can automate eliminate manual work and have some human interaction based on your risk threshold now that's awesome I mean you're right the no schema on right said I throw it into a data leg the data link becomes the data swap we all know that joke okay I want to understand a little bit and maybe you have some other examples of some of the use cases here but there's some of the maturity of where customers are I mean it seems like you got to start by just understanding what data you have cataloging it you're getting your metadata act in order but then you've got a you've got a data quality component before you can actually implement and get yet to insight so you know where our customers on the on the maturity model do you have any other examples that you can share yeah so when we look at our data ops maturity model we tried to simplify it I mentioned this earlier that we try to simplify it so that really anybody can get started they don't have to have a full governance framework implemented to take advantage of the benefits data ops delivers so what we did we said if you can categorize your data ops programs into really three things one is how well do you know your data do you even know what data you have the second one is and you trust it like can you trust its quality can you trust its meeting and the third one is can you put it to use so if you really think about it when you begin with what data do you know right the first step is you know how are you determining what data you know the first step is if you are using spreadsheets replace it with a data catalog if you have a department line of business catalog and you need to start sharing information with the departments then start expanding to an enterprise level data catalog now you mentioned data quality so the first step is do you even have a data quality program right have you even established what your criteria are for high quality data have you considered what your data quality score is comprised of have you mapped out what your critical data elements are to run your business most companies have done that for they're they're governed processes but for these new initiatives and when you identify I'm in my example with the Kovach crisis what products are we gonna help bring to market quickly I need to be able to find out what the critical data elements are and can I trust it have I even done a quality scan and have teams commented on its trustworthiness to be used in this case if you haven't done anything like that in your organization that might be the first place to start pick the critical data elements for this initiative assess its quality and then start to implement the workflows to remediate and then when you get to putting it to use there's several methods for making data available you know one is simply making a data Mart available to a small set of users that's what most people do well first they make a spreadsheet of the data available but then if they need to have multiple people access it that's when like a data Mart might make sense technology like data virtualization eliminates the need for you to move data as you're in this prototyping phase and that's a great way to get started it doesn't cost a lot of money to get a virtual query set up to see if this is the right join or the right combination of fields that are required for this use case eventually you'll get to the need to use a high performance ETL tool for data integration but Nirvana is when you really get to that self-service data prep where users can query a catalog and say these are the data sets I need it presents you a list of data assets that are available I can point and click at these columns I want as part of my you know data pipeline and I hit go and it automatically generates that output for data science use cases for a Cognos dashboard right that's the most mature model and being able to iterate on that so quickly that as soon as you get feedback that that data elements are wrong or you need to add something you can do it push button and that's where data observation to bring organizations to well Julie I think there's no question that this kovat crisis is accentuated the importance of digital you know we talk about digital transformation a lot and it's it's certainly real although I would say a lot of people that we talk to will say well you know not on my watch or I'll be retired before that all happens will this crisis is accelerating that transformation and data is at the heart of it you know digital means data and if you don't have your data you know story together and your act together then you're gonna you're not going to be able to compete and data ops really is a key aspect of that so you know give us a parting word all right I think this is a great opportunity for us to really assess how well we're leveraging data to make strategic decisions and if there hasn't been a more pressing time to do it it's when our entire engagement becomes virtual like this interview is virtual write everything now creates a digital footprint that we can leverage to understand where our customers are having problems where they're having successes you know let's use the data that's available and use data ops to make sure that we can iterate access that data know it trust it put it to use so that we can respond to those in need when they need it Julie Locker your incredible practitioner really hands-on really appreciate you coming on the Kuban and sharing your knowledge with us thank you okay thank you very much it was a pleasure to be here all right and thank you for watching everybody this is Dave Volante for the cube and we will see you next time [Music]

Published Date : Apr 9 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Julie Lochner	PERSON	0.99+
Dave Volante	PERSON	0.99+
Julie Lockner	PERSON	0.99+
90-day	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
99 percent	QUANTITY	0.99+
Julie Locker	PERSON	0.99+
80%	QUANTITY	0.99+
six-week	QUANTITY	0.99+
first step	QUANTITY	0.99+
New England	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
first step	QUANTITY	0.99+
98%	QUANTITY	0.99+
Julie	PERSON	0.99+
DevOps	TITLE	0.99+
a year ago	DATE	0.99+
Boston	LOCATION	0.99+
David	PERSON	0.98+
Watson	TITLE	0.98+
second one	QUANTITY	0.98+
six seven years	QUANTITY	0.97+
Interpol	ORGANIZATION	0.97+
third one	QUANTITY	0.97+
one	QUANTITY	0.97+
both	QUANTITY	0.96+
Mart	ORGANIZATION	0.94+
first place	QUANTITY	0.93+
today	DATE	0.92+
each role	QUANTITY	0.91+
first	QUANTITY	0.91+
a couple of weeks	QUANTITY	0.88+
pandemic	EVENT	0.88+
kovat	PERSON	0.87+
three sprints	QUANTITY	0.87+
three things	QUANTITY	0.84+
step one	QUANTITY	0.8+
guaa Jing	PERSON	0.8+
few weeks ago	DATE	0.78+
OPS	ORGANIZATION	0.77+
one place	QUANTITY	0.77+
ibm	ORGANIZATION	0.75+
Nirvana	ORGANIZATION	0.74+
last five years	DATE	0.72+
DevOps	ORGANIZATION	0.71+
this year	DATE	0.7+
pandemic crisis	EVENT	0.7+
last 10 years	DATE	0.69+
a lot of people	QUANTITY	0.68+
Cognos	TITLE	0.66+
lot of money	QUANTITY	0.66+
Kuban	LOCATION	0.56+
DataOps	ORGANIZATION	0.55+
Kovach	ORGANIZATION	0.55+
Snatcher	PERSON	0.51+
kovat	ORGANIZATION	0.49+
lot	QUANTITY	0.46+
19	PERSON	0.44+
massachusetts	PERSON	0.42+
SAS	ORGANIZATION	0.37+
Alang	PERSON	0.31+

UNLISTED FOR REVIEW Inderpal Bhandari, IBM | DataOps In Action

>>from the Cube Studios in >>Palo Alto and Boston connecting with thought leaders all around the world. This is a cube conversation. Everybody welcome this special digital presentation where we're covering the topic of data ops and specifically how IBM is really operationalize ing and automating the data pipeline with data office. And with me is Interpol Bhandari, who is the global chief data officer at IBM and Paul. It's always great to see you. Thanks for coming on. >>My pleasure. >>So, you know the standard throwaway question from guys like me And you know what keeps the chief data officer up at night? Well, I know what's keeping you up that night. It's coverted 19. How you >>doing? It's keeping keeping all of us. >>Yeah, for sure. Um, So how are you guys making out as a leader I'm interested in, You know, how you have responded would whether it's communications. Obviously you're doing much more stuff you remotely You're not on airplanes. Certainly like you used to be. But But what was your first move when you actually realized this was going to require a shift? >>Well, I think one of the first things that I did wants to test the ability of my organization, You work remotely. This was well before the the recommendations came in from the government just so that we wanted to be sure that this is something that we could pull off if there were extreme circumstances where even everybody was. And so that was one of the first things we did along with that. I think another major activity that's being boxed off is even that we have created this Central Data and AI platform for idea using our hybrid, multi cloud approach. How could that the adaptive very, very quickly help them look over the city? But those were the two big items that my team and my embarked on and again, like I said, this is before there was any recommendations from the government or even internally, within IBM. Have any recommendations be? We decided that we wanted to run ahead and make sure that we were ready to ready to operate in that fashion. And I believe a lot of my colleagues did the same. Yeah, >>there's a there's a conversation going on right now just around productivity hits that people may be taking because they really weren't prepared it sounds like you're pretty comfortable with the productivity impact that you're achieving. >>Oh, I'm totally comfortable with the politics. I mean, in fact, I will tell you that while we've gone down this spot, we've realized that in some cases the productivity is actually going to be better when people are working from home and they're able to focus a lot more on the work, you know, And this could. This one's the gamut from the nature of the jaw, where you know somebody who basically needs to be in the front of the computer and is remotely taking care of operations. You know, if they don't have to come in, their productivity is going to go up Somebody like myself who had a long drive into work, you know, which I would use a phone calls, but that that entire time it can be used a lot more productivity, locked in a lot more productive manner. So there is. We realized that there's going to be some aspect of productivity that will actually be helped by the situation. Why did you are able to deliver the services that you deliver with the same level of quality and satisfaction that you want Now there were certain other aspect where you know the whole activity is going to be effective. So you know my team. There's a lot off white boarding that gets done there lots off informal conversations that spot creativity. But those things are much harder to replicate in a remote and large. So we've got a sense off. You know where we have to do some work? Well, things together. This is where we're actually going to be mobile. But all in all, they're very comfortable that we can pull this off. >>That's great. I want to stay on Cove it for a moment and in the context of just data and data ops, and you know why Now, obviously, with a crisis like this, it increases the imperative to really have your data act together. But I want to ask you both specifically as it relates to covert, why Data office is so important. And then just generally, why at this this point in time, >>So, I mean, you know, the journey we've been on. Thank you. You know, when I joined our data strategy centered around cloud data and ai, mainly because IBM business strategy was around that, and because there wasn't the notion off AI and Enterprise, right, there was everybody understood what AI means for the consumer. But for the enterprise, people don't really understand. Well, what a man. So our data strategy became one off, actually making IBM itself into an AI and and then using that as a showcase for our clients and customers who look a lot like us, you make them into AI. And in a nutshell, what that translated to was that one had two in few ai into the workflow off the key business processes off enterprise. So if you think about that workflow is very demanding, right, you have to be able to deliver. They did not insights on time just when it's needed. Otherwise, you can essentially slow down the whole workflow off a major process within an end. But to be able to pull all that off you need to have your own data works very, very streamlined so that a lot of it is automated and you're able to deliver those insights as the people who are involved in the work floor needed. So we've spent a lot of time while we were making IBM into any I enterprise and infusing AI into our key business processes into essentially a data ops pipeline that was very, very streamlined, which then allowed us to do very quickly adapt do the over 19 situation and I'll give you one specific example that will go to you know how one would someone would essentially leverage that capability that I just talked about to do this. So one of the key business processes that we have taken a map, it was our supply chain. You know, if you're a global company and our supply chain is critical, you have lots of suppliers, and they are all over the globe. And we have different types of products so that, you know, has a multiplication factors for each of those, you have additional suppliers and you have events. You have other events, you have calamities, you have political events. So we have to be able to very quickly understand the risks associated with any of those events with regard to our supply chain and make appropriate adjustments on the fly. So that was one off the key applications that we built on our central data. And as Paul about data ops pipeline. That meant we ingest the ingestion off those several 100 sources of data not to be blazingly fast and also refresh very, very quickly. Also, we have to then aggregate data from the outside from external sources that had to do with weather related events that had to do with political events. Social media feeds a separate I'm overly that on top off our map of interest with regard to our supply chain sites and also where they were supposed to deliver. We also leave them our capabilities here, track of those shipments as they flowed and have that data flow back as well so that we would know exactly where where things were. This is only possible because we had a streamline data ops capability and we have built this Central Data and AI platform for IBM. Now you flip over to the Coleman 19 situation when Corbyn 19 merged and we began to realize that this was going to be a significant significant pandemic. What we were able to do very quickly wants to overlay the over 19 incidents on top of our sites of interest, as well as pick up what was being reported about those sites of interests and provide that over to our business continuity. So this became an immediate exercise that we embark. But it wouldn't have been possible if you didn't have the foundation off the data office pipeline as well as that Central Data and AI platform even plays to help you do that very, very quickly and adapt. >>So what I really like about this story and something that I want to drill into is it Essentially, a lot of organizations have a really tough time operational izing ai, infusing it to use your word and the fact that you're doing it, um is really a good proof point that I want to explore a little bit. So you're essentially there was a number of aspects of what you just described. There was the data quality piece with your data quality in theory, anyway, is going to go up with more data if you can handle it and the other was speed time to insight, so you can respond more quickly if it's talk about this Covic situation. If you're days behind for weeks behind, which is not uncommon, sometimes even worse, you just can't respond. I mean, the things change daily? Um, sometimes, Certainly within the day. Um, so is that right? That's kind of the the business outcome. An objective that you guys were after. >>Yes, you know, So Rama Common infuse ai into your business processes right over our chain. Um, don't come metric. That one focuses on is end to end cycle time. So you take that process the end to end process and you're trying to reduce the end to end cycle time by several factors, several orders of magnitude. And you know, there are some examples off things that we did. For instance, in my organ organization that has to do with the generation of metadata is data about data. And that's usually a very time consuming process. And we've reduced that by over 95%. By using AI, you actually help in the metadata generation itself. And that's applied now across the board for many different business processes that, you know IBM has. That's the same kind of principle that was you. You'll be able to do that so that foundation essentially enables you to go after that cycle time reduction right off the bat. So when you get to a situation like over 19 situation which demands urgent action. Your foundation is already geared to deliver on that. >>So I think actually, we might have a graphic. And then the second graphic, guys, if you bring up a 2nd 1 I think this is Interpol. What you're talking about here, that sort of 95% reduction. Ah, guys, if you could bring that up, would take a look at it. So, um, this is maybe not a cove. It use case? Yeah. Here it is. So that 95% reduction in the cycle time improvement in data quality. What we talked about this actually some productivity metrics, right? This is what you're talking about here in this metadata example. Correct? >>Yeah. Yes, the metadata. Right. It's so central to everything that one does with. I mean, it's basically data about data, and this is really the business metadata that you're talking about, which is once you have data in your data lake. If you don't have business metadata describing what that data is, then it's very hard for people who are trying to do things to determine whether they can, even whether they even have access to the right data. And typically this process is being done manually because somebody looks at the data that looks at the fields and describe it. And it could easily take months. And what we did was we essentially use a deep learning and natural language processing of road. Look at all the data that we've had historically over an idea, and we've automated metadata generation. So whether it was, you know, you were talking about the data relevant for 19 or for supply chain or far receivable process any one of our business processes. This is one of those fundamental steps that one must go through. You'll be able to get your data ready for action. And if you were able to take that cycle time for that step and reduce it by 95% you can imagine the acceleration. >>Yeah, and I like you were saying before you talk about the end to end concept, you're applying system thinking here, which is very, very important because, you know, a lot of a lot of clients that I talk to, they're so focused on one metric maybe optimizing one component of that end to end, but it's really the overall outcome that you're trying to achieve. You may sometimes, you know, be optimizing one piece, but not the whole. So that systems thinking is very, very important, isn't it? >>The systems thinking is extremely important overall, no matter you know where you're involved in the process off designing the system. But if you're the data guy, it's incredibly important because not only does that give you an insight into the cycle time reduction, but it also give clues U N into what standardization is necessary in the data so that you're able to support an eventual out. You know, a lot of people will go down the part of data governance and the creation of data standards, and you can easily boil the ocean trying to do that. But if you actually start with an end to end, view off your key processes and that by extension the outcomes associated with those processes as well as the user experience at the end of those processes and kind of then work backwards as one of the standards that you need for the data that's going to feed into all that, that's how you arrive at, you know, a viable practical data standards effort that you can essentially push forward so that there are multiple aspect when you take that end to end system view that helps the chief legal. >>One of the other tenants of data ops is really the ability across the organization for everybody to have visibility. Communications is very key. We've got another graphic that I want to show around the organizational, you know, in the right regime, and it's a complicated situation for a lot of people. But it's imperative, guys, if you bring up the first graphic, it's a heritage that organizations, you know, find bringing the right stakeholders and actually identify those individuals that are going to participate so that this full visibility everybody understands what their roles are. They're not in silos. So, guys, if you could show us that first graphic, that would be great. But talk about the organization and the right regime there. Interpol? >>Yes, yes, I believe you're going to know what you're going to show up is actually my organization, but I think it's yes, it's very, very illustrative what one has to set up. You'll be able to pull off the kind of impact that I thought So let's say we talked about that Central Data and AI platform that's driving the entire enterprise, and you're infusing AI into key business processes like the supply chain. Then create applications like the operational risk in size that we talked about that extended over. Do a fast emerging and changing situation like the over 19. You need an organization that obviously reflects the technical aspects of the right, so you have to have the data engineering on and AI on. You know, in my case, there's a lot of emphasis around deep learning because that's one of those skill set areas that's really quite rare, and it also very, very powerful. So uh huh you know, the major technology arms off that. There's also the governance on that I talked about. You have to produce the set off standards and implement them and enforce them so that you're able to make this into an impact. But then there's also there's a there's an adoption there. There's a There's a group that reports into me very, very, you know, Empowered Group, which essentially has to convince the rest of the organization to adopt. Yeah, yeah, but the key to their success has been in power in the sense that they're on power. You find like minded individuals in our key business processes. We're also empowered. And if they agree that just move forward and go and do it because you know, we've already provided the central capabilities by Central. I don't mean they're all in one location. You're completely global and you know it's it's It's a hybrid multi cloud set up, but it's a central in the sense that it's one source to come for for trusted data as well as the the expertise that you need from an AI standpoint to be able to move forward and deliver the business out. So when these business teams come together, be an option, that's where the magic happens. So that's another another aspect of the organization that's critical. And then we've also got, ah, Data Officer Council that I chair, and that has to do with no people who are the chief data officers off the individual business units that we have. And they're kind of my extended teams into the rest of the organization, and we levers that bolt from a adoption off the platform standpoint. But also in terms of defining and enforcing standards. It helps them stupid. >>I want to come back over and talk a little bit about business resiliency people. I think it probably seen the news that IBM providing supercomputer resource is that the government to fight Corona virus. You've also just announced that that some some RTP folks, um, are helping first responders and non profits and providing capabilities for no charge, which is awesome. I mean, it's the kind of thing. Look, I'm sensitive companies like IBM. You know, you don't want to appear to be ambulance chasing in these times. However, IBM and other big tech companies you're in a position to help, and that's what you're doing here. So maybe you could talk a little bit about what you're doing in this regard. Um, and then we'll tie it up with just business resiliency and importance of data. >>Right? Right. So, you know, I explained that the operational risk insights application that we had, which we were using internally, we call that 19 even we're using. We're using it primarily to assess the risks to our supply chain from various events and then essentially react very, very quickly. Do those doodles events so you could manage the situation. Well, we realize that this is something that you know, several non government NGOs that they could essentially use. There's a stability because they have to manage many of these situations like natural disaster. And so we've given that same capability, do the NGOs to you and, uh, to help that, to help them streamline their planning. And there's thinking, by the same token, But you talked about over 19 that same capability with the moment 19 data over layed on double, essentially becomes a business continuity, planning and resilience. Because let's say I'm a supply chain offers right now. I can look at incidents off over night, and I can I know what my suppliers are and I can see the incidents and I can say, Oh, yes, no, this supplier and I can see that the incidences going up this is likely to be affected. Let me move ahead and stop making plans backup plans, just in case it reaches a crisis level. On the other hand, if you're somebody in revenue planning, you know, on the finance side and you know where you keep clients and customers are located again by having that information over laid that those sites, you can make your own judgments and you can make your own assessment to do that. So that's how it translates over into business continuity and resolute resilience planning. True, we are internally. No doing that now to every department. You know, that's something that we're actually providing them this capability because we build rapidly on what we have already done to be able to do that as we get inside into what each of those departments do with that data. Because, you know, once they see that data, once they overlay it with their sights of interest. And this is, you know, anybody and everybody in IBM, because no matter what department they're in, there are going to decide the interests that are going to be affected. And they haven't understanding what those sites of interest mean in the context off the planning that they're doing and so they'll be able to make judgments. But as we get a better understanding of that, we will automate those capabilities more and more for each of those specific areas. And now you're talking about the comprehensive approach and AI approach to business continuity and resilience planning in the context of a large IT organization like IBM, which obviously will be of great interest to our enterprise, clients and customers. >>Right? One of the things that we're researching now is trying to understand. You know, what about this? Prices is going to be permanent. Some things won't be, but we think many things will be. There's a lot of learnings. Do you think that organizations will rethink business resiliency in this context that they might sub optimize profitability, for example, to be more prepared crises like this with better business resiliency? And what role would data play in that? >>So, you know, it's a very good question and timely fashion, Dave. So I mean, clearly, people have understood that with regard to that's such a pandemic. Um, the first line of defense, right is is not going to be so much on the medicine side because the vaccine is not even available and will be available for a period of time. It has to go through. So the first line of defense is actually think part of being like approach, like we've seen play out across the world and then that in effect results in an impact on the business, right in the economic climate and on the business is there's an impact. I think people have realized this now they will honestly factor this in and do that in to how they do become. One of those things from this is that I'm talking about how this becomes a permanent. I think it's going to become one of those things that if you go responsible enterprise, you are going to be landing forward. You're going to know how to implement this, the on the second go round. So obviously you put those frameworks and structures in place and there will be a certain costs associated with them, and one could argue that that would eat into the profitability. On the other hand, what I would say is because these two points really that these are fast emerging fluid situations. You have to respond very, very quickly. You will end up laying out a foundation pretty much like we did, which enables you to really accelerate your pipeline, right? So the data ops pipelines we talked about, there's a lot of automation so that you can react very quickly, you know, data injection very, very rapidly that you're able to do that kind of thing, that meta data generation. That's the entire pipeline that you're talking about, that you're able to respond very quickly, bring in new data and then aggregated at the right levels, infuse it into the work flows on the delivery, do the right people at the right time. Well, you know that will become a must. But once you do that, you could argue that there's a cost associated with doing that. But we know that the cycle time reductions on things like that they can run, you know? I mean, I gave you the example of 95% 0 you know, on average, we see, like a 70% end to end cycle time where we've implemented the approach, and that's been pretty pervasive within IBM across the business. So that, in essence, then actually becomes a driver for profitability. So yes, it might. You know this might back people into doing that, but I would argue that that's probably something that's going to be very good long term for the enterprises and world, and they'll be able to leverage that in their in their business and I think that just the competitive director off having to do that will force everybody down that path. But I think it'll be eventually ago >>that end and cycle time. Compression is huge, and I like what you're saying because it's it's not just a reduction in the expected loss during of prices. There's other residual benefits to the organization. Interpol. Thanks so much for coming on the Cube and sharing this really interesting and deep case study. I know there's a lot more information out there, so really appreciate your done. >>My pleasure. >>Alright, take everybody. Thanks for watching. And this is Dave Volante for the Cube. And we will see you next time. Yeah, yeah, yeah.

Published Date : Apr 8 2020

SUMMARY :

how IBM is really operationalize ing and automating the data pipeline with So, you know the standard throwaway question from guys like me And you know what keeps the chief data officer up It's keeping keeping all of us. You know, how you have responded would whether it's communications. so that was one of the first things we did along with that. productivity impact that you're achieving. This one's the gamut from the nature of the jaw, where you know somebody But I want to ask you both specifically as it relates to covert, But to be able to pull all that off you need to have your own data works is going to go up with more data if you can handle it and the other was speed time to insight, So you take that process the end to end process and you're trying to reduce the end to end So that 95% reduction in the cycle time improvement in data quality. So whether it was, you know, you were talking about the data relevant Yeah, and I like you were saying before you talk about the end to end concept, you're applying system that you need for the data that's going to feed into all that, that's how you arrive you know, in the right regime, and it's a complicated situation for a lot of people. So uh huh you know, the major technology arms off that. So maybe you could talk a little bit about what you're doing in this regard. do the NGOs to you and, uh, to help that, Do you think that organizations will I think it's going to become one of those things that if you go responsible enterprise, Thanks so much for coming on the Cube and sharing And we will see you next time.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
95%	QUANTITY	0.99+
70%	QUANTITY	0.99+
two	QUANTITY	0.99+
Dave Volante	PERSON	0.99+
Palo Alto	LOCATION	0.99+
One	QUANTITY	0.99+
Interpol Bhandari	PERSON	0.99+
Inderpal Bhandari	PERSON	0.99+
two points	QUANTITY	0.99+
first graphic	QUANTITY	0.99+
one piece	QUANTITY	0.99+
one	QUANTITY	0.99+
Cube Studios	ORGANIZATION	0.99+
100 sources	QUANTITY	0.99+
second graphic	QUANTITY	0.99+
second	QUANTITY	0.98+
first graphic	QUANTITY	0.98+
Interpol	ORGANIZATION	0.98+
over 95%	QUANTITY	0.98+
first line	QUANTITY	0.98+
first move	QUANTITY	0.98+
19	QUANTITY	0.97+
each	QUANTITY	0.97+
one source	QUANTITY	0.97+
Boston	LOCATION	0.97+
Paul	PERSON	0.97+
pandemic	EVENT	0.97+
both	QUANTITY	0.97+
two big items	QUANTITY	0.96+
one metric	QUANTITY	0.96+
one component	QUANTITY	0.95+
one location	QUANTITY	0.95+
over 19	QUANTITY	0.94+
double	QUANTITY	0.94+
first things	QUANTITY	0.93+
Data Officer Council	ORGANIZATION	0.93+
19	OTHER	0.92+
Empowered Group	ORGANIZATION	0.91+
Cove	ORGANIZATION	0.87+
Rama Common	ORGANIZATION	0.87+
Corona virus	OTHER	0.85+
Cube	COMMERCIAL_ITEM	0.82+
over 19 situation	QUANTITY	0.82+
over 19 incidents	QUANTITY	0.81+
first responders	QUANTITY	0.73+
Coleman 19	EVENT	0.71+
Central Data	ORGANIZATION	0.7+
2nd 1	QUANTITY	0.69+
example	QUANTITY	0.64+
Central Data	TITLE	0.6+
data	ORGANIZATION	0.54+
Corbyn 19	ORGANIZATION	0.53+
0	QUANTITY	0.51+
Covic	ORGANIZATION	0.49+

Michael Foster & Doron Caspin, Red Hat | KubeCon + CloudNativeCon NA 2022

(upbeat music) >> Hey guys, welcome back to the show floor of KubeCon + CloudNativeCon '22 North America from Detroit, Michigan. Lisa Martin here with John Furrier. This is day one, John at theCUBE's coverage. >> CUBE's coverage. >> theCUBE's coverage of KubeCon. Try saying that five times fast. Day one, we have three wall-to-wall days. We've been talking about Kubernetes, containers, adoption, cloud adoption, app modernization all morning. We can't talk about those things without addressing security. >> Yeah, this segment we're going to hear container and Kubernetes security for modern application 'cause the enterprise are moving there. And this segment with Red Hat's going to be important because they are the leader in the enterprise when it comes to open source in Linux. So this is going to be a very fun segment. >> Very fun segment. Two guests from Red Hat join us. Please welcome Doron Caspin, Senior Principal Product Manager at Red Hat. Michael Foster joins us as well, Principal Product Marketing Manager and StackRox Community Lead at Red Hat. Guys, great to have you on the program. >> Thanks for having us. >> Thank you for having us. >> It's awesome. So Michael StackRox acquisition's been about a year. You got some news? >> Yeah, 18 months. >> Unpack that for us. >> It's been 18 months, yeah. So StackRox in 2017, originally we shifted to be the Kubernetes-native security platform. That was our goal, that was our vision. Red Hat obviously saw a lot of powerful, let's say, mission statement in that, and they bought us in 2021. Pre-acquisition we were looking to create a cloud service. Originally we ran on Kubernetes platforms, we had an operator and things like that. Now we are looking to basically bring customers in into our service preview for ACS as a cloud service. That's very exciting. Security conversation is top notch right now. It's an all time high. You can't go with anywhere without talking about security. And specifically in the code, we were talking before we came on camera, the software supply chain is real. It's not just about verification. Where do you guys see the challenges right now? Containers having, even scanning them is not good enough. First of all, you got to scan them and that may not be good enough. Where's the security challenges and where's the opportunity? >> I think a little bit of it is a new way of thinking. The speed of security is actually does make you secure. We want to keep our images up and fresh and updated and we also want to make sure that we're keeping the open source and the different images that we're bringing in secure. Doron, I know you have some things to say about that too. He's been working tirelessly on the cloud service. >> Yeah, I think that one thing, you need to trust your sources. Even if in the open source world, you don't want to copy paste libraries from the web. And most of our customers using third party vendors and getting images from different location, we need to trust our sources and we have a really good, even if you have really good scanning solution, you not always can trust it. You need to have a good solution for that. >> And you guys are having news, you're announcing the Red Hat Advanced Cluster Security Cloud Service. >> Yes. >> What is that? >> So we took StackRox and we took the opportunity to make it as a cloud services so customer can consume the product as a cloud services as a start offering and customer can buy it through for Amazon Marketplace and in the future Azure Marketplace. So customer can use it for the AKS and EKS and AKS and also of course OpenShift. So we are not specifically for OpenShift. We're not just OpenShift. We also provide support for EKS and AKS. So we provided the capability to secure the whole cloud posture. We know customer are not only OpenShift or not only EKS. We have both. We have free cloud or full cloud. So we have open. >> So it's not just OpenShift, it's Kubernetes, environments, all together. >> Doron: All together, yeah. >> Lisa: Meeting customers where they are. >> Yeah, exactly. And we focus on, we are not trying to boil the ocean or solve the whole cloud security posture. We try to solve the Kubernetes security cluster. It's very unique and very need unique solution for that. It's not just added value in our cloud security solution. We think it's something special for Kubernetes and this is what Red that is aiming to. To solve this issue. >> And the ACS platform really doesn't change at all. It's just how they're consuming it. It's a lot quicker in the cloud. Time to value is right there. As soon as you start up a Kubernetes cluster, you can get started with ACS cloud service and get going really quickly. >> I'm going to ask you guys a very simple question, but I heard it in the bar in the lobby last night. Practitioners talking and they were excited about the Red Hat opportunity. They actually asked a question, where do I go and get some free Red Hat to test some Kubernetes out and run helm or whatever. They want to play around. And do you guys have a program for someone to get start for free? >> Yeah, so the cloud service specifically, we're going to service preview. So if people sign up, they'll be able to test it out and give us feedback. That's what we're looking for. >> John: Is that a Sandbox or is that going to be in the cloud? >> They can run it in their own environment. So they can sign up. >> John: Free. >> Doron: Yeah, free. >> For the service preview. All we're asking for is for customer feedback. And I know it's actually getting busy there. It's starting December. So the quicker people are, the better. >> So my friend at the lobby I was talking to, I told you it was free. I gave you the sandbox, but check out your cloud too. >> And we also have the open source version so you can download it and use it. >> Yeah, people want to know how to get involved. I'm getting a lot more folks coming to Red Hat from the open source side that want to get their feet wet. That's been a lot of people rarely interested. That's a real testament to the product leadership. Congratulations. >> Yeah, thank you. >> So what are the key challenges that you have on your roadmap right now? You got the products out there, what's the current stake? Can you scope the adoption? Can you share where we're at? What people are doing specifically and the real challenges? >> I think one of the biggest challenges is talking with customers with a slightly, I don't want to say outdated, but an older approach to security. You hear things like malware pop up and it's like, well, really what we should be doing is keeping things into low and medium vulnerabilities, looking at the configuration, managing risk accordingly. Having disparate security tools or different teams doing various things, it's really hard to get a security picture of what's going on in the cluster. That's some of the biggest challenges that we talk with customers about. >> And in terms of resolving those challenges, you mentioned malware, we talk about ransomware. It's a household word these days. It's no longer, are we going to get hit? It's when? It's what's the severity? It's how often? How are you guys helping customers to dial down some of the risk that's inherent and only growing these days? >> Yeah, risk, it's a tough word to generalize, but our whole goal is to give you as much security information in a way that's consumable so that you can evaluate your risk, set policies, and then enforce them early on in the cluster or early on in the development pipeline so that your developers get the security information they need, hopefully asynchronously. That's the best way to do it. It's nice and quick, but yeah. I don't know if Doron you want to add to that? >> Yeah, so I think, yeah, we know that ransomware, again, it's a big world for everyone and we understand the area of the boundaries where we want to, what we want to protect. And we think it's about policies and where we enforce it. So, and if you can enforce it on, we know that as we discussed before that you can scan the image, but we never know what is in it until you really run it. So one of the thing that we we provide is runtime scanning. So you can scan and you can have policy in runtime. So enforce things in runtime. But even if one image got in a way and get to your cluster and run on somewhere, we can stop it in runtime. >> Yeah. And even with the runtime enforcement, the biggest thing we have to educate customers on is that's the last-ditch effort. We want to get these security controls as early as possible. That's where the value's going to be. So we don't want to be blocking things from getting to staging six weeks after developers have been working on a project. >> I want to get you guys thoughts on developer productivity. Had Docker CEO on earlier and since then I had a couple people messaging me. Love the vision of Docker, but Docker Hub has some legacy and it might not, has does something kind of adoption that some people think it does. Are people moving 'cause there times they want to have these their own places? No one place or maybe there is, or how do you guys see the movement of say Docker Hub to just using containers? I don't need to be Docker Hub. What's the vis-a-vis competition? >> I mean working with open source with Red Hat, you have to meet the developers where they are. If your tool isn't cutting it for developers, they're going to find a new tool and really they're the engine, the growth engine of a lot of these technologies. So again, if Docker, I don't want to speak about Docker or what they're doing specifically, but I know that they pretty much kicked off the container revolution and got this whole thing started. >> A lot of people are using your environment too. We're hearing a lot of uptake on the Red Hat side too. So, this is open source help, it all sorts stuff out in the end, like you said, but you guys are getting a lot of traction there. Can you share what's happening there? >> I think one of the biggest things from a developer experience that I've seen is the universal base image that people are using. I can speak from a security standpoint, it's awesome that you have a base image where you can make one change or one issue and it can impact a lot of different applications. That's one of the big benefits that I see in adoption. >> What are some of the business, I'm curious what some of the business outcomes are. You talked about faster time to value obviously being able to get security shifted left and from a control perspective. but what are some of the, if I'm a business, if I'm a telco or a healthcare organization or a financial organization, what are some of the top line benefits that this can bubble up to impact? >> I mean for me, with those two providers, compliance is a massive one. And just having an overall look at what's going on in your clusters, in your environments so that when audit time comes, you're prepared. You can get through that extremely quickly. And then as well, when something inevitably does happen, you can get a good image of all of like, let's say a Log4Shell happens, you know exactly what clusters are affected. The triage time is a lot quicker. Developers can get back to developing and then yeah, you can get through it. >> One thing that we see that customers compliance is huge. >> Yes. And we don't want to, the old way was that, okay, I will provision a cluster and I will do scans and find things, but I need to do for PCI DSS for example. Today the customer want to provision in advance a PCI DSS cluster. So you need to do the compliance before you provision the cluster and make all the configuration already baked for PCI DSS or HIPAA compliance or FedRAMP. And this is where we try to use our compliance, we have tools for compliance today on OpenShift and other clusters and other distribution, but you can do this in advance before you even provision the cluster. And we also have tools to enforce it after that, after your provision, but you have to do it again before and after to make it more feasible. >> Advanced cluster management and the compliance operator really help with that. That's why OpenShift Platform Plus as a bundle is so popular. Just being able to know that when a cluster gets provision, it's going to be in compliance with whatever the healthcare provider is using. And then you can automatically have ACS as well pop up so you know exactly what applications are running, you know it's in compliance. I mean that's the speed. >> You mentioned the word operator, I get triggering word now for me because operator role is changing significantly on this next wave coming because of the automation. They're operating, but they're also devs too. They're developing and composing. It's almost like a dashboard, Lego blocks. The operator's not just manually racking and stacking like the old days, I'm oversimplifying it, but the new operators running stuff, they got observability, they got coding, their servicing policy. There's a lot going on. There's a lot of knobs. Is it going to get simpler? How do you guys see the org structures changing to fill the gap on what should be a very simple, turn some knobs, operate at scale? >> Well, when StackRox originally got acquired, one of the first things we did was put ACS into an operator and it actually made the application life cycle so much easier. It was very easy in the console to go and say, Hey yeah, I want ACS my cluster, click it. It would get provisioned. New clusters would get provisioned automatically. So underneath it might get more complicated. But in terms of the application lifecycle, operators make things so much easier. >> And of course I saw, I was lucky enough with Lisa to see Project Wisdom in AnsibleFest. You going to say, Hey, Red Hat, spin up the clusters and just magically will be voice activated. Starting to see AI come in. So again, operations operator is got to dev vibe and an SRE vibe, but it's not that direct. Something's happening there. We're trying to put our finger on. What do you guys think is happening? What's the real? What's the action? What's transforming? >> That's a good question. I think in general, things just move to the developers all the time. I mean, we talk about shift left security, everything's always going that way. Developers how they're handing everything. I'm not sure exactly. Doron, do you have any thoughts on that. >> Doron, what's your reaction? You can just, it's okay, say what you want. >> So I spoke with one of our customers yesterday and they say that in the last years, we developed tons of code just to operate their infrastructure. That if developers, so five or six years ago when a developer wanted VM, it will take him a week to get a VM because they need all their approval and someone need to actually provision this VM on VMware. And today they automate all the way end-to-end and it take two minutes to get a VM for developer. So operators are becoming developers as you said, and they develop code and they make the infrastructure as code and infrastructure as operator to make it more easy for the business to run. >> And then also if you add in DataOps, AIOps, DataOps, Security Ops, that's the new IT. It seems to be the new IT is the stuff that's scaling, a lot of data's coming in, you got security. So all that's got to be brought in. How do you guys view that into the equation? >> Oh, I mean you become big generalists. I think there's a reason why those cloud security or cloud professional certificates are becoming so popular. You have to know a lot about all the different applications, be able to code it, automate it, like you said, hopefully everything as code. And then it also makes it easy for security tools to come in and look and examine where the vulnerabilities are when those things are as code. So because you're going and developing all this automation, you do become, let's say a generalist. >> We've been hearing on theCUBE here and we've been hearing the industry, burnout, associated with security professionals and some DataOps because the tsunami of data, tsunami of breaches, a lot of engineers getting called in the middle of the night. So that's not automated. So this got to get solved quickly, scaled up quickly. >> Yes. There's two part question there. I think in terms of the burnout aspect, you better send some love to your security team because they only get called when things get broken and when they're doing a great job you never hear about them. So I think that's one of the things, it's a thankless profession. From the second part, if you have the right tools in place so that when something does hit the fan and does break, then you can make an automated or a specific decision upstream to change that, then things become easy. It's when the tools aren't in place and you have desperate environments so that when a Log4Shell or something like that comes in, you're scrambling trying to figure out what clusters are where and where you're impacted. >> Point of attack, remediate fast. That seems to be the new move. >> Yeah. And you do need to know exactly what's going on in your clusters and how to remediate it quickly, how to get the most impact with one change. >> And that makes sense. The service area is expanding. More things are being pushed. So things will, whether it's a zero day vulnerability or just attack. >> Just mix, yeah. Customer automate their all of things, but it's good and bad. Some customer told us they, I think Spotify lost the whole a full zone because of one mistake of a customer because they automate everything and you make one mistake. >> It scale the failure really. >> Exactly. Scaled the failure really fast. >> That was actually few contact I think four years ago. They talked about it. It was a great learning experience. >> It worked double edge sword there. >> Yeah. So definitely we need to, again, scale automation, test automation way too, you need to hold the drills around data. >> Yeah, you have to know the impact. There's a lot of talk in the security space about what you can and can't automate. And by default when you install ACS, everything is non-enforced. You have to have an admission control. >> How are you guys seeing your customers? Obviously Red Hat's got a great customer base. How are they adopting to the managed service wave that's coming? People are liking the managed services now because they maybe have skills gap issues. So managed service is becoming a big part of the portfolio. What's your guys' take on the managed services piece? >> It's just time to value. You're developing a new application, you need to get it out there quick. If somebody, your competitor gets out there a month before you do, that's a huge market advantage. >> So you care how you got there. >> Exactly. And so we've had so much Kubernetes expertise over the last 10 or so, 10 plus year or well, Kubernetes for seven plus years at Red Hat, that why wouldn't you leverage that knowledge internally so you can get your application. >> Why change your toolchain and your workflows go faster and take advantage of the managed service because it's just about getting from point A to point B. >> Exactly. >> Well, in time to value is, you mentioned that it's not a trivial term, it's not a marketing term. There's a lot of impact that can be made. Organizations that can move faster, that can iterate faster, develop what their customers are looking for so that they have that competitive advantage. It's definitely not something that's trivial. >> Yeah. And working in marketing, whenever you get that new feature out and I can go and chat about it online, it's always awesome. You always get customers interests. >> Pushing new code, being secure. What's next for you guys? What's on the agenda? What's around the corner? We'll see a lot of Red Hat at re:Invent. Obviously your relationship with AWS as strong as a company. Multi-cloud is here. Supercloud as we've been saying. Supercloud is a thing. What's next for you guys? >> So we launch the cloud services and the idea that we will get feedback from customers. We are not going GA. We're not going to sell it for now. We want to get customers, we want to get feedback to make the product as best what we can sell and best we can give for our customers and get feedback. And when we go GA and we start selling this product, we will get the best product in the market. So this is our goal. We want to get the customer in the loop and get as much as feedback as we can. And also we working very closely with our customers, our existing customers to announce the product to add more and more features what the customer needs. It's all about supply chain. I don't like it, but we have to say, it's all about making things more automated and make things more easy for our customer to use to have security in the Kubernetes environment. >> So where can your customers go? Clearly, you've made a big impact on our viewers with your conversation today. Where are they going to be able to go to get their hands on the release? >> So you can find it on online. We have a website to sign up for this program. It's on my blog. We have a blog out there for ACS cloud services. You can just go there, sign up, and we will contact the customer. >> Yeah. And there's another way, if you ever want to get your hands on it and you can do it for free, Open Source StackRox. The product is open source completely. And I would love feedback in Slack channel. It's one of the, we also get a ton of feedback from people who aren't actually paying customers and they contribute upstream. So that's an awesome way to get started. But like you said, you go to, if you search ACS cloud service and service preview. Don't have to be a Red Hat customer. Just if you're running a CNCF compliant Kubernetes version. we'd love to hear from you. >> All open source, all out in the open. >> Yep. >> Getting it available to the customers, the non-customers, they hopefully pending customers. Guys, thank you so much for joining John and me talking about the new release, the evolution of StackRox in the last season of 18 months. Lot of good stuff here. I think you've done a great job of getting the audience excited about what you're releasing. Thank you for your time. >> Thank you. >> Thank you. >> For our guest and for John Furrier, Lisa Martin here in Detroit, KubeCon + CloudNativeCon North America. Coming to you live, we'll be back with our next guest in just a minute. (gentle music)

Published Date : Oct 27 2022

SUMMARY :

back to the show floor Day one, we have three wall-to-wall days. So this is going to be a very fun segment. Guys, great to have you on the program. So Michael StackRox And specifically in the code, Doron, I know you have some Even if in the open source world, And you guys are having and in the future Azure Marketplace. So it's not just OpenShift, or solve the whole cloud security posture. It's a lot quicker in the cloud. I'm going to ask you Yeah, so the cloud So they can sign up. So the quicker people are, the better. So my friend at the so you can download it and use it. from the open source side that That's some of the biggest challenges How are you guys helping so that you can evaluate So one of the thing that we we the biggest thing we have I want to get you guys thoughts you have to meet the the end, like you said, it's awesome that you have a base image What are some of the business, and then yeah, you can get through it. One thing that we see that and make all the configuration and the compliance operator because of the automation. and it actually made the What do you guys think is happening? Doron, do you have any thoughts on that. okay, say what you want. for the business to run. So all that's got to be brought in. You have to know a lot about So this got to get solved and you have desperate environments That seems to be the new move. and how to remediate it quickly, And that makes sense. and you make one mistake. Scaled the contact I think four years ago. you need to hold the drills around data. And by default when you install ACS, How are you guys seeing your customers? It's just time to value. so you can get your application. and take advantage of the managed service Well, in time to value is, whenever you get that new feature out What's on the agenda? and the idea that we will Where are they going to be able to go So you can find it on online. and you can do it for job of getting the audience Coming to you live,

ENTITIES

Entity	Category	Confidence
Lisa	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Michael Foster	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
Doron	PERSON	0.99+
Doron Caspin	PERSON	0.99+
2017	DATE	0.99+
2021	DATE	0.99+
December	DATE	0.99+
Spotify	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
two minutes	QUANTITY	0.99+
seven plus years	QUANTITY	0.99+
second part	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Detroit, Michigan	LOCATION	0.99+
five	DATE	0.99+
one mistake	QUANTITY	0.99+
KubeCon	EVENT	0.99+
Supercloud	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
a week	QUANTITY	0.99+
yesterday	DATE	0.99+
two providers	QUANTITY	0.99+
Two guests	QUANTITY	0.99+
18 months	QUANTITY	0.99+
Today	DATE	0.99+
Michael	PERSON	0.99+
Docker	ORGANIZATION	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
Linux	TITLE	0.99+
four years ago	DATE	0.98+
five times	QUANTITY	0.98+
one issue	QUANTITY	0.98+
six years ago	DATE	0.98+
zero day	QUANTITY	0.98+
six weeks	QUANTITY	0.98+
CloudNativeCon	EVENT	0.98+
OpenShift	TITLE	0.98+
last night	DATE	0.98+
CUBE	ORGANIZATION	0.98+
one image	QUANTITY	0.97+
last years	DATE	0.97+
First	QUANTITY	0.97+
Azure Marketplace	TITLE	0.97+
One thing	QUANTITY	0.97+
telco	ORGANIZATION	0.97+
Day one	QUANTITY	0.97+
one thing	QUANTITY	0.96+
Docker Hub	TITLE	0.96+
Docker Hub	ORGANIZATION	0.96+
10 plus year	QUANTITY	0.96+
Doron	ORGANIZATION	0.96+
Project Wisdom	TITLE	0.96+
day one	QUANTITY	0.95+
Lego	ORGANIZATION	0.95+
one change	QUANTITY	0.95+
a minute	QUANTITY	0.95+
ACS	TITLE	0.95+
CloudNativeCon '22	EVENT	0.94+
Kubernetes	TITLE	0.94+

Ash McCarty, Dell Technologies & Josh Prewitt, Rackspace Technology | VMware Explore 2022

(modern music) >> Welcome back, everyone to theCUBE's live coverage here in San Francisco for VMware Explore, formerly VMworld. theCUBE's been here 12 years today, we've been watching the evolution of the user conference. It's been quite a journey to see and, you know, virtualization just explode. We got two great guests here, we're going to break it all down. Ash McCarty, director of Multicloud Product Management Dell Technologies, no stranger to the VMworld, now VMware Explore, and Josh Prewitt, Chief Product Officer at Rackspace Technology. Great to see you guys, thanks for coming on. >> Absolutely. >> Yeah, thanks so much, thanks for having us. >> So, you know, the theme this year is multicloud, but it's really all about vSphere 8's out, you got VxRail, you got containers, you got the magic going on around cloud native, which it really points to the future state of where this is going, which is agile enterprises, infrastructure as code, high performance under the hood, I mean, all the things that you guys have been doing for many, many years and decades and business, but now with VMware putting it all together, it feels like, this year, it's like you got visibility into the value proposition, people have clear line of sight into where the performances are from the hardware software and now Cloud, it's kind of coming together, feels like it's coming together. Let's talk about that and the relationship between you guys, Rackspace and Dell and VMware. >> Perfect. That sounds great. Well, thanks so much for having us. You know, I'll sort of kick that off. We've got a huge lifelong partnership and relationship with Dell and VMware and the technologies that these guys create that we're able to put in front of our customers are really what allows us to go drive those business outcomes. So, yeah, happy to dive into it. >> Yeah, and I think to add to that, we understand that customers have a tremendously complex challenge ahead of them on managing their infrastructure. That's why with VxRail, we have intelligent infrastructure. We want it to simplify the outcomes for customers no matter if they're managing VMware or if they're managing the actual hardware infrastructure underneath it. >> Yeah, one of the things that we always talk about, you know, you read about it on the blogs and the news and the startup world, is "Oh, product-market fit," and, well, it kind of applies here, if you think about what's going on on the product side with the Edge emerging, hybrid cloud on pace with private cloud, and obviously, cloud native is great too if you have native applications in there, but now, putting it all together, you're hearing things like the telco cloud, I hear buzzwords like that, I hear supercloud, which we promoting, which you see in companies becoming cloud themselves, with the CapEx being handled by either public cloud or optimized on premise or hosted hardware. I mean, this is now, this is not all about everything's going to the cloud, this is now cloud operations on premise and in hosting hardware, so I'd love to get your perspective on that because you guys are huge hosting, you've got huge experience there, modernizing all the time. What does the modern era look like for the customer? >> Yeah, yeah, so, I mean, I think it's very clear to everybody that it's a multicloud world, right? I think the main question is, are you multicloud as a strategy, or are you multicloud as a situation? Because everybody's multicloud. That ship has sailed, right? >> Yeah, exactly. >> And so, when I look at the capabilities that we have with the partnership with Dell and the VxRail technologies, you know, life-cycle management that you have to go and perform across your fleet can be extremely difficult, and whenever you take something like the VxRail and you add, you know, you have the hardware and you have the software all fully integrated there, it makes it much easier to do life-cycle management, so for a company like Rackspace, where we have tens of thousands of nodes that we're managing for customers across 29 global data centers, and we're all over the place, the ability to have that strength with Dell's hardware, the VMware platform improve life-cycle management makes it so much easier for us to manage our fleet and be able to deliver those outcomes even faster for customers. >> So assuming that VxRail isn't a virtual railroad that delivers data to Rackspace data centers, if it's not that, what is it, Ash? Give us a little premier on what VxRail is. >> Well, VxRail is the first and only jointly engineered HCI system with VMware, so everything we do with VMware is better. >> So hyperconverged infrastructure. >> Hyperconverged infrastructure. >> What we used to call a server because all the bits are in the box, right? >> All the storage is computed in there. >> Everything's in there. Right. >> Simplifies management. And we built in with the VxRail HCI system software, which is really our secret sauce, we built in to actually add those automation capabilities with VMware, so it allows you to scale out very quickly, scale up very quickly. And one of our big capabilities is our life-cycle management, which is full stack, meaning it life-cycles the entire vSphere stack as well as the hardware infrastructure underneath as one continuously validated state, meaning that customers can focus more on their business outcomes and driving their business forward versus spending time managing their infrastructure. >> And when you talk about customers, it's also the value proposition that's flowing through Rackspace because Rackspace, when you install these systems, how long does it take to spin up to have a VM available for use when you install one of these systems? >> Oh, so you can have the system up and running very quickly. So we automate all the day one deployment, so you can have the system up and running in your labs, in your data centers in 45 minutes, and you can have VMs up in provision very shortly after that. >> So what do you do with that kind of agility? >> Oh my gosh, so we've actually taken that, and we've taken the VxRail platform and we've created what we call Rackspace Services for VMware Cloud, and this is our platform that is based on VxRail, it's based on vCloud Director from VMware, and by having the VxRail is already RackStacked, ready to go for our customers, we're able to sign a customer up today, and then, within a matter of minutes, give them access to a vCloud Director portal where they can go in and spin up a new VM anytime they want, but then, it also integrates into all of those cloud management platforms and tools, right? It integrates into your Terraform, so you've got, you know, your full CI/CD pipeline, and so you have that full end-to-end capability. If you want to go click around on a portal, you can using vCloud Director and using vSphere and all that great stuff. If you want to automate it, you can do that too. And we do it all in the backs of that VxRail hyperconverged infrastructure. >> Talk about the DPU dynamic. We're hearing a lot about DPUs. VxRail, you guys have some HCI-like vibe there with DPUs. How is that impacting performance, can you guys see? 'Cause we're hearing a lot of buzz around the VxRail and the VMware DPUs really making things much faster. >> I mean, it's the thing we talk about most with customers now is their challenges with scaling their infrastructure, and VxRail is going to be the first and only jointly engineered system that will have vSphere 8 with DPUs functionality and will have the full life-cycle management, and what this really empowers customers to do is, as they're growing their environments that they're scaling out their workloads in the data center, they need a way to scale to that next generation of networking and network security, and that's what DPUs allow you to do. They give you that offload and that high performance capability. >> Talk about the... I'd love to get your guys' perspective, while we're just riffing on this real quick sidebar for a second, if VxRail has these capabilities which you guys are promoting it does and some of the things go on in the modern era, the next gen apps are going to look a lot different. We're kind of calling it supercloud, if you will, for lack of a better description. Yeah, multicloud is a state, I agree. It's a situation and a state, but supercloud is really the functionality of what cloud does. So what do you guys see as, maybe it's tea leaves reading now or dots connecting, what are some of those next gen apps? I mean the Edge is there with, "Oh, the Edge is going to explode," and I can see the Edge having new kinds of apps that we've never seen before, whether it's on premise building lights and however they work or IoT changing. What do you guys see as the next gen app/apps coming out that's not looking the same as now, or how are apps today changing for next gen? 'Cause you get more performance at the Edge, you get more action, you get more co-locations in GEOS, so it's clear multicloud multi-presence is happening too, right? So what are you guys seeing? What's this... >> Yeah, I would say two areas that resonate most with customers is customers transitioning to their cloud native journey, so beginning it and using things like Tanzu for Kubernetes Operations, which we fully support and have a white paper out there list for customers, another area is really in the AIML space, so we've been partnering with both VMware and Nvidia to simplify how customers deploy new AIML infrastructure. I mean, it's challenging, complex, a lot of customers are wanting to dive in because it really enables them to better operate and operate on insights and analytics they get from running their business. >> Josh? >> And, you know, I think it really comes down to, whether you want to call it Edge or IoT or, you know, smart things, whatever, right? It all comes down to how we are expected, now, to capture all of the data to create a better user experience, and that's what we're seeing the modern applications being built around, right, is how do you leverage all of the data that's now at your fingertips, whether it's from wearables, machine vision, whatever it may be, and drive that improved user experience. And so that's the apps that we're seeing now, right? You know, of course, you still have all your business apps, all your ERP capabilities that need to exist and all of that great stuff, but at the same time, I also expect that, whenever, you know, now, whenever I'm walking into a store and their machine vision picks me up and they're pinging my phone and pushing me push notifications, I expect to have a better user experience. >> And do a database search on you too, by the way. >> Yeah, exactly, right? >> No search warrants out for 'em, you know, you're good. >> That's exactly it, so, you know, you kind of expect that better user experience and that's where I'm seeing a lot of the new app development. >> Yeah, it's fun, as these cases are intoxicating to think about all the weird coolness around it. The thing that I want to get your thoughts on is, we were just talking on the analyst session earlier in theCUBE, if DevOps is here and won, which we believe it has and infrastructure as code is happening, the cloud native discussion, shifting left CI/CD pipeline, that's DevOps in my mind, that's like cloud native developers, that's like traditional IT in my mind, so that's all part of the coding. DataOps and Security Ops seem to be the most robust areas of conversations where that's the new Ops, right? So, I mean, I made the term up, but new Ops, in terms of the focus, what are you making more efficient? What are you optimizing for? What's your guys reaction to that? Because all the conversations that we talk about is data, security, and then the rest seems to be cool, all good on the developer's side. Yeah, shift left events happening up there, Kubernetes containers, but all the action on the Ops side seems to be data and security. >> Yeah. >> What's your reaction to that? Is that right? >> So personally, I do think that it's right. I think that, you know with great power comes great responsibility, right? And so the clouds have brought that to us, all of your infrastructure as code has brought that to us. We have that great power now, right? But then you start to see, kind of, the pipeline attacks that are starting to become more and more popular. And so how you secure something that is as complex as, you know, a cloud native development pipeline is really hard, it's really challenging, so I do think that it warrants the attention. Then on the data side, I think that that matters because when I talked about those examples of a better user experience, I don't want my better user experience tomorrow, I don't want it 20 minutes from now. I want that real time capability, and so with that comes massive requirements from a compute and hardware perspective, massive requirements from a software perspective, and from, you know, what folks are now calling DataOps perspective >> Data addressability, having the data available to be delivered in real time. >> You know, there there's been a lot of talk, here at the conference, about the disaggregation of, you know, the brainularism, if we're going to make up words, you know, the horsepower that's involved, CPU, DPU, GPU. I'll make up another word. We're familiar with the thermometers used during COVID to measure temperature. Pretend that I've invented a device called a Care-o-meter and I'm pointing at various people's foreheads, who needs to care about DPUs and GPUs and CPUs? You know, John was referencing the idea of security at the Edge, data. Well, wow, we've got GPUs that can do things. Who needs to care about that? Obviously, we care about it. You care about it. You care about it. You're building this stuff, you're deploying this stuff, but at what level in the customer stack do they need to care about it? Are you going in, is RackSpace engaging customers and saying, "Look, here's the value proposition: we understand your mission to be this. We believe we can achieve your mission." How far down in the organization do you go before you get to someone where you have to have the DPU conversation? 'Cause we didn't even define DPU yet here, which is always offensive to me. >> I think I defined it actually. >> Did you define DPU? Good. Thank you John. >> Yeah, yeah. >> But so who should care? Who should really care about that? >> Oh, that's such a complex question, right? Because everybody, Rackspace included >> But a good one. But a good question. >> Oh, it's a great question. >> Thank you. >> Great question. (laughing) >> Everybody, Rackspace included, is talking about selling business outcomes, right? And ultimately, that is what matters. It is what matters, is selling those business outcomes to the customer. And so of course we're dealing with our business buyers who are just looking for, "Hey, improve my KPIs, make this run faster, better, stronger, all of that great stuff," but ultimately you get down to an IT staff, and to the IT staff, these things matter because the IT staff, they all have budgets that they have to hit. The realities start to hit them and they can't just go and spend whatever they want, you know, trying to hit the KPIs of the marketing department or the finance department, right? And so you have your business buyers that do care significantly about buying their outcomes, and so we're having, you know, the business outcomes conversations with them and then, oftentimes, they will come back to us and say, "Okay, but now we need you to talk to this person over in our IT organization. We need you to talk with our CIO, with our VP of infrastructure," whatever that may be, where we really get down to the nuts and bolts and we talk about how, you know, we can stretch the hardware coming from Dell, we can stretch the software coming from VMware, and we can deliver a higher caliber experience, a lower TCO, by taking advantage of some of the new technologies coming out. >> Yeah, so there's a reason why I ask that awesome question, and it's because I can imagine a scenario where, and this speaks to RackSpace's position in the market today and moving forward and what your history has been, people want to know, "Well, why should I work with Rackspace instead of some mega-hyper-monster-cloud?" If part of the answer is: well, it's because, for very specific application environments, like healthcare we talked about earlier, that might be a conversation where you're actually bringing in Dell to have a conversation about how you are specifically optimizing hardware and software to achieve things that otherwise can't be achieved with t-shirt sizes of servers in a hyperscale cloud. I mean, is that part of the Rackspace value proposition moving forward, that you can do things like that with partners like Dell that the other folks aren't going to focus on? >> Absolutely, it is, right? And a lot of the power of Rackspace is that, you know, we're the best-in-class pure play cloud solutions provider, and we can talk to you about your AWS, your Azure, your GCP, all of that great stuff, but we can also talk to you about private cloud solutions that are built on the backs of Dell Technologies, and in this multicloud world, you don't have that one size fits all for every single application. There are some things that run great in a hyperscale provider, and we can help you get there, but just exactly like you said, there are these verticals where you have applications that don't necessarily run all that well or they're not modernized, they haven't been refactored to be able to take advantage of cloud native services. And if all you're going to do is run that on bare metal in VMs, a hosted private cloud is, by far, the best way to do that, right? And Rackspace provides that hosted private cloud on the backs of Dell technology, on the backs of VMware technology, and we can go deliver those custom bespoke solutions to customers. >> So the infrastructure and the hardware still matters, Ash, yes? >> Absolutely, and I think he just highlighted, while what he does with his customers and what's important to his internal organization is being to deliver faster outcomes, better outcomes, give those customers, to meet those KPIs of those customers consuming their infrastructure at Rackspace, so I think, really, what the DPU and the underlying infrastructure enables is all that full stack integration to allow them to quickly scale to the demands of those customers and what they need in their infrastructure. >> Guys, while we got you here, what do you think about this year's VMware Explore, a lot of anticipation around how many people are going to show up and, you know, all kinds of things around the new name and Broadcom. Big attendance here, I mean, I was very surprised about the size of the attendance and the show floor, the ecosystem, this train is not stopping. I mean, this is VMware's third act, no matter what the contextual situation is. What's your observation of the show? Do you agree, or is there anything that you could want to share about for folks who didn't make it, what they missed? >> Yeah, I mean it really highlights, I mean, you've seen the breadth of the show, I know people that aren't here that aren't able to see it are really missing the excitement. So there's a lot of great announcements around multicloud, around all the announcements, around the vSphere 8 with the DPUs, the vSAN Express Storage architecture, ton of new exciting technologies that are really empowering how customers, you know, the future of how customers are going to consume their workloads in their data centers. >> Josh, they're not short on products and stuff. A lot of moving parts. vSphere 8, a bunch of new stuff. And the cloud native stuff's looking pretty good too, off the tee. >> You know, it does feel like a focus on the core, though, in a way. So I don't think there's been a lot of peripheral noise at the show. Sometimes it's, you know, "And we got this, and this, and this, and this." It's vSphere 8, vSAN 8, cloud software, you know, really hammering it home and refining it. >> But you don't think of it as a little bit of a circus act. I mean the general keynote was theatrical, I thought, I mean, I thought they did a good job on that. I think vSphere 8 was buried a little bit, I thought they could have... They checked the box at the beginning. >> That's true, that's true. >> I mean, they mentioned it, but we didn't see the demos. You know? Demos are usually great. But that's my only criticism. >> Well, that's why we supplemented it with the VxRail announcements, right? With our big announcements around vSphere 8 and with the DPUs as well as the vSAN Express Storage architecture being integrated into VxRail, so I think, you know, it's always that ongoing partnership and, you know, doing what's best for our customers, showing them the next generation and how they consume that technology. >> Yeah, you guys got good props on VxRail. We had a great chat about it yesterday. Rackspace, you guys doing good? Quick update on what's happening with you guys. Give a quick plug. What's going on at Rackspace? What's hot? What's going on? Give a quick plug for what the services are and the products you got going on there. >> Yeah, absolutely. So we are that end-to-end cloud provider, right? And so we've got really exciting offers in market, helping customers take advantage of all the hyperscale providers, and then giving them that private cloud experience. We've got everything from single-tenant running in our data centers on the backs of vSphere, vCloud Director, and VxRails, all the way through to, like, multi-tenant burstable capability that runs within our own data centers as well. It's a really exciting time for technology, a really exciting time for Rackspace. >> Congratulations, we've been following your journey for a long time. Dell, you guys do continue to do a great job and end-to-end phenomenal work. The telco thing's a huge opportunity, we didn't even go there. But Ash, thanks. Josh, thanks for coming on. Appreciate it. >> Yeah, thanks so much. Thanks for having us. >> Thank you very much. >> Okay, thanks for watching theCUBE. We're live, day two of three days of wall-to-wall coverage. Two sets here in Moscone West on the ground level, in the lobby, checking out all the action. Stay with us for more coverage after this short break. (modern music)

Published Date : Aug 31 2022

SUMMARY :

to see and, you know, Yeah, thanks so much, Let's talk about that and the and the technologies Yeah, and I think to add to that, and the startup world, or are you multicloud as a situation? and you have the software that delivers data to Well, VxRail is the first and only infrastructure. All the storage Everything's in there. so it allows you to and you can have VMs up in provision and so you have that full and the VMware DPUs really and that's what DPUs allow you to do. and some of the things another area is really in the AIML space, And so that's the apps that on you too, by the way. 'em, you know, you're good. a lot of the new app development. the rest seems to be cool, And so the clouds have brought that to us, having the data available to How far down in the organization do you go Thank you John. But a good question. Great question. and we talk about how, you know, I mean, is that part of the and we can talk to you about and the underlying infrastructure enables to show up and, you know, around the vSphere 8 with the DPUs, And the cloud native stuff's like a focus on the core, I mean the general keynote but we didn't see the demos. VxRail, so I think, you know, and the products you got going on there. centers on the backs of Dell, you guys do Yeah, thanks so much. West on the ground level,

ENTITIES

Entity	Category	Confidence
Josh	PERSON	0.99+
John	PERSON	0.99+
Josh Prewitt	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Ash McCarty	PERSON	0.99+
Ash	PERSON	0.99+
VMware	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Rackspace	ORGANIZATION	0.99+
29 global data centers	QUANTITY	0.99+
three days	QUANTITY	0.99+
first	QUANTITY	0.99+
12 years	QUANTITY	0.99+
vCloud Director	TITLE	0.99+
yesterday	DATE	0.99+
45 minutes	QUANTITY	0.99+
Rackspace Technology	ORGANIZATION	0.99+
Dell Technologies	ORGANIZATION	0.99+
vCloud Director	TITLE	0.99+
third act	QUANTITY	0.99+
vSphere	TITLE	0.99+
VxRail	TITLE	0.99+
RackSpace	ORGANIZATION	0.98+
vSphere 8	TITLE	0.98+
tomorrow	DATE	0.98+
today	DATE	0.98+
one	QUANTITY	0.98+
vSphere 8	TITLE	0.98+
telco cloud	ORGANIZATION	0.97+
Moscone West	LOCATION	0.97+
two areas	QUANTITY	0.97+
both	QUANTITY	0.96+
Tanzu	ORGANIZATION	0.96+
vSAN 8	TITLE	0.96+
this year	DATE	0.96+
telco	ORGANIZATION	0.96+
single	QUANTITY	0.96+
two great guests	QUANTITY	0.96+
Multicloud Product Management	ORGANIZATION	0.96+
theCUBE	ORGANIZATION	0.95+
tens of thousands	QUANTITY	0.95+
Kubernetes Operations	ORGANIZATION	0.95+
Broadcom	ORGANIZATION	0.95+
multicloud	ORGANIZATION	0.93+
20 minutes	QUANTITY	0.93+
RackStacked	TITLE	0.93+
DevOps	TITLE	0.91+

Clint Sharp, Cribl | Cube Conversation

(upbeat music) >> Hello, welcome to this CUBE conversation I'm John Furrier your host here in theCUBE in Palo Alto, California, featuring Cribl a hot startup taking over the enterprise when it comes to data pipelining, and we have a CUBE alumni who's the co-founder and CEO, Clint Sharp. Clint, great to see you again, you've been on theCUBE, you were on in 2013, great to see you, congratulations on the company that you co-founded, and leading as the chief executive officer over $200 million in funding, doing this really strong in the enterprise, congratulations thanks for joining us. >> Hey, thanks John it's really great to be back. >> You know, remember our first conversation the big data wave coming in, Hadoop World 2010, now the cloud comes in, and really the cloud native really takes data to a whole nother level. You've seeing the old data architectures being replaced with cloud scale. So the data landscape is interesting. You know, Data as Code you're hearing that term, data engineering teams are out there, data is everywhere, it's now part of how developers and companies are getting value whether it's real time, or coming out of data lakes, data is more pervasive than ever. Observability is a hot area, there's a zillion companies doing it, what are you guys doing? Where do you fit in the data landscape? >> Yeah, so what I say is that Cribl and our products and we solve the problem for our customers of the fundamental tension between data growth and budget. And so if you look at IDCs data data's growing at a 25%, CAGR, you're going to have two and a half times the amount of data in five years that you have today, and I talk to a lot of CIOs, I talk to a lot of CISOs, and the thing that I hear repeatedly is my budget is not growing at a 25% CAGR so fundamentally, how do I resolve this tension? We sell very specifically into the observability in security markets, we sell to technology professionals who are operating, you know, observability in security platforms like Splunk, or Elasticsearch, or Datadog, Exabeam, like these types of platforms they're moving, protocols like syslog, they're moving, they have lots of agents deployed on every endpoint and they're trying to figure out how to get the right data to the right place, and fundamentally you know, control cost. And we do that through our product called Stream which is what we call an observability pipeline. It allows you to take all this data, manipulate it in the stream and get it to the right place and fundamentally be able to connect all those things that maybe weren't originally intended to be connected. >> So I want to get into that new architecture if you don't mind, but let me first ask you on the problem space that you're in. So cloud native obviously instrumentating, instrumenting everything is a key thing. You mentioned data got all these tools, is the problem that there's been a sprawl of things being instrumented and they have to bring it together, or it's too costly to run all these point solutions and get it to work? What's the problem space that you're in? >> So I think customers have always been forced to make trade offs John. So the, hey I have volumes and volumes and volumes of data that's relevant to securing my enterprise, that's relevant to observing and understanding the behavior of my applications but there's never been an approach that allows me to really onboard all of that data. And so where we're coming at is giving them the tools to be able to, you know, filter out noise and waste, to be able to, you know, aggregate this high fidelity telemetry data. There's a lot of growing changes, you talk about cloud native, but digital transformation, you know, the pandemic itself and remote work all these are driving significantly greater data volumes, and vendors unsurprisingly haven't really been all that aligned to giving customers the tools in order to reshape that data, to filter out noise and waste because, you know, for many of them they're incentivized to get as much data into their platform as possible, whether that's aligned to the customer's interests or not. And so we saw an opportunity to come out and fundamentally as a customers-first company give them the tools that they need, in order to take back control of their data. >> I remember those conversations even going back six years ago the whole cloud scale, horizontally scalable applications, you're starting to see data now being stuck in the silos now to have high, good data you have to be observable, which means you got to be addressable. So you now have to have a horizontal data plane if you will. But then you get to the question of, okay, what data do I need at the right time? So is the Data as Code, data engineering discipline changing what new architectures are needed? What changes in the mind of the customer once they realize that they need this new way to pipe data and route data around, or make it available for certain applications? What are the key new changes? >> Yeah, so I think one of the things that we've been seeing in addition to the advent of the observability pipeline that allows you to connect all the things, is also the advent of an observability lake as well. Which is allowing people to store massively greater quantities of data, and also different types of data. So data that might not traditionally fit into a data warehouse, or might not traditionally fit into a data lake architecture, things like deployment artifacts, or things like packet captures. These are binary types of data that, you know, it's not designed to work in a database but yet they want to be able to ask questions like, hey, during the Log4Shell vulnerability, one of all my deployment artifacts actually had Log4j in it in an affected version. These are hard questions to answer in today's enterprise. Or they might need to go back to full fidelity packet capture data to try to understand that, you know, a malicious actor's movement throughout the enterprise. And we're not seeing, you know, we're seeing vendors who have great log indexing engines, and great time series databases, but really what people are looking for is the ability to store massive quantities of data, five times, 10 times more data than they're storing today, and they're doing that in places like AWSS3, or in Azure Blob Storage, and we're just now starting to see the advent of technologies we can help them query that data, and technologies that are generally more specifically focused at the type of persona that we sell to which is a security professional, or an IT professional who's trying to understand the behaviors of their applications, and we also find that, you know, general-purpose data processing technologies are great for the enterprise, but they're not working for the people who are running the enterprise, and that's why you're starting to see the concepts like observability pipelines and observability lakes emerge, because they're targeted at these people who have a very unique set of problems that are not being solved by the general-purpose data processing engines. >> It's interesting as you see the evolution of more data volume, more data gravity, then you have these specialty things that need to be engineered for the business. So sounds like observability lake and pipelining of the data, the data pipelining, or stream you call it, these are new things that they bolt into the architecture, right? Because they have business reasons to do it. What's driving that? Sounds like security is one of them. Are there others that are driving this behavior? >> Yeah, I mean it's the need to be able to observe applications and observe end-user behavior at a fine-grain detail. So, I mean I often use examples of like bank teller applications, or perhaps, you know, the app that you're using to, you know, I'm going to be flying in a couple of days. I'll be using their app to understand whether my flight's on time. Am I getting a good experience in that particular application? Answering the question of is Clint getting a good experience requires massive quantities of data, and your application and your service, you know, I'm going to sit there and look at, you know, American Airlines which I'm flying on Thursday, I'm going to be judging them based on off of my experience. I don't care what the average user's experience is I care what my experience is. And if I call them up and I say, hey, and especially for the enterprise usually this is much more for, you know, in-house applications and things like that. They call up their IT department and say, hey, this application is not working well, I don't know what's going on with it, and they can't answer the question of what was my individual experience, they're living with, you know, data that they can afford to store today. And so I think that's why you're starting to see the advent of these new architectures is because digital is so absolutely critical to every company's customer experience, that they're needing to be able to answer questions about an individual user's experience which requires significantly greater volumes of data, and because of significantly greater volumes of data, that requires entirely new approaches to aggregating that data, bringing the data in, and storing that data. >> Talk to me about enabling customer choice when it comes around controlling their data. You mentioned that before we came on camera that you guys are known for choice. How do you enable customer choice and control over their data? >> So I think one of the biggest problems I've seen in the industry over the last couple of decades is that vendors come to customers with hugely valuable products that make their lives better but it also requires them to maintain a relationship with that vendor in order to be able to continue to ask questions of that data. And so customers don't get a lot of optionality in these relationships. They sign multi-year agreements, they look to try to start another, they want to go try out another vendor, they want to add new technologies into their stack, and in order to do that they're often left with a choice of well, do I roll out like get another agent, do I go touch 10,000 computers, or a 100,000 computers in order to onboard this data? And what we have been able to offer them is the ability to reuse their existing deployed footprints of agents and their existing data collection technologies, to be able to use multiple tools and use the right tool for the right job, and really give them that choice, and not only give them the choice once, but with the concepts of things like the observability lake and replay, they can go back in time and say, you know what? I wanted to rehydrate all this data into a new tool, I'm no longer locked in to the way one vendor stores this, I can store this data in open formats and that's one of the coolest things about the observability late concept is that customers are no longer locked in to any particular vendor, the data is stored in open formats and so that gives them the choice to be able to go back later and choose any vendor, because they may want to do some AI or ML on that type of data and do some model training. They may want to be able to forward that data to a new cloud data warehouse, or try a different vendor for log search or a different vendor for time series data. And we're really giving them the choice and the tools to do that in a way in which was simply not possible before. >> You know you are bring up a point that's a big part of the upcoming AWS startup series Data as Code, the data engineering role has become so important and the word engineering is a key word in that, but there's not a lot of them, right? So like how many data engineers are there on the planet, and hopefully more will come in, come from these great programs in computer science but you got to engineer something but you're talking about developing on data, you're talking about doing replays and rehydrating, this is developing. So Data as Code is now a reality, how do you see Data as Code evolving from your perspective? Because it implies DevOps, Infrastructure as Code was DevOps, if Data as Code then you got DataOps, AIOps has been around for a while, what is Data as Code? And what does that mean to you Clint? >> I think for our customers, one, it means a number of I think sort of after-effects that maybe they have not yet been considering. One you mentioned which is it's hard to acquire that talent. I think it is also increasingly more critical that people who were working in jobs that used to be purely operational, are now being forced to learn, you know, developer centric tooling, things like GET, things like CI/CD pipelines. And that means that there's a lot of education that's going to have to happen because the vast majority of the people who have been doing things in the old way from the last 10 to 20 years, you know, they're going to have to get retrained and retooled. And I think that one is that's a huge opportunity for people who have that skillset, and I think that they will find that their compensation will be directly correlated to their ability to have those types of skills, but it also represents a massive opportunity for people who can catch this wave and find themselves in a place where they're going to have a significantly better career and more options available to them. >> Yeah and I've been thinking about what you just said about your customer environment having all these different things like Datadog and other agents. Those people that rolled those out can still work there, they don't have to rip and replace and then get new training on the new multiyear enterprise service agreement that some other vendor will sell them. You come in and it sounds like you're saying, hey, stay as you are, use Cribl, we'll have some data engineering capabilities for you, is that right? Is that? >> Yup, you got it. And I think one of the things that's a little bit different about our product and our market John, from kind of general-purpose data processing is for our users they often, they're often responsible for many tools and data engineering is not their full-time job, it's actually something they just need to do now, and so we've really built tool that's designed for your average security professional, your average IT professional, yes, we can utilize the same kind of DataOps techniques that you've been talking about, CI/CD pipelines, GITOps, that sort of stuff, but you don't have to, and if you're really just already familiar with administering a Datadog or a Splunk, you can get started with our product really easily, and it is designed to be able to be approachable to anybody with that type of skillset. >> It's interesting you, when you're talking you've remind me of the big wave that was coming, it's still here, shift left meant security from the beginning. What do you do with data shift up, right, down? Like what do you, what does that mean? Because what you're getting at here is that if you're a developer, you have to deal with data but you don't have to be a data engineer but you can be, right? So we're getting in this new world. Security had that same problem. Had to wait for that group to do things, creating tension on the CI/CD pipelining, so the developers who are building apps had to wait. Now you got shift left, what is data, what's the equivalent of the data version of shift left? >> Yeah so we're actually doing this right now. We just announced a new product a week ago called Cribl Edge. And this is enabling us to move processing of this data rather than doing it centrally in the stream to actually push this processing out to the edge, and to utilize a lot of unused capacity that you're already paying AWS, or paying Azure for, or maybe in your own data center, and utilize that capacity to do the processing rather than having to centralize and aggregate all of this data. So I think we're going to see a really interesting, and left from our side is towards the origination point rather than anything else, and that allows us to really unlock a lot of unused capacity and continue to drive the kind of cost down to make more data addressable back to the original thing we talked about the tension between data growth, if we want to offer more capacity to people, if we want to be able to answer more questions, we need to be able to cost-effectively query a lot more data. >> You guys had great success in the enterprise with what you got going on. Obviously the funding is just the scoreboard for that. You got good growth, what are the use cases, or what's the customer look like that's working for you where you're winning, or maybe said differently what pain points are out there the customer might be feeling right now that Cribl could fit in and solve? How would you describe that ideal persona, or environment, or problem, that the customer may have that they say, man, Cribl's a perfect fit? >> Yeah, this is a person who's working on tooling. So they administer a Splunk, or an Elastic, or a Datadog, they may be in a network operations center, a security operation center, they are struggling to get data into their tools, they're always at capacity, their tools always at the redline, they really wish they could do more for the business. They're kind of tired of being this department of no where everybody comes to them and says, "hey, can I get this data in?" And they're like, "I wish, but you know, we're all out of capacity, and you know, we have, we wish we could help you but we frankly can't right now." We help them by routing that data to multiple locations, we help them control costs by eliminating noise and waste, and we've been very successful at that in, you know, logos, like, you know, like a Shutterfly, or a, blanking on names, but we've been very successful in the enterprise, that's not great, and we continue to be successful with major logos inside of government, inside of banking, telco, et cetera. >> So basically it used to be the old hyperscalers, the ones with the data full problem, now everyone's got the, they're full of data and they got to really expand capacity and have more agility and more engineering around contributions of the business sounds like that's what you guys are solving. >> Yup and hopefully we help them do a little bit more with less. And I think that's a key problem for our enterprises, is that there's always a limit on the number of human resources that they have available at their disposal, which is why we try to make the software as easy to use as possible, and make it as widely applicable to those IT and security professionals who are, you know, kind of your run-of-the-mill tools administrator, our product is very approachable for them. >> Clint great to see you on theCUBE here, thanks for coming on. Quick plug for the company, you guys looking for hiring, what's going on? Give a quick update, take 30 seconds to give a plug. >> Yeah, absolutely. We are absolutely hiring cribl.io/jobs, we need people in every function from sales, to marketing, to engineering, to back office, GNA, HR, et cetera. So please check out our job site. If you are interested it in learning more you can go to cribl.io. We've got some great online sandboxes there which will help you educate yourself on the product, our documentation is freely available, you can sign up for up to a terabyte a day on our cloud, go to cribl.cloud and sign up free today. The product's easily accessible, and if you'd like to speak with us we'd love to have you in our community, and you can join the community from cribl.io as well. >> All right, Clint Sharp co-founder and CEO of Cribl, thanks for coming to theCUBE. Great to see you, I'm John Furrier your host thanks for watching. (upbeat music)

Published Date : Mar 31 2022

SUMMARY :

Clint, great to see you again, really great to be back. and really the cloud native and get it to the right place and get it to work? to be able to, you know, So is the Data as Code, is the ability to store that need to be engineered that they're needing to be that you guys are known for choice. is the ability to reuse their does that mean to you Clint? from the last 10 to 20 years, they don't have to rip and and it is designed to be but you don't have to be a data engineer and to utilize a lot of unused capacity that the customer may have and you know, we have, and they got to really expand capacity as easy to use as possible, Clint great to see you on theCUBE here, and you can join the community Great to see you, I'm

ENTITIES

Entity	Category	Confidence
Clint Sharp	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
10 times	QUANTITY	0.99+
Clint	PERSON	0.99+
30 seconds	QUANTITY	0.99+
100,000 computers	QUANTITY	0.99+
Thursday	DATE	0.99+
Cribl	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
25%	QUANTITY	0.99+
American Airlines	ORGANIZATION	0.99+
five times	QUANTITY	0.99+
10,000 computers	QUANTITY	0.99+
2013	DATE	0.99+
five years	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
one	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
six years ago	DATE	0.99+
CUBE	ORGANIZATION	0.98+
a week ago	DATE	0.98+
first	QUANTITY	0.98+
telco	ORGANIZATION	0.98+
Datadog	ORGANIZATION	0.97+
today	DATE	0.97+
AWSS3	TITLE	0.97+
Log4Shell	TITLE	0.96+
two and a half times	QUANTITY	0.94+
last couple of decades	DATE	0.89+
first conversation	QUANTITY	0.89+
One	QUANTITY	0.87+
Hadoop World 2010	EVENT	0.87+
Log4j	TITLE	0.83+
cribl.io	ORGANIZATION	0.81+
20 years	QUANTITY	0.8+
Azure	ORGANIZATION	0.8+
first company	QUANTITY	0.79+
big wave	EVENT	0.79+
theCUBE	ORGANIZATION	0.78+
up to a terabyte a day	QUANTITY	0.77+
Azure Blob	TITLE	0.77+
cribl.cloud	TITLE	0.74+
Exabeam	ORGANIZATION	0.72+
Shutterfly	ORGANIZATION	0.71+
banking	ORGANIZATION	0.7+
DataOps	TITLE	0.7+
wave	EVENT	0.68+
last	DATE	0.67+
cribl.io	TITLE	0.66+
things	QUANTITY	0.65+
zillion companies	QUANTITY	0.63+
syslog	TITLE	0.62+
10	QUANTITY	0.61+
Splunk	ORGANIZATION	0.6+
AIOps	TITLE	0.6+
Edge	TITLE	0.6+
Data as	TITLE	0.59+
cribl.io/jobs	ORGANIZATION	0.58+
Elasticsearch	TITLE	0.58+
Elastic	TITLE	0.55+
once	QUANTITY	0.5+
problems	QUANTITY	0.48+
Code	TITLE	0.46+
Splunk	TITLE	0.44+

Sanjeev Mohan, SanjMo & Nong Li, Okera | AWS Startup Showcase

(cheerful music) >> Hello everyone, welcome to today's session of theCUBE's presentation of AWS Startup Showcase, New Breakthroughs in DevOps, Data Analytics, Cloud Management Tools, featuring Okera from the cloud management migration track. I'm John Furrier, your host. We've got two great special guests today, Nong Li, founder and CTO of Okera, and Sanjeev Mohan, principal @SanjMo, and former research vice president of big data and advanced analytics at Gartner. He's a legend, been around the industry for a long time, seen the big data trends from the past, present, and knows the future. Got a great lineup here. Gentlemen, thank you for this, so, life in the trenches, lessons learned across compliance, cloud migration, analytics, and use cases for Fortune 1000s. Thanks for joining us. >> Thanks for having us. >> So Sanjeev, great to see you, I know you've seen this movie, I was saying that in the open, you've at Gartner seen all the visionaries, the leaders, you know everything about this space. It's changing extremely fast, and one of the big topics right out of the gate is not just innovation, we'll get to that, that's the fun part, but it's the regulatory compliance and audit piece of it. It's keeping people up at night, and frankly if not done right, slows things down. This is a big part of the showcase here, is to solve these problems. Share us your thoughts, what's your take on this wide-ranging issue? >> So, thank you, John, for bringing this up, and I'm so happy you mentioned the fact that, there's this notion that it can slow things down. Well I have to say that the old way of doing governance slowed things down, because it was very much about control and command. But the new approach to data governance is actually in my opinion, it's liberating data. If you want to democratize or monetize, whatever you want to call it, you cannot do it 'til you know you can trust said data and it's governed in some ways, so data governance has actually become very interesting, and today if you want to talk about three different areas within compliance regulatory, for example, we all know about the EU GDPR, we know California has CCPA, and in fact California is now getting even a more stringent version called CPRA in a couple of years, which is more aligned to GDPR. That is a first area we know we need to comply to that, we don't have any way out. But then, there are other areas, there is insider trading, there is how you secure the data that comes from third parties, you know, vendors, partners, suppliers, so Nong, I'd love to hand it over to you, and see if you can maybe throw some light into how our customers are handling these use cases. >> Yeah, absolutely, and I love what you said about balancing agility and liberating, in the face of what may be seen as things that slow you down. So we work with customers across verticals with old and new regulations, so you know, you brought up GDPR. One of our clients is using this to great effect to power their ecosystem. They are a very large retail company that has operations and customers across the world, obviously the importance of GDPR, and the regulations that imposes on them are very top of mind, and at the same time, being able to do effective targeting analytics on customer information is equally critical, right? So they're exactly at that spot where they need this customer insight for powering their business, and then the regulatory concerns are extremely prevalent for them. So in the context of GDPR, you'll hear about things like consent management and right to be forgotten, right? I, as a customer of that retailer should say "I don't want my information used for this purpose," right? "Use it for this, but not this." And you can imagine at a very, very large scale, when you have a billion customers, managing that, all the data you've collected over time through all of your devices, all of your telemetry, really, really challenging. And they're leveraging Okera embedded into their analytics platform so they can do both, right? Their data scientists and analysts who need to do everything they're doing to power the business, not have to think about these kind of very granular customer filtering requirements that need to happen, and then they leverage us to do that. So that's kind of new, right, GDPR, relatively new stuff at this point, but we obviously also work with customers that have regulations from a long long time ago, right? So I think you also mentioned insider trading and that supply chain, so we'll talk to customers, and they want really data-driven decisions on their supply chain, everything about their production pipeline, right? They want to understand all of that, and of course that makes sense, whether you're the CFO, if you're going to make business decisions, you need that information readily available, and supply chains as we know get more and more and more complex, we have more and more integrated into manufacturing and other verticals. So that's your, you're a little bit stuck, right? You want to be data-driven on those supply chain analytics, but at the same time, knowing the details of all the supply chain across all of your dependencies exposes your internal team to very high blackout periods or insider trading concerns, right? For example, if you knew Apple was buying a bunch of something, that's maybe information that only a select few people can have, and the way that manifests into data policies, 'cause you need the ability to have very, very scalable, per employee kind of scalable data restriction policies, so they can do their job easier, right? If we talk about speeding things up, instead of a very complex process for them to get approved, and approved on SEC regulations, all that kind of stuff, you can now go give them access to the part of the supply chain that they need, and no more, and limit their exposure and the company's exposure and all of that kind of stuff. So one of our customers able to do this, getting two orders of magnitude, a 100x reduction in the policies to manage the system like that. >> When I hear you talking like that, I think the old days of "Oh yeah, regulatory, it kind of slows down innovation, got to go faster," pretty basic variables, not a lot of combination of things to check. Now with cloud, there seems to be combinations, Sanjeev, because how complicated has the regulatory compliance and audit environment gotten in the past few years, because I hear security in a supply chain, I hear insider threats, I mean these are security channels, not just compliance department G&A kind of functions. You're talking about large-scale, potentially combinations of access, distribution, I mean it seems complicated. How much more complicated is it now, just than it was a few years ago? >> So, you know the way I look at it is, I'm just mentioning these companies just as an example, when PayPal or Ebay, all these companies started, they started in California. Anybody who ever did business on Ebay or PayPal, guess where that data was? In the US in some data center. Today you cannot do it. Today, data residency laws are really tough, and so now these organizations have to really understand what data needs to remain where. On top of that, we now have so many regulations. You know, earlier on if you were healthcare, you needed to be HIPAA compliant, or banking PCI DSS, but today, in the cloud, you really need to know, what data I have, what sensitive data I have, how do I discover it? So that data discovery becomes really important. What roles I have, so for example, let's say I work for a bank in the US, and I decide to move to Germany. Now, the old school is that a new rule will be created for me, because of German... >> John: New email address, all these new things happen, right? >> Right, exactly. So you end up with this really, a mass of rules and... And these are all static. >> Rules and tools, oh my god. >> Yeah. So Okera actually makes a lot of this dynamic, which reduces your cloud migration overhead, and Nong used some great examples, in fact, sorry if I take just a second, without mentioning any names, there's one of the largest banks in the world is going global in the digital space for the first time, and they're taking Okera with them. So... >> But what's the point? This is my next topic in cloud migration, I want to bring this up because, complexity, when you're in that old school kind of data center, waterfall, these old rules and tools, you have to roll this out, and it's a pain in the butt for everybody, it's a hassle, huge hassle. Cloud gives the agility, we know that, and cloud's becoming more secure, and I think now people see the on-premise, certainly things that'd be on-premises for secure things, I get that, but when you start getting into agility, and you now have cloud regions, you can start being more programmatic, so I want to get you guys' thoughts on the cloud migration, how companies who are now lifting and shifting, replatforming, what's the refactoring beyond that, because you can replatform in the cloud, and still some are kind of holding back on that. Then when you're in the cloud, the ones that are winning, the companies that are winning are the ones that are refactoring in the cloud. Doing things different with new services. Sanjeev, you start. >> Yeah, so you know, in fact lot of people tell me, "You know, we are just going to lift and shift into the cloud." But you're literally using cloud as a data center. You still have all the, if I may say, junk you had on-prem, you just moved it into the cloud, and now you're paying for it. In cloud, nothing is free. Every storage, every processing, you're going to pay for it. The most successful companies are the ones that are replatforming, they are taking advantage of the platform as a service or software as a service, so that includes things like, you pay as you go, you pay for exactly the amount you use, so you scale up and scale down or scale out and scale in, pretty quickly, you know? So you're handling that demand, so without replatforming, you are not really utilizing your- >> John: It's just hosting. >> Yeah, you're just hosting. >> It's basically hosting if you're not doing anything right there. >> Right. The reason why people sometimes resist to replatform, is because there's a hidden cost that we don't really talk about, PaaS adds 3x to IaaS cost. So, some organizations that are very mature, and they have a few thousand people in the IT department, for them, they're like "No, we just want to run it in the cloud, we have the expertise, and it's cheaper for us." But in the long run, to get the most benefit, people should think of using cloud as a service. >> Nong what's your take, because you see examples of companies, I'll just call one out, Snowflake for instance, they're essentially a data warehouse in the cloud, they refactored and they replatformed, they have a competitive advantage with the scale, so they have things that others don't have, that just hosting. Or even on-premise. The new model developing where there's real advantages, and how should companies think about this when they have to manage these data lakes, and they have to manage all these new access methods, but they want to maintain that operational stability and control and growth? >> Yeah, so. No? Yeah. >> There's a few topics that are all (indistinct) this topic. (indistinct) enterprises moving to the cloud, they do this maybe for some cost savings, but a ton of it is agility, right? The motor that the business can run at is just so much faster. So we'll work with companies in the context of cloud migration for data, where they might have a data warehouse they've been using for 20 years, and building policies over that time, right? And it's taking a long time to go proof of access and those kind of things, made more sense, right? If it took you months to procure a physical infrastructure, get machines shipped to your data center, then this data access taking so long feels okay, right? That's kind of the same rate that everything is moving. In the cloud, you can spin up new infrastructure instantly, so you don't want approvals for getting policies, creating rules, all that stuff that Sanjeev was talking about, that being slow is a huge, huge problem. So this is a very common environment that we see where they're trying to do that kind of thing. And then, for replatforming, again, they've been building these roles and processes and policies for 20 years. What they don't want to do is take 20 years to go migrate all that stuff into the cloud, right? That's probably an experience nobody wants to repeat, and frankly for many of them, people who did it originally may or may not be involved in this kind of effort. So we work with a lot of companies like that, they have their, they want stability, they got to have the business running as normal, they got to get moving into the new infrastructure, doing it in a new way that, you know, with all the kind of lessons learned, so, as Sanjeev said, one of these big banks that we work with, that classical story of on-premise data warehousing, maybe a little bit of Hadoop, moved onto AWS, S3, Snowflake, that kind of setup, extremely intricate policies, but let's go reimagine how we can do this faster, right? What we like to talk about is, you're an organization, you need a design that, if you onboarded 1000 more data users, that's got to be way, way easier than the first 10 you onboarded, right? You got to get it to be easier over time, in a really, really significant way. >> Talk about the data authorization safety factor, because I can almost imagine all the intricacies of these different tools creates specialism amongst people who operate them. And each one might have their own little authorization nuance. Trend is not to have that siloed mentality. What's your take on clients that want to just "Hey, you know what? I want to have the maximum agility, but I don't want to get caught in the weeds on some of these tripwires around access and authorization." >> Yeah, absolutely, I think it's real important to get the balance of it, right? Because if you are an enterprise, or if you have diversive teams, you want them to have the ability to use tools as best of breed for their purpose, right? But you don't want to have it be so that every tool has its own access and provisioning and whatever, that's definitely going to be a security, or at least, a lot of friction for you to get things going. So we think about that really hard, I think we've seen great success with things like SSO and Okta, right? Unifying authentication. We think there's a very, very similar thing about to happen with authorization. You want that single control plane that can integrate with all the tools, and still get the best of what you need, but it's much, much easier (indistinct). >> Okta's a great example, if people don't want to build their own thing and just go with that, same with what you guys are doing. That seems to be the dots that are connecting you, Sanjeev. The ease of use, but yet the stability factor. >> Right. Yeah, because John, today I may want to bring up a SQL editor to go into Snowflake, just as an example. Tomorrow, I may want to use the Azure Bot, you know? I may not even want to go to Snowflake, I may want to go to an underlying piece of data, or I may use Power BI, you know, for some reason, and come from Azure side, so the point is that, unless we are able to control, in some sort of a centralized manner, we will not get that consistency. And security you know is all or nothing. You cannot say "Well, I secured my Snowflake, but if you come through HTFS, Hadoop, or some, you know, that is outside of my realm, or my scope," what's the point? So that is why it is really important to have a watertight way, in fact I'm using just a few examples, maybe tomorrow I decide to use a data catalog, or I use Denodo as my data virtualization and I run a query. I'm the same identity, but I'm using different tools. I may use it from home, over VPN, or I may use it from the office, so you want this kind of flexibility, all encompassed in a policy, rather than a separate rule if you do this and this, if you do that, because then you end up with literally thousands of rules. >> And it's never going to stop, either, it's like fashion, the next tool's going to come out, it's going to be cool, and people are going to want to use it, again, you don't want to have to then move the train from the compliance side this way or that way, it's a lot of hassle, right? So we have that one capability, you can bring on new things pretty quickly. Nong, am I getting it right, this is kind of like the trend, that you're going to see more and more tools and/or things that are relevant or, certain use cases that might justify it, but yet, AppSec review, compliance review, I mean, good luck with that, right? >> Yeah, absolutely, I mean we certainly expect tools to continue to get more and more diverse, and better, right? Most innovation in the data space, and I think we... This is a great time for that, a lot of things that need to happen, and so on and so forth. So I think one of the early goals of the company, when we were just brainstorming, is we don't want data teams to not be able to use the tools because it doesn't have the right security (indistinct), right? Often those tools may not be focused on that particular area. They're great at what they do, but we want to make sure they're enabled, they do some enterprise investments, they see broader adoption much easier. A lot of those things. >> And I can hear the sirens in the background, that's someone who's not using your platform, they need some help there. But that's the case, I mean if you don't get this right, there are some consequences, and I think one of the things I would like to bring up on next track is, to talk through with you guys is, the persona pigeonhole role, "Oh yeah, a data person, the developer, the DevOps, the SRE," you start to see now, developers and with cloud developers, and data folks, people, however they get pigeonholed, kind of blending in, okay? You got data services, you got analytics, you got data scientists, you got more democratization, all these things are being kicked around, but the notion of a developer now is a data developer, because cloud is about DevOps, data is now a big part of it, it's not just some department, it's actually blending in. Just a cultural shift, can you guys share your thoughts on this trend of data people versus developers now becoming kind of one, do you guys see this happening, and if so, how? >> So when, John, I started my career, I was a DBA, and then a data architect. Today, I think you cannot have a DBA who's not a developer. That's just my opinion. Because there is so much of CICD, DevOps, that happens today, and you know, you write your code in Python, you put it in version control, you deploy using Jenkins, you roll back if there's a problem. And then, you are interacting, you're building your data to be consumed as a service. People in the past, you would have a thick client that would connect to the database over TCP/IP. Today, people don't want to connect over TCP/IP necessarily, they want to go by HTTP. And they want an API gateway in the middle. So, if you're a data architect or DBA, now you have to worry about, "I have a REST API call that's coming in, how am I going to secure that, and make sure that people are allowed to see that?" And that was just yesterday. >> Exactly. Got to build an abstraction layer. You got to build an abstraction layer. The old days, you have to worry about schema, and do all that, it was hard work back then, but now, it's much different. You got serverless, functions are going to show way... It's happening. >> Correct, GraphQL, and semantic layer, that just blows me away because, it used to be, it was all in database, then we took it out of database and we put it in a BI tool. So we said, like BusinessObjects started this whole trend. So we're like "Let's put the semantic layer there," well okay, great, but that was when everything was surrounding BusinessObjects and Oracle Database, or some other database, but today what if somebody brings Power BI or Tableau or Qlik, you know? Now you don't have a semantic layer access. So you cannot have it in the BI layer, so you move it down to its own layer. So now you've got a semantic layer, then where do you store your metrics? Same story repeats, you have a metrics layer, then the data centers want to do feature engineering, where do you store your features? You have a feature store. And before you know, this stack has disaggregated over and over and over, and then you've got layers and layers of specialization that are happening, there's query accelerators like Dremio or Trino, so you've got your data here, which Nong is trying really hard to protect, and then you've got layers and layers and layers of abstraction, and networks are fast, so the end user gets great service, but it's a nightmare for architects to bring all these things together. >> How do you tame the complexity? What's the bottom line? >> Nong? >> Yeah, so, I think... So there's a few things you need to do, right? So, we need to re-think how we express security permanence, right? I think you guys have just maybe in passing (indistinct) talked about creating all these rules and all that kind of stuff, that's been the way we've done things forever. We get to think about policies and mechanisms that are much more dynamic, right? You need to really think about not having to do any additional work, for the new things you add to the system. That's really, really core to solving the complexity problem, right? 'Cause that gets you those orders of magnitude reduction, system's got to be more expressive and map to those policies. That's one. And then second, it's got to be implemented at the right layer, right, to Sanjeev's point, close to the data, and it can service all of those applications and use cases at the same time, and have that uniformity and breadth of support. So those two things have to happen. >> Love this universal data authorization vision that you guys have. Super impressive, we had a CUBE Conversation earlier with Nick Halsey, who's a veteran in the industry, and he likes it. That's a good sign, 'cause he's seen a lot of stuff, too, Sanjeev, like yourself. This is a new thing, you're seeing compliance being addressed, and with programmatic, I'm imagining there's going to be bots someday, very quickly with AI that's going to scale that up, so they kind of don't get in the innovation way, they can still get what they need, and enable innovation. You've got cloud migration, which is only going faster and faster. Nong, you mentioned speed, that's what CloudOps is all about, developers want speed, not things in days or hours, they want it in minutes and seconds. And then finally, ultimately, how's it scale up, how does it scale up for the people operating and/or programming? These are three major pieces. What happens next? Where do we go from here, what's, the customer's sitting there saying "I need help, I need trust, I need scale, I need security." >> So, I just wrote a blog, if I may diverge a bit, on data observability. And you know, so there are a lot of these little topics that are critical, DataOps is one of them, so to me data observability is really having a transparent view of, what is the state of your data in the pipeline, anywhere in the pipeline? So you know, when we talk to these large banks, these banks have like 1000, over 1000 data pipelines working every night, because they've got that hundred, 200 data sources from which they're bringing data in. Then they're doing all kinds of data integration, they have, you know, we talked about Python or Informatica, or whatever data integration, data transformation product you're using, so you're combining this data, writing it into an analytical data store, something's going to break. So, to me, data observability becomes a very critical thing, because it shows me something broke, walk me down the pipeline, so I know where it broke. Maybe the data drifted. And I know Okera does a lot of work in data drift, you know? So this is... Nong, jump in any time, because I know we have use cases for that. >> Nong, before you get in there, I just want to highlight a quick point. I think you're onto something there, Sanjeev, because we've been reporting, and we believe, that data workflows is intellectual property. And has to be protected. Nong, go ahead, your thoughts, go ahead. >> Yeah, I mean, the observability thing is critically important. I would say when you want to think about what's next, I think it's really effectively bridging tools and processes and systems and teams that are focused on data production, with the data analysts, data scientists, that are focused on data consumption, right? I think bridging those two, which cover a lot of the topics we talked about, that's kind of where security almost meets, that's kind of where you got to draw it. I think for observability and pipelines and data movement, understanding that is essential. And I think broadly, on all of these topics, where all of us can be better, is if we're able to close the loop, get the feedback loop of success. So data drift is an example of the loop rarely being closed. It drifts upstream, and downstream users can take forever to figure out what's going on. And we'll have similar examples related to buy-ins, or data quality, all those kind of things, so I think that's really a problem that a lot of us should think about. How do we make sure that loop is closed as quickly as possible? >> Great insight. Quick aside, as the founder CTO, how's life going for you, you feel good? I mean, you started a company, doing great, it's not drifting, it's right in the stream, mainstream, right in the wheelhouse of where the trends are, you guys have a really crosshairs on the real issues, how you feeling, tell us a little bit about how you see the vision. >> Yeah, I obviously feel really good, I mean we started the company a little over five years ago, there are kind of a few things that we bet would happen, and I think those things were out of our control, I don't think we would've predicted GDPR security and those kind of things being as prominent as they are. Those things have really matured, probably as best as we could've hoped, so that feels awesome. Yeah, (indistinct) really expanded in these years, and it feels good. Feels like we're in the right spot. >> Yeah, it's great, data's competitive advantage, and certainly has a lot of issues. It could be a blocker if not done properly, and you're doing great work. Congratulations on your company. Sanjeev, thanks for kind of being my cohost in this segment, great to have you on, been following your work, and you continue to unpack it at your new place that you started. SanjMo, good to see your Twitter handle taking on the name of your new firm, congratulations. Thanks for coming on. >> Thank you so much, such a pleasure. >> Appreciate it. Okay, I'm John Furrier with theCUBE, you're watching today's session presentation of AWS Startup Showcase, featuring Okera, a hot startup, check 'em out, great solution, with a really great concept. Thanks for watching. (calm music)

Published Date : Sep 22 2021

SUMMARY :

and knows the future. and one of the big topics and I'm so happy you in the policies to manage of things to check. and I decide to move to Germany. So you end up with this really, is going global in the digital and you now have cloud regions, Yeah, so you know, if you're not doing anything right there. But in the long run, to and they have to manage all Yeah, so. In the cloud, you can spin up get caught in the weeds and still get the best of what you need, with what you guys are doing. the Azure Bot, you know? are going to want to use it, a lot of things that need to happen, the SRE," you start to see now, People in the past, you The old days, you have and networks are fast, so the for the new things you add to the system. that you guys have. So you know, when we talk Nong, before you get in there, I would say when you want I mean, you started a and I think those things and you continue to unpack it Thank you so much, of AWS Startup Showcase,

ENTITIES

Entity	Category	Confidence
Nick Halsey	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
California	LOCATION	0.99+
US	LOCATION	0.99+
Nong Li	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Ebay	ORGANIZATION	0.99+
PayPal	ORGANIZATION	0.99+
20 years	QUANTITY	0.99+
Sanjeev	PERSON	0.99+
Tomorrow	DATE	0.99+
two	QUANTITY	0.99+
GDPR	TITLE	0.99+
Sanjeev Mohan	PERSON	0.99+
Today	DATE	0.99+
One	QUANTITY	0.99+
yesterday	DATE	0.99+
Snowflake	TITLE	0.99+
today	DATE	0.99+
Python	TITLE	0.99+
Gartner	ORGANIZATION	0.99+
Tableau	TITLE	0.99+
first time	QUANTITY	0.99+
3x	QUANTITY	0.99+
both	QUANTITY	0.99+
100x	QUANTITY	0.99+
one	QUANTITY	0.99+
Okera	ORGANIZATION	0.99+
Informatica	ORGANIZATION	0.98+
two orders	QUANTITY	0.98+
Nong	ORGANIZATION	0.98+
SanjMo	PERSON	0.98+
second	QUANTITY	0.98+
Power BI	TITLE	0.98+
1000	QUANTITY	0.98+
tomorrow	DATE	0.98+
two things	QUANTITY	0.98+
Qlik	TITLE	0.98+
each one	QUANTITY	0.97+
thousands of rules	QUANTITY	0.97+
1000 more data users	QUANTITY	0.96+
Twitter	ORGANIZATION	0.96+
first 10	QUANTITY	0.96+
Okera	PERSON	0.96+
AWS	ORGANIZATION	0.96+
hundred, 200 data sources	QUANTITY	0.95+
HIPAA	TITLE	0.94+
EU	ORGANIZATION	0.94+
CCPA	TITLE	0.94+
over 1000 data pipelines	QUANTITY	0.93+
single	QUANTITY	0.93+
first area	QUANTITY	0.93+
two great special guests	QUANTITY	0.92+
BusinessObjects	TITLE	0.92+

Michele Goetz,, Forrester Research | Collibra Data Citizens'21

>> From around the globe, it's theCUBE, covering Data Citizens '21. Brought to you by Collibra. >> For the past decade organizations have been effecting very deliberate data strategies and investing quite heavily in people, processes and technology, specifically designed to gain insights from data, better serve customers, drive new revenue streams we've heard this before. The results quite frankly have been mixed. As much of the effort is focused on analytics and technology designed to create a single version of the truth, which in many cases continues to be elusive. Moreover, the world of data is changing. Data is increasingly distributed making collaboration and governance more challenging, especially where operational use cases are a priority. Hello, everyone. My name is Dave Vellante and you're watching theCUBE coverage of Data Citizens '21. And we're pleased to welcome Michele Goetz who's the vice president and principal analyst at Forrester Research. Hello, Michele. Welcome to theCUBE. >> Hi, Dave. Thanks for having me today. >> It's our pleasure. So I want to start, you serve have a wide range of roles including enterprise architects, CDOs, chief data officers that is, analyst, the analyst, et cetera, and many data-related functions. And my first question is what are they thinking about today? What's on their minds, these data experts? >> So there's actually two things happening. One is what is the demand that's placed on data for our new intelligent digital systems. So we're seeing a lot of investment and interest in things like edge computing. And then how does that intersect with artificial intelligence to really run your business intelligently and drive new value propositions to be both adaptive to the market as well as resilient to changes that are unforeseen. The second thing is then you create this massive complexity to managing the data, governing the data, orchestrating the data because it's not just a centralized data warehouse environment anymore. You have a highly diverse and distributed landscape that you both control internally, as well as taking advantage of third party information. So really what the struggle then becomes is how do you trust the data? How do you govern it, and secure, and protect that data? And then how do you ensure that it's hyper contextualized to the types of value propositions that our intelligence systems are going to serve? >> Well, I think you're hitting on the key issues here. I mean, you're right. The data and I sort of refer to this as well is sort of out there, it's distributed at the edge. But generally our data organizations are actually quite centralized and as well you talk about the need to trust the data obviously that's crucial. But are you seeing the organization change? I know you're talking about this to clients, your discussion about collaboration. How are you seeing that change? >> Yeah, so as you have to bring data into context of the insights that you're trying to get or the intelligence that's automating and scaling out the value streams and outcomes within your business, we're actually seeing a federated model emerge in organizations. So while there's still a centralized data management and data services organization led typical enterprise architects for data, a data engineering team that's managing warehouses as in data lakes. They're creating this great platform to access and orchestrate information, but we're also seeing data, and analytics, and governance teams come together under chief data officers or chief data and analytics officers. And this is really where the insights are being generated from either BI and analytics or from data science itself and having dedicated data engineers and stewards that are helping to access and prepare data for analytic efforts. And then lastly, this is the really interesting part is when you push data into the edge the goal is that you're actually driving an experience and an application. And so in that case we are seeing data engineering teams starting to be incorporated into the solutions teams that are aligned to lines of business or divisions themselves. And so really what's happening is if there is a solution consultant who is also overseeing value-based portfolio management when you need to instrument the data to these new use cases and keep up with the pace of the business it's this engineering team that is part of the DevOps work bench to execute on that. So really the balances we need the core, we need to get to the insights and build our models for AI. And then the next piece is how do you activate all that? And there's a team over there to help. So it's really spreading the wealth and expertise where it needs to go. >> Yeah, I love that. You took a couple of things that really resonated with me. You talked about context a couple of times and this notion of a federated model, because historically the sort of big data architecture, the team, they didn't have the context, the business context, and my inference is that's changing and I think that's critical. Your talk at Data Citizens is called how obsessive collaboration fuels scalable DataOps. You talk about the data, the DevOps team. What's the premise you put forth to the audience? >> So the point about obsessive collaboration is sort of taking the hubris out of your expertise on the data. Certainly there's a recognition by data professionals that the business understands and owns their data. They know the semantics, they know the context of it and just receiving the requirements on that was assumed to be okay. And then you could provide a data foundation, whether it's just a lake or whether you have a warehouse environment where you're pulling for your analytics. The reality is that as we move into more of AI machine learning type of model, one, more context is necessary. And you're kind of balancing between what are the things that you can ascribe to the data globally which is what data engineers can support. And then there's what is unique about the data and the context of the data that is related to the business value and outcome as well as the feature engineering that is being done on the machine learning models. So there has to be a really tight link and collaboration between the data engineers, the data scientists, and analysts, and the business stakeholders themselves. You see a lot of pods starting up that way to build the intelligence within the system. And then lastly, what do you do with that model? What do you do with that data? What do you do with that insight? You now have to shift your collaboration over to the work bench that is going to pull all these components together to create the experiences and the automation that you're looking for. And that requires a different collaboration model around software development. And still incorporating the business expertise from those stakeholders, so that you're satisfying, not only the quality of the code to run the solution, but the quality towards the outcome that meets the expectation and the time to value that your stakeholders have. So data teams aren't just sitting in the basement or in another part of the organization and digitally disconnected anymore. You're finding that they're having to work much more closely and side by side with their colleagues and stakeholders. >> I think it's clear that you understand this space really well. Hubris out context in, I mean, that's kind of what's been lacking. And I'm glad you said you used the word anymore because I think it's a recognition that that's kind of what it was. They were down in the basement or out in some kind of silo. And I think, and I want to ask you this. I come back to organization because I think a lot of organizations look the most cost effective way for us to serve the business is to have a single data team with hyper specialized roles. That'll be the cheapest way, the most efficient way that we can serve them. And meanwhile, the business, which as you pointed out has the context is frustrated. They can't get to data. So there's this notion of a federated governance model is actually quite interesting. Are you seeing actual common use cases where this is being operationalized? >> Absolutely, I think the first place that you were seeing it was within the operational technology use cases. There the use cases where a lot of the manufacturing industrial device. Any sort of IOT based use case really recognized that without applying data and intelligence to whatever process was going to be executed. It was really going to be challenging to know that you're creating the right foundation, meeting the SLA requirements, and then ultimately bringing the right quality and integrity to the data, let alone any sort of data protection and regulatory compliance that has to be necessary. So you already started seeing the solution teams coming together with the data engineers, the solution developers, the analysts, and data scientists, and the business stakeholders to drive that. But that is starting to come back down into more of the IT mindset as well. And so DataOps starts to emerge from that paradigm into more of the corporate types of use cases and sort of parrot that because there are customer experience use cases that have an IOT or edge component to though. We live on our smart phones, we live on our smart watches, we've got our laptops. All of us have been put into virtual collaboration. And so we really need to take into account not just the insight of analytics but how do you feed that forward. And so this is really where you're seeing sort of the evolution of DataOps as a competency not only to engineer the data and collaborate but ensure that there sort of an activation and alignment where the value is going to come out, and still being trusted and governed. >> I got kind of a weird question, but I'm going. I was talking to somebody in Israel the other day and they told me masks are off, the economy's booming. And he noted that Israel said, hey, we're going to pay up for the price of a vaccine. The cost per dose out, 28 bucks or whatever it was. And he pointed out that the EU haggled big time and they don't want to pay $19. And as a result they're not as far along. Israel understood that the real value was opening up the economy. And so there's an analogy here which I want to come back to my organization and it relates to the DataOps. Is if the real metric is, hey, I have an idea for a data product. How long does it take to go from idea to monetization? That seems to me to be a better KPI than how much storage I have, or how much geometry petabytes I'm managing. So my question is, and it relates to DataOps. Can that DataOps, should that DataOps individual maybe live, and then maybe even the data engineer live inside of the business and is that even feasible technically with this notion of federated governance? Are you seeing that and maybe talk a little bit more about this DataOps role. Is it. >> Yeah. >> Fungible. >> Yeah, it's definitely fungible. And in fact, when I talked about sort of those three units of there's your core enterprise data services, there's your BI and data, and then there's your line of business. All of those, the engineering and the ops is the DataOps which is living in all of those environments and being as close as possible to where the value proposition is being defined and designed. So absolutely being able to federate that. And I think the other piece on DataOps that is really important is recognizing how the practices around continuous integration and continuous deployment using agile methodologies is really reshaping. A lot of the waterfall approaches that were done before where data was lagging 12 to 18 months behind any sort of insights, but a lot of the platforms today assume that you're moving into a standard mature software development life cycle. And you can start seeing returns on investment within a quarter, really, so that you can iterate and then speed that up so that you're delivering new value every two weeks. But it does change the mindset this DataOps team aligned to solution development, aligned to a broader portfolio management of business capabilities and outcomes needs to understand how to appropriately scope the data products that they're delivering to incremental value-based milestones. So the business feels that they're getting improvements over time and not just waiting. So there's an MVP, you move forward on that and optimize, optimize, extend scale. So again, that CICD mindset is helping to not bottleneck and wait for the complete field of dreams to come from your data and your insights. >> Thank you for that, Michelle. I want to come back to this idea of collaboration because over the last decade we've seen attempts, I've seen software come out to try to help the various roles collaborate and some of it's been okay, but you have these hyper specialized roles. You've got data scientists, data engineers, quality engineers, analysts, et cetera. And they tend to be in their own little worlds. But at the end of the day we rely on them all to get answers. So how can these data scientists, all these stewards, how can they collaborate better? What are you seeing there? >> You need to get them onto the same process. That's really what it comes down to. If you're working from different points of view, that's one thing. But if you're working from different processes collaborating is really challenging. And I think the one thing that's really come out of this move to machine learning and AI is recognizing that you need processes that reinforce collaboration. So that's number one. So you see agile development in CICD not just for DataOps, not just for DevOps, but also encouraging and propelling these projects and iterations for the data science teams as well or even if there's machine learning engineers incorporated. And then certainly the business stakeholders are inserted within there as appropriate to accept what it is that is going to be developed. So processes is number one. And number two is what is the platform that's going to reinforce those processes and collaboration. And it's really about what's being shared. How do you share? So certainly what we're seeing within the platforms themselves is everybody contributing into some sort of a library where their components and products are being ascribed to and then that's able to help different teams grab those components and build out what those solutions are going to be. And in fact, what gets really cool about that is you don't always need hardcore data scientists anymore as you have this social platform for data product and analytic product development. This is where a lot of the auto ML begins because those who are less data science-oriented but can build an insight pipeline, can grab all the different components from the pipelines to the transformations, to capture mechanisms, to bolting into the model itself and allowing that to be delivered to the application. So really kind of balancing out between process and platforms that enable and encourage, and almost force you to collaborate and manage through sharing. >> Thank you for that. I want to ask you about the role data governance. You've mentioned trust and that's data quality, and you've got teams that are focused on and specialists focused on data quality. There's the data catalog. Here's my question. You mentioned edge a couple of times and I can see a lot of that. I mean, today, most AI is are a lot of value, I would say most is modeling. And in the future, you mentioned edge it's going to be a lot of influencing in real time. And people maybe not going to have the time or be involved in that decision. So what are you seeing in terms of data governance, federate. We talked about federated governance, this notion of a data catalog and maybe automating data quality without necessarily having it be so labor intensive. What are you seeing the trends there? >> Yeah, so I think our new environment, our new normal is that you have to be composable, interoperable, and portable. Portability is really the key here. So from a cataloging perspective and governance we would bring everything together into our catalogs and business glossaries. And it would be a reference point, it was like a massive Wiki. Well, that's wonderful, but why just how's it in a museum. You really want to activate that. And I think what's interesting about the technologies today for governance is that you can turn those rules, and business logic, and policies into services that are composable components and bring those into the solutions that you're defining. And in that way what happens is that creates portability. You can drive them wherever they need to go. But from the composability and the interoperability portion of that you can put those services in the right place at the right time for what you need for an outcome so that you start to become behaviorally driven on executing on governance rather than trying to write all of the governance down into transformations and controls to where the data lives. You can have quality and observability of that quality and performance right at the edge and context of behavior and use of that solution. You can run those services and in governance on gateways that are managing and routing information at those edge solutions and we synchronization between the edge and the cloud comes up. And if it's appropriate during synchronization of the data back into the data lake you can run those services there. So there's a lot more flexibility and elasticity for today's modern approaches to cataloging, and glossaries, and governance of data than we had before. And that goes back into what we talked about earlier of like, this is the new wave of DataOps. This is how you bring data products to fruition now. Everything is about activation. >> So how do you see the future of DataOps? I mean, I kind of been pushing you to a more decentralized model where the business has more control 'cause the business has the context. I mean, I feel as though, hey, we've done a great job of contextualizing our operational systems. The sales team they know when the data is crap within my CRM, but our data systems are context agnostic generally. And you obviously understand that problem well. But so how do you see the future of DataOps? >> So I think what's kind of interesting about that is we're going to go to governance on greed versus governance on right more so. What do I mean by that? That means that from a business perspective there's two sides of it. There's ensuring that where governance is run is as we talked about before executing at the appropriate place at the appropriate time. It's semantically domain-centric driven not logical and systems centric. So that's number one. Number two is also recognizing that business owners or business operations actually plays a role in this, because as you're working within your CRM systems, like a Salesforce, for example you're using an iPaaS MuleSoft to connect to other applications, connect to other data sources, connect to other analytics sources. And what's happening there is that the data is being modeled and personalized to whatever view insight our task has to happen within those processes. So even CRM environments where we think of as sort of traditional technologies that we're used to are getting a lift, both in terms of intelligence from the data but also your flexibility and how you execute governance and quality services within that environment. And that actually opens up the data foundations a lot more and avoids you from having to do a lot of moving, copying centralizing data and creating an over-weighted business application and an over, both in terms of the data foundation but also in terms of the types of business services, and status updates, and processes that happen in the application itself. You're drawing those tasks back down to where they should be and where performance can be managed rather than trying to over customize your application environment. And that gives you a lot more flexibility later too for any sort of upgrades or migrations that you want to make because all of the logic is contained back down in a service layer instead. >> Great perspectives, Michelle, you obviously know your stuff and it's been a pleasure having you on. My last question is when you look out there anything that really excites you or any specific research that you're working on that you want to share, that you're super pumped about? >> I think there's two things. One is it's truly incredible the amount of insight and growth that is coming through data profiling and observation. Really understanding and contextualizing data anomalies so that you understand is data helping or hurting the business value and tying it very specifically to processes and metrics, which is fantastic as well as models themselves like really understanding how data inputs and outputs are making a difference whether the model performs or not. And then I think the second thing is really the emergence of more active data, active insights. And as what we talked about before your ability to package up services for governance and quality in particular that allow you to scale your data out towards the edge or where it's needed. And doing so not just so that you can run analytics but that you're also driving overall processes and value. So the research around the operationalization and activation of data is really exciting. And looking at the networks and service mesh to bring those things together is kind of where I'm focusing right now because what's the point of having data in a database if it's not providing any value. >> Michele Goetz, Forrester Research, thanks so much for coming on theCUBE. Really awesome perspectives. You're in an exciting space, so appreciate your time. >> Absolutely, thank you. >> And thank you for watching Data Citizens '21 on theCUBE. My name is Dave Vellante. (upbeat music)

Published Date : Jun 17 2021

SUMMARY :

Brought to you by Collibra. of the truth, which in many Thanks for having me today. So I want to start, you serve that you both control internally, the need to trust the data the data to these new use cases What's the premise you and the time to value that And meanwhile, the business, But that is starting to come back down and it relates to the DataOps. and the ops is the DataOps And they tend to be in and allowing that to be And in the future, you mentioned edge of that you can put those services I mean, I kind of been pushing you And that gives you a lot more flexibility on that you want to share, that allow you to scale your so appreciate your time. And thank you for watching

ENTITIES

Entity	Category	Confidence
Michele Goetz	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Michele	PERSON	0.99+
Dave	PERSON	0.99+
Michelle	PERSON	0.99+
$19	QUANTITY	0.99+
Israel	LOCATION	0.99+
12	QUANTITY	0.99+
28 bucks	QUANTITY	0.99+
first question	QUANTITY	0.99+
two sides	QUANTITY	0.99+
EU	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
Forrester Research	ORGANIZATION	0.99+
today	DATE	0.99+
One	QUANTITY	0.99+
Data Citizens	ORGANIZATION	0.99+
second thing	QUANTITY	0.99+
both	QUANTITY	0.98+
Collibra	ORGANIZATION	0.98+
18 months	QUANTITY	0.98+
Forrester Research	ORGANIZATION	0.98+
one	QUANTITY	0.96+
Israel	ORGANIZATION	0.96+
three units	QUANTITY	0.94+
Data Citizens '21	TITLE	0.94+
DataOps	ORGANIZATION	0.93+
one thing	QUANTITY	0.9+
Hubris	PERSON	0.89+
first place	QUANTITY	0.85+
past decade	DATE	0.84+
agile	TITLE	0.83+
Number two	QUANTITY	0.82+
single data team	QUANTITY	0.82+
DevOps	TITLE	0.81+
last	DATE	0.8+
DataOps	TITLE	0.8+
edge	ORGANIZATION	0.78+
DataOps	OTHER	0.78+
single version	QUANTITY	0.78+
wave	EVENT	0.74+
two weeks	QUANTITY	0.74+
DataOps	EVENT	0.73+
times	QUANTITY	0.73+
SLA	TITLE	0.72+
number two	QUANTITY	0.71+
Salesforce	TITLE	0.7+
CICD	ORGANIZATION	0.67+
number one	QUANTITY	0.65+
CICD	TITLE	0.6+
iPaaS	TITLE	0.59+
Citizens'21	ORGANIZATION	0.56+
couple	QUANTITY	0.42+
MuleSoft	ORGANIZATION	0.41+
theCUBE	TITLE	0.34+

Michele Goetz, VP, Principal Analyst, Forrester Research EDIT

>> From around the globe, it's theCube covering Data Citizens '21, brought to you by Collibra. >> For the past decade, organizations have been effecting very deliberate data strategies investing quite heavily in people, processes, and technology specifically designed to gain insights from data, better serve customers, drive new revenue streams, we've heard this before. The results quite frankly have been mixed. As much of the effort is focused on analytics and technology designed to create a single version of the truth, which in many cases continues to be elusive. Moreover, the world of data is changing, data is increasingly distributed making collaboration in governance more challenging especially where operational use cases are a priority. Hello, everyone, my name is Dave Vellante and you're watching theCube's coverage of Data Citizens '21. And we're pleased to welcome Michele Goetz, who's the Vice President and Principal Analyst at Forrester Research. Hello, Michele, welcome to theCube. >> Hi, Dave thanks for having me today. >> It's our pleasure. So I want to start, you serve have a wide range of roles including enterprise architects, CDOs, chief data officers that is, the analyst et cetera, and many data related functions. And my first question is what are they thinking about today? What's on their minds? These data experts. >> So there's actually two things happening. One is what is the demand that's placed on data for our new intelligent digital systems. So we're seeing a lot of investment and interest in things like edge computing. And then how does that intersect with artificial intelligence to really run your business intelligently and drive new value propositions, to be both adaptive to the market as well as resilient to changes that are unforeseen. The second thing is then you create this massive complexity to managing the data, governing the data, orchestrating the data, because it's not just a centralized data warehouse environment anymore. You have a highly diverse and distributed landscape that you both control internally, as well as taking advantage of third party information. So really what the struggle then becomes is how do you trust the data? How do you govern it and secure or protect that data? And then how do you ensure that it's hyper-contextualized to the types of value propositions that our intelligence systems are going to serve? >> Well, I think you're hitting on the key issues here. I mean, you're right, the data and I sort of refer to this as well as sort of out there it's distributed as at the edge, but generally our data organizations are actually quite centralized. And as well, you talk about the need to trust the data, obviously that's crucial. But are you seeing the organization change? I know you're talking about this to clients, your discussion about collaboration. How are you seeing that change? >> Yeah, so as you have to bring data into context of the insights that you're trying to get or the intelligence that's automating and scaling out the value streams and outcomes within your business. We're actually seeing a federated model emerge in organizations. So while there's still a centralized data management and data services organization led typically by enterprise architects for data, a data engineering team that's managing warehouses and data lakes. They're creating this great platform to access and orchestrate information, but we're also seeing data and analytics and governance teams come together under chief data officers or chief data and analytics officers. And this is really where the insights are being generated from either BI and analytics or from data science itself and having dedicated data engineers and stewards that are helping to access and prepare data for analytic efforts. And then lastly, this is the really interesting part is when you push data into the edge, the goal is that you're actually driving an experience and an application. And so in that case, we are seeing data engineering teams starting to be incorporated into the solutions teams that are aligned to lines of business or divisions themselves. And so really what's happening is if there is a solution consultant who is also overseeing value-based portfolio management when you need to instrument the data to these new use cases and keep up with the pace of the business, it's this engineering team that is part of the DevOps work bench to execute on that. So really the balances we need the core, we need to get to the insights and build our models for AI. And then the next piece is how do you activate all that and there's a team over there to help? So it's really spreading the wealth and expertise where it needs to go. >> Yeah, I love that you to, a couple of things that really resonated with me. You talked about context a couple of times and this notion of a federated model, because historically the sort of big data architecture, the team, they didn't have the context, the business context, and you're the, my inference is that's changing. And I think that's critical. Your talk at Data Citizens is called how obsessive collaboration fuels scalable DataOps. You talk about the data, the DevOps team. What's the premise you put forth to the audience? >> So the point about obsessive collaboration is sort of taking the hubris out of your expertise on the data. Certainly, there's a recognition by data professionals that the business understands and owns their data. They know the semantics, they know the context of it and just receiving the requirements on that was assumed to be okay. And then you could provide a data foundation whether it's just a lake or whether you have a warehouse environment where you're pulling for your analytics. The reality is that as we move into more of AI machine learning type of model, one, more context is necessary and you're kind of balancing between what are the things that you can ascribe to the data globally which is what data engineers can support. And then there's what is unique about the data and the context of about the data that is related to the business value and outcome as well as the feature engineering that is being done on the machine learning models. So there has to be a really tight link and collaboration between the data engineers, the data scientists, and analysts, and the business stakeholders themselves. You see a lot of pods starting up that way to build the intelligence within the system. And then lastly, what do you do with that model? What do you do with that data? What do you do with that insight? You now have to shift your collaboration over to the work bench that is going to pull all these components together to create the experiences and the automation that you're looking for. And that requires a different collaboration model around software development and still incorporating the business expertise from those stakeholders so that you're satisfying, not only the quality of the code to run the solution, but the quality towards the outcome that meets the expectation and the time to value that your stakeholders have. So data teams aren't just sitting in the basement or in another part of the organization and digitally, disconnected anymore. You're finding that they're having to work much more closely and side by side with their colleagues and stakeholders. >> I think it's clear that you understand this space really well, hubris out, context in, I mean, that's kind of what's been lacking. And I'm glad you said, you used the word anymore because I think it's a recognition that that's kind of what it was. They were down in the basement or out in some kind of silo. And I think, and I want to ask you this, I'll come back to organization because I think a lot of organizations, look the most cost effective way for us to serve the businesses to have a single data team with hyper-specialized roles, that'll be the cheapest way, the most efficient way that we can serve them. And meanwhile, the business which as you pointed out has the context is frustrated. They can't get to data. So this notion of a federated governance model is actually quite interesting. Are you seeing actual common use cases where this is being operationalized? >> Absolutely, I think the first place that you were seeing it was within the operational technology use cases. The use cases where a lot of the manufacturing, industrial device, any sort of IoT-based use case really recognized that without applying data and intelligence to whatever process was going to be executed, it was really going to be challenging to know that you're creating the right foundation, meeting the SLA requirements, and then ultimately bringing the right quality and integrity to the data, let alone any sort of data protection and regulatory compliance that has to be necessary. So you already started seeing the solution teams coming together with the data engineers, the solution developers, the analysts, and data scientists, and the business stakeholders to drive that. But that is starting to come back down into more of the IT mindset as well. And so DataOps starts to emerge from that paradigm into more of the corporate types of use cases and sort of parrot that because there are customer experience use cases that have an IoT or edge component to them. We live on our smart phones, we live on our smart watches, we've got our laptops, all of us have been put into virtual collaboration. And so we really need to take into account not just the insight of analytics, but how do you feed that, you know, feed that forward. And so this is really where you're seeing sort of the evolution of DataOps as a competency not only to engineer the data and collaborate, but ensure that there sort of an activation and alignment where the value is going to come out and still being trusted and governed. >> I've got kind of a weird question, but I'm going to (indistinct). I was talking to somebody in Israel the other day and they told me masks are off, the economy's booming. And he noted that Israel said, "Hey, we're going to pay up for the price of a vaccine, the cost per dose around 28 bucks," or whatever it was. And he pointed out that the EU haggled big time and they go, "We're going to pay $19." And as a result, they're not, you know, as far along Israel understood that the real value was opening up the economy. And so there's an analogy here, which I want to come back to my organization and it relates to the DataOps. If the real metric is, "Hey, I have an idea for a data product." How long does it take to go from idea to monetization? That seems to me to be a better KPI than, you know, how much storage I have or how much petabytes I'm managing. So my question is, and it relates to DataOps, can that DataOps, should that DataOps individual maybe live and then maybe even the data engineer live inside of the business and is that even feasible technically with this notion of federated governance? Are you seeing that? And maybe talk a little bit more about this DataOps role. Is it-- >> Yeah. >> Fungible? >> Yeah, it's definitely fungible. And in fact, when I talked about sort of those three units of there's your core enterprise data services, there's your BI and data and then there's your line of business. All of those, the engineering and the ops is the DataOps which is living in all of those environments and being as close as possible to where the value proposition is being defined and designed. So absolutely being able to federate that. And I think the other piece on DataOps that is really important is recognizing how the practices around continuous integration and continuous deployment using agile methodologies is really reshaping a lot of the waterfall approaches that were done before where data was lagging 12 to 18 months behind any sort of insights, but a lot of the platforms today assume that you're moving into a standard mature software development life cycle. And you can start seeing returns on investment within a quarter really, so that you can iterate and then speed that up so that you're delivering new value every two weeks. But it does change the mindset, this DataOps team align to solution development, align to a broader portfolio management of business capabilities and outcomes needs to understand how to appropriately stop the data products that they're delivering to incremental value based milestones. So the business feels that they're getting improvements over time and not just waiting. So there's an MVP, you move forward on that and optimize, optimize, extend scale. So again, that CICD mindset is helping to not bottleneck and wait for the complete field of dreams to come from your data and your insights. >> Thank you for that, Michele. I want to come back to this idea of collaboration 'cause over the last decade, we've seen attempts. I've seen software come out to try to help the various roles, collaborate and some of it's been okay, but you have these hyper-specialized roles. You've got data scientists, data engineers, quality engineers, analysts, et cetera. And they tend to be in their own little worlds. But at the end of the day, we rely on them all to get answers. So how can these data scientists, all these stewards, how can they collaborate better? What are you seeing there? >> You need to get them onto the same process, that's really what it comes down to. If you're working from different points of view, that's one thing. But if you're working from different processes, collaborating is really challenging. And I think the one thing that's really come out of this move to machine learning and AI is recognizing that you need processes that reinforce collaboration. So that's number one. So you see agile development in CICD not just for DataOps, not just for DevOps, but also encouraging and propelling these projects and iterations before the data science teams as well or even if there's machine learning engineers incorporated. And then, certainly the business stakeholders are inserted within there as appropriate to accept what it is that is going to be developed. So process is number one. Number two is what is the platform that's going to reinforce those processes and collaboration. And it's really about what's being shared. How do you share? So certainly what we're seeing within the platforms themselves is everybody contributing into some sort of a library where their components and products are being ascribed to and then that's able to help different teams grab those components and build out what those solutions are going to be. And in fact, what gets really cool about that is you don't always need hardcore data scientists anymore as you have this social platform for data product and analytic product development. This is where a lot of the auto ML begins because those who are less data science oriented but can build an insight pipeline, can grab all the different components from the pipelines to the transformations, to capture mechanisms, to bolting into the model itself and allowing that to be delivered to the application. So really kind of balancing out between process and platforms that enable and encourage and almost force you to collaborate and manage through sharing. >> Thank you for that I want to ask you about the role of data governance. You've mentioned trust and that's data quality and you've got teams that are focused on and specialists focused on data quality. There's the data catalog and here's my question. You mentioned edge a couple of times and I can see a lot of that. I mean, today, most AI is a lot of the AI, I would say most is modeling. And in the future, you mentioned edge. It's going to be a lot of inferencing in real-time. And you know people maybe not going to have the time or be involved in that decision. So what are you seeing in terms of data governance, federate, we talked about federated governance, this notion of a data catalog and maybe automating data quality without necessarily having it be so labor-intensive. What are you seeing trends there? >> Yeah, so I think our new environment, our new normal is that you have to be composable, interoperable, and portable. Portability is really the key here. So from a cataloging perspective, in governance we would bring everything together into our catalogs and business glossaries. And it would be a reference point. It was like a massive Wiki. Well, that's wonderful, but why just how's it in a museum you really want to activate that. And I think what's interesting about the technologies today for governance is that you can turn those rules and business logic and policies into services that are composable components and bring those into the solutions that you're defining. And in that way, what happens is that creates portability. You can drive them wherever they need to go. But from the composability and the interoperability portion of that, you can put those services in the right place at the right time for what you need for an outcome so that you start to become behaviorally-driven on executing on governance, rather than trying to write all of the governance down into transformations and controls to where the data lives. You can have quality and observability of that quality and performance right at the edge in context of behavior and use of that solution. You can run those services and in governance on gateways that are managing and routing information at those edge solutions and where synchronization between the edge and the cloud comes up. And if it's appropriate during synchronization of the data back into the data lake, you can run those services there. So there's a lot more flexibility and elasticity for today's modern approaches to cataloging and glossaries and governance of data than we had before. And that goes back into what we talked about earlier of like this is the new wave of DataOps. This is how you bring data products to fruition now everything is about activation. >> So how do you see the future of DataOps? I mean, I kind of been pushing you to a more decentralized model where the business has more control 'cause the business has the context. I mean, I feel as though, hey, we've done a great job of contextualizing our operational systems. The sales team, they know when the data is crap within my CRM, but our data systems are context agnostic, which you know, generally and you obviously understand that problem well but so how do you see the future of DataOps? >> So I think what's kind of interesting about that is we're going to go to governance on greed versus governance on right, more so. What do I mean by that? That means that from a business perspective there's two sides of it. There's ensuring that where governance is run as we talked about before executing at the appropriate place at the appropriate time. It's semantically domain centric driven not logical and systems centric. So that's number one. Number two is also recognizing that business owners or business operations actually plays a role in this because as you're working within your CRM systems like a Salesforce, for example, you're using an I-PASS environment MuleSoft to connect to other applications, connect to other data sources, connect to other analytics sources, and what's happening there is that the data is being modeled and personalized to whatever view, insight, or task has to happen within those processes. So even CRM environments where we think of as sort of traditional technologies that we're used to are getting a lift to both in terms of intelligence from the data but also your flexibility and how you execute governance and quality services within that environment. And that actually opens up the data foundations a lot more and avoids you from having to do a lot of moving, copying, centralizing data, and creating an over-weighted business application and an over, you know, both in terms of the data foundation but also in terms of the types of business services and status updates and processes that happen in the application itself. You're drawing those tasks back down to where they should be and where performance can be managed rather than trying to over customize your application environment. And that gives you a lot more flexibility later too for any sort of upgrades or migrations that you want to make because all of the logic is contained back down in a service layer instead. >> Great perspectives, Michele, you obviously know your stuff and it's been a pleasure having you on. My last question is when you look out there anything that really excites you or any specific research that you're working on that you want to share that you're super-pumped about. >> I think there's two things. One is it's truly incredible the amount of insight and growth that is coming through data profiling and observation, really understanding and contextualizing data anomalies so that you understand is data helping or hurting the business value. And, you know tying it very specifically to processes and metrics which is fantastic as well as models themselves like really understanding how data inputs and outputs are making a difference whether the model performs or not. And then I think the second thing is really the emergence of more active data, active insights, as what we talked about before your ability to package up services for governance and quality in particular that allow you to scale your data out towards the edge or where it's needed and doing so, you know not just so that you can run analytics but that you're also driving overall processes and value. So the research around the operationalization and activation of data is really exciting. And looking at the networks and service mesh to bring those things is kind of where I'm focusing right now because what's the point of having data in a database if it's not providing any value. >> Michele Goetz, Forrester Research, thanks so much for coming on theCube really awesome perspectives. You're in an exciting space. So appreciate your time. >> Absolutely, thank you. >> And thank you for watching Data Citizens '21 on theCube. My name is Dave Vellante. (upbeat music)

Published Date : Jun 14 2021

SUMMARY :

brought to you by Collibra. of the truth, which in many So I want to start, you that you both control internally, and I sort of refer to this and stewards that are helping to access What's the premise you and the time to value that you understand and the business and it relates to the DataOps. and the ops is the DataOps And they tend to be in and allowing that to be And in the future, you mentioned edge. and controls to where the data lives. and you obviously understand And that gives you a lot and it's been a pleasure having you on. not just so that you can run analytics So appreciate your time. And thank you for watching

ENTITIES

Entity	Category	Confidence
Michele	PERSON	0.99+
Michele Goetz	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
$19	QUANTITY	0.99+
Israel	LOCATION	0.99+
12	QUANTITY	0.99+
first question	QUANTITY	0.99+
EU	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
first	QUANTITY	0.99+
Forrester Research	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
Data Citizens	ORGANIZATION	0.99+
Forrester Research	ORGANIZATION	0.99+
One	QUANTITY	0.99+
today	DATE	0.99+
Collibra	ORGANIZATION	0.99+
18 months	QUANTITY	0.99+
second thing	QUANTITY	0.98+
both	QUANTITY	0.98+
second thing	QUANTITY	0.98+
Data Citizens '21	TITLE	0.97+
around 28 bucks	QUANTITY	0.96+
DataOps	ORGANIZATION	0.94+
Israel	ORGANIZATION	0.93+
one	QUANTITY	0.93+
three units	QUANTITY	0.9+
one thing	QUANTITY	0.88+
Salesforce	TITLE	0.86+
CICD	ORGANIZATION	0.84+
single version	QUANTITY	0.83+
past decade	DATE	0.83+
two weeks	QUANTITY	0.82+
theCube	ORGANIZATION	0.81+
Number two	QUANTITY	0.79+
agile	TITLE	0.77+
a quarter	QUANTITY	0.75+
edge	ORGANIZATION	0.7+
wave of	EVENT	0.69+
times	QUANTITY	0.69+
single data team	QUANTITY	0.67+
DataOps	TITLE	0.65+
last decade	DATE	0.64+
DevOps	TITLE	0.64+
MuleSoft	ORGANIZATION	0.64+
Vice President	PERSON	0.56+
DataOps	EVENT	0.53+
couple	QUANTITY	0.4+

Octavian Tanase, NetApp and Jason McGee, IBM | IBM Think 2021

>> Narrator: From around the globe, It's theCUBE with digital coverage of IBM Think 2021 brought to you by IBM. >> Hi, welcome back to theCUBE's coverage of IBM Think 2021 virtual. We're not yet in real life. We're doing another remote interviews with two great guests Cube Alumni. Of course, I'm John Furrier your host of theCUBE. We've got Jason McGee, IBM fellow VP and CTO of IBM's cloud platform and Octavian the Nazis senior vice president Hybrid Cloud engineering at NetApp. Both Cube alumni, is great to see you both. Thanks for coming on theCUBE >> Yeah, great to be here. >> Thanks for having us. >> So we were just talking before we came on camera that, it feels like we've had this conversation, a long time ago we have, Hybrid Cloud has been on a trajectory for both of you guys in many times on theCUBE. So now it's mainstream, it's here in the real world. Everyone gets it. There's no real debate. Now multicloud that's people are debating that which means that's right around the corner. So Hybrid Cloud is here and now Jason this is really the focus. And this is also brings together the NetApp in your partnership and talk about the relationship first with Hybrid Cloud. >> Yeah, I mean, you know look we've talked a number of times together. I think in the industry, maybe a few years ago people were debating whether Hybrid Cloud was a real thing. We don't have that conversation anymore. I think, you know, enterprises today, especially maybe in the face of COVID and kind of how we work differently now realize that their cloud journey is going to be a mix of on-prem and off-prem systems probably going to be a mix of multiple public cloud providers. And what they're looking for now is how do I do that? And how do I manage that hybrid environment? How do I have a consistent platform across the different environments I want to operate in? And then how do I get more and more of my workload into those environments? And it's been interesting. I think the first waves of cloud were infrastructure centric and externally application focused they were easier things. And now we're moving into more mission critical more stateful, more data oriented workloads. And that brings with a new challenges on where applications run and how we leverage the club. >> Octavian, you guys had a great relationship with IBM over the years, data centric company NetApp has always been great engineering team. You're on the cloud, Hybrid Cloud engineering. What's the current status of the relationship. Give us an update on how it's vectoring into the Hybrid Cloud since you're a senior vice president of Hybrid Cloud engineering. >> Well, so first of all, I want to recognize 20 years of a successful partnership with IBM. I think NetApp and IBM have been companies that have embraced digital transformation and technology trends to enable that digital transformation for our customers. And we've been very successful. I think there is a very strong joint Hybrid Cloud value proposition for customers on NetApps storage and data services compliment what IBM does in terms of products and solutions both for on-premise deployments in the cloud. I think together we can build more complete solutions, solutions that span data mobility, data governance for the new workrooms that Jason has talked about. >> And how has some of the customer challenges that you're seeing obviously software defined networking software defined storage, DevOps is now turned into DevSecOps. So you have now that programmability requirement with for dynamic applications, applications driven, infrastructure, all these buzz words point to one thing, the infrastructure has to be resilient and respond to the applications. >> Yeah, I would say infrastructure will continue to be top of mind for everybody, whether they're building a private cloud or whether they we're trying to leverage, something like IBM cloud. I think, you know, people want to consume, infrastructure is an API. I think they want to simplicity, security. I think they want to manage their costs very well. I think we're very proud to be partnering with IBM cloud to build such capabilities. >> Jason how are you guys helping some of these customers as they look at new things and sometimes retrofitting and refactoring previous stuff during transforming, but also innovating at the same time. There's a lot of that going on. What are you guys doing to help with the Hybrid challenges? >> Yeah, I mean, you know, there's a lot of dimensions to that problem but the one that I think has been kind of most interesting over the last year has been how kind of the consumption model of public cloud, API driven, self service, capabilities operated for you how that consumption model is starting to spread because I think one of the challenges with hybrid and one of the challenges as customers are looking at these, more mission critical data centric kind of workloads was well, I can't always move that application to the public cloud data center or I need that application to live out on the network, closer to my end users. So, you know, out where data is being generated, maybe in an IOT context and when you had those requirements you had to kind of switch operating models. You, you had to kind of move away from a public cloud service consumption model to a software deployment model. And we have a common platform and things like open shift that can run everywhere but the missing piece was how do I consume everything as a service everywhere? And so recently we launched this thing called IBM been satellite, which we've been working with Octavian and his team on, on how we can actually extend the public cloud experience back into the data center, out to the edge and allow people to kind of mix both location flexibility with public cloud consumption. And when you do that, you of course running a much more diverse infrastructure environment. You have to integrate with different storage environments and you wind up with like multi-tiered applications you know, some stuff on the edge and some stuff in the core. And so data replication and data management start to become really interesting because you're kind of distributing your workloads across this complex environment. >> Here we've seen that relationship between compute and storage change a lot over the past decade as the evolution goes. Octavian, I got to ask you this is critical path for companies. They want the storage ready infrastructure. You guys have been doing that for many decades pardon me with IBM, for sure. But now they're all getting Hybrid Cloud big time and it's not, it's attributed computing. It's what it is. It's the operating model. When someone asks you guys what your capabilities are how do you answer that in today's world? Because you have storage as well known. You got a great product people know that. But what is NetApp's capabilities? When I say I'm going all in and Hybrid Cloud complete changeover. >> So what we have been doing is basically rewriting a lot of our software with a few design points in mind. The software-defined has been definitely one of the key design points. The second is the Hybrid Cloud in the containerization of our operating systems. So they can run both in traditional environments as well as in the cloud. I think the last thing that we wanted to do it's enabled the speed of scale. And that has been by building, you know intrinsically in the product, both support or in also using Kubernetes as an infrastructure to achieve that agility, that scale. >> So how about this data fabric vision? Because to me, this is comes up all the time in my conversations with practitioners the number one problem at their, and problem that they're to solve in the conversation tends to I hear what was that control plane Kubernetes, horizontally scalable this all points to data being available., So how do you create that availability? What is data fabric mean? What does all this mean in a hybrid context? >> Well if you think about it, data fabric it's a Hybrid Cloud concept, right? This is about enabling data governance, data, mobility data security in an environment where some of the applications were run on premises or at the edge or the smart edge and many of the, perhaps data links and analytics, and services, rich services will be in a central locations or many or perhaps some large know data centers. So you need to have, the type of capabilities data services to enable that mobility that governance, that security across this continuum that spans the edge, the core and the cloud. >> Jason, you mentioned satellite before Cloud Satellite. Could you go into more detail on that? I know it's kind of a new product, what is that about and tell me what's the benefits and why is it exist and what problems does it solve? >> Yeah, so in the most simple terms, Cloud Satellite is the capability to extend IBM's, public cloud into on-prem infrastructure at the edge or in a multicloud context to other public cloud infrastructures. And so you can consume all the services in the public cloud that you need to to build your application, OpenShift as a service database, as DevTools, AI capabilities instead of being limited to only being able to consume those services in IBM's re you know, cloud regions you can now add your private data center or add your Metro provider, or add your AWS or Azure accounts and now consume those services consistently across all those environments. And that really allows you to kind of combine the benefits of public cloud with kind of location independence you see in hybrid and lets us solve new problems. Like, you know, it's really interesting. We're seeing like AI and data being a primary driver. You know I need my application to live in a certain country or to live next to my mainframe or to live like, in a Metro because all of my, I'm doing like video analytics on a bunch of cameras and I'm not going to stream all that data back to halfway across the country to some cloud region. And so it lets you extend out in that way. And when you do that, of course, you now move the cloud into a more diverse infrastructure environment. And so like we've been working with NetApp on how do we then expose NetApp storage into this environment when I'm running in the data center or I'm running at the edge and I need to store that data replicate the data, secure it. Well, how do I kind of plug those two things together? I think John, at the beginning you kind of alluded to this idea of, things are becoming more application centric, right? And we're trying to run a IT architecture that's more centered around the application. Well, by combining clouds knowledge of kind of where everything's running with that common platform like OpenShift with Kubernetes aware of data fabric and storage layer, you really can achieve that. You can have an application centric kind of management that spans those environments. >> Yeah, I'm want to come back to that whole impact on IT because this has come up as a major theme here. Think at the it transformation is going to be more about cloud scale, but I want to get to Octavian on the satellite on NetApp's role and how you compliment that. How do you guys fit in? He just mentioned that you guys are playing with Cloud Satellite. Obviously this looks like an operating model. How does NetApp fit in. >> Simply put we extend and enable the capabilities that IBM satellite platform provides. I think Jason referred to the storage aspects and you know what we are doing it's enabling not only storage but rich data services around new theory based on temperature, or replicated snapshots or capabilities around, you know, caching, high availability, encryption and so forth. So we believe that our technology integrate very well with red hat openShift and the Kubernetes aspect enable the application mobility in that translation of really distributed computing at scale, from the traditional data center to the edge and to the massive Ops that IBM is building. >> You know, I got to say, but watching you guys work together for many decades now and covering you with theCUBE for the past 10 years or 11 years now been a great partnership. I got to say one thing, that's obvious to me and our team and mainly the world is now you've got a new CEO over at IBM you have a Cloud Focus that's on unwavering. Arvind loves the cloud we all know that. Ecosystems are changing without, you already had a big ecosystem and partnerships. Now it seems to be moving to a level where you got to have that ecosystem really thrive in the cloud. So I guess we'll use the last couple of minutes. If you guys don't mind explaining how the IBM NetApp relationship in the new context of this new partnership new ecosystem or a new kind of world helps customers and how you guys are working together. >> Yeah, I mean, I could start, I mean I think you're right, that cloud is all about platforms and about kind of the overall environment people operate in and the ecosystem is really critical. And I think things like satellite have given us new ways to work together. I mean, IBM and NetApp, as we said I've been working together for a long time. We rely on them all in our public cloud, for example, in our storage tiers. But with the kind of idea of distributed cloud in the boundaries of public cloud spreading to all of these new environments, those were just new places where we can build really interesting valuable integrations for our clients so that they can deal with data, deal with these more complex apps, in all the places that they exist. So I think it's been actually really exciting to kind of leverage that opportunity to find, new ways to work together and deliver solutions for our clients. >> Octavian. >> I will say that data is the ecosystem and we all know that there's more data right now being created outside of the traditional data center be it in the cloud or at the edge. So our mission is to enable that, Hybrid Cloud or data mobility and enabled persistence rich data, storage services, whatever data is being created. I think IBM's new satellite platform, comes in and broadens the aperture of people being able to consume IBM's services at the edge and or a remote office and I think that's very exciting. >> You guys are both experts and solely season executives to DevOps, DevSecOps, DataOps, whatever you want to call data's here, ecosystems. Guys thanks for coming on theCUBE Really appreciate the insight. >> Thank you. >> Thank you. >> Okay, IBM Think CUBE coverage. I'm John Furrier, your host. Thanks for watching. (upbeat music) (tranquil music)

Published Date : May 12 2021

SUMMARY :

brought to you by IBM. great to see you both. for both of you guys in and kind of how we work differently of the relationship. deployments in the cloud. the infrastructure has to be resilient I think, you know, people want to consume, Jason how are you guys back into the data center, out to the edge a lot over the past decade Cloud in the containerization in the conversation tends to that spans the edge, I know it's kind of a new product, in the public cloud that you need to Octavian on the satellite and enable the capabilities and mainly the world is and about kind of the overall environment of people being able to Really appreciate the insight. I'm John Furrier, your host.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Jason McGee	PERSON	0.99+
John	PERSON	0.99+
Jason	PERSON	0.99+
Octavian	PERSON	0.99+
John Furrier	PERSON	0.99+
Octavian Tanase	PERSON	0.99+
20 years	QUANTITY	0.99+
Arvind	PERSON	0.99+
NetApp	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
NetApp	TITLE	0.99+
last year	DATE	0.99+
Both	QUANTITY	0.98+
both	QUANTITY	0.98+
second	QUANTITY	0.98+
two things	QUANTITY	0.98+
one	QUANTITY	0.98+
two great guests	QUANTITY	0.97+
Cube	ORGANIZATION	0.97+
Kubernetes	TITLE	0.96+
OpenShift	TITLE	0.96+
Octavian the Nazis	PERSON	0.95+
Cloud Satellite	TITLE	0.93+
11 years	QUANTITY	0.89+
first	QUANTITY	0.89+
one thing	QUANTITY	0.88+
DevOps	TITLE	0.87+
both location	QUANTITY	0.84+
DevSecOps	TITLE	0.83+
today	DATE	0.83+
Hybrid Cloud	TITLE	0.82+
Think 2021	COMMERCIAL_ITEM	0.82+
few years ago	DATE	0.82+
theCUBE	ORGANIZATION	0.82+
past decade	DATE	0.81+
DevTools	TITLE	0.77+
Metro	TITLE	0.75+
COVID	OTHER	0.72+
past 10 years	DATE	0.72+
one of	QUANTITY	0.7+

Sandeep Singh & Omer Asad, HPE

(digital music) >> Hello everyone. And welcome to theCUBE where we're covering the recent news from Hewlett Packard Enterprise Making Moves and Storage. And with me are Omer Asad, Vice President and General Manager for Primary Storage, HCI and Data Management at HPE and Sandeep Singh who's the Vice President of Storage Marketing at Hewlett Packard Enterprise. Gentlemen, welcome back to theCUBE. Great to see you both. >> Dave its a pleasure to be here. >> Always a pleasure talking to you Dave thank you so much. >> Oh, it's my pleasure. Hey, so we just watched HPE make a big announcement and I wonder Sandeep, if you could give us a quick recap. >> Yeah, of course Dave. In the world of enterprise storage there hasn't been a moment like this in decades, a point at which everything is changing for data and infrastructure and it's really coming at the nexus of data, cloud and AI that's opening up the opportunity for customers across industries to accelerate their data-driven transformation. Building on that we just unveiled a new vision for data that accelerates the data driving transformation for customers edge to cloud. And to pay that off we introduce a new data services platform that consists of two game-changing innovations. First it's a data services cloud console which is a SaaS based console that delivers cloud operational agility for customers. And it's designed to unify data operations through a suite of cloud services. Though our second announcement is HPE Electra. HPE Electra is a cloud native data infrastructure portfolio to power your data edge to cloud. It's managed natively with data services cloud console and it brings that cloud operational model to customers wherever their data lives. These innovations are really combined with our industry leading AIOPS platform which is HPE InfoSight and combine these innovations radically simplify and bring that cloud operational model to customers our data and infrastructure management. And it gives the opportunity for streamlining data management across the life cycle. These innovations are making it possible for organizations across the industries to unleash the power of data. >> That's kind of cool. There're a lot of the stuff we've been talking about for all these years is sort of this unified layer across all clouds on-prem, AI injected in I could tell you're excited and it sounds like you you can't wait to get these offerings in the hands of customers, but I wonder we get back up a minute. Omer, maybe you could describe the problem statement that you're addressing with this announcement. What are customers really pain points? >> Excellent question, Dave. So in my role, as the General Manager for Data Management and Storage here at HPE I get the wonderful opportunity to talk to hundreds of customers in a year. And, you know, as time has progressed as the amount of data under organizations' management has continued to increase, what I have noticed is that recently there are three main themes that are continuously emerging and are now bubbling at the top. The first one is storage infrastructure management itself is extremely complex for customers. While there have been lots of leaps and down progress in managing a single array or managing two arrays with a lot of simplification of the UI and maybe some modern UIs are present but as the problem starts to get at scale as customers acquire more and more assets to store and manage their data on premise the management at scale is extremely complex. Yes, storage has gotten faster, yes, flash has had a profound effect on performance availability and latency access to the data but infrastructure management and storage management as a whole has become a pain for customers and it's a constant theme as storage lifecycle management comes up storage refresh has come up and deploying and managing storage infrastructure at scale comes up. So that's one of the main problems that I've been seeing as I talk to customers. Now, secondly, a lot of customers are now talking about two different elements. One is storage and storage deployment and life cycle management. And the second is the management of data that is stored on those storage devices. As the amount of data grows the silos continue to grow a single view of life cycle management of data doesn't, you know, customers don't get to see it. And lastly, one of the biggest things that we see is a lot of customers are now asking, how can I extract a value from this data under my management because they can't seem to parse through the silos. So there is an incredible amount of productivity lost when it comes to data management as a whole, which is just fragmented into silos, and then from a storage management. And when you put these two together and especially add two more elements to it which is hybrid management of data or a multicloud management of data the silos and the sprawl just continues and there is nothing that is stitching together this thing at scale. So these are the three main themes that constantly appear in these discussions. Although in spite of these a lot of modern enhancements in storage >> Well, I wonder if I could comment guys 'cause I've been following this industry for a number of years and you're absolutely right, Omer. I mean, if you look at the amount of money and time and energy that's spent into or put into the data architectures people are frustrated they're not getting enough out of it. And I'd note that, you know, the prevailing way in which we've attacked complexity historically is you build a better box. And well, while that system was maybe easier to manage than the predecessor systems all it did is create another silo and then the cloud, despite its impaired simplicity that was another disconnected siloed. So then we threw siloed management solutions at the problem and we're left with this collection of point solutions with data sort of trapped inside. So I wonder if you could give us your thoughts on that and you know, do you agree, what data do you have around this problem statement? >> Yeah, Dave that's a great point. And actually ESG just recently conducted a survey of over 250 IT decision makers. And that actually brings one of the perfect validations of the problems that Omer and you just articulated. What it showed is that 93% of the respondents indicated that storage and data management, that complexity is impeding their digital transformation. On average, the organizations have over 23 different data management tools which just typifies and is a perfect showcase of the fragmentation and the complexity that exists in that data management. And 95% of the respondents indicated that solving storage and data management that complexity is a top 10 business initiative for them. And actually top five for 67% of the respondents. So it's a great validation across the board. >> Well, its fresh in their minds too, because pre pandemic there was probably, you know, a mixed picture, right. It was probably well there's complacency or we're not moving fast enough, we have other priorities, but they were forced into this. Now they know what the real problem is it's front and center. Yeah, I liked that you're putting out there in your announcement this sort of future state that you're envisioning for customers. And I wonder if we could sort of summarize that and share with our listeners that vision that you unveiled what does it look like and how are you making it real? >> Yeah, overall, we feel very strongly that it's time for our customers to reimagine data management. And our vision is that customers need to break down the silos and complexity that plagues the distributed data environments. And they need to experience a new data experience across the board that's going to help them accelerate their data-driven transformation and we call this vision Unified DataOps. Unified DataOps integrates data-centric policies across the board to streamline data management, cloud-native control and operations to bring that agility of cloud and the operational model to wherever data lives. And AI driven insights and intelligence to make the infrastructure invisible. It delivers a whole new experience to customers to radically simplify and bring the agility of cloud to data and data infrastructure, streamlined data management and really help customers innovate faster than ever before. And we're making the promise of Unified DataOps real by transforming the entire HPE storage business to a cloud native software defined data services and that's through introducing a data services platform that expands HPE GreenLake. >> I mean, the key word I take away there Sandeep, is invisible. I mean, as a customer I want you to abstract that complexity away that underlying infrastructure complexity I just don't want to see it anymore. Omer, I wonder if we could start with the first part of the announcement maybe you can help us unpack data services, cloud console. I mean, you know, people are immediately going to think it's just another software product to manage infrastructure. But to really innovate, I'm hoping that it's more than that. >> Absolutely, Dave, it's a lot more than that. What we have done fundamentally at the root of the problem is we have taken the data and infrastructure control away from the hardware and through that, we provided a unified approach to manage the data wherever it lives. It's a full blown SaaS console which our customers get onto and from there they can deploy appliances, manage appliances, lifecycle appliances and then they not only stop at that but then go ahead and start to get context around their data. But all of that (indistinct) available through a SaaS platform, a SaaS console as every customer onboards themselves and their equipment and their storage infrastructure onto this console then they can go ahead and define role-based access for different parts of their organization. They can also apply role-based access to HPE GreenLake management personnel so they can come in and do and perform all the operations for the customers via the same console by just being another access control methodology in that. And then in addition to that, as you know, data mobility is extremely important to our customers. How do you make data available in different hyperscaler clouds if the customer's digital transformation requires that? So again, from that single cloud console from that single data console, which we are naming here as data services console customers are able to curate the data, maneuver the data, pre-positioned the data into different hyperscalers. But the beautiful thing is that the entire view of the storage infrastructure, the data with its context that is stored on top of that access control methodologies and management framework is operational from a single SaaS console which the customer can decide to give access to whichever management entity or authority comes into help them. And then what this leads us into is then combining these things into a northbound API. So anybody that wants to streamline operational manageability can then use these APIs to program against a single API which will then control the entire infrastructure on behalf of the customer. So if somebody dare what this is it is bringing that cloud operational model that was so desired by each one of our customers into their data centers and this is what I call an in-place transformation of a management experience for our customer by making them seamlessly available on a cloud operational model for their infrastructure. >> Yeah, and you've turned that into essentially an API with a lot of automation, that's great. So, okay. So that's kind of how you're trying to change the game here you're charting new territory. I want you to talk, you talked to hundreds and hundreds of customers every year I wonder if you could paint a picture from the customer perspective how does their experience actually change? >> Right, that's a wonderful question, Dave. This allows me to break it down into bits and bytes further for you and I love that, right. So the way you look at it is, you know, recently if you look at the storage management, as we talked about earlier, from an array perspective or maybe two arrays perspective has been simplified I mean, it's a solved problem. But when you start to imagine deploying hundreds of arrays and these are large customers, they have massive amounts of data assets, storage management hasn't scaled along as the infrastructure scales. But if you look at the consumer world you can have hundreds of devices but the ownership model is completely (indistinct). So the inspiration for solving this problem for us actually was inspired from consumerization of IT and that's a big trend over here. So now we're changing the customer's ownership model, the customer's deployment model and the customer's data management model into a true cloud first model. So let me give some of the examples of that, right. So first of all, let's talk about deployment. So previously deployment has been a massive challenge for our customers. What does deployment in this new data services console world looks like? Devices show up, you rack them up and then you plug in the power cable, you plug in the network cable and then you walk out of the data center. Data center administrator or the storage of administrator they will be on their iPad, on their data services console, or iPhone or whatever the device of their choice is and from that console, from that point on the device will be registered, onboarded, its initial state will be given to it from the cloud. And if the customer has some predefined States for their previous deployment model already saved with the data console they don't even need to do that we'll just take that and apply that state and induct the device into the fleet that's just one example. It's extremely simple plug in the power cable, plug in the network cable and the data center operational manager just walks out. After that you could be on the beach, you could be at your home, you could be driving in a car and this don't, I advise people not to fiddle with their iPhones when they're driving in a car, but still you could do it if you want to, right. So that's just one part from a deployment methodology perspective. Now, the second thing that, you know, Sandeep and I often bounce ideas on is provisioning of a workload. It's like a science these days. And is this array going to be able to absorb my workload, is the latency going to go South does this workload latency profile match this particular piece of device in my data center? All of this is extremely manual and it literally takes, I mean, if you talk to any of the customers or even analysts, deploying a workload is a massive challenge. It's a guesswork that you have to model and, you know basically see how it works out. I think based on HPE InfoSight, we're collecting hundreds and millions of data points from all these devices. So now to harness that and present that back to a customer in a very simple manner so that we can model on their behalf to the data services console, which is now workload of it, you just describe your workload, hey, I'm going to need these many IOPS and by the way, this happens to be my application. And that's it. On the backend because we're managing your infrastructure the cloud console understands your entire fleet. We are seeing the statistics and the telemetric coming off of your systems and because now you've described the workload for us we can do that matching for you. And what intent based provisioning does is describe your workloads in two or three clicks or maybe two or three API construct formats and we'll do the provisioning, the deployment and bringing it up for you on your behalf on the right pieces of infrastructure that matched it. And if you don't like our choices you can manually change it as well. But from a provisioning perspective I think that took days can now come down to a couple of minutes of the description. And lastly, then, you know, global data management distributed infrastructure from edge to cloud, invisible upgrades, only upgrading the right amount of infrastructure that needs the upgrade. All of that just comes rolling along with it, right. So those are some of the things that this data services console as a SaaS management and scale allows you to. >> And actually, if I can just jump in and add a little bit of what Omer described, especially with intent-based provisioning, that's really bringing a paradigm shift to provisioning. It's shifting it from a LAN-centric to app-center provisioning. And when you combine it with identity management and role-based access what it means is that you're enabling self-service on demand provisioning of the underlying data infrastructure to accelerate the app workload deployments. And you're eliminating guesswork and providing the ability to be able to optimize service level objectives. >> Yeah, it sounds like you've really nailed that in an elegant way that provisioning challenge. I've been saying for years if your primary expertise is deploying logical unit numbers you better find some other scales because the day is coming that that's just going to get automated away. So that's cool. There's another issue that I'm sure you've thought about but I wonder if you could address, I mean, you've got the cloud, the definition of cloud is changing that the cloud is expanding to on-prem on-prem expand to the cloud. It's going out to the edge, it's going across clouds and so, you know, security becomes a big issue that threat surface is expanding, the operating model is changing. So how are you thinking about addressing those security concerns? >> Excellent question, Dave. So, you know, most of the organizations that we talked to in today's modern world, you know almost every customer that I talk to has deployed either some sort of a cloud console where they're either one of the customers were the hyperscalers or you know, buy in for SaaS-based applications or pervasive across the customer base. And as you know, we were the first ones to introduce the automatic telemeter management through HPE InfoSight that's one of the largest storage SaaS services in production today that we operate on behalf of our customers, which has, you know, Dave, about 85% connectivity rate. So from that perspective, keeping customer's data secure, keeping customer's telemetry information secure we're no stranger to that. Again, we follow all security protocols that any cloud operational SaaS service would do. So a reverse handling, the firewall compliancy security audit logs that are published to our customers and published to customers' chief information security officers. So all of those, you know what I call crossing the T's and dotted the I's we do that with security expert and security policies for which each of our customers has a different set of rules. And we have a proper engagement model that we go through that particular audit process for our customers. Then secondly, Dave the data services cloud console is actually built on a fundamental cloud deployment technology that is not sort of that new. Aruba Central which is an Aruba management console which is also an HPE company it's been deployed and it's managing millions of access points in a SaaS framework for our customers. So the fundamental building blocks of the data storage console from a basic enablement perspective come from the Aruba Central console. And what we've taken is we've taken those generic cloud-based SaaS services and then built data and storage centric SaaS services on top of that and made them available to our customers. >> Yeah, I really like the Aruba. You picked that up several years ago and it's same thing with InfoSight the way that you bring it to other parts of the portfolio those are really good signs to watch of successful acquisitions. All right, there's a lot here. I want to talk about the second part of the announcement. I know you're a branding team you guys are serious about branding that new product brand. Maybe you could talk about that. >> So again, so delivering the cloud operational model is just the first piece, right. And now the second part of the announcement is delivering the cloud native hardware infrastructure which is extremely performing to go along with this cloud operational model. So what we have done Dave, in this announcement is we've announced HPE Electra. This is our new brand for our cloud native infrastructure to power your data and its appliances from core to the edge, to the cloud, right. And what it does is it takes the cloud operational model and this hardware is powered by that, it's completely wrapped around data. And so HPE Electra is available in two models right now, the HB electron 9,000 which is available for mission critical workloads for those high intensity workloads with a hundred percent availability guarantee where no failure is ever an option. And then it's also available as HPE Electra, 6,000 which is available for general purpose, business critical workloads generally trying to address that mid range of the storage market. And both of these systems are full 100% NBME front and back. And they're powered by the same unified cloud management operational experience that the data cloud console provides. And what it does is it allows our customers to simplify the deployment model, it simplifies their management model and really really allows them to focus on the context, the data and their app diversity whereas data mobility, data connectivity, data management in a multicloud world is then completely obstructed from them. >> Dave: Yeah. >> Sandeep: And Dave. >> Dave: Go ahead, please. >> Just to jump in HPE Electra combined with data services cloud console is delivering a cloud experience that makes deploying and scaling the application workloads as simple as flipping a switch. >> Dave: Nice. >> It really does. And you know, I'm very comfortable in saying this you know, like HPE InfoSight, we were the first in the industry to bring AI-based elementary and support enabled metrics (indistinct). And then here with data services console and the hardware that goes with it we're just completely transforming the storage ownership and a storage management model. And for our customers, it's a seamless non-disruptive upgrade with fully data in place upgrade. And they transform to a cloud operational model where they can manage their infrastructure better where they are through a complete consumer grade SaaS console is again the first of its kind when you look at storage management and storage management at scale. >> And I like how you're emphasizing that management layer, but underneath you got all the modern hardware technologies too which is important because it's a performance got to be, you know, a good price performance. >> Absolutely. >> So now can we bring this back again to the customers what are the outcomes that this is going to enable for them? >> So I think Dave, the first and the foremost thing is as they scale their storage infrastructures they don't have to think it's really as simple as yeah, just send it to the data center, plug in the power cable, plug in the network cable and up it comes. And from that point onwards the life cycle and the device management aspect are completely abstracted by the data services console. All they have to focus is I just have new capacity available to me and when I have an application the system will figure it out for me where they need to deploy. So no more needing the guesswork, the Excel sheets of capacity management, you know the chargeback models, none of that stuff is needed. And for customers that are looking to transform their applications customers looking to refactor their applications into a hyperscaler model or maybe transform from VM to containers, all they need to think about and focus is on that the data will just follow these workloads from that perspective. >> And Dave, just to almost response here as I speak with customers one of the things I'm hearing from IT is that line of business really wants IT to deliver that agility of cloud yet IT also has to deliver all of the enterprise reliability, availability, all of the data services. And what's fantastic here is that through this cloud operational model IT can deliver that agility, that line of business owners are looking for at the same time they've been under pressure to do a lot more with less. And through this agility, IT is able to get time back be able to focus more on the strategic projects at the same time, be able to get time back to spend more time with their families that's incredibly important. >> Omer: Right >> Well, I love the sort of mindset shift that I'm seeing from HPE we're not talking about how much the box weighs (laughing) we're talking about the customer experience. And I wonder, you know, that kind of leads me, Sandeep to how this kind of fits in to this new really, to me, I'm seeing the transformation before our eyes but how does it fit into HPE's overall mission? >> Well, Dave, our mission overall is to be the edge to cloud platform as a service company with HPE GreenLake, being the key to delivering that cloud experience. And as Omer put it, be able to deliver that cloud experience wherever the customer's data lives. And today we're advancing HPE GreenLake as a service transformation of the HPE storage business to a software defined cloud data services business overall. And for our customers, this translates to how to operational and ownership experience that unleashes their agility, their data and their innovation. So we're super excited >> Guys, I can tell you're excited. Thanks so much for coming to theCUBE and summarizing the announcements, congratulations and best of luck to both of you and to HPE and your customers. >> Thank you Dave. It was a pleasure. (digital music)

Published Date : Apr 29 2021

SUMMARY :

Great to see you both. Always a pleasure talking to you Dave and I wonder Sandeep, if you and it's really coming at the There're a lot of the stuff but as the problem starts to get at scale and you know, do you agree, And 95% of the respondents indicated that vision that you unveiled the agility of cloud to data I mean, the key word I take away there is that the entire view of from the customer perspective is the latency going to go South and providing the ability that the cloud is expanding to on-prem and dotted the I's the way that you bring it to that the data cloud console provides. the application workloads and the hardware that goes with it got to be, you know, And from that point onwards the life cycle at the same time, be able to get time back And I wonder, you know, that of the HPE storage business and best of luck to both of you Thank you Dave.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Sandeep	PERSON	0.99+
two	QUANTITY	0.99+
93%	QUANTITY	0.99+
Sandeep Singh	PERSON	0.99+
Omer Asad	PERSON	0.99+
HPE	ORGANIZATION	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
95%	QUANTITY	0.99+
iPad	COMMERCIAL_ITEM	0.99+
one	QUANTITY	0.99+
first piece	QUANTITY	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
two models	QUANTITY	0.99+
Omer	PERSON	0.99+
both	QUANTITY	0.99+
67%	QUANTITY	0.99+
HPE Electra	ORGANIZATION	0.99+
First	QUANTITY	0.99+
One	QUANTITY	0.99+
ESG	ORGANIZATION	0.99+
second part	QUANTITY	0.99+
three clicks	QUANTITY	0.99+
one part	QUANTITY	0.99+
Excel	TITLE	0.99+
two arrays	QUANTITY	0.99+
second	QUANTITY	0.99+
100%	QUANTITY	0.99+
three main themes	QUANTITY	0.99+
two different elements	QUANTITY	0.99+
second thing	QUANTITY	0.99+
first one	QUANTITY	0.98+
second announcement	QUANTITY	0.98+
first	QUANTITY	0.98+
first model	QUANTITY	0.98+
over 250 IT decision makers	QUANTITY	0.98+
hundred percent	QUANTITY	0.98+
one example	QUANTITY	0.98+
HPE GreenLake	ORGANIZATION	0.98+
each	QUANTITY	0.98+
first part	QUANTITY	0.98+
hundreds of customers	QUANTITY	0.97+
HCI	ORGANIZATION	0.97+
hundreds of arrays	QUANTITY	0.97+
second part	QUANTITY	0.97+
two more elements	QUANTITY	0.97+
HPE InfoSight	ORGANIZATION	0.97+
Aruba	ORGANIZATION	0.96+
Hewlett Packard Enterprise Making Moves and Storage	ORGANIZATION	0.96+
single	QUANTITY	0.96+
today	DATE	0.96+
about 85%	QUANTITY	0.95+
several years ago	DATE	0.94+
HPE Electra	COMMERCIAL_ITEM	0.94+
over 23 different data management tools	QUANTITY	0.93+
each one	QUANTITY	0.93+
secondly	QUANTITY	0.93+

Ajay Vohora and Duncan Turnbull | Io-Tahoe Data Quality: Active DQ

>> Announcer: From around the globe. It's the cube presenting active DQ, intelligent automation for data quality brought to you by Io Tahoe. (indistinct) >> Got it? all right if everybody is ready we'll opening on Dave in five, four, three. Now we're going to look at the role automation plays in mobilizing your data on snowflake. Let's welcome. And Duncan Turnbull who's partner sales engineer at snowflake, Ajay Vohora is back CEO of IO. Tahoe he's going to share his insight. Gentlemen. Welcome. >> Thank you, David good to be back. >> Yes it's great to have you back Ajay and it's really good to see Io Tahoe expanding the ecosystem so important now of course bringing snowflake in, it looks like you're really starting to build momentum. I mean, there's progress that we've seen every month month by month, over the past 12, 14 months. Your seed investors, they got to be happy. >> They are they're happy and they can see that we're running into a nice phase of expansion here new customers signing up, and now we're ready to go out and raise that next round of funding. Maybe think of us like Snowflake five years ago. So we're definitely on track with that. A lot of interest from investors and right now trying to focus in on those investors that can partner with us and understand AI data and an automation. >> Well, so personally, I mean you've managed a number of early stage VC funds. I think four of them. You've taken several comm software companies through many funding rounds and growth and all the way to exit. So you know how it works. You have to get product market fit, you got to make sure you get your KPIs, right. And you got to hire the right salespeople, but what's different this time around? >> Well, you know, the fundamentals that you mentioned those that never change. What I can see that's different that's shifted this time around is three things. One in that they used to be this kind of choice of do we go open source or do we go proprietary? Now that has turned into a nice hybrid model where we've really keyed into RedHat doing something similar with Centos. And the idea here is that there is a core capability of technology that underpins a platform, but it's the ability to then build an ecosystem around that made up of a community. And that community may include customers, technology partners, other tech vendors and enabling the platform adoption so that all of those folks in that community can build and contribute whilst still maintaining the core architecture and platform integrity at the core of it. And that's one thing that's changed. We're seeing a lot of that type of software company emerge into that model, which is different from five years ago. And then leveraging the Cloud, every Cloud, Snowflake Cloud being one of them here. In order to make use of what customers end customers in enterprise software are moving towards. Every CIO is now in some configuration of a hybrid. IT is state whether that is Cloud, multi-Cloud, on-prem. That's just the reality. The other piece is in dealing with the CIO, his legacy. So the past 15, 20 years I've purchased many different platforms, technologies, and some of those are still established and still (indistinct) How do you enable that CIO to make purchase whilst still preserving and in some cases building on and extending the legacy material technology. So they've invested their people's time and training and financial investment into. Yeah, of course solving a problem, customer pain point with technology that never goes out in a fashion >> That never changes. You have to focus like a laser on that. And of course, speaking of companies who are focused on solving problems, Duncan Turnbull from Snowflake. You guys have really done a great job and really brilliantly addressing pain points particularly around data warehousing, simplified that you're providing this new capability around data sharing really quite amazing. Duncan, Ajay talks about data quality and customer pain points in enterprise IT. Why is data quality been such a problem historically? >> So one of the biggest challenges that's really affected that in the past is that because to address everyone's needs for using data, they've evolved all these kinds of different places to store it, all these different silos or data marts or all this kind of pluralfiation of places where data lives and all of those end up with slightly different schedules for bringing data in and out, they end up with slightly different rules for transforming that data and formatting it and getting it ready and slightly different quality checks for making use of it. And this then becomes like a big problem in that these different teams are then going to have slightly different or even radically different ounces to the same kinds of questions, which makes it very hard for teams to work together on their different data problems that exist inside the business, depending on which of these silos they end up looking at. And what you can do. If you have a single kind of scalable system for putting all of your data, into it, you can kind of side step along this complexity and you can address the data quality issues in a single way. >> Now, of course, we're seeing this huge trend in the market towards robotic process automation, RPA that adoption is accelerating. You see in UI paths, IPO, 35 plus billion dollars, valuation, Snowflake like numbers, nice comms there for sure. Ajay you've coined the phrase data RPA what is that in simple terms? >> Yeah I mean, it was born out of seeing how in our ecosystem (indistinct) community developers and customers general business users for wanting to adopt and deploy Io Tahoe's technology. And we could see that. I mean, there's not marketing out here we're not trying to automate that piece but wherever there is a process that was tied into some form of a manual overhead with handovers. And so on, that process is something that we were able to automate with Io Tahoe's technology and the employment of AI and machine learning technologies specifically to those data processes, almost as a precursor to getting into marketing automation or financial information automation. That's really where we're seeing the momentum pick up especially in the last six months. And we've kept it really simple with snowflake. We've kind of stepped back and said, well, the resource that a Snowflake can leverage here is the metadata. So how could we turn Snowflake into that repository of being the data catalog? And by the way, if you're a CIO looking to purchase the data catalog tool, stop there's no need to. Working with Snowflake we've enabled that intelligence to be gathered automatically and to be put to use within snowflake. So reducing that manual effort and I'm putting that data to work. And that's where we've packaged this with our AI machine learning specific to those data tasks. And it made sense that's what's resonated with our customers. >> You know, what's interesting here just a quick aside, as you know I've been watching snowflake now for awhile and of course the competitors come out and maybe criticize, "Why they don't have this feature. They don't have that feature." And snowflake seems to have an answer. And the answer oftentimes is, well ecosystem, ecosystem is going to bring that because we have a platform that's so easy to work with. So I'm interested Duncan in what kind of collaborations you are enabling with high quality data. And of course, your data sharing capability. >> Yeah so I think the ability to work on datasets isn't just limited to inside the business itself or even between different business units you're kind of discussing maybe with those silos before. When looking at this idea of collaboration. We have these challenges where we want to be able to exploit data to the greatest degree possible, but we need to maintain the security, the safety, the privacy, and governance of that data. It could be quite valuable. It could be quite personal depending on the application involved. One of these novel applications that we see between organizations of data sharing is this idea of data clean rooms. And these data clean rooms are safe, collaborative spaces which allow multiple companies or even divisions inside a company where they have particular privacy requirements to bring two or more data sets together, for analysis. But without having to actually share the whole unprotected data set with each other. And this lets you to you know, when you do this inside of Snowflake you can collaborate using standard tool sets. You can use all of our SQL ecosystem. You can use all of the data science ecosystem that works with Snowflake. You can use all of the BI ecosystem that works with snowflake. But you can do that in a way that keeps the confidentiality that needs to be presented inside the data intact. And you can only really do these kinds of collaborations especially across organization but even inside large enterprises, when you have good reliable data to work with, otherwise your analysis just isn't going to really work properly. A good example of this is one of our large gaming customers. Who's an appetizer. They were able to build targeted ads to acquire customers and measure the campaign impact in revenue but they were able to keep their data safe and secure while doing that while working with advertising partners. The business impact of that was they're able to get a lift of 20 to 25% in campaign effectiveness through better targeting and actually pull through into that of a reduction in customer acquisition costs because they just didn't have to spend as much on the forms of media that weren't working for them. >> So, Ajay I wonder, I mean with the way public policy is shaping out, you know, obviously GDPR started it in the States, California consumer privacy Act, and people are sort of taking the best of those. And there's a lot of differentiation but what are you seeing just in terms of governments really driving this move to privacy. >> Government, public sector, we're seeing a huge wake up an activity and across (indistinct), part of it has been data privacy. The other part of it is being more joined up and more digital rather than paper or form based. We've all got, so there's a waiting in the line, holding a form, taking that form to the front of the line and handing it over a desk. Now government and public sector is really looking to transform their services into being online (indistinct) self service. And that whole shift is then driving the need to emulate a lot of what the commercial sector is doing to automate their processes and to unlock the data from silos to put through into those processes. And another thing that I can say about this is the need for data quality is as Duncan mentions underpins all of these processes government, pharmaceuticals, utilities, banking, insurance. The ability for a chief marketing officer to drive a a loyalty campaign, the ability for a CFO to reconcile accounts at the end of the month to do a quick accurate financial close. Also the ability of a customer operations to make sure that the customer has the right details about themselves in the right application that they can sell. So from all of that is underpinned by data and is effective or not based on the quality of that data. So whilst we're mobilizing data to the Snowflake Cloud the ability to then drive analytics, prediction, business processes of that Cloud succeeds or fails on the quality of that data. >> I mean it really is table stakes. If you don't trust the data you're not going to use the data. The problem is it always takes so long to get to the data quality. There's all these endless debates about it. So we've been doing a fair amount of work and thinking around this idea of decentralized data. Data by its very nature is decentralized but the fault domains of traditional big data is that everything is just monolithic. And the organizations monolithic that technology's monolithic, the roles are very, you know, hyper specialized. And so you're hearing a lot more these days about this notion of a data fabric or what Jimit Devani calls a data mesh and we've kind of been leaning into that and the ability to connect various data capabilities whether it's a data, warehouse or a data hub or a data lake, that those assets are discoverable, they're shareable through API APIs and they're governed on a federated basis. And you're using now bringing in a machine intelligence to improve data quality. You know, I wonder Duncan, if you could talk a little bit about Snowflake's approach to this topic >> Sure so I'd say that making use of all of your data is the key kind of driver behind these ideas of beta meshes or beta fabrics? And the idea is that you want to bring together not just your kind of strategic data but also your legacy data and everything that you have inside the enterprise. I think I'd also like to kind of expand upon what a lot of people view as all of the data. And I think that a lot of people kind of miss that there's this whole other world of data they could be having access to, which is things like data from their business partners, their customers, their suppliers, and even stuff that's, more in the public domain, whether that's, you know demographic data or geographic or all these kinds of other types of data sources. And what I'd say to some extent is that the data Cloud really facilitates the ability to share and gain access to this both kind of, between organizations, inside organizations. And you don't have to, make lots of copies of the data and kind of worry about the storage and this federated, idea of governance and all these things that it's quite complex to kind of manage. The snowflake approach really enables you to share data with your ecosystem or the world without any latency with full control over what's shared without having to introduce new complexities or having complex interactions with APIs or software integration. The simple approach that we provide allows a relentless focus on creating the right data product to meet the challenges facing your business today. >> So Ajay, the key here is Duncan's talking about it my mind and in my cake takeaway is to simplicity. If you can take the complexity out of the equation you're going to get more adoption. It really is that simple. >> Yeah, absolutely. I think that, that whole journey, maybe five, six years ago the adoption of data lakes was a stepping stone. However, the Achilles heel there was the complexity that it shifted towards consuming that data from a data lake where there were many, many sets of data to be able to cure rate and to consume. Whereas actually, the simplicity of being able to go to the data that you need to do your role, whether you're in tax compliance or in customer services is key. And listen for snowflake by Io Tahoe. One thing we know for sure is that our customers are super smart and they're very capable. They're data savvy and they'll want to use whichever tool and embrace whichever Cloud platform that is going to reduce the barriers to solving what's complex about that data, simplifying that and using good old fashioned SQL to access data and to build products from it to exploit that data. So simplicity is key to it to allow people to make use of that data and CIO is recognize that. >> So Duncan, the Cloud obviously brought in this notion of DevOps and new methodologies and things like agile that's brought in the notion of DataOps which is a very hot topic right now basically DevOps applies to data about how does Snowflake think about this? How do you facilitate that methodology? >> So I agree with you absolutely that DataOps takes these ideas of agile development or agile delivery and have the kind of DevOps world that we've seen just rise and rise. And it applies them to the data pipeline, which is somewhere where it kind of traditionally hasn't happened. And it's the same kinds of messages. As we see in the development world it's about delivering faster development having better repeatability and really getting towards that dream of the data-driven enterprise, where you can answer people's data questions they can make better business decisions. And we have some really great architectural advantages that allow us to do things like allow cloning of data sets without having to copy them, allows us to do things like time travel so we can see what the data looked like at some point in the past. And this lets you kind of set up both your own kind of little data playpen as a clone without really having to copy all of that data so it's quick and easy. And you can also, again with our separation of storage and compute, you can provision your own virtual warehouse for dev usage. So you're not interfering with anything to do with people's production usage of this data. So these ideas, the scalability, it just makes it easy to make changes, test them, see what the effect of those changes are. And we've actually seen this, that you were talking a lot about partner ecosystems earlier. The partner ecosystem has taken these ideas that are inside Snowflake and they've extended them. They've integrated them with DevOps and DataOps tooling. So things like version control and get an infrastructure automation and things like Terraform. And they've kind of built that out into more of a DataOps products that you can make use of. So we can see there's a huge impact of these ideas coming into the data world. We think we're really well-placed to take advantage to them. The partner ecosystem is doing a great job with doing that. And it really allows us to kind of change that operating model for data so that we don't have as much emphasis on like hierarchy and change windows and all these kinds of things that are maybe viewed as a lot as fashioned. And we kind of taken the shift from this batch stage of integration into streaming continuous data pipelines in the Cloud. And this kind of gets you away from like a once a week or once a month change window if you're really unlucky to pushing changes in a much more rapid fashion as the needs of the business change. >> I mean those hierarchical organizational structures when we apply those to begin to that it actually creates the silos. So if you're going to be a silo buster, which Ajay I look at you guys in silo busters, you've got to put data in the hands of the domain experts, the business people, they know what data they want, if they have to go through and beg and borrow for a new data sets cetera. And so that's where automation becomes so key. And frankly the technology should be an implementation detail not the dictating factor. I wonder if you could comment on this. >> Yeah, absolutely. I think making the technologies more accessible to the general business users or those specialists business teams that's the key to unlocking. So it is interesting to see is as people move from organization to organization where they've had those experiences operating in a hierarchical sense, I want to break free from that. And we've been exposed to automation. Continuous workflows change is continuous in IT. It's continuous in business. The market's continuously changing. So having that flow across the organization of work, using key components, such as GitHub and similar towards your drive process, Terraform to build in, code into the process and automation and with Io Tahoe, leveraging all the metadata from across those fragmented sources is good to see how those things are coming together. And watching people move from organization to organization say, "Hey okay, I've got a new start. I've got my first hundred days to impress my new manager. What kind of an impact can I bring to this?" And quite often we're seeing that as, let me take away the good learnings from how to do it or how not to do it from my previous role. And this is an opportunity for me to bring in automation. And I'll give you an example, David, recently started working with a client in financial services. Who's an asset manager, managing financial assets. They've grown over the course of the last 10 years through M&A and each of those acquisitions have bought with its technical debt, it's own set of data, that multiple CRM systems now multiple databases, multiple bespoke in-house created applications. And when the new CIO came in and had a look at those he thought well, yes I want to mobilize my data. Yes, I need to modernize my data state because my CEO is now looking at these crypto assets that are on the horizon and the new funds that are emerging that's around digital assets and crypto assets. But in order to get to that where absolutely data underpins that and is the core asset cleaning up that that legacy situation mobilizing the relevant data into the Snowflake Cloud platform is where we're giving time back. You know, that is now taking a few weeks whereas that transitioned to mobilize that data start with that new clean slate to build upon a new business as a digital crypto asset manager as well as the legacy, traditional financial assets, bonds, stocks, and fixed income assets, you name it is where we're starting to see a lot of innovation. >> Tons of innovation. I love the crypto examples, NFTs are exploding and let's face it. Traditional banks are getting disrupted. And so I also love this notion of data RPA. Especially because Ajay I've done a lot of work in the RPA space. And what I would observe is that the early days of RPA, I call it paving the cow path, taking existing processes and applying scripts, letting software robots do its thing. And that was good because it reduced mundane tasks, but really where it's evolved is a much broader automation agenda. People are discovering new ways to completely transform their processes. And I see a similar analogy for the data operating model. So I'm wonder what do you think about that and how a customer really gets started bringing this to their ecosystem, their data life cycles. >> Sure. Yeah. Step one is always the same. It's figuring out for the CIO, the chief data officer, what data do I have? And that's increasingly something that they want to automate, so we can help them there and do that automated data discovery whether that is documents in the file share backup archive in a relational data store in a mainframe really quickly hydrating that and bringing that intelligence the forefront of what do I have, and then it's the next step of, well, okay now I want to continually monitor and curate that intelligence with the platform that I've chosen let's say Snowflake. In order such that I can then build applications on top of that platform to serve my internal external customer needs. and the automation around classifying data, reconciliation across different fragmented data silos building that in those insights into Snowflake. As you say, a little later on where we're talking about data quality, active DQ, allowing us to reconcile data from different sources as well as look at the integrity of that data. So then go on to remediation. I want to harness and leverage techniques around traditional RPA but to get to that stage, I need to fix the data. So remediating publishing the data in Snowflake, allowing analysis to be formed, performed in Snowflake but those are the key steps that we see and just shrinking that timeline into weeks, giving the organization that time back means they're spending more time on their customer and solving their customer's problem which is where we want them to be. >> Well, I think this is the brilliance of Snowflake actually, you know, Duncan I've talked to Benoit Dageville about this and your other co-founders and it's really that focus on simplicity. So I mean, that's you picked a good company to join in my opinion. So I wonder Ajay, if you could talk about some of the industry sectors that again are going to gain the most from data RPA, I mean traditional RPA, if I can use that term, a lot of it was back office, a lot of financial, what are the practical applications where data RPA is going to impact businesses and the outcomes that we can expect. >> Yes, so our drive is really to make that business general user's experience of RPA simpler and using no code to do that where they've also chosen Snowflake to build their Cloud platform. They've got the combination then of using a relatively simple scripting techniques such as SQL without no code approach. And the answer to your question is whichever sector is looking to mobilize their data. It seems like a cop-out but to give you some specific examples, David now in banking, where our customers are looking to modernize their banking systems and enable better customer experience through applications and digital apps, that's where we're seeing a lot of traction in this approach to pay RPA to data. And health care where there's a huge amount of work to do to standardize data sets across providers, payers, patients and it's an ongoing process there. For retail helping to to build that immersive customer experience. So recommending next best actions. Providing an experience that is going to drive loyalty and retention, that's dependent on understanding what that customer's needs, intent are, being able to provide them with the content or the offer at that point in time or all data dependent utilities. There's another one great overlap there with Snowflake where helping utilities telecoms, energy, water providers to build services on that data. And this is where the ecosystem just continues to expand. If we're helping our customers turn their data into services for their ecosystem, that's exciting. Again, they were more so exciting than insurance which it always used to think back to, when insurance used to be very dull and mundane, actually that's where we're seeing a huge amounts of innovation to create new flexible products that are priced to the day to the situation and risk models being adaptive when the data changes on events or circumstances. So across all those sectors that they're all mobilizing their data, they're all moving in some way but for sure form to a multi-Cloud setup with their IT. And I think with Snowflake and with Io Tahoe being able to accelerate that and make that journey simple and less complex is why we've found such a good partner here. >> All right. Thanks for that. And thank you guys both. We got to leave it there really appreciate Duncan you coming on and Ajay best of luck with the fundraising. >> We'll keep you posted. Thanks, David. >> All right. Great. >> Okay. Now let's take a look at a short video. That's going to help you understand how to reduce the steps around your DataOps let's watch. (upbeat music)

Published Date : Apr 20 2021

SUMMARY :

brought to you by Io Tahoe. he's going to share his insight. and it's really good to see Io Tahoe and they can see that we're running and all the way to exit. but it's the ability to You have to focus like a laser on that. is that because to address in the market towards robotic and I'm putting that data to work. and of course the competitors come out that needs to be presented this move to privacy. the ability to then drive and the ability to connect facilitates the ability to share and in my cake takeaway is to simplicity. that is going to reduce the And it applies them to the data pipeline, And frankly the technology should be that's the key to unlocking. that the early days of RPA, and the automation and the outcomes that we can expect. And the answer to your question is We got to leave it there We'll keep you posted. All right. That's going to help you

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Ajay Vohora	PERSON	0.99+
Duncan Turnbull	PERSON	0.99+
Duncan Turnbull	PERSON	0.99+
five	QUANTITY	0.99+
Duncan	PERSON	0.99+
two	QUANTITY	0.99+
Dave	PERSON	0.99+
IO	ORGANIZATION	0.99+
Jimit Devani	PERSON	0.99+
Ajay	PERSON	0.99+
Io Tahoe	ORGANIZATION	0.99+
20	QUANTITY	0.99+
Io-Tahoe	ORGANIZATION	0.99+
One	QUANTITY	0.99+
California consumer privacy Act	TITLE	0.99+
Tahoe	PERSON	0.99+
Benoit Dageville	PERSON	0.99+
Snowflake	TITLE	0.99+
five years ago	DATE	0.99+
SQL	TITLE	0.99+
first hundred days	QUANTITY	0.98+
four	QUANTITY	0.98+
GDPR	TITLE	0.98+
each	QUANTITY	0.98+
three	QUANTITY	0.98+
both	QUANTITY	0.98+
25%	QUANTITY	0.97+
three things	QUANTITY	0.97+
one	QUANTITY	0.97+
M&A	ORGANIZATION	0.97+
once a week	QUANTITY	0.97+
one thing	QUANTITY	0.96+
Snowflake	ORGANIZATION	0.95+
once a month	QUANTITY	0.95+
DevOps	TITLE	0.95+
snowflake	TITLE	0.94+
single	QUANTITY	0.93+
last six months	DATE	0.92+
States	TITLE	0.92+
six years ago	DATE	0.91+
single way	QUANTITY	0.91+
Snowflake Cloud	TITLE	0.9+
DataOps	TITLE	0.9+
today	DATE	0.86+
12	QUANTITY	0.85+
35 plus billion dollars	QUANTITY	0.84+
five	DATE	0.84+
Step one	QUANTITY	0.83+
Tons	QUANTITY	0.82+
RedHat	ORGANIZATION	0.81+
Centos	ORGANIZATION	0.8+
One thing	QUANTITY	0.79+
14 months	QUANTITY	0.79+

Scott Buckles, IBM | Actifio Data Driven 2020

>> Narrator: From around the globe. It's theCUBE, with digital coverage of Actifio Data Driven 2020, brought to you by Actifio. >> Welcome back. I'm Stuart Miniman and this is theCUBE's coverage of Actifio Data Driven 2020. We wish everybody could join us in Boston, but instead we're doing it online this year, of course, and really excited. We're going to be digging into the value of data, how DataOps, data scientists are leveraging data. And joining me on the program, Scott Buckles, he's the North American Business Executive for database data science and DataOps with IBM, Scott, welcome to theCUBE. >> Thanks Stuart, thanks for having me, great to see you. >> Start with the Actifio-IBM partnership. Anyone that knows that Actifio knows that the IBM partnership is really the oldest one that they've had, either it's hardware through software, those joint solutions go together. So tell us about the partnership here in 2020. >> Sure. So it's been a fabulous partnership. In the DataOps world where we are looking to help, all of our customers gain efficiency and effectiveness in their data pipeline and getting value out of their data, Actifio really compliments a lot of the solutions that we have very well. So the folks from everybody from the up top, all the way through the engineering team, is a great team to work with. We're very, very fortunate to have them. How many or any specific examples or anonymized examples that you can share about joint (indistinct). >> I'm going to stay safe and go on the anonymized side. But we've had a lot of great wins, several significantly large wins, where we've had clients that have been struggling with their different data pipelines. And I say data pipeline, I mean getting value from understanding their data, to developing models and and doing the testing on that, and we can get into this in a minute, but those folks have really needed a solution where Actifio has stepped in and provided that solution. To do that at several of the largest banks in the world, including one that was a very recent merger down in the Southeast, where we were able to bring in the Actifio solution and address our, the customer's needs around how they were testing and how they were trying to really move through that testing cycle, because it was a very iterative process, a very sequential process, and they just weren't doing it fast enough, and Actifio stepped in and helped us deliver that in a much more effective way, in a much more efficient way, especially when you into a bank or two banks rather that are merging and have a lot of work to convert systems into one another and converge data, not an easy task. And that was one of the best wins that we've had in the recent months. And again, going back to the partnership, it was an awesome, awesome opportunity to work with them. >> Well, Scott, as I teed up for the beginning of the conversation, you've got data science and DataOps, help us understand how this isn't just a storage solution, when you're talking about BDP. How does DevOps fit into this? Talk a little bit about some of the constituents inside your customers that are engaging with the solution. >> Yeah. So we call it DataOps, and DataOps is both a methodology, which is really trying to combine the best of the way that we've transformed how we develop applications with DevOps and Agile Development. So going back 20 years ago, everything was a waterfall approach, everything was very slow , and then you had to wait a long time to figure out whether you had success or failure in the application that you had developed and whether it was the right application. And with the advent of DevOps and continuous delivery, the advent of things like Agile Development methodologies, DataOps is really converging that and applying that to our data pipelines. So when we look at the opportunity ahead of us, with the world exploding with data, we see it all the time. And it's not just structured data anymore, it's unstructured data, it's how do we take advantage of all the data that we have so that we can make that impact to our business. But oftentimes we are seeing where it's still a very slow process. Data scientists are struggling or business analysts are struggling to get the data in the right form so that they can create a model, and then they're having to go through a long process of trying to figure out whether that model that they've created in Python or R is an effective model. So DataOps is all about driving more efficiency, more speed to that process, and doing it in a much more effective manner. And we've had a lot of good success, and so it's part methodology, which is really cool, and applying that to certain use cases within the, in the data science world, and then it's also a part of how do we build our solutions within IBM, so that we are aligning with that methodology and taking advantage of it. So that we have the AI machine learning capabilities built in to increase that speed which is required by our customers. Because data science is great, AI is great, but you still have to have good data underneath and you have to do it at speed. Well, yeah, Scott, definitely a theme that I heard loud and clear read. IBM think this year, we do a lot of interviews with theCUBE there, it was helping with the tools, helping with the processes, and as you said, helping customers move fast. A big piece of IBM strategy there are the Cloud Paks. My understanding you've got an update with regards to BDP and Cloud Pak. So to tell us what the new releases here for the show. >> Yeah. So in our (indistinct) release that's coming up, we will be to launch BDP directly from Cloud Pak, so that you can take advantage of the Activio capabilities, which we call virtual data pipeline, straight from within Cloud Pak. So it's a native integration, and that's the first of many things to come with how we are tying those two capabilities and those two solutions more closely together. So we're excited about it and we're looking forward to getting it in our customer's hands. >> All right. And that's the Cloud Pak for Data, if I have that correct, right? >> That's called Cloud Pak for data, correct, sorry, yes. Absolutely, I should have been more clear. >> No, it's all right. It's, it's definitely, we've been watching that, those different solutions that IBM is building out with the Cloud Paks, and of course data, as we said, it's so important. Bring us inside a little bit, if you could, the customers. What are the use cases, those problems that you're helping your customers solve with these solution? >> Sure. So there's three primary use cases. One is about accelerating the development process. Getting into how do you take data from its raw form, which may or may not be usable, in a lot of cases it's not, and getting it to a business ready state, so that your data scientists, your business, your data models can take advantage of it, about speed. The second is about reducing storage costs. As data has exponentially grown so has storage costs. We've been in the test data management world for a number of years now. And our ability to help customers reduce that storage footprint is also tied to actually the acceleration piece, but helping them reduce that cost is a big part of it. And then the third part is about mitigating risk. With the amount of data security challenges that we've seen, customers are continuously looking for ways to mitigate their exposure to somebody manipulating data, accessing production data and manipulating production data, especially sensitive data. And by virtualizing that data, we really almost fully mitigate that risk of them being able to do that. Somebody either unintentionally or intentionally altering that data and exposing a client. >> Scott, I know IBM is speaking at the Data Driven event. I read through some of the pieces that they're talking about. It looks like really what you talk about accelerating customer outcomes, helping them be more productive, if you could, what, what are some of key measurements, KPIs that your customers have when they successfully deploy the solution? >> So when it comes to speed, it's really about, we're looking at about how are we reducing the time of that project, right? Are we able to have a material impact on the amount of time that we see clients get through a testing cycle, right? Are we taking them from months to days, are we taking them from weeks to hours? Having that type of material impact. The other piece on storage costs is certainly looking at what is the future growth? You're not necessarily going to reduce storage costs, but are you reducing the growth or the speed at which your storage costs are growing. And then the third piece is really looking at how are we minimizing the vulnerabilities that we have. And when you go through an audit, internally or externally around your data, understanding that the number of exposures and helping find a material impact there, those vulnerabilities are reduced. >> Scott, last question I have for you. You talk about making data scientists more efficient and the like, what are you seeing organizationally, have teams come together or are they planning together, who has the enablement to be able to leverage some of the more modern technologies out there? >> Well, that's a great question. And it varies. I think the organizations that we see that have the most impact are the ones that are most open to bringing their data science as close to the business as possible. The ones that are integrating their data organizations, either the CDO organization or wherever that may set it. Even if you don't have a CDO, that data organization and who owned those data scientists, and folding them and integrating them into the business so that they're an integral part of it, rather than a standalone organization. I think the ones that sort of weave them into the fabric of the business are the ones that get the most benefit and we've seen have the most success thus far. >> Well, Scott, absolutely. We know how important data is and getting full value out of those data scientists, critical initiative for customers. Thanks so much for joining us. Great to get the updates. >> Oh, thank you for having me. Greatly appreciated. >> Stay tuned for more coverage from Activio Data Driven 2020. I'm Stuart Miniman, and thank you for watching theCUBE. (upbeat music)

Published Date : Sep 16 2020

SUMMARY :

Narrator: From around the globe. And joining me on the thanks for having me, great to see you. is really the oldest one that they've had, the solutions that we have very well. To do that at several of the beginning of the conversation, in the application that you had developed and that's the first of And that's the Cloud Pak for Data, Absolutely, I should have been more clear. What are the use cases, and getting it to a business ready state, at the Data Driven event. on the amount of time that we see leverage some of the more are the ones that are most open to and getting full value out of Oh, thank you for having me. I'm Stuart Miniman, and thank

ENTITIES

Entity	Category	Confidence
Scott	PERSON	0.99+
Stuart	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Scott Buckles	PERSON	0.99+
Stuart Miniman	PERSON	0.99+
2020	DATE	0.99+
third piece	QUANTITY	0.99+
Actifio	ORGANIZATION	0.99+
two banks	QUANTITY	0.99+
One	QUANTITY	0.99+
Cloud Pak	TITLE	0.99+
two solutions	QUANTITY	0.99+
Python	TITLE	0.99+
DevOps	TITLE	0.99+
third part	QUANTITY	0.99+
second	QUANTITY	0.99+
first	QUANTITY	0.99+
Actifio Data Driven 2020	TITLE	0.98+
one	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
two capabilities	QUANTITY	0.98+
Cloud Paks	TITLE	0.97+
20 years ago	DATE	0.97+
this year	DATE	0.96+
three primary use cases	QUANTITY	0.96+
both	QUANTITY	0.95+
DataOps	ORGANIZATION	0.95+
DataOps	TITLE	0.94+
Southeast	LOCATION	0.94+
Agile	TITLE	0.94+
Agile Development	TITLE	0.92+
R	TITLE	0.88+
North American	PERSON	0.78+
Activio Data Driven 2020	TITLE	0.74+
Cloud	COMMERCIAL_ITEM	0.74+
BDP	TITLE	0.7+
Data Driven	EVENT	0.67+
BDP	ORGANIZATION	0.53+
Paks	TITLE	0.52+
minute	QUANTITY	0.52+

Breaking Analysis: Competition Heats up for Cloud Analytic Databases

(enlightening music) >> From theCUBE's studios in Palo Alto and Boston, connecting with thought leaders all around the world, this is a CUBE conversation. >> As we've been reporting, there's a new class of workloads emerging in the cloud. Early cloud was all about IaaS, spinning up storage, compute, and networking infrastructure to support startups, SaaS, easy experimentation, dev test, and increasingly moving business workloads into the cloud. Modern cloud workloads are combining data. They're infusing machine intelligence into application's AI. They're simplifying analytics and scaling with the cloud to deliver business insights in near real time. And at the center of this mega trend is a new class of data stores and analytic databases, what some called data warehouses, a term that I think is outdated really for today's speed of doing business. Welcome to this week's Wikibon CUBE Insights, powered by ETR. In this breaking analysis, we update our view of the emerging cloud native analytic database market. Today, we want to do three things. First, we'll update you on the basics of this market, what you really need to know in the space. The next thing we're going to do, take a look into the competitive environment, and as always, we'll dig into the ETR spending data to see which companies have the momentum in the market, and maybe, ahead of some of the others. Finally, we're going to close with some thoughts on how the competitive landscape is likely to evolve. And we want to answer the question will the cloud giants overwhelm the upstarts, or will the specialists continue to thrive? Let's take a look at some of the basics of this market. We're seeing the evolution of the enterprise data warehouse market space. It's an area that has been critical to supporting reporting and governance requirements for companies, especially post Sarbanes-Oxley, right? However, historically, as I've said many times, EDW has failed to deliver on its promises of a 360-degree view of the business and real-time customer insights. Classic enterprise data warehouses are too cumbersome, they're too complicated, they're too slow, and don't keep pace with the speed of the business. Now, EDW is about a $20 billion market, but the analytic database opportunity in the cloud, we think is much larger, why is that? It's because cloud computing unlocks the ability to rapidly combine multiple data sources, bring data science tooling into the mix, very quickly analyze data, and deliver insights to the business. More importantly, even more importantly, allow a line of business pros to access data in a self service mode. It's a new paradigm that uses the notion of DevOps as applied to the data pipeline, agile data or what we sometimes called DataOps. This is a highly competitive marketplace. In the early part of last decade, you saw Google bring BigQuery to market, Snowflake was founded, AWS did a one-time license deal to acquire the IP to ParAccel, an MPP database, on which it built Redshift. In the latter part of the decade, Microsoft threw his hat in the ring with SQL DW, which Microsoft has now evolved into Azure Synapse. They did so at the Build conference, a few weeks ago. There are other players as well like IBM. So you can see, there's a lot at stake here. The cloud vendors want your data, because they understand this is one of the key ingredients of the next decade of innovation. No longer is Moore's Law, the mainspring of growth. We've said this many times. Rather today, it's data driven, and AI to push insights and scale with the cloud. Here's the interesting dynamic that is emerging in the space. Snowflake is a cloud specialist in this field, having raised more than a billion dollars in venture, a billion four, a billion five. And it's up against the big cloud players, who are moving fast and often stealing moves from Snowflake and driving customers to their respective platforms. Here's an example that we reported on at last year's re:Invent. It's an article by Tony Baer. He wrote this on ZDNet talking about how AWS RA3 separates compute from storage, and of course, this was a founding architectural principle for Snowflake. Here's another example from the information. They were reporting on Microsoft here turning up the heat on Snowflake. And you can see the highlighted text, where the author talks about Microsoft trying to divert customers to its database. So you got this weird dynamic going on. Snowflake doesn't run on-prem, it only runs in the cloud. Runs on AWS, runs on Azure, runs on GCP. The cloud players again, they all want your data to go into their database. So they want you to put their data into their respective platforms. At the same time, they need SaaS ISVs to run in the cloud because it sells infrastructure services. So, is Snowflake, are they going to pivot to run on-prem to try to differentiate from the cloud giants? I asked Frank Slootman, Snowflake's CEO, about the on-prem opportunity, and his perspective earlier this year. Let's listen to what he said. >> Okay, we're not doing this endless hedging that people have done for 20 years, sort of keeping a leg in both worlds. Forget it, this will only work in the public cloud because this is how the utility model works, right? I think everybody is coming to this realization, right? I mean the excuses are running out at this point. We think that it'll, people will come to the public cloud a lot sooner than we will ever come to the private cloud. It's not that we can't run a private cloud, it just diminishes the potential and the value that we that we bring. >> Okay, so pretty definitive statements by Slootman. Now, the question I want to pose today is can Snowflake compete, given the conventional wisdom that we saw in the media articles that the cloud players are going to hurt Snowflake in this market. And if so, how will they compete? Well, let's see what the customers are saying and bring in some ETR survey data. This chart shows two of our favorite metrics from the ETR data set. That is Net Score, which is on the y-axis. Net Score, remember is a measure of spending momentum and market share, which is on the x-axis. Market share is a measure of pervasiveness in the data set. And what we show here are some of the key players in the EDW and cloud native analytic database market. I'll make a couple of points, and we'll dig into this a little bit further. First thing I want to share is you can see from this data, this is the April ETR survey, which was taken at the height of the US lockdown for the pandemic. The survey captured respondents from more than 1,200 CIOs and IT buyers, asking about their spending intentions for analytic databases for the companies that we show here on this kind of x-y chart. So the higher the company is on the vertical axis, the stronger the spending momentum relative to last year, and you could see Snowflake has a 77% Net Score. It leaves all players with AWS Redshift showing very strong, as well. Now in the box in the lower right, you see a chart. Those are the exact Net Scores for all the vendors in the Shared N. A Shared N is a number of citations for that vendor within the N of the 1,269. So you can see the N's are quite large, certainly large enough to feel comfortable with some of the conclusions that we're going to make today. Microsoft, they have a huge footprint. And they somewhat skew the data with its very high market share due to its volume. And you could see where Google sits, it's at good momentum, not as much presence in the marketplace. We've also added a couple of on-prem vendors, Teradata and Oracle primarily on-prem, just for context. They're two companies that compete, they obviously have some cloud offerings, but again, most of their base is on-prem. So what I want to do now is drill into this a little bit more by looking at Snowflake within the individual clouds. So let's look at Snowflake inside of AWS. That's what this next chart shows. So it's customer spending momentum Net Score inside of AWS accounts. And we cut the data to isolate those ETR survey respondents running AWS, so there's an N there of 672 that you can see. The bars show the Net Score granularity for Snowflake and Amazon Redshift. Now, note that we show 96 Shared N responses for Snowflake and 213 for Redshift within the overall N of 672 AWS accounts. The colors show 2020 spending intentions relative to 2019. So let's read left to right here. The replacements are red. And then, the bright red, then, you see spending less by 6% or more, that's the pinkish, and then, flat spending, the gray, increasing spending by more than 6%, that's the forest green, and then, adding to the platform new, that's the lime green. Now, remember Net Score is derived by subtracting the reds from the greens. And you can see that Snowflake has more spending momentum in the AWS cloud than Amazon Redshift, by a small margin, but look at, 80% of the AWS accounts plan to spend more on Snowflake with 35%, they're adding new. Very strong, 76% of AWS customers plan to spend more in 2020 relative to 2019 on Redshift with only 12% adding the platform new. But nonetheless, both are very, very strong, and you can see here, the key point is minimal red and pink, but not a lot of people leaving, not a lot of people spending less. It's going to be critical to see in the June ETR survey, which is in the field this month, if Snowflake is able to hold on to these new accounts that it's gained in the last couple of months. Now, let's look at how Snowflake is doing inside of Azure and compare it to Microsoft. So here's the data from the ETR survey, same view of the data here except we isolate on Azure accounts. The N there is 677 Azure accounts. And we show Snowflake and Microsoft cuts for analytic databases with 83 and 393 Shared N responses respectively. So again, enough I feel to draw some conclusions from this data. Now, note the Net Scores. Snowflake again, winning with 78% versus 51% from Microsoft. 51% is strong but 78% is there's a meaningful lead for Snowflake within the Microsoft base, very interesting. And once again, you see massive new ads, 41% for Snowflake, whereas Microsoft's Net Score is being powered really by growth from existing customers, that forest green. And again, very little red for both companies. So super positive there. Okay, let's take a look now at how Snowflake's doing inside of Google accounts, GCP, Google Cloud Platform. So here's the ETR data, same view of that data, but now, we isolate on GCP accounts. There are fewer, 298 running, then, you got those running Snowflake and Google Analytic databases, largely BigQuery, but could be some others in there but the Snowflake Shared N is 49, it's smaller than on the other clouds, because the company just announced support for GCP, just about a year ago. I think it was last June, but still large enough to draw conclusions from the data. I feel pretty comfortable with that. We're not slicing and dicing it too finely. And you could see Google Shared N at 147. Look at the story. I sound like a broken record. Snowflake is again winning by a meaningful margin if you measure this Net Score or spending momentum. So 77.6% Net Score versus Google at 54%, with Snowflake at 80% in the green. Both companies, very little red. So this is pretty impressive. Snowflake has greater spending momentum than the captive cloud providers in all three of the big US-based clouds. So the big question is can Snowflake hold serve, and continue to grow, and how are they going to to be able to do that? Look, as I said before, this is a very competitive market. We reported that how Snowflake is taking share from some of the legacy on-prem data warehouse players like Teradata and IBM, and from what our data suggests, Lumen and Oracle too. I've reported how IBM is stretched thin on its research and development budget, spends about $6 billion a year, but it's got to spend it across a lot of different lines. Oracle's got more targeted spending R&D. They can target more toward database and direct more of its free cash flow to database than IBM can. But Amazon, and Microsoft, and Google, they don't have that problem. They spend a ton of dough on R&D. And here's an example of the challenge that Snowflake faces. Take a look at this partial list that I drew together of recent innovations. And we show here a set of features that Snowflake has launched in 2020, and AWS since re:Invent last year. I don't have time to go into these, but we do know this that AWS is no slouch at adding features. Amazon, as a company, spends two x more on research and development than Snowflake is worth as a company. So why do I like Snowflake's chances. Well, there are several reasons. First, every dime that Snowflake spends on R&D, go-to market, and ecosystem, goes into making its databases better for its customers. Now, I asked Frank Slootman in the middle of the lockdown how he was allocating precious capital during the pandemic. Let's listen to his response. I've said, there's no layoffs on our radar, number one. Number two, we are hiring. And number three is, we have a higher level of scrutiny on the hires that we're making. And I am very transparent. In other words, I tell people, "Look, I prioritize the roles that are closest "to the drivetrain of the business." Right, it's kind of common sense. But I wanted to make sure that this is how we're thinking about this. There are some roles that are more postponable than others. I'm hiring in engineering, without any reservation because that is the long term, strategic interest of the company. >> But you know, that's only part of the story. And so I want to spend a moment here on some other differentiation, which is multi-cloud. Now, as many of you know, I've been sort of cynical of multi-cloud up until recently. I've said that multi-cloud is a symptom, more of a symptom of multi-vendor and largely, a bunch of vendor marketing hooey today. But that's beginning to change. I see multi-cloud as increasingly viable and important to organizations, not only because CIOs are being asked to clean up the crime scene, as I've often joked, but also because it's increasingly becoming a strategy, right cloud for the right workload. So first, let me reiterate what I said at the top. New workloads are emerging in the cloud, real-time AI, insights extraction, and real-time inferencing is going to be a competitive differentiator. It's all about the data. The new innovation cocktail stems from machine intelligence applied to that data with data science tooling and simplified interfaces that enable scaling with the cloud. You got to have simplicity if you're going to scale and cloud is the best way to scale. It's really the only way to scale globally. So as such, we see cross-cloud exploitation is a real differentiator for Snowflake and others that build high quality cloud native capabilities from multiple clouds, and I want to spend a minute on this topic generally and talk about what it means for Snowflake specifically. Now, we've been pounding the table lately saying that building capabilities natively for the cloud versus putting a wrapper around your stack and making it run in the cloud is key. It's a big difference, why is this? Because cloud native means taking advantage of the primitive capabilities within respective clouds to create the highest performance, the lowest latency, the most efficient services, for that cloud, and the most secure, really exploiting that cloud. And this is enabled only by natively building in the cloud, and that's why Slootman is so dogmatic on this issue. Multi-cloud can be a differentiator for Snowflake. We can think about data lives everywhere. And you want to keep data, where it lives ideally, you don't want to have to move it, whether it's on AWS, Azure, whatever cloud is holding that data. If the answer to your query requires tapping data that lives in multiple clouds across a data network, and the app needs fast answers, then, you need low latency access to that data. So here's what I think. I think Snowflake's game is to automate by extracting, abstracting, sorry, the complexity around the data location, of course, latency is a part of that, metadata, bandwidth concerns, the time to get to query and answers. All those factors that build complexity into the data pipeline and then optimizing that to get insights, irrespective of data location. So a differentiating formula is really to not only be the best analytic database but be cloud agnostic. AWS, for example, they got a cloud agenda, as do Azure and GCP. Their number one answer to multi-cloud is put everything on our cloud. Yeah, Microsoft and Google Anthos, they would argue against that but we know that behind the scenes, that's what they want. They got offerings across clouds but Snowflake is going to make this a top priority. They can lead with that, and they must be best at it. And if Snowflake can do this, it's going to have a very successful future, in our opinion. And by all accounts, and the data that we shared, Snowflake is executing well. All right, so that's a wrap for this week's CUBE Insights, powered by ETR. Don't forget, all these breaking analysis segments are available as podcasts, just Google breaking analysis with Dave Vellante. I publish every week on wikibon.com and siliconangle.com. Check out etr.plus. That's where all the survey data is and reach out to me, I'm @dvellante on Twitter, or you can hit me up on my LinkedIn posts, or email me at david.vellante@siliconangle.com. Thanks for watching, everyone. We'll see you next time. (enlightening music)

Published Date : Jun 5 2020

SUMMARY :

all around the world, this and maybe, ahead of some of the others. I mean the excuses are that the cloud players are going to and cloud is the best way to scale.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Frank Slootman	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Tony Baer	PERSON	0.99+
Google	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Slootman	PERSON	0.99+
Boston	LOCATION	0.99+
two	QUANTITY	0.99+
Teradata	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
2020	DATE	0.99+
77.6%	QUANTITY	0.99+
Lumen	ORGANIZATION	0.99+
360-degree	QUANTITY	0.99+
US	LOCATION	0.99+
Snowflake	EVENT	0.99+
Today	DATE	0.99+
80%	QUANTITY	0.99+
77%	QUANTITY	0.99+
two companies	QUANTITY	0.99+
EDW	ORGANIZATION	0.99+
41%	QUANTITY	0.99+
677	QUANTITY	0.99+
2019	DATE	0.99+
78%	QUANTITY	0.99+
54%	QUANTITY	0.99+
First	QUANTITY	0.99+
June	DATE	0.99+
35%	QUANTITY	0.99+
20 years	QUANTITY	0.99+
Snowflake	TITLE	0.99+
more than a billion dollars	QUANTITY	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
51%	QUANTITY	0.99+
last year	DATE	0.99+
76%	QUANTITY	0.99+
April	DATE	0.99+
Both companies	QUANTITY	0.99+
672	QUANTITY	0.99+
more than 1,200 CIOs	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
more than 6%	QUANTITY	0.99+
first	QUANTITY	0.99+
last June	DATE	0.99+
today	DATE	0.99+
1,269	QUANTITY	0.99+
both companies	QUANTITY	0.99+

Ajay Vohora Final

>> Narrator: From around the globe, its theCUBE! With digital coverage of enterprise data automation. An event series brought to you by Io-Tahoe. >> Okay, we're back, welcome back to Data Automated, Ajay Vohora is CEO of Io-Tahoe. Ajay, good to see you, how are things in London? >> Things are doing well, things are doing well, we're making progress. Good to see you, hope you're doing well, and pleasure being back here on theCUBE. >> Yeah, it's always great to talk to you, we're talking enterprise data automation, as you know, within our community we've been pounding the whole DataOps conversation. A little different, though, we're going to dig into that a little bit, but let's start with, Ajay, how are you seeing the response to COVID, and I'm especially interested in the role that data has played in this pandemic. >> Yeah, absolutely, I think everyone's adapting, both socially and in business, the customers that I speak to, day in, day out, that we partner with, they're busy adapting their businesses to serve their customers, it's very much a game of ensuring that we can serve our customers to help their customers, and the adaptation that's happening here is trying to be more agile, trying to be more flexible, and there's a lot of pressure on data, lot of demand on data to deliver more value to the business, to serve that customer. >> Yeah, I mean data, machine intelligence and cloud are really three huge factors that have helped organizations in this pandemic, and the machine intelligence or AI piece, that's what automation is all about, how do you see automation helping organizations evolve, maybe faster than they thought they might have to? >> For sure, I think the necessity of these times, there's, as they say, there's a lot of demand on doing something with data, data, a lot of businesses talk about being data-driven. It's interesting, I sort of look behind that when we work with our customers, and it's all about the customer. My peers, CEOs, investors, shareholders, the common theme here is the customer, and that customer experience starts and ends with data. Being able to move from a point that is reacting to what the customer is expecting, and taking it to that step forward where you can be proactive to serve what that customer's expectation to, and that's definitely come alive now with the current time. >> Yeah, so as I said, we were talking about DataOps a lot, the idea being DevOps applied to the data pipeline, but talk about enterprise data automation, what is it to you and how is it different from DataOps? >> Yeah, great question, thank you. I think we're all familiar with, got more and more awareness around DevOps as it's applied to processes, methodologies that have become more mature over the past five years around DevOps, but managing change, managing application life cycles, managing software development, DevOps has been great, but breaking down those silos between different roles, functions, and bringing people together to collaborate. And we definitely see that those tools, those methodologies, those processes, that kind of thinking, lending itself to data with DataOps is exciting, we're excited about that, and shifting the focus from being IT versus business users to, who are the data producers and who are the data consumers, and in a lot of cases it can sit in many different lines of business. So with DataOps, those methods, those tools, those processes, what we look to do is build on top of that with data automation, it's the nuts and bolts of the algorithms, the models behind machine learning, the functions, that's where we invest our R&D. And bringing that in to build on top of the methods, the ways of thinking that break down those silos, and injecting that automation into the business processes that are going to drive a business to serve its customer. It's a layer beyond DevOps, DataOps, taking it to that point where, way I like to think about it is, is the automation behind the automation. We can take, I'll give you an example of a bank where we've done a lot of work to move them into accelerating their digital transformation, and what we're finding is that as we're able to automate the jobs related to data, and managing that data, and serving that data, that's going into them as a business automating their processes for their customer. So it's definitely having a compound effect. >> Yeah, I mean I think that DataOps for a lot of people is somewhat new, the whole DevOps, the DataOps thing is good and it's a nice framework, good methodology, there is obviously a level of automation in there, and collaboration across different roles, but it sounds like you're talking about sort of supercharging it if you will, the automation behind the automation. You know, organizations talk about being data-driven, you hear that thrown around a lot. A lot of times people will sit back and say "We don't make decisions without data." Okay, but really, being data-driven is, there's a lot of aspects there, there's cultural, but there's also putting data at the core of your organization, understanding how it affects monetization, and as you know well, silos have been built up, whether it's through M&A, data sprawl, outside data sources, so I'm interested in your thoughts on what data-driven means and specifically how Io-Tahoe plays there. >> Yeah, sure, I'd be happy to put that through, David. We've come a long way in the last three or four years, we started out with automating some of those simple, to codify, but have a high impact on an organization across a data lake, across a data warehouse. Those data-related tasks that help classify data. And a lot of our original patents and IP portfolio that were built up is very much around there. Automating, classifying data across different sources, and then being able to serve that for some purpose. So originally, some of those simpler challenges that we help our customers solve, were around data privacy. I've got a huge data lake here, I'm a telecoms business, so I've got millions of subscribers, and quite often a chief data office challenge is, how do I cover the operational risk here, where I've got so much data, I need to simplify my approach to automating, classifying that data. Reason is, can't do that manually, we can't throw people at it, and the scale of that is prohibitive. Quite often, if you were to do it manually, by the time you've got a good picture of it, it's already out of date. So in starting with those simple challenges that we've been able to address, we've then gone on and built on that to see, what else do we serve? What else do we serve for the chief data officer, chief marketing officer, and the CFO, and in these times, where those decision-makers are looking for, have a lot of choices in the platform options that they take, the tooling, they're very much looking for that Swiss army knife, being able to do one thing really well is great, but more and more, where that cost pressure challenge is coming in, is about how do we offer more across the organization, bring in those business, lines of business activities that depend on data, to not just with IT. >> So we like, in theCUBE sometimes we like to talk about okay, what is it, and then how does it work, and what's the business impact? We kind of covered what it is, I'd love to get into the tech a little bit in terms of how it works, and I think we have a graphic here that gets into that a little bit. So guys, if you could bring that up, I wonder, Ajay, if you could tell us, what is the secret sauce behind Io-Tahoe, and if you could take us through this slide. >> Ajay: Sure, I mean right there in the middle, the heart of what we do, it is the intellectual property that were built up over time, that takes from heterogeneous data sources, your Oracle relational database, your mainframe, your data lake, and increasingly APIs and devices that produce data. And now creates the ability to automatically discover that data, classify that data, after it's classified then have the ability to form relationship across those different source systems, silos, different lines of business, and once we've automated that, then we can start to do some cool things, such as put some context and meaning around that data. So it's moving it now from being data-driven, and increasingly where we have really smart, bright people in our customer organizations who want to do some of those advanced knowledge tasks, data scientists, and quants in some of the banks that we work with. The onus is on them, putting everything we've done there with automation, classifying it, relationship, understanding data quality, the policies that you can apply to that data, and putting it in context. Once you've got the ability to power a professional who's using data, to be able to put that data in context and search across the entire enterprise estate, then they can start to do some exciting things, and piece together the tapestry, the fabric, across their different system. Could be CRM, ELP systems, such as SAP, and some of the newer cloud databases that we work with, Snowflake is a great one. >> Yeah, so this is, you're describing sort of one of the reasons why there's so many stovepipes in organizations, 'cause data is kind of locked into these silos and applications, and I also want to point out that previously, to do discovery, to do that classification that you talked about, form those relationships, to glean context from data, a lot of that, if not most of that, in some cases all of that would've been manual. And of course it's out of date so quickly, nobody wants to do it because it's so hard, so this again is where automation comes into the idea of really becoming data-driven. >> Sure, I mean the efforts, if I look back maybe five years ago, we had a prevalence of data lake technologies at the cutting edge, and those have started to converge and move to some of the cloud platforms that we work with, such as Google and AWS. And I think very much as you've said it, those manual attempts to try and grasp what is such a complex challenge at scale, quickly runs out of steam, because once you've got your fingers on the details of what's in your data estate, it's changed. You've onboarded a new customer, you've signed up a new partner, a customer has adopted a new product that you've just launched, and that slew of data keeps coming, so it's keeping pace with that, the only answer really here is some form of automation. And what we've found is if we can tie automation with what I said before, the expertise, the subject matter experience that sometimes goes back many years within an organization's people, that augmentation between machine learning, AI, and that knowledge that sits inside the organization really tends to allot a lot of value in data. >> Yeah, so you know well, Ajay, you can't be as a smaller company all things to all people, so the ecosystem is critical. You're working with AWS, you're working with Google, you got Red Hat, IBM as partners. What is attracting those folks to your ecosystem, and give us your thoughts on the importance of ecosystem. >> Yeah, that's fundamental, I mean when I came into Io-Tahoe here as CEO, one of the trends that I wanted us to be part of was being open, having an open architecture that allowed one thing that was close to my heart, which was as a CEO, a CIO, well you've got a budget vision, and you've already made investments into your organization, and some of those are pretty long term bets, they could be going out five, 10 years sometimes, with a CRM system, training up your people, getting everybody working together around a common business platform. What I wanted to ensure is that we could openly plug in, using APIs that were available, to a lot of that sunk investment, and the cost that has already gone into managing an organization's IT, for business users to perform. So, part of the reason why we've been able to be successful with some of our partners like Google, AWS, and increasingly a number of technology players such as Red Hat, MongoDB is another one that we're doing a lot of good work with, and Snowflake, there is, those investments have been made by the organizations that are our customers, and we want to make sure we're adding to that, and then leveraging the value that they've already committed to. >> Okay, so we've talked about what it is and how it works, now I want to get into the business impact, I would say what I would be looking for, from this, would be can you help me lower my operational risk, I've got tasks that I do, many are sequential, some are in parallel, but can you reduce my time to task, and can you help me reduce the labor intensity, and ultimately my labor cost, so I can put those resources elsewhere, and ultimately I want to reduce the end to end cycle time, because that is going to drive telephone number ROI, so am I missing anything, can you do those things, maybe you can give us some examples of the ROI and the business impact. >> Yeah, I mean the ROI, David, is built upon three things that I've mentioned, it's a combination of leveraging the existing investment with the existing estate, whether that's on Microsoft Azure, or AWS, or Google, IBM, and putting that to work, because the customers that we work with have made those choices. On top of that, it's ensuring that we have got the automation that is working right down to the level of data, at a column level or the file level. So we don't deal with metadata, it's being very specific, to be at the most granular level. So as we run our processes and the automation, classification, tagging, applying policies from across different compliance and regulatory needs an organization has to the data, everything that then happens downstream from that is ready to serve a business outcome. It could be a customer who wants that experience on a mobile device, a tablet, or face to face, within a store. And being able to provision the right data, and enable our customers to do that for their customers, with the right data that they can trust, at the right time, just in that real time moment where a decision or an action is being expected, that's driving the ROI to be in some cases 20x plus, and that's really satisfying to see, that kind of impact, it's taking years down to month, and in many cases months of work down to days, and some cases hours, the time to value. I'm impressed with how quickly out of the box, with very little training a customer can pick up our tool, and use features such as search, data discovery, knowledge graph, and identifying duplicates, and redundant data. Straight off the bat, within hours. >> Well it's why investors are interested in this space, I mean they're looking for a big, total available market, they're looking for a significant return, 10x is, you got to have 10x, 20x is better. So that's exciting, and obviously strong management, and a strong team. I want to ask you about people, and culture. So you got people process technology, we've seen with this pandemic that the processes are really unpredictable, and the technology has to be able to adapt to any process, not the reverse, you can't force your process into some static software, so that's very very important, but at the end of the day, you got to get people on board. So I wonder if you could talk about this notion of culture, and a data-driven culture. >> Yeah, that's so important, I mean, current times is forcing the necessity of the moment to adapt, but as we start to work our way through these changes and adapt and work with our customers to adapt to these changing economic times, what we're seeing here is the ability to have the technology complement, in a really smart way, what those business users and IT knowledge workers are looking to achieve together. So, I'll give you an example. We have quite often with the data operations teams, in the companies that we are partnering with, have a lot of inbound inquiries on a day to day level, "I really need this set of data because I think it can help "my data scientists run a particular model," or "What would happen if we combine these two different "silos of data and get some enrichment going?" Now those requests can sometimes take weeks to realize, what we've been able to do with the power of (audio glitches) technology, is to get those answers being addressed by the business users themselves, and now, with our customers, they're coming to the data and IT folks saying "Hey, I've now built something in a development environment, "why don't we see how that can scale up "with these sets of data?" I don't need terabytes of it, I know exactly the columns and the feats in the data that I'm going to use, and that cuts out a lot of wastage, and time, and cost, to innovate. >> Well that's huge, I mean the whole notion of self-service in the lines of business actually feeling like they have ownership of the data, as opposed to IT or some technology group owning the data because then you've got data quality issues, or if it doesn't line up with their agenda, you're going to get a lot of finger pointing, so that is a really important piece of it. I'll give you a last word, Ajay, your final thoughts if you would. >> Yeah, we're excited to be on this path, and I think we've got some great customer examples here, where we're having a real impact in a really fast pace, whether it's helping them migrate to the cloud, helping them clean up their legacy data lake, and quite often now, the conversation is around data quality. As more of the applications that we enable to work more proficiently could be data, RPA, could be robotic process automation, a lot of the APIs that are now available in the cloud platforms, a lot of those are dependent on data quality and being able to automate for business users, to take accountability of being able to look at the trend of their data quality over time and get those signaled, is really driving trust, and that trust in data is helping in turn, the IT teams, the data operations teams they partner with, do more, and more quickly. So it comes back to culture, being able to apply the technology in such a way that it's visual, it's intuitive, and helping just like DevOps has with IT, DataOps, putting the intelligence in at the data level, to drive that collaboration. We're excited. >> You know, you remind me of something, I lied, I don't want to go yet, if it's okay. I know we're tight on time, but you mentioned a migration to the cloud, and I'm thinking about the conversation with Paula from Webster Bank. Migrations are, they're a nasty word for organizations, and we saw this with Webster, how are you able to help minimize the migration pain and why is that something that you guys are good at? >> Yeah, I mean there are many large, successful companies that we've worked with, Webster's a great example. Where I'd like to give you the analogy where, you've got a lot of bright people in your teams, if you're running a business as a CEO, and it's a bit like a living brain. But imagine if those different parts of your brain were not connected, that would certainly diminish how you're able to perform. So, what we're seeing, particularly with migration, is where banks, retailers, manufacturers have grown over the last 10 years, through acquisition, and through different initiatives to drive customer value. That sprawl in their data estate hasn't been fully dealt with. It's sometimes been a good thing to leave whatever you've acquired or created in situ, side by side with that legacy mainframe, and your Oracle ERP. And what we're able to do very quickly with that migration challenge is shine a light on all the different parts of data application at the column level, or at the file level if it's a data lake, and show an enterprise architect, a CDO, how everything's connected, where there may not be any documentation. The bright people that created some of those systems have long since moved on, or retired, or been promoted into other roles, and within days, being able to automatically generate and keep refreshed the states of that data, across that landscape, and put it into context, then allows you to look at a migration from a confidence that you're dealing with the facts, rather than what we've often seen in the past, is teams of consultants and business analysts and data analysts, spend months getting an approximation, and a good idea of what it could be in the current state, and try their very best to map that to the future target state. Now with Io-Tahoe being able to run those processes within hours of getting started, and build that picture, visualize that picture, and bring it to life. The ROI starts off the bat with finding data that should've been deleted, data that there's copies of, and being able to allow the architect, whether it's we have working on GCP, or in migration to any of the clouds such as AWS, or a multicloud landscape, quite often now. We're seeing, yeah. >> Yeah, that visi-- That visibility is key to sort of reducing operational risk, giving people confidence that they can move forward, and being able to do that and update that on an ongoing basis means you can scale. Ajay Vohora, thanks so much for coming to theCUBE and sharing your insights and your experiences, great to have you. >> Thank you David, look forward to talking again. >> All right, and keep it right there everybody, we're here with Data Automated on theCUBE, this is Dave Vellante, and we'll be right back right after this short break. (calm music)

Published Date : Jun 1 2020

SUMMARY :

to you by Io-Tahoe. Ajay, good to see you, Good to see you, hope you're doing well, Yeah, it's always great to talk to you, and the adaptation and it's all about the customer. the jobs related to data, and as you know well, that depend on data, to not just with IT. and if you could take and quants in some of the in some cases all of that and move to some of the cloud so the ecosystem is critical. and the cost that has already gone into the end to end cycle time, and some cases hours, the time to value. and the technology has to be able to adapt and the feats in the data of self-service in the lines of business at the data level, to and we saw this with Webster, and being able to allow the architect, and being able to do that and update that forward to talking again. and we'll be right back

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Ajay Vohora	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Paula	PERSON	0.99+
Google	ORGANIZATION	0.99+
Io-Tahoe	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Ajay	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Webster	ORGANIZATION	0.99+
10x	QUANTITY	0.99+
London	LOCATION	0.99+
Red Hat	ORGANIZATION	0.99+
Webster Bank	ORGANIZATION	0.99+
DevOps	TITLE	0.99+
20x	QUANTITY	0.99+
Oracle	ORGANIZATION	0.98+
five years ago	DATE	0.98+
one thing	QUANTITY	0.97+
three things	QUANTITY	0.97+
Data Automated	ORGANIZATION	0.96+
DataOps	TITLE	0.94+
10 years	QUANTITY	0.94+
three huge factors	QUANTITY	0.93+
one	QUANTITY	0.92+
both	QUANTITY	0.91+
millions of subscribers	QUANTITY	0.89+
four years	QUANTITY	0.85+
DataOps	ORGANIZATION	0.84+
two different	QUANTITY	0.82+
MongoDB	ORGANIZATION	0.81+
pandemic	EVENT	0.77+
Microsoft Azure	ORGANIZATION	0.74+
20x plus	QUANTITY	0.72+
past five years	DATE	0.69+
theCUBE	ORGANIZATION	0.68+
SAP	ORGANIZATION	0.64+
M	TITLE	0.63+
last three	DATE	0.57+
Snowflake	ORGANIZATION	0.57+
last 10 years	DATE	0.52+
Webster	PERSON	0.49+
Swiss	ORGANIZATION	0.43+
COVID	PERSON	0.23+

Aliye 1 1 w dave crowdchat v2

>> Hi everybody, this is Dave Velante with the CUBE. And when we talk to practitioners about data and AI they have troubles infusing AI into their data pipeline and automating that data pipeline. So we're bringing together the community, brought to you by IBM to really understand how successful organizations are operationalizing the data pipeline and with me to talk about that is Aliye Ozcan. Aliye, hello, introduce yourself. Tell us about who you are. >> Hi Dave, how are you doing? Yes, my name is Aliye Ozcan I'm the Data Operations Data ops Global Marketing Leader at IBM. >> So I'm very excited about this project. Go to crowdchat.net/dataops, add it to your calendar and check it out. So we have practitioners, Aliye from Harley Davidson, Standard Bank, Associated Bank. What are we going to learn from them? >> What we are going to learn from them is the data experiences. What are the data challenges that they are going through? What are the data bottlenecks that they had? And especially in these challenging times right now. The industry is going through this challenging time. We are all going through this. How the foundation that they invested. Is now helping them to pivot quickly to market demands, the new market demands fast. That is fascinating to see, and I'm very excited having individual conversations with those experts and bringing those stories to the audience here. >> Awesome, and we also have Inderpal Bhandari from the CDO office at IBM, so go to crowdchat.net/dataops, add it to your calendar, we'll see you in the crowd chat.

Published Date : May 6 2020

SUMMARY :

are operationalizing the data pipeline I'm the Data Operations Data ops What are we going to learn from them? What are the data challenges add it to your calendar, we'll

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Velante	PERSON	0.99+
Standard Bank	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Associated Bank	ORGANIZATION	0.99+
Inderpal Bhandari	PERSON	0.99+
Harley Davidson	ORGANIZATION	0.99+
Aliye	PERSON	0.99+
crowdchat.net/dataops	OTHER	0.99+
Aliye Ozcan	PERSON	0.99+
Aliye 1	PERSON	0.86+
CUBE	ORGANIZATION	0.85+
crowdchat	TITLE	0.67+
Data ops	ORGANIZATION	0.61+
CDO	ORGANIZATION	0.53+

Kathryn IBM promo v2

>> Hi, I'm Katie Kupec, Global Portfolio Product Marketing Manager for IBM Master Data Management. Master Data Management is a key part within the DataOps toolchain to deliver a trusted, complete view of your customers, products and to offer unique and personalized digital experiences. To learn more about this, join us at our DataOps crowd chat event on May 27th. Hope to chat with you there.

Published Date : May 6 2020

SUMMARY :

to deliver a trusted, complete view

ENTITIES

Entity	Category	Confidence
Katie Kupec	PERSON	0.99+
May 27th	DATE	0.99+
IBM	ORGANIZATION	0.99+
Kathryn	PERSON	0.98+
DataOps	EVENT	0.93+
DataOps	ORGANIZATION	0.82+
Master Data	TITLE	0.69+
v2	EVENT	0.49+

Jerry Chen, Greylock | AWS re:Invent 2019

>> Narrator: Live from Las Vegas, it's theCUBE covering AWS reInvent 2019. Brought to you by Amazon Web Services and Intel along with it's Ecosystem partners. >> Well, welcome back, everyone theCUBE's live coverage in Las Vegas for AWS reInvent. It's theCUBE's 10th year of operations, it's our seventh AWS reInvent and every year, it gets better and better and every year, we've had theCUBE at reInvent, Jerry Chen has been on as a guest. He's a VIP, Jerry Chen, now a general partner at Greylock Tier One, one of the leading global Venture capitals at Silicon Valley. Jerry, you've been on the journey with us the whole time. >> I guess I'm your good luck charm. >> (laughs) Well, keep it going. Keep on changing the game. So, thanks for coming on. >> Jerry: Thanks for having me. >> So, now that you're a seasoned partner now at Greylock. You got a lot of investments under your belt. How's it going? >> It's great, I mean look, every single year, I look around the landscape thinking, "What else could be coming? "What if we surprise this year?" What's the new trends? What both macro-trends, also company trends, like, who's going to buy who, who's going to go public? Every year, it just gets busier and busier and bigger and bigger. >> All these new categories are emerging with this new architecture. I call it Cloud 2.0, maybe next gen Cloud, whatever you want to call it, it's clear visibility now into the fact that DevOps is working, Cloud operations, large scale operations with Cloud is certainly a great value proposition. You're seeing now multiple databases, pick the tool, I think Jassy got that right in his keynote, I believe that, but now the data equation comes over the top. So, you got DevOps infrastructure as code, you got data now looking like it's going to go down that same path of data as code where developers don't have to deal with all the different nuances of how data's stored, how it's handled, where is it, warm or cold or at glacier. So, developers still don't have that yet today. Seems to be an area of Amazon. What's your take on all this? >> I think you saw, so what drove DevOps? Speed, right? It's basically how developers shows you operations, merging of two groups. So, we're seeing the same trend DataOps, right? How data engineers and data scientists can now have the same speeds developers had for the past 10 years, DataOps. So, A, what does that mean? Give me the menu of what I want like, Goldilocks, too big, too small, just right. Too hot, too cold, just right. Like, give me the storage tier, the data tier, the size I want, the temperature I want and the speed I want. So, you're seeing DataOps give the same kind of Goldilocks treatment as developers. >> And on terms of like Cloud evolution again, you've seen the movie from the beginning at VM where now through Amazon, seventh year. What jumps out at you, what do you look at as squinting through the trend lines and the fashion of the features, it still seems to be the same old game, compute memory storage and software. >> Well I mean, compute memory storage, there's an atomic building blocks of a compute, right? So, regardless of services these high level frameworks, deep down, you still have compute networking and storage. So, that's the building blocks but I think we're seeing 10th year of reInvent this kind of, it's not one size fits all but this really big fat long tail, small instances, micro-instances, server lists, big instances for like jumbo VMs, bare metal, right? So, you're seeing not one architecture but folks can kind of pick and choose buy compute by the drip, the drop or buy compute by the whole VM or whole server full. >> And a lot of people are like, the builders love that. Amazon owns the builder market. I mean, if anyone who's doing a startup, they pretty much start on Amazon. It's the most robust, you pick your tools, you build, but Steve Malaney was just on before us says, "Enterprise don't want power tools, "they're going to cut their hand off." (laughs) Right so, Microsoft's been winning with this approach of consumable Cloud and it's a nice card to play because they're not yet there with capabilities with Amazon, so it's a good call, they got an Enterprise sales force. Microsoft playing a different game than AWS because they have to. >> Sure I mean, what's football now, you have a running game, you need a passing game, right? So, if you can't beat them with the running game, you go with a passing game and so, Amazon has kind of like the fundamental building blocks or power tools for the builders. There's a large segment of population out there that don't want that level of building blocks but they want us a little bit more prescriptive. Microsoft's been around Enterprise for many many years, they understand prescriptive tools and architectures. So, you're going to become a little bit more prefab, if you will. Here's how you can actually construct the right application, ML apps, AI apps, et cetera. Let me give you the building blocks at a higher level abstraction. >> So, I want to get your take on value creations. >> Jerry: Sure. >> So, if it's still early (mumbles), it's took a lot more growth, you start to see Jassy even admit that in his keynotes that he said quote, "There are two types "of developers and customers. "People want the building blocks "or people who want solutions." Or prefab or some sort of more consumable. >> More prescriptive, yeah. >> So, I think Amazon's going to start going that way but that being said, there's still opportunities for startups. You're an investor, you invest in startups. Where do you see opportunities? If you're looking at the startup landscape, what is the playbook? How should you advise startups? Because ya know, have the best team or whatever but you look at Amazon, it's like, okay, they got large scale. >> Jerry: Yeah. >> I'm going to be a little nervous. Are they going to eat my lunch? Do I take advantage of them? Do I draft off them? There are wide spaces as vertical market's exploding that are available. What's your view on how startups should attack the wealth creation opportunity value creation? >> There, I mean, Amazon's creating a new market, right? So, you look at their list of many services. There's just like 175 services out there, which is basically too many for any one company to win every single service. So, but you look at that menu of services, each one of those services themselves can be a startup or a collection of services can be a startup. So, I look at that as a roadmap for opportunity of companies can actually go in and create value around AI, around data, around security, around observability because Amazon's not going to naturally win all of those markets. What they do have is distribution, right? They have a lot of developer mind share. So, if you're a startup, you play one or three themes. So like, one is how do I pick one area and go deep for IP, right? Like, cheaper, better, faster, own some IP and though, they're going to execute better and that's doable over and over again in different markets. Number two is, we talked about this before, there's not going to be a one Cloud wins all, Amazon's clearly in the lead, they have won most of the Cloud, so far, but it'll be a multi-Cloud world, it'll be On Premise world. So, how do I play a multi-Cloud world, is another angle, so, go deep in IP, go multi-Cloud. Number three is this end to end solution, kind of prescriptive. Amazon can get you 80% of the way there, 70% of the way there but if you're like, an AI developer, you're a CMO, you're a marketing developer, you kind of want this end to end solution. So, how can I put together a full suite of tools from beginning to end that can give me a product that's a better experience. So, either I have something that's a deeper IP play a seam between multiple Clouds or give it end to end solutions around a problem and solve that one problem for our customer. >> And in most cases, the underlay is Amazon or Azure. >> Or Google or Alley Cloud or On Premises. Not going to wait any time soon, right? And so, how do I create a single fabric, if you will that looks similar? >> I want to riff with you in real time here on theCUBE around data. So, data scale is obviously a big discussion that's starting to happen now, data tsunami, we've heard that for years. So, there's two scale benefits, horizontal scale with data and then vertical specialism, vertical scale or ya know, using AI machine learning in apps, having data, so, how do you view that? What's your reaction to the notion of creating the horizontal scale value and vertical specialism value? >> Both are a great place for startups, right? They're not mutually exclusive but I think if you go horizontal, the amount of data being created by your applications, your infrastructure, your sensors, time stories data, ridiculously large amount, right? And that's not going away any time soon. I recently did investment in ChronoSphere, 'cause you guys covered over at CUBEcon a few weeks ago, that's talking about metrics and observability data, time stories data. So, they're going to handle that horizontal amount of data, petabytes and petabytes, how can we quarry this quickly, deeply with a lot of insight? That's one play, right? Cheaper, better, faster at scale. The next play, like you said, is vertical. It's how do I own data or slice the data with more contacts than I know I was going to have? We talked about the virtual cycle of data, right? Just the system of intelligence, as well. If I own a set of data, be it healthcare, government or self-driving car data, that no one else has, I can build a solution end to end and go deep and so either pick a lane or pick a geography, you can go either way. It's hard to do both, though. >> It's hard for startup. >> For a startup. >> Any big company. >> Very few companies can do two things well, startups especially, succeed by doing one thing very well. >> I think my observation is that I think looking at Amazon, is that they want the horizontal and they're leaving offers on the table for our startups, the vertical. >> Yeah, if you look at their strategy, the lower level Amazon gets, the more open-sourced, the more ubiquitous you try to be for containers, server lists, networking, S3, basic sub straits, so, horizontal horizontal, low price. As you get higher up from like, deep mind like, AI technologies, perception, prediction, they're getting a little bit more specialized, right? As you see these solutions around retail, healthcare, voice, so, the higher up in the stack, they can build more narrow solutions because like any startup of any product, you need the right wedge. What's the right wedge in the customers? At the base level of developers, building blocks, ubiquitous. For solutions marketing, healthcare, financial services, retail, how do I find a fine point wedge? >> So, the old Venture business was all enamored with consumers over the years and then, maybe four years ago, Enterprise got hot. We were lowly Enterprise guys where no one-- >> Enterprise has been hot forever in my mind, John but maybe-- >> Well, first of all, we've been hot on Enterprise, we love Enterprise but then all of a sudden, it just seemed like, oh my God, people had an awakening like, and there's real value to be had. The IT spend has been trillions and the stats are roughly 20 or so percent, yet to move to the Cloud or this new next gen architecture that you're investing companies in. So, a big market... that's an investment thesis. So, a huge enterprise market, Steve Malaney of Aviation called it a thousand foot wave. So, there's going to be a massive enterprise money... big bag of money on the table. (laughs) A lot of re-transformations, lot of reborn on the Cloud, lot of action. What's your take on that? Do you see it the same way because look how they're getting in big time, Goldman Sachs on stage here. It's a lot of cash. How do you think it's going to be deployed and who's going to be fighting for it? >> Well, I think, we talked about this in the past. When you look to make an investment, as a startup founder or as a VC, you want to pick a wave bigger than you, bigger than your competitors. Right so, on the consumer side, ya know, the classic example, your Instagram fighting Facebook and photo sharing, you pick the mobile first wave, iPhone wave, right, the first mobile native photo sharing. If you're fighting Enterprise infrastructure, you pick the Cloud data wave, right? You pick the big data wave, you pick the AI waves. So, first as a founder startup, I'm looking for these macro-waves that I see not going away any time soon. So, moving from BaaS data to streaming real time data. That's a wave that's happening, that's inevitable. Dollars are floating from slower BaaS data bases to streaming real time analytics. So, Rocksett, one of the investors we talked about, they're riding that wave from going BaaS to real time, how to do analytics and sequel on real time data. Likewise, time servers, you're going from like, ya know, BaaS data, slow data to massive amounts of time storage data, Chronosphere, playing that wave. So, I think you have to look for these macro-waves of Cloud, which anyone knows but then, you pick these small wavelettes, if that's a word, like a wavelettes or a smaller wave within a wave that says, "Okay, I'm going to "pick this one trend." Ride it as a startup, ride it as an investor and because that's going to be more powerful than my competitors. >> And then, get inside the wave or inside the tornado, whatever metaphor. >> We're going to torch the metaphors but yeah, ride that wave. >> All right, Jerry, great to have you on. Seven years of CUBE action. Great to have you on, congratulations, you're VIP, you've been with us the whole time. >> Congratulations to you, theCUBE, the entire staff here. It's amazing to watch your business grow in the past seven years, as well. >> And we soft launch our CUBE 365, search it, it's on Amazon's marketplace. >> Jerry: Amazing. >> SaaS, our first SaaS offering. >> I love it, I mean-- >> John: No Venture funding. (laughs) Ya know, we're going to be out there. Ya know, maybe let you in on the deal. >> But now, like you broadcast the deal to the rest of the market. >> (laughs) Jerry, great to have you on. Again, great to watch your career at Greylock. Always happy to have ya on, great commentary, awesome time, Jerry Chen, Venture partner, general partner of Greylock. So keep coverage, breaking down the commentary, extracting the signal from the noise here at reInvent 2019, I'm John Furrier, back with more after this short break. (energetic electronic music)

Published Date : Dec 4 2019

SUMMARY :

Brought to you by Amazon Web Services and Intel of the leading global Venture capitals at Silicon Valley. Keep on changing the game. So, now that you're a seasoned partner now at Greylock. What's the new trends? So, you got DevOps infrastructure as code, I think you saw, so what drove DevOps? of the features, it still seems to be the same old game, So, that's the building blocks It's the most robust, you pick your tools, you build, So, if you can't beat them with the running game, So, I want to get your take you start to see Jassy even admit that in his keynotes So, I think Amazon's going to start going that way I'm going to be a little nervous. So, but you look at that menu of services, And so, how do I create a single fabric, if you will I want to riff with you So, they're going to handle that horizontal amount of data, one thing very well. on the table for our startups, the vertical. the more ubiquitous you try to be So, the old Venture business was all enamored So, there's going to be a massive enterprise money... So, I think you have to look for these or inside the tornado, whatever metaphor. We're going to torch the metaphors All right, Jerry, great to have you on. It's amazing to watch your business grow And we soft launch our CUBE 365, Ya know, maybe let you in on the deal. But now, like you broadcast the deal (laughs) Jerry, great to have you on.

ENTITIES

Entity	Category	Confidence
Steve Malaney	PERSON	0.99+
Jerry Chen	PERSON	0.99+
Jerry	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
70%	QUANTITY	0.99+
80%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
two groups	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
175 services	QUANTITY	0.99+
Goldman Sachs	ORGANIZATION	0.99+
one	QUANTITY	0.99+
10th year	QUANTITY	0.99+
first	QUANTITY	0.99+
Greylock	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Intel	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Both	QUANTITY	0.99+
Jassy	PERSON	0.99+
both	QUANTITY	0.99+
two types	QUANTITY	0.99+
DevOps	TITLE	0.99+
one problem	QUANTITY	0.99+
seventh year	QUANTITY	0.98+
two things	QUANTITY	0.98+
reInvent	EVENT	0.98+
Aviation	ORGANIZATION	0.98+
four years ago	DATE	0.98+
Seven years	QUANTITY	0.98+
two scale	QUANTITY	0.97+
CUBEcon	ORGANIZATION	0.97+
iPhone	COMMERCIAL_ITEM	0.97+
one company	QUANTITY	0.97+
three themes	QUANTITY	0.97+
today	DATE	0.96+
reInvent 2019	EVENT	0.96+
Instagram	ORGANIZATION	0.96+
ChronoSphere	TITLE	0.95+
Azure	ORGANIZATION	0.95+
each one	QUANTITY	0.95+
Facebook	ORGANIZATION	0.94+
Rocksett	PERSON	0.94+
this year	DATE	0.93+
Google	ORGANIZATION	0.92+
Number three	QUANTITY	0.92+
Number two	QUANTITY	0.92+
one thing	QUANTITY	0.92+
trillions	QUANTITY	0.92+
20	QUANTITY	0.92+
Venture	ORGANIZATION	0.92+
Cloud	TITLE	0.92+
single fabric	QUANTITY	0.88+
one area	QUANTITY	0.87+
Cloud 2.0	TITLE	0.87+

Seth Dobrin, IBM | IBM CDO Summit 2019

>> Live from San Francisco, California, it's the theCUBE, covering the IBM Chief Data Officer Summit, brought to you by IBM. >> Welcome back to San Francisco everybody. You're watching theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise and we're here at the IBM Chief Data Officer Summit, 10th anniversary. Seth Dobrin is here, he's the Vice President and Chief Data Officer of the IBM Analytics Group. Seth, always a pleasure to have you on. Good to see you again. >> Yeah, thanks for having me back Dave. >> You're very welcome. So I love these events you get a chance to interact with chief data officers, guys like yourself. We've been talking a lot today about IBM's internal transformation, how IBM itself is operationalizing AI and maybe we can talk about that, but I'm most interested in how you're pointing that at customers. What have you learned from your internal experiences and what are you bringing to customers? >> Yeah, so, you know, I was hired at IBM to lead part of our internal transformation, so I spent a lot of time doing that. >> Right. >> I've also, you know, when I came over to IBM I had just left Monsanto where I led part of their transformation. So I spent the better part of the first year or so at IBM not only focusing on our internal efforts, but helping our clients transform. And out of that I found that many of our clients needed help and guidance on how to do this. And so I started a team we call, The Data Science an AI Elite Team, and really what we do is we sit down with clients, we share not only our experience, but the methodology that we use internally at IBM so leveraging things like design thinking, DevOps, Agile, and how you implement that in the context of data science and AI. >> I've got a question, so Monsanto, obviously completely different business than IBM-- >> Yeah. >> But when we talk about digital transformation and then talk about the difference between a business and a digital business, it comes down to the data. And you've seen a lot of examples where you see companies traversing industries which never used to happen before. You know, Apple getting into music, there are many, many examples, and the theory is, well, it's 'cause it's data. So when you think about your experiences of a completely different industry bringing now the expertise to IBM, were there similarities that you're able to draw upon, or was it a completely different experience? >> No, I think there's tons of similarities which is, which is part of why I was excited about this and I think IBM was excited to have me. >> Because the chances for success were quite high in your mind? >> Yeah, yeah, because the chance for success were quite high, and also, you know, if you think about it there's on the, how you implement, how you execute, the differences are really cultural more than they're anything to do with the business, right? So it's, the whole role of a Chief Data Officer, or Chief Digital Officer, or a Chief Analytics Officer, is to drive fundamental change in the business, right? So it's how do you manage that cultural change, how do you build bridges, how do you make people, how do you make people a little uncomfortable, but at the same time get them excited about how to leverage things like data, and analytics, and AI, to change how they do business. And really this concept of a digital transformation is about moving away from traditional products and services, more towards outcome-based services and not selling things, but selling, as a Service, right? And it's the same whether it's IBM, you know, moving away from fully transactional to Cloud and subscription-based offerings. Or it's a bank reimagining how they interact with their customers, or it's oil and gas company, or it's a company like Monsanto really thinking about how do we provide outcomes. >> But how do you make sure that every, as a Service, is not a snowflake and it can scale so that you can actually, you know, make it a business? >> So underneath the, as a Service, is a few things. One is, data, one is, machine learning and AI, the other is really understanding your customer, right, because truly digital companies do everything through the eyes of their customer and so every company has many, many versions of their customer until they go through an exercise of creating a single version, right, a customer or a Client 360, if you will, and we went through that exercise at IBM. And those are all very consistent things, right? They're all pieces that kind of happen the same way in every company regardless of the industry and then you get into understanding what the desires of your customer are to do business with you differently. >> So you were talking before about the Chief Digital Officer, a Chief Data Officer, Chief Analytics Officer, as a change agent making people feel a little bit uncomfortable, explore that a little bit what's that, asking them questions that intuitively they, they know they need to have the answer to, but they don't through data? What did you mean by that? >> Yeah so here's the conversations that usually happen, right? You go and you talk to you peers in the organization and you start having conversations with them about what decisions are they trying to make, right? And you're the Chief Data Officer, you're responsible for that, and inevitably the conversation goes something like this, and I'm going to paraphrase. Give me the data I need to support my preconceived notions. >> (laughing) Yeah. >> Right? >> Right. >> And that's what they want to (voice covers voice). >> Here's the answer give me the data that-- >> That's right. So I want a Dashboard that helps me support this. And the uncomfortableness comes in a couple of things in that. It's getting them to let go of that and allow the data to provide some inkling of things that they didn't know were going on, that's one piece. The other is, then you start leveraging machine learning, or AI, to actually help start driving some decisions, so limiting the scope from infinity down to two or three things and surfacing those two or three things and telling people in your business your choices are one of these three things, right? That starts to make people feel uncomfortable and really is a challenge for that cultural change getting people used to trusting the machine, or in some instances even, trusting the machine to make the decision for you, or part of the decision for you. >> That's got to be one of the biggest cultural challenges because you've got somebody who's, let's say they run a big business, it's a profitable business, it's the engine of cashflow at the company, and you're saying, well, that's not what the data says. And you're, say okay, here's a future path-- >> Yeah. >> For success, but it's going to be disruptive, there's going to be a change and I can see people not wanting to go there. >> Yeah, and if you look at, to the point about, even businesses that are making the most money, or parts of a business that are making the most money, if you look at what the business journals say you start leveraging data and AI, you get double-digit increases in your productivity, in your, you know, in differentiation from your competitors. That happens inside of businesses too. So the conversation even with the most profitable parts of the business, or highly, contributing the most revenue is really what we could do better, right? You could get better margins on this revenue you're driving, you could, you know, that's the whole point is to get better leveraging data and AI to increase your margins, increase your revenue, all through data and AI. And then things like moving to, as a Service, from single point to transaction, that's a whole different business model and that leads from once every two or three or five years, getting revenue, to you get revenue every month, right? That's highly profitable for companies because you don't have to go in and send your sales force in every time to sell something, they buy something once, and they continue to pay as long as you keep 'em happy. >> But I can see that scaring people because if the incentives don't shift to go from a, you know, pay all up front, right, there's so many parts of the organization that have to align with that in order for that culture to actually occur. So can you give some examples of how you've, I mean obviously you ran through that at IBM, you saw-- >> Yeah. >> I'm sure a lot of that, got a lot of learnings and then took that to clients. Maybe some examples of client successes that you've had, or even not so successes that you've learned from. >> Yeah, so in terms of client success, I think many of our clients are just beginning this journey, certainly the ones I work with are beginning their journey so it's hard for me to say, client X has successfully done this. But I can certainly talk about how we've gone in, and some of the use cases we've done-- >> Great. >> With certain clients to think about how they transformed their business. So maybe the biggest bang for the buck one is in the oil and gas industry. So ExxonMobile was on stage with me at, Think, talking about-- >> Great. >> Some of the work that we've done with them in their upstream business, right? So every time they drop a well it costs them not thousands of dollars, but hundreds of millions of dollars. And in the oil and gas industry you're talking massive data, right, tens or hundreds of petabytes of data that constantly changes. And no one in that industry really had a data platform that could handle this dynamically. And it takes them months to get, to even start to be able to make a decision. So they really want us to help them figure out, well, how do we build a data platform on this massive scale that enables us to be able to make decisions more rapidly? And so the aim was really to cut this down from 90 days to less than a month. And through leveraging some of our tools, as well as some open-source technology, and teaching them new ways of working, we were able to lay down this foundation. Now this is before, we haven't even started thinking about helping them with AI, oil and gas industry has been doing this type of thing for decades, but they really were struggling with this platform. So that's a big success where, at least for the pilot, which was a small subset of their fields, we were able to help them reduce that timeframe by a lot to be able to start making a decision. >> So an example of a decision might be where to drill next? >> That's exactly the decision they're trying to make. >> Because for years, in that industry, it was boop, oh, no oil, boop, oh, no oil. >> Yeah, well. >> And they got more sophisticated, they started to use data, but I think what you're saying is, the time it took for that analysis was quite long. >> So the time it took to even overlay things like seismic data, topography data, what's happened in wells, and core as they've drilled around that, was really protracted just to pull the data together, right? And then once they got the data together there were some really, really smart people looking at it going, well, my experience says here, and it was driven by the data, but it was not driven by an algorithm. >> A little bit of art. >> True, a lot of art, right, and it still is. So now they want some AI, or some machine learning, to help guide those geophysicists to help determine where, based on the data, they should be dropping wells. And these are hundred million and billion dollar decisions they're making so it's really about how do we help them. >> And that's just one example, I mean-- >> Yeah. >> Every industry has it's own use cases, or-- >> Yeah, and so that's on the front end, right, about the data foundation, and then if you go to a company that was really advanced in leveraging analytics, or machine learning, JPMorgan Chase, in their, they have a division, and also they were on stage with me at, Think, that they had, basically everything is driven by a model, so they give traders a series of models and they make decisions. And now they need to monitor those models, those hundreds of models they have for misuse of those models, right? And so they needed to build a series of models to manage, to monitor their models. >> Right. >> And this was a tremendous deep-learning use case and they had just bought a power AI box from us so they wanted to start leveraging GPUs. And we really helped them figure out how do you navigate and what's the difference between building a model leveraging GPUs, compared to CPUs? How do you use it to accelerate the output, and again, this was really a cost-avoidance play because if people misuse these models they can get in a lot of trouble. But they also need to make these decisions very quickly because a trader goes to make a trade they need to make a decision, was this used properly or not before that trade is kicked off and milliseconds make a difference in the stock market so they needed a model. And one of the things about, you know, when you start leveraging GPUs and deep learning is sometimes you need these GPUs to do training and sometimes you need 'em to do training and scoring. And this was a case where you need to also build a pipeline that can leverage the GPUs for scoring as well which is actually quite complicated and not as straight forward as you might think. In near real time, in real time. >> Pretty close to real time. >> You can't get much more real time then those things, potentially to stop a trade before it occurs to protect the firm. >> Yeah. >> Right, or RELug it. >> Yeah, and don't quote, I think this is right, I think they actually don't do trades until it's confirmed and so-- >> Right. >> Or that's the desire as to not (voice covers voice). >> Well, and then now you're in a competitive situation where, you know. >> Yeah, I mean people put these trading floors as close to the stock exchange as they can-- >> Physically. >> Physically to (voice covers voice)-- >> To the speed of light right? >> Right, so every millisecond counts. >> Yeah, read Flash Boys-- >> Right, yeah. >> So, what's the biggest challenge you're finding, both at IBM and in your clients, in terms of operationalizing AI. Is it technology? Is it culture? Is it process? Is it-- >> Yeah, so culture is always hard, but I think as we start getting to really think about integrating AI and data into our operations, right? As you look at what software development did with this whole concept of DevOps, right, and really rapidly iterating, but getting things into a production-ready pipeline, looking at continuous integration, continuous development, what does that mean for data and AI? And these concept of DataOps and AIOps, right? And I think DataOps is very similar to DevOps in that things don't change that rapidly, right? You build your data pipeline, you build your data assets, you integrate them. They may change on the weeks, or months timeframe, but they're not changing on the hours, or days timeframe. As you get into some of these AI models some of them need to be retrained within a day, right, because the data changes, they fall out of parameters, or the parameters are very narrow and you need to keep 'em in there, what does that mean? How do you integrate this for your, into your CI/CD pipeline? How do you know when you need to do regression testing on the whole thing again? Does your data science and AI pipeline even allow for you to integrate into your current CI/CD pipeline? So this is actually an IBM-wide effort that my team is leading to start thinking about, how do we incorporate what we're doing into people's CI/CD pipeline so we can enable AIOps, if you will, or MLOps, and really, really IBM is the only company that's positioned to do that for so many reasons. One is, we're the only one with an end-to-end toolchain. So we do everything from data, feature development, feature engineering, generating models, whether selecting models, whether it's auto AI, or hand coding or visual modeling into things like trust and transparency. And so we're the only one with that entire toolchain. Secondly, we've got IBM research, we've got decades of industry experience, we've got our IBM Services Organization, all of us have been tackling with this with large enterprises so we're uniquely positioned to really be able to tackle this in a very enterprised-grade manner. >> Well, and the leverage that you can get within IBM and for your customers. >> And leveraging our clients, right? >> It's off the charts. >> We have six clients that are our most advanced clients that are working with us on this so it's not just us in a box, it's us with our clients working on this. >> So what are you hoping to have happen today? We're just about to get started with the keynotes. >> Yeah. >> We're going to take a break and then come back after the keynotes and we've got some great guests, but what are you hoping to get out of today? >> Yeah, so I've been with IBM for 2 1/2 years and I, and this is my eighth CEO Summit, so I've been to many more of these than I've been at IBM. And I went to these religiously before I joined IBM really for two reasons. One, there's no sales pitch, right, it's not a trade show. The second is it's the only place where I get the opportunity to listen to my peers and really have open and candid conversations about the challenges they're facing and how they're addressing them and really giving me insights into what other industries are doing and being able to benchmark me and my organization against the leading edge of what's going on in this space. >> I love it and that's why I love coming to these events. It's practitioners talking to practitioners. Seth Dobrin thanks so much for coming to theCUBE. >> Yeah, thanks always, Dave. >> Always a pleasure. All right, keep it right there everybody we'll be right back right after this short break. You're watching, theCUBE, live from San Francisco. Be right back.

Published Date : Jun 24 2019

SUMMARY :

brought to you by IBM. Seth, always a pleasure to have you on. Yeah, thanks for and what are you bringing to customers? to lead part of our DevOps, Agile, and how you implement that bringing now the expertise to IBM, and I think IBM was excited to have me. and analytics, and AI, to to do business with you differently. Give me the data I need to And that's what they want to and allow the data to provide some inkling That's got to be there's going to be a and they continue to pay as that have to align with that and then took that to clients. and some of the use cases So maybe the biggest bang for the buck one And so the aim was really That's exactly the decision it was boop, oh, no oil, boop, oh, they started to use data, but So the time it took to help guide those geophysicists And so they needed to build And one of the things about, you know, to real time. to protect the firm. Or that's the desire as to not Well, and then now so every millisecond counts. both at IBM and in your clients, and you need to keep 'em in there, Well, and the leverage that you can get We have six clients that So what are you hoping and being able to benchmark talking to practitioners. Yeah, after this short break.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
San Francisco	LOCATION	0.99+
Seth	PERSON	0.99+
JPMorgan Chase	ORGANIZATION	0.99+
Monsanto	ORGANIZATION	0.99+
90 days	QUANTITY	0.99+
two	QUANTITY	0.99+
six clients	QUANTITY	0.99+
Dave	PERSON	0.99+
hundred million	QUANTITY	0.99+
tens	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
one piece	QUANTITY	0.99+
ExxonMobile	ORGANIZATION	0.99+
IBM Analytics Group	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
San Francisco, California	LOCATION	0.99+
less than a month	QUANTITY	0.99+
2 1/2 years	QUANTITY	0.99+
three	QUANTITY	0.99+
one example	QUANTITY	0.99+
today	DATE	0.99+
thousands of dollars	QUANTITY	0.99+
one	QUANTITY	0.99+
five years	QUANTITY	0.98+
One	QUANTITY	0.98+
second	QUANTITY	0.98+
two reasons	QUANTITY	0.98+
hundreds of petabytes	QUANTITY	0.97+
hundreds of millions of dollars	QUANTITY	0.97+
hundreds of models	QUANTITY	0.97+
10th anniversary	QUANTITY	0.97+
IBM Chief Data Officer Summit	EVENT	0.97+
three things	QUANTITY	0.96+
single point	QUANTITY	0.96+
decades	QUANTITY	0.95+
billion dollar	QUANTITY	0.95+
Flash Boys	TITLE	0.95+
single version	QUANTITY	0.95+
Secondly	QUANTITY	0.94+
both	QUANTITY	0.92+
IBM Services Organization	ORGANIZATION	0.9+
IBM Chief Data Officer Summit	EVENT	0.9+
first year	QUANTITY	0.89+
once	QUANTITY	0.87+
IBM CDO Summit 2019	EVENT	0.83+
DataOps	TITLE	0.72+
years	QUANTITY	0.72+
Vice President	PERSON	0.69+
Think	ORGANIZATION	0.69+
every millisecond	QUANTITY	0.68+
DevOps	TITLE	0.68+
once every	QUANTITY	0.67+
double-	QUANTITY	0.62+
eighth CEO	QUANTITY	0.62+
Chief Data Officer	PERSON	0.6+
UBE	ORGANIZATION	0.59+
360	COMMERCIAL_ITEM	0.58+
RELug	ORGANIZATION	0.56+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for DataOps: