Seth Rao, FirstEigen | AWS re:Invent 2021

(upbeat music) >> Hey, welcome back to Las Vegas. theCUBE is live at AWS re:Invent 2021. I'm Lisa Martin. We have two live sets, theCUBE. We are running one of the largest hybrid tech events, most important events of the year with AWS and its massive ecosystem of partners like as I said. Two live sets, two remote sets. Over a hundred guests on the program talking about the next generation of cloud innovation. I'm pleased to welcome a first timer to theCUBE. Seth Rao, the CEO of FirstEigen joins me. Seth, nice to have you on the program. >> Thank you nice to be here. >> Talk to me about FirstEigen. Also explain to me the name. >> So FirstEigen is a startup company based out of Chicago. The name Eigen is a German word. It's a mathematical term. It comes from eigenvectors and eigenvalues which is used and what it's called is principal component analysis, which is used to detect anomalies, which is related to what we do. So we look for errors in data and hence our name FirstEigen. >> Got it. That's excellent. So talk to me. One of the things that has been a resounding theme of this year's re:Invent is that especially in today's age, every company needs to be a data company. >> Yeah. >> It's all one thing to say it's as a whole other thing to be able to put that into practice with reliable data, with trustworthy data. Talk to me about some of the challenges that you help customers solve 'cause some of the theme about not just being a data company but if you're not a data company you're probably not going to be around much longer. >> Yeah, absolutely .So what we have seen across the board across all verticals, the customers we work with is data governance teams and data management teams are constantly firefighting to find errors in data and fix it. So what we have done is we have created the software DataBuck that autonomously looks at every data set and it will discover errors that are hidden to the human eye. They're hard to find out, hard to detect. Our machine learning algorithms figure out those errors before those errors impact the business. In the usual way, things are sorted out, things are done. It's very laborious, time-consuming and expensive. You have taken a process that takes man-years or even man-months and compressed it to a few hours. >> So dramatic time-savings there. >> Absolutely. >> So six years ago when you guys were founded, you realize this gap in the market, thought it's taking way too long. We don't have this amount of time. Gosh, can you imagine if you guys weren't around the last 22 months when certainly time was of the essence? >> Absolutely. Yeah. Six years ago when we founded the company, my co-founder who's also the CTO. He has extensive experience in validating data and data quality. And my own background and my own experiences in AI and ML. And what we saw was that people are spending an enormous amount of time and yet errors were getting down through to the business side. And at that point it comes back and people are still firefighting. So it was a waste of time, waste of money, waste of effort. >> Right. But also there's the potential for brand damage, brand reputation. Whatever products and services you're producing, if your employees don't have the right data, if there's errors there of what's going out to the consumers is wrong then you've got a big problem. >> Absolutely. Interesting you should mention that because over the summer there was a Danish bank, a very big name Danish bank that had to send apology letters to its customers because they overcharged them on the mortgage because the data in the backend had some errors in it and didn't realize it was inadvertent. But somebody ultimately caught it and did the right thing. Absolutely correct. If the data is incorrect and then you're doing analytics or you're doing reporting or you're sending people a bill that they need to pay it better be very accurate. Otherwise it's a serious brand damage. It has real implications and it has a whole bunch of other issues as well. >> It does and those things can snowball very quickly. >> Yeah. >> So talk to me about one of the things that we've seen in the recent months and years is this explosion of data. And then when the pandemic struck we had this scattering of people and data sources or so much data. The edge is persistent. We've got this work from anywhere environment. What are some of the risks for organizations? They come to you and saying help us ensure that our data is trustworthy. I mean that the trust is key but how do you help organizations that are in somewhat a flux figure out how to solve that problem? >> Yeah. So you're absolutely correct. There is an explosion of data, number one. And along with that, there is also an explosion of analytical tools to mine that data. So as a consequence, there is a big growth. It's exponential growth of microservices, how people are consuming that data. Now in the old world when there were a few consumers of data, it was a lot easier to validate the data. You had few people who are the gatekeepers or the data stewards. But with an explosion of data consumers within a company, you have to take a completely different approach. You cannot now have people manually looking and creating rules to validate data. So there has to be a change in the process. You start validating the data. As soon as the data comes into your system, you start validating if the data is reliable at point zero. >> Okay. >> And then it goes downstream. And every stage the data hops that is a chance that data can get corrupted. And these are called systems risks. Because there are multiple systems and data comes from multiple systems onto the cloud, errors creep in. So you validate the data from the beginning all the way to the end and the kinds of checks you do also increase in complexity as the data is going downstream. You don't want to boil the ocean upfront. You want to do the essential checks. Is my water drinkable at this point, right? I'm not trying to cook as soon as it comes out of the tap. Is it drinkable? - Right. >> Good enough quality. If not then we go back to the source and say, guys, send me better quality data. So sequence, the right process and check every step along the way. >> How much of a cultural shift is FirstEigen helping to facilitate within organizations that now don't... There isn't time to, like we talked about if an error gets in, there's so many downstream effects that can happen, but how do you help organizations shift their mindset? 'Cause that's hard thing to change. >> Fantastic point. In fact, what we see is the mindset change is the biggest wall for companies to have good data. People have been living in the old world where there is a team that is a group, much downstream that is responsible for accurate data. But the volume of data, the complexity of data has gone up so much that that team cannot handle it anymore. It's just beyond their scope. It's not fair for us to expect them to save the world. So the mindshift has to come from an organization leadership that says guys, the data engineers who are upfront who are getting the data into the organization, who are taking care of the data assets have to start thinking of trustable data. Because if they stopped doing it, everything downstream becomes easy. Otherwise it's much, much more complex for these guys. And that's what we do. Our tool provides autonomous solution to monitor the data. It comes out with a data trust score with zero human input. Our software will be able to validate the data and give an objective trust score. Right now it's a popularity contest. People are saying they vote. Yeah, I think I like this. I like this and I like that. That's okay. Maybe it's acceptable. But the reason they do it is because there is no way to objectively say the data is trustable. If there is a small error somewhere, it's a needle in the haystack. It's hard to find out, but we can. With machine learning algorithms our software can detect the errors, the minutest errors, and to give an objective score from zero to a hundred, trust or no trust. So along with a mindset, now they have the tool to implement that mindset and we can make it happen. >> Talk to me about some of the things that you've seen from a data governance perspective, as we've seen, the explosion, the edge, people working from anywhere. This hybrid environment that we're going to be in for quite some time. >> Yeah. >> From a data governance perspective and Dave Vellante did his residency. We're seeing so many more things pop up, you know different regulations. How do you help facilitate data governance for organizations as the data volume is just going to continue to proliferate? >> Absolutely correct. So data governance. So we are a key component of data governance and data quality and data trustworthiness, reliability is a key component of it. And one of the central, one of the central pillars of data governance is the data catalog. Just like a catalog in the library. It's cataloging every data asset. But right now the catalogs, which are the mainstay are not as good as they can be. A key information that is missing is I know where my data is what I don't know is how good is my data? How usable is it? If I'm using it for an accounts receivable or an accounts payable, for example, the data better be very, very accurate. So what our software will do is it'll help data governance by linking with any data governance tool and giving an important component which is data quality, reliability, trustability score, which is objective to every data asset. So imagine I open the catalog. I see where my book is in the library. I also know if there are pages missing in the book is the book readable? So it's not good enough to know that I have a book somewhere but it's how good is it? >> Right >> So DataBuck will make that happen. >> So when customers come to you, how do you help them start? 'Cause obviously the data, the volume it's intimidating. >> Yeah. >> Where do they start? >> Great. This is interestingly enough a challenge that every customer has. >> Right. >> Everybody is ambitious enough to say, no, I want to make the change. But the previous point was, if you want to do such a big change, it's an organizational change management problem. So the way we recommend customers is start with the small problem. Get some early victories. And this software is very easy. Just bring it in, automate a small part. You have your sales data or transactional data, or operational data. Take a small portion of it, automate it. Get reliable data, get good analytics, get the results and start expanding to other places. Trying to do everything at one time, it's just too much inertia, organizations don't move. You don't get anywhere. Data initiatives will fail. >> Right. So you're helping customers identify where are those quick wins? >> Yes. And where are the landmines that we need to be able to find out where they are so we can navigate around them? >> Yeah. We have enough expedience over 20 years of working with different customers. And I know if something can go wrong we know where it'll go wrong and we can help them steer them away from the landmines and take them to areas where they'll get quick wins. 'Cause we want the customer to win. We want them to go back and say, look, because of this, we were able to do better analytics. We are able to do better reporting and so on and so forth. We can help them navigate this area. >> Do you have a favorite example, customer example that you think really articulates that value there, that we're helping customers. We can't boil the ocean like you said. It doesn't make any sense, but customer that you helped with small quick wins that really just opened up the opportunity to unlock the value of trustable data. >> Absolutely. So we're working with a fortune 50 company in the US and it's a manufacturing company. Their CFO is a little in a concern whether the data that she's reporting to the Wall Street is acceptable, does it have any errors? And ultimately she signing off on it. So she had a large team in the technology side that was supporting her and they were doing their best. But in spite of that, she's a very sharp woman. She was able to look and find errors and saying, "Something does not look right here guys. Go back and check". Then it goes back to the IT team and they go, "Oh yeah, actually, there was an error". Some errors had slipped through. So they brought us in and we were able to automate the process, What they could do. They could do a few checks within that audit window. We were able to do an enormous number of checks more. More detailed, more accurate. And we were able to reduce the number of errors that were slipping through by over 98%. >> Big number. >> So, absolutely. Really fast. Really good. Now that this has gone through they feel a lot more comfortable than the question is, okay. In addition to financial reporting, can I use it to iron out my supply chain data? 'Cause they have thousands of vendors. They have hundreds of distributors. They have products all over the globe. Now they want to validate all the data because even if your data is off in a one or 2%, if you're a hundred plus billion dollar company, it has an enormous impact on your balance sheet and your income statement. >> Absolutely. Yeah. >> So we are slowly expanding as soon as they allow us. They like us now they're taking it to other areas from beyond finance. >> Well it sounds like you have not only great technology, Seth but a great plan for helping customers with those quick wins and then learning and expanding within and really developing that trusted relationship between FirstEigen and your customers. Thank you so much for joining me on the program today. Introducing the company, what you guys are doing really cool stuff. Appreciate your time. >> Thank you very much. >> All right. >> Pleasure to be here. >> For Seth Rao, I'm Lisa Martin. You're watching theCUBE. The global leader in live tech coverage. (upbeat music)

Published Date : Dec 2 2021

SUMMARY :

We are running one of the Also explain to me the name. So FirstEigen is a startup One of the things 'cause some of the theme that are hidden to the human eye. So six years ago through to the business side. have the right data, that they need to pay it can snowball very quickly. I mean that the trust is key So there has to be a the kinds of checks you do So sequence, the right process 'Cause that's hard thing to change. So the mindshift has to come the things that you've seen as the data volume is just going is the data catalog. 'Cause obviously the data, that every customer has. So the way we recommend customers So you're to find out where they are We are able to do better We can't boil the ocean like you said. the IT team and they go, They have products all over the globe. Yeah. to other areas from beyond finance. me on the program today. The global leader in live tech coverage.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Seth Rao	PERSON	0.99+
Chicago	LOCATION	0.99+
Las Vegas	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Seth	PERSON	0.99+
US	LOCATION	0.99+
FirstEigen	ORGANIZATION	0.99+
two remote sets	QUANTITY	0.99+
Two live sets	QUANTITY	0.99+
zero	QUANTITY	0.99+
one	QUANTITY	0.99+
two live sets	QUANTITY	0.99+
thousands	QUANTITY	0.99+
six years ago	DATE	0.99+
fortune 50	ORGANIZATION	0.98+
over 98%	QUANTITY	0.98+
over 20 years	QUANTITY	0.98+
today	DATE	0.98+
Six years ago	DATE	0.98+
2%	QUANTITY	0.98+
One	QUANTITY	0.97+
one time	QUANTITY	0.97+
German	OTHER	0.97+
pandemic	EVENT	0.96+
Over a hundred guests	QUANTITY	0.95+
zero human	QUANTITY	0.94+
DataBuck	TITLE	0.94+
hundred plus billion dollar	QUANTITY	0.93+
Invent	EVENT	0.9+
DataBuck	ORGANIZATION	0.89+
one thing	QUANTITY	0.87+
Wall Street	LOCATION	0.87+
last 22 months	DATE	0.85+
re:Invent 2021	EVENT	0.83+
this year	DATE	0.8+
a hundred	QUANTITY	0.77+
first timer	QUANTITY	0.75+
hundreds of distributors	QUANTITY	0.73+
point zero	QUANTITY	0.67+
2021	DATE	0.63+
theCUBE	TITLE	0.55+
Danish	LOCATION	0.55+
CEO	PERSON	0.52+
theCUBE	ORGANIZATION	0.46+
Danish	OTHER	0.45+

Rohit Seth | KubeCon + CloudNativeCon NA 2021

hey everyone this is thecube's live coverage from los angeles of kubecon and cloud native con north america 21 lisa martin with dave nicholson we're going to be talking with the founder and ceo next of cloudnatics rohit seth rohit welcome to the program thank you very much lisa pleasure to meet you good to meet you too welcome so tell the the audience about cloudnatics what you guys do when you were founded and what was the gap in the market that you saw that said we need a solution so just to start uh cloud9x was started in 2019 by me and the reason for starting cloud netex was as i was starting to look at the cloud adoption and how enterprises are kind of almost blindly jumping on this cloud bandwagon i started reading what are the key challenges the market is facing and it started resonating with what i saw in google 15 years before when i joined google the first thing i noticed was of course the scale would just overwhelm anyone but at the same time how good they are utilized at that scale was the key that i was starting to look for and over the next couple of months i did all the scripting and such with my teams and found out that lower teens is the utilization of their computers servers and uh lower utilization means if you're spending a billion dollars you're basically wasting the major portion of that and a tech savvy company like google if that's a state of affair you can imagine what would be happening in other companies so in any case we actually now started work at that time started working on a technology so that more groups more business units could share the same machine in a efficient fashion and that's what led to the invention of containers over the next six years we rolled out containers across the whole google fleet the utilization went up at least three times right fast forward 15 years and you start reading 125 billion dollars are spent on a cloud and 60 billion dollars of waste someone would say 90 billion dollars a waste you know what i don't care whether 60 or 90 billion is a very large number and if tech savvy company google couldn't fix it on its own i bet you it it's not an easy problem for enterprises to fix it so we i started talking to several executives in the valley about is this problem for real or not the worst thing that i found was not only they didn't know how bad the problem was they actually didn't have any means to find out how bad the problem could be right one cfo just ran like headless chicken for about two months to figure out okay i know i'm spending this much but where is that spend going so i started kind of trading those waters and i started saying okay visibility is the first thing that we need to provide to the end customer saying that listen it doesn't need to be rocket science for you to figure out how much is your marketing spending how much your different business units so the first line of action is basically give them the visibility that they need to make the educated business decisions about how good or how bad they are doing their operations once they have the visibility the next thing is what to do if there is a waste there are a thousand different type of vms on aws alone people talk about complexity on multi-cloud hybrid cloud and that's all right but even on a single cloud you have thousand vms the heterogeneity of the vms with dynamic pricing that changes every so often is a killer and so and so rohit when you talk about driving levels of efficiency you're not just you're not just talking about abstraction versus bare metal utilization you're talking about even in environments that have used sort of traditional virtualization yes okay absolutely i think all clouds run in vms but within vms sometimes you have containers sometimes you don't have containers if you don't have containers there is no way for you to securely have a protagonist and antagonist job running on the same machines so containers basically came to the world just so that different applications could share the same resources in a meaningful fashion we are basically extending that landscape to to the enterprises so that that utilization benefit exists for everyone right so first of first order business for cloud natick is basically provide them the visibility on how well or bad they are doing the second is to give them the recommendation if you are not doing well what to do about it to do well and we can actually slice and dice the data based on what is important for you okay we don't tell you that these are the dimensions that you should be looking at of course we have our recommendations but we actually want you to figure out basically do you want to look at your marketing organization or your engineering organization or your product organization to see where they are spending money and you can slice and match that data according and we'll give you recommendations for those organizations but now you have the visibility now you have the recommendations but then what right if you ask a cubernities administrator to go and apply those recommendations i bet you the moment you have more than five cluster which is a kind of a very ordinary thing it'll take at least two hours just to figure out how to go from where you are to be able to log in and to be able to apply those recommendations and then changing back the ci cd pipelines and asking your developers to be cognizant about your resources next time is a month-long ordeal no one follows it that's why those recommendations falls on deaf ears most of the time what we do is we give you the choice you want to apply those recommendations manually or you can put the whole system on autopilot in which case once you have enough confidence in cloud native platform we will actually apply those recommendations for you dynamically on the fly as your workloads are increasing or decreasing in utilization and where are your customer conversations happening you mentioned the cfl you mentioned the billions in cloud waste where do you start having these conversations within an organization because clearly you mentioned marketing services you can give them that visibility across the organization who are you talking to within these customers so we start with mostly the cios ctos vp of engineering but it's very interesting we say it's a waste and i think the waste is most more of an effect than a cause the real cause is the complexity and who is having the complexity is the devops and the developers so in 99 of our customer interactions we basically start from cios and ctos but very soon we have these conversations over a week with developers and devops leads also sitting in the room saying that but this is a challenge on why i cannot do this so what we have done is to address the real cause and waste aspect of cloud computing we have we have what we call the management console through which we reduce the complexity of kubernetes operations themselves so think about how you can log into a crashing pod within two minutes rather than two hours right and this is where cloud native start differentiating from the rest of the competition out there because we provide you not only or do this recommendation do this right sizing of vm here or there but this is how you structurally fix the issue going forward right i'm not going to tell you that your containers are not going to crash loop their failures are regular part of distributed systems how you deal with them how you debug them and how you get it back up and running is a core integral part of how businesses get run that's what we provide in cloud natives platform a lot of this learning that we have is actually coming from our experience in hyperscalers we have a chief architect who is also from google he was a dl of a technology called borg and then we have sonic who was the head of products at mesosphere before so we understand what it takes for an enterprise who's primarily coming from on-prem or even the companies that are starting from cloud to scale in cloud often you hear this trillion dollar paradoxes that hey you're stupid if you don't start from cloud and you're stupid if you scale at cloud we are saying that if you're really careful about how you function on cloud it has a value prop that can actually take you to the web scalar heights without even blinking twice can you share an example of one of your favorite customer stories absolutely even by industry only where you've really shown them tremendous value in savings absolutely so a couple of discussions that happened that led like oh but we are we have already spent a team of four people trying to optimize our operations over the last year and we said that's fine uh you know what our onboarding exercise takes only 20 minutes right let's do the onboarding in about a week we will tell you if we could save you any money or not and put your best devops on this pov prove a value exercise to see if it actually help their daily life in terms of operations or not this particular customer only has 30 clusters so it's not very small but it's not very big in terms of what we are seeing in the market first thing the maximum benefit or the cost optimization that they could do over the past year using some of the tools and using their own top-class engineering shots were about seven to ten percent within a week we told them 38 without even having those engineers spend more than two hours in that week we gave them the recommendations right another two weeks because they did not want to put it on autopilot just because it's a new platform in production within next two hours they were able to apply i think at least close to 16 recommendations to their platform to get that 37 improvement in cost what are some examples of of recommendations um obviously you don't want to reveal too much of the secret sauce behind the scene but but but you know what are some what are some classic recommendations that are made so some of them could be as low-hanging fruit as or you have not right sized your vms right this is what i call a lot of companies you would find that oh you have not right side but for us that's the lowest hanging code you go in and you can tell them that whether you have right size that thing or not but in kubernetes in particular if you really look at how auto scaling up and how auto scaling down happens and particularly when you get a global federated view of the number of losses that's where our secret sources start coming and that's where we know how to load balance and how to scale vertically up or how to scale horizontally within the cluster right those kind of optimization we have not seen anywhere in the market so far and that's where the most of the value prop that our customers are seeing kind of comes out and it doesn't take uh too much time i think within a week we have enough data to to say that this service that has thousands of containers could benefit by about this much and just to kind of give you i wouldn't be able to go into the specific dollar numbers here but we are talking in at least a 5 million ish kind of a range of a spend for this cluster and think about it 37 of that if we could save that that kind of money is a real money that not only helps you save your bottom line but at that level you're actually impacting your top line of the business as well sure right that's our uh value crop that we are going to go in and completely automate you're not going to look for devops that don't exist anymore to hire one of the key challenges i'm pretty sure that you must have already heard 86 percent of businesses are not able to hire the devops and they want to hire 86 percent what happens when you don't have that devops that you want to have your existing devops want to move as fast cutting corners sometimes not because they don't know anywhere but just because there's so much pressure to do so much more they don't scale when things become brittle that's when um the fragility of the system comes up and when the demand goes up that's when the systems break but you're not prepared for that breakage just because you have not really done the all the things that you would have done if you had all the time that you needed to do the right thing it sounds like some of the microservices that are in containers that are that run the convention center here have just crashed i think it's gone hopefully the background noise didn't get picked up too much yeah but you're the so the the time to value the roi that you're able to deliver to customers is significant yes you talked about that great customer use case are there any kind of news or announcements anything that you want to kind of share here that folks can can be like looking forward to without the index absolutely so two things even though this is kubecon and everyone is focused on kubernetes kubernetes is still only about three to five percent of enterprise market okay we differentiate ourselves by saying that it doesn't matter whether you're running kubernetes or you're in running legacy vms we will come on board in your environment without you making a single line of change in less than 20 minutes and either we give you the value prop in one week or we don't all right that's number one number two we have a webinar coming on november 3rd uh please go to cloudnetix.com and subscribe or sign up for that webinar sonic and i will be presenting that webinar giving you the value proposition going through some use cases that oh we have seen with our customers so far so that we can actually educate the broader audience and let them know about this beautiful platform i think that my team has built up here all right cloudnatics.com rohit thank you for joining us sharing with us what you're doing at cloud natives why you founded the company and the tremendous impact and roi that you're able to give to your customers we appreciate learning more about the technology thank you so much and i really believe that cloud is here for stay for a long long time it's a trillion dollar market out there and if we do it right i do believe we will accelerate the adoption of cloud even further than what we have seen so far so thanks a lot lisa it's been a pleasure nice to meet you it's a pleasure we want to thank you for watching for dave nicholson lisa martin coming to you live from los angeles we are at kubecon cloudnativecon north america 21. dave and i will be right back with our next guest thank you you

Published Date : Oct 15 2021

SUMMARY :

gap in the market that you saw that said

ENTITIES

Entity	Category	Confidence
2019	DATE	0.99+
60	QUANTITY	0.99+
two hours	QUANTITY	0.99+
99	QUANTITY	0.99+
dave	PERSON	0.99+
november 3rd	DATE	0.99+
125 billion dollars	QUANTITY	0.99+
90 billion dollars	QUANTITY	0.99+
86 percent	QUANTITY	0.99+
dave nicholson	PERSON	0.99+
86 percent	QUANTITY	0.99+
30 clusters	QUANTITY	0.99+
los angeles	LOCATION	0.99+
60 billion dollars	QUANTITY	0.99+
more than two hours	QUANTITY	0.99+
90 billion	QUANTITY	0.99+
two minutes	QUANTITY	0.99+
37	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
north america	LOCATION	0.99+
two things	QUANTITY	0.99+
lisa martin	PERSON	0.99+
less than 20 minutes	QUANTITY	0.99+
15 years	QUANTITY	0.99+
lisa martin	PERSON	0.99+
lisa	PERSON	0.98+
first thing	QUANTITY	0.98+
Rohit Seth	PERSON	0.98+
first line	QUANTITY	0.98+
KubeCon	EVENT	0.98+
twice	QUANTITY	0.98+
first	QUANTITY	0.98+
one	QUANTITY	0.98+
second	QUANTITY	0.97+
four people	QUANTITY	0.97+
CloudNativeCon	EVENT	0.97+
one week	QUANTITY	0.97+
cloud natives	ORGANIZATION	0.97+
google	ORGANIZATION	0.97+
ular	ORGANIZATION	0.97+
five percent	QUANTITY	0.96+
cloudnetix.com	OTHER	0.96+
38	QUANTITY	0.96+
16 recommendations	QUANTITY	0.96+
more than five cluster	QUANTITY	0.96+
ten percent	QUANTITY	0.96+
rohit	PERSON	0.96+
about two months	QUANTITY	0.96+
last year	DATE	0.95+
thousands of containers	QUANTITY	0.95+
cloudnatics	ORGANIZATION	0.95+
15 years before	DATE	0.95+
about a week	QUANTITY	0.94+
a week	QUANTITY	0.93+
over a week	QUANTITY	0.93+
billions	QUANTITY	0.93+
rohit seth rohit	PERSON	0.93+
trillion dollar	QUANTITY	0.91+
north america	LOCATION	0.9+
billion dollars	QUANTITY	0.89+
cloudnatics.com	OTHER	0.89+
single cloud	QUANTITY	0.88+
single	QUANTITY	0.88+
next couple of months	DATE	0.87+
kubecon	ORGANIZATION	0.87+
about three	QUANTITY	0.87+
a lot of companies	QUANTITY	0.86+
trillion dollar	QUANTITY	0.84+
several executives	QUANTITY	0.83+
one of the key challenges	QUANTITY	0.82+
about seven	QUANTITY	0.81+
thousand	QUANTITY	0.8+
20 minutes	QUANTITY	0.79+
NA 2021	EVENT	0.79+
thecube	ORGANIZATION	0.79+
at least two hours	QUANTITY	0.75+
5 million	QUANTITY	0.72+
least three times	QUANTITY	0.72+
37 improvement	QUANTITY	0.71+
cloudnativecon	EVENT	0.71+
borg	ORGANIZATION	0.7+
past year	DATE	0.69+
six years	DATE	0.68+
cloud native con	ORGANIZATION	0.67+
cloud netex	TITLE	0.64+

Seth Dobrin, IBM | IBM Think 2021

>>From around the globe. It's the queue with digital coverage of IBM. Think 20, 21 brought to you by IBM. Okay. We're back with our coverage of IBM. Think 2021. This is Dave Vellante and this is the cube. We're not going to dig into AI and explore the issue of trust in machine intelligence. And we're very excited to welcome. Long-time cube alum, Seth Dobrin, who is the global chief AI officer at IBM. Seth. Always good to see you. Thanks so much for making time for us. >>Yeah, always good to see you, Dave. Thanks for having me on again. >>It's our pleasure. Um, look, language matters and IBM has been talking about language automation and, and trust. And I know you're doing a session on trustworthy AI. Can you talk about trust in the world of machine intelligence and AI, what we should, what do we need to know? >>Yeah. So, you know, as you mentioned, you know, language, language matters, automation matters and to do either of those well, you need to really trust, uh, the AI that's coming out of, of your models and trusting the AI really means five things. There's I think of it as there's five pillars, uh, the AI needs to be transparent that AI needs to be fair. It needs to be explainable. It needs to be robust, robust, and it needs to ensure privacy. And without those five things, all combined, you don't really have the ability to trust your AI either as a consumer or even as an end user. So imagine I'm a business user that's supposed to use this cool new AI, that's going to help me make a decision if I don't understand it. And I can't figure out how it got to a decision, I'm going to be less likely to consume it and use it in my day-to-day work. Whereas if I can really understand how and why it got to a decision and know that it's, you know, protecting the, the ultimate end users data, it's gonna be a lot more easy, a lot more likely that I'll, I'll end up using that AI, >>But there is the black box problem with, with, with AI and, and, uh, you know, how is that a technical issue? Uh, I mean, how do you get around that? I mean, this is, it seems non-trivial. >>Yeah. So, you know, I think solving the black box problem of AI specifically of either complex traditional machine learning models or what we think of as deep learning models is not a trivial problem, but, you know, IBM and others over the years have been really trying to tackle this, this problem of explainable AI. And we've come to a point in the world now where we've, we really believe that we have a good handle on how to, how to explain these black box models. How do you basically interpret it, interpret their results and explain them from, from the end point and understand what went into each decision at each layer in the model, if it's deep learning model to kind of be able to extract why and how it's making a certain decision. So I don't, you know, three years ago it was, you know, we thought of it as an intractable problem, but we knew we'd be able to solve it in the future. I think we're at that future today, where unless you get into something incredibly complex, we can, we can explain how and why it got to a decision. And we do this through various sets of tooling that we have, some are open source, you know, we're, we're so committed to, to explainable and fair and fair and trust when it comes to AI. That probably half of what we do is that the open source community in the form of what we call our AI fairness three 60 tool kits. Right. >>Great. Thank you for that. So let me ask you another sort of probing question here is, is there a risk, I mean, people talk about that. There's maybe a risk of, of putting too much attention on trust mean early days in the AI journey. And people are worried that it's going to stifle projects maybe slow down innovation, or maybe even be a headwind to AI adoption, adoption and scale. What are your thoughts on that? >>You know, I, I, I, I think it's a slippery slope to say it's to serve soon to ever be concerned about fairness, trust, privacy, bias, explainability, you know, what we think of as trustworthy AI. Um, I think if you, you can do very interesting and very exciting and very innovative and game changing things in the context of doing what's right. Um, and it is right, especially when you're building AIS or anything that actually impacts people's lives to make sure that it's trustworthy to make sure that it's, you know, it's, it's transparent, it's fair, it's explainable, it's robust. And it ensures the privacy of the underlying data in that model. Otherwise, you know, you get into a point where you may be able to do cool things, but those cool things get undermined by previous missteps that have caused the industry or the tools or the technology to get a bad rap. >>I mean, a great example of that is, I mean, look at the conversational AI that were released in the wild and Twitter and Facebook without any kind of thought about how do you keep them trustworthy. Um, you know, that, that, you know, that went bad really quick. And we want to make sure that, you know, our, we don't, we don't, you know, IBM's a consumer facing company we're kind of the, you know, the IBM inside, if you will, right. We want to make sure that when, you know, when the, the world's largest companies are deploying, IBM's AI, we're using IBM's pools to deploy their own AI that it's done in a way that gives them the ability to make sure that things don't go off the rails quickly, because we're not talking about a conversational Twitter bot we're talking about potentially, you know, an AI that's going to help make a life changing decision. Like, do you, you know, do we keep Dave a mortgage? Uh, do we let Dave at a jail? Is he likely to recidivate? Um, you know, is he, you know, things like that that are actually life-changing and not just going to be embarrassment for the company, it's important to keep trucking >>Great points you make. I mean, you're right. And it did turn bad, uh, very quickly. And it's not resolved. I mean, a lot of the social companies are saying, well, government, you figure it out. We can't. So let's bring it back to the enterprise. That's what we're kind of interested in where IBM's main focuses right now. And where do you see it going? I mean, you mentioned, you know, things like recidivism and mortgages. I mean, these, these really are events that you can predict with very high probability. You know, maybe you don't get it a hundred percent. Right. Uh, but it really is world changing in, in many ways. Where's the focus now? Where do you see it headed? >>Yeah. So, so I think the focus is now, or has been for a little while and continues to be, and probably will be in the future on augmenting intelligence. So especially when it comes to life-changing decisions, we don't really want an AI making that decision independent of the human. Uh, we want that AI guiding the decisions that humans make. Um, and, you know, but reducing the, the, the, the universe of those decisions, something down to something that's digestible, uh, by, by a human and also at the same time using the AI to help eliminate biases, cognitive biases that may exist within, within us as humans. So, and when we think about things like bias, uh, we have to remember too, that the, by the math itself is not where the bias is coming from, right? The bias is coming from the data. And the bias in the data comes from prior decisions of humans that were themselves done for, for bias reasons, right? >>And so by leveraging AI to remove the bias from the decisions that are surface to humans, it helps eliminate some of these things. So for instance, you know, back to the mortgage example, right, if we look at the impact of redlining, right, where people in certain zip codes or certain addresses, certain areas didn't get mortgages that red lining still exists in the data. I don't know of a single mortgage provider today that wants to have that in part of their day-to-day practices helps remove that from them and surface a decision that's based on the context of the individual, based on their, you know, their, their ability to repay in the case of a mortgage, as opposed to what they look like or where they live. >>I mean, I liked the concept of the common editorial power of machines and humans. Um, but, but I think there's, well, there's all, I wonder if you could sort of educate me on them. And there seems to be a lot of potential use cases for many companies, but IBM as well for inference, you know, at the edge. I mean, everybody's talking about the edge OpenShift obviously is a big play there hybrid cloud. Uh, so how do you see that, that kind of real-time inference playing it is that date, certain data comes back to the model and that's where the humans come in. How do you see that? >>Yeah, so, so I often get asked, what do you, you know, what do you see as the future of AI? And my response is the future of AI is edge. And, and the reason for that is if I can solve for an edge use case, I can solve for every use case between the edge and the data center. Right. And some of the challenges that you brought up, uh, as far as being on the edge, get back to, you know, and it actually helps address other problems too, such as how do I handle data sovereignty regulations when it comes to, to training models, even models themselves. Right. Um, but when you think about, you know, I have 50 models deployed around the world, there are 50 50 versions of the same model deployed around the world at different scoring end points or different places where I'm inferencing. >>Um, how do I, without having to bring the model back or all the data back, how do I keep all those models in sync? How do I make sure that, you know, back to the social media example, that one of them doesn't go completely off the rails. And we do this through federated learning, right? And this is, or distributed, distributed learning, federated learning, whatever you want to call it. It's this concept of you have models running at discrete edge locations or discrete distributed locations, those models over time, learn from the data that they're, that they're scoring on. Um, and instead of sending the data back to retrain, or even the model back to retrain, you send back the, the height, the changes in the parameters, uh, that have been updated in that model. You can then pull, put all those together. So you have 50 different distributions that you're managing. You pull all those together, and you can even assign weights to the different ones based on bias that might exist, you know, uh, not biopsies, but different distributions that may exist. That one node or another, you can do it based on the amount of data that's been going into econ into changing those weights. And so you, you combine these models back into a single model, then you push them back out to the edge. >>So I just thinking about, I mean, cause Mo most of the work in AI today, correct me if I'm wrong is in, in, in, I would say modeling versus influencing. And, but, but if you're laying out a future where that's going to change, and if I think about some of the things that we're familiar with today, things like fraud detection and maybe weather, um, supply chains and, and, and that's just going to get better and better and better. But I think about some of the areas, and I'm curious if you could maybe talk to some of the use cases you're seeing in examples both today and the future, but I think of things like, you know, smart power grid, smart factories, automated retail, I mean, he seemed like wheelhouses for, for IBM. So maybe you could share with us your thoughts on that. And some other examples. Yeah. >>So you brought up, you know, fraud fraud is a really good example of an edge use case that might not seem like an edge use case. Um, and so, you know, as you're swiping a credit card and let's just focus on credit card transactions, most of those transactions occur on a mainframe and, you know, they, they need a response time that's, you know, less than a millisecond. And so if I'm, if I'm, if I'm responsible for making sure that my bank doesn't have any credit card fraud, and I have a model that's going to do it, I can't have, in this case, the mainframe call-out to someplace else to score the model and then come back and this gets back to the power of hybrid cloud, right? And so if I can deploy that model on my main frame, where those transactions are happening, you know, half a millisecond at a time, I can then score every single transaction that comes back and on the mainframe directly without having to go out, which enables me to keep my SLA enables me to keep that less than half a millisecond response time while also preventing any fraudulent transactions from happening. >>That's a great example of what I would call let's. Let's fall into my, everything is edge bucket, right? Where, yeah, you're training the model somewhere else where you don't have the cost of running it, training it on the mainframe, but I want to score it back there. And we've actually done this with a couple of banks where we've trained models in the cloud, on GPS and done the influencing and scoring on the mainframe for just exactly that for fraud >>Edge. If you can make it there, you can make it anywhere. Uh, Seth, we gotta leave it there. Uh, really, really appreciate your time. >>All right. Great to have great for having me. Thanks, Dave. I appreciate it as always. >>Great to see you hope we can see you later this year, face to face, or at least in 22. And, uh, and thank you. I hope so. Yeah. Let's, let's make that happen. So that's where those virtual shake on it. Thanks everybody for watching our ongoing coverage of IBM think 2021, the virtual edition is Dave Volante for the cube. Keep it right there for more great content from the show.

Published Date : May 4 2021

SUMMARY :

Think 20, 21 brought to you by IBM. Yeah, always good to see you, Dave. Can you talk about trust it got to a decision and know that it's, you know, protecting the, and, uh, you know, how is that a technical issue? And we do this through various sets of tooling that we have, some are open source, you know, So let me ask you another sort of probing question here is, Otherwise, you know, you get into a point where you may be able to do cool things, And we want to make sure that, you know, our, we don't, And where do you see it going? Um, and, you know, but reducing the, the, the, the universe of those decisions, that's based on the context of the individual, based on their, you know, for inference, you know, at the edge. And some of the challenges that you brought up, the data back to retrain, or even the model back to retrain, you send back the future, but I think of things like, you know, smart power grid, smart factories, Um, and so, you know, as you're swiping the mainframe for just exactly that for fraud Uh, Seth, we gotta leave it there. Great to have great for having me. Great to see you hope we can see you later this year, face to face, or at least in 22.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Seth Dobrin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Seth	PERSON	0.99+
Dave Volante	PERSON	0.99+
50 models	QUANTITY	0.99+
five things	QUANTITY	0.99+
five things	QUANTITY	0.99+
50	QUANTITY	0.99+
each layer	QUANTITY	0.99+
less than half a millisecond	QUANTITY	0.99+
less than a millisecond	QUANTITY	0.99+
later this year	DATE	0.99+
five pillars	QUANTITY	0.98+
three years ago	DATE	0.98+
50 different distributions	QUANTITY	0.98+
half a millisecond	QUANTITY	0.98+
Facebook	ORGANIZATION	0.98+
today	DATE	0.98+
one	QUANTITY	0.97+
Twitter	ORGANIZATION	0.97+
both	QUANTITY	0.97+
single model	QUANTITY	0.95+
Think 2021	COMMERCIAL_ITEM	0.95+
Think 20	COMMERCIAL_ITEM	0.92+
each decision	QUANTITY	0.91+
60 tool kits	QUANTITY	0.88+
hundred percent	QUANTITY	0.86+
single mortgage	QUANTITY	0.84+
IBM think 2021	TITLE	0.78+
22	QUANTITY	0.75+
OpenShift	TITLE	0.74+
2021	TITLE	0.74+
50 versions	QUANTITY	0.72+
21	COMMERCIAL_ITEM	0.72+
three	QUANTITY	0.69+
single transaction	QUANTITY	0.65+
Think	COMMERCIAL_ITEM	0.55+
couple	QUANTITY	0.51+

BOS10 Seth Dobrin VTT

(Upbeat music) >> Narrator: From around the globe. It's The Cube with digital coverage of IBM Think 2021, brought to you by IBM. >> Okay. We're back with our coverage of IBM Think 2021. This is Dave Vellante and this is The Cube. We're now going to dig into AI and explore the issue of trust in machine intelligence. And we're very excited to welcome longtime Cube alum Seth Dobrin, who is a global chief AI officer at IBM. Seth, always good to see you. Thanks so much for making time for us. >> Yeah, always good to see you, Dave. Thanks for having me on again. >> It's our pleasure. Look, language matters and IBM has been talking about language, automation, and-- and trust. And I know you're doing a session on trustworthy AI. Can you talk about trust in the world of machine intelligence and AI, what we should-- what do we need to know? >> Yeah. So, you know as you mentioned, you know, language, language matters. Automation matters. And to do either of those well, you need to really trust the AI that's coming out of-- of your models. And trusting the AI really means five things. There's-- I think of it as there's five pillars. The AI needs to be transparent. The AI needs to be fair. It needs to be explainable. It needs to be robust. And it needs to ensure privacy. And without those five things all combined, you don't really have the ability to trust your AI either as a consumer or even as an end user. So if I imagine I'm a business user that's supposed to use this cool, new AI that's going to help me make a decision, if I don't understand it and I can't figure out how it got to a decision, I'm going to be less likely to consume it and use it in my day-to-day work. Whereas if I can really understand how and why it got to a decision and know that it's, you know, protecting the ultimate end user's data, it's going to be a lot more easy a lot more likely that I'll end up using that AI. >> But there is the black box problem with, with, with AI and, and, you know, how-- is that a technical issue? I mean, how do you get around that? I mean, this is, it seems non-trivial. >> Yeah. So, you know I think solving the black box problem of AI specifically of either complex traditional machine learning models or what we think of as deep learning models is not a trivial problem, but, you know, IBM and others over the years have been really trying to tackle this this problem of explainable AI. And we've come to a point in the world now where we really believe that we have a good handle on how to, how to explain these black box models. How do you basically interpret it-- interpret their results and explain them from the end point and understand what went into each decision at each layer in the model, if it's deep learning model, to kind of be able to extract why and how it's making a certain decision. So I don't, you know, three years ago it was, you know we thought of it as an intractable problem but we knew we'd be able to solve it in the future. I think we're at that future today, where unless you get into something incredibly complex, we can, we can explain how and why it got to a decision. >> Awesome >> And we do this through various sets of tooling that we have. Some are open source, you know, we're so committed to, to explainable and fair and trust when it comes to AI, that probably half of what we do is at the open source community in the form of what we call our AI fairness 360 tool kits. >> Great. Thank you for that. So let me ask you another sort of probing question here. Is there a risk-- I mean, people talk about that there's maybe a risk of putting too much attention on trust and these are early days in the AI journey and people are worried that it's going to stifle projects, maybe slow down innovation, or maybe even be a headwind to AI adoption and scale. What are your thoughts on that? >> You know, I, I, I-- I think it's a slippery slope to say it's too soon to ever be concerned about fairness, trust, privacy, bias, explainability, you know, what we think of as trustworthy AI. I think if you-- you can do very interesting and very exciting and very innovative and game changing things in the context of doing what's right. And it is right, especially when you're building AIs or anything that actually impacts people's lives to make sure that it's trustworthy. To make sure that it's, you know, it's, it's transparent, it's fair, it's explainable, it's robust. And it ensures the privacy of the underlying data in that model. Otherwise, you know, you get into a point where you may be able to do cool things but those cool things get undermined by previous missteps that have caused the industry or the tools or the technology to get a bad rap. I mean, a great example of that is, I mean, look at the conversational AI that were released in the wild in Twitter and Facebook without any kind of thought about how do you keep them trustworthy? You know, that-- you know, that went bad really quick. And we want to make sure that, you know, our-- we don't-- we don't, you know, IBM's not a consumer facing company, we're kind of the, you know, the IBM inside, if you will. >> Dave: Right >> Right? We want to make sure that when, you know, when the-- the world's largest companies are deploying, IBM's AI, or using IBM's tools to deploy their own AI, that it's done in a way that gives them the ability to make sure that things don't go off the rail quickly. 'Cause we're not talking about a conversational Twitter bot. We're talking about potentially, you know, an AI that's going to help make a life-changing decision. Like, do you, you know, do we give Dave a mortgage? Do we let Dave out of jail? Is he likely to recidivate? You know, is he, you know-- things like that that are actually life changing and not just going to be embarrassment for the company. So it's important to keep trust upfront. >> Great points you make. I mean, you're right. And it did turn bad very quickly and it's not resolved. I mean, a lot of the social companies are saying, "Oh, government you figure it out." We can't. (chuckles) So let's bring it back to the enterprise. That's what we're kind of interested in where IBM's main focus is right now. And where do you see it going? I mean, you mentioned, you know, things like recidivism and mortgages. I mean, these, these really are events that you can predict with very high probability. You know, maybe you don't get it a hundred percent right. But it really is world changing in, in many ways. Where's the focus now? And where do you see it headed? >> Yeah, so, so I think the focus is now, or has been for a little while and continues to be, and probably will be in the future on augmenting intelligence. >> Dave: Mmm hmm >> So especially when it comes to life-changing decisions, we don't really want an AI making that decision independent of the human. We want that AI guiding the decisions that humans make and you know-- but reducing the universe of those decisions, something-- down to something that's digestible by, by a human, and also at the same time using the AI to help eliminate biases, cognitive biases that that may exist within us as humans. So, and when we think about things like bias, we have to remember too, that the bi- the math itself is not where the bias is coming from, right? The bias is coming from the data. And the bias in the data comes from prior decisions of humans that were themselves done for bias reasons, right? And so by leveraging AI to remove the bias from the decisions that are surface to humans, it helps eliminate some of these things. So for instance, you know, back to the mortgage example, right? If we look at the impact of redlining, right? Where people in certain zip codes or certain addresses, certain areas didn't get mortgages, that redlining still exists in the data. I don't know of a single mortgage provider today that wants to have that in part of their day-to-day practice. This helps remove that from them and surface a decision that's based on the context of the individual, based on their-- you know, their ability to repay, in the case of a mortgage, as opposed to what they look like or where they live. >> So, I mean, I liked the concept of the common and pictorial power of machines and humans, but I think there's-- well, there's all-- I wonder if you could sort of educate me on them. There seems to be a lot of potential use cases for many companies, but IBM as well, for inference, you know, at the edge. I mean, everybody's talking about the edge. Open shift obviously is a big play there. Hybrid cloud... So how do you see-- that-- that kind of real time inference playing in? Does that-- certain data comes back to the model and that's where the humans come in? W-- H-- How do you see that? >> Yeah, so, so-- I often get asked, what do you, you know what do you see as the future of AI? And my response is the future of AI is edge. And, and the reason for that is if I can solve for an edge use case, I can solve for every use case between the edge and the data center. >> Dave: Uh huh. >> Right? And some of the challenges that you brought up as far as being on the edge, get back to, you know, and it actually helps address other problems too, such as how do I handle data sovereignty regulations when it comes to training models, and even models themselves. >> Dave: Right. >> But when you think about, you know, I have 50 models deployed around the world, there are 50 versions of the same model deployed around the world at different scoring end points or different places where I'm inferencing. How do I-- without having to bring the model back or all the data back, how do I keep all those models in sync? How do I make sure that, you know, back to the social media example, that one of them doesn't go completely off the rails. And we do this through federated learning, right? And this is-- or distributed, distributed learning, federated learning, whatever you want to call it. It's this concept of you have models running at the discrete edge locations or discrete distributed locations, those models over time learn from the data that they're, that they're scoring on. And instead of sending the data back to retrain or even the model back to retrain you send back the the-- the changes in the parameters that have been updated in that model. You can then pull-- put all those together. So you have 50 different distributions that you're managing. You pull all those together and you can even assign weights to the different ones based on bias that might, you know, not biases but different distributions that may exist at one node or another, you can do it based on the amount of data that's been going into-- gone into changing those weights. And so you-- you combine these models back into a single model and then you push them back out to the edge. >> So I just thinking about, I mean, cause mm-mo-- most of the work in AI today, correct me if I'm wrong is in, in, in-- I would say modeling versus inferencing. And, but, but if you're laying out a future where that's going to change, and if I think about some of the things that we're familiar with today, things like fraud detection, maybe weather... supply chains and, and-- and that's just going to get better and better and better. But I think about some of the areas and I'm curious if you could maybe talk to some of the use cases you're seeing in examples, both today and the future, but I think of things like, you know, smart power grid, smart factories, automated retail, I mean these seemed like wheelhouses for IBM. So maybe you could share with us your thoughts on that. And some other examples. >> Yeah, so you brought up, you know, fraud. Fraud is a really good example of an edge use case that might not seem like an edge use case. >> Dave: Yeah. >> And so, you know, as you're swiping a credit card, and let's just focus on credit card transactions, most of those transactions occur on a mainframe and you know, they, they need a response time that's, you know, less than a millisecond. And so if I'm, if I'm, you know, if I'm responsible for making sure that my bank doesn't have any credit card fraud and I have a model that's going to do it, I can't have, in this case, the mainframe call-out to someplace else to score the model and then come back. And this gets back to the power of hybrid cloud, right? And so if I can deploy that model on my mainframe, where those transactions are happening, you know, half a millisecond at a time, I can then score every single transaction that comes back and on the mainframe directly without having to go out, which enables me to keep my SLA, enables me to keep that less than half a millisecond response time, while also preventing any fraudulent transactions from happening. That's a great example of what I would call-- That falls into my "everything is edge bucket," right? Where, yeah, you're training the model somewhere else where you don't have the cost of running and training it on the mainframe, but I want to score it back there. And we've actually done this with a couple of banks, where we've trained models in the cloud on GPUs and done the inferencing and scoring on the mainframe for just exactly that, for fraud. >> Edge: if you can make it there, you can make it anywhere. (chuckles) Seth, we-- we got to leave it there. Really, really appreciate your time. >> All right. Great to have-- great for having me. Thanks, Dave, I appreciate it, as always. >> Oh, it was great to see you. Hope we can see you later this year, face to face, or at least in '22. And, and thank you! >> I hope so. >> Yeah, let's, let's make that happen Seth. We'll virtual shake on it. (chuckles) Thanks everybody for watching our ongoing coverage of IBM Think 2021, the virtual edition. This is Dave Vellante for The Cube. Keep it right there for more great content from the show. (light-hearted music) (light-hearted music continues) (light-hearted music fades) (upbeat music)

Published Date : Apr 16 2021

SUMMARY :

brought to you by IBM. and explore the issue of Yeah, always good to see you, Dave. Can you talk about trust that it's, you know, I mean, how do you get around that? So I don't, you know, three Some are open source, you know, So let me ask you another the IBM inside, if you will. when, you know, when the-- I mean, a lot of the social companies and probably will be in the future So for instance, you know, for inference, you know, at the edge. And, and the reason for that is that you brought up based on bias that might, you know, I mean, cause mm-mo-- most of the work up, you know, fraud. and you know, they, they Edge: if you can make it Great to have-- great for having me. Hope we can see you later this year, of IBM Think 2021, the virtual edition.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
50 models	QUANTITY	0.99+
Seth	PERSON	0.99+
50 versions	QUANTITY	0.99+
Seth Dobrin	PERSON	0.99+
each layer	QUANTITY	0.99+
five things	QUANTITY	0.99+
less than half a millisecond	QUANTITY	0.99+
five things	QUANTITY	0.99+
'22	DATE	0.99+
less than a millisecond	QUANTITY	0.99+
today	DATE	0.99+
half a millisecond	QUANTITY	0.98+
later this year	DATE	0.98+
Facebook	ORGANIZATION	0.98+
five pillars	QUANTITY	0.97+
three years ago	DATE	0.97+
Twitter	ORGANIZATION	0.97+
50 different distributions	QUANTITY	0.97+
each decision	QUANTITY	0.97+
Think 2021	COMMERCIAL_ITEM	0.97+
both	QUANTITY	0.97+
one	QUANTITY	0.96+
single model	QUANTITY	0.93+
hundred percent	QUANTITY	0.84+
single mortgage	QUANTITY	0.82+
360	QUANTITY	0.78+
The Cube	COMMERCIAL_ITEM	0.76+
single transaction	QUANTITY	0.7+
Cube	COMMERCIAL_ITEM	0.6+
The Cube	ORGANIZATION	0.43+

Seth Juarez, Microsoft | Microsoft Ignite 2019

>>Live from Orlando, Florida. It's the cube covering Microsoft ignite brought to you by Cohesity. >>Good afternoon everyone and welcome back to the cubes live coverage of Microsoft ignite 26,000 people here at this conference at the orange County convention center. I'm your host, Rebecca Knight, alongside my cohost Stu Miniman. We are joined by Seth Juarez. He is the cloud developer advocate at Microsoft. Thank you so much for coming on the show. >>Glad to be here. You have such a lovely sad and you're lovely people. We just met up. You don't know any better? No. Well maybe after after the end of the 15 minutes we'll have another discussion. >>You're starting off on the right foot, so tell us a little bit about what you do. You're also a host on channel nine tell us about your role as a, as a cloud developer. >>So a cloud advocate's job is primarily to help developers be successful on Azure. My particular expertise lies in AI and machine learning and so my job is to help developers be successful with AI in the cloud, whether it be developers, data scientists, machine learning engineers or whatever it is that people call it nowadays. Because you know how the titles change a lot, but my job is to help them be successful and sometimes what's interesting is that sometimes our customers can't find success in the cloud. That's actually a win for me too because then I have a deep integration with the product group and my job is to help them understand from a customer perspective what it is they need and why. So I'm like the ombudsman so to speak because the product groups are the product groups. I don't report up to them. So I usually go in there and I'm like, Hey, I don't report to any of you, but this is what the customers are saying. >>We are very keen on being customer centered and that's why I do what I do. >> Seth, I have to imagine when you're dealing with customers, some of that skills gap and learning is something that they need to deal with. You know, we've been hearing for a long time, you know, there's not enough data scientists, you know, we need to learn these environments. Satya Nadella spent a lot of time talking about the citizen developers out there. So you know H bring us inside the customers you're talking to, you know, kind of, where do you usually start and you know, how do they pull the right people in there or are they bringing in outside people a little bit? Great organization, great question. It turns out that for us at Microsoft we have our product groups and then right outside we have our advocates that are very closely aligned to the product groups. >>And so anytime we do have an interaction with a customer, it's for the benefit of all the other customers. And so I meet with a lot of customers and I don't, I'm to get to talk about them too much. But the thing is I go in there, I see what they're doing. For example, one time I went to the touring Institute in the UK. I went in there and because I'm not there to sell, I'm there to figure out like what are you trying to do and does this actually match up? It's a very different kind of conversation and they'd tell me about what they're working on. I tell them about how we can help them and then they tell me where the gaps are or where they're very excited and I take both of those pieces of feedback to the, to the product group and they, they just love being able to have someone on the ground to talk to people because sometimes you know, when work on stuff you get a little siloed and it's good to have an ombudsman so to speak, to make sure that we're doing the right thing for our customers. >>As somebody that works on AI. You must've been geeking out working, working with the Turing Institute though. Oh yeah. Those people are absolutely wonderful and it was like as I was walking in, a little giddy, but the problems that they're facing in AI are very similar. The problems that people at the other people doing and that are in big organizations, other organizations are trying to onboard to AI and try to figure out, everyone says I need to be using this hammer and they're trying to hammer some screws in with the hammer. So it's good to figure out when it's appropriate to use AI and when it isn't. And I also have customers with that >>and I'm sure the answer is it depends in terms of when it's appropriate, but do you have any sort of broad brush advice for helping an organization determine is is this a job for AI? Absolutely. >>That's uh, it's a question I get often and developers, we have this thing called the smell that tells us if a code smell, we have a code smell tells us, maybe we should refactor, maybe we should. For me, there's this AI smell where if you can't precisely figure out the series of steps to execute an algorithm and you're having a hard time writing code, or for example, if every week you need to change your if L statements or if you're changing numbers from 0.5 to 0.7 and now it works, that's the smell that you should think about using AI or machine learning, right? There's also a set of a class of algorithms that, for example, AI, it's not that we've solved, solved them, but they're pretty much solved. Like for example, detecting what's in an image, understanding sentiment and text, right? Those kinds of problems we have solutions for that are just done. >>But if you have a code smell where you have a lot of data and you don't want to write an algorithm to solve that problem, machine learning and AI might be the solution. Alright, a lot of announcements this week. Uh, any of the highlights for from your area. We last year, AI was mentioned specifically many times now with you know, autonomous systems and you know it feels like AI is in there not necessarily just you know, rubbing AI on everything. >> I think it's because we have such a good solution for people building custom machine learning that now it's time to talk about the things you can do with it. So we're talking about autonomous systems. It's because it's based upon the foundation of the AI that we've already built. We released something called Azure machine learning, a set of tools called in a studio where you can do end and machine learning. >>Because what what's happening is most data scientists nowadays, and I'm guilty of this myself, we put stuff in things called Jupiter notebooks. We release models, we email them to each other, we're emailing Python files and that's kinda like how programming was in 1995 and now we're doing is we're building a set of tools to allow machine learning developers to go end to end, be able to see how data scientists are working and et cetera. For example, let's just say you're a data scientist. Bill. Did an awesome job, but then he goes somewhere else and Sally who was absolutely amazing, comes in and now she's the data scientist. Usually Sally starts from zero and all of the stuff that bill did is lost with Azure machine learning. You're able to see all of your experiments, see what bill tried, see what he learned and Sally can pick right up and go on. And that's just doing the experiments. Now if you want to get machine learning models into production, we also have the ability to take these models, version them, put them into a CIC, D similar process with Azure dev ops and machine learning. So you can go from data all the way to machine learning in production very easily, very quickly and in a team environment, you know? And that's what I'm excited about mostly. >>So at a time when AI and big and technology companies in general are under fire and not, Oh considered to not always have their users best interests at heart. I'd like you to talk about the Microsoft approach to ethical AI and responsible AI. >>Yeah, I was a part of the keynote. Scott Hanselman is a very famous dab and he did a keynote and I got to form part of it and one of the things that we're very careful even on a dumb demo or where he was like doing rock paper, scissors. I said, and Scott, we were watching you with your permission to see like what sequence of throws you were doing. We believe that through and through all the way we will never use our customers' data to enhance any of our models. In fact, there was a time when we were doing like a machine learning model for NLP and I saw the email thread and it's like we don't have language food. I don't remember what it was. We don't have enough language food. Let's pay some people to ethically source this particular language data. We will never use any of our customer's data and I've had this question asked a lot. >>Like for example, our cognitive services which have built in AI, we will never use any of our customer's data to build that neither. For example, if we have, for example, we have a custom vision where you upload your own pictures, those are your pictures. We're never going to use them for anything. And anything that we do, there's always consent and we want to make sure that everyone understands that AI is a powerful tool, but it also needs to be used ethically. And that's just on how we use data for people that are our customers. We also have tools inside of Azure machine learning to get them to use AI. Ethically. We have tools to explain models. So for example, if you very gender does the model changes prediction or if you've very class or race, is your model being a little iffy? We allow, we have those tools and Azure machine learning, so our customers can also be ethical with the AI they build on our platform. So we have ethics built into how we build our models and we have ethics build into how our customers can build their models too, which is to me very. >>And is that a selling point? Are customers gravitating? I mean we've talked a lot about it on the show. About the, the trust that customers have in Microsoft and the image that Microsoft has in the industry right now. But the idea that it is also trying to perpetuate this idea of making everyone else more ethical. Do you think that that is one of the reasons customers are gravitate? >>I hope so. And as far as a selling point, I absolutely think it's a selling point, but we've just released it and so I'm going to go out there and evangelize the fact that not only are we as tickle with what we do in AI, but we want our customers to be ethical as well. Because you know, trust pays, as Satya said in his keynote, tra trust the enhancer in the exponent that allows tech intensity to actually be tech intensity. And we believe that through and through not only do believe it for ourselves, but we want our customers to also believe it and see the benefits of having trust with our customers. One of the things we, we talked to Scott Hanselman a little bit yesterday about that demo is the Microsoft of today isn't just use all the Microsoft products, right? To allow you to use, you know, any tool, any platform, you know, your own environment, uh, to tell us how that, that, that plays into your world. >>It's, you know, like in my opinion, and I don't know if it's the official opinion, but we are in the business of renting computer cycles. We don't care how you use them, just come into our house and use them. You wanna use Java. We've recently announced a tons of things with spraying. We're become an open JDK contributor. You know, one of my colleagues, we're very hard on that. I work primarily in Python because it's machine learning. I have a friend might call a friend and colleague, David Smith who works in our, I have other colleagues that work in a number of different languages. We don't care. What we are doing is we're trying to empower every organization and every person on the planet to achieve more where they are, how they are, and hopefully bring a little bit of of it to our cloud. >>What are you doing that, that's really exciting to you right now? I know you're doing a new.net library. Any other projects that are sparking your end? >>Yeah, so next week I'm going to France and this is before anyone's going to see this and there is a, there is a company, I think it's called surf, I'll have to look it up and we'll put it in the notes, but they are basically trying to use AI to be more environmentally conscious and they're taking pictures of trash and rivers and they're using AI to figure out where it's coming from so they can clean up environment. I get to go over there and see what they're doing, see how I can help them improvement and promote this kind of ethical way of doing AI. We also do stuff with snow leopards. I was watching some Netflix thing with my kids and we were watching snow leopards and there was like two of them. Like this is impressive because as I'm watching this with my kids, I'm like, Hey we are at Microsoft, we're helping this population, you know, perpetuate with AI. >>And so those are the things it's actually a had had I've seen on TV is, you know, rather than spending thousands of hours of people out there, the AI can identify the shape, um, you know, through the cameras. So they're on a, I love that powerful story to explain some of those pieces as opposed to it. It's tough to get the nuance of what's happening here. Absolutely. With this technology, these models are incredibly easy to build on our platform. And, and I and I st fairly easy to build with what you have. We love people use TensorFlow, use TensorFlow, people use pie torch. That's great cafe on it. Whatever you want to use. We are happy to let you use a rent out our computer cycles because we want you to be successful. Maybe speak a little bit of that when you talk about, you know, the, the cloud, one of the things is to democratize, uh, availability of this. >>There's usually free tiers out there, especially in the emerging areas. Uh, you know, how, how is Microsoft helping to get that, that compute and that world technology to people that might not have had it in the past? I was in, I was in Peru a number of years ago and I and I had a discussion with someone on the channel nine show and it was absolutely imp. Like I under suddenly understood the value of this. He said, Seth, if I wanted to do a startup here in Peru, right, and it was a capital Peru, like a very industrialized city, I would have to buy a server. It would come from California on a boat. It would take a couple of months to get here and then it would be in a warehouse for another month as it goes through customs. And then I would have to put it into a building that has a C and then I could start now sat with a click of a button. >>I can provision an entire cluster of machines on Azure and start right now. That's what, that's what the cloud is doing in places like Peru and places that maybe don't have a lot of infrastructure. Now infrastructure is for everyone and maybe someone even in the United States, you know, in a rural area that doesn't, they can start up their own business right now anywhere. And it's not just because it's Peru, it's not just because it's some other place that's becoming industrialized. It's everywhere. Because any kid with a dream can spin up an app service and have a website done in like five minutes. >>So what does this mean? I mean, as you said, any, any kid, any person or rural area, any developing country, what does this mean in five or 10 years from now in terms of the future of commerce and work and business? >>Honestly, some people feel like computers are art, stealing, you know, human engineering. I think they are really augmenting it. Like for example, I don't have to, if I want to know something for her. Back when, when I was a kid, I had to, if I want to know something, sometimes I had to go without knowing where like I guess we'll never know. Right? And then five years later we're like, okay, we found out it was that a character on that show, you know? And now we just look at our phone. It's like, Oh, you were wrong. And I like not knowing that I'm wrong for a lot longer, you know what I'm saying? But nowadays with our, with our phones and with other devices, we have information readily available so that we can make appropriate response, appropriate answers to questions that we have. AI is going to help us with that by augmenting human ingenuity, by looking at the underlying structure. >>We can't, for example, if you look at, if you look at an Excel spreadsheet, if it's like five rows and maybe five columns, you and I as humans can look at and see a trend. But what if it's 10 million rows and 5,000 columns? Our ingenuity has been stretched too far, but with computers now we can aggregate, we can do some machine learning models, and then we can see the patterns that the computer found aggregated, and now we can make the decisions we could make with five columns, five rows, but it's not taking our jobs. It's augmenting our capacity to do the right thing. >>Excellent. We'll assess that. Thank you so much for coming on the Cuba. Really fun conversation. >>Glad to be here. Thanks for having me. >>Alright, I'm Rebecca Knight for Stu minimun. Stay tuned for more of the cubes live coverage of Microsoft ignite.

Published Date : Nov 6 2019

SUMMARY :

Microsoft ignite brought to you by Cohesity. Thank you so much for coming on the show. Glad to be here. You're starting off on the right foot, so tell us a little bit about what you do. So I'm like the ombudsman so to speak because the product groups are the product groups. You know, we've been hearing for a long time, you know, there's not enough data scientists, they just love being able to have someone on the ground to talk to people because sometimes you know, And I also have customers with that and I'm sure the answer is it depends in terms of when it's appropriate, but do you have any sort of broad brush if every week you need to change your if L statements or if you're changing numbers from 0.5 to 0.7 many times now with you know, autonomous systems and you know it feels like AI is to talk about the things you can do with it. So you can go from data all the way to machine learning in I'd like you to talk about the Microsoft approach to ethical AI and responsible AI. I said, and Scott, we were watching you with your permission to see For example, if we have, for example, we have a custom vision where you upload your own pictures, Do you think that that is one of the reasons customers are gravitate? any platform, you know, your own environment, uh, to tell us how that, We don't care how you use them, just come into our house What are you doing that, that's really exciting to you right now? we're helping this population, you know, perpetuate with AI. And, and I and I st fairly easy to build with what you have. Uh, you know, how, how is Microsoft helping to get that, that compute and that world technology to you know, in a rural area that doesn't, they can start up their own business right now anywhere. Honestly, some people feel like computers are art, stealing, you know, We can't, for example, if you look at, if you look at an Excel spreadsheet, if it's like five rows and maybe five Thank you so much for coming on the Cuba. Glad to be here. Alright, I'm Rebecca Knight for Stu minimun.

ENTITIES

Entity	Category	Confidence
Sally	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Scott	PERSON	0.99+
David Smith	PERSON	0.99+
Peru	LOCATION	0.99+
Seth Juarez	PERSON	0.99+
California	LOCATION	0.99+
France	LOCATION	0.99+
1995	DATE	0.99+
Satya Nadella	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Turing Institute	ORGANIZATION	0.99+
10 million rows	QUANTITY	0.99+
Scott Hanselman	PERSON	0.99+
UK	LOCATION	0.99+
Stu Miniman	PERSON	0.99+
United States	LOCATION	0.99+
five minutes	QUANTITY	0.99+
five rows	QUANTITY	0.99+
5,000 columns	QUANTITY	0.99+
last year	DATE	0.99+
yesterday	DATE	0.99+
five columns	QUANTITY	0.99+
Orlando, Florida	LOCATION	0.99+
Satya	PERSON	0.99+
Java	TITLE	0.99+
next week	DATE	0.99+
Excel	TITLE	0.99+
Python	TITLE	0.99+
Seth	PERSON	0.99+
Cuba	LOCATION	0.99+
Bill	PERSON	0.99+
today	DATE	0.99+
26,000 people	QUANTITY	0.99+
one	QUANTITY	0.99+
five years later	DATE	0.98+
this week	DATE	0.98+
both	QUANTITY	0.98+
15 minutes	QUANTITY	0.98+
One	QUANTITY	0.97+
0.7	QUANTITY	0.97+
Azure	TITLE	0.96+
JDK	TITLE	0.96+
thousands of hours	QUANTITY	0.95+
10 years	QUANTITY	0.94+
five	QUANTITY	0.93+
Netflix	ORGANIZATION	0.92+
0.5	QUANTITY	0.91+
zero	QUANTITY	0.91+
TensorFlow	TITLE	0.9+
orange County convention center	LOCATION	0.84+
snow leopards	TITLE	0.84+
nine show	QUANTITY	0.76+
number of years ago	DATE	0.73+
NLP	ORGANIZATION	0.72+
two of them	QUANTITY	0.7+
bill	PERSON	0.67+
months	QUANTITY	0.66+
Stu	ORGANIZATION	0.65+
things	QUANTITY	0.61+
ignite	TITLE	0.6+
Cohesity	ORGANIZATION	0.59+
couple	QUANTITY	0.54+

Seth Dobrin, IBM | IBM Data and AI Forum

>>live from Miami, Florida It's the Q covering. IBM is data in a I forum brought to you by IBM. >>Welcome back to the port of Miami, everybody. We're here at the Intercontinental Hotel. You're watching the Cube? The leader and I live tech covered set. Daubert is here. He's the vice president of data and I and a I and the chief data officer of cloud and cognitive software. And I'd be upset too. Good to see you again. >>Good. See, Dave, thanks for having me >>here. The data in a I form hashtag data. I I It's amazing here. 1700 people. Everybody's gonna hands on appetite for learning. Yeah. What do you see out in the marketplace? You know what's new since we last talked. >>Well, so I think if you look at some of the things that are really need in the marketplace, it's really been around filling the skill shortage. And how do you operationalize and and industrialize? You're a I. And so there's been a real need for things ways to get more productivity out of your data. Scientists not necessarily replace them. But how do you get more productivity? And we just released a few months ago, something called Auto A I, which really is, is probably the only tool out there that automates the end end pipeline automates 80% of the work on the Indian pipeline, but isn't a black box. It actually kicks out code. So your data scientists can then take it, optimize it further and understand it, and really feel more comfortable about it. >>He's got a eye for a eyes. That's >>exactly what is a eye for an eye. >>So how's that work? So you're applying machine intelligence Two data to make? Aye. Aye, more productive pick algorithms. Best fit. >>Yeah, So it does. Basically, you feed it your data and it identifies the features that are important. It does feature engineering for you. It does model selection for you. It does hyper parameter tuning and optimization, and it does deployment and also met monitors for bias. >>So what's the date of scientists do? >>Data scientist takes the code out the back end. And really, there's some tweaks that you know, the model, maybe the auto. Aye, aye. Maybe not. Get it perfect, Um, and really customize it for the business and the needs of the business. that the that the auto A I so they not understand >>the data scientist, then can can he or she can apply it in a way that is unique to their business that essentially becomes their I p. It's not like generic. Aye, aye for everybody. It's it's customized by And that's where data science to complain that I have the time to do this. Wrangling data >>exactly. And it was built in a combination from IBM Research since a great assets at IBM Research plus some cattle masters at work here at IBM that really designed and optimize the algorithm selection and things like that. And then at the keynote today, uh, wonderment Thompson was up there talking, and this is probably one of the most impactful use cases of auto. Aye, aye to date. And it was also, you know, my former team, the data science elite team, was engaged, but wonderment Thompson had this problem where they had, like, 17,000 features in their data sets, and what they wanted to do was they wanted to be able to have a custom solution for their customers. And so every time they get a customer that have to have a data scientist that would sit down and figure out what the right features and how the engineer for this customer. It was an intractable problem for them. You know, the person from wonderment Thompson have prevented presented today said he's been trying to solve this problem for eight years. Auto Way I, plus the data science elite team solve the form in two months, and after that two months, it went right into production. So in this case, oughta way. I isn't doing the whole pipeline. It's helping them identify the features and engineering the features that are important and giving them a head start on the model. >>What's the, uh, what's the acquisition bottle for all the way as a It's a license software product. Is it assassin part >>of Cloudpack for data, and it's available on IBM Cloud. So it's on IBM Cloud. You can use it paper use so you get a license as part of watching studio on IBM Cloud. If you invest in Cloudpack for data, it could be a perpetual license or committed term license, which essentially assassin, >>it's essentially a feature at dawn of Cloudpack for data. >>It's part of Cloudpack per day and you're >>saying it can be usage based. So that's key. >>Consumption based hot pack for data is all consumption based, >>so people want to use a eye for competitive advantage. I said by my open that you know, we're not marching to the cadence of Moore's Law in this industry anymore. It's a combination of data and then cloud for scale. So so people want competitive advantage. You've talked about some things that folks are doing to gain that competitive advantage. But the same time we heard from Rob Thomas that only about 4 to 10% penetration for a I. What? What are the key blockers that you see and how you're knocking them >>down? Well, I think there's. There's a number of key blockers, so one is of access to data, right? Cos have tons of data, but being able to even know what data is, they're being able to pull it all together and being able to do it in a way that is compliant with regulation because you got you can't do a I in a vacuum. You have to do it in the context of ever increasing regulation like GDP R and C, C, P A and all these other regulator privacy regulations that are popping up. So so that's that's really too so access to data and regulation can be blockers. The 2nd 1 or the 3rd 1 is really access to appropriate skills, which we talked a little bit about. Andi, how do you retrain, or how do you up skill, the talent you have? And then how do you actually bring in new talent that can execute what you want on then? Sometimes in some cos it's a lack of strategy with appropriate measurement, right? So what is your A II strategy, and how are you gonna measure success? And you and I have talked about this on Cuban on Cube before, where it's gotta measure your success in dollars and cents right cost savings, net new revenue. That's really all your CFO is care about. That's how you have to be able to measure and monitor your success. >>Yes. Oh, it's so that's that Last one is probably were where most organizations start. Let's prioritize the use cases of the give us the best bang for the buck, and then business guys probably get really excited and say Okay, let's go. But to up to truly operationalize that you gotta worry about these other things. You know, the compliance issues and you gotta have the skill sets. Yeah, it's a scale. >>And sometimes that's actually the first thing you said is sometimes a mistake. So focusing on the one that's got the most bang for the buck is not necessarily the best place to start for a couple of reasons. So one is you may not have the right data. It may not be available. It may not be governed properly. Number one, number two the business that you're building it for, may not be ready to consume it right. They may not be either bought in or the processes need to change so much or something like that, that it's not gonna get used. And you can build the best a I in the world. If it doesn't get used, it creates zero value, right? And so you really want to focus on for the first couple of projects? What are the one that we can deliver the best value, not Sarah, the most value, but the best value in the shortest amount of time and ensure that it gets into production because especially when you're starting off, if you don't show adoption, people are gonna lose interest. >>What are you >>seeing in terms of experimentation now in the customer base? You know, when you talk to buyers and you talk about, you know, you look at the I T. Spending service. People are concerned about tariffs. The trade will hurt the 2020 election. They're being a little bit cautious. But in the last two or three years have been a lot of experimentation going on. And a big part of that is a I and machine learning. What are you seeing in terms of that experimentation turning into actually production project that we can learn from and maybe do some new experiments? >>Yeah, and I think it depends on how you're doing the experiments. There's, I think there's kind of academic experimentation where you have data science, Sistine Data science teams that come work on cool stuff that may or may not have business value and may or may not be implemented right. They just kind of latch on. The business isn't really involved. They latch on, they do projects, and that's I think that's actually bad experimentation if you let it that run your program. The good experimentation is when you start identity having a strategy. You identify the use cases you want to go after and you experiment by leveraging, agile to deliver these methodologies. You deliver value in two weeks prints, and you can start delivering value quickly. You know, in the case of wonderment, Thompson again 88 weeks, four sprints. They got value. That was an experiment, right? That was an experiment because it was done. Agile methodologies using good coding practices using good, you know, kind of design up front practices. They were able to take that and put it right into production. If you're doing experimentation, you have to rewrite your code at the end. And it's a waste of time >>T to your earlier point. The moon shots are oftentimes could be too risky. And if you blow it on a moon shot, it could set you back years. So you got to be careful. Pick your spots, picked ones that maybe representative, but our lower maybe, maybe lower risk. Apply agile methodologies, get a quick return, learn, develop those skills, and then then build up to the moon ship >>or you break that moon shot down its consumable pieces. Right, Because the moon shot may take you two years to get to. But maybe there are sub components of that moon shot that you could deliver in 34 months and you start delivering knows, and you work up to the moon shot. >>I always like to ask the dog food in people. And I said, like that. Call it sipping your own champagne. What do you guys done internally? When we first met, it was and I think, a snowy day in Boston, right at the spark. Some it years ago. And you did a big career switch, and it's obviously working out for you, But But what are some of the things? And you were in part, brought in to help IBM internally as well as Interpol Help IBM really become data driven internally? Yeah. How has that gone? What have you learned? And how are you taking that to customers? >>Yeah, so I was hired three years ago now believe it was that long toe lead. Our internal transformation over the last couple of years, I got I don't want to say distracted there were really important business things I need to focus on, like gpr and helping our customers get up and running with with data science, and I build a data science elite team. So as of a couple months ago, I'm back, you know, almost entirely focused on her internal transformation. And, you know, it's really about making sure that we use data and a I to make appropriate decisions on DSO. Now we have. You know, we have an app on her phone that leverages Cognos analytics, where at any point, Ginny Rometty or Rob Thomas or Arvin Krishna can pull up and look in what we call E P M. Which is enterprise performance management and understand where the business is, right? What what do we do in third quarter, which just wrapped up what was what's the pipeline for fourth quarter? And it's at your fingertips. We're working on revamping our planning cycle. So today planning has been done in Excel. We're leveraging Planning Analytics, which is a great planning and scenario planning tool that with the tip of a button, really let a click of a button really let you understand how your business can perform in the future and what things need to do to get it perform. We're also looking across all of cloud and cognitive software, which data and A I sits in and within each business unit and cloud and cognitive software. The sales teams do a great job of cross sell upsell. But there's a huge opportunity of how do we cross sell up sell across the five different businesses that live inside of cloud and cognitive software. So did an aye aye hybrid cloud integration, IBM Cloud cognitive Applications and IBM Security. There's a lot of potential interplay that our customers do across there and providing a I that helps the sales people understand when they can create more value. Excuse me for our customers. >>It's interesting. This is the 10th year of doing the Cube, and when we first started, it was sort of the beginning of the the big data craze, and a lot of people said, Oh, okay, here's the disruption, crossing the chasm. Innovator's dilemma. All that old stuff going away, all the new stuff coming in. But you mentioned Cognos on mobile, and that's this is the thing we learned is that the key ingredients to data strategies. Comprised the existing systems. Yes. Throw those out. Those of the systems of record that were the single version of the truth, if you will, that people trusted you, go back to trust and all this other stuff built up around it. Which kind of created dissidents. Yeah. And so it sounds like one of the initiatives that you you're an IBM I've been working on is really bringing in the new pieces, modernizing sort of the existing so that you've got sort of consistent data sets that people could work. And one of the >>capabilities that really has enabled this transformation in the last six months for us internally and for our clients inside a cloud pack for data, we have this capability called IBM data virtualization, which we have all these independent sources of truth to stomach, you know? And then we have all these other data sources that may or may not be as trusted, but to be able to bring them together literally. With the click of a button, you drop your data sources in the Aye. Aye, within data. Virtualization actually identifies keys across the different things so you can link your data. You look at it, you check it, and it really enables you to do this at scale. And all you need to do is say, pointed out the data. Here's the I. P. Address of where the data lives, and it will bring that in and help you connect it. >>So you mentioned variances in data quality and consumer of the data has to have trust in that data. Can you use machine intelligence and a I to sort of give you a data confidence meter, if you will. Yeah. So there's two things >>that we use for data confidence. I call it dodging this factor, right. Understanding what the dodging this factor is of the data. So we definitely leverage. Aye. Aye. So a I If you have a date, a dictionary and you have metadata, the I can understand eight equality. And it can also look at what your data stewards do, and it can do some of the remediation of the data quality issues. But we all in Watson Knowledge catalog, which again is an in cloudpack for data. We also have the ability to vote up and vote down data. So as much as the team is using data internally. If there's a data set that had a you know, we had a hive data quality score, but it wasn't really valuable. It'll get voted down, and it will help. When you search for data in the system, it will sort it kind of like you do a search on the Internet and it'll it'll down rank that one, depending on how many down votes they got. >>So it's a wisdom of the crowd type of. >>It's a crowd sourcing combined with the I >>as that, in your experience at all, changed the dynamics of politics within organizations. In other words, I'm sure we've all been a lot of meetings where somebody puts foursome data. And if the most senior person in the room doesn't like the data, it doesn't like the implication he or she will attack the data source, and then the meeting's over and it might not necessarily be the best decision for the organization. So So I think it's maybe >>not the up, voting down voting that does that, but it's things like the E PM tool that I said we have here. You know there is a single source of truth for our finance data. It's on everyone's phone. Who needs access to it? Right? When you have a conversation about how the company or the division or the business unit is performing financially, it comes from E. P M. Whether it's in the Cognos app or whether it's in a dashboard, a separate dashboard and Cognos or is being fed into an aye aye, that we're building. This is the source of truth. Similarly, for product data, our individual products before me it comes from here's so the conversation at the senior senior meetings are no longer your data is different from my data. I don't believe it. You've eliminated that conversation. This is the data. This is the only data. Now you can have a conversation about what's really important >>in adult conversation. Okay, Now what are we going to do? It? It's >>not a bickering about my data versus your data. >>So what's next for you on? You know, you're you've been pulled in a lot of different places again. You started at IBM as an internal transformation change agent. You got pulled into a lot of customer situations because yeah, you know, you're doing so. Sales guys want to drag you along and help facilitate activity with clients. What's new? What's what's next for you. >>So really, you know, I've only been refocused on the internal transformation for a couple months now. So really extending IBM struck our cloud and cognitive software a data and a I strategy and starting to quickly implement some of these products, just like project. So, like, just like I just said, you know, we're starting project without even knowing what the prioritized list is. Intuitively, this one's important. The team's going to start working on it, and one of them is an aye aye project, which is around cross sell upsell that I mentioned across the portfolio and the other one we just got done talking about how in the senior leadership meeting for Claude Incognito software, how do we all work from a Cognos dashboard instead of Excel data data that's been exported put into Excel? The challenge with that is not that people don't trust the data. It's that if there's a question you can't drill down. So if there's a question about an Excel document or a power point that's up there, you will get back next meeting in a month or in two weeks, we'll have an e mail conversation about it. If it's presented in a really live dashboard, you can drill down and you can actually answer questions in real time. The value of that is immense, because now you as a leadership team, you can make a decision at that point and decide what direction you're going to do. Based on data, >>I said last time I have one more questions. You're CDO but you're a polymath on. So my question is, what should people look for in a chief data officer? What sort of the characteristics in the attributes, given your >>experience, that's kind of a loaded question, because there is. There is no good job, single job description for a chief date officer. I think there's a good solid set of skill sets, the fine for a cheap date officer and actually, as part of the chief data officer summits that you you know, you guys attend. We had were having sessions with the chief date officers, kind of defining a curriculum for cheap date officers with our clients so that we can help build the chief. That officer in the future. But if you look a quality so cheap, date officer is also a chief disruption officer. So it needs to be someone who is really good at and really good at driving change and really good at disrupting processes and getting people excited about it changes hard. People don't like change. How do you do? You need someone who can get people excited about change. So that's one thing. On depending on what industry you're in, it's got to be. It could be if you're in financial or heavy regulated industry, you want someone that understands governance. And that's kind of what Gardner and other analysts call a defensive CDO very governance Focus. And then you also have some CDOs, which I I fit into this bucket, which is, um, or offensive CDO, which is how do you create value from data? How do you caught save money? How do you create net new revenue? How do you create new business models, leveraging data and a I? And now there's kind of 1/3 type of CDO emerging, which is CDO not as a cost center but a studio as a p N l. How do you generate revenue for the business directly from your CDO office. >>I like that framework, right? >>I can't take credit for it. That's Gartner. >>Its governance, they call it. We say he called defensive and offensive. And then first time I met Interpol. He said, Look, you start with how does data affect the monetization of my organization? And that means making money or saving money. Seth, thanks so much for coming on. The Cube is great to see you >>again. Thanks for having me >>again. All right, Keep it right to everybody. We'll be back at the IBM data in a I form from Miami. You're watching the Cube?

Published Date : Oct 22 2019

SUMMARY :

IBM is data in a I forum brought to you by IBM. Good to see you again. What do you see out in the marketplace? And how do you operationalize and and industrialize? He's got a eye for a eyes. So how's that work? Basically, you feed it your data and it identifies the features that are important. And really, there's some tweaks that you know, the data scientist, then can can he or she can apply it in a way that is unique And it was also, you know, my former team, the data science elite team, was engaged, Is it assassin part You can use it paper use so you get a license as part of watching studio on IBM Cloud. So that's key. What are the key blockers that you see and how you're knocking them the talent you have? You know, the compliance issues and you gotta have the skill sets. And sometimes that's actually the first thing you said is sometimes a mistake. You know, when you talk to buyers and you talk You identify the use cases you want to go after and you experiment by leveraging, And if you blow it on a moon shot, it could set you back years. Right, Because the moon shot may take you two years to And how are you taking that to customers? with the tip of a button, really let a click of a button really let you understand how your business And so it sounds like one of the initiatives that you With the click of a button, you drop your data sources in the Aye. to sort of give you a data confidence meter, if you will. So a I If you have a date, a dictionary and you have And if the most senior person in the room doesn't like the data, so the conversation at the senior senior meetings are no longer your data is different Okay, Now what are we going to do? a lot of customer situations because yeah, you know, you're doing so. So really, you know, I've only been refocused on the internal transformation for What sort of the characteristics in the attributes, given your And then you also have some CDOs, which I I I can't take credit for it. The Cube is great to see you Thanks for having me We'll be back at the IBM data in a I form from Miami.

ENTITIES

Entity	Category	Confidence
Seth	PERSON	0.99+
Arvin Krishna	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Daubert	PERSON	0.99+
Boston	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
Dave	PERSON	0.99+
Ginny Rometty	PERSON	0.99+
Seth Dobrin	PERSON	0.99+
IBM Research	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Miami	LOCATION	0.99+
Excel	TITLE	0.99+
eight years	QUANTITY	0.99+
88 weeks	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
Gardner	PERSON	0.99+
Sarah	PERSON	0.99+
Miami, Florida	LOCATION	0.99+
34 months	QUANTITY	0.99+
17,000 features	QUANTITY	0.99+
two things	QUANTITY	0.99+
10th year	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
1700 people	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
Cognos	TITLE	0.99+
three years ago	DATE	0.99+
two months	QUANTITY	0.99+
first time	QUANTITY	0.98+
one	QUANTITY	0.98+
today	DATE	0.98+
each business	QUANTITY	0.97+
first couple	QUANTITY	0.97+
Interpol	ORGANIZATION	0.96+
about 4	QUANTITY	0.96+
Thompson	PERSON	0.96+
third quarter	DATE	0.96+
five different businesses	QUANTITY	0.95+
Two data	QUANTITY	0.95+
Intercontinental Hotel	ORGANIZATION	0.94+
IBM Data	ORGANIZATION	0.94+
first	QUANTITY	0.93+
single job	QUANTITY	0.93+
first thing	QUANTITY	0.92+
Cognos	ORGANIZATION	0.91+
last couple of years	DATE	0.91+
single source	QUANTITY	0.89+
few months ago	DATE	0.89+
one more questions	QUANTITY	0.89+
couple months ago	DATE	0.88+
Cloudpack	TITLE	0.87+
single version	QUANTITY	0.87+
Cube	COMMERCIAL_ITEM	0.86+
80% of	QUANTITY	0.85+
last six months	DATE	0.84+
Claude Incognito	ORGANIZATION	0.84+
agile	TITLE	0.84+
10%	QUANTITY	0.84+
years	DATE	0.84+
Moore	ORGANIZATION	0.82+
zero	QUANTITY	0.81+
three years	QUANTITY	0.8+
2020 election	EVENT	0.8+
E PM	TITLE	0.79+
four sprints	QUANTITY	0.79+
Watson	ORGANIZATION	0.77+
2nd 1	QUANTITY	0.75+

Seth Dobrin, IBM | IBM CDO Summit 2019

>> Live from San Francisco, California, it's the theCUBE, covering the IBM Chief Data Officer Summit, brought to you by IBM. >> Welcome back to San Francisco everybody. You're watching theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise and we're here at the IBM Chief Data Officer Summit, 10th anniversary. Seth Dobrin is here, he's the Vice President and Chief Data Officer of the IBM Analytics Group. Seth, always a pleasure to have you on. Good to see you again. >> Yeah, thanks for having me back Dave. >> You're very welcome. So I love these events you get a chance to interact with chief data officers, guys like yourself. We've been talking a lot today about IBM's internal transformation, how IBM itself is operationalizing AI and maybe we can talk about that, but I'm most interested in how you're pointing that at customers. What have you learned from your internal experiences and what are you bringing to customers? >> Yeah, so, you know, I was hired at IBM to lead part of our internal transformation, so I spent a lot of time doing that. >> Right. >> I've also, you know, when I came over to IBM I had just left Monsanto where I led part of their transformation. So I spent the better part of the first year or so at IBM not only focusing on our internal efforts, but helping our clients transform. And out of that I found that many of our clients needed help and guidance on how to do this. And so I started a team we call, The Data Science an AI Elite Team, and really what we do is we sit down with clients, we share not only our experience, but the methodology that we use internally at IBM so leveraging things like design thinking, DevOps, Agile, and how you implement that in the context of data science and AI. >> I've got a question, so Monsanto, obviously completely different business than IBM-- >> Yeah. >> But when we talk about digital transformation and then talk about the difference between a business and a digital business, it comes down to the data. And you've seen a lot of examples where you see companies traversing industries which never used to happen before. You know, Apple getting into music, there are many, many examples, and the theory is, well, it's 'cause it's data. So when you think about your experiences of a completely different industry bringing now the expertise to IBM, were there similarities that you're able to draw upon, or was it a completely different experience? >> No, I think there's tons of similarities which is, which is part of why I was excited about this and I think IBM was excited to have me. >> Because the chances for success were quite high in your mind? >> Yeah, yeah, because the chance for success were quite high, and also, you know, if you think about it there's on the, how you implement, how you execute, the differences are really cultural more than they're anything to do with the business, right? So it's, the whole role of a Chief Data Officer, or Chief Digital Officer, or a Chief Analytics Officer, is to drive fundamental change in the business, right? So it's how do you manage that cultural change, how do you build bridges, how do you make people, how do you make people a little uncomfortable, but at the same time get them excited about how to leverage things like data, and analytics, and AI, to change how they do business. And really this concept of a digital transformation is about moving away from traditional products and services, more towards outcome-based services and not selling things, but selling, as a Service, right? And it's the same whether it's IBM, you know, moving away from fully transactional to Cloud and subscription-based offerings. Or it's a bank reimagining how they interact with their customers, or it's oil and gas company, or it's a company like Monsanto really thinking about how do we provide outcomes. >> But how do you make sure that every, as a Service, is not a snowflake and it can scale so that you can actually, you know, make it a business? >> So underneath the, as a Service, is a few things. One is, data, one is, machine learning and AI, the other is really understanding your customer, right, because truly digital companies do everything through the eyes of their customer and so every company has many, many versions of their customer until they go through an exercise of creating a single version, right, a customer or a Client 360, if you will, and we went through that exercise at IBM. And those are all very consistent things, right? They're all pieces that kind of happen the same way in every company regardless of the industry and then you get into understanding what the desires of your customer are to do business with you differently. >> So you were talking before about the Chief Digital Officer, a Chief Data Officer, Chief Analytics Officer, as a change agent making people feel a little bit uncomfortable, explore that a little bit what's that, asking them questions that intuitively they, they know they need to have the answer to, but they don't through data? What did you mean by that? >> Yeah so here's the conversations that usually happen, right? You go and you talk to you peers in the organization and you start having conversations with them about what decisions are they trying to make, right? And you're the Chief Data Officer, you're responsible for that, and inevitably the conversation goes something like this, and I'm going to paraphrase. Give me the data I need to support my preconceived notions. >> (laughing) Yeah. >> Right? >> Right. >> And that's what they want to (voice covers voice). >> Here's the answer give me the data that-- >> That's right. So I want a Dashboard that helps me support this. And the uncomfortableness comes in a couple of things in that. It's getting them to let go of that and allow the data to provide some inkling of things that they didn't know were going on, that's one piece. The other is, then you start leveraging machine learning, or AI, to actually help start driving some decisions, so limiting the scope from infinity down to two or three things and surfacing those two or three things and telling people in your business your choices are one of these three things, right? That starts to make people feel uncomfortable and really is a challenge for that cultural change getting people used to trusting the machine, or in some instances even, trusting the machine to make the decision for you, or part of the decision for you. >> That's got to be one of the biggest cultural challenges because you've got somebody who's, let's say they run a big business, it's a profitable business, it's the engine of cashflow at the company, and you're saying, well, that's not what the data says. And you're, say okay, here's a future path-- >> Yeah. >> For success, but it's going to be disruptive, there's going to be a change and I can see people not wanting to go there. >> Yeah, and if you look at, to the point about, even businesses that are making the most money, or parts of a business that are making the most money, if you look at what the business journals say you start leveraging data and AI, you get double-digit increases in your productivity, in your, you know, in differentiation from your competitors. That happens inside of businesses too. So the conversation even with the most profitable parts of the business, or highly, contributing the most revenue is really what we could do better, right? You could get better margins on this revenue you're driving, you could, you know, that's the whole point is to get better leveraging data and AI to increase your margins, increase your revenue, all through data and AI. And then things like moving to, as a Service, from single point to transaction, that's a whole different business model and that leads from once every two or three or five years, getting revenue, to you get revenue every month, right? That's highly profitable for companies because you don't have to go in and send your sales force in every time to sell something, they buy something once, and they continue to pay as long as you keep 'em happy. >> But I can see that scaring people because if the incentives don't shift to go from a, you know, pay all up front, right, there's so many parts of the organization that have to align with that in order for that culture to actually occur. So can you give some examples of how you've, I mean obviously you ran through that at IBM, you saw-- >> Yeah. >> I'm sure a lot of that, got a lot of learnings and then took that to clients. Maybe some examples of client successes that you've had, or even not so successes that you've learned from. >> Yeah, so in terms of client success, I think many of our clients are just beginning this journey, certainly the ones I work with are beginning their journey so it's hard for me to say, client X has successfully done this. But I can certainly talk about how we've gone in, and some of the use cases we've done-- >> Great. >> With certain clients to think about how they transformed their business. So maybe the biggest bang for the buck one is in the oil and gas industry. So ExxonMobile was on stage with me at, Think, talking about-- >> Great. >> Some of the work that we've done with them in their upstream business, right? So every time they drop a well it costs them not thousands of dollars, but hundreds of millions of dollars. And in the oil and gas industry you're talking massive data, right, tens or hundreds of petabytes of data that constantly changes. And no one in that industry really had a data platform that could handle this dynamically. And it takes them months to get, to even start to be able to make a decision. So they really want us to help them figure out, well, how do we build a data platform on this massive scale that enables us to be able to make decisions more rapidly? And so the aim was really to cut this down from 90 days to less than a month. And through leveraging some of our tools, as well as some open-source technology, and teaching them new ways of working, we were able to lay down this foundation. Now this is before, we haven't even started thinking about helping them with AI, oil and gas industry has been doing this type of thing for decades, but they really were struggling with this platform. So that's a big success where, at least for the pilot, which was a small subset of their fields, we were able to help them reduce that timeframe by a lot to be able to start making a decision. >> So an example of a decision might be where to drill next? >> That's exactly the decision they're trying to make. >> Because for years, in that industry, it was boop, oh, no oil, boop, oh, no oil. >> Yeah, well. >> And they got more sophisticated, they started to use data, but I think what you're saying is, the time it took for that analysis was quite long. >> So the time it took to even overlay things like seismic data, topography data, what's happened in wells, and core as they've drilled around that, was really protracted just to pull the data together, right? And then once they got the data together there were some really, really smart people looking at it going, well, my experience says here, and it was driven by the data, but it was not driven by an algorithm. >> A little bit of art. >> True, a lot of art, right, and it still is. So now they want some AI, or some machine learning, to help guide those geophysicists to help determine where, based on the data, they should be dropping wells. And these are hundred million and billion dollar decisions they're making so it's really about how do we help them. >> And that's just one example, I mean-- >> Yeah. >> Every industry has it's own use cases, or-- >> Yeah, and so that's on the front end, right, about the data foundation, and then if you go to a company that was really advanced in leveraging analytics, or machine learning, JPMorgan Chase, in their, they have a division, and also they were on stage with me at, Think, that they had, basically everything is driven by a model, so they give traders a series of models and they make decisions. And now they need to monitor those models, those hundreds of models they have for misuse of those models, right? And so they needed to build a series of models to manage, to monitor their models. >> Right. >> And this was a tremendous deep-learning use case and they had just bought a power AI box from us so they wanted to start leveraging GPUs. And we really helped them figure out how do you navigate and what's the difference between building a model leveraging GPUs, compared to CPUs? How do you use it to accelerate the output, and again, this was really a cost-avoidance play because if people misuse these models they can get in a lot of trouble. But they also need to make these decisions very quickly because a trader goes to make a trade they need to make a decision, was this used properly or not before that trade is kicked off and milliseconds make a difference in the stock market so they needed a model. And one of the things about, you know, when you start leveraging GPUs and deep learning is sometimes you need these GPUs to do training and sometimes you need 'em to do training and scoring. And this was a case where you need to also build a pipeline that can leverage the GPUs for scoring as well which is actually quite complicated and not as straight forward as you might think. In near real time, in real time. >> Pretty close to real time. >> You can't get much more real time then those things, potentially to stop a trade before it occurs to protect the firm. >> Yeah. >> Right, or RELug it. >> Yeah, and don't quote, I think this is right, I think they actually don't do trades until it's confirmed and so-- >> Right. >> Or that's the desire as to not (voice covers voice). >> Well, and then now you're in a competitive situation where, you know. >> Yeah, I mean people put these trading floors as close to the stock exchange as they can-- >> Physically. >> Physically to (voice covers voice)-- >> To the speed of light right? >> Right, so every millisecond counts. >> Yeah, read Flash Boys-- >> Right, yeah. >> So, what's the biggest challenge you're finding, both at IBM and in your clients, in terms of operationalizing AI. Is it technology? Is it culture? Is it process? Is it-- >> Yeah, so culture is always hard, but I think as we start getting to really think about integrating AI and data into our operations, right? As you look at what software development did with this whole concept of DevOps, right, and really rapidly iterating, but getting things into a production-ready pipeline, looking at continuous integration, continuous development, what does that mean for data and AI? And these concept of DataOps and AIOps, right? And I think DataOps is very similar to DevOps in that things don't change that rapidly, right? You build your data pipeline, you build your data assets, you integrate them. They may change on the weeks, or months timeframe, but they're not changing on the hours, or days timeframe. As you get into some of these AI models some of them need to be retrained within a day, right, because the data changes, they fall out of parameters, or the parameters are very narrow and you need to keep 'em in there, what does that mean? How do you integrate this for your, into your CI/CD pipeline? How do you know when you need to do regression testing on the whole thing again? Does your data science and AI pipeline even allow for you to integrate into your current CI/CD pipeline? So this is actually an IBM-wide effort that my team is leading to start thinking about, how do we incorporate what we're doing into people's CI/CD pipeline so we can enable AIOps, if you will, or MLOps, and really, really IBM is the only company that's positioned to do that for so many reasons. One is, we're the only one with an end-to-end toolchain. So we do everything from data, feature development, feature engineering, generating models, whether selecting models, whether it's auto AI, or hand coding or visual modeling into things like trust and transparency. And so we're the only one with that entire toolchain. Secondly, we've got IBM research, we've got decades of industry experience, we've got our IBM Services Organization, all of us have been tackling with this with large enterprises so we're uniquely positioned to really be able to tackle this in a very enterprised-grade manner. >> Well, and the leverage that you can get within IBM and for your customers. >> And leveraging our clients, right? >> It's off the charts. >> We have six clients that are our most advanced clients that are working with us on this so it's not just us in a box, it's us with our clients working on this. >> So what are you hoping to have happen today? We're just about to get started with the keynotes. >> Yeah. >> We're going to take a break and then come back after the keynotes and we've got some great guests, but what are you hoping to get out of today? >> Yeah, so I've been with IBM for 2 1/2 years and I, and this is my eighth CEO Summit, so I've been to many more of these than I've been at IBM. And I went to these religiously before I joined IBM really for two reasons. One, there's no sales pitch, right, it's not a trade show. The second is it's the only place where I get the opportunity to listen to my peers and really have open and candid conversations about the challenges they're facing and how they're addressing them and really giving me insights into what other industries are doing and being able to benchmark me and my organization against the leading edge of what's going on in this space. >> I love it and that's why I love coming to these events. It's practitioners talking to practitioners. Seth Dobrin thanks so much for coming to theCUBE. >> Yeah, thanks always, Dave. >> Always a pleasure. All right, keep it right there everybody we'll be right back right after this short break. You're watching, theCUBE, live from San Francisco. Be right back.

Published Date : Jun 24 2019

SUMMARY :

brought to you by IBM. Seth, always a pleasure to have you on. Yeah, thanks for and what are you bringing to customers? to lead part of our DevOps, Agile, and how you implement that bringing now the expertise to IBM, and I think IBM was excited to have me. and analytics, and AI, to to do business with you differently. Give me the data I need to And that's what they want to and allow the data to provide some inkling That's got to be there's going to be a and they continue to pay as that have to align with that and then took that to clients. and some of the use cases So maybe the biggest bang for the buck one And so the aim was really That's exactly the decision it was boop, oh, no oil, boop, oh, they started to use data, but So the time it took to help guide those geophysicists And so they needed to build And one of the things about, you know, to real time. to protect the firm. Or that's the desire as to not Well, and then now so every millisecond counts. both at IBM and in your clients, and you need to keep 'em in there, Well, and the leverage that you can get We have six clients that So what are you hoping and being able to benchmark talking to practitioners. Yeah, after this short break.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
San Francisco	LOCATION	0.99+
Seth	PERSON	0.99+
JPMorgan Chase	ORGANIZATION	0.99+
Monsanto	ORGANIZATION	0.99+
90 days	QUANTITY	0.99+
two	QUANTITY	0.99+
six clients	QUANTITY	0.99+
Dave	PERSON	0.99+
hundred million	QUANTITY	0.99+
tens	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
one piece	QUANTITY	0.99+
ExxonMobile	ORGANIZATION	0.99+
IBM Analytics Group	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
San Francisco, California	LOCATION	0.99+
less than a month	QUANTITY	0.99+
2 1/2 years	QUANTITY	0.99+
three	QUANTITY	0.99+
one example	QUANTITY	0.99+
today	DATE	0.99+
thousands of dollars	QUANTITY	0.99+
one	QUANTITY	0.99+
five years	QUANTITY	0.98+
One	QUANTITY	0.98+
second	QUANTITY	0.98+
two reasons	QUANTITY	0.98+
hundreds of petabytes	QUANTITY	0.97+
hundreds of millions of dollars	QUANTITY	0.97+
hundreds of models	QUANTITY	0.97+
10th anniversary	QUANTITY	0.97+
IBM Chief Data Officer Summit	EVENT	0.97+
three things	QUANTITY	0.96+
single point	QUANTITY	0.96+
decades	QUANTITY	0.95+
billion dollar	QUANTITY	0.95+
Flash Boys	TITLE	0.95+
single version	QUANTITY	0.95+
Secondly	QUANTITY	0.94+
both	QUANTITY	0.92+
IBM Services Organization	ORGANIZATION	0.9+
IBM Chief Data Officer Summit	EVENT	0.9+
first year	QUANTITY	0.89+
once	QUANTITY	0.87+
IBM CDO Summit 2019	EVENT	0.83+
DataOps	TITLE	0.72+
years	QUANTITY	0.72+
Vice President	PERSON	0.69+
Think	ORGANIZATION	0.69+
every millisecond	QUANTITY	0.68+
DevOps	TITLE	0.68+
once every	QUANTITY	0.67+
double-	QUANTITY	0.62+
eighth CEO	QUANTITY	0.62+
Chief Data Officer	PERSON	0.6+
UBE	ORGANIZATION	0.59+
360	COMMERCIAL_ITEM	0.58+
RELug	ORGANIZATION	0.56+

Seth Ravin, Rimini Street | CUBE Conversation, December 2018

(inspiring music) >> Hey welcome back everybody, Jeff Rick here with theCUBE. We're in our Palo Alto studios for a Cube Conversation. 2018 is winding down, I think we're at our last big show of the week, this week at KubeCon. It's always nice to get back into the studio, things are a little bit calmer, a little bit less hectic, and learn about new businesses, new companies. So we're excited to have, I think first time to theCUBE, Seth Ravin, he is the CEO and co-founder of Rimini Street. Seth, great to see you. >> Thank you very much, good to be here. >> Yeah, welcome. So for the folks that aren't familiar with Rimini Street, give us a quick overview. >> Sure, Rimini Street is a 13 year old company. We went public last year on the NASDAQ, RMNI. >> Congratulations. >> Thank you. We have 1,100 people or so, operating in 18 countries, and we're servicing nearly 2,700 companies that have moved and used Rimini Street services around the world, including about 150 of the Fortune 500. >> You've got a pretty interesting business model. Very, kind of, innovative, but it's one of those so simple and so obvious, why didn't anybody see it ahead of time? So, tell the folks what your basic, core business model is. >> Sure, the enterprise software space has about $160 billion that's spent every year on annual maintenance fees. And, of that, Oracle and SAP have about $32 billion in annual fees. But this market has not been a competitive market. Those companies drive north of 90% profit margins, and their customers are not all that happy with the service. So we came in and offered a service at 50% off that provides a better service overall, for customers, and makes them very happy. >> Right, so the core components of a maintenance contract, so they pay their licensing fee and they pay, whatever, 15, 20% of that licensing fee for the maintenance. They're getting patches, they're getting upgrades. What are some of the other things that should be included in that maintenance fee? >> Well what they get is, they would get upgrades, they would get updates, which includes tax, legal, and reg updates, which everybody needs when you're running a global company, for, whether it's payroll taxes or financial taxes. You also need, when things break, you need to get them fixed. You also need advice and counsel in these very complex, large systems. And Rimini Street comes in, and what we don't do is we don't offer upgrades. We don't offer new versions of the product. What we do do is extend the life of these existing products for 15 to 20 years, beyond what the vendor would consider their normal support life. >> Right. So there's a whole bunch of things that work into that. One of 'em is, they want it to be supported, or they want you to have the latest patches and stuff so that they'll continue to support them. You've basically, per our earlier conversation, just basically taken over that whole responsibility, so not only the software patches and changes, but also then the support on top of that. >> That's right. In fact, we have support with less than five minute turnaround time, with a senior engineer, 24 by seven. So we've really offered a concierge level of service at half the annual fees. So our customers can save up to 90% total operating costs on these large, complex systems. >> So it's pretty, (laughs) it's kind of hard to grasp at first, but I think you gave a great analogy before we went on air, which is kind of like getting your car fixed. Take it to the dealer, or take it to Bob's Mechanic. And if you can get the quality of service, customer service, same parts, it's actually for a lot of people a better alternative. And that's kind of what you're doing, right? >> That's right. Think of it as, you could take your car to the dealer, you could take it to your local mechanic, who you might think is better at fixing that system. We have hundreds of engineers around the world, and we think that we are very very good at fixing these core systems and providing the updates that are needed to keep them moving forward. And I think one of the other parts that's really important here, is this is a difference between two different directions that every single licensee of someone like an Oracle or SAP product has to make a decision on. Number one, do you go down the vendor's road map, which includes, if you follow their road map, that's upgrades and updates and costs that are very expensive, that are designed around what the vendor wants to do, and the vendor's needs. Or, you choose to go down a business-driven road map, which is the focus around the company, and focus around competitive advantage and growth. And we are the company that works on the business-driven road maps. >> And you made a good point earlier, 'cause we talked about updates on our mobile phones, and DevOps, and we're at KubeCon, it's all about DevOps, and patches are coming out all the time, and updates are coming all the time. But the systems you're talking about are big, nasty, hairy ERP systems. These are not things that you want to be changing all the time, and in fact for a lot of cases, you probably don't, I would imagine the biggest value, one of the biggest values for your customers is extending that lifetime of that current install, and continuing to get the support which is threatened if they don't continue to pay the tax to the big red machine. >> That's correct. Instead of paying a 20% annual fee just for the maintenance on it, they can focus at saving half of that money, putting it into new innovation into their environment. And the kind of changes you're talking about, they're systems of engagement. So on the front end, where we interface with customers, and vendors, on the front end that's constantly changing, that's a dynamic system. On the back end where we work, these are big core transaction systems where change introduces risk into the system. We want to run these systems for a long time. We don't have competitive advantage on the back end of our financial system. Competitive advantage is done on the front end, where we compete against other people in the industry. >> So how'd you come up with the idea? I mean, it seems so obvious in hindsight, again, with the car repair analogy, which is just dead dumb simple. But what did you see, you were in the business, and what was your kind of experience in kind of the other side that got you to think, hmm, here's an opportunity that I think a lot of people would like to take advantage of? >> So I was part of the management team building out PeopleSoft for many years, and I was in a business where I was part of the team that had to try and force customers to take these upgrades. That was one of my jobs, was to move people forward onto these new releases. And I had an epiphany one day, that said, really, I am tired of selling people things they don't want, and let's focus on selling people what they really need. And this is a function of the maturity of the products that we deal with. They are so mature that they don't need to be changed out that frequently. So we want to move away from what the vendor wants, and we want to focus in on what is right for the company to allow them to shift more spending into these systems of innovation, that they have to do because the CIO world, IT is changing. The mission of IT is to support competitive advantage and growth now. It's not just to run a data center. >> Right, and as you said, taking a patch is not just like a quick update on your phone. You got to bring systems down, you've got labor components, you've got, again, complex APIs and connections that have to be managed, so, so these are pretty disruptive processes that people had to do, you had no choice if you wanted to keep your support active, right? You had to do it. >> And thousands of Rimini Street customers don't have to worry about all of those risks being introduced into their environment. And when things do need to be changed, proactively, like a tax, legal, and regulatory update, they get those. And if they need support, they have a very fast turnaround with an assigned engineer. So we've really changed the dynamics of the support model into one that people rave about, because it works very well compared to your typical call center model. We have no call center. So, our customers call their engineers directly, which allows them to get support from senior people very quickly. >> Right. So the other part you touched on, is then that frees up the CIO, and the inside team, to worry about front end innovation, to worry about some of these other more dynamic processes, where you do have to be a little more active, you do have to be on the cutting edge, you do have to be more responsive to competitive threats, which is not an ERP upgrade, but it's a new app, it's a new, you know, whatever. Versus (laughs) the unplanned, unwanted, and unanticipated forced process on a back end that you didn't even, maybe, want to, or see the benefit of. >> That's correct, and CIOs have to decide how they're going to distribute their budget between, what we say, keeping the lights on, a day to day operating cost, and then how much they're going to spend in innovation. And many customers wind up spending, just like a federal budget deficit issue, they spend 90% of their budget keeping the lights on, paying maintenance bills, running a data center, and that leaves very little money for innovation, which they need for that competitive advantage and growth. We are helping them shift money from the side of keeping the lights on, into innovation. >> So where do you guys go next? Is it just more of the same, the big giant TAM, obviously a whole lot of Oracle, SAP, and other enterprise applications, is that really your mission going forward, freeing up people to do the more innovative and creative, you're basically kind of offloading a big headache. >> Sure, but I think what you're going to watch is we expand the services that we cover. Today, we replace the vendor's maintenance. Tomorrow, we may do more of the work inside the IT organization. All support, but expanding the definition of support so that we can provide freed-up capital, time, and resources, to focus on innovation. As you know, in today's world, you're either growing or you're dyin'. There is no status quo left in this world. It's too competitive. And so we are helping companies make sure they keep their competitive edge, and gain new ones. >> Well it's a great story, and now that you're public, we can all watch it unfold and it looks like you've paid off a bunch of debt recently, I was goin' through some of the financial information, so congratulations, and, really interesting model. I know I use my local car repairman Bob. As long as he keeps deliverin', I'll keep goin' back, and so I imagine once you get seated in, it's probably a good long term relationship. >> Yes, that's the thing, it's a recurring revenue business. We're a subscription, just like a SAS business, only we're subscription revenue on maintenance. >> Alright, well Seth, thanks for taking a few minutes of your day and sharin' your story. >> Thank you much. >> Alright, he's Seth, I'm Jeff, you're watchin' theCUBE, we're at our Palo Alto studio havin' Cube Conversations. Thanks for watchin', we'll see you next time. (inspiring music)

Published Date : Dec 13 2018

SUMMARY :

Seth Ravin, he is the CEO and co-founder of Rimini Street. Thank you very much, So for the folks that aren't familiar with Rimini Street, Sure, Rimini Street is a 13 year old company. around the world, including about 150 of the Fortune 500. So, tell the folks what your basic, core business model is. Sure, the enterprise software space Right, so the core components of a maintenance contract, you need to get them fixed. so that they'll continue to support them. at half the annual fees. to grasp at first, but I think you gave We have hundreds of engineers around the world, the tax to the big red machine. So on the front end, where we interface with customers, that got you to think, hmm, here's an opportunity that they have to do because the CIO world, IT is changing. that people had to do, you had no choice don't have to worry about all of those risks So the other part you touched on, and then how much they're going to spend in innovation. Is it just more of the same, the big giant TAM, and resources, to focus on innovation. and so I imagine once you get seated in, Yes, that's the thing, it's a recurring revenue business. a few minutes of your day and sharin' your story. Thanks for watchin', we'll see you next time.

ENTITIES

Entity	Category	Confidence
Seth	PERSON	0.99+
Jeff Rick	PERSON	0.99+
Seth Ravin	PERSON	0.99+
20%	QUANTITY	0.99+
15	QUANTITY	0.99+
December 2018	DATE	0.99+
50%	QUANTITY	0.99+
Jeff	PERSON	0.99+
Bob	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Tomorrow	DATE	0.99+
1,100 people	QUANTITY	0.99+
Rimini Street	ORGANIZATION	0.99+
NASDAQ	ORGANIZATION	0.99+
RMNI	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Today	DATE	0.99+
TAM	ORGANIZATION	0.99+
less than five minute	QUANTITY	0.99+
last year	DATE	0.99+
SAP	ORGANIZATION	0.99+
about $160 billion	QUANTITY	0.99+
18 countries	QUANTITY	0.99+
thousands	QUANTITY	0.99+
24	QUANTITY	0.99+
SAS	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
about $32 billion	QUANTITY	0.98+
hundreds of engineers	QUANTITY	0.98+
one	QUANTITY	0.98+
seven	QUANTITY	0.98+
KubeCon	EVENT	0.97+
theCUBE	ORGANIZATION	0.97+
Rimini Street	LOCATION	0.97+
20 years	QUANTITY	0.97+
about 150	QUANTITY	0.97+
up to 90%	QUANTITY	0.96+
two different directions	QUANTITY	0.96+
2018	DATE	0.96+
first time	QUANTITY	0.96+
this week	DATE	0.94+
One	QUANTITY	0.93+
nearly 2,700 companies	QUANTITY	0.91+
KubeCon	ORGANIZATION	0.9+
15, 20%	QUANTITY	0.9+
PeopleSoft	ORGANIZATION	0.89+
DevOps	TITLE	0.89+
today	DATE	0.87+
one day	QUANTITY	0.84+
Fortune 500	ORGANIZATION	0.81+
single licensee	QUANTITY	0.81+
13 year old	QUANTITY	0.79+
Conversation	EVENT	0.77+
north	QUANTITY	0.75+
first	QUANTITY	0.7+
Mechanic	ORGANIZATION	0.69+
90% profit	QUANTITY	0.66+
Bob	ORGANIZATION	0.63+
Cube	ORGANIZATION	0.54+
Cube	TITLE	0.48+
Conversations	EVENT	0.34+

Seth Morrell, Hub International & Jeremy Embalabala, Hub International | AWS re:Invent 2018

>> Live from Las Vegas, it's theCUBE, covering AWS re:Invent 2018, brought to you by Amazon Web Services, Intel, and their ecosystem partners. >> And welcome back here to Las Vegas. We're in the Sands expo, we're in Hall D. If you happen to be at the show or dropping in just to watch, come on by and say hi to us. Love to see you here on theCUBE, as we continue our coverage, day two. And along with Justin Warren, I'm John Walls. And now we're joined by a couple of gents from HUB International, Seth Morrell, who's the vice president of enterprise, architecture and design. Seth, good morning to you. >> Good morning. >> And Jeremy Embalabala, who is the director of security architecture and engineering, also at HUB International. Good morning, Jeremy. >> Good morning. >> Seth, by the way, playing hurt, broken finger with a snowblower in Chicago on Monday. >> On Monday. >> Yeah, good luck though with the winter. >> Yeah, yeah, yeah, it started off well. >> Sorry to see that, but thanks for coming regardless. >> No problem. >> All right, tell us about HUB International a little bit, about primary mission and then the two of you, what you're doing for them primarily. >> Right, right, so HUB International is an insurance brokerage. Personal, commercial, we do employee benefits, retirement as well. We're based in the US in Chicago, operate in US and Canada. 500 plus locations, 12,000 employees. >> Okay, and then primary responsibilities between the two of you? >> Well, I'm the director of security architecture. I'm responsible for all things technical with regards to security, both on the architecture side, engineering and operations. >> All right, so yesterday we were talking about this early, you did a session, you're big Splunk guys, right? So let's talk about what you're doing with that, how that's working for you in general, if you would. >> Yeah, yeah, go ahead. >> Yeah, the reason Splunk Enterprise Security, the on-premise version we actually, people always ask me, are you using Splunk Cloud or Splunk On Prem? And I always joke, well we're using Splunk On Prem in the cloud in AWS. But for us, we're really focused on Splunk as a SIEM, to enable our security operations center to provide insights into our environment and help us detect and understand threats that are going on in the environment. So we have a manage partner that runs our security operations center for us. They also manage our Splunk environment. It helps us keep an eye on both our AWS environment that we have, our Azure environment, and our on-premise data center as well. >> A few people have sort of gotten wary of the idea of a SIEM. People have tried to use SIEMs and they haven't been very successful and they go, "Oh SIEM's a bit of a dirty word." But it sounds like SIEM's actually working for you really well. >> Yeah, I really view a SIEM as a cornerstone of security program. Specifically if you have a mature security operation center, it's really hard to operate that without a SIEM. SIEMs are tricky, they're tricky to implement, they're generally very costly and they require a lot of tuning, a lot of love, care, and feeding in order to be effective. Quite frankly, if you don't get that right, it can actually be detrimental to your security program. But if you put the proper care and feeding into a SIEM, it will be very beneficial to your organization. >> Okay, so what's some of the things that you've been able to do now that you've got Splunk in there and it's helping you manage the security? Because I saw some statistics earlier this morning, where security is basically the second biggest, most popular term here at AWS and at re:Invent. It's clearly front of mind for a lot of enterprises. So what is it that Splunk in helping you to achieve that you wouldn't have been able to go otherwise? >> The biggest thing for us is the aggregation of all of our logs, our data sources in AWS, data sources on prem, our Windows file servers, our network traffic flow data, all of that's aggregated into Splunk. And that allows us to do some correlation with third-party threat intelligence feeds. Take indicators of compromise that are streamed, that are observed out there in the real world, and apply those to data that we're seeing on our actual data sources in our environment. It allows us to detect threats that we wouldn't have been able to detect otherwise. >> Right, how does that translate through to what you're actually doing as a business? I mean, this is a very sort of technology-centric thing, but you're an insurance agent. So how does this investment in security translate into the business value? >> One, it just gives us visibility into the environment, and we can proactively identify potential threats and remediate them before they actually cause an impact to the business. Without these tools and without these capabilities, it'd be a much riskier endeavor. And so it's helped us throughout, and we've been good partners with Splunk, they're been good partners with us. And coupled with all the other things that we're doing in the security space and in the cloud space, we're able to build a nice secure environment for our customers and ourselves. >> We're also a very highly regulated industry, so we have regulations that we have to comply with for security. And our customers also care about security very, very deeply. So it allows us to be able to protect our customers' data and really assure our customers that their data is safe with us, whether that data is hosted on-prem or it's in the cloud. >> What about that battle? There's often a battle between private enterprise and regulation, just in general, right? It's making sure the policy makers understand capabilities and real threats as opposed to maybe perceptions or whatever. What do you see in terms of the federal regulatory environment and what you deal with in a Balkanized system where you're dealing with 50 states and Canada. So you've got your hands full, I assume. >> So at HUB, we view security and compliance a little differently. Instead of trying to build security programs and achieve compliance by abiding by all the regulations, we do the right thing from a security perspective. We make the right investments. We put the right controls into our environment. When those new regulations come out for provincial law in Canada or different states or GDPR in Europe, that we'll be 95% of the way there, by just building the right controls into our environment at a foundational level. Then we have to just spend our efforts just kind of aligning ourselves with the other 5% that vary from regulation to regulation. >> Was that a shift in management philosophy at all? Because quite often or maybe in the past, it's like, I'm only going to do something. I'm not saying HUB, but in general, when I have to. As opposed to you appear to be preemptive. Right, you're doing things because you should. So there's a different mindset there, right? >> It sounds like a much more strategic view of security rather than a tactical reactive kind of security. How long has that been the philosophy at HUB? >> So we really built out our a security program starting the beginning of last year. There's all new leadership that came in, Seth came in, myself came in, all new leadership across the organization. And that's really where that mindset came from. And the need and recognition to make an investment in security. We view security as a driver of business, not just a cost center. It's a way we can add to the bottom line and be able to generate revenue for the business by being able to show our customers that we really care about their data, and we're going to do our best to take of them. So with that mindset, we can actually help market, and use that as a marketing tool to be able to help drive business. >> So what are some of the things that you've seen here at the show that you're thinking about, well actually that will support my strategy? Some of the more longer term things. Is there anything that's sort of stuck out to you as sort of going, ooh, that's something that we should actually take back? >> Yeah, well, there's some tactical announcements that are very important to us. The announcement of Windows File Server support. File Server support is big deal for us. We're a heavy File Server organization. And having that native within AWS is very interesting. There's been some other announcements with SFTP. Other items that we're going to be trying to take advantage of in a fairly quick fashion. And we're excited about that. We've been on our journey to cloud since essentially the summer of 2017 through now. And we're kind of ready for the next steps, the next set of capabilities. And so, a conference like this and all these announcements, we're excited to take a look at the menu and start picking out what we want to eat. >> It's a great buffet. >> Yeah, yeah. >> In a city that's famous for it. >> That's true, that's true. >> All you can eat. >> Yeah. >> All right, so let's talk about the journey then. You said 2017, so it's been a year, year plus into that. And you're excited about what's coming, but what do you need? So I know you got this great buffet that you're looking at, but maybe you don't want the pork. Maybe you want the turkey. What do you need, what do you want the most, you think, to service your clients? >> Right, so, we spent most of our migration just essentially moving what we had over to the cloud. And so, what our next steps are, let's really understand our workloads, let's be smarter about how we're running them, let's take advantage of the appropriate technology, the menu items that are out there, per work load, just to be smarter. We're going to be spending much more time this year looking at more automation, orchestration, and basically maturing our cloud capabilities so that we're ready for the next big thing. And as we acquire another company or there's a new business need, we're working to be more proactive and being able to anticipate those needs with building a platform that we can really extend and build upon. >> I'm sorry, go ahead. >> I have a question on the choosing of workloads then. So are you going to be moving everything to the cloud? Or do you think that there'll be some things that will actually remain on-prem or is it going to be a hybrid cloud? >> Our goal is to go from a data center to a network closet. >> Right. >> So we have moved almost all of our application workloads out of our data center right now. We have a large VDI environment we're looking to move as well. Once that's done, we'll be down to our phone system and a couple other legacy applications that we're trying to determine what we actually want to do with strategically. >> Right, okay. That's a pretty common sort of story. There's a lot of people who are moving as much as they possibly can, and then there's a few little bits that just sort of sit there that you need to decide, do we rewrite this, do we actually need this at all, maybe we just turn it off. >> Right. >> Yeah. >> Are there any capabilities specific to your industry that you need or that you'd like to have refined? Something that would allow you to do your job, specifically in the insurance space, that would be unique to you? Anything floating out there that you say, if we had that, that'll fine-tune this to a better degree or a greater degree? >> So for us, it's all about flexibility. We grow very, very rapidly through our mergers and acquisitions. We bought 52 companies last year and we're on pace to do almost 70 companies this year. So for us, the cloud really enables us to be able to absorb those organizations that we acquire, bring them in much, much faster. Part of the story of our cloud migration, we were able to move the integration time for mergers and acquisitions from six months down to under 90 days. Because we're now able to move those workloads in much, much quicker with the clouds. For us that's really a key capability. >> Well you guys are used to writing checks, dinner's on them tonight, right? >> Definitely. >> Seth, Jeremy, thanks for being with us. >> Thank you. >> Glad to be here. >> We appreciate the time. Good luck with the winter, I think you might need it. >> Yeah, yeah, exactly. >> All right, we'll be back with more from AWS re:Invent. You're watching theCUBE from Las Vegas. (snappy techno music)

Published Date : Nov 28 2018

SUMMARY :

brought to you by Amazon Web Services, Love to see you here on theCUBE, as we continue And Jeremy Embalabala, who is the director of security Seth, by the way, playing hurt, what you're doing for them primarily. We're based in the US in Chicago, operate in US and Canada. to security, both on the architecture side, So let's talk about what you're doing with that, that are going on in the environment. for you really well. and feeding in order to be effective. So what is it that Splunk in helping you to achieve and apply those to data that we're seeing to what you're actually doing as a business? and we can proactively identify potential threats have to comply with for security. regulatory environment and what you and achieve compliance by abiding by all the regulations, As opposed to you appear to be preemptive. How long has that been the philosophy at HUB? And the need and recognition to Is there anything that's sort of stuck out to you We've been on our journey to cloud since So I know you got this great buffet that you're looking at, to anticipate those needs with building a platform So are you going to be moving everything to the cloud? that we're trying to determine what just sort of sit there that you need to decide, to be able to absorb those organizations that we acquire, Good luck with the winter, I think you might need it. All right, we'll be back with more from AWS re:Invent.

ENTITIES

Entity	Category	Confidence
Justin Warren	PERSON	0.99+
Seth Morrell	PERSON	0.99+
Jeremy Embalabala	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Chicago	LOCATION	0.99+
US	LOCATION	0.99+
HUB International	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Jeremy	PERSON	0.99+
Monday	DATE	0.99+
Canada	LOCATION	0.99+
two	QUANTITY	0.99+
2017	DATE	0.99+
Seth	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
last year	DATE	0.99+
John Walls	PERSON	0.99+
yesterday	DATE	0.99+
95%	QUANTITY	0.99+
52 companies	QUANTITY	0.99+
5%	QUANTITY	0.99+
this year	DATE	0.99+
12,000 employees	QUANTITY	0.99+
Splunk	ORGANIZATION	0.99+
Intel	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
both	QUANTITY	0.98+
Windows	TITLE	0.98+
GDPR	TITLE	0.98+
under 90 days	QUANTITY	0.97+
tonight	DATE	0.97+
50 states	QUANTITY	0.95+
500 plus locations	QUANTITY	0.94+
a year	QUANTITY	0.94+
summer of 2017	DATE	0.91+
Sands expo	EVENT	0.91+
day two	QUANTITY	0.89+
70 companies	QUANTITY	0.88+
Hub International	ORGANIZATION	0.87+
Azure	TITLE	0.87+
HUB	ORGANIZATION	0.84+
earlier this morning	DATE	0.82+
One	QUANTITY	0.81+
Splunk	TITLE	0.8+
re:Invent 2018	EVENT	0.78+
Hall D.	LOCATION	0.77+
second biggest	QUANTITY	0.77+
SFTP	ORGANIZATION	0.76+
couple	QUANTITY	0.74+
Invent 2018	EVENT	0.73+
Splunk On Prem	TITLE	0.63+
re	ORGANIZATION	0.6+
Balkanized	OTHER	0.58+
prem	ORGANIZATION	0.56+
AWS re:Invent	TITLE	0.55+
re:	EVENT	0.54+
On Prem	TITLE	0.52+
gents	QUANTITY	0.51+
Splunk Cloud	ORGANIZATION	0.5+
Enterprise Security	TITLE	0.48+
Invent	ORGANIZATION	0.47+
theCUBE	ORGANIZATION	0.46+

Sreesha Rao, Niagara Bottling & Seth Dobrin, IBM | Change The Game: Winning With AI 2018

>> Live, from Times Square, in New York City, it's theCUBE covering IBM's Change the Game: Winning with AI. Brought to you by IBM. >> Welcome back to the Big Apple, everybody. I'm Dave Vellante, and you're watching theCUBE, the leader in live tech coverage, and we're here covering a special presentation of IBM's Change the Game: Winning with AI. IBM's got an analyst event going on here at the Westin today in the theater district. They've got 50-60 analysts here. They've got a partner summit going on, and then tonight, at Terminal 5 of the West Side Highway, they've got a customer event, a lot of customers there. We've talked earlier today about the hard news. Seth Dobern is here. He's the Chief Data Officer of IBM Analytics, and he's joined by Shreesha Rao who is the Senior Manager of IT Applications at California-based Niagara Bottling. Gentlemen, welcome to theCUBE. Thanks so much for coming on. >> Thank you, Dave. >> Well, thanks Dave for having us. >> Yes, always a pleasure Seth. We've known each other for a while now. I think we met in the snowstorm in Boston, sparked something a couple years ago. >> Yep. When we were both trapped there. >> Yep, and at that time, we spent a lot of time talking about your internal role as the Chief Data Officer, working closely with Inderpal Bhandari, and you guys are doing inside of IBM. I want to talk a little bit more about your other half which is working with clients and the Data Science Elite Team, and we'll get into what you're doing with Niagara Bottling, but let's start there, in terms of that side of your role, give us the update. >> Yeah, like you said, we spent a lot of time talking about how IBM is implementing the CTO role. While we were doing that internally, I spent quite a bit of time flying around the world, talking to our clients over the last 18 months since I joined IBM, and we found a consistent theme with all the clients, in that, they needed help learning how to implement data science, AI, machine learning, whatever you want to call it, in their enterprise. There's a fundamental difference between doing these things at a university or as part of a Kaggle competition than in an enterprise, so we felt really strongly that it was important for the future of IBM that all of our clients become successful at it because what we don't want to do is we don't want in two years for them to go "Oh my God, this whole data science thing was a scam. We haven't made any money from it." And it's not because the data science thing is a scam. It's because the way they're doing it is not conducive to business, and so we set up this team we call the Data Science Elite Team, and what this team does is we sit with clients around a specific use case for 30, 60, 90 days, it's really about 3 or 4 sprints, depending on the material, the client, and how long it takes, and we help them learn through this use case, how to use Python, R, Scala in our platform obviously, because we're here to make money too, to implement these projects in their enterprise. Now, because it's written in completely open-source, if they're not happy with what the product looks like, they can take their toys and go home afterwards. It's on us to prove the value as part of this, but there's a key point here. My team is not measured on sales. They're measured on adoption of AI in the enterprise, and so it creates a different behavior for them. So they're really about "Make the enterprise successful," right, not "Sell this software." >> Yeah, compensation drives behavior. >> Yeah, yeah. >> So, at this point, I ask, "Well, do you have any examples?" so Shreesha, let's turn to you. (laughing softly) Niagara Bottling -- >> As a matter of fact, Dave, we do. (laughing) >> Yeah, so you're not a bank with a trillion dollars in assets under management. Tell us about Niagara Bottling and your role. >> Well, Niagara Bottling is the biggest private label bottled water manufacturing company in the U.S. We make bottled water for Costcos, Walmarts, major national grocery retailers. These are our customers whom we service, and as with all large customers, they're demanding, and we provide bottled water at relatively low cost and high quality. >> Yeah, so I used to have a CIO consultancy. We worked with every CIO up and down the East Coast. I always observed, really got into a lot of organizations. I was always observed that it was really the heads of Application that drove AI because they were the glue between the business and IT, and that's really where you sit in the organization, right? >> Yes. My role is to support the business and business analytics as well as I support some of the distribution technologies and planning technologies at Niagara Bottling. >> So take us the through the project if you will. What were the drivers? What were the outcomes you envisioned? And we can kind of go through the case study. >> So the current project that we leveraged IBM's help was with a stretch wrapper project. Each pallet that we produce--- we produce obviously cases of bottled water. These are stacked into pallets and then shrink wrapped or stretch wrapped with a stretch wrapper, and this project is to be able to save money by trying to optimize the amount of stretch wrap that goes around a pallet. We need to be able to maintain the structural stability of the pallet while it's transported from the manufacturing location to our customer's location where it's unwrapped and then the cases are used. >> And over breakfast we were talking. You guys produce 2833 bottles of water per second. >> Wow. (everyone laughs) >> It's enormous. The manufacturing line is a high speed manufacturing line, and we have a lights-out policy where everything runs in an automated fashion with raw materials coming in from one end and the finished goods, pallets of water, going out. It's called pellets to pallets. Pellets of plastic coming in through one end and pallets of water going out through the other end. >> Are you sitting on top of an aquifer? Or are you guys using sort of some other techniques? >> Yes, in fact, we do bore wells and extract water from the aquifer. >> Okay, so the goal was to minimize the amount of material that you used but maintain its stability? Is that right? >> Yes, during transportation, yes. So if we use too much plastic, we're not optimally, I mean, we're wasting material, and cost goes up. We produce almost 16 million pallets of water every single year, so that's a lot of shrink wrap that goes around those, so what we can save in terms of maybe 15-20% of shrink wrap costs will amount to quite a bit. >> So, how does machine learning fit into all of this? >> So, machine learning is way to understand what kind of profile, if we can measure what is happening as we wrap the pallets, whether we are wrapping it too tight or by stretching it, that results in either a conservative way of wrapping the pallets or an aggressive way of wrapping the pallets. >> I.e. too much material, right? >> Too much material is conservative, and aggressive is too little material, and so we can achieve some savings if we were to alternate between the profiles. >> So, too little material means you lose product, right? >> Yes, and there's a risk of breakage, so essentially, while the pallet is being wrapped, if you are stretching it too much there's a breakage, and then it interrupts production, so we want to try and avoid that. We want a continuous production, at the same time, we want the pallet to be stable while saving material costs. >> Okay, so you're trying to find that ideal balance, and how much variability is in there? Is it a function of distance and how many touches it has? Maybe you can share with that. >> Yes, so each pallet takes about 16-18 wraps of the stretch wrapper going around it, and that's how much material is laid out. About 250 grams of plastic that goes on there. So we're trying to optimize the gram weight which is the amount of plastic that goes around each of the pallet. >> So it's about predicting how much plastic is enough without having breakage and disrupting your line. So they had labeled data that was, "if we stretch it this much, it breaks. If we don't stretch it this much, it doesn't break, but then it was about predicting what's good enough, avoiding both of those extremes, right? >> Yes. >> So it's a truly predictive and iterative model that we've built with them. >> And, you're obviously injecting data in terms of the trip to the store as well, right? You're taking that into consideration in the model, right? >> Yeah that's mainly to make sure that the pallets are stable during transportation. >> Right. >> And that is already determined how much containment force is required when your stretch and wrap each pallet. So that's one of the variables that is measured, but the inputs and outputs are-- the input is the amount of material that is being used in terms of gram weight. We are trying to minimize that. So that's what the whole machine learning exercise was. >> And the data comes from where? Is it observation, maybe instrumented? >> Yeah, the instruments. Our stretch-wrapper machines have an ignition platform, which is a Scada platform that allows us to measure all of these variables. We would be able to get machine variable information from those machines and then be able to hopefully, one day, automate that process, so the feedback loop that says "On this profile, we've not had any breaks. We can continue," or if there have been frequent breaks on a certain profile or machine setting, then we can change that dynamically as the product is moving through the manufacturing process. >> Yeah, so think of it as, it's kind of a traditional manufacturing production line optimization and prediction problem right? It's minimizing waste, right, while maximizing the output and then throughput of the production line. When you optimize a production line, the first step is to predict what's going to go wrong, and then the next step would be to include precision optimization to say "How do we maximize? Using the constraints that the predictive models give us, how do we maximize the output of the production line?" This is not a unique situation. It's a unique material that we haven't really worked with, but they had some really good data on this material, how it behaves, and that's key, as you know, Dave, and probable most of the people watching this know, labeled data is the hardest part of doing machine learning, and building those features from that labeled data, and they had some great data for us to start with. >> Okay, so you're collecting data at the edge essentially, then you're using that to feed the models, which is running, I don't know, where's it running, your data center? Your cloud? >> Yeah, in our data center, there's an instance of DSX Local. >> Okay. >> That we stood up. Most of the data is running through that. We build the models there. And then our goal is to be able to deploy to the edge where we can complete the loop in terms of the feedback that happens. >> And iterate. (Shreesha nods) >> And DSX Local, is Data Science Experience Local? >> Yes. >> Slash Watson Studio, so they're the same thing. >> Okay now, what role did IBM and the Data Science Elite Team play? You could take us through that. >> So, as we discussed earlier, adopting data science is not that easy. It requires subject matter, expertise. It requires understanding of data science itself, the tools and techniques, and IBM brought that as a part of the Data Science Elite Team. They brought both the tools and the expertise so that we could get on that journey towards AI. >> And it's not a "do the work for them." It's a "teach to fish," and so my team sat side by side with the Niagara Bottling team, and we walked them through the process, so it's not a consulting engagement in the traditional sense. It's how do we help them learn how to do it? So it's side by side with their team. Our team sat there and walked them through it. >> For how many weeks? >> We've had about two sprints already, and we're entering the third sprint. It's been about 30-45 days between sprints. >> And you have your own data science team. >> Yes. Our team is coming up to speed using this project. They've been trained but they needed help with people who have done this, been there, and have handled some of the challenges of modeling and data science. >> So it accelerates that time to --- >> Value. >> Outcome and value and is a knowledge transfer component -- >> Yes, absolutely. >> It's occurring now, and I guess it's ongoing, right? >> Yes. The engagement is unique in the sense that IBM's team came to our factory, understood what that process, the stretch-wrap process looks like so they had an understanding of the physical process and how it's modeled with the help of the variables and understand the data science modeling piece as well. Once they know both side of the equation, they can help put the physical problem and the digital equivalent together, and then be able to correlate why things are happening with the appropriate data that supports the behavior. >> Yeah and then the constraints of the one use case and up to 90 days, there's no charge for those two. Like I said, it's paramount that our clients like Niagara know how to do this successfully in their enterprise. >> It's a freebie? >> No, it's no charge. Free makes it sound too cheap. (everybody laughs) >> But it's part of obviously a broader arrangement with buying hardware and software, or whatever it is. >> Yeah, its a strategy for us to help make sure our clients are successful, and I want it to minimize the activation energy to do that, so there's no charge, and the only requirements from the client is it's a real use case, they at least match the resources I put on the ground, and they sit with us and do things like this and act as a reference and talk about the team and our offerings and their experiences. >> So you've got to have skin in the game obviously, an IBM customer. There's got to be some commitment for some kind of business relationship. How big was the collective team for each, if you will? >> So IBM had 2-3 data scientists. (Dave takes notes) Niagara matched that, 2-3 analysts. There were some working with the machines who were familiar with the machines and others who were more familiar with the data acquisition and data modeling. >> So each of these engagements, they cost us about $250,000 all in, so they're quite an investment we're making in our clients. >> I bet. I mean, 2-3 weeks over many, many weeks of super geeks time. So you're bringing in hardcore data scientists, math wizzes, stat wiz, data hackers, developer--- >> Data viz people, yeah, the whole stack. >> And the level of skills that Niagara has? >> We've got actual employees who are responsible for production, our manufacturing analysts who help aid in troubleshooting problems. If there are breakages, they go analyze why that's happening. Now they have data to tell them what to do about it, and that's the whole journey that we are in, in trying to quantify with the help of data, and be able to connect our systems with data, systems and models that help us analyze what happened and why it happened and what to do before it happens. >> Your team must love this because they're sort of elevating their skills. They're working with rock star data scientists. >> Yes. >> And we've talked about this before. A point that was made here is that it's really important in these projects to have people acting as product owners if you will, subject matter experts, that are on the front line, that do this everyday, not just for the subject matter expertise. I'm sure there's executives that understand it, but when you're done with the model, bringing it to the floor, and talking to their peers about it, there's no better way to drive this cultural change of adopting these things and having one of your peers that you respect talk about it instead of some guy or lady sitting up in the ivory tower saying "thou shalt." >> Now you don't know the outcome yet. It's still early days, but you've got a model built that you've got confidence in, and then you can iterate that model. What's your expectation for the outcome? >> We're hoping that preliminary results help us get up the learning curve of data science and how to leverage data to be able to make decisions. So that's our idea. There are obviously optimal settings that we can use, but it's going to be a trial and error process. And through that, as we collect data, we can understand what settings are optimal and what should we be using in each of the plants. And if the plants decide, hey they have a subjective preference for one profile versus another with the data we are capturing we can measure when they deviated from what we specified. We have a lot of learning coming from the approach that we're taking. You can't control things if you don't measure it first. >> Well, your objectives are to transcend this one project and to do the same thing across. >> And to do the same thing across, yes. >> Essentially pay for it, with a quick return. That's the way to do things these days, right? >> Yes. >> You've got more narrow, small projects that'll give you a quick hit, and then leverage that expertise across the organization to drive more value. >> Yes. >> Love it. What a great story, guys. Thanks so much for coming to theCUBE and sharing. >> Thank you. >> Congratulations. You must be really excited. >> No. It's a fun project. I appreciate it. >> Thanks for having us, Dave. I appreciate it. >> Pleasure, Seth. Always great talking to you, and keep it right there everybody. You're watching theCUBE. We're live from New York City here at the Westin Hotel. cubenyc #cubenyc Check out the ibm.com/winwithai Change the Game: Winning with AI Tonight. We'll be right back after a short break. (minimal upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by IBM. at Terminal 5 of the West Side Highway, I think we met in the snowstorm in Boston, sparked something When we were both trapped there. Yep, and at that time, we spent a lot of time and we found a consistent theme with all the clients, So, at this point, I ask, "Well, do you have As a matter of fact, Dave, we do. Yeah, so you're not a bank with a trillion dollars Well, Niagara Bottling is the biggest private label and that's really where you sit in the organization, right? and business analytics as well as I support some of the And we can kind of go through the case study. So the current project that we leveraged IBM's help was And over breakfast we were talking. (everyone laughs) It's called pellets to pallets. Yes, in fact, we do bore wells and So if we use too much plastic, we're not optimally, as we wrap the pallets, whether we are wrapping it too little material, and so we can achieve some savings so we want to try and avoid that. and how much variability is in there? goes around each of the pallet. So they had labeled data that was, "if we stretch it this that we've built with them. Yeah that's mainly to make sure that the pallets So that's one of the variables that is measured, one day, automate that process, so the feedback loop the predictive models give us, how do we maximize the Yeah, in our data center, Most of the data And iterate. the Data Science Elite Team play? so that we could get on that journey towards AI. And it's not a "do the work for them." and we're entering the third sprint. some of the challenges of modeling and data science. that supports the behavior. Yeah and then the constraints of the one use case No, it's no charge. with buying hardware and software, or whatever it is. minimize the activation energy to do that, There's got to be some commitment for some and others who were more familiar with the So each of these engagements, So you're bringing in hardcore data scientists, math wizzes, and that's the whole journey that we are in, in trying to Your team must love this because that are on the front line, that do this everyday, and then you can iterate that model. And if the plants decide, hey they have a subjective and to do the same thing across. That's the way to do things these days, right? across the organization to drive more value. Thanks so much for coming to theCUBE and sharing. You must be really excited. I appreciate it. I appreciate it. Change the Game: Winning with AI Tonight.

ENTITIES

Entity	Category	Confidence
Shreesha Rao	PERSON	0.99+
Seth Dobern	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Walmarts	ORGANIZATION	0.99+
Costcos	ORGANIZATION	0.99+
Dave	PERSON	0.99+
30	QUANTITY	0.99+
Boston	LOCATION	0.99+
New York City	LOCATION	0.99+
California	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
60	QUANTITY	0.99+
Niagara	ORGANIZATION	0.99+
Seth	PERSON	0.99+
Shreesha	PERSON	0.99+
U.S.	LOCATION	0.99+
Sreesha Rao	PERSON	0.99+
third sprint	QUANTITY	0.99+
90 days	QUANTITY	0.99+
two	QUANTITY	0.99+
first step	QUANTITY	0.99+
Inderpal Bhandari	PERSON	0.99+
Niagara Bottling	ORGANIZATION	0.99+
Python	TITLE	0.99+
both	QUANTITY	0.99+
tonight	DATE	0.99+
ibm.com/winwithai	OTHER	0.99+
one	QUANTITY	0.99+
Terminal 5	LOCATION	0.99+
two years	QUANTITY	0.99+
about $250,000	QUANTITY	0.98+
Times Square	LOCATION	0.98+
Scala	TITLE	0.98+
2018	DATE	0.98+
15-20%	QUANTITY	0.98+
IBM Analytics	ORGANIZATION	0.98+
each	QUANTITY	0.98+
today	DATE	0.98+
each pallet	QUANTITY	0.98+
Kaggle	ORGANIZATION	0.98+
West Side Highway	LOCATION	0.97+
Each pallet	QUANTITY	0.97+
4 sprints	QUANTITY	0.97+
About 250 grams	QUANTITY	0.97+
both side	QUANTITY	0.96+
Data Science Elite Team	ORGANIZATION	0.96+
one day	QUANTITY	0.95+
every single year	QUANTITY	0.95+
Niagara Bottling	PERSON	0.93+
about two sprints	QUANTITY	0.93+
one end	QUANTITY	0.93+
R	TITLE	0.92+
2-3 weeks	QUANTITY	0.91+
one profile	QUANTITY	0.91+
50-60 analysts	QUANTITY	0.91+
trillion dollars	QUANTITY	0.9+
2-3 data scientists	QUANTITY	0.9+
about 30-45 days	QUANTITY	0.88+
almost 16 million pallets of water	QUANTITY	0.88+
Big Apple	LOCATION	0.87+
couple years ago	DATE	0.87+
last 18 months	DATE	0.87+
Westin Hotel	ORGANIZATION	0.83+
pallet	QUANTITY	0.83+
#cubenyc	LOCATION	0.82+
2833 bottles of water per second	QUANTITY	0.82+
the Game: Winning with AI	TITLE	0.81+

Seth Dobrin, IBM & Asim Tewary, Verizon | IBM CDO Summit Spring 2018

>> Narrator: Live from downtown San Francisco, it's The Cube, covering IBM chief data officer strategy summit 2018, brought to you by IBM. (playful music) >> Welcome back to the IBM chief data officer strategy summit in San Francisco. We're here at the Parc 55. My name is Dave Vellante, and you're watching The Cube, the leader in live tech coverage, #IBMCDO. Seth Dobrin is here. He's the chief data officer for IBM analytics. Seth, good to see you again. >> Good to see you again, Dave. >> Many time Cube alum; thanks for coming back on. Asim Tewary, Tewary? Tewary; sorry. >> Tewary, yes. >> Asim Tewary; I can't read my own writing. Head of data science and advanced analytics at Verizon, and from Jersey. Two east coast boys, three east coast boys. >> Three east coast boys. >> Yeah. >> Welcome, gentlemen. >> Thank you. >> Asim, you guys had a panel earlier today. Let's start with you. What's your role? I mean, we talked you're the defacto chief data officer at Verizon. >> Yes, I'm responsible for all the data ingestion platform, big data, and the data science for Verizon, for wireless, wire line, and enterprise businesses. >> It's a relatively new role at Verizon? You were saying previously you were CDO at a financial services organization. Common that a financial service organization would have a chief data officer. How did the role come about at Verizon? Are you Verizon's first CDO or-- >> I was actually brought in to really pull together the analytics and data across the enterprise, because there was a realization that data only creates value when you're able to get it from all the difference sources. We had separate teams in the past. My role was to bring it all together, to have a common platform, common data science team to drive revenue across the businesses. >> Seth, this is a big challenge, obviously. We heard Caitlyn this morning, talking about the organizational challenges. You got data in silos. Inderpal and your team are basically, I call it dog-fooding. You're drinking your own champagne. >> Champagne-ing, yeah. >> Yeah, okay, but you have a similar challenge. You have big company, complex, a lot of data silos coming. Yeah, I mean, IBM is really, think of it as five companies, right? Any one of them would be a fortune 500 company in and of themselves. Even within each of those, there were silos, and then Inderpal trying to bring them across, you know, the data from across all of them is really challenging. Honestly, the technology part, the bringing it together is the easy part. It's the cultural change that goes along with it that's really, really hard, to get people to think about it as IBM's or Verizon's data, and not their data. That's really how you start getting value from it. >> That's a cultural challenge you face is, "Okay, I've got my data; I don't want to share." How do you address that? >> Absolutely. Governance and ownership of data, having clear roles and responsibilities, ensuring there's this culture where people realize that data is an asset of the firm. It is not your data or my data; it is firm's data, and the value you create for the business is from that data. It is a transformation. It's changing the people culture aspect, so there's a lot of education. You know, you have to be an evangelist. You wear multiple hats to show people the value, why they should do. Obviously, I had an advantage because coming in, Verizon management was completely sold to the idea that the data has to be managed as an enterprise asset. Business was ready and willing to own data as an enterprise asset, and so it was relatively easier. However, it was a journey to try to get everyone on the same page in terms of ensuring that it wasn't the siloed mentality. This was a enterprise asset that we need to manage together. >> A lot of organizations tell me that, first of all, you got to have top-down buy-in. Clearly, you had that, but a lot of the times I hear that the C-suite says, "Okay, we're going to do this," but the middle management is sort of, they got to PNL, they've got to make their plan, and it takes them longer to catch up. Did you face that challenge, and how do you ... How were you addressing it? >> Absolutely. What we had to do was really make sure that we were not trying to boil the ocean, that we were trying to show the values. We found champions. For example, finance, you know, was a good champion for us, where we used the data and analytics to really actually launch some very critical initiatives for the firm, asset-backed securities. For the first time, Verizon launched ABS, and we actually enabled that. That created the momentum, if you will, as to, "Okay, there's value in this." That then created the opportunity for all the other business to jump on and start leveraging data. Then we all are willing to help and be part of the journey. >> Seth, before you joined IBM, obviously the company was embarking on this cognitive journey. You know, Watson, the evolution of Watson, the kind of betting a lot on cognitive, but internally you must have said, "Well, if we're going to market this externally, "we'd better become a cognitive enterprise." One of the questions that came up on the panel was, "What is a cognitive enterprise?" You guys, have you defined it? Love to ask Asim the same question. >> Yeah, so I mean, a cognitive enterprise is really about an enterprise that uses data and analytics, and cognition to run their business, right? You can't just jump to being a cognitive enterprise, right? It's a journey or a ladder, right? Where you got to get that foundation data in order. Then you've got to start even being able to do basic analytics. Then you can start doing things like machine learning, and deep learning, and then you can get into cognition. It's not a, just jump to the top of the ladder, because there's just a lot of work that's required to do it. You can do that within a business unit. The whole company doesn't need to get there, and in fact, you'll see within a company, different part of the company will be at different stages. Kind of to Asim's point about partnering with finance, and that's my experience both at IBM and before I joined. You find a partner that's going to be a champion for you. You make them immensely successful, and everyone else will follow because of shame, because they don't want to be out-competed by their peers. >> So, similar definition of a cognitive enterprise? >> Absolutely. In fact, what I would say is cognitive is a spectrum, right? Where most companies are at the low end of that spectrum where using data for decision-making, but those are reports, BI reports, and stuff like that. As you evolve to become smarter and more AI machine learning, that's when you get into predictive, where you're using the data to predict what might happen based on prior historical information. Then that evolution goes all the way to being prescriptive, where you're not only looking back and being able to predict, but you're actually able to recommend action that you want to take. Obviously, with the human involvement, because governance is an important aspect to all of this, right? Completely agree that the cognitive is really covering the spectrum of prescriptive, predictive, and using data for all your decision making. >> This actually gets into a good point, right? I mean, I think Asim has implemented some deep learning models at Verizon, but you really need to think about what's the right technology or the right, you know, the right use case for that. There's some use cases where descriptive analytics is the right answer, right? There's no reason to apply machine learning or deep learning. You just need to put that in front of someone. Then there are use cases where you do want deep learning, either because the problem is so complex, or because the accuracy needs to be there. I go into a lot of companies to talk to senior executives, and they're like, "We want to do deep learning." You ask them what the use case is, and you're like, "Really, that's rules," right? It gets back to Occam's razor, right? The simplest solution is always the answer, is always the best answer. Really understanding from your perspective, having done this at a couple of companies now, kind of when do you know when to use deep learning versus machine learning, versus just basic statistics? >> How about that? >> Yeah. >> How do you parse that? >> Absolutely. You know, like anything else, it's very important to understand what problem you're trying to solve. When you have a hammer, everything looks like a nail, and deep learning might be one of those hammers. What we do is make sure that any problem that requires explain-ability, interpret-ability, you cannot use deep learning, because you cannot explain when you're using deep learning. It's a multi-layered neural network algorithm. You can't really explain why the outcome was what it was. For that, you have to use more simpler algorithms, like decision tree, like regression, classification. By the way, 70 to 80% of the problem that you have in the company, can be solved by those algorithms. You don't always use deep learning, but deep learning is a great use case algorithm to use when you're solving complex problems. For example, when you're looking at doing friction analysis as to customer journey path analysis, that tends to be very noisy. You know, you have billions of data points that you have to go through for an algorithm. That is, you know, good for deep learning, so we're using that today, but you know, those are a narrow set of use cases where it is required, so it's important to understand what problem you're trying to solve and where you want to use deep learning. >> To use deep learning, you need a lot of label data, right? >> Yes. >> And that's-- >> A lot of what? Label data? >> Label data. So, and that's often a hurdle to companies using deep learning, even when they have a legitimate deep learning use cases. Just the massive amount of label data you need for that use case. >> As well as scale, right? >> Yeah. >> The whole idea is that when you have massive amounts of data with a lot of different variables, you need deep learning to be able to make that decision. That means you've got to have scale and real time capability within the platform, that has the elasticity and compute, to be able to crunch all that data. >> Yeah. >> Initially, when we started on this journey, our infrastructure was not able to handle that. You know, we had a lot of failures, and so obviously we had to enhance our infrastructure to-- >> You spoke to Samit Gupta and Ed earlier, about, you know, GPUs, and flash storage, and the need for those types of things to do these complex, you know, deep learning problems. We struggled with that even inside of IBM when we first started building this platform as, how do we get the best performance of ingesting the data, getting it labeled, and putting it into these models, these deep learning models, and some of the instance we use that. >> Yeah, my takeaway is that infrastructure for AI has to be flexible, you got to be great granularity. It's got to not only be elastic, but it's got to be, sometimes we call it plastic. It's got to sometimes retain its form. >> Yes. >> Right? Then when you bring in some new unknown workload, you've got to be able to adjust it without ripping down the entire infrastructure. You have to purpose built a whole next set of infrastructure, which is kind of how we built IT over the years. >> Exactly. >> I think, Dave, too, When you and I first spoke four or five years ago, it was all about commodity hardware, right? It was going to Hadoop ecosystem, minimizing, you know, getting onto commodity hardware, and now you're seeing a shift away from commodity hardware, in some instances, toward specialized hardware, because you need it for these use cases. So we're kind of making that. We shifted to one extreme, and now we're kind of shifting, and I think we're going to get to a good equilibrium where it's a balance of commodity and specialized hardware for big data, as much as I hate that word, and advanced analytics. >> Well, yeah, even your cloud guys, all the big cloud guys, they used to, you know, five, six years ago, say, "Oh, it's all commodity stuff," and now it's a lot of custom, because they're solving problems that you can't solve with a commodity. I want to ask you guys about this notion of digital business. To us, the difference between a business and a digital business is how you use data. As you become a digital business, which is essentially what you're doing with cognitive and AI, historically, you may have organized around, I don't know, your network, and certain you've got human skills that are involved, and your customers. I mean, IBM in your case, it's your products, your services, your portfolio, your clients. Increasingly, you're organizing around your data, aren't you? Which brings back to cultural change, but what about the data model? I presume you're trying to get to a data model where the customer service, and the sales, and the marketing aren't separate entities. I don't have to deal with them when I talk to Verizon. I deal with just Verizon, right? That's not easy when the data's all inside. How are you dealing with that challenge? >> Customer is at the center of the business model. Our motto and out goal is to provide the best products to the customers, but even more important, provide the best experience. It is all about the customer, agnostic of the channel, which channel the customer is interacting with. The customer, for the customer, it's one Verizon. The way we are organizing our data platform is, first of all, breaking all the silos. You know, we need to have data from all interactions with the customer, that is all digital, that's coming through, and creating one unified model, essentially, that essentially teaches all the journeys, and all the information about the customer, their events, their behavior, their propensities, and stuff like that. Then that information, using algorithms, like predictive, prescriptive, and all of that, make it available in all channels of engagement. Essentially, you have common intelligence that is made available across all channels. Whether the customer goes to point of sale in a retail store, or calls a call center, talks to a rep, or is on the digital channel, is the same intelligence driving the experience. Whether a customer is trying to buy a phone, or has an issue with a service related aspect of it, and that's the key, which is centralized intelligence from common data lake, and then deliver a seamless experience across all channels for that customer-- >> Independent of where I bought that phone, for example, right? >> Exactly. Maintaining the context is critical. If you went to the store and you know, you're looking for a phone, and you know, you didn't find what you're looking for, you want to do some research, if you go to the digital channel, you should be able to have a seamless experience where we should know that you went, that you're looking for the phone, or you called care and you asked the agent about something. Having that context be transferred across channel and be available, so the customer feels that we know who the customer is, and provide them with a good experience, is the key. >> We have limited time, but I want to talk about skills. It's hard to come by; we talked about that. It's number five on Inderpal's sort of, list of things you've got to do as a CDO. Sometimes you can do MNA, by the weather company. You've got a lot of skills, but that's not always so practical. How have you been dealing with the skills gap? >> Look, skill is hard to find, data scientists are hard to find. The way we are envisioning our talent management is two things we need to take care of. One, we need solid big data engineers, because having a solid platform that has real trans-streaming capability is very critical. Second, data scientists, it's hard to get. However, our plan is to really take the domain experts, who really understand the business, who understand the business process and the data, and give them the tools, automation tools for data science, that essentially, you know, will put it in a box for them, in terms of which algorithm to use, and enable them to create more value. While we will continue to hire specialized data scientists who are going to work on much more of the complex problems, the skill will come from empowering and enabling the domain experts with data science capabilities that automates choosing model development and algorithm development. >> Presumably grooming people in house, right? >> Grooming people in house, and I actually break it down a little more granular. I even say there's data engineers, there's machine learning engineers, there's optimization engineers, then there's data journalists. They're the ones that tell the story. I think we were talking earlier, Asim, about you know, it's not just PhDs, right? You're not just looking for PhDs to fill these rolls anymore. You're looking for people with masters degrees, and even in some cases, bachelors degrees. With IBM's new collar job initiative, we're even bringing on some, what we call P-TECH students, which are five year high school students, and we're building a data science program for them. We're building apprenticeships, which is, you know, you've had a couple years of college, building a data science program, and people look at me like I'm crazy when I say that, but the bulk of the work of a data science program, of executing data science, is not implementing machine learning models. It's engineering features, it's cleaning data. With basic Python skills, this is something that you can very easily teach these people to do, and then under the supervision of a principal data scientist or someone with a PhD or a masters degree, they can start learning how to implement models, but they can start contributing right away with just some basic Python skills. >> Then five, seven years in, they're-- >> Yeah. >> domain experts. All right, guys, got to jump, but thanks very much, Asim, for coming on and sharing your story. Seth, always a pleasure. >> Yeah, good to see you again, Dave. >> All right. >> Thank you, Dave. >> You're welcome. Keep it right there, buddy. >> Thanks. >> We'll be back with our next guest. This is The Cube, live from IBM CDO strategy summit in San Francisco. We'll be right back. (playful music) (phone dialing)

Published Date : May 1 2018

SUMMARY :

brought to you by IBM. Seth, good to see you again. Asim Tewary, Tewary? and from Jersey. the defacto chief data officer at Verizon. the data ingestion platform, You were saying previously you were CDO We had separate teams in the past. talking about the but you have a similar challenge. How do you address that? and the value you create for and it takes them longer to catch up. and be part of the journey. One of the questions that and cognition to run and being able to predict, or because the accuracy needs to be there. the problem that you have of label data you need when you have massive amounts of data and so obviously we had to and some of the instance we use that. has to be flexible, you got You have to purpose built because you need it for these use cases. and AI, historically, you Whether the customer goes to and be available, so the How have you been dealing and enable them to create more value. but the bulk of the work All right, guys, got to jump, Keep it right there, buddy. This is The Cube,

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
David	PERSON	0.99+
Michael	PERSON	0.99+
Marc Lemire	PERSON	0.99+
Chris O'Brien	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Hilary	PERSON	0.99+
Mark	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ildiko Vancsa	PERSON	0.99+
John	PERSON	0.99+
Alan Cohen	PERSON	0.99+
Lisa Martin	PERSON	0.99+
John Troyer	PERSON	0.99+
Rajiv	PERSON	0.99+
Europe	LOCATION	0.99+
Stefan Renner	PERSON	0.99+
Ildiko	PERSON	0.99+
Mark Lohmeyer	PERSON	0.99+
JJ Davis	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Beth	PERSON	0.99+
Jon Bakke	PERSON	0.99+
John Farrier	PERSON	0.99+
Boeing	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Dave Nicholson	PERSON	0.99+
Cassandra Garber	PERSON	0.99+
Peter McKay	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Dave Brown	PERSON	0.99+
Beth Cohen	PERSON	0.99+
Stu Miniman	PERSON	0.99+
John Walls	PERSON	0.99+
Seth Dobrin	PERSON	0.99+
Seattle	LOCATION	0.99+
5	QUANTITY	0.99+
Hal Varian	PERSON	0.99+
JJ	PERSON	0.99+
Jen Saavedra	PERSON	0.99+
Michael Loomis	PERSON	0.99+
Lisa	PERSON	0.99+
Jon	PERSON	0.99+
Rajiv Ramaswami	PERSON	0.99+
Stefan	PERSON	0.99+

Seth Dobrin, IBM | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to theCUBE's continuing coverage of our own event, Big Data SV. I'm Lisa Martin, with my cohost Dave Vellante. We're in downtown San Jose at this really cool place, Forager Eatery. Come by, check us out. We're here tomorrow as well. We're joined by, next, one of our CUBE alumni, Seth Dobrin, the Vice President and Chief Data Officer at IBM Analytics. Hey, Seth, welcome back to theCUBE. >> Hey, thanks for having again. Always fun being with you guys. >> Good to see you, Seth. >> Good to see you. >> Yeah, so last time you were chatting with Dave and company was about in the fall at the Chief Data Officers Summit. What's kind of new with you in IBM Analytics since then? >> Yeah, so the Chief Data Officers Summit, I was talking with one of the data governance people from TD Bank and we spent a lot of time talking about governance. Still doing a lot with governance, especially with GDPR coming up. But really started to ramp up my team to focus on data science, machine learning. How do you do data science in the enterprise? How is it different from doing a Kaggle competition, or someone getting their PhD or Masters in Data Science? >> Just quickly, who is your team composed of in IBM Analytics? >> So IBM Analytics represents, think of it as our software umbrella, so it's everything that's not pure cloud or Watson or services. So it's all of our software franchise. >> But in terms of roles and responsibilities, data scientists, analysts. What's the mixture of-- >> Yeah. So on my team I have a small group of people that do governance, and so they're really managing our GDPR readiness inside of IBM in our business unit. And then the rest of my team is really focused on this data science space. And so this is set up from the perspective of we have machine-learning engineers, we have predictive-analytics engineers, we have data engineers, and we have data journalists. And that's really focus on helping IBM and other companies do data science in the enterprise. >> So what's the dynamic amongst those roles that you just mentioned? Is it really a team sport? I mean, initially it was the data science on a pedestal. Have you been able to attack that problem? >> So I know a total of two people that can do that all themselves. So I think it absolutely is a team sport. And it really takes a data engineer or someone with deep expertise in there, that also understands machine-learning, to really build out the data assets, engineer the features appropriately, provide access to the model, and ultimately to what you're going to deploy, right? Because the way you do it as a research project or an activity is different than using it in real life, right? And so you need to make sure the data pipes are there. And when I look for people, I actually look for a differentiation between machine-learning engineers and optimization. I don't even post for data scientists because then you get a lot of data scientists, right? People who aren't really data scientists, and so if you're specific and ask for machine-learning engineers or decision optimization, OR-type people, you really get a whole different crowd in. But the interplay is really important because most machine-learning use cases you want to be able to give information about what you should do next. What's the next best action? And to do that, you need decision optimization. >> So in the early days of when we, I mean, data science has been around forever, right? We always hear that. But in the, sort of, more modern use of the term, you never heard much about machine learning. It was more like stats, math, some programming, data hacking, creativity. And then now, machine learning sounds fundamental. Is that a new skillset that the data scientists had to learn? Did they get them from other parts of the organization? >> I mean, when we talk about math and stats, what we call machine learning today has been what we've been doing since the first statistics for years, right? I mean, a lot of the same things we apply in what we call machine learning today I did during my PhD 20 years ago, right? It was just with a different perspective. And you applied those types of, they were more static, right? So I would build a model to predict something, and it was only for that. It really didn't apply it beyond, so it was very static. Now, when we're talking about machine learning, I want to understand Dave, right? And I want to be able to predict Dave's behavior in the future, and learn how you're changing your behavior over time, right? So one of the things that a lot of people don't realize, especially senior executives, is that machine learning creates a self-fulfilling prophecy. You're going to drive a behavior so your data is going to change, right? So your model needs to change. And so that's really the difference between what you think of as stats and what we think of as machine learning today. So what we were looking for years ago is all the same we just described it a little differently. >> So how fine is the line between a statistician and a data scientist? >> I think any good statistician can really become a data scientist. There's some issues around data engineering and things like that but if it's a team sport, I think any really good, pure mathematician or statistician could certainly become a data scientist. Or machine-learning engineer. Sorry. >> I'm interested in it from a skillset standpoint. You were saying how you're advertising to bring on these roles. I was at the Women in Data Science Conference with theCUBE just a couple of days ago, and we hear so much excitement about the role of data scientists. It's so horizontal. People have the opportunity to make impact in policy change, healthcare, etc. So the hard skills, the soft skills, mathematician, what are some of the other elements that you would look for or that companies, enterprises that need to learn how to embrace data science, should look for? Someone that's not just a mathematician but someone that has communication skills, collaboration, empathy, what are some of those, openness, to not lead data down a certain, what do you see as the right mix there of a data scientist? >> Yeah, so I think that's a really good point, right? It's not just the hard skills. When my team goes out, because part of what we do is we go out and sit with clients and teach them our philosophy on how you should integrate data science in the enterprise. A good part of that is sitting down and understanding the use case. And working with people to tease out, how do you get to this ultimate use case because any problem worth solving is not one model, any use case is not one model, it's many models. How do you work with the people in the business to understand, okay, what's the most important thing for us to deliver first? And it's almost a negotiation, right? Talking them back. Okay, we can't solve the whole problem. We need to break it down in discreet pieces. Even when we break it down into discreet pieces, there's going to be a series of sprints to deliver that. Right? And so having these soft skills to be able to tease that in a way, and really help people understand that their way of thinking about this may or may not be right. And doing that in a way that's not offensive. And there's a lot of really smart people that can say that, but they can come across at being offensive, so those soft skills are really important. >> I'm going to talk about GDPR in the time we have remaining. We talked about in the past, the clocks ticking, May the fines go into effect. The relationship between data science, machine learning, GDPR, is it going to help us solve this problem? This is a nightmare for people. And many organizations aren't ready. Your thoughts. >> Yeah, so I think there's some aspects that we've talked about before. How important it's going to be to apply machine learning to your data to get ready for GDPR. But I think there's some aspects that we haven't talked about before here, and that's around what impact does GDPR have on being able to do data science, and being able to implement data science. So one of the aspects of the GDPR is this concept of consent, right? So it really requires consent to be understandable and very explicit. And it allows people to be able to retract that consent at any time. And so what does that mean when you build a model that's trained on someone's data? If you haven't anonymized it properly, do I have to rebuild the model without their data? And then it also brings up some points around explainability. So you need to be able to explain your decision, how you used analytics, how you got to that decision, to someone if they request it. To an auditor if they request it. Traditional machine learning, that's not too much of a problem. You can look at the features and say these features, this contributed 20%, this contributed 50%. But as you get into things like deep learning, this concept of explainable or XAI becomes really, really important. And there were some talks earlier today at Strata about how you apply machine learning, traditional machine learning to interpret your deep learning or black box AI. So that's really going to be important, those two things, in terms of how they effect data science. >> Well, you mentioned the black box. I mean, do you think we'll ever resolve the black box challenge? Or is it really that people are just going to be comfortable that what happens inside the box, how you got to that decision is okay? >> So I'm inherently both cynical and optimistic. (chuckles) But I think there's a lot of things we looked at five years ago and we said there's no way we'll ever be able to do them that we can do today. And so while I don't know how we're going to get to be able to explain this black box as a XAI, I'm fairly confident that in five years, this won't even be a conversation anymore. >> Yeah, I kind of agree. I mean, somebody said to me the other day, well, it's really hard to explain how you know it's a dog. >> Seth: Right (chuckles). But you know it's a dog. >> But you know it's a dog. And so, we'll get over this. >> Yeah. >> I love that you just brought up dogs as we're ending. That's my favorite thing in the world, thank you. Yes, you knew that. Well, Seth, I wish we had more time, and thanks so much for stopping by theCUBE and sharing some of your insights. Look forward to the next update in the next few months from you. >> Yeah, thanks for having me. Good seeing you again. >> Pleasure. >> Nice meeting you. >> Likewise. We want to thank you for watching theCUBE live from our event Big Data SV down the street from the Strata Data Conference. I'm Lisa Martin, for Dave Vellante. Thanks for watching, stick around, we'll be rick back after a short break.

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media Welcome back to theCUBE's continuing coverage Always fun being with you guys. Yeah, so last time you were chatting But really started to ramp up my team So it's all of our software franchise. What's the mixture of-- and other companies do data science in the enterprise. that you just mentioned? And to do that, you need decision optimization. So in the early days of when we, And so that's really the difference I think any good statistician People have the opportunity to make impact there's going to be a series of sprints to deliver that. in the time we have remaining. And so what does that mean when you build a model Or is it really that people are just going to be comfortable ever be able to do them that we can do today. I mean, somebody said to me the other day, But you know it's a dog. But you know it's a dog. I love that you just brought up dogs as we're ending. Good seeing you again. We want to thank you for watching theCUBE

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Seth	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
20%	QUANTITY	0.99+
50%	QUANTITY	0.99+
TD Bank	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
two people	QUANTITY	0.99+
tomorrow	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
one model	QUANTITY	0.99+
five years	QUANTITY	0.98+
20 years ago	DATE	0.98+
Big Data SV	EVENT	0.98+
five years ago	DATE	0.98+
GDPR	TITLE	0.98+
theCUBE	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Strata Data Conference	EVENT	0.97+
today	DATE	0.97+
first statistics	QUANTITY	0.95+
CUBE	ORGANIZATION	0.94+
Women in Data Science Conference	EVENT	0.94+
both	QUANTITY	0.94+
Chief Data Officers Summit	EVENT	0.93+
Big Data SV 2018	EVENT	0.93+
couple of days ago	DATE	0.93+
years	DATE	0.9+
Forager Eatery	ORGANIZATION	0.9+
first	QUANTITY	0.86+
Watson	TITLE	0.86+
Officers Summit	EVENT	0.74+
Data Officer	PERSON	0.73+
SV	EVENT	0.71+
President	PERSON	0.68+
Strata	TITLE	0.67+
Big Data	ORGANIZATION	0.66+
earlier today	DATE	0.65+
Silicon Valley	LOCATION	0.64+
years	QUANTITY	0.6+
Chief	EVENT	0.44+
Kaggle	ORGANIZATION	0.43+

Seth Myers, Demandbase | George Gilbert at HQ

>> This is George Gilbert, we're on the ground at Demandbase, the B2B CRM company, based on AI, one of uh, a very special company that's got some really unique technology. We have the privilege to be with Seth Myers today, Senior Data Scientist and resident wizard, and who's going to take us on a journey through some of the technology Demandbase is built on, and some of the technology coming down the road. So Seth, welcome. >> Thank you very much for having me. >> So, we talked earlier with Aman Naimat, Senior VP of Technology, and we talked about some of the functionality in Demandbase, and how it's very flexible, and reactive, and adaptive in helping guide, or react to a customer's journey, through the buying process. Tell us about what that journey might look like, how it's different, and the touchpoints, and the participants, and then how your technology rationalizes that, because we know, old CRM packages were really just lists of contact points. So this is something very different. How's it work? >> Yeah, absolutely, so at the highest level, each customer's going to be different, each customer's going to make decisions and look at different marketing collateral, and respond to different marketing collateral in different ways, you know, as the companies get bigger, and their products they're offering become more sophisticated, that's certainly the case, and also, sales cycles take a long time. You're engaged with an opportunity over many months, and so there's a lot of touchpoints, there's a lot of planning that has to be done, so that actually offers a huge opportunity to be solved with AI, especially in light of recent developments in this thing called reinforcement learning. So reinforcement learning is basically machine learning that can think strategically, they can actually plan ahead in a series of decisions, and it's actually technology behind AlphaGo which is the Google technology that beat the best Go players in the world. And what we basically do is we say, "Okay, if we understand "you're a customer, we understand the company you work at, "we understand the things they've been researching elsewhere "on third party sites, then we can actually start to predict "about content they will be likely to engage with." But more importantly, we can start to predict content they're more likely to engage with next, and after that, and after that, and after that, and so what our technology does is it looks at all possible paths that your potential customer can take, all the different content you could ever suggest to them, all the different routes they will take, and it looks at ones that they're likely to follow, but also ones they're likely to turn them into an opportunity. And so we basically, in the same way Google Maps considers all possible routes to get you from your office to home, we do the same, and we choose the one that's most likely to convert the opportunity, the same way Google chooses the quickest road home. >> Okay, this is really, that's a great example, because people can picture that, but how do you, how do you know what's the best path, is it based on learning from previous journeys from customers? >> Yes. >> And then, if you make a wrong guess, you sort of penalize the engine and say, "Pick the next best, "what you thought was the next best path." >> Absolutely, so the way, the nuts and bolts of how it works is we start working with our clients, and they have all this data of different customers, and how they've engaged with different pieces of content throughout their journey, and so the machine learning model, what it's really doing at any moment in time, given any customer in any stage of the opportunity that they find themselves in, it says, what piece of content are they likely to engage with next, and that's based on historical training data, if you will. And then once we make that decision on a step-by-step basis, then we kind of extrapolate, and we basically say, "Okay, if we showed them this page, or if they engage with "this material, what would that do, what situation would "we find them in at the next step, and then what would "we recommend from there, and then from there, "and then from there," and so it's really kind of learning the right move to make at each time, and then extrapolating that all the way to the opportunity being closed. >> The picture that's in my mind is like, the Deep Blue, I think it was chess, where it would map out all the potential moves. >> Very similar, yeah. >> To the end game. >> Very similar idea. >> So, what about if you're trying to engage with a customer across different channels, and it's not just web content? How is that done? >> Well, that's something that we're very excited about, and that's something that we're currently really starting to devote resources to. Right now, we already have a product live that's focused on web content specifically, but yeah, we're working on kind of a multi-channel type solution, and we're all pretty excited about it. >> Okay so, obviously you can't talk too much about it. Can you tell us what channels that might touch? >> I might have to play my cards a little close to my chest on this one, but I'll just say we're excited. >> Alright. Well I guess that means I'll have to come back. >> Please, please. >> So, um, tell us about the personalized conversations. Is the conversation just another way of saying, this is how we're personalizing the journey? Or is there more to it than that? >> Yeah, it really is about personalizing the journey, right? Like you know, a lot of our clients now have a lot of sophisticated marketing collateral, and a lot of time and energy has gone into developing content that different people find engaging, that kind of positions products towards pain points, and all that stuff, and so really there's so much low-hanging fruit by just organizing and leveraging all of this material, and actually forming the conversation through a series of journeys through that material. >> Okay, so, Aman was telling us earlier that we have so many sort of algorithms, they're all open source, or they're all published, and they're only as good as the data you can apply them to. So, tell us, where do companies, startups, you know, not the Googles, Microsofts, Amazons, where do they get their proprietary information? Is it that you have algorithms that now are so advanced that you can refine raw information into proprietary information that others don't have? >> Really I think it comes down to, our competitive advantage I think is largely in the source of our data, and so, yes, you can build more and more sophisticated algorithms, but again, you're starting with a public data set, you'll be able to derive some insights, but there will always be a path to those datasets for, say, a competitor. For example, we're currently tracking about 700 billion web interactions a year, and then we're also able to attribute those web interactions to companies, meaning the employees at those companies involved in those web interactions, and so that's able to give us an insight that no amount of public data or processing would ever really be able to achieve. >> How do you, Aman started to talk to us about how, like there were DNS, reverse DNS registries. >> Reverse IP lookups, yes. >> Yeah, so how are those, if they're individuals within companies, and then the companies themselves, how do you identify them reliably? >> Right, so reverse IP lookup is, we've been doing this for years now, and so we've kind of developed a multi-source solution, so reverse IP lookups is a big one. Also machine learning, you can look at traffic coming from an IP address, and you can start to make some very informed decisions about what the IP address is actually doing, who they are, and so if you're looking at, at the account level, which is what we're tracking at, there's a lot of information to be gleaned from that kind of information. >> Sort of the way, and this may be a weird-sounding analogy, but the way a virus or some piece of malware has a signature in terms of its behavior, you find signatures in terms of users associated with an IP address. >> And we certainly don't de-anonymize individual users, but if we're looking at things at the account level, then you know, the bigger the data, the more signal you can infer, and so if we're looking at a company-wide usage of an IP address, then you can start to make some very educated guesses as to who that company is, the things that they're researching, what they're in market for, that type of thing. >> And how do you find out, if they're not coming to your site, and they're not coming to one of your customer's sites, how do you find out what they're touching? >> Right, I mean, I can't really go into too much detail, but a lot of it comes from working with publishers, and a lot of this data is just raw, and it's only because we can identify the companies behind these IP addresses, that we're able to actually turn these web interactions into insights about specific companies. >> George: Sort of like how advertisers or publishers would track visitors across many, many sites, by having agreements. >> Yes. Along those lines, yeah. >> Okay. So, tell us a little more about natural language processing, I think where most people have assumed or have become familiar with it is with the B2C capabilities, with the big internet giants, where they're trying to understand all language. You have a more well-scoped problem, tell us how that changes your approach. >> So a lot of really exciting things are happening in natural language processing in general, and the research, and right now in general, it's being measured against this yardstick of, can it understand languages as good as a human can, obviously we're not there yet, but that doesn't necessarily mean you can't derive a lot of meaningful insights from it, and the way we're able to do that is, instead of trying to understand all of human language, let's understand very specific language associated with the things that we're trying to learn. So obviously we're a B2B marketing company, so it's very important to us to understand what companies are investing in other companies, what companies are buying from other companies, what companies are suing other companies, and so if we said, okay, we only want to be able to infer a competitive relationship between two businesses in an actual document, that becomes a much more solvable and manageable problem, as opposed to, let's understand all of human language. And so we actually started off with these kind of open source solutions, with some of these proprietary solutions that we paid for, and they didn't work because their scope was this broad, and so we said, okay, we can do better by just focusing in on the types of insights we're trying to learn, and then work backwards from them. >> So tell us, how much of the algorithms that we would call building blocks for what you're doing, and others, how much of those are all published or open source, and then how much is your secret sauce? Because we talk about data being a key part of the secret sauce, what about the algorithms? >> I mean yeah, you can treat the algorithms as tools, but you know, a bag of tools a product does not make, right? So our secret sauce becomes how we use these tools, how we deploy them, and the datasets we put them again. So as mentioned before, we're not trying to understand all of human language, actually the exact opposite. So we actually have a single machine learning algorithm that all it does is it learns to recognize when Amazon, the company, is being mentioned in a document. So if you see the word Amazon, is it talking about the river, is it talking about the company? So we have a classifier that all it does is it fires whenever Amazon is being mentioned in a document. And that's a much easier problem to solve than understanding, than Siri basically. >> Okay. I still get rather irritated with Siri. So let's talk about, um, broadly this topic that sort of everyone lays claim to as their great higher calling, which is democratizing machine learning and AI, and opening it up to a much greater audience. Help set some context, just the way you did by saying, "Hey, if we narrow the scope of a problem, "it's easier to solve." What are some of the different approaches people are taking to that problem, and what are their sweet spots? >> Right, so the the talk of the data science community, talking machinery right now, is some of the work that's coming out of DeepMind, which is a subsidiary of Google, they just built AlphaGo, which solved the strategy game that we thought we were decades away from actually solving, and their approach of restricting the problem to a game, with well-defined rules, with a limited scope, I think that's how they're able to propel the field forward so significantly. They started off by playing Atari games, then they moved to long term strategy games, and now they're doing video games, like video strategy games, and I think the idea of, again, narrowing the scope to well-defined rules and well-defined limited settings is how they're actually able to advance the field. >> Let me ask just about playing the video games. I can't remember Star... >> Starcraft. >> Starcraft. Would you call that, like, where the video game is a model, and you're training a model against that other model, so it's almost like they're interacting with each other. >> Right, so it really comes down, you can think of it as pulling levers, so you have a very complex machine, and there's certain levers you can pull, and the machine will respond in different ways. If you're trying to, for example, build a robot that can walk amongst a factory and pick out boxes, like how you move each joint, what you look around, all the different things you can see and sense, those are all levers to pull, and that gets very complicated very quickly, but if you narrow it down to, okay, there's certain places on the screen I can click, there's certain things I can do, there's certain inputs I can provide in the video game, you basically limit the number of levers, and then optimizing and learning how to work those levers is a much more scoped and reasonable problem, as opposed to learn everything all at once. >> Okay, that's interesting, now, let me switch gears a little bit. We've done a lot of work at WikiBound about IOT and increasingly edge-based intelligence, because you can't go back to the cloud for your analytics for everything, but one of the things that's becoming apparent is, it's not just the training that might go on in a cloud, but there might be simulations, and then the sort of low-latency response is based on a model that's at the edge. Help elaborate where that applies and how that works. >> Well in general, when you're working with machine learning, in almost every situation, training the model is, that's really the data-intensive process that requires a lot of extensive computation, and that's something that makes sense to have localized in a single location which you can leverage resources and you can optimize it. Then you can say, alright, now that I have this model that understands the problem that's trained, it becomes a much simpler endeavor to basically put that as close to the device as possible. And so that really is how they're able to say, okay, let's take this really complicated billion-parameter neural network that took days and weeks to train, and let's actually derive insights at the level, right at the device level. Recent technology though, like I mentioned deep learning, that in itself, just the actual deploying the technology creates new challenges as well, to the point that actually Google invented a new type of chip to just run... >> The tensor processing. >> Yeah, the TPU. The tensor processing unit, just to handle what is now a machine learning algorithm so sophisticated that even deploying it after it's been trained is still a challenge. >> Is there a difference in the hardware that you need for training vs. inferencing? >> So they initially deployed the TPU just for the sake of inference. In general, the way it actually works is that, when you're building a neural network, there is a type of mathematical operation to do a whole bunch, and it's based on the idea of working with matrices and it's like that, that's still absolutely the case with training as well as inference, where actually, querying the model, but so if you can solve that one mathematical operation, then you can deploy it everywhere. >> Okay. So, one of our CTOs was talking about how, in his view, what's going to happen in the cloud is richer and richer simulations, and as you say, the querying the model, getting an answer in realtime or near realtime, is out on the edge. What exactly is the role of the simulation? Is that just a model that understands time, and not just time, but many multiple parameters that it's playing with? >> Right, so simulations are particularly important in taking us back to reinforcement learning, where you basically have many decisions to make before you actually see some sort of desirable or undesirable outcome, and so, for example, the way AlphaGo trained itself is basically by running simulations of the game being played against itself, and really what that simulations are doing is allowing the artificial intelligence to explore the entire possibilities of all games. >> Sort of like WarGames, if you remember that movie. >> Yes, with uh... >> Matthew Broderick, and it actually showed all the war game scenarios on the screen, and then figured out, you couldn't really win. >> Right, yes, it's a similar idea where they, for example in Go, there's more board configurations than there are atoms in the observable universe, and so the way Deep Blue won chess is basically, more or less explore the vast majority of chess moves, that's really not the same option, you can't really play that same strategy with AlphaGo, and so, this constant simulation is how they explore the meaningful game configurations that it needed to win. >> So in other words, they were scoped down, so the problem space was smaller. >> Right, and in fact, basically one of the reasons, like AlphaGo was really kind of two different artificial intelligences working together, one that decided which solutions to explore, like which possibilities it should pursue more, and which ones not to, to ignore, and then the second piece was, okay, given the certain board configuration, what's the likely outcome? And so those two working in concert, one that narrows and focuses, and one that comes up with the answer, given that focus, is how it was actually able to work so well. >> Okay. Seth, on that note, that was a very, very enlightening 20 minutes. >> Okay. I'm glad to hear that. >> We'll have to come back and get an update from you soon. >> Alright, absolutely. >> This is George Gilbert, I'm with Seth Myers, Senior Data Scientist at Demandbase, a company I expect we'll be hearing a lot more about, and we're on the ground, and we'll be back shortly.

Published Date : Nov 2 2017

SUMMARY :

We have the privilege to and the participants, and the company you work at, say, "Pick the next best, the right move to make the Deep Blue, I think it was chess, that we're very excited about, Okay so, obviously you I might have to play I'll have to come back. Is the conversation just and actually forming the as good as the data you can apply them to. and so that's able to give us Aman started to talk to us about how, and you can start to make Sort of the way, and this the things that they're and a lot of this data is just George: Sort of like how Along those lines, yeah. the B2C capabilities, focusing in on the types of about the company? the way you did by saying, the problem to a game, playing the video games. Would you call that, and that gets very complicated a model that's at the edge. that in itself, just the Yeah, the TPU. the hardware that you need and it's based on the idea is out on the edge. and so, for example, the if you remember that movie. it actually showed all the and so the way Deep Blue so the problem space was smaller. and focuses, and one that Seth, on that note, that was a very, very I'm glad to hear that. We'll have to come back and and we're on the ground,

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
George	PERSON	0.99+
Amazons	ORGANIZATION	0.99+
Microsofts	ORGANIZATION	0.99+
Siri	TITLE	0.99+
Googles	ORGANIZATION	0.99+
Demandbase	ORGANIZATION	0.99+
20 minutes	QUANTITY	0.99+
Starcraft	TITLE	0.99+
second piece	QUANTITY	0.99+
WikiBound	ORGANIZATION	0.99+
two businesses	QUANTITY	0.99+
Seth Myers	PERSON	0.99+
Aman Naimat	PERSON	0.99+
two	QUANTITY	0.99+
Atari	ORGANIZATION	0.99+
Seth	PERSON	0.98+
each customer	QUANTITY	0.98+
each joint	QUANTITY	0.98+
Go	TITLE	0.98+
single	QUANTITY	0.98+
Matthew Broderick	PERSON	0.98+
one	QUANTITY	0.98+
today	DATE	0.97+
Aman	PERSON	0.96+
Deep Blue	TITLE	0.96+
billion-parameter	QUANTITY	0.94+
each time	QUANTITY	0.91+
two different artificial intelligences	QUANTITY	0.88+
decades	QUANTITY	0.88+
Google Maps	TITLE	0.86+
AlphaGo	ORGANIZATION	0.82+
about 700 billion web interactions a year	QUANTITY	0.81+
Star	TITLE	0.81+
AlphaGo	TITLE	0.79+
one mathematical	QUANTITY	0.78+
lot	QUANTITY	0.76+
years	QUANTITY	0.74+
DeepMind	ORGANIZATION	0.74+
lot of information	QUANTITY	0.73+
bag of tools	QUANTITY	0.63+
IOT	TITLE	0.62+
WarGames	TITLE	0.6+
sites	QUANTITY	0.6+

Seth Dobrin & Jennifer Gibbs | IBM CDO Strategy Summit 2017

>> Live from Boston, Massachusetts. It's The Cube! Covering IBM Chief Data Officer's Summit. Brought to you by IBM. (techno music) >> Welcome back to The Cube's live coverage of the IBM CDO Strategy Summit here in Boston, Massachusetts. I'm your host Rebecca Knight along with my Co-host Dave Vellante. We're joined by Jennifer Gibbs, the VP Enterprise Data Management of TD Bank, and Seth Dobrin who is VP and Chief Data Officer of IBM Analytics. Thanks for joining us Seth and Jennifer. >> Thanks for having us. >> Thank you. >> So Jennifer, I want to start with you can you tell our viewers a little about TD Bank, America's Most Convenient Bank. Based, of course, in Toronto. (laughs). >> Go figure. (laughs) >> So tell us a little bit about your business. >> So TD is a, um, very old bank, headquartered in Toronto. We do have, ah, a lot of business as well in the U.S. Through acquisition we've built quite a big business on the Eastern seaboard of the United States. We've got about 85 thousand employees and we're servicing 42 lines of business when it comes to our Data Management and our Analytics programs, bank wide. >> So talk about your Data Management and Analytics programs a little bit. Tell our viewers a little bit about those. >> So, we split up our office of the Chief Data Officer, about 3 to 4 years ago and so we've been maturing. >> That's relatively new. >> Relatively new, probably, not unlike peers of ours as well. We started off with a strong focus on Data Governance. Setting up roles and responsibilities, data storage organization and councils from which we can drive consensus and discussion. And then we started rolling out some of our Data Management programs with a focus on Data Quality Management and Meta Data Management, across the business. So setting standards and policies and supporting business processes and tooling for those programs. >> Seth when we first met, now you're a long timer at IBM. (laughs) When we first met you were a newbie. But we heard today, about,it used to be the Data Warehouse was king but now Process is king. Can you unpack that a little bit? What does that mean? >> So, you know, to make value of data, it's more than just having it in one place, right? It's what you do with the data, how you ingest the data, how you make it available for other uses. And so it's really, you know, data is not for the sake of data. Data is not a digital dropping of applications, right? The whole purpose of having and collecting data is to use it to generate new value for the company. And that new value could be cost savings, it could be a cost avoidance, or it could be net new revenue. Um, and so, to do that right, you need processes. And the processes are everything from business processes, to technical processes, to implementation processes. And so it's the whole, you need all of it. >> And so Jennifer, I don't know if you've seen kind of a similar evolution from data warehouse to data everywhere, I'm sure you have. >> Yeah. >> But the data quality problem was hard enough when you had this sort of central master data management approach. How are you dealing with it? Is there less of a single version of the truth now than there ever was, and how do you deal with the data quality challenge? >> I think it's important to scope out the work effort in a way that you can get the business moving in the right direction without overwhelming and focusing on the areas that are most important to the bank. So, we've identified and scoped out what we call critical data. So each line of business has to identify what's critical to them. Does relate very strongly to what Seth said around what are your core business processes and what data are you leveraging to provide value to that, to the bank. So, um, data quality for us is about a consistent approach, to ensure the most critical elements of data that used for business processes are where they need to be from a quality perspective. >> You can go down a huge rabbit whole with data quality too, right? >> Yeah. >> Data quality is about what's good enough, and defining, you know. >> Right. >> Mm-hmm (affirmative) >> It's not, I liked your, someone, I think you said, it's not about data quality, it's about, you know it's, you got to understand what good enough is, and it's really about, you know, what is the state of the data and under, it's really about understanding the data, right? Than it is perfection. There are some cases, especially in banking, where you need perfection, but there's tons of cases where you don't. And you shouldn't spend a lot of resources on something that's not value added. And I think it's important to do, even things like, data quality, around a specific use case so that you do it right. >> And what you were saying too, it that it's good enough but then that, that standard is changing too, all the time. >> Yeah and that changes over time and it's, you know, if you drive it by use case and not just, we have get this boil the ocean kind of approach where all data needs to be perfect. And all data will never be perfect. And back to your question about processes, usually, a data quality issue, is not a data issue, it's a process issue. You get bad data quality because a process is broken or it's not working for a business or it's changed and no one's documented it so there's a work around, right? And so that's really where your data quality issues come from. Um, and I think that's important to remember. >> Yeah, and I think also coming out of the data quality efforts that we're making, to your point, is it central wise or is it cross business? It's really driving important conversations around who's the producer of this data, who's the consumer of this data? What does data quality mean to you? So it's really generating a lot of conversation across lines of business so that we can start talking about data in more of a shared way versus more of a business by business point of view. So those conversations are important by-products I would say of the individual data quality efforts that we're doing across the bank. >> Well, and of course, you're in a regulated business so you can have the big hammer of hey, we've got regulations, so if somebody spins up a Hadoop Cluster in some line of business you can reel 'em in, presumably, more easily, maybe not always. Seth you operate in an unregulated business. You consult with clients that are in unregulated businesses, is that a bigger challenge for you to reel in? >> So, I think, um, I think that's changing. >> Mm-hmm (affirmative) >> You know, there's new regulations coming out in Europe that basically have global impact, right? This whole GDPR thing. It's not just if you're based in Europe. It's if you have a subject in Europe and that's an employee, a contractor, a customer. And so everyone is subject to regulations now, whether they like it or not. And, in fact, there was some level of regulation even in the U.S., which is kind of the wild, wild, west when it comes to regulations. But I think, um, you should, even doing it because of regulation is not the right answer. I mean it's a great stick to hold up. It's great to be able to go to your board and say, "Hey if we don't do this, we need to spend this money 'cause it's going to cost us, in the case of GDPR, four percent of our revenue per instance.". Yikes, right? But really it's about what's the value and how do you use that information to drive value. A lot of these regulation are about lineage, right? Understanding where your data came from, how it's being processed, who's doing what with it. A lot of it is around quality, right? >> Yep. >> And so these are all good things, even if you're not in a regulated industry. And they help you build a better connection with your customer, right? I think lots of people are scared of GDPR. I think it's a really good thing because it forces companies to build a personal relationship with each of their clients. Because you need to get consent to do things with their data, very explicitly. No more of these 30 pages, two point font, you know ... >> Click a box. >> Click a box. >> Yeah. >> It's, I am going to use your data for X. Are you okay with that? Yes or no. >> So I'm interested from, to hear from both of you, what are you hearing from customers on this? Because this is such a sensitive topic and, in particularly, financial data, which is so private. What are you, what are you hearing from customers on this? >> Um, I think customers are, um, are, especially us in our industry, and us as a bank. Our relationship with our customer is top priority and so maintaining that trust and confidence is always a top priority. So whenever we leverage data or look for use cases to leverage data, making sure that that trust will not be compromised is critically important. So finding that balance between innovating with data while also maintaining that trust and frankly being very transparent with customers around what we're using it for, why we're using it, and what value it brings to them, is something that we're focused on with, with all of our data initiatives. >> So, big part of your job is understanding how data can affect and contribute to the monetization, you know, of your businesses. Um, at the simplest level, two ways, cut costs, increase revenue. Where do you each see the emphasis? I'm sure both, but is there a greater emphasis on cutting costs 'cause you're both established, you know, businesses, with hundreds of thousands, well in your case, 85 thousand employees. Where do you see the emphasis? Is it greater on cutting costs or not necessarily? >> I think for us, I don't necessarily separate the two. Anything we can do to drive more efficiency within our business processes is going to help us focus our efforts on innovative use of data, innovative ways to interact with our customers, innovative ways to understand more about out customers. So, I see them both as, um, I don't see them mutually exclusive, I see them as contributing to each. >> Mm-hmm (affirmative) >> So our business cases tend to have an efficiency slant to them or a productivity slant to them and that helps us redirect effort to other, other things that provide extra value to our clients. So I'd say it's a mix. >> I mean I think, I think you have to do the cost savings and cost avoidance ones first. Um, you learn a lot about your data when you do that. You learn a lot about the gaps. You learn about how would I even think about bringing external data in to generate that new revenue if I don't understand my own data? How am I going to tie 'em all together? Um, and there's a whole lot of cultural change that needs to happen before you can even start generating revenue from data. And you kind of cut your teeth on that by doing the really, simple cost savings, cost avoidance ones first, right? Inevitably, maybe not in the bank, but inevitably most company's supply chain. Let's go find money we can take out of your supply chain. Most companies, if you take out one percent of the supply chain budget, you're talking a lot of money for the company, right? And so you can generate a lot of money to free up to spend on some of these other things. >> So it's a proof of concept to bring everyone along. >> Well it's a proof of concept but it's also, it's more of a cultural change, right? >> Mm-hmm (affirmative) It's not even, you don't even frame it up as a proof of concept for data or analytics, you just frame it up, we're going to save the company, you know, one percent of our supply chain, right? We're going to save the company a billion dollars. >> Yes. >> And then there's gain share there 'cause we're going to put that thing there. >> And then there's a gain share and then other people are like, "Well, how do I do that?". And how do I do that, and how do I do that? And it kind of picks up. >> Mm-hmm (affirmative) But I don't think you can jump just to making new revenue. You got to kind of get there iteratively. >> And it becomes a virtuous circle. >> It becomes a virtuous circle and you kind of change the culture as you do it. But you got to start with, I don't, I don't think they're mutually exclusive, but I think you got to start with the cost avoidance and cost savings. >> Mm-hmm (affirmative) >> Great. Well, Seth, Jennifer thanks so much for coming on The Cube. We've had a great conversation. >> Thanks for having us. >> Thanks. >> Thanks you guys. >> We will have more from the IBM CDO Summit in Boston, Massachusetts, just after this. (techno music)

Published Date : Oct 25 2017

SUMMARY :

Brought to you by IBM. Cube's live coverage of the So Jennifer, I want to start with you (laughs) So tell us a little of the United States. So talk about your Data Management and of the Chief Data Officer, And then we started met you were a newbie. And so it's the whole, you need all of it. to data everywhere, I'm sure you have. How are you dealing with it? So each line of business has to identify and defining, you know. And I think it's important to do, And what you were And back to your question about processes, across lines of business so that we can business so you can have the big hammer of So, I think, um, I and how do you use that And they help you build Are you okay with that? what are you hearing and so maintaining that Where do you each see the emphasis? as contributing to each. So our business cases tend to have And so you can generate a lot of money to bring everyone along. It's not even, you don't even frame it up to put that thing there. And it kind of picks up. But I don't think you can jump change the culture as you do it. much for coming on The Cube. from the IBM CDO Summit

ENTITIES

Entity	Category	Confidence
Seth	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jennifer	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer Gibbs	PERSON	0.99+
Europe	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
TD Bank	ORGANIZATION	0.99+
Toronto	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
TD	ORGANIZATION	0.99+
42 lines	QUANTITY	0.99+
two	QUANTITY	0.99+
Boston, Massachusetts	LOCATION	0.99+
30 pages	QUANTITY	0.99+
United States	LOCATION	0.99+
one percent	QUANTITY	0.99+
both	QUANTITY	0.99+
two point	QUANTITY	0.99+
U.S.	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
each line	QUANTITY	0.99+
GDPR	TITLE	0.99+
today	DATE	0.98+
each	QUANTITY	0.98+
85 thousand employees	QUANTITY	0.98+
hundreds of thousands	QUANTITY	0.98+
four percent	QUANTITY	0.97+
first	QUANTITY	0.97+
one place	QUANTITY	0.97+
two ways	QUANTITY	0.97+
about 85 thousand employees	QUANTITY	0.95+
4 years ago	DATE	0.93+
IBM	EVENT	0.93+
IBM CDO Summit	EVENT	0.91+
IBM CDO Strategy Summit	EVENT	0.91+
Data Warehouse	ORGANIZATION	0.89+
billion dollars	QUANTITY	0.89+
IBM Chief Data Officer's	EVENT	0.88+
about 3	DATE	0.81+
tons of cases	QUANTITY	0.79+
America	ORGANIZATION	0.77+
CDO Strategy Summit 2017	EVENT	0.76+
single version	QUANTITY	0.67+
Data Officer	PERSON	0.59+
Cube	ORGANIZATION	0.58+
money	QUANTITY	0.52+
lot	QUANTITY	0.45+
The Cube	ORGANIZATION	0.36+

Gaurav Seth, Microsoft | Node Summit 2017

(switch clicking) >> Hey, welcome back, everybody. Jeff Frick, here with theCUBE. We're at the Mission Bay Conference Center in downtown San Francisco at Node Summit 2017. TheCUBE's been coming here for a number of years. In fact, Ryan Dahl's one of our most popular interviews in the history of the show, talking about Node. And, the community's growing, the performance is going up and there's a lot of good energy here, so we're excited to be here and there's a lot of big companies that maybe you would or wouldn't expect to be involved. And, we're excited to have Gaurav Seth. He is the Product Manager for Several Things JavaScript. I think that's the first time we've ever had that title on. He's from Microsoft. Thanks for stopping by. >> Yeah, hey, Jeff, nice to be here. Thanks for having me over. >> Absolutely, >> Yes. >> so let's just jump right into it. What is Microsoft doing here in such a big way? >> So, one of the things that Microsoft is, like, I think we really are, now, committed and, you know, we have the mantra that we are trying to follow which is any app, any developer, any platform. You know, Node actually is a great growing community and we've been getting soaked more and more and trying to help the community and build the community and play along and contribute and that's the reason that brings us here, like, it's great to see the energy, the passion with people around here. It's great to get those connections going, have those conversations, hear from the customers as to what they really need, hear from developers about their needs and then having, you know, a close set of collaboration with the Core community members to see how we can even evolve the project further. >> Right, right, and specifically on Azure, which is interesting. You know, it's been interesting to watch Microsoft really go full bore into cloud, via Azure. >> Right. >> I just talked to somebody the other day, I was talking about 365 being >> Uh huh. >> such a game-changer in terms of cloud implementation, as a big company. There was a report that came out about, you know, the path at 20 billion, >> Right. >> so, clearly, Microsoft is not only all-in, but really successfully >> Right. >> executing on that strategy >> Yeah, I mean-- >> and you're a big piece of that. >> Yes, I mean, I think one of the big, big, big pieces, really, is as the developer paradigms are changing, as the app paradigms are changing, you know, how do you really help make developers this transition to a cloud-native world? >> Right, right. >> How do you make sure that the app platforms, the underlying infrastructure, the cloud, the tools that developer use, how do you combine all of them and make sure that you're making it a much easier experience for developers to move on >> Right. >> from their existing paradigms to these new cloud-native paradigms? You know, one of the things we've been doing on the Azure side of the house and when, especially when we look at Node.js as a platform, we've been working on making sure that Node.js has a great story across all the different compute models that we support on Azure, starting from, like, hey, if you you want to do server list of functions, if you want to do BasS, if you want to go the container way, if you want to just use WEAMS, and, in fact, we just announced the Azure container instances, today, >> Right. >> so it's, one of the work, some of the work we are doing is really focused on making sure that the developer experiences as you migrate your workloads from old traditional, monolithic apps are also getting ready to move to this cloud native era. >> Right, so it's an interesting point of view from Microsoft 'cause some people, again, people in-the-know already know, but a lot of people maybe don't know, kind of, Microsoft's heritage in open source. We think, you know, that I used to buy my Office CD, >> Right. >> and my Outlook CD >> Right. >> you know, it's different, especially as you guys go more heavily into cloud, >> Right. >> you need to be more open to the various tools of the developer community. >> That's absolutely true and one of the focus areas for us, really, has been, you know, as we think through the cloud-native transition, what are the big pieces, the main open source tools, the frameworks that are available and how do we provide great experiences for those on Azure? >> Right, right. >> Right, because, at times, people come with the notion that, hey, Azure probably might just be good for dot NET or might just be good for Windows, but, you know, the actual fact, today, is really that Azure has great supporting story for Linux, Azure has great story for a lot of these open source tools and we are continuing to grow our story in that perspective. >> Right. >> So, we really want to make sure that open source developers who come and work on our platform are successful. >> And then, specifically for Node, and you're actually on the Board, so you've got >> Right. >> a leadership position, >> Yep. >> when you look at Node.js within the ecosystem of opensource projects and the growth that we keep hearing about in the sessions, >> Yep. >> you know, how are you, and you specifically and Microsoft generally, kind of helping to guide the growth of this community and the development of this community as it gets bigger and bigger and bigger? >> Right, I think that's a great question. I think from my perspective, and also Microsoft's perspective, there are a bunch of things we are actually doing to engage with the community, so I'll kind of list out three or four things that we are doing. I think the first and foremost is, you know, we are a participant in the Node.js Foundation. >> Right. >> You know, that's where like, hey, we kind of look at the administrative stuff. We are a sponsor of, you know, at the needed levels, et cetera, so that's just the initial monetary support, but then it gets to really being a part of the Node Core Committee, like, as we work on some of the Core pieces, as we evolve Node, how can we actually bring more perspectives, more value, into the actual project? So, that's, you know, we have many set of engineers who are, right now, working across different working groups with Node and helping evolve Node. You know, you might have heard about the NAPI effort. We are working with the Diagnostics Working Group, we are working with the Benchmarking Working Group and, you know, bringing the thing. The third thing that we did, a while back, was we also did this integration of bringing Chakra which is the JavaScript Runtime from Microsoft that powers Microsoft Edge. We made Node work with Chakra because we wanted to bring the power of Node to this new platform called Windows IoT >> Right, right. >> and, you know, the existing Node could not get there because some of the platform limitations. So, those are like some of the few examples that we've, and how we've been actually communicating and contributing. And then, I think the biggest and the foremost for me, really, are the two pillars, like when I think about Microsoft's contribution, it's really, like, you know, the big story or the big pivot for us is, we kind of go create developer tools and help make developer live's easier by giving them the right set of tools to achieve what they want to achieve in less time, be more productive >> Right, right. >> and the second thing is, really, like the cloud platforms, as things are moving. I think across both of those areas, our focus really had been to make sure that Node as a language, Node as a platform has great first-class experiences that we can help define. >> Right. Well, you guys are so fortunate. You have such a huge install base of developers, >> Right. >> but, again, traditionally, it wasn't necessarily cloud application developers and that's been changing >> Yep. >> over time >> Yep. >> and there's such a fierce competition for that guy, >> Yep. >> or gal, who wakes up >> Yep. >> in the morning or not, maybe, the morning, at 10:00, >> Yep. >> has a cup of coffee >> Yep. >> and has to figure out what they're going to develop today >> Right. >> and there's so many options >> Right. >> and it's a fierce competition, >> Right. >> so you need to have an easy solution, you need to have a nice environment, you need to have everything that they want, so they're coding on your stuff and not on somebody else's. >> That's true, I mean I, you know, somehow, I kind of instead of calling it competition, I have started using this term coopetition because between a lot of the companies and vendors that we talk about, right, it's more about, for all of us, it's working together to grow the community. >> Right. >> It's working together to grow the pie. You know, with open source, it's not really one over the other. It's like the more players you have and the more players who engage with great ideas, I think better things come out of that, so it's all about that coopetition, >> rather than competition, >> Right. >> I would say. >> Well, certainly, around and open source project, here, >> Yes, exactly. >> and we see a lot of big names, >> Exactly. >> but I can tell you, I've been to a lot of big shows where they are desperately trying to attract >> Right, right, yes. >> the developer ecosystem. "Come develop on our platforms." >> Yes, yes. >> So, you're in a fortunate spot, you started, >> Yes, I mean that-- >> not from zero, but, but open source is different >> Yes. >> and it's an important ethos because it is much more community >> Exactly, exactly. >> and people look at the name, they don't necessarily look at the title >> Exactly. >> or even the company >> Yep, exactly. >> that people work for. >> Exactly, and I think having more players involved also means, like, it's going to be great for the developer ecosystem, right, because everybody's going to keep pushing for making it better and better, >> Right. >> so, you know, as we grow from a smaller stage to, like, hey, there's actually a lot of enterprised option of these use case scenarios that people are coming up with, et cetera, it's always great to have more parties involved and more people involved. >> Gaurav, thank you very much >> Yeah. >> and, again, congratulations on your work here in Node. Keep this community strong. >> Sure. >> It looks like you guys are well on your way. >> Yeah. Thanks, Jeff. >> All right. >> Thanks for your time, take care, yeah. >> Guarav Seth, he's a Project Lead at Microsoft. I'm Jeff Frick. You're watching theCUBE from Node Summit 2017. Thanks for watching. (upbeat synthpop music)

Published Date : Jul 27 2017

SUMMARY :

in the history of the show, talking about Node. Yeah, hey, Jeff, nice to be here. so let's just jump right into it. and then having, you know, a close set of collaboration to watch Microsoft really go full bore There was a report that came out about, you know, You know, one of the things we've been doing on making sure that the developer experiences We think, you know, that I used to buy my Office CD, you need to be more open but, you know, the actual fact, today, is really So, we really want to make sure and the growth that we keep hearing about you know, we are a participant the power of Node to this new platform and, you know, the existing Node could not get there and the second thing is, really, Well, you guys are so fortunate. so you need to have because between a lot of the companies and vendors It's like the more players you have the developer ecosystem. so, you know, as we grow and, again, congratulations on your work here in Node. It looks like you guys are Yeah. Thanks for watching.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
Ryan Dahl	PERSON	0.99+
Gaurav Seth	PERSON	0.99+
Gaurav	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
20 billion	QUANTITY	0.99+
three	QUANTITY	0.99+
Node.js Foundation	ORGANIZATION	0.99+
Node.js	TITLE	0.99+
Guarav Seth	PERSON	0.99+
both	QUANTITY	0.99+
Node	TITLE	0.99+
first	QUANTITY	0.99+
two pillars	QUANTITY	0.99+
second thing	QUANTITY	0.98+
Outlook	TITLE	0.98+
Chakra	TITLE	0.98+
Node Summit 2017	EVENT	0.98+
one	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
JavaScript	TITLE	0.97+
Mission Bay Conference Center	LOCATION	0.97+
10:00	DATE	0.97+
Windows	TITLE	0.97+
WEAMS	TITLE	0.97+
Linux	TITLE	0.96+
third thing	QUANTITY	0.96+
first time	QUANTITY	0.95+
TheCUBE	ORGANIZATION	0.95+
Office	TITLE	0.95+
today	DATE	0.95+
Node Core Committee	ORGANIZATION	0.94+
Azure	TITLE	0.93+
four things	QUANTITY	0.86+
NAPI	ORGANIZATION	0.83+
San Francisco	LOCATION	0.81+
Node	ORGANIZATION	0.8+
NET	ORGANIZATION	0.75+
zero	QUANTITY	0.75+
Azure	ORGANIZATION	0.7+
Node Summit	LOCATION	0.69+
Diagnostics Working Group	ORGANIZATION	0.64+
2017	DATE	0.58+
365	QUANTITY	0.54+
Edge	TITLE	0.53+
Things	ORGANIZATION	0.52+
BasS	TITLE	0.52+
Group	ORGANIZATION	0.47+

Seth Dobrin, IBM Analytics - IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany; it's The Cube. Covering IBM; fast-track your data. Brought to you by IBM. (upbeat techno music) >> For you here at the show, generally; and specifically, what are you doing here today? >> There's really three things going on at the show, three high level things. One is we're talking about our new... How we're repositioning our hybrid data management portfolio, specifically some announcements around DB2 in a hybrid environment, and some highly transactional offerings around DB2. We're talking about our unified governance portfolio; so actually delivering a platform for unified governance that allows our clients to interact with governance and data management kind of products in a more streamlined way, and help them actually solve a problem instead of just offering products. The third is really around data science and machine learning. Specifically we're talking about our machine learning hub that we're launching here in Germany. Prior to this we had a machine learning hub in San Francisco, Toronto, one in Asia, and now we're launching one here in Europe. >> Seth, can you describe what this hub is all about? This is a data center where you're hosting machine learning services, or is it something else? >> Yeah, so this is where clients can come and learn how to do data science. They can bring their problems, bring their data to our facilities, learn how to solve a data science problem in a more team oriented way; interacting with data scientists, machine learning engineers, basically, data engineers, developers, to solve a problem for their business around data science. These previous hubs have been completely booked, so we wanted to launch them in other areas to try and expand the capacity of them. >> You're hosting a round table today, right, on the main tent? >> Yep. >> And you got a customer on, you guys going to be talking about sort of applying practices and financial and other areas. Maybe describe that a little bit. >> We have a customer on from ING, Heinrich, who's the chief architect for ING. ING, IBM, and Horton Works have a consortium, if you would, or a framework that we're doing around Apache Atlas and Ranger, as the kind of open-source operating system for our unified governance platform. So much as IBM has positioned Spark as a unified, kind of open-source operating system for analytics, for a unified governance platform... For a governance platform to be truly unified, you need to be able to integrate metadata. The biggest challenge about connecting your data environments, if you're an enterprise that was not internet born, or cloud born, is that you have proprietary metadata platforms that all want to be the master. When everyone wants to be the master, you can't really get anything done. So what we're doing around Apache Atlas is we are setting up Apache Atlas as kind of a virtual translator, if you would, or a dictionary between all the different proprietary metadata platforms so that you can get a single unified view of your data environment across hybrid clouds, on premise, in the cloud, and across different proprietary vendor platforms. Because it's open-sourced, there are these connectors that can go in and out of the proprietary platforms. >> So Seth, you seem like you're pretty tuned in to the portfolio within the analytics group. How are you spending your time as the Chief Data Officer? How do you balance it between customer visits, maybe talking about some of the products, and then you're sort of day job? >> I actually have three days jobs. My job's actually split into kind of three pieces. The first, my primary mission, is really around transforming IBM's internal business unit, internal business workings, to use data and analytics to run our business. So kind of internal business unit transformation. Part of that business unit transformation is also making sure that we're compliant with regulations like GDBR and other regulations. Another third is really around kind of rethinking our offerings from a CDO perspective. As a CDO, and as you, Dave, I've only been with IBM for seven months. As a former client recently, and as a CDO, what is it that I want to see from IBM's offerings? We kind of hit on it a little bit with the unified governance platform, where I think IBM makes fantastic products. But as a client, if a salesperson shows up to me, I don't want them selling me a product, 'cause if I want an MDM solution, I'll call you up and say, "Hey, I need an MDM solution. "Give me a quote." What I want them showing up is saying, "I have a solution that's going to solve "your governance problem across your portfolio." Or, "I'm going to solve your data science problem." Or, "I'm going to help you master your data, "and manage your data across "all these different environments." So really working with the offering management and the Dev teams to define what are these three or four, kind of business platforms that we want to settle on? We know three of them at least, right? We know that we have a hybrid data management. We have unified governance. We have data science and machine learning, and you could think of the Z franchise as a fourth platform. >> Seth, can you net out how governance relates to data science? 'Cause there is governance of the statistical models, machine learning, and so forth, version control. I mean, in an end to end machine learning pipeline, there's various versions of various artifacts they have to be managed in a structured way. Is your unified governance bundle, or portfolio, does it address those requirements? Or just the data governance? >> Yeah, so the unified governance platform really kind of focuses today on data governance and how good data governance can be an enabler of rapid data science. So if you have your data all pre-governed, it makes it much quicker to get access to data and understand what you can and can't do with data; especially being here in Europe, in the context of the EU GDPR. You need to make sure that your data scientists are doing things that are approved by the user, because basically your data, you have to give explicit consent to allow things to be done with it. But long term vision is that... essentially the output of models is data, right? And how you use and deploy those models also need to be governed. So the long term vision is that we will have a governance platform for all those things, as well. I think it makes more sense for those things to be governed in the data science platform, if you would. And we... >> We often hear separate from GDPR and all that, is something called algorithmic accountability; that more is being discussed in policy circles, in government circles around the world, as strongly related to everything you're describing. Being able to trace the lineage of any algorithmic decision back to the data, the metadata, and so forth, and the machine learning models that might have driven it. Is that where IBM's going with this portfolio? >> I think that's the natural extension of it. We're thinking really in the context of them as two different pieces, but if you solve them both and you connect them together, then you have that problem. But I think you're absolutely right. As we're leveraging machine learning and artificial intelligence, in general, we need to be able to understand how we got to a decision, and that includes the model, the data, how the data was gathered, how the data was used and processed. So it is that entire pipeline, 'cause it is a pipeline. You're not doing machine learning or AI in a vacuum. You're doing it in the context of the data, and you're doing it in the context about the individuals or the organizations that you're trying to influence with the output of those models. >> I call it Dev ops for data science. >> Seth, in the early Hadoop days, the real headwind was complexity. It still is, by the way. We know that. Companies like IBM are trying to reduce that complexity. Spark helps a little bit So the technology will evolve, we get that. It seems like one of the other big headwinds right now is that most companies don't have a great understanding of how they can take data and monetize it, turn it into value. Most companies, many anyway, make the mistake of, "Well, I don't really want to sell my data," or, "I'm not really a data supplier." And they're kind of thinking about it, maybe not in the right way. But we seem to be entering a next wave here, where people are beginning to understand I can cut costs, I can do predictive maintenance, I can maybe not sell the data, but I can enhance what I'm doing and increase my revenue, maybe my customer retention. They seem to be tuning, more so; largely, I think 'cause of the chief data officer roles, helping them think that through. I wonder if you would give us your point of view on that narrative. >> I think what you're describing is kind of the digital transformation journey. I think the end game, as enterprises go through a digital transformation, the end game is how do I sell services, outcomes, those types of things. How do I sell an outcome to my end user? That's really the end game of a digital transformation in my mind. But before you can get to that, before you transform your business's objectives, there's a couple of intermediary steps that are required for that. The first is what you're describing, is those kind of data transformations. Enterprises need to really get a handle on their data and become data driven, and start then transforming their current business model; so how do I accelerate my current business leveraging data and analytics? I kind of frame that, that's like the data science kind of transformation aspect of the digital journey. Then the next aspect of it is how do I transform my business and change my business objectives? Part of that first step is in fact, how do I optimize my supply chain? How do I optimize my workforce? How do I optimize my goals? How do I get to my current, you know, the things that Wall Street cares about for business; how do I accelerate those, make those faster, make those better, and really put my company out in front? 'Cause really in the grand scheme of things, there's two types of companies today; there's the company that's going to be the disruptor, and there's companies that's going to get disrupted. Most companies want to be the disruptors, and it's a process to do that. >> So the accounting industry doesn't have standards around valuing data as an asset, and many of us feel as though waiting for that is a mistake. You can't wait for that. You've got to figure out on your own. But again, it seems to be somewhat of a headwind because it puts data and data value in this fuzzy category. But there are clearly the data haves and the data have-nots. What are you seeing in that regard? >> I think the first... When I was in my former role, my former company went through an exercise of valuing our data and our decisions. I'm actually doing that same exercise at IBM right now. We're going through IBM, at least in the analytics business unit, the part I'm responsible for, and going to all the leaders and saying, "What decisions are you making?" "Help me understand the decisions that you're making." "Help me understand the data you need "to make those decisions." And that does two things. Number one, it does get to the point of, how can we value the decisions? 'Cause each one of those decisions has a specific value to the company. You can assign a dollar amount to it. But it also helps you change how people in the enterprise think. Because the first time you go through and ask these questions, they talk about the dashboards they want to help them make their preconceived decisions, validated by data. They have a preconceived notion of the decision they want to make. They want the data to back it up. So they want a dashboard to help them do that. So when you come in and start having this conversation, you kind of stop them and say, "Okay, what you're describing is a dashboard. "That's not a decision. "Let's talk about the decision that you want to make, "and let's understand the real value of that decision." So you're doing two things, you're building a portfolio of decisions that then becomes to your point, Jim, about Dev ops for data science. It's your backlog for your data scientists, in the long run. You then connect those decisions to data that's required to make those, and you can extrapolate the data for each decision to the component that each piece of data makes up to it. So you can group your data logically within an enterprise; customer, product, talent, location, things like that, and you can assign a value to those based on decisions they support. >> Jim: So... >> Dave: Go ahead, please. >> As a CDO, following on that, are you also, as part of that exercise, trying to assess the value of not just the data, but of data science as a capability? Or particular data science assets, like machine learning models? In the overall scheme of things, that kind of valuation can then drive IBM's decision to ramp up their internal data science initiatives, or redeploy it, or, give me a... >> That's exactly what happened. As you build this portfolio of decisions, each decision has a value. So I am now assigning a value to the data science models that my team will build. As CDOs, CDOs are a relatively new role in many organizations. When money gets tight, they say, "What's this guy doing?" (Dave laughing) Having a portfolio of decisions that's saying, "Here's real value I'm adding..." So, number one, "Here's the value I can add in the future," and as you check off those boxes, you can kind of go and say, "Here's value I've added. "Here's where I've changed how the company's operating. "Here's where I've generated X billions of dollars "of new revenue, or cost savings, or cost avoidance, "for the enterprise." >> When you went through these exercises at your previous company, and now at IBM, are you using standardized valuation methodologies? Did you kind of develop your own, or come up with a scoring system? How'd you do that? >> I think there's some things around, like net promoter score, where there's pretty good standards on how to assign value to increases in net promoter score, or decreases in net promoter score for certain aspects of your business. In other ways, you need to kind of decide as an enterprise, how do we value our assets? Do we use a three year, five year, ten year MPV? Do we use some other metric? You need to kind of frame it in the reference that your CFO is used to talking about so that it's in the context that the company is used to talking about. Most companies, it's net present value. >> Okay, and you're measuring that on an ongoing basis. >> Seth: Yep. >> And fine tuning as you go along. Seth, we're out of time. Thanks so much for coming back in The Cube. It was great to see you. >> Seth: Yeah, thanks for having me. >> You're welcome, good luck this afternoon. >> Seth: Alright. >> Keep it right there, buddy. We'll be back. Actually, let me run down the day here for you, just take a second to do that. We're going to end our Cube interviews for the morning, and then we're going to cut over to the main tent. So in about an hour, Rob Thomas is going to kick off the main tent here with a keynote, talking about where data goes next. Hilary Mason's going to be on. There's a session with Dez Blanchfield on data science as a team sport. Then the big session on changing regulations, GDPRs. Seth, you've got some customers that you're going to bring on and talk about these issues. And then, sort of balancing act, the balancing act of hybrid data. Then we're going to come back to The Cube and finish up our Cube interviews for the afternoon. There's also going to be two breakout sessions; one with Hilary Mason, and one on GDPR. You got to go to IBMgo.com and log in and register. It's all free to see those breakout sessions. Everything else is open. You don't even have to register or log in to see that. So keep it right here, everybody. Check out the main tent. Check out siliconangle.com, and of course IBMgo.com for all the action here. Fast track your data. We're live from Munich, Germany; and we'll see you a little later. (upbeat techno music)

Published Date : Jun 24 2017

SUMMARY :

Brought to you by IBM. that allows our clients to interact with governance and expand the capacity of them. And you got a customer on, you guys going to be talking about and Ranger, as the kind of open-source operating system How are you spending your time as the Chief Data Officer? and the Dev teams to define what are these three or four, I mean, in an end to end machine learning pipeline, in the data science platform, if you would. and the machine learning models that might have driven it. and you connect them together, then you have that problem. I can maybe not sell the data, How do I get to my current, you know, But again, it seems to be somewhat of a headwind of decisions that then becomes to your point, Jim, of not just the data, but of data science as a capability? and as you check off those boxes, you can kind of go and say, You need to kind of frame it in the reference that your CFO And fine tuning as you go along. and we'll see you a little later.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
ING	ORGANIZATION	0.99+
Seth	PERSON	0.99+
Europe	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
Germany	LOCATION	0.99+
Jim	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Thomas	PERSON	0.99+
ten year	QUANTITY	0.99+
five year	QUANTITY	0.99+
seven months	QUANTITY	0.99+
Asia	LOCATION	0.99+
three year	QUANTITY	0.99+
three	QUANTITY	0.99+
four	QUANTITY	0.99+
Heinrich	PERSON	0.99+
Horton Works	ORGANIZATION	0.99+
Dez Blanchfield	PERSON	0.99+
two types	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
three days	QUANTITY	0.99+
two things	QUANTITY	0.99+
each piece	QUANTITY	0.99+
today	DATE	0.99+
Dav	PERSON	0.99+
each	QUANTITY	0.99+
first	QUANTITY	0.99+
Munich, Germany	LOCATION	0.99+
third	QUANTITY	0.99+
both	QUANTITY	0.99+
billions of dollars	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.98+
two different pieces	QUANTITY	0.98+
three things	QUANTITY	0.98+
DB2	TITLE	0.98+
first step	QUANTITY	0.98+
GDPR	TITLE	0.97+
Apache Atlas	ORGANIZATION	0.97+
fourth platform	QUANTITY	0.97+
2017	DATE	0.97+
three pieces	QUANTITY	0.97+
IBM Analytics	ORGANIZATION	0.96+
first time	QUANTITY	0.96+
single	QUANTITY	0.96+
Spark	TITLE	0.95+
Ranger	ORGANIZATION	0.91+
two breakout sessions	QUANTITY	0.88+
about an hour	QUANTITY	0.86+
each decision	QUANTITY	0.85+
Cube	COMMERCIAL_ITEM	0.84+
each one	QUANTITY	0.83+
this afternoon	DATE	0.82+
Cube	ORGANIZATION	0.8+
San Francisco, Toronto	LOCATION	0.79+
GDPRs	TITLE	0.76+
GDBR	TITLE	0.75+

Seth Dobrin, IBM - IBM CDO Strategy Summit - #IBMCDO - #theCUBE

>> (lively music) (lively music) >> [Narrator] Live, from Fisherman's Wharf in San Francisco, it's theCUBE. Covering IBM Chief Data Officers Strategy Summit Spring 2017. Brought to you by IBM. >> Hey, welcome back everybody. >> Jeff Flick here with theCUBE alongside Peter Burris, our chief research officer from Wikibon. We're at the IBM Chief Data Officers Strategy Summit Sprint 2017. It's a mouthful but it's an important event. There's 170 plus CDO's here sharing information, really binding their community, sharing best practices and of course, IBM is sharing their journey which is pretty interesting cause they're taking their own transformational journey, writing up a blue print and going to deliver it in October. Drinking their own champagne as they like to say. We're really excited to have CUBE alumni, many time visitor Seth Dobrin. He is the chief data officer of IBM Analytics. Seth welcome. >> Yeah, thanks for having me again. >> Absolutely, so again, these events are interesting. There's a series of them. They're in multiple cities. They're, now, going to go to multiple countries. And it's really intended, I believe, or tell me, it's a learning experience in this great, little, tight community for this, very specific, role. >> Yeah, so these events are, actually, really good. I've been participating in these since the second one. >> So, since the first one in Boston about 2 1/2 years ago. They're really great events because it's an opportunity for CDO's or de facto CDO's in organizations to have in depth conversations with their peers about struggles, challenges, successes. >> It really helps to, kind of, one piece says you can benchmark yourself, how are we doing as an organization and how am I doing as a CDO and where do I fit within the bigger community or within your industry? >> How have you seen it evolve? Not just the role, per say, but some of the specific challenges or implementation issues that these people have had in trying to deliver a value inside their company. >> Yeah, so when they started, three years ago, there, really, were not a whole lot of tools that CDO's could use to solve your data science problems, to solve your cloud problems, to solve your governance problem. We're starting to get to a place in the world where there are actual tools out there that help you do these things. So you don't struggle to figure out how do I find talent that can build the tools internally and deploy em. It's now getting the talent to, actually, start implementing things that already exist. >> Is the CDO job well enough defined at this point in time? Do you think that you can, actually, start thinking about tools as opposed to the challenges of the business? In other words, is every CDO different or are the practices, now, becoming a little bit more and the conventions becoming a little bit better understood and stable so you >> can outdo a better job of practicing the CDO role? >> Yeah, I think today, the CDO role is still very ill defined. It's, really, industry by industry and company by company even, CDO's play different roles within each of those. I've only been with IBM for the last four months. I've been spending a lot of that time talking to our clients. Financial services, manufacturing, all over the board and really, the CDO's in those people are all industry specific, they're in different places and even company by company, they're in different places. It really depends on where the company's are on their data and digital journey what role the CDO has. Is it really a defensive play to make sure we're not going to violate any regulations or is it an offensive play and how do we disrupt our industry instead of being disrupted because, really, every industry is in a place where you're either going to be the disruptor or you're going to be the distruptee. And so, that's the scope, the breadth of, I think, the role the CDO plays. >> Do you see it all eventually converging to a common point? Cause, obviously, the CFO and the CMO, those are pretty good at standardized functions over time that wasn't always that way. >> Well, I sure hope it does. I think CDO's are becoming pretty pervasive. I think you're starting to see, when this started, the first one I went to, there were, literally, 35 people >> and only 1/2 of then were called CDO's. We've progressed now where we've got 100 people over 170 some odd people that are here that are CDO's. Most of them have the CDO title even. >> The fact that that title is much more pervasive says that we're heading that way. I think industry by industry you'll start seeing similar responsibilities for CDO's but I don't think you'll start seeing it across the board like a CFO where a CFO does the same thing regardless of the industry. I don't think you'll see that in a CDO for quite some time. >> Well one of the things, certainly, we find interesting is that the role the data's playing in business involvement. And it, partly, the CDO's job is to explain to his or her peers, at that chief level, how using data is going to change the way that they do things from the way that they're function works. And that's part of the reason, I think, why you're suggesting that on a vertical basis that the CDO's job is different. Cause different industries are being impacted themselves by data differently. So as you think about the job that you're performing and the job the CDO's are performing, what part is technical? What part is organizational? What part is political? Et cetera. >> I think a lot of the role of a CDO is political. Most of the CDO's that I know have built their careers on stomping on people's toes. How do I drive change by infringing on other people's turf effectively? >> Peter: In a nice way. >> Well, it depends. In the appropriate way, right? >> Peter: In a productive way. >> In the appropriate way. It could be nice, it could not be nice >> depending on the politics and the culture of the organization. I think a lot of the role of a CDO, it's, almost, like chief disruption officer as much as it is data officer. I think it's a lot about using data >> but, I think, more importantly, it's about using analytics. >> So how do you use analytics to, actually, drive insights and next best action from the data? I think just looking at data and still using gut based on data is not good enough. For chief data officers to really have an impact and really be successful, it's how do you use analytics on that data whether it's machine learning, deep learning, operations research, to really change how the business operates? Because as chief data officers, you need to justify your existence a lot. The way you do that is you tie real value to decisions that your company is making. The data and the analytics that are needed for those decisions. That's, really, the role of a CDO in my mind is, how do I tie value of data based on decisions and how do I use analytics to make those decisions more effective? >> Were the early days more defensive and now, shifting to offensive? It sounds like it. That's a typical case where you use technology, initially, often to save money before you start to use it to create new value, new revenue streams. Is that consistent here? By answering that, you say they have to defend themselves sometimes when you would think it'd be patently obvious >> that if you're not getting on a data software defined train, you're going to be left behind. >> I think there's two types. There's CDO's that are there to protect freedom to operate and that's what I call, think of, as defensive. And then, there's offensive CDO's and that's really bringing more value out of existing processes. In my mind, every company is on this digital transformation journey and there's two steps to it. >> One is this data science transformation which is where you use data and analytics to accelerate your businesses current goals. How do I use data analytics to accelerate my businesses march towards it's current goals? Then there's the second stage which is the true digital transformation which is how do I use data and analytics to, fundamentally, change how my industry and my company operates? So, actually, changing the goals of the industry. For example, moving from selling physical products to selling outcomes. You can't do that until you've done this data transformation till you've started operating on data, till you've started operating on analytics. You can't sell outcomes until you've done that. It's this two step journey. >> You said this a couple of times and I want to test an idea on you and see what you think. Industry classifications are tied back to assets. So, you look at industries and they have common organization of assets, right? >> Seth: Yep. Data, as an asset, has very, very, different attributes because it can be shared. It's not scarce, it's something that can be shared. As we become more digital and as this notion of data science or analytics, the world of data places in asset and analytics plays as assets becomes more pervasive, does that start to change the notion of industry because, now, by using data differently, you can use other assets and deploy other assets differently? >> Yeah, I think it, fundamentally, changes how business operates and even how businesses are measured because you hit on this point pretty well which is data is reusable. And so as I build these data or digital assets, the quality of a company's margins should change. For every dollar of revenue I generate. Maybe today I generate 15% profit. As you start moving to a digital being a more digital company built on data and analytics, that percent of profit based on revenue should go up. Because these assets that you're building to reuse them is extremely cheap. I don't have to build another factory to scale up, I buy a little bit more compute time. Or I develop a new machine learning model. And so it's very scalable unlike building physical products. I think you will see a fundamental shift in how businesses are measured. What standards that investors hold businesses to. I think, another good point is, a mind set shift that needs to happen for companies is that companies need to stop thinking of data as a digital dropping of applications and start thinking of it as an asset. Cause data has value. It's no longer just something that's dropped on the table from applications that I built. It's we are building to, fundamentally, create data to drive analytics, to generate value, to build new revenue for a company that didn't exist today. >> Well the thing that changes the least, ultimately, is the customer. And so it suggests that companies that have customers can use data to get in a new product, or new service domains faster than companies who don't think about data as an asset and are locked into how can I take my core set up, my organization, >> my plant, my machinery and keep stamping out something that's common to it or similar to it. So this notion of customer becomes the driver, increasingly, of what industry you're in or what activities you perform. Does that make sense? >> I think everything needs to be driven from the prospective of the customer. As you become a data driven or a digital company, everything needs to be shifted in that organization from the perspective of the customer. Even companies that are B to B. B to B companies need to start thinking about what is the ultimate end user. How are they going to use what I'm building, for my business partner, my B to B partner, >> what is their, actual, human being that's sitting down using it, how are they going to use it? How are they going to interact with it? It really, fundamentally, changes how businesses approach B to B relationships. It, fundamentally, changes the type of information that, if I'm a B to B company, how do I get more information about the end users and how do I connect? Even if I don't come in direct contact with them, how do I understand how they're using my product better. That's a fundamental just like you need to stop thinking of data as a digital dropping. Every question needs to come from how is the end user, ultimately, going to use this? How do I better deploy that? >> So the utility that the customer gets capturing data about the use of that, the generation of that utility and drive it all the way back. Does the CDO have to take a more explicit role in getting people to see that? >> Yes, absolutely. I think that's part of the cultural shift that needs to happen. >> Peter: So how does the CDO do that? >> I think every question needs to start with what impact does this have on the end user? >> What is the customer perspective on this? Really starting to think about. >> I'm sorry for interrupting. I'd turn that around. I would say it's what impact does the customer have on us? Because you don't know unless you capture data. That notion of the customer impact measurement >> which we heard last time, the measureability and then drive that all the way back. That seems like it's going to become an increasingly, a central design point. >> Yeah, it's a loop and you got to start using these new methodologies that are out there. These design thinking methodologies. It's not just about building an Uber app. It's not just about building an app. It's about how do I, fundamentally, shift my business to this design thinking methodology to start thinking cause that's what design thinking is all about. It's all about how is this going to be used? And every aspect of your business you need to approach that way. >> Seth, I'm afraid they're going to put us in the chaffing dish here if we don't get off soon. >> Seth: I think so too, yeah. >> So we're going to leave it there. It's great to see you again and we look forward to seeing you at the next one of these things. >> Yeah, thanks so much. >> He's Seth, he's Peter, I'm Jeff. You're watching theCUBE from the IBM Chief Data Officers Strategy Summit Spring 2017, I got it all in in a mouthful. We'll be back after lunch which they're >> setting up right now. (laughs) (lively music) (drum beats)

Published Date : Mar 29 2017

SUMMARY :

Brought to you by IBM. Drinking their own champagne as they like to say. They're, now, going to go to multiple countries. Yeah, so these events are, actually, really good. to have in depth conversations with their peers but some of the specific challenges data science problems, to solve your cloud problems, And so, that's the scope, the breadth of, Cause, obviously, the CFO and the CMO, I think you're starting to see, that are here that are CDO's. seeing it across the board like a CFO And it, partly, the CDO's job is to explain Most of the CDO's that I know have built In the appropriate way, right? In the appropriate way. and the culture of the organization. it's about using analytics. For chief data officers to really have an impact and now, shifting to offensive? that if you're not getting on There's CDO's that are there to protect freedom to operate So, actually, changing the goals of the industry. and see what you think. does that start to change the notion of industry is that companies need to stop thinking Well the thing that changes the least, something that's common to it or similar to it. in that organization from the perspective of the customer. how are they going to use it? Does the CDO have to take a more that needs to happen. What is the customer perspective on this? That notion of the customer impact measurement That seems like it's going to become It's all about how is this going to be used? Seth, I'm afraid they're going to It's great to see you again the IBM Chief Data Officers Strategy Summit (lively music)

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Jeff Flick	PERSON	0.99+
Seth Dobrin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jeff	PERSON	0.99+
Seth	PERSON	0.99+
Peter	PERSON	0.99+
Boston	LOCATION	0.99+
October	DATE	0.99+
two types	QUANTITY	0.99+
second stage	QUANTITY	0.99+
two step	QUANTITY	0.99+
IBM Analytics	ORGANIZATION	0.99+
35 people	QUANTITY	0.99+
100 people	QUANTITY	0.99+
two steps	QUANTITY	0.99+
second one	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
first one	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
today	DATE	0.99+
three years ago	DATE	0.98+
One	QUANTITY	0.98+
one piece	QUANTITY	0.98+
one	QUANTITY	0.97+
each	QUANTITY	0.94+
Wikibon	ORGANIZATION	0.92+
last four months	DATE	0.9+
IBM Chief Data Officers Strategy Summit Sprint 2017	EVENT	0.9+
about 2 1/2 years ago	DATE	0.89+
Chief Data Officers Strategy Summit	EVENT	0.88+
Spring 2017	DATE	0.85+
over 170	QUANTITY	0.85+
IBM Chief Data Officers Strategy Summit Spring 2017	EVENT	0.84+
15% profit	QUANTITY	0.83+
CDO	TITLE	0.82+
170 plus CDO	QUANTITY	0.79+
CDO Strategy Summit	EVENT	0.77+
Fisherman's Wharf	LOCATION	0.76+
1/2	QUANTITY	0.75+
CUBE	ORGANIZATION	0.73+
#IBMCDO	ORGANIZATION	0.69+
theCUBE	ORGANIZATION	0.52+
IBM	EVENT	0.51+

Seth Dobrin, IBM - IBM Interconnect 2017 - #ibminterconnect - #theCUBE

>> Announcer: Live from Las Vegas, it's theCUBE, covering InterConnect 2017. Brought to you by IBM. >> Okay welcome back everyone. We are here live in Las Vegas from Mandalay Bay for IBM InterConnect 2017. This is theCUBE's three day coverage of IBM InterConnect. I'm John Furrier with my co-host Dave Vellante. Or next guest is Seth Dobrin, Vice President and Chief Data Officer for IBM Analytics. Welcome to theCUBE, welcome back. >> Yeah, thanks for having me again. I love sittin' down and chattin' with you guys. >> You're a CDO, Chief Data Officer and that's a really kind of a really pivotal role because you got to look at, as a chief, over all of the data with IBM Analytics. Also you have customers you're delivering a lot solutions to and it's cutting edge. I like the keynote on day one here. You had Chris Moody at Twitter. He's a data guy. >> Seth: Yep. >> I mean you guys have a deal with Twitter so he got more data. You've got the weather company, you got that data set. You have IBM customer data. You guys are full with data right now. >> We're first seat at the scenes with data and that's a good thing. >> So what's the strategy and what are you guys working on and what's the key points that you guys are honing in on? Obviously, Cognitive to the Core is Robetti's theme. How are you guys making data work for IBM and your customers? >> If you think about IBM Analytics, we're really focusing on five key areas, five things that we think if we get right, we'll help our clients learn how to drive their business and data strategies right. One is around how do I manage data across hybrid environments? So what's my hybrid data management strategy? It used to be how do I get to public cloud, but really what it is, it's a conversation about every enterprise has their business critical assets, what people call legacy. If we call them business critical and we think about-- These are how companies got here today. This is what they make their money on today. The real challenge is how do we help them tie those business critical assets to their future state cloud, whether it's public cloud, private cloud, or something in between our hybrid cloud. One of the key strategies for us is hybrid data management. Another one is around unified governance. If you look at governance in the past, governance in the past was an inhibitor. It was something that people went (groan) "Governance, so I have to do it." >> John: Barb wire. >> Right, you know. When I've been at companies before, and thought about building a data strategy, we spent the first six months building data strategy trying to figure out how to avoid data governance, or the word data governance, and really, we need to embrace data governance as an enabler. If you do it right, if you do it upfront, if you wrap things that include model management, how do I make sure that my data scientists can get to the data they need upfront by classifying data ahead of time; understanding entitlements, understanding what intent when people gave consent was. You also take out of the developer hands the need to worry about governance because now in a unified governance platform, right, it's all API-driven. Just like our applications are all API-driven, how do we make our governance platform API-driven? If I'm an application developer, by the way, I'm not, I can now call on API to manage governance for me, so I don't need to worry about am I giving away the shop. Am I going to get the company sued? Am I going to get fired? Now I'm calling on API. That's only two of them, right? The third one is really around data science and machine learning. So how do we make machine learning pervasive across enterprises and things like data science experience. Watson, IBM, machine learning. We're now bringing that machine-learning capability to the private cloud, right, because 90% of data that exists can't be Googled so it's behind firewalls. How do we bring machine learning to that? >> One more! >> One more! That's around, God, I gave you quite a list-- >> Hybrid data management, you defined governance, data science and machine learning-- >> Oh, the other one is Open Source, our commitment to Open Source. Our commitment to Open Source, like Hadoop, Spark, as we think about unified governance, a truly unified governed platform needs to be built on top of Open Source, so IBM is doubling down on our commitment to Apache Spark as a framework backbone, a metadata framework for our unified governed platform. >> What's the biggest para >> Wait, did we miss one? Hybrid data management, unified governance, data science machine learning (talking over another), pervasive, and open source. >> That's four. >> I thought it was five. >> No. >> Machine learning and data science are two, so typically five. >> There's only four. If I said five, there's only four. >> Cover the data governance thing because this unification is interesting to me because one of the things we see in the marketplace, people hungry for data ops. Like what data ops was for cloud was a whole application developer model developing where as a new developer persona emerging where I want to code and I want to just tap data handled by brilliant people who are cognitive engines that just serve me up what I need like a routine or a procedure, or a subroutine, whatever you want to call it, that's a data DevOps model kind of thing. How will you guys do it? Do you agree with that and how does that play out? >> That's a combination, in my mind, that's a combination of an enterprise creating data assets, so treating data as the asset it is and not a digital dropping of applications, and it's that combined with metadata. It gets back to the Apache Atlas conversation. If you want to understand your data and know where it is, it's a metadata problem. What's the data; what's the lineage; where is it; where does it live; how do I get to it; what can I, can't I do with it, and so that just reinforces the need for an Open Source ubiquitous metadata catalog, a single catalog, and then a single catalog of policies associated with that all driven in a composable way through API. >> That's a fundamental, cultural thinking shift because you're saying, "I don't want to just take exhaust "from apps, which is just how people have been dealing with data." You're saying, "Get holistic and say you need to create an asset class or layer or something that is designed." >> If an enterprises are going to be successful with data, now we're getting to five things, right, so there's five things. They need to treat data as an asset. It's got to be a first-class citizen, not a digital dropping, and they need a strategy around it. So what are, conceptually, what are the pieces of data that I care about? My customers, my products, my talent, my finances, what are the limited number of things. What is my data science strategy? How do I build deployable data science assets? I can't be developing machine-learning models and deploying them in Excel spreadsheets. They have to be integrated into My Processes. I have to have a cloud strategy so am I going to be on premise? Am I going to be off premise? Am I going to be something in between? I have to get back to unified governance. I have to govern it, right? Governing in a single place is hard enough, let alone multiple places, and then my talent disappears. >> Could you peg a progress bar of the industry where these would be, what you just said, because, I think-- >> Dave: Again, we only got through four. >> No talent was the last one. >> Talent, sorry, missed it. >> In the progress bar of work, how are the enterprises right now 'cause actually the big conversation on the cloud side is enterprise-readiness, enterprise-grade, that's kind of an ongoing conversation, but now, if you take your premise, which I think is accurate, is that I got to have a centralized data strategy and platform, not a data (mumbles), more than that, software, et cetera, where's the progress bar? Where are people, Pegeninning? >> I think they are all over the map. I've only been with IBM for four months and I've been spending much of that time literally traveling around the world talking to clients, and clients are all over the map. Last week I spent a week in South America with a media company, a cable company down there. Before setting up the meeting, the guy was like, "Well, you know, we're not that far along "down this journey," and I was like, "Oh, my God, "you guys are like so far ahead of everyone else! "That's not even funny!" And then I'm sitting down with big banks that think they're like way out there and they haven't even started on the journey. So it's really literally all over the place and it's even within industry. There's financial companies that are also way out there. There's another bank in Brazil that uses biometrics to access ATMs, you don't need a pin anymore. They have analytics that drive all that. That's crazy. We don't have anything like that here. >> Are you meeting with CDOs? >> Yeah, mostly CDOs, or kind of defacto like we talked about before this show. Mostly CDOs. >> So you may be unique in the sense that you are working for a technology company, so a lot of your time is outward focused, but when you travel around and meet with the CDOs, how much of their time is inward-focused versus outward-focused? >> My time is actually split between inward and outward focus because part of my time is transforming our own business using data and analytics because IBM is a company and we got to figure out how to do that. >> Is it correct that yours is probably a higher percentage outward? >> Mine's probably a higher percentage outward than most CDOs, yeah. So I think most CDOs are 7%, 80% inward-focused and 20% outward-focused, and a lot of that outward focus is just trying to understand what other people are doing. >> I guess it's okay for now, but will that change over time? >> I think that's about right. It gets back to the other conversation we had before the show about your monetization strategy. I think if a company progresses where it's not longer about how do I change my processes and use data to monetize my internal process. If I'm going to start figuring how I sell data, then CDOs need to get a more external-- >> But you're supporting the business in that role and that's largely going to be an internal function of data-quality, governance, and the like, like you say, the data science strategy. >> Yeah, and I think it's important when I talk about data governance, I think things that we used to talk about is data management is all part of data governance. Data governance is not just controlling. It's all of that. It's how do I understand my data, how do I provide access to my data. It's all those things you need to enable your business to thrive on data. >> My question for you is a personal one. How did you get to be a CDO? Do you go to a class? I'm going to be a CDO someday. Not that you do that, I'm just-- >> CDO school. >> CDO school. >> Seth: I was staying in a Holiday Express last night. (laughing) >> Tongue in cheek aside, people are getting into CDO roles from interesting vectors, right? Anthropology, science, art, I mean, it's a really interesting, math geeks certainly love, they thrive there, but there's not one, I haven't yet seen one sweet spot. Take us through how you got into it and what-- >> I'm not going to fit any preconceived notion of what a CDO is, especially in a technology company. My background is in molecular and statistical genetics. >> Dave: Well, that explains it. >> I'm a geneticist. >> Data has properties that could be kind of biological. >> And actually, if you think about the routes of big data and data science, or big data, at least, the two of the predative, they're probably fundamental drivers of the concept of big data were genetics and astrophysics. So 20 years ago when I was getting my PhD, we were dealing with tens and hundreds of gigabyte-sized files. We were trying to figure out how do we get stuff out of 15 Excel files because they weren't big enough into a single CSV file. Millions of rows and millions of crude, by today's standard, but it was still, how do we do this, and so 20 years ago I was learning to be a data scientist. I didn't know it. I stopped doing that field and I started managing labs for a while and then in my last role, we kind of transformed how the research group within that company, in the agricultural space, handled and managed data, and I was simultaneously the biggest critic and biggest advocate for IT, and they said, "Hey, come over and help us figure out how to transform "the company the way we've transformed this group." >> It's looks like when you talk about your PhD experience, it's almost like you were so stuck in the mud with not having to compute power or sort of tooling. It's like a hungry man saying "Oh, it's an unlimited "abundance of compute, oh, I love what's going on." So you almost get gravitated, pulled into that, right? >> It was funny, I was doing a demo upstairs today with, one of the sales guys was doing a demo with some clients, and in one line of code, they had expressed what was part of my dissertation. It was a single line of code in a script and it was like, that was someone's entire four-year career 20 years ago. >> Great story, and I think that's consistent with just people who just attracted to it, and they end up being captains of industry. This is a hot field. You guys have a CDO of that happening in San Francisco. We'll be doing some live streaming there. What's the agenda because this is a very accelerating field? You mentioned now dealing practically with compliance and governance, which is you'd run in the other direction in the old days, now this embracing that. It's got to get (mumbles) and discipline in management. What's going to go on at CDO Summit or do you know? >> At the CDO Summit next week, I think we're going to focus on three key areas, right? What does a cloud journey look like? Maybe four key areas, right. So a cloud journey, how do you monetize data and what does that even mean, and talent, so at all these CDO Summits, the IBM CDO Summits have been going on for three or four years now, every one of them has a talent conversation, and then governance. I think those are four key concepts, and not surprising, they were four of my five on my list. I think that's what really we're going to talk about. >> The unified governance, tell us how that happens in your vision because that's something that you hear unified identity, we hear block chain looking at a whole new disruptive way of dealing with value digitally. How do you see the data governance thing unifying? >> Well, I think again, it's around... IBM did a great job of figuring out how to take an Open Source product that was Spark, and make it the heart of our products. It's going to be the same thing with governance where you're going to see Apache Atlas is at its infancy right now, having that open backbone so that people can get in and out of it easy. If you're going to have a unified governance platform, it's going to be open by definition because I need to get other people's products on there. I can't go to an enterprise and say we're going to sell your unified governance platform, but you got to buy all IBM, or you got to spend two years doing development work to get it on there. So open is the framework and composable, API-driven, and pro-active are really, I think, that's kind of the key pieces for it. >> So we all remember the client-server days where it took a decade and a half to realize, "Oh, my Gosh, this is out of control "and we need to bring it back in." And the Wild West days of big data, it feels like enterprises have nipped that governance issue in the butt at least, maybe they don't have it under control yet, but they understand the need to get it under control. Is that a fair statement? >> I think they understand the need. The data is so big and grows so fast that another component that I didn't mention, maybe it was implied a little bit, but, is automation. You need to be able to capture metadata in an automated fashion. We were talking to a client earlier who, 400 terabytes a day of data changes, not even talking about what new data they are ingesting, how do they keep track of that? It's got to be automated. This unified governance, you need to capture this metadata and as an automated fashion as possible. Master data needs to be automated when you think about-- >> And make it available in real time, low-latency because otherwise it becomes a data swamp. >> Right, it's got to be pro-active, real-time, on-demand. >> Another thing I wanted to ask you, Seth, and get your opinion on is sort of the mid-2000s when the federal rules of civil procedure changed in electronic documents and records became admissible, it was always about how do I get rid of data, and that's changed. Everybody wants to keep data and how to analyze it, and so forth, so what about that balance? And one of the challenges back then was data classification. I can't scale, by governance, I can't eliminate and defensively delete data unless I can classify it. Is the analog true where with data as an opportunity, I can't do a good job or a good enough job analyzing my data and keeping my data under control without some kind of automated classification, and has the industry solved that? >> I don't think the industry has completely solved it yet, but I think with cognitive tools, there's tools out there that we have that other people have that can automatically, if you give them parameters and train it, can classify the data for you, and I think classification is one of the keys. You need to understand how the data's classified so you understand who can access it, how long you should keep it, and so it's key, and that's got to be automated also. I think we've done a fair job as an industry of doing that. There's still a whole lot of work, especially as you get into the kind of specialized sectors, and so I think that's a key and we've got to do a better job of helping companies train those things so that they work. I'm a big proponent of don't give your data away to IT companies. It's your asset. Don't let them train their models with your data and sell it to other people, but there are some caveats out. There are some core areas where industries need to get together and let IT companies, whether it's IBM or someone else, train models for things just like that, for classification because if someone gets it wrong, it can bring the whole industry down. >> It's almost as if (talking over each other) source paradigm almost. It's like Open Source software. Share some data, but I-- >> Right, and there's some key things that aren't differentiating that, as an industry, you should get together and share. >> You guys are making, IBM is making a big deal out of this, and I think it's super important. I think it's probably the top thing that CDOs and CIOs need to think about right now is if I really own my data and that data is needed to train my big data models, who owns the models and how do I protect my IP. >> And are you selling it to my competitors. Are you going down the street and taking away my IP, my differentiating IP and giving it to my competitor? >> So do I own the model 'cause the data and models are coming together, and that's what IBM's telling me. >> Seth: Absolutely. >> I own the data and the models that it informs, is that correct? >> Yeah, that's absolutely correct. You guys made the point earlier about IBM bursting at the seams on data. That's really the driver for it. We need to do a key set of training. We need to train our models with content for industries, bring those trained models to companies and let them train specific versions for their company with their data that unless there's a reason they tell us to do it, is never going to leave their company. >> I think that's a great point about you being full of data because a lot of people who are building solutions and scaffolding for data, aka software never have more data full. The typical, "Oh, I'm going to be a software company," and they build something that they don't (mumbles) for. Your data full, so you know the problem. You're living it every day. It's opportunity. >> Yeah, and that's why when a startup comes to you and says, "Hey, we have this great AI algorithm. "Give us your data," they want to resell that model, and because they don't have access to the content. If you look at what IBM's done with Watson, right? That's why there's specialized verticals that we're focusing Watson, Watson Health, Watson Financial, because where we are investing in data in those areas you can look at the acquisitions we've done, right. We're investing in data to train those models. >> We should follow up on this because this brings up the whole scale point. If you look at all the innovators of the past decade, even two decades, Yahoo, Google, Facebook, these are companies that were webscalers before there was anything that they could buy. They built their own because they had their own problem at scale. >> At scale. >> And data at scale is a whole other mind-blowing issue. Do you agree? >> Absolutely. >> We're going to put that on the agenda for the CDO Summit in San Francisco next week. Seth, thanks so much for joining us on theCube. Appreciate it; Chief Data Officer, this is going to be a hot field. The CDO is going to be a very important opportunity for anyone watching in the data field. This is going to be new opportunities. Get that data, get it controlled, taming the data, making it valuable. This is theCUBE, taming all of the content here at InterConnect. I'm John Furrier with Dave Vellante. More content coming. Stay with us. Day Two coverage continues. (innovative music tones)

Published Date : Mar 22 2017

SUMMARY :

Brought to you by IBM. Welcome to theCUBE, welcome back. chattin' with you guys. I like the keynote on day one here. I mean you guys have the scenes with data what are you guys working on I get to public cloud, the need to worry about governance platform needs to be built data science machine learning data science are two, If I said five, there's only four. one of the things we see and so that just reinforces the need for and say you need to create Am I going to be off premise? to access ATMs, you like we talked about before this show. and we got to figure out how to do that. a lot of that outward focus If I'm going to start and that's largely going to how do I provide access to my data. I'm going to be a CDO someday. Seth: I was staying in a Take us through how you I'm not going to fit Data has properties that fundamental drivers of the concept it's almost like you and it was like, that was someone's It's got to get (mumbles) and not surprising, they were How do you see the data and make it the heart of our products. and a half to realize, Master data needs to be in real time, low-latency Right, it's got to be and has the industry solved that? and sell it to other people, It's almost as if Right, and there's some key things need to think about right giving it to my competitor? So do I own the model is never going to leave their company. Your data full, so you know the problem. have access to the content. innovators of the past decade, Do you agree? The CDO is going to be a

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Chris Moody	PERSON	0.99+
Seth Dobrin	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Brazil	LOCATION	0.99+
Seth	PERSON	0.99+
90%	QUANTITY	0.99+
three	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
tens	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.99+
John Furrier	PERSON	0.99+
South America	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
20%	QUANTITY	0.99+
Last week	DATE	0.99+
two	QUANTITY	0.99+
five	QUANTITY	0.99+
two years	QUANTITY	0.99+
80%	QUANTITY	0.99+
four months	QUANTITY	0.99+
7%	QUANTITY	0.99+
five things	QUANTITY	0.99+
400 terabytes	QUANTITY	0.99+
Watson Health	ORGANIZATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
four years	QUANTITY	0.99+
One	QUANTITY	0.99+
Watson Financial	ORGANIZATION	0.99+
next week	DATE	0.99+
Twitter	ORGANIZATION	0.99+
Excel	TITLE	0.99+
Las Vegas	LOCATION	0.99+
one	QUANTITY	0.99+
four	QUANTITY	0.99+
today	DATE	0.99+
Robetti	PERSON	0.99+
third one	QUANTITY	0.99+
Watson	ORGANIZATION	0.99+
CDO Summit	EVENT	0.99+
a week	QUANTITY	0.99+
mid-2000s	DATE	0.98+
single line	QUANTITY	0.98+
next week	DATE	0.98+
Millions of rows	QUANTITY	0.98+
15	QUANTITY	0.98+
three day	QUANTITY	0.98+
first six months	QUANTITY	0.97+
20 years ago	DATE	0.97+
day one	QUANTITY	0.96+
single catalog	QUANTITY	0.96+
five key areas	QUANTITY	0.96+

Seth Dobrin, IBM Analytics - Spark Summit East 2017 - #sparksummit - #theCUBE

>> Narrator: Live from Boston, Massachusetts, this is theCUBE! Covering Spark Summit East 2017. Brought to you by, Databricks. Now, here are your hosts, Dave Vellante and George Gilbert. >> Welcome back to Boston, everybody, Seth Dobrin is here, he's the vice president and chief data officer of the IBM Analytics Organization. Great to see you, Seth, thanks for coming on. >> Great to be back, thanks for having me again. >> You're welcome, so chief data officer is the hot title. It was predicted to be the hot title and now it really is. Many more of you around the world and IBM's got an interesting sort of structure of chief data officers, can you explain that? >> Yeah, so there's a global chief data officer, that's Inderpal Bhandari and he's been on this podcast or videocast a view times. Then he's set up structures within each of the business units in IBM. Where each of the major business units have a chief data officer, also. And so I'm the chief data officer for the analytics business unit. >> So one of Interpol's things when I've interviewed them is culture. The data culture, you've got to drive that in. And he talks about the five things that chief data officers really need to do to be successful. Maybe you could give us your perspective on how that flows down through the organization and what are the key critical success factors for you and how are you implementing them? >> I agree, there's five key things and maybe I frame a little differently than Interpol does. There's this whole cloud migration, so every chief data officer needs to understand what their cloud migration strategy is. Every chief data officer needs to have a good understanding of what their data science strategy is. So how are they going to build the posable data science assets. So not data science assets that are delivered through spreadsheets. Every chief data officer needs to understand what their approach to unified governance is. So how do I govern all of my platforms in a way that enables that last point about data science. And then there's a piece around people. How do I build a pipeline for me today and the future? >> So the people piece is both the skills, and it's presumably a relationship with the line of business, as well. There's sort of two vectors there, right? >> Yeah the people piece when I think of it, is really about skills. There's a whole cultural component that goes across all of those five pieces that I laid out. Finding the right people, with the right skillset, where you need them, is hard. >> Can you talk about cloud migration, why that's so critical and so hard? >> If you look at kind of where the industry's been, the IT industry, it's been this race to the public cloud. I think it's a little misguided, all along. If you look at how business is run, right? Today, enterprises that are not internet born, make their money from what's running their businesses today. So this business critical assets. And just thinking that you can pick those up and move them to the cloud and take advantage of cloud, is not realistic. So the race really, is to a hybrid cloud. Our future's really lie in how do I connect these business critical assets to the cloud? And how do I migrate those things to the cloud? >> So Seth, the CIO might say to you, "Okay, let's go there for a minute, I kind of agree with what you're saying, I can't just shift everything in to the cloud. But what I can do in a hybrid cloud that I can't do in a public cloud?" >> Well, there's some drivers for that. I think one driver for hybrid cloud is what I just said. You can't just pick everything up and move it overnight, it's a journey. And it's not a six month journey, it's probably not a year journey, it's probably a multi year journey. >> Dave: So you can actually keep running your business? >> So you can actually keep running your business. And then other piece is there's new regulations that are coming up. And these regulations, EUGDPR is the biggest example of them right now. There are very stiff fines, for violations of those policies. And the party that's responsible for paying those fines, is the party that who the consumer engaged with. It's you, it's whoever owns the business. And as a business leader, I don't know that I would be, very willingly give up, trust a third party to manage that, just any any third party to manage that for me. And so there's certain types of data that some enterprises may never want to move to the cloud, because they're not going to trust a third party to manage that risk for them. >> So it's more transparent from a government standpoint. It's not opaque. >> Seth: Yup. >> You feel like you're in control? >> Yeah, you feel like you're in control and if something goes wrong, it's my fault. It's not something that I got penalized for because someone else did something wrong. >> So at the data layer, help us sort of abstract one layer up and the applications. How would you partition the applications. The ones that are managing that critical data that has to stay on premises. What would you build up potentially to compliment it in the public cloud? >> I don't think you need to partition applications. The way you build modern applications today, it's all API driven. You can reduce some of the costs of latency, through design. So you don't really need to partition the applications, per say. >> I'm thinking more along the lines of that the systems of record are not going to be torn out and those are probably the last ones if ever to go to the public cloud. But other applications leverage them. If that's not the right way of looking at it, where do you add value in the public cloud versus what stays on premise? >> So some of the system of record data, there's no reason you can't replicate some of it to the cloud. So if it's not this personal information, or highly regulated information, there's no reason that you can't replicate some of that to the cloud. And I think we get caught up in, we can't replicate data, we can't replicate data. I don't think that's the right answer, I think the right answer is to replicate the data if you need to, or if the data and system of record is not in the right structure, for what I need to do, then let's put the data in the right structure. Let's not have the conversation about how I can't replicate data. Let's have the conversation about where's the right place for the data, where does it make most sense and what's the right structure for it? And if that means you've got 10 copies of a certain type of data then you've got 10 copies of a certain type of data. >> Would you be, on that data, would it typically be, other parts of the systems of record that you might have in the public cloud, or would they be new apps, sort of green field apps? >> Seth: Yes. >> George: Okay. >> Seth: I think both. And that's part of, i think in my mind, that's kind of how you build, that question you just asked right there. Is one of the things that guide how you build your cloud migration strategy. So we said you can't just pick everything up and move it. So how do you prioritize? You look at what you need to build to run your business differently. And you start there and you start thinking about how do I migrate information to support those to the cloud? And maybe you start by building a local private cloud. So that everything's close together until you kind of master it. And then once you get enough, critical mass of data and applications around it, then you start moving stuff to the cloud. >> We talked earlier off camera about reframing governance steps. I used to head a CIO consultancy and we worked with a number of CIOs that were within legal IT, for example. And were worried about compliance and governance and things of that nature. And their ROI was always scare the board. But the holy grail, was can we turn governance into something of value? For the organization? Can we? >> I think in the world we live in today, with ever increasing regulations. And with a need to be agile and with everyone needing to and wanting to apply data science at scale. You need to reframe governance, right? Governance needs to be reframed from something that is seen as a roadblock. To something that is truly an enabler. And not just giving it lip service. And what do I mean by that? For governance to be an enabler, you really got to think about, how do I upfront, classify my data so that all data in my organization is bucketed in to some version of public, propietary and confidential. Different enterprises may have 30 scales and some may only have two. Or some may have one. and so you do that up front and so you know what can be done with data, when it can be done and who it can by done with. You need to capture intent. So what are allowed intended uses of data? And as a data scientist, what am I intending to do with this data? So that you can then mesh those two things together? Cause that's important in these new regulations I talked about, is people give you access to data, their personal data for an intended purpose. And then you need to be able to apply these governance, policies, actively. So it's not a passive, after the fact. Or you got to stop and you got to wait, it's leveraging services. Leveraging APIs. And building a composable system of polices that are delivered through APIs. So if I want to create a sandbox. To run some analytics on. I'm going to call an API. To get that data. That API is going to call a policy API that's going to say, "Okay, does Seth have permission to see this data? Can Seth use this data for this intended purpose?" if yes, the sandbox is created. If not, there's a conversation about really why does Seth need access to this data? It's really moving governance to be actively to enable me to do things. And it changes the conversation from, hey it's your data, can I have it? To there's really solid reasons as to why I can and can't have data. >> And then some potential automation around a sandbox that creates value. >> Seth: Absolutely. >> But it's still, the example you gave, public prop6ietary or confidential. Is still very governance like, where I was hoping you were going with the data classification and I think you referenced this. Can I extend that, that schema, that nomenclature to include other attributes of value? And can i do it, automate it, at the point of creation or use and scale it? >> Absolutely, that is exactly what I mean. I just used those three cause it was the three that are easy to understand. >> So I can give you as a business owner some areas that I would like to see, a classification schema and then you could automate that for me at scale? In theory? >> In theory, that's where we're hoping to go. To be able to automate. And it's going to be different based on what industry vertical you're in. What risk profile your business is willing to take. So that classification scheme is going to look very different for a bank, than it will for a pharmaceutical company. Or for a research organization. >> Dave: Well, if I can then defensively delete data. That's of real value to an organization. >> With new regulations, you need to be able to delete data. And you need to be able to know where all of your data is. So that you can delete it. Today, most organizations don't know where all their data is. >> And that problem is solved with math and data science, or? >> I think that problem is solved with a combination of governance. >> Dave: Sure. >> And technology. Right? >> Yeah, technology kind of got us into this problem. We'll say technology can get us out. >> On the technology subject, it seems like, with the explosion of data, whether it's not just volume, but also, many copies of the truth. You would need some sort of curation and catalog system that goes beyond what you had in a data warehouse. How do you address that challenge? >> Seth: Yeah and that gets into what I said when you guys asked me about CDOs, what do they care about? One of the things is unified governance. And so part of unified governance, the first piece of unified governance is having a catalog of your data. That is all of your data. And it's a single catalog for your data whether it's one of your business critical systems that's running your business today. Whether it's a public cloud, or it's a private cloud. Or some combination of both. You need to know where all your data is. You also need to have a policy catalog that's single for both of those. Catalogs like this fall apart by entropy. And the more you have, the more likely they are to fall apart. And so if you have one. And you have a lot of automation around it to do a lot of these things, so you have automation that allows you to go through your data and discover what data is where. And keep track of lineage in an automated fashion. Keep track of provenance in an automated fashion. Then we start getting into a system of truly unified governance that's active like I said before. >> There's a lot of talk about digital transformations. Of course, digital equals data. If it ain't data, it ain't digital. So one of the things that in the early days of the whole big data theme. You'd hear people say, "You have to figure out how to monetize the data." And that seems to have changed and morphed into you have to understand how your organization gets value from data. If you're a for profit company, it's monetizing. Something and feeding how data contributes to that monetization if you're a health care organization, maybe it's different. I wonder if you could talk about that in terms of the importance of understanding how an organization makes money to the CDO specifically. >> I think you bring up a good point. Monetization of data and analytics, is often interpreted differently. If you're a CFO you're going to say, "You're going to create new value for me, I'm going to start getting new revenue streams." And that may or may not be what you mean. >> Dave: Sell the data, it's not always so easy. >> It's not always so easy and it's hard to demonstrate value for data. To sell it. There's certain types, like IBM owns a weather company. Clearly, people want to buy weather data, it's important. But if you're talking about how do you transform a business unit it's not necessarily about creating new revenue streams, it's how do I leverage data and analytics to run my business differently. And maybe even what are new business models that I could never do before I had data and data science. >> Would it be fair to say that, as Dave was saying, there's the data side and people were talking about monetizing that. But when you talk about analytics increasingly, machine learning specifically, it's a fusion of the data and the model. And a feedback loop. Is that something where, that becomes a critical asset? >> I would actually say that you really can't generate a tremendous amount of value from just data. You need to apply something like machine learning to it. And machine learning has no value without good data. You need to be able to apply machine learning at scale. You need to build the deployable data science assets that run your business differently. So for example, I could run a report that shows me how my business did last quarter. How my sales team did last quarter. Or how my marketing team did last quarter. That's not really creating value. That's giving me a retrospective look on how I did. Where you can create value is how do I run my marketing team differently. So what data do I have and what types of learning can I get from that data that will tell my marketing team what they should be doing? >> George: And the ongoing process. >> And the ongoing process. And part of actually discovering, doing this catalog your data and understanding data you find data quality issues. And data quality issues are not necessarily an issue with the data itself or the people, they're usually process issues. And by discovering those data quality issues you may discover processes that need to be changed and in changing those processes you can create efficiencies. >> So it sounds like you guys got a pretty good framework. Having talked to Interpol a couple times and what you're saying makes sense. Do you have nightmares about IOT? (laughing) >> Do I have nightmares about IOT? I don't think I have nightmares about IOT. IOT is really just a series of connected devices. Is really what it is. On my talk tomorrow, I'm going to talk about hybrid cloud and connect a car is actually one of the things I'm going to talk about. And really a connected car you're just have a bunch of connected devices to a private cloud that's on wheels. I'm less concerned about IOT than I am, people manually changing data. IOT you get data, you can track it, if something goes wrong, you know what happened. I would say no, I don't have nightmares about IOT. If you do security wrong, that's a whole nother conversation. >> But it sounds like you're doing security right, sounds like you got a good handle on governance. Obviously scale is a key part of that. Could break the whole thing if you can't scale. And you're comfortable with the state of technology being able to support that? At least with IBM. >> I think at least with an IBM I think I am. Like I said, a connected car which is basically a bunch of IOT devices, a private cloud. How do we connect that private cloud to other private clouds or to a public cloud? There's tons of technologies out there to do that. Spark, Kafka. Those two things together allow you to do things that we could never do before. >> Can you elaborate? Like in a connected car environment or some other scenario where, other people called it a data center on wheels. Think of it as a private cloud, that's a wonderful analogy. How does Spark and Kafka on that very, very, smart device, cooperate with something like on the edge. Like the cities, buildings, versus in the clouds? >> If you're a connected car and you're this private cloud on wheels. You can't drive the car just on that information. You can't drive it just on the LIDAR knowing how well the wheels are in contact, you need weather information. You need information about other cars around you. You need information about pedestrians. You need information about traffic. All of this information you get from that connection. And the way you do that is leveraging Spark and Kafka. Kafka's a messaging system, you could leverage Kafka to send the car messages. Or send pedestrian messages. "This car is coming, you shouldn't cross." Or vice versa. Get a car to stop because there's a pedestrian in the way before even the systems on the car can see it. So if you can get that kind of messaging system in near real time. If I'm the pedestrian I'm 300 feet away. A half a second that it would take for that to go through, isn't that big of a deal because you'll be stopped before you get there. >> What about the again, intelligence between not just the data, but the advanced analytics. Where some of that would live in the car and some in the cloud. Is it just you're making realtime decisions in the car and you're retraining the models in the cloud, or how does that work? >> No I think some of those decisions would be done through Spark. In transit. And so one of the nice things about something about Spark is, we can do machine learning transformations on data. Think ETL. But think ETL where you can apply machine learning as part of that ETL. So I'm transferring all this weather data, positioning data and I'm applying a machine learning algorithm for a given purpose in that car. So the purpose is navigation. Or making sure I'm not running into a building. So that's happening in real time as it's streaming to the car. >> That's the prediction aspect that's happening in real time. >> Seth: Yes. >> But at the same time, you want to be learning from all the cars in your fleet. >> That would happen up in the cloud. I don't think that needs to happen on the edge. Maybe it does, but I don't think it needs to happen on the edge. And today, while I said a car is a data center, a private cloud on wheels, there's cost to the computation you can have on that car. And I don't think the cost is quite low enough yet where you could do all that where it makes sense to do all that computation on the edge. So some of it you would want to do in the cloud. Plus you would want to have all the information from as many cars in the area as possible. >> Dave: We're out of time, but some closing thoughts. They say may you live in interesting times. Well you can sum up the sum of the changes that are going on the business. Dell buys EMC, IBM buys The Weather Company. And that gave you a huge injection of data scientists. Which, talk about data culture. Just last thoughts on that in terms of the acquisition and how that's affected your role. >> I've only been at IBM since November. So all that happened before my role. >> Dave: So you inherited? >> So from my perspective it's a great thing. Before I got there, the culture was starting to change. Like we talked about before we went on air, that's the hardest part about any kind of data science transformation is the cultural aspects. >> Seth, thanks very much for coming back in theCUBE. Good to have you. >> Yeah, thanks for having me again. >> You're welcome, all right, keep it right there everybody, we'll be back with our next guest. This is theCUBE, we're live from Spark Summit in Boston. Right back. (soft rock music)

Published Date : Feb 8 2017

SUMMARY :

Brought to you by, Databricks. of the IBM Analytics Organization. Many more of you around the world And so I'm the chief data officer and what are the key critical success factors for you So how are they going to build the posable data science assets. So the people piece is both the skills, with the right skillset, where you need them, is hard. So the race really, is to a hybrid cloud. So Seth, the CIO might say to you, And it's not a six month journey, So you can actually keep running your business. So it's more transparent from a government standpoint. Yeah, you feel like you're in control that has to stay on premises. I don't think you need to partition applications. of record are not going to be torn out to replicate the data if you need to, that guide how you build your cloud migration strategy. But the holy grail, So that you can then mesh those two things together? And then some potential automation But it's still, the example you gave, that are easy to understand. So that classification scheme is going to That's of real value to an organization. And you need to be able to know where all of your data is. I think that problem is solved And technology. Yeah, technology kind of got us into this problem. that goes beyond what you had in a data warehouse. And the more you have, And that seems to have changed and morphed into you have And that may or may not be what you mean. and it's hard to demonstrate value for data. it's a fusion of the data and the model. that you really can't generate a tremendous amount And by discovering those data quality issues you may So it sounds like you guys got a pretty good framework. of the things I'm going to talk about. Could break the whole thing if you can't scale. Those two things together allow you Can you elaborate? And the way you do that is leveraging Spark and Kafka. and some in the cloud. But think ETL where you can apply machine That's the prediction aspect you want to be learning from all the cars in your fleet. to the computation you can have on that car. And that gave you a huge injection of data scientists. So all that happened before my role. that's the hardest part about any kind Good to have you. we'll be back with our next guest.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
George	PERSON	0.99+
George Gilbert	PERSON	0.99+
Seth	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Inderpal Bhandari	PERSON	0.99+
10 copies	QUANTITY	0.99+
Seth Dobrin	PERSON	0.99+
Dell	ORGANIZATION	0.99+
300 feet	QUANTITY	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
six month	QUANTITY	0.99+
both	QUANTITY	0.99+
Boston	LOCATION	0.99+
30 scales	QUANTITY	0.99+
last quarter	DATE	0.99+
five things	QUANTITY	0.99+
five pieces	QUANTITY	0.99+
IBM Analytics Organization	ORGANIZATION	0.99+
Boston, Massachusetts	LOCATION	0.99+
each	QUANTITY	0.99+
two things	QUANTITY	0.99+
today	DATE	0.99+
November	DATE	0.99+
tomorrow	DATE	0.99+
Today	DATE	0.99+
single	QUANTITY	0.99+
The Weather Company	ORGANIZATION	0.99+
two vectors	QUANTITY	0.99+
EMC	ORGANIZATION	0.98+
Spark	TITLE	0.98+
Interpol	ORGANIZATION	0.98+
IBM Analytics	ORGANIZATION	0.98+
one driver	QUANTITY	0.98+
One	QUANTITY	0.97+
first piece	QUANTITY	0.97+
Kafka	PERSON	0.97+
three	QUANTITY	0.97+
Spark Summit East 2017	EVENT	0.93+
a year	QUANTITY	0.93+
Spark Summit	EVENT	0.92+
five key things	QUANTITY	0.91+
single catalog	QUANTITY	0.9+
EUGDPR	TITLE	0.9+
one layer	QUANTITY	0.9+
Spark	PERSON	0.88+
Kafka	TITLE	0.86+
half a second	QUANTITY	0.84+
Databricks	ORGANIZATION	0.82+

Gene Kolker, IBM & Seth Dobrin, Monsanto - IBM Chief Data Officer Strategy Summit 2016 - #IBMCDO

>> live from Boston, Massachusetts. It's the Cube covering IBM Chief Data Officer Strategy Summit brought to you by IBM. Now, here are your hosts. Day Volante and Stew Minimum. >> Welcome back to Boston, everybody. This is the Cube, the worldwide leader in live tech coverage. Stillman and I have pleased to have Jean Kolker on a Cuba lem. Uh, he's IBM vice president and chief data officer of the Global Technology Services division. And Seth Dobrin who's the Director of Digital Strategies. That Monsanto. You may have seen them in the news lately. Gentlemen. Welcome to the Cube, Jean. Welcome back. Good to see you guys again. Thanks. Thank you. So let's start with the customer. Seth, Let's, uh, tell us about what you're doing here, and then we'll get into your role. >> Yes. So, you know, the CDO summit has been going on for a couple of years now, and I've been lucky enoughto be participating for a couple of a year and 1/2 or so, Um, and you know, really, the nice thing about the summit is is the interaction with piers, um, and the interaction and networking with people who are facing similar challenges from a similar perspective. >> Yes, kind of a relatively new Roland topic, one that's evolved, Gene. We talked about this before, but now you've come from industry into, ah, non regulated environment. Now what's happened like >> so I think the deal is that way. We're developing some approaches, and we get in some successes in regulated environment. Right? And now I feel with And we were being client off IBM for years, right? Using their technology's approaches. Right? So and now I feel it's time for me personally to move on something different and tried to serve our power. I mean, IBM clients respected off in this striking from healthcare, but their approaches, you know, and what IBM can do for clients go across the different industries, right? And doing it. That skill that's very beneficial, I think, for >> clients. So Monsanto obviously guys do a lot of stuff in the physical world. Yeah, you're the head of digital strategy. So what does that entail? What is Monte Santo doing for digital? >> Yes, so, you know, for as head of digital strategies for Monsanto, really? My role is to number one. Help Monsanto internally reposition itself so that we behave and act like a digital companies, so leveraging data and analytics and also the cultural shifts associated with being more digital, which is that whole kind like you start out this conversation with the whole customer first approach. So what is the real impact toe? What we're doing to our customers on driving that and then based on on those things, how can we create new business opportunities for us as a company? Um, and how can we even create new adjacent markets or new revenues in adjacent areas based on technologies and things we already have existing within the company? >> It was the scope of analytics, customer engagement of digital experiences, all of the above, so that the scope is >> really looking at our portfolio across the gamut on DH, seeing how we can better serve our customers and society leveraging what we're doing today. So it's really leveraging the re use factor of the whole digital concept. Right? So we have analytics for geospatial, right? Big part of agriculture is geospatial. Are there other adjacent areas that we could apply some of that technology? Some of that learning? Can we monetize those data? We monetize the the outputs of those models based on that, Or is there just a whole new way of doing business as a company? Because we're in this digital era >> this way? Talked about a lot of the companies that have CEOs today are highly regulated. What are you learning from them? What's what's different? Kind of a new organization. You know, it might be an opportunity for you that they don't have. And, you know, do you have a CDO yet or is that something you're planning on having? >> Yes, So we don't have a CDO We do have someone acts as an essential. he's a defacto CEO, he has all of the data organizations on his team. Um, it's very recent for Monsanto, Um, and and so I think, you know, in terms of from the regular, what can we learn from, you know, there there are. It's about half financial people have non financial people, are half heavily regulated industries, and I think, you know, on the surface you would. You would think that, you know, there was not a lot of overlap, but I think the level of rigor that needs to go into governance in a financial institution that same thought process. Khun really be used as a way Teo really enable Maur R and D. Mohr you know, growth centered companies to be able to use data more broadly and so thinking of governance not as as a roadblock or inhibitor, but really thinking about governance is an enabler. How does it enable us to be more agile as it enable us to beam or innovative? Right? If if people in the company there's data that people could get access to by unknown process of known condition, right, good, bad, ugly. As long as people know they can do things more quickly because the data is there, it's available. It's curated. And if they shouldn't have access it under their current situation, what do they need to do to be able to access that data? Right. So if I would need If I'm a data scientist and I want to access data about my customers, what can I can't? What can and can't I do with that data? Number one doesn't have to be DEA Nana Mayes, right? Or if I want to access in, it's current form. What steps do I need to go through? What types of approval do I need to do to do to access that data? So it's really about removing roadblocks through governance instead of putting him in place. >> Gina, I'm curious. You know, we've been digging into you know, IBM has a very multifaceted role here. You know how much of this is platforms? How much of it is? You know, education and services. How much of it is, you know, being part of the data that your your customers you're using? >> Uh so I think actually, that different approaches to this issues. My take is basically we need Teo. I think that with even cognitive here, right and data is new natural resource worldwide, right? So data service, cognitive za za service. I think this is where you know IBM is coming from. And the BM is, you know, tradition. It was not like that, but it's under a lot of transformation as we speak. A lot of new people coming in a lot off innovation happening as we speak along. This line's off new times because cognitive with something, really you right, and it's just getting started. Data's a service is really new. It's just getting started. So there's a lot to do. And I think my role specifically global technology services is you know, ah, largest by having your union that IBM, you're 30 plus 1,000,000,000 answered You okay? And we support a lot of different industries basically going across all different types of industries how to transition from offerings to new business offerings, service, integrated services. I think that's the key for us. >> Just curious, you know? Where's Monsanto with kind of the adoption of cognitive, You know what? Where are you in that journey? >> Um, so we are actually a fairly advanced in the journey In terms of using analytics. I wouldn't say that we're using cognitive per se. Um, we do use a lot of machine learning. We have some applications that on the back end run on a I So some form of artificial or formal artificial intelligence, that machine learning. Um, we haven't really gotten into what, you know, what? IBM defined his cognitive in terms of systems that you can interact with in a natural, normal course of doing voice on DH that you spend a whole lot of time constantly teaching. But we do use like I said, artificial intelligence. >> Jean I'm interested in the organizational aspects. So we have Inderpal on before. He's the global CDO, your divisional CDO you've got a matrix into your leadership within the Global Services division as well as into the chief date officer for all of IBM. Okay, Sounds sounds reasonable. He laid out for us a really excellent sort of set of a framework, if you will. This is interval. Yeah, I understand your data strategy. Identify your data store says, make those data sources trusted. And then those air sequential activities. And in parallel, uh, you have to partner with line of business. And then you got to get into the human resource planning and development piece that has to start right away. So that's the framework. Sensible framework. A lot of thought, I'm sure, went into it and a lot of depth and meaning behind it. How does that framework translate into the division? Is it's sort of a plug and play and or is there their divisional goals that are create dissonance? Can you >> basically, you know, I'm only 100 plus days in my journey with an IBM right? But I can feel that the global technology services is transforming itself into integrated services business. Okay, so it's thiss framework you just described is very applicable to this, right? So basically what we're trying to do, we're trying to become I mean, it was the case before for many industries, for many of our clients. But we I want to transform ourselves into trusted broker. So what they need to do and this framework help is helping tremendously, because again, there's things we can do in concert, you know, one after another, right to control other and things we can do in parallel. So we trying those things to be put on the agenda for our global technology services, okay. And and this is new for them in some respects. But some respects it's kind of what they were doing before, but with new emphasis on data's A service cognitive as a service, you know, major thing for one of the major things for global technology services delivery. So cognitive delivery. That's kind of new type off business offerings which we need to work on how to make it truly, you know, once a sense, you know, automated another sense, you know, cognitive and deliver to our clients some you value and on value compared to what was done up until recently. What >> do you mean by cognitive delivery? Explained that. >> Yeah, so basically in in plain English. So what's right now happening? Usually when you have a large systems computer IT system, which are basically supporting lot of in this is a lot of organizations corporations, right? You know, it's really done like this. So it's people run technology assistant, okay? And you know what Of decisions off course being made by people, But some of the decisions can be, you know, simple decisions. Right? Decisions, which can be automated, can standardize, normalize can be done now by technology, okay and people going to be used for more complex decisions, right? It's basically you're going toe. It turned from people around technology assisted toa technology to technology around people assisted. OK, that's very different. Very proposition, right? So, again, it's not about eliminating jobs, it's very different. It's taken off, you know, routine and automata ble part off the business right to technology and given options and, you know, basically options to choose for more complex decision making to people. That's kind of I would say approach. >> It's about scale and the scale to, of course, IBM. When when Gerstner made the decision, Tio so organized as a services company, IBM came became a global leader, if not the global leader but a services business. Hard to scale. You could scare with bodies, and the bigger it gets, the more complicated it gets, the more expensive it gets. So you saying, If I understand correctly, the IBM is using cognitive and software essentially to scale its services business where possible, assisted by humans. >> So that's exactly the deal. So and this is very different. Very proposition, toe say, compared what was happening recently or earlier? Always. You know other. You know, players. We're not building your shiny and much more powerful and cognitive, you know, empowered mouse trap. No, we're trying to become trusted broker, OK, and how to do that at scale. That's an open, interesting question, but we think that this transition from you know people around technology assisted Teo technology around people assisted. That's the way to go. >> So what does that mean to you? How does that resonate? >> Yeah, you know, I think it brings up a good point actually, you know, if you think of the whole litany of the scope of of analytics, you have everything from kind of describing what happened in the past All that to cognitive. Um, and I think you need to I understand the power of each of those and what they shouldn't should be used for. A lot of people talk. You talk. People talk a lot about predictive analytics, right? And when you hear predictive analytics, that's really where you start doing things that fully automate processes that really enable you to replace decisions that people make right, I think. But those air mohr transactional type decisions, right? More binary type decisions. As you get into things where you can apply binary or I'm sorry, you can apply cognitive. You're moving away from those mohr binary decisions. There's more transactional decisions, and you're moving mohr towards a situation where, yes, the system, the silicon brain right, is giving you some advice on the types of decisions that you should make, based on the amount of information that it could absorb that you can't even fathom absorbing. But they're still needs really some human judgment involved, right? Some some understanding of the contacts outside of what? The computer, Khun Gay. And I think that's really where something like cognitive comes in. And so you talk about, you know, in this in this move to have, you know, computer run, human assisted right. There's a whole lot of descriptive and predictive and even prescriptive analytics that are going on before you get to that cognitive decision but enables the people to make more value added decisions, right? So really enabling the people to truly add value toe. What the data and the analytics have said instead of thinking about it, is replacing people because you're never going to replace you. Never gonna replace people. You know, I think I've heard people at some of these conferences talking about, Well, no cognitive and a I is going to get rid of data scientist. I don't I don't buy that. I think it's really gonna enable data scientist to do more valuable, more incredible things >> than they could do today way. Talked about this a lot to do. I mean, machines, through the course of history, have always replaced human tasks, right, and it's all about you know, what's next for the human and I mean, you know, with physical labor, you know, driving stakes or whatever it is. You know, we've seen that. But now, for the first time ever, you're seeing cognitive, cognitive assisted, you know, functions come into play and it's it's new. It's a new innovation curve. It's not Moore's law anymore. That's driving innovation. It's how we interact with systems and cognitive systems one >> tonight. And I think, you know, I think you hit on a good point there when you said in driving innovation, you know, I've run, you know, large scale, automated process is where the goal was to reduce the number of people involved. And those were like you said, physical task that people are doing we're talking about here is replacing intellectual tasks, right or not replacing but freeing up the intellectual capacity that is going into solving intellectual tasks to enable that capacity to focus on more innovative things, right? We can teach a computer, Teo, explain ah, an area to us or give us some advice on something. I don't know that in the next 10 years, we're gonna be able to teach a computer to innovate, and we can free up the smart minds today that are focusing on How do we make a decision? Two. How do we be more innovative in leveraging this decision and applying this decision? That's a huge win, and it's not about replacing that person. It's about freeing their time up to do more valuable things. >> Yes, sure. So, for example, from my previous experience writing healthcare So physicians, right now you know, basically, it's basically impossible for human individuals, right to keep up with spaced of changes and innovations happening in health care and and by medical areas. Right? So in a few years it looks like there was some numbers that estimate that in three days you're going to, you know, have much more information for several years produced during three days. What was done by several years prior to that point. So it's basically becomes inhuman to keep up with all these innovations, right? Because of that decision is going to be not, you know, optimal decisions. So what we'd like to be doing right toe empower individuals make this decision more, you know, correctly, it was alternatives, right? That's about empowering people. It's not about just taken, which is can be done through this process is all this information and get in the routine stuff out of their plate, which is completely full. >> There was a stat. I think it was last year at IBM Insight. Exact numbers, but it's something like a physician would have to read 1,500 periodic ALS a week just to keep up with the new data innovations. I mean, that's virtually impossible. That something that you're obviously pointing, pointing Watson that, I mean, But there are mundane examples, right? So you go to the airport now, you don't need a person that the agent to give you. Ah, boarding pass. It's on your phone already. You get there. Okay, so that's that's That's a mundane example we're talking about set significantly more complicated things. And so what's The gate is the gate. Creativity is it is an education, you know, because these are step functions in value creation. >> You know, I think that's ah, what? The gate is a question I haven't really thought too much about. You know, when I approach it, you know the thinking Mohr from you know, not so much. What's the gate? But where? Where can this ad the most value um So maybe maybe I have thought about it. And the gate is value, um, and and its value both in terms of, you know, like the physician example where, you know, physicians, looking at images. And I mean, I don't even know what the error rate is when someone evaluates and memory or something. And I probably don't want Oh, right. So, getting some advice there, the value may not be monetary, but to me, it's a lot more than monetary, right. If I'm a patient on DH, there's a lot of examples like that. And other places, you know, that are in various industries. That I think that's that's the gate >> is why the value you just hit on you because you are a heat seeking value missile inside of your organisation. What? So what skill sets do you have? Where did you come from? That you have this capability? Was your experience, your education, your fortitude, >> While the answer's yes, tell all of them. Um, you know, I'm a scientist by training my backgrounds in statistical genetics. Um, and I've kind of worked through the business. I came up through the RND organization with him on Santo over the last. Almost exactly 10 years now, Andi, I've had lots of opportunities to leverage. Um, you know, Data and analytics have changed how the company operates on. I'm lucky because I'm in a company right now. That is extremely science driven, right? Monsanto is a science based company. And so being in a company like that, you don't face to your question about financial industry. I don't think you face the same barriers and Monsanto about using data and analytics in the same way you may in a financial types that you've got company >> within my experience. 50% of diagnosis being proven incorrect. Okay, so 50% 05 0/2 summation. You go to your physician twice. Once you on average, you get in wrong diagnosis. We don't know which one, by the way. Definitely need some someone. Garrett A cz Individuals as humans, we do need some help. Us cognitive, and it goes across different industries. Right, technologist? So if your server is down, you know you shouldn't worry about it because there is like system, you know, Abbas system enough, right? So think about how you can do that scale, and then, you know start imagined future, which going to be very empowering. >> So I used to get a second opinion, and now the opinion comprises thousands, millions, maybe tens of millions of opinions. Is that right? >> It's a try exactly and scale ofthe data accumulation, which you're going to help us to solve. This problem is enormous. So we need to keep up with that scale, you know, and do it properly exactly for business. Very proposition. >> Let's talk about the role of the CDO and where you see that evolving how it relates to the role of the CIA. We've had this conversation frequently, but is I'm wondering if the narratives changing right? Because it was. It's been fuzzy when we first met a couple years ago that that was still a hot topic. When I first started covering this. This this topic, it was really fuzzy. Has it come in two more clarity lately in terms of the role of the CDO versus the CIA over the CTO, its chief digital officer, we starting to see these roles? Are they more than just sort of buzzwords or grey? You know, areas. >> I think there's some clarity happening already. So, for example, there is much more acceptance for cheap date. Office of Chief Analytics Officer Teo, Chief Digital officer. Right, in addition to CEO. So basically station similar to what was with Serious 20 plus years ago and CEO Row in one sentence from my viewpoint would be How you going using leverage in it. Empower your business. Very proposition with CDO is the same was data how using data leverage and data, your date and your client's data. You, Khun, bring new value to your clients and businesses. That's kind ofthe I would say differential >> last word, you know, And you think you know I'm not a CDO. But if you think about the concept of establishing a role like that, I think I think the name is great because that what it demonstrates is support from leadership, that this is important. And I think even if you don't have the name in the organization like it, like in Monsanto, you know, we still have that executive management level support to the data and analytics, our first class citizens and their important, and we're going to run our business that way. I think that's really what's important is are you able to build the culture that enable you to leverage the maximum capability Data and analytics. That's really what matters. >> All right, We'll leave it there. Seth Gene, thank you very much for coming that you really appreciate your time. Thank you. Alright. Keep it right there, Buddy Stew and I'll be back. This is the IBM Chief Data Officer Summit. We're live from Boston right back.

Published Date : Oct 4 2016

SUMMARY :

IBM Chief Data Officer Strategy Summit brought to you by IBM. Good to see you guys again. be participating for a couple of a year and 1/2 or so, Um, and you know, Yes, kind of a relatively new Roland topic, one that's evolved, approaches, you know, and what IBM can do for clients go across the different industries, So Monsanto obviously guys do a lot of stuff in the physical world. the cultural shifts associated with being more digital, which is that whole kind like you start out this So it's really leveraging the re use factor of the whole digital concept. And, you know, do you have a CDO I think, you know, in terms of from the regular, what can we learn from, you know, there there are. How much of it is, you know, being part of the data that your your customers And the BM is, you know, tradition. Um, we haven't really gotten into what, you know, what? And in parallel, uh, you have to partner with line of business. because again, there's things we can do in concert, you know, one after another, do you mean by cognitive delivery? and given options and, you know, basically options to choose for more complex decision So you saying, If I understand correctly, the IBM is using cognitive and software That's an open, interesting question, but we think that this transition from you know people you know, in this in this move to have, you know, computer run, know, what's next for the human and I mean, you know, with physical labor, And I think, you know, I think you hit on a good point there when you said in driving innovation, decision is going to be not, you know, optimal decisions. So you go to the airport now, you don't need a person that the agent to give you. of, you know, like the physician example where, you know, physicians, is why the value you just hit on you because you are a heat seeking value missile inside of your organisation. I don't think you face the same barriers and Monsanto about using data and analytics in the same way you may So think about how you can do that scale, So I used to get a second opinion, and now the opinion comprises thousands, So we need to keep up with that scale, you know, Let's talk about the role of the CDO and where you So basically station similar to what was with Serious And I think even if you don't have the name in the organization like it, like in Monsanto, Seth Gene, thank you very much for coming that you really appreciate your time.

ENTITIES

Entity	Category	Confidence
Monsanto	ORGANIZATION	0.99+
Gina	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
Seth	PERSON	0.99+
Jean Kolker	PERSON	0.99+
CIA	ORGANIZATION	0.99+
Gene Kolker	PERSON	0.99+
thousands	QUANTITY	0.99+
Boston	LOCATION	0.99+
50%	QUANTITY	0.99+
Jean	PERSON	0.99+
three days	QUANTITY	0.99+
Seth Gene	PERSON	0.99+
Stillman	PERSON	0.99+
Boston, Massachusetts	LOCATION	0.99+
Teo	PERSON	0.99+
Andi	PERSON	0.99+
Khun Gay	PERSON	0.99+
last year	DATE	0.99+
D. Mohr	PERSON	0.99+
today	DATE	0.99+
second opinion	QUANTITY	0.99+
one sentence	QUANTITY	0.99+
Nana Mayes	PERSON	0.99+
Buddy Stew	PERSON	0.99+
tonight	DATE	0.99+
twice	QUANTITY	0.99+
both	QUANTITY	0.98+
100 plus days	QUANTITY	0.98+
IBM Insight	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Cuba	LOCATION	0.98+
Gene	PERSON	0.97+
tens of millions	QUANTITY	0.97+
each	QUANTITY	0.97+
Monte Santo	ORGANIZATION	0.97+
English	OTHER	0.97+
Moore	PERSON	0.96+
Khun	PERSON	0.96+
first time	QUANTITY	0.96+
Global Technology Services	ORGANIZATION	0.96+
10 years	QUANTITY	0.96+
Watson	PERSON	0.95+
RND	ORGANIZATION	0.95+
Gerstner	PERSON	0.95+
CDO	EVENT	0.95+
millions	QUANTITY	0.95+
Maur R	PERSON	0.94+
first approach	QUANTITY	0.94+
IBM Chief Data Officer Summit	EVENT	0.93+
two	QUANTITY	0.93+
Global Services	ORGANIZATION	0.93+
20 plus years ago	DATE	0.92+
Santo	ORGANIZATION	0.92+
1,000,000,000	QUANTITY	0.9+
50	QUANTITY	0.88+
Serious	ORGANIZATION	0.88+
30 plus	QUANTITY	0.87+
Cube	ORGANIZATION	0.86+
DEA	ORGANIZATION	0.85+
1/2	QUANTITY	0.85+
Inderpal	PERSON	0.84+
1,500 periodic ALS a week	QUANTITY	0.84+
Garrett A	PERSON	0.84+
next 10 years	DATE	0.84+
#IBMCDO	EVENT	0.84+

Programmable Quantum Simulators: Theory and Practice

>>Hello. My name is Isaac twang and I am on the faculty at MIT in electrical engineering and computer science and in physics. And it is a pleasure for me to be presenting at today's NTT research symposium of 2020 to share a little bit with you about programmable quantum simulators theory and practice the simulation of physical systems as described by their Hamiltonian. It's a fundamental problem which Richard Fineman identified early on as one of the most promising applications of a hypothetical quantum computer. The real world around us, especially at the molecular level is described by Hamiltonians, which captured the interaction of electrons and nuclei. What we desire to understand from Hamiltonian simulation is properties of complex molecules, such as this iron molded to them. Cofactor an important catalyst. We desire there are ground States, reaction rates, reaction dynamics, and other chemical properties, among many things for a molecule of N Adams, a classical simulation must scale exponentially within, but for a quantum simulation, there is a potential for this simulation to scale polynomials instead. >>And this would be a significant advantage if realizable. So where are we today in realizing such a quantum advantage today? I would like to share with you a story about two things in this quest first, a theoretical optimal quantum simulation, awkward them, which achieves the best possible runtime for generic Hamiltonian. Second, let me share with you experimental results from a quantum simulation implemented using available quantum computing hardware today with a hardware efficient model that goes beyond what is utilized by today's algorithms. I will begin with the theoretically optimal quantum simulation uncle rhythm in principle. The goal of quantum simulation is to take a time independent Hamiltonian age and solve Schrodinger's equation has given here. This problem is as hard as the hardest quantum computation. It is known as being BQ P complete a simplification, which is physically reasonable and important in practice is to assume that the Hamiltonian is a sum over terms which are local. >>For example, due to allow to structure these local terms, typically do not commute, but their locality means that each term is reasonably small, therefore, as was first shown by Seth Lloyd in 1996, one way to compute the time evolution that is the exponentiation of H with time is to use the lead product formula, which involves a successive approximation by repetitive small time steps. The cost of this charterization procedure is a number of elementary steps, which scales quadratically with the time desired and inverse with the error desired for the simulation output here then is the number of local terms in the Hamiltonian. And T is the desired simulation time where Epsilon is the desired simulation error. Today. We know that for special systems and higher or expansions of this formula, a better result can be obtained such as scaling as N squared, but as synthetically linear in time, this however is for a special case, the latest Hamiltonians and it would be desirable to scale generally with time T for a order T time simulation. >>So how could such an optimal quantum simulation be constructed? An important ingredient is to transform the quantum simulation into a quantum walk. This was done over 12 years ago, Andrew trials showing that for sparse Hamiltonians with around de non-zero entries per row, such as shown in this graphic here, one can do a quantum walk very much like a classical walk, but in a superposition of right and left shown here in this quantum circuit, where the H stands for a hazard market in this particular circuit, the head Mar turns the zero into a superposition of zero and one, which then activate the left. And the right walk in superposition to graph of the walk is defined by the Hamiltonian age. And in doing so Childs and collaborators were able to show the walk, produces a unitary transform, which goes as E to the minus arc co-sign of H times time. >>So this comes close, but it still has this transcendental function of age, instead of just simply age. This can be fixed with some effort, which results in an algorithm, which scales approximately as towel log one over Epsilon with how is proportional to the sparsity of the Hamiltonian and the simulation time. But again, the scaling here is a multiplicative product rather than an additive one, an interesting insight into the dynamics of a cubit. The simplest component of a quantum computer provides a way to improve upon this single cubits evolve as rotations in a sphere. For example, here is shown a rotation operator, which rotates around the axis fi in the X, Y plane by angle theta. If one, the result of this rotation as a projection along the Z axis, the result is a co-sign squared function. That is well-known as a Ravi oscillation. On the other hand, if a cubit is rotated around multiple angles in the X Y plane, say around the fee equals zero fee equals 1.5 and fee equals zero access again, then the resulting response function looks like a flat top. >>And in fact, generalizing this to five or more pulses gives not just flattered hops, but in fact, arbitrary functions such as the Chevy chef polynomial shown here, which gets transplants like bullying or, and majority functions remarkably. If one does rotations by angle theta about D different angles in the X Y plane, the result is a response function, which is a polynomial of order T in co-sign furthermore, as captured by this theorem, given a nearly arbitrary degree polynomial there exists angles fi such that one can achieve the desired polynomial. This is the result that derives from the Remez exchange algorithm used in classical discreet time signal processing. So how does this relate to quantum simulation? Well recall that a quantum walk essentially embeds a Hamiltonian insight, the unitary transform of a quantum circuit, this embedding generalize might be called and it involves the use of a cubit acting as a projector to control the application of H if we generalize the quantum walk to include a rotation about access fee in the X Y plane, it turns out that one obtains a polynomial transform of H itself. >>And this it's the same as the polynomial in the quantum signal processing theorem. This is a remarkable result known as the quantum synchrony value transformed theorem from contrast Julian and Nathan weep published last year. This provides a quantum simulation auger them using quantum signal processing. For example, can start with the quantum walk result and then apply quantum signal processing to undo the arc co-sign transformation and therefore obtain the ideal expected Hamiltonian evolution E to the minus I H T the resulting algorithm costs a number of elementary steps, which scales as just the sum of the evolution time and the log of one over the error desired this saturates, the known lower bound, and thus is the optimal quantum simulation algorithm. This table from a recent review article summarizes a comparison of the query complexities of the known major quantum simulation algorithms showing that the cubitus station and quantum sequel processing algorithm is indeed optimal. >>Of course, this optimality is a theoretical result. What does one do in practice? Let me now share with you the story of a hardware efficient realization of a quantum simulation on actual hardware. The promise of quantum computation traditionally rests on a circuit model, such as the one we just used with quantum circuits, acting on cubits in contrast, consider a real physical problem from quantum chemistry, finding the structure of a molecule. The starting point is the point Oppenheimer separation of the electronic and vibrational States. For example, to connect it, nuclei, share a vibrational mode, the potential energy of this nonlinear spring, maybe model as a harmonic oscillator since the spring's energy is determined by the electronic structure. When the molecule becomes electronically excited, this vibrational mode changes one obtains, a different frequency and different equilibrium positions for the nuclei. This corresponds to a change in the spring, constant as well as a displacement of the nuclear positions. >>And we may write down a full Hamiltonian for this system. The interesting quantum chemistry question is known as the Frank Condon problem. What is the probability of transition between the original ground state and a given vibrational state in the excited state spectrum of the molecule, the Frank content factor, which gives this transition probability is foundational to quantum chemistry and a very hard and generic question to answer, which may be amiable to solution on a quantum computer in particular and natural quantum computer to use might be one which already has harmonic oscillators rather than one, which has just cubits. This has provided any Sonic quantum processors, such as the superconducting cubits system shown here. This processor has both cubits as embodied by the Joseph's injunctions shown here, and a harmonic oscillator as embodied by the resonant mode of the transmission cavity. Given here more over the output of this planar superconducting circuit can be connected to three dimensional cavities instead of using cubit Gates. >>One may perform direct transformations on the bull's Arctic state using for example, beam splitters, phase shifters, displacement, and squeezing operators, and the harmonic oscillator, and may be initialized and manipulated directly. The availability of the cubit allows photon number resolve counting for simulating a tri atomic two mode, Frank Condon factor problem. This superconducting cubits system with 3d cavities was to resonators cavity a and cavity B represent the breathing and wiggling modes of a Triumeq molecule. As depicted here. The coupling of these moles was mediated by a superconducting cubit and read out was accomplished by two additional superconducting cubits, coupled to each one of the cavities due to the superconducting resonators used each one of the cavities had a, a long coherence time while resonator States could be prepared and measured using these strong coupling of cubits to the cavity. And Posana quantum operations could be realized by modulating the coupling cubit in between the two cavities, the cavities are holes drilled into pure aluminum, kept superconducting by millikelvin scale. >>Temperatures microfiber, KT chips with superconducting cubits are inserted into ports to couple via a antenna to the microwave cavities. Each of the cavities has a quality factor so high that the coherence times can reach milliseconds. A coupling cubit chip is inserted into the port in between the cavities and the readout and preparation cubit chips are inserted into ports on the sides. For sake of brevity, I will skip the experimental details and present just the results shown here is the fibrotic spectrum obtained for a water molecule using the Pulsonix superconducting processor. This is a typical Frank content spectrum giving the intensity of lions versus frequency in wave number where the solid line depicts the theoretically expected result and the purple and red dots show two sets of experimental data. One taken quickly and another taken with exhaustive statistics. In both cases, the experimental results have good agreement with the theoretical expectations. >>The programmability of this system is demonstrated by showing how it can easily calculate the Frank Condon spectrum for a wide variety of molecules. Here's another one, the ozone and ion. Again, we see that the experimental data shown in points agrees well with the theoretical expectation shown as a solid line. Let me emphasize that this quantum simulation result was obtained not by using a quantum computer with cubits, but rather one with resonators, one resonator representing each one of the modes of vibration in this trial, atomic molecule. This approach represents a far more efficient utilization of hardware resources compared with the standard cubit model because of the natural match of the resonators with the physical system being simulated in comparison, if cubit Gates had been utilized to perform the same simulation on the order of a thousand cubit Gates would have been required compared with the order of 10 operations, which were performed for this post Sonic realization. >>As in topically, the Cupid motto would have required significantly more operations because of the need to retire each one of the harmonic oscillators into some max Hilbert space size compared with the optimal quantum simulation auger rhythms shown in the first half of this talk, we see that there is a significant gap between available quantum computing hardware can perform and what optimal quantum simulations demand in terms of the number of Gates required for a simulation. Nevertheless, many of the techniques that are used for optimal quantum simulation algorithms may become useful, especially if they are adapted to available hardware, moving for the future, holds some interesting challenges for this field. Real physical systems are not cubits, rather they are composed from bolt-ons and from yawns and from yawns need global anti-Semitism nation. This is a huge challenge for electronic structure calculation in molecules, real physical systems also have symmetries, but current quantum simulation algorithms are largely governed by a theorem, which says that the number of times steps required is proportional to the simulation time. Desired. Finally, real physical systems are not purely quantum or purely classical, but rather have many messy quantum classical boundaries. In fact, perhaps the most important systems to simulate are really open quantum systems. And these dynamics are described by a mixture of quantum and classical evolution and the desired results are often thermal and statistical properties. >>I hope this presentation of the theory and practice of quantum simulation has been interesting and worthwhile. Thank you.

Published Date : Sep 24 2020

SUMMARY :

one of the most promising applications of a hypothetical quantum computer. is as hard as the hardest quantum computation. the time evolution that is the exponentiation of H with time And the right walk in superposition If one, the result of this rotation as This is the result that derives from the Remez exchange algorithm log of one over the error desired this saturates, the known lower bound, The starting point is the point Oppenheimer separation of the electronic and vibrational States. spectrum of the molecule, the Frank content factor, which gives this transition probability The availability of the cubit Each of the cavities has a quality factor so high that the coherence times can reach milliseconds. the natural match of the resonators with the physical system being simulated quantum simulation auger rhythms shown in the first half of this talk, I hope this presentation of the theory and practice of quantum simulation has been interesting

ENTITIES

Entity	Category	Confidence
Richard Fineman	PERSON	0.99+
Joseph	PERSON	0.99+
Isaac twang	PERSON	0.99+
Seth Lloyd	PERSON	0.99+
1996	DATE	0.99+
Schrodinger	PERSON	0.99+
Andrew	PERSON	0.99+
Today	DATE	0.99+
five	QUANTITY	0.99+
last year	DATE	0.99+
Julian	PERSON	0.99+
MIT	ORGANIZATION	0.99+
both cases	QUANTITY	0.99+
10 operations	QUANTITY	0.99+
Second	QUANTITY	0.99+
two cavities	QUANTITY	0.99+
Frank Condon	PERSON	0.99+
each term	QUANTITY	0.99+
Nathan	PERSON	0.99+
first half	QUANTITY	0.99+
1.5	QUANTITY	0.99+
first	QUANTITY	0.98+
two sets	QUANTITY	0.98+
two things	QUANTITY	0.98+
today	DATE	0.97+
zero	QUANTITY	0.97+
One	QUANTITY	0.97+
two additional superconducting cubits	QUANTITY	0.96+
each one	QUANTITY	0.94+
3d	QUANTITY	0.94+
one way	QUANTITY	0.94+
NTT research symposium	EVENT	0.93+
Hamiltonian	OTHER	0.92+
Posana	OTHER	0.91+
over 12 years ago	DATE	0.9+
one	QUANTITY	0.89+
Each of	QUANTITY	0.88+
zero entries	QUANTITY	0.88+
Gates	PERSON	0.87+
zero fee	QUANTITY	0.85+
both cubits	QUANTITY	0.83+
two mode	QUANTITY	0.78+
Hamiltonians	PERSON	0.77+
Frank	OTHER	0.73+
Hamiltonian	PERSON	0.72+
milliseconds	QUANTITY	0.72+
one resonator	QUANTITY	0.71+
Condon	PERSON	0.71+
a thousand cubit Gates	QUANTITY	0.7+
BQ P	OTHER	0.69+
single cubits	QUANTITY	0.69+
Mar	PERSON	0.65+
2020	DATE	0.65+
Oppenheimer	LOCATION	0.59+
Gates	OTHER	0.59+
States	LOCATION	0.57+
Arctic	LOCATION	0.56+
Hilbert	PERSON	0.56+
cavities	QUANTITY	0.53+
Hamiltonians	TITLE	0.53+
Hamiltonian	TITLE	0.53+
Chevy	ORGANIZATION	0.51+
Epsilon	TITLE	0.48+
Triumeq	OTHER	0.48+
N Adams	OTHER	0.46+
Remez	OTHER	0.45+
Cupid	PERSON	0.44+
Pulsonix	ORGANIZATION	0.37+

Yusef Khan, Io Tahoe | Enterprise Data Automation

>>from around the globe. It's the Cube with digital coverage of enterprise data automation, an event Siri's brought to you by Iot. Tahoe, everybody, We're back. We're talking about enterprise data automation. The hashtag is data automated, and we're going to really dig into data migrations, data, migrations. They're risky. They're time consuming, and they're expensive. Yousef con is here. He's the head of partnerships and alliances at I o ta ho coming again from London. Hey, good to see you, Seth. Thanks very much. >>Thank you. >>So your role is is interesting. We're talking about data migrations. You're gonna head of partnerships. What is your role specifically? And how is it relevant to what we're gonna talk about today? >>Uh, I work with the various businesses such as cloud companies, systems integrators, companies that sell operating systems, middleware, all of whom are often quite well embedded within a company. I t infrastructures and have existing relationships. Because what we do fundamentally makes migrating to the cloud easier on data migration easier. A lot of businesses that are interested in partnering with us. Um, we're interested in parting with, So >>let's set up the problem a little bit. And then I want to get into some of the data. You know, I said that migration is a risky, time consuming, expensive. They're they're often times a blocker for organizations to really get value out of data. Why is that? >>Uh, I think I mean, all migrations have to start with knowing the facts about your data, and you can try and do this manually. But when that you have an organization that may have been going for decades or longer, they will probably have a pretty large legacy data estate so that I have everything from on premise mainframes. They may have stuff which is probably in the cloud, but they probably have hundreds, if not thousands of applications and potentially hundreds of different data stores. Um, now they're understanding of what they have. Ai's often quite limited because you can try and draw a manual maps, but they're outdated very quickly. Every time that data changes the manual that's out of date on people obviously leave organizations over time, so that kind of tribal knowledge gets built up is limited as well. So you can try a Mackel that manually you might need a db. Hey, thanks. Based analyst or ah, business analyst, and they won't go in and explore the data for you. But doing that manually is very, very time consuming this contract teams of people, months and months. Or you can use automation just like what's the bank with Iot? And they managed to do this with a relatively small team. Are in a timeframe of days. >>Yeah, we talked to Paul from Webster Bank. Awesome discussion. So I want to dig into this migration and let's let's pull up graphic it will talk about. We'll talk about what a typical migration project looks like. So what you see here it is. It's very detailed. I know it's a bit of an eye test, but let me call your attention to some of the key aspects of this Ah, and then use. If I want you to chime in. So at the top here, you see that area graph that's operational risk for a typical migration project, and you can see the timeline and the the milestones. That blue bar is the time to test so you can see the second step data analysis talking 24 weeks so, you know, very time consuming. And then Let's not get dig into the stuff in the middle of the fine print, but there's some real good detail there, but go down the bottom. That's labor intensity in the in the bottom and you can see high is that sort of brown and and you could see a number of data analysis, data staging data prep, the trial, the implementation post implementation fixtures, the transition toe B A B a year, which I think is business as usual. Those are all very labor intensive. So what do you take aways from this typical migration project? What do we need to know yourself? >>I mean, I think the key thing is, when you don't understand your data upfront, it's very difficult to scope to set up a project because you go to business stakeholders and decision makers and you say Okay, we want to migrate these data stores. We want to put them in the cloud most often, but actually, you probably don't know how much data is there. You don't necessarily know how many applications that relates to, you know, the relationships between the data. You don't know the flow of the data. So the direction in which the data is going between different data stores and tables, so you start from a position where you have pretty high risk and alleviate that risk. You could be stacking project team of lots and lots of people to do the next base, which is analysis. And so you set up a project which has got a pretty high cost. The big projects, more people, the heavy of governance, obviously on then there, then in the phase where they're trying to do lots and lots of manual analysis manage. That, in a sense, is, as we all know, on the idea of trying to relate data that's in different those stores relating individual tables and columns. Very, very time consuming, expensive. If you're hiring in resource from consultants or systems integrators externally, you might need to buy or to use party tools, Aziz said earlier. The people who understand some of those systems may have left a while ago. See you even high risks quite cost situation from the off on the same things that have developed through the project. Um, what are you doing with it, Ayatollah? Who is that? We're able to automate a lot of this process from the very beginning because we can do the initial data. Discovery run, for example, automatically you very quickly have an automated validator. A data map on the data flow has been generated automatically, much less time and effort and much less cars. Doctor Marley. >>Okay, so I want to bring back that that first chart, and I want to call your attention to the again that area graph the blue bars and then down below that labor intensity. And now let's bring up the the the same chart. But with a set of an automation injection in here and now. So you now see the So let's go Said Accelerated by Iot, Tom. Okay, great. And we're going to talk about this. But look, what happens to the operational risk. A dramatic reduction in that. That graph. And then look at the bars, the bars, those blue bars. You know, data analysis went from 24 weeks down to four weeks and then look at the labor intensity. The it was all these were high data analysis data staging data prep. Try a lot post implementation fixtures in transition to be a you. All of those went from high labor intensity. So we've now attack that and gone to low labor intensity. Explain how that magic happened. >>I think that the example off a data catalog. So every large enterprise wants to have some kind of repository where they put all their understanding about their data in its Price States catalog, if you like, um, imagine trying to do that manually. You need to go into every individual data store. You need a DB a business analyst, rich data store they need to do in extracted the data table was individually they need to cross reference that with other data school, it stores and schemers and tables. You probably were the mother of all lock Excel spreadsheets. It would be a very, very difficult exercise to do. I mean, in fact, one of our reflections as we automate lots of data lots of these things is, um it accelerates the ability to water may, But in some cases, it also makes it possible for enterprise customers with legacy systems um, take banks, for example. There quite often end up staying on mainframe systems that they've had in place for decades. Uh, no migrating away from them because they're not able to actually do the work of understanding the data g duplicating the data, deleting data isn't relevant and then confidently going forward to migrate. So they stay where they are with all the attendant problems assistance systems that are out of support. Go back to the data catalog example. Um, whatever you discover invades, discovery has to persist in a tool like a data catalog. And so we automate data catalog books, including Out Way Cannot be others, but we have our own. The only alternative to this kind of automation is to build out this very large project team or business analysts off db A's project managers processed analysts together with data to understand that the process of gathering data is correct. To put it in the repository to validate it except etcetera, we've got into organizations and we've seen them ramp up teams off 2030 people costs off £234 million a year on a time frame, 15 20 years just to try and get a data catalog done. And that's something that we can typically do in a timeframe of months, if not weeks. And the difference is using automation. And if you do what? I've just described it. In this manual situation, you make migrations to the cloud prohibitively expensive. Whatever saving you might make from shutting down your legacy data stores, we'll get eaten up by the cost of doing it. Unless you go with the more automated approach. >>Okay, so the automated approach reduces risk because you're not gonna, you know you're going to stay on project plan. Ideally, it's all these out of scope expectations that come up with the manual processes that kill you in the rework andan that data data catalog. People are afraid that their their family jewels data is not going to make it through to the other side. So So that's something that you're you're addressing and then you're also not boiling the ocean. You're really taking the pieces that are critical and stuff you don't need. You don't have to pay for >>process. It's a very good point. I mean, one of the other things that we do and we have specific features to do is to automatically and noise data for a duplication at a rover or record level and redundancy on a column level. So, as you say before you go into a migration process. You can then understand. Actually, this stuff it was replicated. We don't need it quite often. If you put data in the cloud you're paying, obviously, the storage based offer compute time. The more data you have in there that's duplicated, that is pure cost. You should take out before you migrate again if you're trying to do that process of understanding what's duplicated manually off tens or hundreds of bases stores. It was 20 months, if not years. Use machine learning to do that in an automatic way on it's much, much quicker. I mean, there's nothing I say. Well, then, that costs and benefits of guitar. Every organization we work with has a lot of money existing, sunk cost in their I t. So have your piece systems like Oracle or Data Lakes, which they've spent a good time and money investing in. But what we do by enabling them to transition everything to the strategic future repositories, is accelerate the value of that investment and the time to value that investment. So we're trying to help people get value out of their existing investments on data estate, close down the things that they don't need to enable them to go to a kind of brighter, more future well, >>and I think as well, you know, once you're able to and this is a journey, we know that. But once you're able to go live on, you're infusing sort of a data mindset, a data oriented culture. I know it's somewhat buzzword, but when you when you see it in organizations, you know it's really and what happens is you dramatically reduce that and cycle time of going from data to actually insights. Data's plentiful, but insights aren't, and that is what's going to drive competitive advantage over the next decade and beyond. >>Yeah, definitely. And you could only really do that if you get your data estate cleaned up in the first place. Um, I worked with the managed teams of data scientists, data engineers, business analysts, people who are pushing out dashboards and trying to build machine learning applications. You know, you know, the biggest frustration for lots of them and the thing that they spend far too much time doing is trying to work out what the right data is on cleaning data, which really you don't want a highly paid thanks to scientists doing with their time. But if you sort out your data stays in the first place, get rid of duplication. If that pans migrate to cloud store, where things are really accessible on its easy to build connections and to use native machine learning tools, you're well on the way up to date the maturity curve on you can start to use some of those more advanced applications. >>You said. What are some of the pre requisites? Maybe the top few that are two or three that I need to understand as a customer to really be successful here? Is it skill sets? Is it is it mindset leadership by in what I absolutely need to have to make this successful? >>Well, I think leadership is obviously key just to set the vision of people with spiky. One of the great things about Ayatollah, though, is you can use your existing staff to do this work. If you've used on automation, platform is no need to hire expensive people. Alright, I was a no code solution. It works out of the box. You just connect to force on your existing stuff can use. It's very intuitive that has these issues. User interface? >>Um, it >>was only to invest vast amounts with large consultants who may well charging the earth. Um, and you already had a bit of an advantage. If you've got existing staff who are close to the data subject matter experts or use it because they can very easily learn how to use a tool on, then they can go in and they can write their own data quality rules on. They can really make a contribution from day one, when we are go into organizations on way. Can I? It's one of the great things about the whole experience. Veritas is. We can get tangible results back within the day. Um, usually within an hour or two great ones to say Okay, we started to map relationships. Here's the data map of the data that we've analyzed. Harrison thoughts on where the sensitive data is because it's automated because it's running algorithms stater on. That's what they were really to expect. >>Um, >>and and you know this because you're dealing with the ecosystem. We're entering a new era of data and many organizations to your point, they just don't have the resources to do what Google and Amazon and Facebook and Microsoft did over the past decade To become data dominant trillion dollar market cap companies. Incumbents need to rely on technology companies to bring that automation that machine intelligence to them so they can apply it. They don't want to be AI inventors. They want to apply it to their businesses. So and that's what really was so difficult in the early days of so called big data. You have this just too much complexity out there, and now companies like Iot Tahoe or bringing your tooling and platforms that are allowing companies to really become data driven your your final thoughts. Please use it. >>That's a great point, Dave. In a way, it brings us back to where it began. In terms of partnerships and alliances. I completely agree with a really exciting point where we can take applications like Iot. Uh, we can go into enterprises and help them really leverage the value of these type of machine learning algorithms. And and I I we work with all the major cloud providers AWS, Microsoft Azure or Google Cloud Platform, IBM and Red Hat on others, and we we really I think for us. The key thing is that we want to be the best in the world of enterprise data automation. We don't aspire to be a cloud provider or even a workflow provider. But what we want to do is really help customers with their data without automated data functionality in partnership with some of those other businesses so we can leverage the great work they've done in the cloud. The great work they've done on work flows on virtual assistants in other areas. And we help customers leverage those investments as well. But our heart, we really targeted it just being the best, uh, enterprised data automation business in the world. >>Massive opportunities not only for technology companies, but for those organizations that can apply technology for business. Advantage yourself, count. Thanks so much for coming on the Cube. Appreciate. All right. And thank you for watching everybody. We'll be right back right after this short break. >>Yeah, yeah, yeah, yeah.

Published Date : Jun 23 2020

SUMMARY :

of enterprise data automation, an event Siri's brought to you by Iot. And how is it relevant to what we're gonna talk about today? fundamentally makes migrating to the cloud easier on data migration easier. a blocker for organizations to really get value out of data. And they managed to do this with a relatively small team. That blue bar is the time to test so you can see the second step data analysis talking 24 I mean, I think the key thing is, when you don't understand So you now see the So let's go Said Accelerated by Iot, You need a DB a business analyst, rich data store they need to do in extracted the data processes that kill you in the rework andan that data data catalog. close down the things that they don't need to enable them to go to a kind of brighter, and I think as well, you know, once you're able to and this is a journey, And you could only really do that if you get your data estate cleaned up in I need to understand as a customer to really be successful here? One of the great things about Ayatollah, though, is you can use Um, and you already had a bit of an advantage. and and you know this because you're dealing with the ecosystem. And and I I we work And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
London	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Yusef Khan	PERSON	0.99+
Seth	PERSON	0.99+
Dave	PERSON	0.99+
20 months	QUANTITY	0.99+
Aziz	PERSON	0.99+
hundreds	QUANTITY	0.99+
tens	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Webster Bank	ORGANIZATION	0.99+
24 weeks	QUANTITY	0.99+
two	QUANTITY	0.99+
four weeks	QUANTITY	0.99+
three	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Io Tahoe	PERSON	0.99+
Marley	PERSON	0.99+
Harrison	PERSON	0.99+
Data Lakes	ORGANIZATION	0.99+
Siri	TITLE	0.99+
Excel	TITLE	0.99+
Veritas	ORGANIZATION	0.99+
second step	QUANTITY	0.99+
15 20 years	QUANTITY	0.98+
Tahoe	PERSON	0.98+
One	QUANTITY	0.98+
first chart	QUANTITY	0.98+
an hour	QUANTITY	0.98+
Red Hat	ORGANIZATION	0.98+
one	QUANTITY	0.97+
Tom	PERSON	0.96+
hundreds of bases	QUANTITY	0.96+
first	QUANTITY	0.95+
next decade	DATE	0.94+
first place	QUANTITY	0.94+
Iot	ORGANIZATION	0.94+
Iot	TITLE	0.93+
earth	LOCATION	0.93+
day one	QUANTITY	0.92+
Mackel	ORGANIZATION	0.91+
today	DATE	0.91+
Ayatollah	PERSON	0.89+
£234 million a year	QUANTITY	0.88+
data	QUANTITY	0.88+
Iot	PERSON	0.83+
hundreds of	QUANTITY	0.81+
thousands of applications	QUANTITY	0.81+
decades	QUANTITY	0.8+
I o ta ho	ORGANIZATION	0.75+
past decade	DATE	0.75+
Microsoft Azure	ORGANIZATION	0.72+
two great ones	QUANTITY	0.72+
2030 people	QUANTITY	0.67+
Doctor	PERSON	0.65+
States	LOCATION	0.65+
Iot Tahoe	ORGANIZATION	0.65+
a year	QUANTITY	0.55+
Yousef	PERSON	0.45+
Cloud Platform	TITLE	0.44+
Cube	ORGANIZATION	0.38+

Enterprise Data Automation | Crowdchat

>>from around the globe. It's the Cube with digital coverage of enterprise data automation, an event Siri's brought to you by Iot. Tahoe Welcome everybody to Enterprise Data Automation. Ah co created digital program on the Cube with support from my hotel. So my name is Dave Volante. And today we're using the hashtag data automated. You know, organizations. They really struggle to get more value out of their data, time to data driven insights that drive cost savings or new revenue opportunities. They simply take too long. So today we're gonna talk about how organizations can streamline their data operations through automation, machine intelligence and really simplifying data migrations to the cloud. We'll be talking to technologists, visionaries, hands on practitioners and experts that are not just talking about streamlining their data pipelines. They're actually doing it. So keep it right there. We'll be back shortly with a J ahora who's the CEO of Iot Tahoe to kick off the program. You're watching the Cube, the leader in digital global coverage. We're right back right after this short break. Innovation impact influence. Welcome to the Cube disruptors. Developers and practitioners learn from the voices of leaders who share their personal insights from the hottest digital events around the globe. Enjoy the best this community has to offer on the Cube, your global leader. High tech digital coverage from around the globe. It's the Cube with digital coverage of enterprise, data, automation and event. Siri's brought to you by Iot. Tahoe. Okay, we're back. Welcome back to Data Automated. A J ahora is CEO of I O ta ho, JJ. Good to see how things in London >>Thanks doing well. Things in, well, customers that I speak to on day in, day out that we partner with, um, they're busy adapting their businesses to serve their customers. It's very much a game of ensuring the week and serve our customers to help their customers. Um, you know, the adaptation that's happening here is, um, trying to be more agile. Got to be more flexible. Um, a lot of pressure on data, a lot of demand on data and to deliver more value to the business, too. So that customers, >>as I said, we've been talking about data ops a lot. The idea being Dev Ops applied to the data pipeline, But talk about enterprise data automation. What is it to you. And how is it different from data off >>Dev Ops, you know, has been great for breaking down those silos between different roles functions and bring people together to collaborate. Andi, you know, we definitely see that those tools, those methodologies, those processes, that kind of thinking, um, lending itself to data with data is exciting. We look to do is build on top of that when data automation, it's the it's the nuts and bolts of the the algorithms, the models behind machine learning that the functions. That's where we investors, our r and d on bringing that in to build on top of the the methods, the ways of thinking that break down those silos on injecting that automation into the business processes that are going to drive a business to serve its customers. It's, um, a layer beyond Dev ops data ops. They can get to that point where well, I think about it is is the automation behind new dimension. We've come a long way in the last few years. Boy is, we started out with automating some of those simple, um, to codify, um, I have a high impact on organization across the data a cost effective way house. There's data related tasks that classify data on and a lot of our original pattern certain people value that were built up is is very much around that >>love to get into the tech a little bit in terms of how it works. And I think we have a graphic here that gets into that a little bit. So, guys, if you bring that up, >>sure. I mean right there in the middle that the heart of what we do it is, you know, the intellectual property now that we've built up over time that takes from Hacha genius data sources. Your Oracle Relational database. Short your mainframe. It's a lay and increasingly AP eyes and devices that produce data and that creates the ability to automatically discover that data. Classify that data after it's classified. Them have the ability to form relationships across those different source systems, silos, different lines of business. And once we've automated that that we can start to do some cool things that just puts of contact and meaning around that data. So it's moving it now from bringing data driven on increasingly where we have really smile, right people in our customer organizations you want I do some of those advanced knowledge tasks data scientists and ah, yeah, quants in some of the banks that we work with, the the onus is on, then, putting everything we've done there with automation, pacifying it, relationship, understanding that equality, the policies that you can apply to that data. I'm putting it in context once you've got the ability to power. Okay, a professional is using data, um, to be able to put that data and contacts and search across the entire enterprise estate. Then then they can start to do some exciting things and piece together the the tapestry that fabric across that different system could be crm air P system such as s AP and some of the newer brown databases that we work with. Snowflake is a great well, if I look back maybe five years ago, we had prevalence of daily technologies at the cutting edge. Those are converging to some of the cloud platforms that we work with Google and AWS and I think very much is, as you said it, those manual attempts to try and grasp. But it is such a complex challenges scale quickly runs out of steam because once, once you've got your hat, once you've got your fingers on the details Oh, um, what's what's in your data state? It's changed, You know, you've onboard a new customer. You signed up a new partner. Um, customer has, you know, adopted a new product that you just Lawrence and there that that slew of data keeps coming. So it's keeping pace with that. The only answer really is is some form of automation >>you're working with AWS. You're working with Google, You got red hat. IBM is as partners. What is attracting those folks to your ecosystem and give us your thoughts on the importance of ecosystem? >>That's fundamental. So, I mean, when I caimans where you tell here is the CEO of one of the, um, trends that I wanted us CIO to be part of was being open, having an open architecture allowed one thing that was close to my heart, which is as a CEO, um, a c i o where you go, a budget vision on and you've already made investments into your organization, and some of those are pretty long term bets. They should be going out 5 10 years, sometimes with the CRM system training up your people, getting everybody working together around a common business platform. What I wanted to ensure is that we could openly like it using AP eyes that were available, the love that some investment on the cost that has already gone into managing in organizations I t. But business users to before. So part of the reason why we've been able to be successful with, um, the partners like Google AWS and increasingly, a number of technology players. That red hat mongo DB is another one where we're doing a lot of good work with, um and snowflake here is, um Is those investments have been made by the organizations that are our customers, and we want to make sure we're adding to that. And they're leveraging the value that they've already committed to. >>Yeah, and maybe you could give us some examples of the r A y and the business impact. >>Yeah, I mean, the r a y David is is built upon on three things that I mentioned is a combination off. You're leveraging the existing investment with the existing estate, whether that's on Microsoft Azure or AWS or Google, IBM, and I'm putting that to work because, yeah, the customers that we work with have had made those choices. On top of that, it's, um, is ensuring that we have got the automation that is working right down to the level off data, a column level or the file level we don't do with meta data. It is being very specific to be at the most granular level. So as we've grown our processes and on the automation, gasification tagging, applying policies from across different compliance and regulatory needs that an organization has to the data, everything that then happens downstream from that is ready to serve a business outcome now without hoping out which run those processes within hours of getting started And, um, Bill that picture, visualize that picture and bring it to life. You know, the PR Oh, I that's off the bat with finding data that should have been deleted data that was copies off on and being able to allow the architect whether it's we're working on GCB or a migration to any other clouds such as AWS or a multi cloud landscape right off the map. >>A. J. Thanks so much for coming on the Cube and sharing your insights and your experience is great to have you. >>Thank you, David. Look who is smoking in >>now. We want to bring in the customer perspective. We have a great conversation with Paul Damico, senior vice president data architecture, Webster Bank. So keep it right there. >>Utah Data automated Improve efficiency, Drive down costs and make your enterprise data work for you. Yeah, we're on a mission to enable our customers to automate the management of data to realise maximum strategic and operational benefits. We envisage a world where data users consume accurate, up to date unified data distilled from many silos to deliver transformational outcomes, activate your data and avoid manual processing. Accelerate data projects by enabling non I t resources and data experts to consolidate categorize and master data. Automate your data operations Power digital transformations by automating a significant portion of data management through human guided machine learning. Yeah, get value from the start. Increase the velocity of business outcomes with complete accurate data curated automatically for data, visualization tours and analytic insights. Improve the security and quality of your data. Data automation improves security by reducing the number of individuals who have access to sensitive data, and it can improve quality. Many companies report double digit era reduction in data entry and other repetitive tasks. Trust the way data works for you. Data automation by our Tahoe learns as it works and can ornament business user behavior. It learns from exception handling and scales up or down is needed to prevent system or application overloads or crashes. It also allows for innate knowledge to be socialized rather than individualized. No longer will your companies struggle when the employee who knows how this report is done, retires or takes another job, the work continues on without the need for detailed information transfer. Continue supporting the digital shift. Perhaps most importantly, data automation allows companies to begin making moves towards a broader, more aspirational transformation, but on a small scale but is easy to implement and manage and delivers quick wins. Digital is the buzzword of the day, but many companies recognized that it is a complex strategy requires time and investment. Once you get started with data automation, the digital transformation initiated and leaders and employees alike become more eager to invest time and effort in a broader digital transformational agenda. Yeah, >>everybody, we're back. And this is Dave Volante, and we're covering the whole notion of automating data in the Enterprise. And I'm really excited to have Paul Damico here. She's a senior vice president of enterprise Data Architecture at Webster Bank. Good to see you. Thanks for coming on. >>Nice to see you too. Yes. >>So let's let's start with Let's start with Webster Bank. You guys are kind of a regional. I think New York, New England, uh, leave headquartered out of Connecticut, but tell us a little bit about the >>bank. Yeah, Webster Bank is regional, Boston. And that again in New York, Um, very focused on in Westchester and Fairfield County. Um, they're a really highly rated bank regional bank for this area. They, um, hold, um, quite a few awards for the area for being supportive for the community. And, um, are really moving forward. Technology lives. Currently, today we have, ah, a small group that is just working toward moving into a more futuristic, more data driven data warehouse. That's our first item. And then the other item is to drive new revenue by anticipating what customers do when they go to the bank or when they log into there to be able to give them the best offer. The only way to do that is you have timely, accurate, complete data on the customer and what's really a great value on off something to offer that >>at the top level, what were some of what are some of the key business drivers there catalyzing your desire for change >>the ability to give the customer what they need at the time when they need it? And what I mean by that is that we have, um, customer interactions and multiple weights, right? And I want to be able for the customer, too. Walk into a bank, um, or online and see the same the same format and being able to have the same feel, the same look and also to be able to offer them the next best offer for them. >>Part of it is really the cycle time, the end end cycle, time that you're pressing. And then there's if I understand it, residual benefits that are pretty substantial from a revenue opportunity >>exactly. It's drive new customers, Teoh new opportunities. It's enhanced the risk, and it's to optimize the banking process and then obviously, to create new business. Um, and the only way we're going to be able to do that is that we have the ability to look at the data right when the customer walks in the door or right when they open up their app. >>Do you see the potential to increase the data sources and hence the quality of the data? Or is that sort of premature? >>Oh, no. Um, exactly. Right. So right now we ingest a lot of flat files and from our mainframe type of runnin system that we've had for quite a few years. But now that we're moving to the cloud and off Prem and on France, you know, moving off Prem into, like, an s three bucket Where that data king, we can process that data and get that data faster by using real time tools to move that data into a place where, like, snowflake Good, um, utilize that data or we can give it out to our market. The data scientists are out in the lines of business right now, which is great, cause I think that's where data science belongs. We should give them on, and that's what we're working towards now is giving them more self service, giving them the ability to access the data in a more robust way. And it's a single source of truth. So they're not pulling the data down into their own like tableau dashboards and then pushing the data back out. I have eight engineers, data architects, they database administrators, right, um, and then data traditional data forwarding people, Um, and because some customers that I have that our business customers lines of business, they want to just subscribe to a report. They don't want to go out and do any data science work. Um, and we still have to provide that. So we still want to provide them some kind of read regiment that they wake up in the morning and they open up their email. And there's the report that they just drive, um, which is great. And it works out really well. And one of the things. This is why we purchase I o waas. I would have the ability to give the lines of business the ability to do search within the data, and we read the data flows and data redundancy and things like that and help me cleanup the data and also, um, to give it to the data. Analysts who say All right, they just asked me. They want this certain report and it used to take Okay, well, we're gonna four weeks, we're going to go. We're gonna look at the data, and then we'll come back and tell you what we dio. But now with Iot Tahoe, they're able to look at the data and then, in one or two days of being able to go back and say, Yes, we have data. This is where it is. This is where we found that this is the data flows that we've found also, which is what I call it is the birth of a column. It's where the calm was created and where it went live as a teenager. And then it went to, you know, die very archive. >>In researching Iot Tahoe, it seems like one of the strengths of their platform is the ability to visualize data the data structure, and actually dig into it. But also see it, um, and that speeds things up and gives everybody additional confidence. And then the other pieces essentially infusing ai or machine intelligence into the data pipeline is really how you're attacking automation, right? >>Exactly. So you're able to let's say that I have I have seven cause lines of business that are asking me questions. And one of the questions I'll ask me is, um, we want to know if this customer is okay to contact, right? And you know, there's different avenues so you can go online to go. Do not contact me. You can go to the bank And you could say, I don't want, um, email, but I'll take tests and I want, you know, phone calls. Um, all that information. So seven different lines of business asked me that question in different ways once said Okay to contact the other one says, You know, just for one to pray all these, you know, um, and each project before I got there used to be siloed. So one customer would be 100 hours for them to do that and analytical work, and then another cut. Another of analysts would do another 100 hours on the other project. Well, now I can do that all at once, and I can do those type of searches and say yes we already have that documentation. Here it is. And this is where you can find where the customer has said, You know, you don't want I don't want to get access from you by email, or I've subscribed to get emails from you. I'm using Iot typos eight automation right now to bring in the data and to start analyzing the data close to make sure that I'm not missing anything and that I'm not bringing over redundant data. Um, the data warehouse that I'm working off is not, um a It's an on prem. It's an oracle database. Um, and it's 15 years old, so it has extra data in it. It has, um, things that we don't need anymore. And Iot. Tahoe's helping me shake out that, um, extra data that does not need to be moved into my S three. So it's saving me money when I'm moving from offering on Prem. >>What's your vision or your your data driven organization? >>Um, I want for the bankers to be able to walk around with on iPad in their hands and be able to access data for that customer really fast and be able to give them the best deal that they can get. I want Webster to be right there on top, with being able to add new customers and to be able to serve our existing customers who had bank accounts. Since you were 12 years old there and now our, you know, multi. Whatever. Um, I want them to be able to have the best experience with our our bankers. >>That's really what I want is a banking customer. I want my bank to know who I am, anticipate my needs and create a great experience for me. And then let me go on with my life. And so that's a great story. Love your experience, your background and your knowledge. Can't thank you enough for coming on the Cube. >>No, thank you very much. And you guys have a great day. >>Next, we'll talk with Lester Waters, who's the CTO of Iot Toe cluster takes us through the key considerations of moving to the cloud. >>Yeah, right. The entire platform Automated data Discovery data Discovery is the first step to knowing your data auto discover data across any application on any infrastructure and identify all unknown data relationships across the entire siloed data landscape. smart data catalog. Know how everything is connected? Understand everything in context, regained ownership and trust in your data and maintain a single source of truth across cloud platforms, SAS applications, reference data and legacy systems and power business users to quickly discover and understand the data that matters to them with a smart data catalog continuously updated ensuring business teams always have access to the most trusted data available. Automated data mapping and linking automate the identification of unknown relationships within and across data silos throughout the organization. Build your business glossary automatically using in house common business terms, vocabulary and definitions. Discovered relationships appears connections or dependencies between data entities such as customer account, address invoice and these data entities have many discovery properties. At a granular level, data signals dashboards. Get up to date feeds on the health of your data for faster improved data management. See trends, view for history. Compare versions and get accurate and timely visual insights from across the organization. Automated data flows automatically captured every data flow to locate all the dependencies across systems. Visualize how they work together collectively and know who within your organization has access to data. Understand the source and destination for all your business data with comprehensive data lineage constructed automatically during with data discovery phase and continuously load results into the smart Data catalog. Active, geeky automated data quality assessments Powered by active geek You ensure data is fit for consumption that meets the needs of enterprise data users. Keep information about the current data quality state readily available faster Improved decision making Data policy. Governor Automate data governance End to end over the entire data lifecycle with automation, instant transparency and control Automate data policy assessments with glossaries, metadata and policies for sensitive data discovery that automatically tag link and annotate with metadata to provide enterprise wide search for all lines of business self service knowledge graph Digitize and search your enterprise knowledge. Turn multiple siloed data sources into machine Understandable knowledge from a single data canvas searching Explore data content across systems including GRP CRM billing systems, social media to fuel data pipelines >>Yeah, yeah, focusing on enterprise data automation. We're gonna talk about the journey to the cloud Remember, the hashtag is data automate and we're here with Leicester Waters. Who's the CTO of Iot Tahoe? Give us a little background CTO, You've got a deep, deep expertise in a lot of different areas. But what do we need to know? >>Well, David, I started my career basically at Microsoft, uh, where I started the information Security Cryptography group. They're the very 1st 1 that the company had, and that led to a career in information, security. And and, of course, as easy as you go along with information security data is the key element to be protected. Eso I always had my hands and data not naturally progressed into a roll out Iot talk was their CTO. >>What's the prescription for that automation journey and simplifying that migration to the cloud? >>Well, I think the first thing is understanding what you've got. So discover and cataloging your data and your applications. You know, I don't know what I have. I can't move it. I can't. I can't improve it. I can't build upon it. And I have to understand there's dependence. And so building that data catalog is the very first step What I got. Okay, >>so So we've done the audit. We know we've got what's what's next? Where do we go >>next? So the next thing is remediating that data you know, where do I have duplicate data? I may have often times in an organization. Uh, data will get duplicated. So somebody will take a snapshot of the data, you know, and then end up building a new application, which suddenly becomes dependent on that data. So it's not uncommon for an organization of 20 master instances of a customer, and you can see where that will go. And trying to keep all that stuff in sync becomes a nightmare all by itself. So you want to sort of understand where all your redundant data is? So when you go to the cloud, maybe you have an opportunity here to do you consolidate that that data, >>then what? You figure out what to get rid of our actually get rid of it. What's what's next? >>Yes, yes, that would be the next step. So figure out what you need. What, you don't need you Often times I've found that there's obsolete columns of data in your databases that you just don't need. Or maybe it's been superseded by another. You've got tables have been superseded by other tables in your database, so you got to kind of understand what's being used and what's not. And then from that, you can decide. I'm gonna leave this stuff behind or I'm gonna I'm gonna archive this stuff because I might need it for data retention where I'm just gonna delete it. You don't need it. All were >>plowing through your steps here. What's next on the >>journey? The next one is is in a nutshell. Preserve your data format. Don't. Don't, Don't. Don't boil the ocean here at music Cliche. You know, you you want to do a certain degree of lift and shift because you've got application dependencies on that data and the data format, the tables in which they sent the columns and the way they're named. So some degree, you are gonna be doing a lift and ship, but it's an intelligent lift and ship. The >>data lives in silos. So how do you kind of deal with that? Problem? Is that is that part of the journey? >>That's that's great pointed because you're right that the data silos happen because, you know, this business unit is start chartered with this task. Another business unit has this task and that's how you get those in stance creations of the same data occurring in multiple places. So you really want to is part of your cloud migration. You really want a plan where there's an opportunity to consolidate your data because that means it will be less to manage. Would be less data to secure, and it will be. It will have a smaller footprint, which means reduce costs. >>But maybe you could address data quality. Where does that fit in on the >>journey? That's that's a very important point, you know. First of all, you don't want to bring your legacy issues with U. S. As the point I made earlier. If you've got data quality issues, this is a good time to find those and and identify and remediate them. But that could be a laborious task, and you could probably accomplish. It will take a lot of work. So the opportunity used tools you and automate that process is really will help you find those outliers that >>what's next? I think we're through. I think I've counted six. What's the What's the lucky seven >>Lucky seven involved your business users. Really, When you think about it, you're your data is in silos, part of part of this migration to cloud as an opportunity to break down the silos. These silence that naturally occurs are the business. You, uh, you've got to break these cultural barriers that sometimes exists between business and say so. For example, I always advise there's an opportunity year to consolidate your sensitive data. Your P I. I personally identifiable information and and three different business units have the same source of truth From that, there's an opportunity to consolidate that into one. >>Well, great advice, Lester. Thanks so much. I mean, it's clear that the Cap Ex investments on data centers they're generally not a good investment for most companies. Lester really appreciate Lester Water CTO of Iot Tahoe. Let's watch this short video and we'll come right back. >>Use cases. Data migration. Accelerate digitization of business by providing automated data migration work flows that save time in achieving project milestones. Eradicate operational risk and minimize labor intensive manual processes that demand costly overhead data quality. You know the data swamp and re establish trust in the data to enable data signs and Data analytics data governance. Ensure that business and technology understand critical data elements and have control over the enterprise data landscape Data Analytics ENABLEMENT Data Discovery to enable data scientists and Data Analytics teams to identify the right data set through self service for business demands or analytical reporting that advanced too complex regulatory compliance. Government mandated data privacy requirements. GDP Our CCP, A, e, p, R HIPPA and Data Lake Management. Identify late contents cleanup manage ongoing activity. Data mapping and knowledge graph Creates BKG models on business enterprise data with automated mapping to a specific ontology enabling semantic search across all sources in the data estate data ops scale as a foundation to automate data management presences. >>Are you interested in test driving the i o ta ho platform Kickstart the benefits of data automation for your business through the Iot Labs program? Ah, flexible, scalable sandbox environment on the cloud of your choice with set up service and support provided by Iot. Top Click on the link and connect with the data engineer to learn more and see Iot Tahoe in action. Everybody, we're back. We're talking about enterprise data automation. The hashtag is data automated and we're going to really dig into data migrations, data migrations. They're risky, they're time consuming and they're expensive. Yousef con is here. He's the head of partnerships and alliances at I o ta ho coming again from London. Hey, good to see you, Seth. Thanks very much. >>Thank you. >>So let's set up the problem a little bit. And then I want to get into some of the data said that migration is a risky, time consuming, expensive. They're they're often times a blocker for organizations to really get value out of data. Why is that? >>I think I mean, all migrations have to start with knowing the facts about your data. Uh, and you can try and do this manually. But when you have an organization that may have been going for decades or longer, they will probably have a pretty large legacy data estate so that I have everything from on premise mainframes. They may have stuff which is probably in the cloud, but they probably have hundreds, if not thousands of applications and potentially hundreds of different data stores. >>So I want to dig into this migration and let's let's pull up graphic. It will talk about We'll talk about what a typical migration project looks like. So what you see, here it is. It's very detailed. I know it's a bit of an eye test, but let me call your attention to some of the key aspects of this, uh and then use if I want you to chime in. So at the top here, you see that area graph that's operational risk for a typical migration project, and you can see the timeline and the the milestones That Blue Bar is the time to test so you can see the second step. Data analysis. It's 24 weeks so very time consuming, and then let's not get dig into the stuff in the middle of the fine print. But there's some real good detail there, but go down the bottom. That's labor intensity in the in the bottom, and you can see hi is that sort of brown and and you could see a number of data analysis data staging data prep, the trial, the implementation post implementation fixtures, the transition to be a Blu, which I think is business as usual. >>The key thing is, when you don't understand your data upfront, it's very difficult to scope to set up a project because you go to business stakeholders and decision makers, and you say Okay, we want to migrate these data stores. We want to put them in the cloud most often, but actually, you probably don't know how much data is there. You don't necessarily know how many applications that relates to, you know, the relationships between the data. You don't know the flow of the basis of the direction in which the data is going between different data stores and tables. So you start from a position where you have pretty high risk and probably the area that risk you could be. Stack your project team of lots and lots of people to do the next phase, which is analysis. And so you set up a project which has got a pretty high cost. The big projects, more people, the heavy of governance, obviously on then there, then in the phase where they're trying to do lots and lots of manual analysis, um, manual processes, as we all know, on the layer of trying to relate data that's in different grocery stores relating individual tables and columns, very time consuming, expensive. If you're hiring in resource from consultants or systems integrators externally, you might need to buy or to use party tools. Aziz said earlier the people who understand some of those systems may have left a while ago. CEO even higher risks quite cost situation from the off on the same things that have developed through the project. Um, what are you doing with Ayatollah? Who is that? We're able to automate a lot of this process from the very beginning because we can do the initial data. Discovery run, for example, automatically you very quickly have an automated validator. A data met on the data flow has been generated automatically, much less time and effort and much less cars stopped. >>Yeah. And now let's bring up the the the same chart. But with a set of an automation injection in here and now. So you now see the sort of Cisco said accelerated by Iot, Tom. Okay, great. And we're gonna talk about this, but look, what happens to the operational risk. A dramatic reduction in that, That that graph and then look at the bars, the bars, those blue bars. You know, data analysis went from 24 weeks down to four weeks and then look at the labor intensity. The it was all these were high data analysis, data staging data prep trialling post implementation fixtures in transition to be a you all those went from high labor intensity. So we've now attacked that and gone to low labor intensity. Explain how that magic happened. >>I think that the example off a data catalog. So every large enterprise wants to have some kind of repository where they put all their understanding about their data in its price States catalog. If you like, imagine trying to do that manually, you need to go into every individual data store. You need a DB, a business analyst, reach data store. They need to do an extract of the data. But it on the table was individually they need to cross reference that with other data school, it stores and schemers and tables you probably with the mother of all Lock Excel spreadsheets. It would be a very, very difficult exercise to do. I mean, in fact, one of our reflections as we automate lots of data lots of these things is, um it accelerates the ability to water may, But in some cases, it also makes it possible for enterprise customers with legacy systems take banks, for example. There quite often end up staying on mainframe systems that they've had in place for decades. I'm not migrating away from them because they're not able to actually do the work of understanding the data, duplicating the data, deleting data isn't relevant and then confidently going forward to migrate. So they stay where they are with all the attendant problems assistance systems that are out of support. You know, you know, the biggest frustration for lots of them and the thing that they spend far too much time doing is trying to work out what the right data is on cleaning data, which really you don't want a highly paid thanks to scientists doing with their time. But if you sort out your data in the first place, get rid of duplication that sounds migrate to cloud store where things are really accessible. It's easy to build connections and to use native machine learning tools. You well, on the way up to the maturity card, you can start to use some of the more advanced applications >>massive opportunities not only for technology companies, but for those organizations that can apply technology for business. Advantage yourself, count. Thanks so much for coming on the Cube. Much appreciated. Yeah, yeah, yeah, yeah

Published Date : Jun 23 2020

SUMMARY :

of enterprise data automation, an event Siri's brought to you by Iot. a lot of pressure on data, a lot of demand on data and to deliver more value What is it to you. into the business processes that are going to drive a business to love to get into the tech a little bit in terms of how it works. the ability to automatically discover that data. What is attracting those folks to your ecosystem and give us your thoughts on the So part of the reason why we've IBM, and I'm putting that to work because, yeah, the A. J. Thanks so much for coming on the Cube and sharing your insights and your experience is great to have Look who is smoking in We have a great conversation with Paul Increase the velocity of business outcomes with complete accurate data curated automatically And I'm really excited to have Paul Damico here. Nice to see you too. So let's let's start with Let's start with Webster Bank. complete data on the customer and what's really a great value the ability to give the customer what they need at the Part of it is really the cycle time, the end end cycle, time that you're pressing. It's enhanced the risk, and it's to optimize the banking process and to the cloud and off Prem and on France, you know, moving off Prem into, In researching Iot Tahoe, it seems like one of the strengths of their platform is the ability to visualize data the You know, just for one to pray all these, you know, um, and each project before data for that customer really fast and be able to give them the best deal that they Can't thank you enough for coming on the Cube. And you guys have a great day. Next, we'll talk with Lester Waters, who's the CTO of Iot Toe cluster takes Automated data Discovery data Discovery is the first step to knowing your We're gonna talk about the journey to the cloud Remember, the hashtag is data automate and we're here with Leicester Waters. data is the key element to be protected. And so building that data catalog is the very first step What I got. Where do we go So the next thing is remediating that data you know, You figure out what to get rid of our actually get rid of it. And then from that, you can decide. What's next on the You know, you you want to do a certain degree of lift and shift Is that is that part of the journey? So you really want to is part of your cloud migration. Where does that fit in on the So the opportunity used tools you and automate that process What's the What's the lucky seven there's an opportunity to consolidate that into one. I mean, it's clear that the Cap Ex investments You know the data swamp and re establish trust in the data to enable Top Click on the link and connect with the data for organizations to really get value out of data. Uh, and you can try and milestones That Blue Bar is the time to test so you can see the second step. have pretty high risk and probably the area that risk you could be. to be a you all those went from high labor intensity. But it on the table was individually they need to cross reference that with other data school, Thanks so much for coming on the Cube.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Dave Volante	PERSON	0.99+
Paul Damico	PERSON	0.99+
Paul Damico	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Aziz	PERSON	0.99+
Webster Bank	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Westchester	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
24 weeks	QUANTITY	0.99+
Seth	PERSON	0.99+
London	LOCATION	0.99+
one	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
Connecticut	LOCATION	0.99+
New York	LOCATION	0.99+
100 hours	QUANTITY	0.99+
iPad	COMMERCIAL_ITEM	0.99+
Cisco	ORGANIZATION	0.99+
four weeks	QUANTITY	0.99+
Siri	TITLE	0.99+
thousands	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
six	QUANTITY	0.99+
first item	QUANTITY	0.99+
20 master instances	QUANTITY	0.99+
today	DATE	0.99+
second step	QUANTITY	0.99+
S three	COMMERCIAL_ITEM	0.99+
I o ta ho	ORGANIZATION	0.99+
first step	QUANTITY	0.99+
Fairfield County	LOCATION	0.99+
five years ago	DATE	0.99+
first	QUANTITY	0.99+
each project	QUANTITY	0.99+
France	LOCATION	0.98+
two days	QUANTITY	0.98+
Leicester Waters	ORGANIZATION	0.98+
Iot Tahoe	ORGANIZATION	0.98+
Cap Ex	ORGANIZATION	0.98+
seven cause	QUANTITY	0.98+
Lester Waters	PERSON	0.98+
5 10 years	QUANTITY	0.98+
Boston	LOCATION	0.97+
Iot	ORGANIZATION	0.97+
Tahoe	ORGANIZATION	0.97+
Tom	PERSON	0.97+
First	QUANTITY	0.97+
15 years old	QUANTITY	0.96+
seven different lines	QUANTITY	0.96+
single source	QUANTITY	0.96+
Utah	LOCATION	0.96+
New England	LOCATION	0.96+
Webster	ORGANIZATION	0.95+
12 years old	QUANTITY	0.95+
Iot Labs	ORGANIZATION	0.95+
Iot. Tahoe	ORGANIZATION	0.95+
1st 1	QUANTITY	0.95+
U. S.	LOCATION	0.95+
J ahora	ORGANIZATION	0.95+
Cube	COMMERCIAL_ITEM	0.94+
Prem	ORGANIZATION	0.94+
one customer	QUANTITY	0.93+
Oracle	ORGANIZATION	0.93+
I O ta ho	ORGANIZATION	0.92+
Snowflake	TITLE	0.92+
seven	QUANTITY	0.92+
single	QUANTITY	0.92+
Lester	ORGANIZATION	0.91+

Greg DeKoenigsberg & Robyn Bergeron, Red Hat | AnsibleFest 2019

>>live from Atlanta, Georgia. It's the Q covering answerable best 2019. Brought to you by Red hat. >>Welcome back, everyone to the Cube. Live coverage in Atlanta, Georgia for answerable fest. This is Red Hats Event where all the practices come together. The community to talk about automation anywhere. John Kerry with my coast to Minutemen, our next two guests arrive. And Bergeron, principal community architect for answerable now Red Hat and Greg Dankers Berg, senior director, Community Ansel's. Well, thanks for coming on. Appreciate it. >>Thank you. >>Okay, So we were talking before camera that you guys had. This is a two day event. We're covering the Cube. You guys have an awful fast, but you got your community day yesterday. The day before the people came in early. The core community heard great things about it. Love to get an update. Could you share just what happened yesterday? And then we'll get in some of the community. Sure. >>We s o uh, for all of our answer professed for a while now we've started them with ah, community contributor conference. And the goal of that conference is to get together. Ah, lot of the people we work with online right people we see is IRC nicks or get hub handles rights to get them together in the same room. Ah, have them interact with, uh, with core members of our team. Uh, and that's where we really do, uh, make a lot of decisions about how we're gonna be going forward, get really direct feedback from some of our key contributors about the decisions were making The things were thinking about, uh, with the goal of, you know, involving our community deeply in a lot of decisions we make, that's >>a working session, meets social, get together. That's >>right, Several working sessions and then, you know, drinks afterward for those who want the drinks and just hang out time that >>way. Drinks and their last night was really good. I got the end of it. I missed the session, but >>they have the peaches, peaches, it on the >>table. That was good. But this is the dynamic community. This is one things we notice here. Not a seat open in the house on the keynote Skinny Ramon Lee, active participant base from this organic as well Be now going mainstream. How >>you >>guys handling it, how you guys ride in this way? Because certainly you certainly do. The communities which is great for feedback get from the community. But as you have the commercial eyes open sores and answerable, it's a tough task. >>Well, I'd like to think part of it is, I guess maybe it's not our first rodeo. Is that what we'd say? I mean, yeah, uh, for Ansel. I worked at ELASTICSEARCH, uh, doing community stuff. Before that, I worked at Red Hat. It was a fedora. Project leader, number five. And you were Fedora project Leader. What number was that? Number one depends >>on how you count, but >>you're the You're the one that got us to be able to call it having a federal project leader. So I sort of was number one. So we've been dealing with this stuff for a really long time. It's different in Anselm that, you know, unlike a lot of, you know, holds old school things like fedora. You know, a lot of this stuff is newer and part of the reason it's really important for us to get You know, some of these folks here to talk to us in person is that you know especially. And you saw my keynote this morning where they talked about we talked about modularity. Lot of these folks are really just focused on. They're one little bit and they don't always have is much time. People are working in lots of open source projects now, right, and it's hard to pay deep attention to every single little thing all the time. So this gives them a day of in case you missed it. Here's the deep, dark dive into everything that you know we're planning or thinking about, and they really are. You know, people who are managing those smaller parts all around answerable, really are some of our best feedback loops, right? Because they're people who probably wrote that model because they're using it every single day and their hard core Ansel users. But they also understand how to participate in community so we can get those people actually talking with the rest of us who a lot of us used to be so sad. Men's. I used to be a sis admin, lots of us. You know. A lot of our employees actually just got into wanting to work on Ansel because they loved using it so much of their jobs. And when you're not, actually, since admitting every day, you you lose a little bit of >>the front lines with the truth of what's around. Truth is right there >>and putting all these people together in room make sure that they all also, you know, when you have to look at someone in the eye and tell them news that they might not like you have a different level of empathy and you approach it a little bit differently than you may on the Internet. So, >>Robin So I lived in your keynote this morning. You talked about answerable. First commit was only back in 2012. So that simplicity of that modularity and the learnings from where open source had been in the past Yes, they're a little bit, you know, what could answerable do, being a relatively young project that it might not have been able to dio if it had a couple of decades of history? >>Maybe Greg should tell the story about the funk project >>way. There was a There was a project, a tread hat that we started in 2007 in a coffee shop in Chapel Hill, North Carolina is Ah, myself and Michael the Han and Seth the doll on entry likens Who still works with this with us? A danceable Ah, and we we put together Ah, an idea with all the same underpinnings, right? Ah, highly modular automation tool We debated at the time whether it should be based on SSL or SS H for funk. We chose SSL Ah, and you know, after watching that grow to a certain point and then stagnates and it being inside of red Hat where, you know, there were a lot of other business pressures, things like that. We learned a lot from that experience and we were able to take that experience. And then in 2012 there there's the open source community was a little different. Open source was more acceptable. Get Hubbell was becoming a common plat platform for open source project hosting. And so a lot of things came together in a short pier Time All that experience, although, >>and also market conditions, agenda market conditions in 2007 Cloud was sort of a weird thing that not really everyone was doing 2012 rolls around. Everyone has these cloud images and they need to figure out how to get something in it. Um, and it turns out that Hansel's a really great way to actually do that. And, you know, even if we had picked SS H back in the beginning, I don't know, you know, not have had time projects grow to a certain point. And I could point a lots of projects that were just It's a shame they were so ahead of their time. And because of that, you know, >>timing is everything with the key. I think now what I've always admired about the simplicity is automation requires that the abstract, the way, the complexities and so I think you bring a cloud that brings up more complexity, more use cases for some of the underlying paintings of the plumbing. And this is always gonna This is a moving train that's never going to stop. What was the feedback from the community this year around? As you guys get into some of these analytical capabilities, so the new features have a platform flair to it. It's a platform you guys announced answerable automation platform that implies that enables some value. >>You know, I >>think in >>a way. We've always been a platform, right, because platform is a set of small rules and then modules that attached to it. It's about how that grows, right? And, uh, traditionally, we've had a batteries included model where every module and plug in was built to go into answerable Boy, that got really big bright and >>we like to hear it. I don't even know how many I keep say, I'll >>say 2000. Then it'll be 3000 say 3000 >>something else, a lot of content. And it's, you know, in the beginning, it was I can't imagine this ever being more than 202 150 batteries included, and at some point, you know, it's like, Whoa, yeah, taking care of this and making sure it all works together all the time gets >>You guys have done a great You guys have done a great job with community, and one of the things that you met with Cloud is as more use cases come, scale becomes a big question, and there's real business benefits now, so open source has become part of the business. People talk about business, models will open source. You guys know that you've been part of that 28 years of history with Lennox. But now you're seeing Dev Ops, which is you'll go back to 78 2009 10 time frame The only the purest we're talking Dev ops. At that time, Infrastructures Co was being kicked around. We certainly been covering the cubes is 2010 on that? But now, in mainstream enterprise, it seems like the commercialization and operational izing of Dev ops is here. You guys have a proof point in your own community. People talk about culture, about relationships. We have one guest on time, but they're now friends with the other guy group dowels. So you stay. The collaboration is now becoming a big part of it because of the playbook because of the of these these instances. So talk about that dynamic of operational izing the Dev Ops movement for Enterprise. >>All right, so I remember Ah, an example at one of the first answer professed I ever went thio There were there were a few before I came on board. Ah, but it was I >>think it was >>the 1st 1 I came to when I was about to make the jump from my previous company, and I was just There is a visitor and a friend of the team, and there was an adman who talked to me and said, For the first time, I have this thing, this playbook, that I can write and that I can hand to my manager and say this is what we're going to D'oh! Right? And so there was this artifact that allowed for a bridging between different parts of the organization. That was the simplicity of that playbook that was human readable, that he could show to his boss or to someone else in the organ that they could agree on. And suddenly there was this sort of a document that was a mechanism for collaboration that everyone could understand buy into that hadn't really existed before. Answerable existed after me. That was one of the many, you know, flip of the light moments where I was like, Oh, wow, maybe we have something >>really big. There were plenty of other infrastructures, code things that you could hand to someone. But, you know, for a lot of people, it's like I don't speak that language right? That's why we like to say like Ansel sort of this universal automation language, right? Like everybody can read it. You don't have to be a rocket scientist. Uh, it's, you know, great for your exact example, right? I'm showing this to my manager and saying This is the order of operations and you don't have to be a genius to read it because it's really, really readable >>connecting system which connects people >>right. It's fascinating to May is there was this whole wave of enterprise collaboration tools that the enterprise would try to push down and force people to collaborate. But here is a technology tool that from the ground up, is getting people to do that collaboration. And they want to do it. And it's helping bury some >>of those walls. And it's interesting you mention that I'm sure that something like slack is a thing that falls into that category. And they've built around making sure that the 20 billion people inside a company all sign up until somebody in the I T departments like, What do you mean? These random people are just everyone's using it. No one saving it isn't secure, and they all freak out, and, um, well, I mean, this is sort of, you know, everybody tells her friend about Ansel and they go, Oh, right, Tool. That's gonna save the world Number 22 0 wait, actually, yeah. No, this is This actually is pretty cool. Yeah, yeah, yeah, I get started. >>Well, you know, sometimes the better mouse trap will always drive people to that solution. You guys have proven that organic. What's interesting to me is not only does it keep win on capabilities, it actually grew organically. And this connective tissue between different groups, >>right? Got it >>breaks down that hole silo mentality. And that's really where I tease been stuck? Yes. And as software becomes more prominent and data becomes more prominent, it's gonna just shift more power in the hands of developer and to the, uh, just add mons who are now being redeployed into being systems, architects or whatever they are. This transitional human rolls with automation, >>transformation architect >>Oh my God, that's a real title. I don't >>have it, but >>double my pay. I'll take it. >>So collections is one of the key things talked about when we talk about the Antelope Automation platform. Been hearing a lot discussion about how the partner ecosystems really stepping up even more than before. You know, 4600 plus contributors out there in community, But the partners stepping up Where do you see this going? Where? Well, collections really catalyze the next growth for your >>It's got to be the future for us that, you know, there there were a >>few >>key problems that we recognize that the collections was ultimately the the dissolution that we chose. Uh, you know, one key problem is that with the batteries included model that put a lot of pressure on vendors to conform to whatever our processes were, they had to get their batteries in tow. Are thing to be a part of the ecosystem. And there was a huge demand to be a part of our ecosystem. The partners would just sort of, you know, swallow hard and do what they needed to d'oh. But it really wasn't optimized Tol partners, right? So they might have different development processes. They might have different release cycles. They might have different testing on the back end. That would be, you know, more difficult to hook together collections, breaks a lot of that out and gives our partners a lot of freedom to innovate in their own time. Uh, >>release on their own cycle, the down cycle. We just released our new version of software, but you can't actually get the new Ansel modules that are updated for it until answerable releases is not always the thing that you know makes their product immediately useful. You know, you're a vendor, you really something new. You want people to start using it right away, not wait until, you know answerable comes around so >>and that new artifact also creates more network effects with the, you know, galaxy and automation hub. And you know, the new deployment options that we're gonna have available for that stuff. So it's, I think it's just leveling up, right? It's taking the same approach that's gotten us this foreign, just taking out to, uh, to another level. >>I certainly wouldn't consider it to be like that. Partners air separate part of our They're still definitely part of the community. It's just they have slightly different problems. And, you know, there were folks from all sorts of different companies who are partners in the contributor summit. Yesterday >>there were >>actually, you know, participating and you know, folks swapping stories and listening to each other and again being part of that feedback. >>Maybe just a little bit broader. You know, the other communities out there, I think of the Cloud Native Computing Foundation, the Open Infrastructure Foundation. You're wearing your soul pin. I talk a little bit of our handsome How rentable plays across these other communities, which are, you know, very much mixture of the vendors and the end users. >>Well, I mean and will certainly had Sorry. Are you asking about how Ansel is relating to those other communities? Okay, Yeah, because I'm all about that. I mean, we certainly had a long standing sort of, ah fan base over in the open stacks slash open infrastructure foundation land. Most of the deployment tools for all of you know, all the different ways. So many ways to deploy open stack. A lot of them wound up settling on Ansel towards the end of time. You know, that community sort of matured, and, you know, there's a lot of periods of experimentation and, you know, that's one of the things is something's live. Something's didn't but the core parts of what you actually need to make a cloud or, you know, basically still there. Um And then we also have a ton of modules, actually unanswerable, that, you know, help people to operationalize all their open stack cloud stuff. Just like we have modules for AWS and Google Cloud and Azure and whoever else I'm leaving out this week as far as the C N. C f stuff goes, I mean again, we've seen a lot of you know how to get this thing up and running. Turns out Cooper Daddy's is not particularly easy to get up and running. It's even more complicated than a cloud sometimes, because it also assumes you've got a cloud of some sort already. And I like working on our thing. It's I can actually use it. It's pretty cool. Um, cube spray on. Then A lot of the other projects also have, you know, things that are related to Ansel. Now there's the answer. Will operator stuff? I don't know if you want to touch on that, but >>yeah, uh, we're working on. We know one of the big questions is ah, how do answerable, uh, and open shift slash kubernetes work together frequently and in sort of kubernetes land Open shift land. You want to keep his much as you can on the cluster. Lots of operations on the cluster. >>Sometimes you got >>to talk to things outside of the cluster, right? You got to set up some networking stuff, or you gotta go talk to an S three bucket. There's always something some storage thing. As much as you try to get things in a container land, there's all there's always legacy stuff. There's always new stuff, maybe edge stuff that might not all be part of your cluster. And so one of the things we're working on is making it easier to use answerable as part of your operator structure, to go and manage some of those things, using the operator framework that's already built into kubernetes and >>again, more complexity out there. >>Well, and and the thing is, we're great glue. Answerable is such great glue, and it's accessible to so many people and as the moon. As we move away from monolithic code bases to micro service's and vastly spread out code basis, it's not like the complexity goes away. The complexity simply moves to the relationship between the components and answerable. It's excellent glue for helping to manage those relationships between. >>Who doesn't like a glue layer >>everyone, if it's good and easy to understand, even better, >>the glue layers key guys, Thanks for coming on. Sharing your insights. Thank you so much for a quick minute to give a quick plug for the community. What's up? Stats updates. Quick projects Give a quick plug for what's going on the community real quick. >>You go first. >>We're big. We're 67 >>snow. It was number six. Number seven was kubernetes >>right. Number six out of 96 million projects on Get Hub. So lots of contributors. Lots of energy. >>Anytime. I tried to cite a stat, I find that I have to actually go and look it up. And I was about to sight again. >>So active, high, high numbers of people activity. What's that mean? You're running the plumbing, so obviously it's it's cloud on premise. Other updates. Projects of the contributor day. What's next, what's on the schedule. >>We're looking to put together our next contributor summit. We're hoping in Europe sometime in the spring, so we've got to get that on the plate. I don't know if we've announced the next answer will fast yet >>I know that happens tomorrow. So don't Don't really don't >>ruin that for everybody. >>Gradual ages on the great community. You guys done great. Work out in the open sores opened business. Open everything these days. Can't bet against open. >>But again, >>I wouldn't bet against open. >>We're here. Cube were open. Was sharing all the data here in Atlanta with the interviews. I'm John for his stupid men. Stayed with us for more after this short break.

Published Date : Sep 24 2019

SUMMARY :

Brought to you by Red hat. The community to talk about automation anywhere. Okay, So we were talking before camera that you guys had. And the goal of that conference is to get together. a working session, meets social, get together. I got the end of it. Not a seat open in the house on the keynote Skinny Ramon Lee, active participant But as you have the commercial eyes open sores and answerable, And you were Fedora project Leader. some of these folks here to talk to us in person is that you know especially. the front lines with the truth of what's around. and putting all these people together in room make sure that they all also, you know, when you have to look at someone in the eye and So that simplicity of that modularity and the learnings from where open source had been in the past We chose SSL Ah, and you know, And because of that, you know, requires that the abstract, the way, the complexities and so I think you bring a cloud that brings up more complexity, It's about how that grows, I don't even know how many I keep say, I'll And it's, you know, in the beginning, You guys have done a great You guys have done a great job with community, and one of the things that you met with Cloud is All right, so I remember Ah, an example at one of the first answer That was one of the many, you know, flip of the light moments where I was like, saying This is the order of operations and you don't have to be a genius to read it because it's really, that the enterprise would try to push down and force people to collaborate. And it's interesting you mention that I'm sure that something like slack is a thing that falls into that Well, you know, sometimes the better mouse trap will always drive people to that solution. it's gonna just shift more power in the hands of developer and to the, uh, I don't double my pay. But the partners stepping up Where do you see this going? That would be, you know, more difficult to hook together collections, breaks a lot of that out and gives our always the thing that you know makes their product immediately useful. And you know, the new deployment options that we're gonna have available And, you know, there were folks from all sorts of different companies who are partners in the contributor actually, you know, participating and you know, folks swapping stories and listening to each other and again handsome How rentable plays across these other communities, which are, you know, very much mixture of the vendors on. Then A lot of the other projects also have, you know, things that are related to Ansel. You want to keep his much as you can on the cluster. You got to set up some networking stuff, or you gotta go talk to an S three bucket. Well, and and the thing is, we're great glue. Thank you so much for a quick minute to give a quick plug for the community. We're big. It was number six. So lots of contributors. And I was about to sight again. Projects of the contributor day. in the spring, so we've got to get that on the plate. I know that happens tomorrow. Work out in the open sores opened business. Was sharing all the data here in Atlanta with the interviews.

ENTITIES

Entity	Category	Confidence
2007	DATE	0.99+
2012	DATE	0.99+
Robyn Bergeron	PERSON	0.99+
John Kerry	PERSON	0.99+
Europe	LOCATION	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Atlanta	LOCATION	0.99+
John	PERSON	0.99+
Open Infrastructure Foundation	ORGANIZATION	0.99+
2019	DATE	0.99+
28 years	QUANTITY	0.99+
two day	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
Atlanta, Georgia	LOCATION	0.99+
Greg DeKoenigsberg	PERSON	0.99+
Bergeron	PERSON	0.99+
Greg	PERSON	0.99+
Greg Dankers Berg	PERSON	0.99+
Robin	PERSON	0.99+
Infrastructures Co	ORGANIZATION	0.99+
Red hat	ORGANIZATION	0.99+
Ansel	ORGANIZATION	0.99+
20 billion people	QUANTITY	0.99+
4600 plus contributors	QUANTITY	0.99+
2010	DATE	0.99+
yesterday	DATE	0.99+
AWS	ORGANIZATION	0.99+
2000	QUANTITY	0.98+
ELASTICSEARCH	ORGANIZATION	0.98+
Yesterday	DATE	0.98+
tomorrow	DATE	0.98+
67	QUANTITY	0.98+
May	DATE	0.98+
first time	QUANTITY	0.98+
3000	QUANTITY	0.98+
Red Hats	EVENT	0.98+
one	QUANTITY	0.98+
this week	DATE	0.98+
Fedora	ORGANIZATION	0.97+
one guest	QUANTITY	0.97+
more than 202 150 batteries	QUANTITY	0.97+
two guests	QUANTITY	0.96+
96 million projects	QUANTITY	0.96+
Chapel Hill, North Carolina	LOCATION	0.95+
Lennox	ORGANIZATION	0.95+
Minutemen	LOCATION	0.94+
fedora	ORGANIZATION	0.93+
first	QUANTITY	0.91+
first rodeo	QUANTITY	0.91+
Anselm	LOCATION	0.91+
one key problem	QUANTITY	0.91+
Get Hub	ORGANIZATION	0.91+
this year	DATE	0.91+
Michael the Han	PERSON	0.9+
Cooper	PERSON	0.89+
2009	DATE	0.89+
Number seven	QUANTITY	0.87+
Community Ansel	ORGANIZATION	0.87+
Azure	TITLE	0.86+
first answer	QUANTITY	0.84+
Cloud	TITLE	0.84+
this morning	DATE	0.83+
First commit	QUANTITY	0.79+
one little	QUANTITY	0.79+
Number six	QUANTITY	0.76+
last night	DATE	0.75+
AnsibleFest	EVENT	0.75+
a day	QUANTITY	0.74+
single day	QUANTITY	0.73+
10 time	QUANTITY	0.71+
C N. C f	TITLE	0.7+
single little thing	QUANTITY	0.69+
1st 1	QUANTITY	0.67+
D'oh	ORGANIZATION	0.66+
Google Cloud	ORGANIZATION	0.64+
couple	QUANTITY	0.62+

Influencer Panel | IBM CDO Summit 2019

>> Live from San Francisco, California, it's theCUBE covering the IBM Chief Data Officers Summit, brought to you by IBM. >> Welcome back to San Francisco everybody. I'm Dave Vellante and you're watching theCUBE, the leader in live tech coverage. This is the end of the day panel at the IBM Chief Data Officer Summit. This is the 10th CDO event that IBM has held and we love to to gather these panels. This is a data all-star panel and I've recruited Seth Dobrin who is the CDO of the analytics group at IBM. Seth, thank you for agreeing to chip in and be my co-host in this segment. >> Yeah, thanks Dave. Like I said before we started, I don't know if this is a promotion or a demotion. (Dave laughing) >> We'll let you know after the segment. So, the data all-star panel and the data all-star awards that you guys are giving out a little later in the event here, what's that all about? >> Yeah so this is our 10th CDU Summit. So two a year, so we've been doing this for 5 years. The data all-stars are those people that have been to four at least of the ten. And so these are five of the 16 people that got the award. And so thank you all for participating and I attended these like I said earlier, before I joined IBM they were immensely valuable to me and I was glad to see 16 other people that think it's valuable too. >> That is awesome. Thank you guys for coming on. So, here's the format. I'm going to introduce each of you individually and then ask you to talk about your role in your organization. What role you play, how you're using data, however you want to frame that. And the first question I want to ask is, what's a good day in the life of a data person? Or if you want to answer what's a bad day, that's fine too, you choose. So let's start with Lucia Mendoza-Ronquillo. Welcome, she's the Senior Vice President and the Head of BI and Data Governance at Wells Fargo. You told us that you work within the line of business group, right? So introduce your role and what's a good day for a data person? >> Okay, so my role basically is again business intelligence so I support what's called cards and retail services within Wells Fargo. And I also am responsible for data governance within the business. We roll up into what's called a data governance enterprise. So we comply with all the enterprise policies and my role is to make sure our line of business complies with data governance policies for enterprise. >> Okay, good day? What's a good day for you? >> A good day for me is really when I don't get a call that the regulators are knocking on our doors. (group laughs) Asking for additional reports or have questions on the data and so that would be a good day. >> Yeah, especially in your business. Okay, great. Parag Shrivastava is the Director of Data Architecture at McKesson, welcome. Thanks so much for coming on. So we got a healthcare, couple of healthcare examples here. But, Parag, introduce yourself, your role, and then what's a good day or if you want to choose a bad day, be fun the mix that up. >> Yeah, sounds good. Yeah, so mainly I'm responsible for the leader strategy and architecture at McKesson. What that means is McKesson has a lot of data around the pharmaceutical supply chain, around one-third of the world's pharmaceutical supply chain, clinical data, also around pharmacy automation data, and we want to leverage it for the better engagement of the patients and better engagement of our customers. And my team, which includes the data product owners, and data architects, we are all responsible for looking at the data holistically and creating the data foundation layer. So I lead the team across North America. So that's my current role. And going back to the question around what's a good day, I think I would say the good day, I'll start at the good day. Is really looking at when the data improves the business. And the first thing that comes to my mind is sort of like an example, of McKesson did an acquisition of an eight billion dollar pharmaceutical company in Europe and we were creating the synergy solution which was based around the analytics and data. And actually IBM was one of the partners in implementing that solution. When the solution got really implemented, I mean that was a big deal for me to see that all the effort that we did in plumbing the data, making sure doing some analytics, is really helping improve the business. I think that is really a good day I would say. I mean I wouldn't say a bad day is such, there are challenges, constant challenges, but I think one of the top priorities that we are having right now is to deal with the demand. As we look at the demand around the data, the role of data has got multiple facets to it now. For example, some of the very foundational, evidentiary, and compliance type of needs as you just talked about and then also profitability and the cost avoidance and those kind of aspects. So how to balance between that demand is the other aspect. >> All right good. And we'll get into a lot of that. So Carl Gold is the Chief Data Scientist at Zuora. Carl, tell us a little bit about Zuora. People might not be as familiar with how you guys do software for billing et cetera. Tell us about your role and what's a good day for a data scientist? >> Okay, sure, I'll start by a little bit about Zuora. Zuora is a subscription management platform. So any company who wants to offer a product or service as subscription and you don't want to build your billing and subscription management, revenue recognition, from scratch, you can use a product like ours. I say it lets anyone build a telco with a complicated plan, with tiers and stuff like that. I don't know if that's a good thing or not. You guys'll have to make up your own mind. My role is an interesting one. It's split, so I said I'm a chief data scientist and we work about 50% on product features based on data science. Things like churn prediction, or predictive payment retries are product areas where we offer AI-based solutions. And then but because Zuora is a subscription platform, we have an amazing set of data on the actual performance of companies using our product. So a really interesting part of my role has been leading what we call the subscription economy index and subscription economy benchmarks which are reports around best practices for subscription companies. And it's all based off this amazing dataset created from an anonymized data of our customers. So that's a really exciting part of my role. And for me, maybe this speaks to our level of data governance, I might be able to get some tips from some of my co-panelists, but for me a good day is when all the data for me and everyone on my team is where we left it the night before. And no schema changes, no data, you know records that you were depending on finding removed >> Pipeline failures. >> Yeah pipeline failures. And on a bad day is a schema change, some crucial data just went missing and someone on my team is like, "The code's broken." >> And everybody's stressed >> Yeah, so those are bad days. But, data governance issues maybe. >> Great, okay thank you. Jung Park is the COO of Latitude Food Allergy Care. Jung welcome. >> Yeah hi, thanks for having me and the rest of us here. So, I guess my role I like to put it as I'm really the support team. I'm part of the support team really for the medical practice so, Latitude Food Allergy Care is a specialty practice that treats patients with food allergies. So, I don't know if any of you guys have food allergies or maybe have friends, kids, who have food allergies, but, food allergies unfortunately have become a lot more prevalent. And what we've been able to do is take research and data really from clinical trials and other research institutions and really use that from the clinical trial setting, back to the clinical care model so that we can now treat patients who have food allergies by using a process called oral immunotherapy. It's fascinating and this is really personal to me because my son as food allergies and he's been to the ER four times. >> Wow. >> And one of the scariest events was when he went to an ER out of the country and as a parent, you know you prepare your child right? With the food, he takes the food. He was 13 years old and you had the chaperones, everyone all set up, but you get this call because accidentally he ate some peanut, right. And so I saw this unfold and it scared me so much that this is something I believe we just have to get people treated. So this process allows people to really eat a little bit of the food at a time and then you eat the food at the clinic and then you go home and eat it. Then you come back two weeks later and then you eat a little bit more until your body desensitizes. >> So you build up that immunity >> Exactly. >> and then you watch the data obviously. >> Yeah. So what's a good day for me? When our patients are done for the day and they have a smile on their face because they were able to progress to that next level. >> Now do you have a chief data officer or are you the de facto CFO? >> I'm the de facto. So, my career has been pretty varied. So I've been essentially chief data officer, CIO, at companies small and big. And what's unique about I guess in this role is that I'm able to really think about the data holistically through every component of the practice. So I like to think of it as a patient journey and I'm sure you guys all think of it similarly when you talk about your customers, but from a patient's perspective, before they even come in, you have to make sure the data behind the science of whatever you're treating is proper, right? Once that's there, then you have to have the acquisition part. How do you actually work with the community to make sure people are aware of really the services that you're providing? And when they're with you, how do you engage them? How do you make sure that they are compliant with the process? So in healthcare especially, oftentimes patients don't actually succeed all the way through because they don't continue all the way through. So it's that compliance. And then finally, it's really long-term care. And when you get the long-term care, you know that the patient that you've treated is able to really continue on six months, a year from now, and be able to eat the food. >> Great, thank you for that description. Awesome mission. Rolland Ho is the Vice President of Data and Analytics at Clover Health. Tell us a little bit about Clover Health and then your role. >> Yeah, sure. So Clover is a startup Medicare Advantage plan. So we provide Medicare, private Medicare to seniors. And what we do is we're because of the way we run our health plan, we're able to really lower a lot of the copay costs and protect seniors against out of pocket. If you're on regular Medicare, you get cancer, you have some horrible accident, your out of pocket is infinite potentially. Whereas with Medicare Advantage Plan it's limited to like five, $6,000 and you're always protected. One of the things I'm excited about being at Clover is our ability to really look at how can we bring the value of data analytics to healthcare? Something I've been in this industry for close to 20 years at this point and there's a lot of waste in healthcare. And there's also a lot of very poor application of preventive measures to the right populations. So one of the things that I'm excited about is that with today's models, if you're able to better identify with precision, the right patients to intervene with, then you fundamentally transform the economics of what can be done. Like if you had to pa $1,000 to intervene, but you were only 20% of the chance right, that's very expensive for each success. But, now if your model is 60, 70% right, then now it opens up a whole new world of what you can do. And that's what excites me. In terms of my best day? I'll give you two different angles. One as an MBA, one of my best days was, client calls me up, says, "Hey Rolland, you know, "your analytics brought us over $100 million "in new revenue last year." and I was like, cha-ching! Excellent! >> Which is my half? >> Yeah right. And then on the data geek side the best day was really, run a model, you train a model, you get ridiculous AUC score, so area under the curve, and then you expect that to just disintegrate as you go into validation testing and actual live production. But the 98 AUC score held up through production. And it's like holy cow, the model actually works! And literally we could cut out half of the workload because of how good that model was. >> Great, excellent, thank you. Seth, anything you'd add to the good day, bad day, as a CDO? >> So for me, well as a CDO or as CDO at IBM? 'Cause at IBM I spend most of my time traveling. So a good day is a day I'm home. >> Yeah, when you're not in an (group laughing) aluminum tube. >> Yeah. Hurdling through space (laughs). No, but a good day is when a GDPR compliance just happened, a good day for me was May 20th of last year when IBM was done and we were, or as done as we needed to be for GDPR so that was a good day for me last year. This year is really a good day is when we start implementing some new models to help IBM become a more effective company and increase our bottom line or increase our margins. >> Great, all right so I got a lot of questions as you know and so I want to give you a chance to jump in. >> All right. >> But, I can get it started or have you got something? >> I'll go ahead and get started. So this is a the 10th CDO Summit. So five years. I know personally I've had three jobs at two different companies. So over the course of the last five years, how many jobs, how many companies? Lucia? >> One job with one company. >> Oh my gosh you're boring. (group laughing) >> No, but actually, because I support basically the head of the business, we go into various areas. So, we're not just from an analytics perspective and business intelligence perspective and of course data governance, right? It's been a real journey. I mean there's a lot of work to be done. A lot of work has been accomplished and constantly improving the business, which is the first goal, right? Increasing market share through insights and business intelligence, tracking product performance to really helping us respond to regulators (laughs). So it's a variety of areas I've had to be involved in. >> So one company, 50 jobs. >> Exactly. So right now I wear different hats depending on the day. So that's really what's happening. >> So it's a good question, have you guys been jumping around? Sure, I mean I think of same company, one company, but two jobs. And I think those two jobs have two different layers. When I started at McKesson I was a solution leader or solution director for business intelligence and I think that's how I started. And over the five years I've seen the complete shift towards machine learning and my new role is actually focused around machine learning and AI. That's why we created this layer, so our own data product owners who understand the data science side of things and the ongoing and business architecture. So, same company but has seen a very different shift of data over the last five years. >> Anybody else? >> Sure, I'll say two companies. I'm going on four years at Zuora. I was at a different company for a year before that, although it was kind of the same job, first at the first company, and then at Zuora I was really focused on subscriber analytics and churn for my first couple a years. And then actually I kind of got a new job at Zuora by becoming the subscription economy expert. I become like an economist, even though I don't honestly have a background. My PhD's in biology, but now I'm a subscription economy guru. And a book author, I'm writing a book about my experiences in the area. >> Awesome. That's great. >> All right, I'll give a bit of a riddle. Four, how do you have four jobs, five companies? >> In five years. >> In five years. (group laughing) >> Through a series of acquisition, acquisition, acquisition, acquisition. Exactly, so yeah, I have to really, really count on that one (laughs). >> I've been with three companies over the past five years and I would say I've had seven jobs. But what's interesting is I think it kind of mirrors and kind of mimics what's been going on in the data world. So I started my career in data analytics and business intelligence. But then along with that I had the fortune to work with the IT team. So the IT came under me. And then after that, the opportunity came about in which I was presented to work with compliance. So I became a compliance officer. So in healthcare, it's very interesting because these things are tied together. When you look about the data, and then the IT, and then the regulations as it relates to healthcare, you have to have the proper compliance, both internal compliance, as well as external regulatory compliance. And then from there I became CIO and then ultimately the chief operating officer. But what's interesting is as I go through this it's all still the same common themes. It's how do you use the data? And if anything it just gets to a level in which you become closer with the business and that is the most important part. If you stand alone as a data scientist, or a data analyst, or the data officer, and you don't incorporate the business, you alienate the folks. There's a math I like to do. It's different from your basic math, right? I believe one plus one is equal to three because when you get the data and the business together, you create that synergy and then that's where the value is created. >> Yeah, I mean if you think about it, data's the only commodity that increases value when you use it correctly. >> Yeah. >> Yeah so then that kind of leads to a question that I had. There's this mantra, the more data the better. Or is it more of an Einstein derivative? Collect as much data as possible but not too much. What are your thoughts? Is more data better? >> I'll take it. So, I would say the curve has shifted over the years. Before it used to be data was the bottleneck. But now especially over the last five to 10 years, I feel like data is no longer oftentimes the bottleneck as much as the use case. The definition of what exactly we're going to apply to, how we're going to apply it to. Oftentimes once you have that clear, you can go get the data. And then in the case where there is not data, like in Mechanical Turk, you can all set up experiments, gather data, the cost of that is now so cheap to experiment that I think the bottleneck's really around the business understanding the use case. >> Mm-hmm. >> Mm-hmm. >> And I think the wave that we are seeing, I'm seeing this as there are, in some cases, more data is good, in some cases more data is not good. And I think I'll start it where it is not good. I think where quality is more required is the area where more data is not good. For example like regulation and compliance. So for example in McKesson's case, we have to report on opioid compliance for different states. How much opioid drugs we are giving to states and making sure we have very, very tight reporting and compliance regulations. There, highest quality of data is important. In our data organization, we have very, very dedicated focus around maintaining that quality. So, quality is most important, quantity is not if you will, in that case. Having the right data. Now on the other side of things, where we are doing some kind of exploratory analysis. Like what could be a right category management for our stores? Or where the product pricing could be the right ones. Product has around 140 attributes. We would like to look at all of them and see what patterns are we finding in our models. So there you could say more data is good. >> Well you could definitely see a lot of cases. But certainly in financial services and a lot of healthcare, particularly in pharmaceutical where you don't want work in process hanging around. >> Yeah. >> Some lawyer could find a smoking gun and say, "Ooh see." And then if that data doesn't get deleted. So, let's see, I would imagine it's a challenge in your business, I've heard people say, "Oh keep all the, now we can keep all the data, "it's so inexpensive to store." But that's not necessarily such a good thing is it? >> Well, we're required to store data. >> For N number of years, right? >> Yeah, N number of years. But, sometimes they go beyond those number of years when there's a legal requirements to comply or to answer questions. So we do keep more than, >> Like a legal hold for example. >> Yeah. So we keep more than seven years for example and seven years is the regulatory requirement. But in the case of more data, I'm a data junkie, so I like more data (laughs). Whenever I'm asked, "Is the data available?" I always say, "Give me time I'll find it for you." so that's really how we operate because again, we're the go-to team, we need to be able to respond to regulators to the business and make sure we understand the data. So that's the other key. I mean more data, but make sure you understand what that means. >> But has that perspective changed? Maybe go back 10 years, maybe 15 years ago, when you didn't have the tooling to be able to say, "Give me more data." "I'll get you the answer." Maybe, "Give me more data." "I'll get you the answer in three years." Whereas today, you're able to, >> I'm going to go get it off the backup tapes (laughs). >> (laughs) Yeah, right, exactly. (group laughing) >> That's fortunately for us, Wells Fargo has implemented data warehouse for so many number of years, I think more than 10 years. So we do have that capability. There's certainly a lot of platforms you have to navigate through, but if you are able to navigate, you can get to the data >> Yeah. >> within the required timeline. So I have, astonished you have the technology, team behind you. Jung, you want to add something? >> Yeah, so that's an interesting question. So, clearly in healthcare, there is a lot of data and as I've kind of come closer to the business, I also realize that there's a fine line between collecting the data and actually asking our folks, our clinicians, to generate the data. Because if you are focused only on generating data, the electronic medical records systems for example. There's burnout, you don't want the clinicians to be working to make sure you capture every element because if you do so, yes on the back end you have all kinds of great data, but on the other side, on the business side, it may not be necessarily a productive thing. And so we have to make a fine line judgment as to the data that's generated and who's generating that data and then ultimately how you end up using it. >> And I think there's a bit of a paradox here too, right? The geneticist in me says, "Don't ever throw anything away." >> Right. >> Right? I want to keep everything. But, the most interesting insights often come from small data which are a subset of that larger, keep everything inclination that we as data geeks have. I think also, as we're moving in to kind of the next phase of AI when you can start doing really, really doing things like transfer learning. That small data becomes even more valuable because you can take a model trained on one thing or a different domain and move it over to yours to have a starting point where you don't need as much data to get the insight. So, I think in my perspective, the answer is yes. >> Yeah (laughs). >> Okay, go. >> I'll go with that just to run with that question. I think it's a little bit of both 'cause people touched on different definitions of more data. In general, more observations can never hurt you. But, more features, or more types of things associated with those observations actually can if you bring in irrelevant stuff. So going back to Rolland's answer, the first thing that's good is like a good mental model. My PhD is actually in physical science, so I think about physical science, where you actually have a theory of how the thing works and you collect data around that theory. I think the approach of just, oh let's put in 2,000 features and see what sticks, you know you're leaving yourself open to all kinds of problems. >> That's why data science is not democratized, >> Yeah (laughing). >> because (laughing). >> Right, but first Carl, in your world, you don't have to guess anymore right, 'cause you have real data. >> Well yeah, of course, we have real data, but the collection, I mean for example, I've worked on a lot of customer churn problems. It's very easy to predict customer churn if you capture data that pertains to the value customers are receiving. If you don't capture that data, then you'll never predict churn by counting how many times they login or more crude measures of engagement. >> Right. >> All right guys, we got to go. The keynotes are spilling out. Seth thank you so much. >> That's it? >> Folks, thank you. I know, I'd love to carry on, right? >> Yeah. >> It goes fast. >> Great. >> Yeah. >> Guys, great, great content. >> Yeah, thanks. And congratulations on participating and being data all-stars. >> We'd love to do this again sometime. All right and thank you for watching everybody, it's a wrap from IBM CDOs, Dave Vellante from theCUBE. We'll see you next time. (light music)

Published Date : Jun 25 2019

SUMMARY :

brought to you by IBM. This is the end of the day panel Like I said before we started, I don't know if this is that you guys are giving out a little later And so thank you all for participating and then ask you to talk and my role is to make sure our line of business complies a call that the regulators are knocking on our doors. and then what's a good day or if you want to choose a bad day, And the first thing that comes to my mind So Carl Gold is the Chief Data Scientist at Zuora. as subscription and you don't want to build your billing and someone on my team is like, "The code's broken." Yeah, so those are bad days. Jung Park is the COO of Latitude Food Allergy Care. So, I don't know if any of you guys have food allergies of the food at a time and then you eat the food and then you When our patients are done for the day and I'm sure you guys all think of it similarly Great, thank you for that description. the right patients to intervene with, and then you expect that to just disintegrate Great, excellent, thank you. So a good day is a day I'm home. Yeah, when you're not in an (group laughing) for GDPR so that was a good day for me last year. and so I want to give you a chance to jump in. So over the course of the last five years, Oh my gosh you're boring. and constantly improving the business, So that's really what's happening. and the ongoing and business architecture. in the area. That's great. Four, how do you have four jobs, five companies? In five years. really count on that one (laughs). and you don't incorporate the business, Yeah, I mean if you think about it, Or is it more of an Einstein derivative? But now especially over the last five to 10 years, So there you could say more data is good. particularly in pharmaceutical where you don't want "it's so inexpensive to store." So we do keep more than, Like a legal hold So that's the other key. when you didn't have the tooling to be able to say, (laughs) Yeah, right, exactly. but if you are able to navigate, you can get to the data astonished you have the technology, and then ultimately how you end up using it. And I think there's a bit of a paradox here too, right? to have a starting point where you don't need as much data and you collect data around that theory. you don't have to guess anymore right, if you capture data that pertains Seth thank you so much. I know, I'd love to carry on, right? and being data all-stars. All right and thank you for watching everybody,

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Europe	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
McKesson	ORGANIZATION	0.99+
Wells Fargo	ORGANIZATION	0.99+
May 20th	DATE	0.99+
five companies	QUANTITY	0.99+
Zuora	ORGANIZATION	0.99+
two jobs	QUANTITY	0.99+
seven jobs	QUANTITY	0.99+
$1,000	QUANTITY	0.99+
50 jobs	QUANTITY	0.99+
three companies	QUANTITY	0.99+
last year	DATE	0.99+
Seth	PERSON	0.99+
Dave	PERSON	0.99+
Clover	ORGANIZATION	0.99+
Lucia Mendoza-Ronquillo	PERSON	0.99+
seven years	QUANTITY	0.99+
five	QUANTITY	0.99+
two companies	QUANTITY	0.99+
Clover Health	ORGANIZATION	0.99+
four years	QUANTITY	0.99+
Parag Shrivastava	PERSON	0.99+
San Francisco	LOCATION	0.99+
five years	QUANTITY	0.99+
Rolland Ho	PERSON	0.99+
$6,000	QUANTITY	0.99+
Lucia	PERSON	0.99+
eight billion dollar	QUANTITY	0.99+
5 years	QUANTITY	0.99+
Carl	PERSON	0.99+
more than seven years	QUANTITY	0.99+
one company	QUANTITY	0.99+
San Francisco, California	LOCATION	0.99+
today	DATE	0.99+
North America	LOCATION	0.99+
One	QUANTITY	0.99+
Four	QUANTITY	0.99+
Jung	PERSON	0.99+
three jobs	QUANTITY	0.99+
Latitude Food Allergy Care	ORGANIZATION	0.99+
One job	QUANTITY	0.99+
2,000 features	QUANTITY	0.99+
Carl Gold	PERSON	0.99+
four jobs	QUANTITY	0.99+
over $100 million	QUANTITY	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
Einstein	PERSON	0.99+
first question	QUANTITY	0.99+
16 people	QUANTITY	0.99+
three	QUANTITY	0.99+
first goal	QUANTITY	0.99+
Parag	PERSON	0.99+
IBM Chief Data Officers Summit	EVENT	0.99+
Rolland	PERSON	0.99+
six months	QUANTITY	0.98+
15 years ago	DATE	0.98+
Jung Park	PERSON	0.98+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Seth: