Image Title

Search Results for Robetti:

Seth Dobrin, IBM - IBM Interconnect 2017 - #ibminterconnect - #theCUBE


 

>> Announcer: Live from Las Vegas, it's theCUBE, covering InterConnect 2017. Brought to you by IBM. >> Okay welcome back everyone. We are here live in Las Vegas from Mandalay Bay for IBM InterConnect 2017. This is theCUBE's three day coverage of IBM InterConnect. I'm John Furrier with my co-host Dave Vellante. Or next guest is Seth Dobrin, Vice President and Chief Data Officer for IBM Analytics. Welcome to theCUBE, welcome back. >> Yeah, thanks for having me again. I love sittin' down and chattin' with you guys. >> You're a CDO, Chief Data Officer and that's a really kind of a really pivotal role because you got to look at, as a chief, over all of the data with IBM Analytics. Also you have customers you're delivering a lot solutions to and it's cutting edge. I like the keynote on day one here. You had Chris Moody at Twitter. He's a data guy. >> Seth: Yep. >> I mean you guys have a deal with Twitter so he got more data. You've got the weather company, you got that data set. You have IBM customer data. You guys are full with data right now. >> We're first seat at the scenes with data and that's a good thing. >> So what's the strategy and what are you guys working on and what's the key points that you guys are honing in on? Obviously, Cognitive to the Core is Robetti's theme. How are you guys making data work for IBM and your customers? >> If you think about IBM Analytics, we're really focusing on five key areas, five things that we think if we get right, we'll help our clients learn how to drive their business and data strategies right. One is around how do I manage data across hybrid environments? So what's my hybrid data management strategy? It used to be how do I get to public cloud, but really what it is, it's a conversation about every enterprise has their business critical assets, what people call legacy. If we call them business critical and we think about-- These are how companies got here today. This is what they make their money on today. The real challenge is how do we help them tie those business critical assets to their future state cloud, whether it's public cloud, private cloud, or something in between our hybrid cloud. One of the key strategies for us is hybrid data management. Another one is around unified governance. If you look at governance in the past, governance in the past was an inhibitor. It was something that people went (groan) "Governance, so I have to do it." >> John: Barb wire. >> Right, you know. When I've been at companies before, and thought about building a data strategy, we spent the first six months building data strategy trying to figure out how to avoid data governance, or the word data governance, and really, we need to embrace data governance as an enabler. If you do it right, if you do it upfront, if you wrap things that include model management, how do I make sure that my data scientists can get to the data they need upfront by classifying data ahead of time; understanding entitlements, understanding what intent when people gave consent was. You also take out of the developer hands the need to worry about governance because now in a unified governance platform, right, it's all API-driven. Just like our applications are all API-driven, how do we make our governance platform API-driven? If I'm an application developer, by the way, I'm not, I can now call on API to manage governance for me, so I don't need to worry about am I giving away the shop. Am I going to get the company sued? Am I going to get fired? Now I'm calling on API. That's only two of them, right? The third one is really around data science and machine learning. So how do we make machine learning pervasive across enterprises and things like data science experience. Watson, IBM, machine learning. We're now bringing that machine-learning capability to the private cloud, right, because 90% of data that exists can't be Googled so it's behind firewalls. How do we bring machine learning to that? >> One more! >> One more! That's around, God, I gave you quite a list-- >> Hybrid data management, you defined governance, data science and machine learning-- >> Oh, the other one is Open Source, our commitment to Open Source. Our commitment to Open Source, like Hadoop, Spark, as we think about unified governance, a truly unified governed platform needs to be built on top of Open Source, so IBM is doubling down on our commitment to Apache Spark as a framework backbone, a metadata framework for our unified governed platform. >> What's the biggest para >> Wait, did we miss one? Hybrid data management, unified governance, data science machine learning (talking over another), pervasive, and open source. >> That's four. >> I thought it was five. >> No. >> Machine learning and data science are two, so typically five. >> There's only four. If I said five, there's only four. >> Cover the data governance thing because this unification is interesting to me because one of the things we see in the marketplace, people hungry for data ops. Like what data ops was for cloud was a whole application developer model developing where as a new developer persona emerging where I want to code and I want to just tap data handled by brilliant people who are cognitive engines that just serve me up what I need like a routine or a procedure, or a subroutine, whatever you want to call it, that's a data DevOps model kind of thing. How will you guys do it? Do you agree with that and how does that play out? >> That's a combination, in my mind, that's a combination of an enterprise creating data assets, so treating data as the asset it is and not a digital dropping of applications, and it's that combined with metadata. It gets back to the Apache Atlas conversation. If you want to understand your data and know where it is, it's a metadata problem. What's the data; what's the lineage; where is it; where does it live; how do I get to it; what can I, can't I do with it, and so that just reinforces the need for an Open Source ubiquitous metadata catalog, a single catalog, and then a single catalog of policies associated with that all driven in a composable way through API. >> That's a fundamental, cultural thinking shift because you're saying, "I don't want to just take exhaust "from apps, which is just how people have been dealing with data." You're saying, "Get holistic and say you need to create an asset class or layer or something that is designed." >> If an enterprises are going to be successful with data, now we're getting to five things, right, so there's five things. They need to treat data as an asset. It's got to be a first-class citizen, not a digital dropping, and they need a strategy around it. So what are, conceptually, what are the pieces of data that I care about? My customers, my products, my talent, my finances, what are the limited number of things. What is my data science strategy? How do I build deployable data science assets? I can't be developing machine-learning models and deploying them in Excel spreadsheets. They have to be integrated into My Processes. I have to have a cloud strategy so am I going to be on premise? Am I going to be off premise? Am I going to be something in between? I have to get back to unified governance. I have to govern it, right? Governing in a single place is hard enough, let alone multiple places, and then my talent disappears. >> Could you peg a progress bar of the industry where these would be, what you just said, because, I think-- >> Dave: Again, we only got through four. >> No talent was the last one. >> Talent, sorry, missed it. >> In the progress bar of work, how are the enterprises right now 'cause actually the big conversation on the cloud side is enterprise-readiness, enterprise-grade, that's kind of an ongoing conversation, but now, if you take your premise, which I think is accurate, is that I got to have a centralized data strategy and platform, not a data (mumbles), more than that, software, et cetera, where's the progress bar? Where are people, Pegeninning? >> I think they are all over the map. I've only been with IBM for four months and I've been spending much of that time literally traveling around the world talking to clients, and clients are all over the map. Last week I spent a week in South America with a media company, a cable company down there. Before setting up the meeting, the guy was like, "Well, you know, we're not that far along "down this journey," and I was like, "Oh, my God, "you guys are like so far ahead of everyone else! "That's not even funny!" And then I'm sitting down with big banks that think they're like way out there and they haven't even started on the journey. So it's really literally all over the place and it's even within industry. There's financial companies that are also way out there. There's another bank in Brazil that uses biometrics to access ATMs, you don't need a pin anymore. They have analytics that drive all that. That's crazy. We don't have anything like that here. >> Are you meeting with CDOs? >> Yeah, mostly CDOs, or kind of defacto like we talked about before this show. Mostly CDOs. >> So you may be unique in the sense that you are working for a technology company, so a lot of your time is outward focused, but when you travel around and meet with the CDOs, how much of their time is inward-focused versus outward-focused? >> My time is actually split between inward and outward focus because part of my time is transforming our own business using data and analytics because IBM is a company and we got to figure out how to do that. >> Is it correct that yours is probably a higher percentage outward? >> Mine's probably a higher percentage outward than most CDOs, yeah. So I think most CDOs are 7%, 80% inward-focused and 20% outward-focused, and a lot of that outward focus is just trying to understand what other people are doing. >> I guess it's okay for now, but will that change over time? >> I think that's about right. It gets back to the other conversation we had before the show about your monetization strategy. I think if a company progresses where it's not longer about how do I change my processes and use data to monetize my internal process. If I'm going to start figuring how I sell data, then CDOs need to get a more external-- >> But you're supporting the business in that role and that's largely going to be an internal function of data-quality, governance, and the like, like you say, the data science strategy. >> Yeah, and I think it's important when I talk about data governance, I think things that we used to talk about is data management is all part of data governance. Data governance is not just controlling. It's all of that. It's how do I understand my data, how do I provide access to my data. It's all those things you need to enable your business to thrive on data. >> My question for you is a personal one. How did you get to be a CDO? Do you go to a class? I'm going to be a CDO someday. Not that you do that, I'm just-- >> CDO school. >> CDO school. >> Seth: I was staying in a Holiday Express last night. (laughing) >> Tongue in cheek aside, people are getting into CDO roles from interesting vectors, right? Anthropology, science, art, I mean, it's a really interesting, math geeks certainly love, they thrive there, but there's not one, I haven't yet seen one sweet spot. Take us through how you got into it and what-- >> I'm not going to fit any preconceived notion of what a CDO is, especially in a technology company. My background is in molecular and statistical genetics. >> Dave: Well, that explains it. >> I'm a geneticist. >> Data has properties that could be kind of biological. >> And actually, if you think about the routes of big data and data science, or big data, at least, the two of the predative, they're probably fundamental drivers of the concept of big data were genetics and astrophysics. So 20 years ago when I was getting my PhD, we were dealing with tens and hundreds of gigabyte-sized files. We were trying to figure out how do we get stuff out of 15 Excel files because they weren't big enough into a single CSV file. Millions of rows and millions of crude, by today's standard, but it was still, how do we do this, and so 20 years ago I was learning to be a data scientist. I didn't know it. I stopped doing that field and I started managing labs for a while and then in my last role, we kind of transformed how the research group within that company, in the agricultural space, handled and managed data, and I was simultaneously the biggest critic and biggest advocate for IT, and they said, "Hey, come over and help us figure out how to transform "the company the way we've transformed this group." >> It's looks like when you talk about your PhD experience, it's almost like you were so stuck in the mud with not having to compute power or sort of tooling. It's like a hungry man saying "Oh, it's an unlimited "abundance of compute, oh, I love what's going on." So you almost get gravitated, pulled into that, right? >> It was funny, I was doing a demo upstairs today with, one of the sales guys was doing a demo with some clients, and in one line of code, they had expressed what was part of my dissertation. It was a single line of code in a script and it was like, that was someone's entire four-year career 20 years ago. >> Great story, and I think that's consistent with just people who just attracted to it, and they end up being captains of industry. This is a hot field. You guys have a CDO of that happening in San Francisco. We'll be doing some live streaming there. What's the agenda because this is a very accelerating field? You mentioned now dealing practically with compliance and governance, which is you'd run in the other direction in the old days, now this embracing that. It's got to get (mumbles) and discipline in management. What's going to go on at CDO Summit or do you know? >> At the CDO Summit next week, I think we're going to focus on three key areas, right? What does a cloud journey look like? Maybe four key areas, right. So a cloud journey, how do you monetize data and what does that even mean, and talent, so at all these CDO Summits, the IBM CDO Summits have been going on for three or four years now, every one of them has a talent conversation, and then governance. I think those are four key concepts, and not surprising, they were four of my five on my list. I think that's what really we're going to talk about. >> The unified governance, tell us how that happens in your vision because that's something that you hear unified identity, we hear block chain looking at a whole new disruptive way of dealing with value digitally. How do you see the data governance thing unifying? >> Well, I think again, it's around... IBM did a great job of figuring out how to take an Open Source product that was Spark, and make it the heart of our products. It's going to be the same thing with governance where you're going to see Apache Atlas is at its infancy right now, having that open backbone so that people can get in and out of it easy. If you're going to have a unified governance platform, it's going to be open by definition because I need to get other people's products on there. I can't go to an enterprise and say we're going to sell your unified governance platform, but you got to buy all IBM, or you got to spend two years doing development work to get it on there. So open is the framework and composable, API-driven, and pro-active are really, I think, that's kind of the key pieces for it. >> So we all remember the client-server days where it took a decade and a half to realize, "Oh, my Gosh, this is out of control "and we need to bring it back in." And the Wild West days of big data, it feels like enterprises have nipped that governance issue in the butt at least, maybe they don't have it under control yet, but they understand the need to get it under control. Is that a fair statement? >> I think they understand the need. The data is so big and grows so fast that another component that I didn't mention, maybe it was implied a little bit, but, is automation. You need to be able to capture metadata in an automated fashion. We were talking to a client earlier who, 400 terabytes a day of data changes, not even talking about what new data they are ingesting, how do they keep track of that? It's got to be automated. This unified governance, you need to capture this metadata and as an automated fashion as possible. Master data needs to be automated when you think about-- >> And make it available in real time, low-latency because otherwise it becomes a data swamp. >> Right, it's got to be pro-active, real-time, on-demand. >> Another thing I wanted to ask you, Seth, and get your opinion on is sort of the mid-2000s when the federal rules of civil procedure changed in electronic documents and records became admissible, it was always about how do I get rid of data, and that's changed. Everybody wants to keep data and how to analyze it, and so forth, so what about that balance? And one of the challenges back then was data classification. I can't scale, by governance, I can't eliminate and defensively delete data unless I can classify it. Is the analog true where with data as an opportunity, I can't do a good job or a good enough job analyzing my data and keeping my data under control without some kind of automated classification, and has the industry solved that? >> I don't think the industry has completely solved it yet, but I think with cognitive tools, there's tools out there that we have that other people have that can automatically, if you give them parameters and train it, can classify the data for you, and I think classification is one of the keys. You need to understand how the data's classified so you understand who can access it, how long you should keep it, and so it's key, and that's got to be automated also. I think we've done a fair job as an industry of doing that. There's still a whole lot of work, especially as you get into the kind of specialized sectors, and so I think that's a key and we've got to do a better job of helping companies train those things so that they work. I'm a big proponent of don't give your data away to IT companies. It's your asset. Don't let them train their models with your data and sell it to other people, but there are some caveats out. There are some core areas where industries need to get together and let IT companies, whether it's IBM or someone else, train models for things just like that, for classification because if someone gets it wrong, it can bring the whole industry down. >> It's almost as if (talking over each other) source paradigm almost. It's like Open Source software. Share some data, but I-- >> Right, and there's some key things that aren't differentiating that, as an industry, you should get together and share. >> You guys are making, IBM is making a big deal out of this, and I think it's super important. I think it's probably the top thing that CDOs and CIOs need to think about right now is if I really own my data and that data is needed to train my big data models, who owns the models and how do I protect my IP. >> And are you selling it to my competitors. Are you going down the street and taking away my IP, my differentiating IP and giving it to my competitor? >> So do I own the model 'cause the data and models are coming together, and that's what IBM's telling me. >> Seth: Absolutely. >> I own the data and the models that it informs, is that correct? >> Yeah, that's absolutely correct. You guys made the point earlier about IBM bursting at the seams on data. That's really the driver for it. We need to do a key set of training. We need to train our models with content for industries, bring those trained models to companies and let them train specific versions for their company with their data that unless there's a reason they tell us to do it, is never going to leave their company. >> I think that's a great point about you being full of data because a lot of people who are building solutions and scaffolding for data, aka software never have more data full. The typical, "Oh, I'm going to be a software company," and they build something that they don't (mumbles) for. Your data full, so you know the problem. You're living it every day. It's opportunity. >> Yeah, and that's why when a startup comes to you and says, "Hey, we have this great AI algorithm. "Give us your data," they want to resell that model, and because they don't have access to the content. If you look at what IBM's done with Watson, right? That's why there's specialized verticals that we're focusing Watson, Watson Health, Watson Financial, because where we are investing in data in those areas you can look at the acquisitions we've done, right. We're investing in data to train those models. >> We should follow up on this because this brings up the whole scale point. If you look at all the innovators of the past decade, even two decades, Yahoo, Google, Facebook, these are companies that were webscalers before there was anything that they could buy. They built their own because they had their own problem at scale. >> At scale. >> And data at scale is a whole other mind-blowing issue. Do you agree? >> Absolutely. >> We're going to put that on the agenda for the CDO Summit in San Francisco next week. Seth, thanks so much for joining us on theCube. Appreciate it; Chief Data Officer, this is going to be a hot field. The CDO is going to be a very important opportunity for anyone watching in the data field. This is going to be new opportunities. Get that data, get it controlled, taming the data, making it valuable. This is theCUBE, taming all of the content here at InterConnect. I'm John Furrier with Dave Vellante. More content coming. Stay with us. Day Two coverage continues. (innovative music tones)

Published Date : Mar 22 2017

SUMMARY :

Brought to you by IBM. Welcome to theCUBE, welcome back. chattin' with you guys. I like the keynote on day one here. I mean you guys have the scenes with data what are you guys working on I get to public cloud, the need to worry about governance platform needs to be built data science machine learning data science are two, If I said five, there's only four. one of the things we see and so that just reinforces the need for and say you need to create Am I going to be off premise? to access ATMs, you like we talked about before this show. and we got to figure out how to do that. a lot of that outward focus If I'm going to start and that's largely going to how do I provide access to my data. I'm going to be a CDO someday. Seth: I was staying in a Take us through how you I'm not going to fit Data has properties that fundamental drivers of the concept it's almost like you and it was like, that was someone's It's got to get (mumbles) and not surprising, they were How do you see the data and make it the heart of our products. and a half to realize, Master data needs to be in real time, low-latency Right, it's got to be and has the industry solved that? and sell it to other people, It's almost as if Right, and there's some key things need to think about right giving it to my competitor? So do I own the model is never going to leave their company. Your data full, so you know the problem. have access to the content. innovators of the past decade, Do you agree? The CDO is going to be a

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

DavePERSON

0.99+

IBMORGANIZATION

0.99+

Chris MoodyPERSON

0.99+

Seth DobrinPERSON

0.99+

YahooORGANIZATION

0.99+

JohnPERSON

0.99+

GoogleORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

BrazilLOCATION

0.99+

SethPERSON

0.99+

90%QUANTITY

0.99+

threeQUANTITY

0.99+

San FranciscoLOCATION

0.99+

tensQUANTITY

0.99+

Mandalay BayLOCATION

0.99+

John FurrierPERSON

0.99+

South AmericaLOCATION

0.99+

Seth DobrinPERSON

0.99+

20%QUANTITY

0.99+

Last weekDATE

0.99+

twoQUANTITY

0.99+

fiveQUANTITY

0.99+

two yearsQUANTITY

0.99+

80%QUANTITY

0.99+

four monthsQUANTITY

0.99+

7%QUANTITY

0.99+

five thingsQUANTITY

0.99+

400 terabytesQUANTITY

0.99+

Watson HealthORGANIZATION

0.99+

IBM AnalyticsORGANIZATION

0.99+

Las VegasLOCATION

0.99+

four yearsQUANTITY

0.99+

OneQUANTITY

0.99+

Watson FinancialORGANIZATION

0.99+

next weekDATE

0.99+

TwitterORGANIZATION

0.99+

ExcelTITLE

0.99+

Las VegasLOCATION

0.99+

oneQUANTITY

0.99+

fourQUANTITY

0.99+

todayDATE

0.99+

RobettiPERSON

0.99+

third oneQUANTITY

0.99+

WatsonORGANIZATION

0.99+

CDO SummitEVENT

0.99+

a weekQUANTITY

0.99+

mid-2000sDATE

0.98+

single lineQUANTITY

0.98+

next weekDATE

0.98+

Millions of rowsQUANTITY

0.98+

15QUANTITY

0.98+

three dayQUANTITY

0.98+

first six monthsQUANTITY

0.97+

20 years agoDATE

0.97+

day oneQUANTITY

0.96+

single catalogQUANTITY

0.96+

five key areasQUANTITY

0.96+