Daniel Hernandez, Analytics Offering Management | IBM Data Science For All

>> Announcer: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome to the big apple, John Walls and Dave Vellante here on theCUBE we are live at IBM's Data Science For All. Going to be here throughout the day with a big panel discussion wrapping up our day. So be sure to stick around all day long on theCUBe for that. Dave always good to be here in New York is it not? >> Well you know it's been kind of the data science weeks, months, last week we're in Boston at an event with the chief data officer conference. All the Boston Datarati were there, bring it all down to New York City getting hardcore really with data science so it's from chief data officer to the hardcore data scientists. >> The CDO, hot term right now. Daniel Hernandez now joins as our first guest here at Data Science For All. Who's a VP of IBM Analytics, good to see you. David thanks for being with us. >> Pleasure. >> Alright well give us first off your take, let's just step back high level here. Data science it's certainly been evolving for decades if you will. First off how do you define it today? And then just from the IBM side of the fence, how do you see it in terms of how businesses should be integrating this into their mindset. >> So the way I describe data science simply to my clients is it's using the scientific method to answer questions or deliver insights. It's kind of that simple. Or answering questions quantitatively. So it's a methodology, it's a discipline, it's not necessarily tools. So that's kind of the way I approach describing what it is. >> Okay and then from the IBM side of the fence, in terms of how wide of a net are you casting these days I assume it's as big as you can get your arms out. >> So when you think about any particular problem that's a data science problem, you need certain capabilities. We happen to deliver those capabilities. You need the ability to collect, store, manage, any and all data. You need the ability to organize that data so you can discover it and protect it. You got to be able to analyze it. Automate the mundane, explain the past, predict the future. Those are the capabilities you need to do data science. We deliver a portfolio of it. Including on the analyze part of our portfolio, our data science tools that we would declare as such. >> So data science for all is very aspirational, and when you guys made the announcement of the Watson data platform last fall, one of the things that you focused on was collaboration between data scientists, data engineers, quality engineers, application development, the whole sort of chain. And you made the point that most of the time that data scientists spend is on wrangling data. You're trying to attack that problem, and you're trying to break down the stovepipes between those roles that I just mentioned. All that has to happen before you can actually have data science for all. I mean that's just data science for all hardcore data people. Where are we in terms of sort of the progress that your clients have made in that regard? >> So you know, I would say there's two majors vectors of progress we've made. So if you want data science for all you need to be able to address people that know how to code and people that don't know how to code. So if you consider kind the history of IBM in the data science space especially in SPSS, which has been around for decades. We're mastering and solving data science problems for non-coders. The data science experience really started with embracing coders. Developers that grew up in open source, that lived and learned Jupiter or Python and were more comfortable there. And integration of these is kind of our focus. So that's one aspect. Serving the needs of people that know how to code and don't in the kind of data science role. And then for all means supporting an entire analytics life cycle from collecting the data you need in order to answer the question that you're trying to answer to organizing that information once you've collected so you can discover it inside of tools like our own data science experience and SPSS, and then of course the set of tools that around exploratory analytics. All integrated so that you can do that end to end life cycle. So where clients are, I think they're getting certainly much more sophisticated in understanding that. You know most people have approached data science as a tool problem, as a data prep problem. It's a life cycle problem. And that's kind of how we're thinking about it. We're thinking about it in terms of, alright if our job is answer questions, delivering insights through scientific methods, how do we decompose that problem to a set of things that people need to get the job done, serving the individuals that have to work together. >> And when you think about, go back to the days where it's sort of the data warehouse was king. Something we talked about in Boston last week, it used to be the data warehouse was king, now it's the process is much more important. But it was very few people had access to that data, you had the elapsed time of getting answers, and the inflexibility of the systems. Has that changed and to what degree has it changed? >> I think if you were to go ask anybody in business whether or not they have all the data they need to do their job, they would say no. Why? So we've invested in EDW's, we've invested in Hadoop. In part sometimes, the problem might be, I just don't have the data. Most of the time it is I have the data I just don't know where it is. So there's a pretty significant issue on data discoverability, and it's important that I might have data in my operational systems, I might have data inside my EDW, I don't have everything inside my EDW, I've standed up one or more data lakes, and to solve my problem like customer segmentation I have data everywhere, how do I find and bring it in? >> That seems like that should be a fundamental consideration, right? If you're going to gather this much more information, make it accessible to people. And if you don't, it's a big flaw, it's a big gap is it not? >> So yes, and I think part of the reason why is because governance professionals which I am, you know I spent quite a bit of time trying to solve governance related problems. We've been focusing pretty maniacally on kind of the compliance, and the regulatory and security related issues. Like how do we keep people from going to jail, how do we ensure regulatory compliance with things like e-discovery, and records for instance. And it just so happens the same discipline that you use, even though in some cases lighter weight implementations, are what you need in order to solve this data discovery problem. So the discourse around governance has been historically about compliance, about regulations, about cost takeout, not analytics. And so a lot of our time certainly in R&D is trying to solve that data discovery problem which is how do I discover data using semantics that I have, which as a regular user is not physical understandings of my data, and once I find it how am I assured that what I get is what I should get so that it's, I'm not subject to compliance related issues, but also making the company more vulnerable to data breach. >> Well so presumably part of that anyway involves automating classification at the point of creation or use, which is actually was a technical challenge for a number of years. Has that challenge been solved in your view? >> I think machine learning is, and in fact later on today I will be doing some demonstrations of technology which will show how we're making the application of machine learning easy, inside of everything we do we're applying machine learning techniques including to classification problems that help us solve the problem. So it could be we're automatically harvesting technical metadata. Are there business terms that could be automatically extracted that don't require some data steward to have to know and assert, right? Or can we automatically suggest and still have the steward for a case where I need a canonical data model, and so I just don't want the machine to tell me everything, but I want the machine to assist the data curation process. We are not just exploring the application of machine learning to solve that data classification problem, which historically was a manual one. We're embedding that into most of the stuff that we're doing. Often you won't even know that we're doing it behind the scenes. >> So that means that often times well the machine ideally are making the decisions as to who gets access to what, and is helping at least automate that governance, but there's a natural friction that occurs. And I wonder if you can talk about the balance sheet if you will between information as an asset, information as a liability. You know the more restrictions you put on that information the more it constricts you know a business user's ability. So how do you see that shaping up? >> I think it's often a people process problem, not necessarily a technology problem. I don't think as an industry we've figured it out. Certainly a lot of our clients haven't figured out that balance. I mean there are plenty of conversation I'll go into where I'll talk to a data science team in a same line of business as a governance team and what the data science team will tell us is I'm building my own data catalog because the stuff that the governance guys are doing doesn't help me. And the reason why it doesn't help me is because it's they're going through this top down data curation methodology and I've got a question, I need to go find the data that's relevant. I might not know what that is straight away. So the CDO function in a lot of organizations is helping bridge that. So you'll see governance responsibilities line up with the CDO with analytics. And I think that's gone a long way to bridge that gaps. But that conversation that I was just mentioning is not unique to one or two customers. Still a lot of customers are doing it. Often customers that either haven't started a CDO practice or are early days on it still. >> So about that, because this is being introduced to the workplace, a new concept right, fairly new CDOs. As opposed to CIO or CTO, you know you have these other. I mean how do you talk to your clients about trying to broaden their perspective on that and I guess emphasizing the need for them to consider putting somebody of a sole responsibility, or primary responsibility for their data. Instead of just putting it lumping it in somewhere else. >> So we happen to have one of the best CDO's inside of our group which is like a handy tool for me. So if I go into a client and it's purporting to be a data science problem and it turns out they have a data management issue around data discovery, and they haven't yet figured out how to install the process and people design to solve that particular issue one of the key things I'll do is I'll bring in our CDO and his delegates to have a conversation around them on what we're doing inside of IBM, what we're seeing in other customers to help institute that practice inside of, inside of their own organization. We have forums like the CDO event in Boston last week, which are designed to, you know it's not designed to be here's what IBM can do in technology, it's designed to say here's how the discipline impacts your business and here's some best practices you should apply. So if ultimately I enter into those conversations where I find that there's a need, I typically am like alright, I'm not going to, tools are part of the problem but not the only issue, let me bring someone in that can describe the people process related issues which you got to get right. In order for, in some cases to the tools that I deliver to matter. >> We had Seth Dobrin on last weekend in Boston, and Inderpal Bhandari as well, and he put forth this enterprise, sort of data blueprint if you will. CDO's are sort of-- >> Daniel: We're using that in IBM by the way. >> Well this is the thing, it's a really well thought out sort of structure that seems to be trickling down to the divisions. And so it's interesting to hear how you're applying Seth's expertise. I want to ask you about the Hortonworks relationship. You guys have made a big deal about that this summer. To me it was a no brainer. Really what was the point of IBM having a Hadoop distro, and Hortonworks gets this awesome distribution channel. IBM has always had an affinity for open source so that made sense there. What's behind that relationship and how's it going? >> It's going awesome. Perhaps what we didn't say and we probably should have focused on is the why customers care aspect. There are three main by an occasion use cases that customers are implementing where they are ready even before the relationship. They're asking IBM and Hortonworks to work together. And so we were coming to the table working together as partners before the deeper collaboration we started in June. The first one was bringing data science to Hadoop. So running data science models, doing data exploration where the data is. And if you were to actually rewind the clock on the IBM side and consider what we did with Hortonworks in full consideration of what we did prior, we brought the data science experience and machine learning to Z in February. The highest value transactional data was there. The next step was bring data science to where the, often for a lot of clients the second most valuable set of data which is Hadoop. So that was kind of part one. And then we've kind of continued that by bringing data science experience to the private cloud. So that's one use case. I got a lot data, I need to do data science, I want to do it in resident, I want to take advantage of the compute grid I've already laid down, and I want to take advantage of the performance benefits and the integrated security and governance benefits by having these things co-located. That's kind of play one. So we're bringing in data science experience and HDP and HDF, which are the Hortonworks distributions way closer together and optimized for each other. Another component of that is not all data is going to be in Hadoop as we were describing. Some of it's in an EDW and that data science job is going to require data outside of Hadoop, and so we brought big SQL. It was already supporting Hortonworks, we just optimized the stack, and so the combination of data science experience and big SQL allows you to data science against a broader surface area of data. That's kind of play one. Play two is I've got a EDW either for cost or agility reasons I want to augment it or some cases I might want to offload some data from it to Hadoop. And so the combination of Hortonworks plus big SQL and our data integration technologies are a perfect combination there and we have plenty of clients using that for kind of analytics offloading from EDW. And then the third piece that we're doing quite a bit of engineering, go-to-market work around is govern data lakes. So I want to enable self service analytics throughout my enterprise. I want self service analytics tools to everyone that has access to it. I want to make data available to them, but I want that data to be governed so that they can discover what's in it in the lake, and whatever I give them is what they should have access to. So those are the kind of the three tracks that we're working with Hortonworks on, and all of them are making stunning results inside of clients. >> And so that involves actually some serious engineering as well-- >> Big time. It's not just sort of a Barney deal or just a pure go to market-- >> It's certainly more the market texture and just works. >> Big picture down the road then. Whatever challenges that you see on your side of the business for the next 12 months. What are you going to tackle, what's that monster out there that you think okay this is our next hurdle to get by. >> I forgot if Rob said this before, but you'll hear him say often and it's statistically proven, the majority of the data that's available is not available to be Googled, so it's behind a firewall. And so we started last year with the Watson data platform creating an integrating data analytics system. What if customers have data that's on-prem that they want to take advantage of, what if they're not ready for the public cloud. How do we deliver public benefits to them when they want to run that workload behind a firewall. So we're doing a significant amount of engineering, really starting with the work that we did on a data science experience. Bringing it behind the firewall, but still delivering similar benefits you would expect if you're delivering it in the public cloud. A major advancement that IBM made is run IBM cloud private. I don't know if you guys are familiar with that announcement. We made, I think it's already two weeks ago. So it's a (mumbles) foundation on top of which we have micro services on top of which our stack is going to be made available. So when I think of kind of where the future is, you know our customers ultimately we believe want to run data and analytic workloads in the public cloud. How do we get them there considering they're not there now in a stepwise fashion that is sensible economically project management-wise culturally. Without having them having to wait. That's kind of big picture, kind of a big problem space we're spending considerable time thinking through. >> We've been talking a lot about this on theCUBE in the last several months or even years is people realize they can't just reform their business and stuff into the cloud. They have to bring the cloud model to their data. Wherever that data exists. If it's in the cloud, great. And the key there is you got to have a capability and a solution that substantially mimics that public cloud experience. That's kind of what you guys are focused on. >> What I tell clients is, if you're ready for certain workloads, especially green field workloads, and the capability exists in a public cloud, you should go there now. Because you're going to want to go there eventually anyway. And if not, then a vendor like IBM helps you take advantage of that behind a firewall, often in form facts that are ready to go. The integrated analytics system, I don't know if you're familiar with that. That includes our super advanced data warehouse, the data science experience, our query federation technology powered by big SQL, all in a form factor that's ready to go. You get started there for data and data science workloads and that's a major step in the direction to the public cloud. >> Alright well Daniel thank you for the time, we appreciate that. We didn't get to touch at all on baseball, but next time right? >> Daniel: Go Cubbies. (laughing) >> Sore spot with me but it's alright, go Cubbies. Alright Daniel Hernandez from IBM, back with more here from Data Science For All. IBM's event here in Manhattan. Back with more in theCUBE in just a bit. (electronic music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. So be sure to stick around all day long on theCUBe for that. to the hardcore data scientists. Who's a VP of IBM Analytics, good to see you. how do you see it in terms of how businesses should be So that's kind of the way I approach describing what it is. in terms of how wide of a net are you casting You need the ability to organize that data All that has to happen before you can actually and people that don't know how to code. Has that changed and to what degree has it changed? and to solve my problem like customer segmentation And if you don't, it's a big flaw, it's a big gap is it not? And it just so happens the same discipline that you use, Well so presumably part of that anyway We're embedding that into most of the stuff You know the more restrictions you put on that information So the CDO function in a lot of organizations As opposed to CIO or CTO, you know you have these other. the process and people design to solve that particular issue data blueprint if you will. that seems to be trickling down to the divisions. is going to be in Hadoop as we were describing. just a pure go to market-- that you think okay this is our next hurdle to get by. I don't know if you guys are familiar And the key there is you got to have a capability often in form facts that are ready to go. We didn't get to touch at all on baseball, Daniel: Go Cubbies. IBM's event here in Manhattan.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Daniel	PERSON	0.99+
February	DATE	0.99+
Boston	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
one	QUANTITY	0.99+
David	PERSON	0.99+
Manhattan	LOCATION	0.99+
Inderpal Bhandari	PERSON	0.99+
June	DATE	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Seth	PERSON	0.99+
Python	TITLE	0.99+
third piece	QUANTITY	0.99+
EDW	ORGANIZATION	0.99+
second	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
last week	DATE	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
SQL	TITLE	0.99+
two customers	QUANTITY	0.99+
Hadoop	TITLE	0.99+
first	QUANTITY	0.99+
SPSS	TITLE	0.98+
Seth Dobrin	PERSON	0.98+
three tracks	QUANTITY	0.98+
John Walls	PERSON	0.98+
IBM Analytics	ORGANIZATION	0.98+
first guest	QUANTITY	0.97+
two weeks ago	DATE	0.97+
one aspect	QUANTITY	0.96+
first one	QUANTITY	0.96+
Barney	ORGANIZATION	0.96+
two majors	QUANTITY	0.96+
last weekend	DATE	0.94+
this summer	DATE	0.94+
Hadoop	ORGANIZATION	0.93+
decades	QUANTITY	0.92+
last fall	DATE	0.9+
two	QUANTITY	0.85+
IBM Data Science For All	ORGANIZATION	0.79+
three main	QUANTITY	0.78+
next 12 months	DATE	0.78+
CDO	TITLE	0.77+
D	ORGANIZATION	0.72+

Seth Dobrin, IBM Analytics - IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany; it's The Cube. Covering IBM; fast-track your data. Brought to you by IBM. (upbeat techno music) >> For you here at the show, generally; and specifically, what are you doing here today? >> There's really three things going on at the show, three high level things. One is we're talking about our new... How we're repositioning our hybrid data management portfolio, specifically some announcements around DB2 in a hybrid environment, and some highly transactional offerings around DB2. We're talking about our unified governance portfolio; so actually delivering a platform for unified governance that allows our clients to interact with governance and data management kind of products in a more streamlined way, and help them actually solve a problem instead of just offering products. The third is really around data science and machine learning. Specifically we're talking about our machine learning hub that we're launching here in Germany. Prior to this we had a machine learning hub in San Francisco, Toronto, one in Asia, and now we're launching one here in Europe. >> Seth, can you describe what this hub is all about? This is a data center where you're hosting machine learning services, or is it something else? >> Yeah, so this is where clients can come and learn how to do data science. They can bring their problems, bring their data to our facilities, learn how to solve a data science problem in a more team oriented way; interacting with data scientists, machine learning engineers, basically, data engineers, developers, to solve a problem for their business around data science. These previous hubs have been completely booked, so we wanted to launch them in other areas to try and expand the capacity of them. >> You're hosting a round table today, right, on the main tent? >> Yep. >> And you got a customer on, you guys going to be talking about sort of applying practices and financial and other areas. Maybe describe that a little bit. >> We have a customer on from ING, Heinrich, who's the chief architect for ING. ING, IBM, and Horton Works have a consortium, if you would, or a framework that we're doing around Apache Atlas and Ranger, as the kind of open-source operating system for our unified governance platform. So much as IBM has positioned Spark as a unified, kind of open-source operating system for analytics, for a unified governance platform... For a governance platform to be truly unified, you need to be able to integrate metadata. The biggest challenge about connecting your data environments, if you're an enterprise that was not internet born, or cloud born, is that you have proprietary metadata platforms that all want to be the master. When everyone wants to be the master, you can't really get anything done. So what we're doing around Apache Atlas is we are setting up Apache Atlas as kind of a virtual translator, if you would, or a dictionary between all the different proprietary metadata platforms so that you can get a single unified view of your data environment across hybrid clouds, on premise, in the cloud, and across different proprietary vendor platforms. Because it's open-sourced, there are these connectors that can go in and out of the proprietary platforms. >> So Seth, you seem like you're pretty tuned in to the portfolio within the analytics group. How are you spending your time as the Chief Data Officer? How do you balance it between customer visits, maybe talking about some of the products, and then you're sort of day job? >> I actually have three days jobs. My job's actually split into kind of three pieces. The first, my primary mission, is really around transforming IBM's internal business unit, internal business workings, to use data and analytics to run our business. So kind of internal business unit transformation. Part of that business unit transformation is also making sure that we're compliant with regulations like GDBR and other regulations. Another third is really around kind of rethinking our offerings from a CDO perspective. As a CDO, and as you, Dave, I've only been with IBM for seven months. As a former client recently, and as a CDO, what is it that I want to see from IBM's offerings? We kind of hit on it a little bit with the unified governance platform, where I think IBM makes fantastic products. But as a client, if a salesperson shows up to me, I don't want them selling me a product, 'cause if I want an MDM solution, I'll call you up and say, "Hey, I need an MDM solution. "Give me a quote." What I want them showing up is saying, "I have a solution that's going to solve "your governance problem across your portfolio." Or, "I'm going to solve your data science problem." Or, "I'm going to help you master your data, "and manage your data across "all these different environments." So really working with the offering management and the Dev teams to define what are these three or four, kind of business platforms that we want to settle on? We know three of them at least, right? We know that we have a hybrid data management. We have unified governance. We have data science and machine learning, and you could think of the Z franchise as a fourth platform. >> Seth, can you net out how governance relates to data science? 'Cause there is governance of the statistical models, machine learning, and so forth, version control. I mean, in an end to end machine learning pipeline, there's various versions of various artifacts they have to be managed in a structured way. Is your unified governance bundle, or portfolio, does it address those requirements? Or just the data governance? >> Yeah, so the unified governance platform really kind of focuses today on data governance and how good data governance can be an enabler of rapid data science. So if you have your data all pre-governed, it makes it much quicker to get access to data and understand what you can and can't do with data; especially being here in Europe, in the context of the EU GDPR. You need to make sure that your data scientists are doing things that are approved by the user, because basically your data, you have to give explicit consent to allow things to be done with it. But long term vision is that... essentially the output of models is data, right? And how you use and deploy those models also need to be governed. So the long term vision is that we will have a governance platform for all those things, as well. I think it makes more sense for those things to be governed in the data science platform, if you would. And we... >> We often hear separate from GDPR and all that, is something called algorithmic accountability; that more is being discussed in policy circles, in government circles around the world, as strongly related to everything you're describing. Being able to trace the lineage of any algorithmic decision back to the data, the metadata, and so forth, and the machine learning models that might have driven it. Is that where IBM's going with this portfolio? >> I think that's the natural extension of it. We're thinking really in the context of them as two different pieces, but if you solve them both and you connect them together, then you have that problem. But I think you're absolutely right. As we're leveraging machine learning and artificial intelligence, in general, we need to be able to understand how we got to a decision, and that includes the model, the data, how the data was gathered, how the data was used and processed. So it is that entire pipeline, 'cause it is a pipeline. You're not doing machine learning or AI in a vacuum. You're doing it in the context of the data, and you're doing it in the context about the individuals or the organizations that you're trying to influence with the output of those models. >> I call it Dev ops for data science. >> Seth, in the early Hadoop days, the real headwind was complexity. It still is, by the way. We know that. Companies like IBM are trying to reduce that complexity. Spark helps a little bit So the technology will evolve, we get that. It seems like one of the other big headwinds right now is that most companies don't have a great understanding of how they can take data and monetize it, turn it into value. Most companies, many anyway, make the mistake of, "Well, I don't really want to sell my data," or, "I'm not really a data supplier." And they're kind of thinking about it, maybe not in the right way. But we seem to be entering a next wave here, where people are beginning to understand I can cut costs, I can do predictive maintenance, I can maybe not sell the data, but I can enhance what I'm doing and increase my revenue, maybe my customer retention. They seem to be tuning, more so; largely, I think 'cause of the chief data officer roles, helping them think that through. I wonder if you would give us your point of view on that narrative. >> I think what you're describing is kind of the digital transformation journey. I think the end game, as enterprises go through a digital transformation, the end game is how do I sell services, outcomes, those types of things. How do I sell an outcome to my end user? That's really the end game of a digital transformation in my mind. But before you can get to that, before you transform your business's objectives, there's a couple of intermediary steps that are required for that. The first is what you're describing, is those kind of data transformations. Enterprises need to really get a handle on their data and become data driven, and start then transforming their current business model; so how do I accelerate my current business leveraging data and analytics? I kind of frame that, that's like the data science kind of transformation aspect of the digital journey. Then the next aspect of it is how do I transform my business and change my business objectives? Part of that first step is in fact, how do I optimize my supply chain? How do I optimize my workforce? How do I optimize my goals? How do I get to my current, you know, the things that Wall Street cares about for business; how do I accelerate those, make those faster, make those better, and really put my company out in front? 'Cause really in the grand scheme of things, there's two types of companies today; there's the company that's going to be the disruptor, and there's companies that's going to get disrupted. Most companies want to be the disruptors, and it's a process to do that. >> So the accounting industry doesn't have standards around valuing data as an asset, and many of us feel as though waiting for that is a mistake. You can't wait for that. You've got to figure out on your own. But again, it seems to be somewhat of a headwind because it puts data and data value in this fuzzy category. But there are clearly the data haves and the data have-nots. What are you seeing in that regard? >> I think the first... When I was in my former role, my former company went through an exercise of valuing our data and our decisions. I'm actually doing that same exercise at IBM right now. We're going through IBM, at least in the analytics business unit, the part I'm responsible for, and going to all the leaders and saying, "What decisions are you making?" "Help me understand the decisions that you're making." "Help me understand the data you need "to make those decisions." And that does two things. Number one, it does get to the point of, how can we value the decisions? 'Cause each one of those decisions has a specific value to the company. You can assign a dollar amount to it. But it also helps you change how people in the enterprise think. Because the first time you go through and ask these questions, they talk about the dashboards they want to help them make their preconceived decisions, validated by data. They have a preconceived notion of the decision they want to make. They want the data to back it up. So they want a dashboard to help them do that. So when you come in and start having this conversation, you kind of stop them and say, "Okay, what you're describing is a dashboard. "That's not a decision. "Let's talk about the decision that you want to make, "and let's understand the real value of that decision." So you're doing two things, you're building a portfolio of decisions that then becomes to your point, Jim, about Dev ops for data science. It's your backlog for your data scientists, in the long run. You then connect those decisions to data that's required to make those, and you can extrapolate the data for each decision to the component that each piece of data makes up to it. So you can group your data logically within an enterprise; customer, product, talent, location, things like that, and you can assign a value to those based on decisions they support. >> Jim: So... >> Dave: Go ahead, please. >> As a CDO, following on that, are you also, as part of that exercise, trying to assess the value of not just the data, but of data science as a capability? Or particular data science assets, like machine learning models? In the overall scheme of things, that kind of valuation can then drive IBM's decision to ramp up their internal data science initiatives, or redeploy it, or, give me a... >> That's exactly what happened. As you build this portfolio of decisions, each decision has a value. So I am now assigning a value to the data science models that my team will build. As CDOs, CDOs are a relatively new role in many organizations. When money gets tight, they say, "What's this guy doing?" (Dave laughing) Having a portfolio of decisions that's saying, "Here's real value I'm adding..." So, number one, "Here's the value I can add in the future," and as you check off those boxes, you can kind of go and say, "Here's value I've added. "Here's where I've changed how the company's operating. "Here's where I've generated X billions of dollars "of new revenue, or cost savings, or cost avoidance, "for the enterprise." >> When you went through these exercises at your previous company, and now at IBM, are you using standardized valuation methodologies? Did you kind of develop your own, or come up with a scoring system? How'd you do that? >> I think there's some things around, like net promoter score, where there's pretty good standards on how to assign value to increases in net promoter score, or decreases in net promoter score for certain aspects of your business. In other ways, you need to kind of decide as an enterprise, how do we value our assets? Do we use a three year, five year, ten year MPV? Do we use some other metric? You need to kind of frame it in the reference that your CFO is used to talking about so that it's in the context that the company is used to talking about. Most companies, it's net present value. >> Okay, and you're measuring that on an ongoing basis. >> Seth: Yep. >> And fine tuning as you go along. Seth, we're out of time. Thanks so much for coming back in The Cube. It was great to see you. >> Seth: Yeah, thanks for having me. >> You're welcome, good luck this afternoon. >> Seth: Alright. >> Keep it right there, buddy. We'll be back. Actually, let me run down the day here for you, just take a second to do that. We're going to end our Cube interviews for the morning, and then we're going to cut over to the main tent. So in about an hour, Rob Thomas is going to kick off the main tent here with a keynote, talking about where data goes next. Hilary Mason's going to be on. There's a session with Dez Blanchfield on data science as a team sport. Then the big session on changing regulations, GDPRs. Seth, you've got some customers that you're going to bring on and talk about these issues. And then, sort of balancing act, the balancing act of hybrid data. Then we're going to come back to The Cube and finish up our Cube interviews for the afternoon. There's also going to be two breakout sessions; one with Hilary Mason, and one on GDPR. You got to go to IBMgo.com and log in and register. It's all free to see those breakout sessions. Everything else is open. You don't even have to register or log in to see that. So keep it right here, everybody. Check out the main tent. Check out siliconangle.com, and of course IBMgo.com for all the action here. Fast track your data. We're live from Munich, Germany; and we'll see you a little later. (upbeat techno music)

Published Date : Jun 24 2017

SUMMARY :

Brought to you by IBM. that allows our clients to interact with governance and expand the capacity of them. And you got a customer on, you guys going to be talking about and Ranger, as the kind of open-source operating system How are you spending your time as the Chief Data Officer? and the Dev teams to define what are these three or four, I mean, in an end to end machine learning pipeline, in the data science platform, if you would. and the machine learning models that might have driven it. and you connect them together, then you have that problem. I can maybe not sell the data, How do I get to my current, you know, But again, it seems to be somewhat of a headwind of decisions that then becomes to your point, Jim, of not just the data, but of data science as a capability? and as you check off those boxes, you can kind of go and say, You need to kind of frame it in the reference that your CFO And fine tuning as you go along. and we'll see you a little later.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
ING	ORGANIZATION	0.99+
Seth	PERSON	0.99+
Europe	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
Germany	LOCATION	0.99+
Jim	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Thomas	PERSON	0.99+
ten year	QUANTITY	0.99+
five year	QUANTITY	0.99+
seven months	QUANTITY	0.99+
Asia	LOCATION	0.99+
three year	QUANTITY	0.99+
three	QUANTITY	0.99+
four	QUANTITY	0.99+
Heinrich	PERSON	0.99+
Horton Works	ORGANIZATION	0.99+
Dez Blanchfield	PERSON	0.99+
two types	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
three days	QUANTITY	0.99+
two things	QUANTITY	0.99+
each piece	QUANTITY	0.99+
today	DATE	0.99+
Dav	PERSON	0.99+
each	QUANTITY	0.99+
first	QUANTITY	0.99+
Munich, Germany	LOCATION	0.99+
third	QUANTITY	0.99+
both	QUANTITY	0.99+
billions of dollars	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.98+
two different pieces	QUANTITY	0.98+
three things	QUANTITY	0.98+
DB2	TITLE	0.98+
first step	QUANTITY	0.98+
GDPR	TITLE	0.97+
Apache Atlas	ORGANIZATION	0.97+
fourth platform	QUANTITY	0.97+
2017	DATE	0.97+
three pieces	QUANTITY	0.97+
IBM Analytics	ORGANIZATION	0.96+
first time	QUANTITY	0.96+
single	QUANTITY	0.96+
Spark	TITLE	0.95+
Ranger	ORGANIZATION	0.91+
two breakout sessions	QUANTITY	0.88+
about an hour	QUANTITY	0.86+
each decision	QUANTITY	0.85+
Cube	COMMERCIAL_ITEM	0.84+
each one	QUANTITY	0.83+
this afternoon	DATE	0.82+
Cube	ORGANIZATION	0.8+
San Francisco, Toronto	LOCATION	0.79+
GDPRs	TITLE	0.76+
GDBR	TITLE	0.75+

Rob Thomas, IBM Analytics | IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM: Fast Track Your Data. Brought to you by IBM. >> Welcome, everybody, to Munich, Germany. This is Fast Track Your Data brought to you by IBM, and this is theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise. My name is Dave Vellante, and I'm here with my co-host Jim Kobielus. Rob Thomas is here, he's the General Manager of IBM Analytics, and longtime CUBE guest, good to see you again, Rob. >> Hey, great to see you. Thanks for being here. >> Dave: You're welcome, thanks for having us. So we're talking about, we missed each other last week at the Hortonworks DataWorks Summit, but you came on theCUBE, you guys had the big announcement there. You're sort of getting out, doing a Hadoop distribution, right? TheCUBE gave up our Hadoop distributions several years ago so. It's good that you joined us. But, um, that's tongue-in-cheek. Talk about what's going on with Hortonworks. You guys are now going to be partnering with them essentially to replace BigInsights, you're going to continue to service those customers. But there's more than that. What's that announcement all about? >> We're really excited about that announcement, that relationship, just to kind of recap for those that didn't see it last week. We are making a huge partnership with Hortonworks, where we're bringing data science and machine learning to the Hadoop community. So IBM will be adopting HDP as our distribution, and that's what we will drive into the market from a Hadoop perspective. Hortonworks is adopting IBM Data Science Experience and IBM machine learning to be a core part of their Hadoop platform. And I'd say this is a recognition. One is, companies should do what they do best. We think we're great at data science and machine learning. Hortonworks is the best at Hadoop. Combine those two things, it'll be great for clients. And, we also talked about extending that to things like Big SQL, where they're partnering with us on Big SQL, around modernizing data environments. And then third, which relates a little bit to what we're here in Munich talking about, is governance, where we're partnering closely with them around unified governance, Apache Atlas, advancing Atlas in the enterprise. And so, it's a lot of dimensions to the relationship, but I can tell you since I was on theCUBE a week ago with Rob Bearden, client response has been amazing. Rob and I have done a number of client visits together, and clients see the value of unlocking insights in their Hadoop data, and they love this, which is great. >> Now, I mean, the Hadoop distro, I mean early on you got into that business, just, you had to do it. You had to be relevant, you want to be part of the community, and a number of folks did that. But it's really sort of best left to a few guys who want to do that, and Apache open source is really, I think, the way to go there. Let's talk about Munich. You guys chose this venue. There's a lot of talk about GDPR, you've got some announcements around unified government, but why Munich? >> So, there's something interesting that I see happening in the market. So first of all, you look at the last five years. There's only 10 companies in the world that have outperformed the S&P 500, in each of those five years. And we started digging into who those companies are and what they do. They are all applying data science and machine learning at scale to drive their business. And so, something's happening in the market. That's what leaders are doing. And I look at what's happening in Europe, and I say, I don't see the European market being that aggressive yet around data science, machine learning, how you apply data for competitive advantage, so we wanted to come do this in Munich. And it's a bit of a wake-up call, almost, to say hey, this is what's happening. We want to encourage clients across Europe to think about how do they start to do something now. >> Yeah, of course, GDPR is also a hook. The European Union and you guys have made some talk about that, you've got some keynotes today, and some breakout sessions that are discussing that, but talk about the two announcements that you guys made. There's one on DB2, there's another one around unified governance, what do those mean for clients? >> Yeah, sure, so first of all on GDPR, it's interesting to me, it's kind of the inverse of Y2K, which is there's very little hype, but there's huge ramifications. And Y2K was kind of the opposite. So look, it's coming, May 2018, clients have to be GDPR-compliant. And there's a misconception in the market that that only impacts companies in Europe. It actually impacts any company that does any type of business in Europe. So, it impacts everybody. So we are announcing a platform for unified governance that makes sure clients are GDPR-compliant. We've integrated software technology across analytics, IBM security, some of the assets from the Promontory acquisition that IBM did last year, and we are delivering the only platform for unified governance. And that's what clients need to be GDPR-compliant. The second piece is data has to become a lot simpler. As you think about my comment, who's leading the market today? Data's hard, and so we're trying to make data dramatically simpler. And so for example, with DB2, what we're announcing is you can download and get started using DB2 in 15 minutes or less, and anybody can do it. Even you can do it, Dave, which is amazing. >> Dave: (laughs) >> For the first time ever, you can-- >> We'll test that, Rob. >> Let's go test that. I would love to see you do it, because I guarantee you can. Even my son can do it. I had my son do it this weekend before I came here, because I wanted to see how simple it was. So that announcement is really about bringing, or introducing a new era of simplicity to data and analytics. We call it Download And Go. We started with SPSS, we did that back in March. Now we're bringing Download And Go to DB2, and to our governance catalog. So the idea is make data really simple for enterprises. >> You had a community edition previous to this, correct? There was-- >> Rob: We did, but it wasn't this easy. >> Wasn't this simple, okay. >> Not anybody could do it, and I want to make it so anybody can do it. >> Is simplicity, the rate of simplicity, the only differentiator of the latest edition, or I believe you have Kubernetes support now with this new addition, can you describe what that involves? >> Yeah, sure, so there's two main things that are new functionally-wise, Jim, to your point. So one is, look, we're big supporters of Kubernetes. And as we are helping clients build out private clouds, the best answer for that in our mind is Kubernetes, and so when we released Data Science Experience for Private Cloud earlier this quarter, that was on Kubernetes, extending that now to other parts of the portfolio. The other thing we're doing with DB2 is we're extending JSON support for DB2. So think of it as, you're working in a relational environment, now just through SQL you can integrate with non-relational environments, JSON, documents, any type of no-SQL environment. So we're finally bringing to fruition this idea of a data fabric, which is I can access all my data from a single interface, and that's pretty powerful for clients. >> Yeah, more cloud data development. Rob, I wonder if you can, we can go back to the machine learning, one of the core focuses of this particular event and the announcements you're making. Back in the fall, IBM made an announcement of Watson machine learning, for IBM Cloud, and World of Watson. In February, you made an announcement of IBM machine learning for the z platform. What are the machine learning announcements at this particular event, and can you sort of connect the dots in terms of where you're going, in terms of what sort of innovations are you driving into your machine learning portfolio going forward? >> I have a fundamental belief that machine learning is best when it's brought to the data. So, we started with, like you said, Watson machine learning on IBM Cloud, and then we said well, what's the next big corpus of data in the world? That's an easy answer, it's the mainframe, that's where all the world's transactional data sits, so we did that. Last week with the Hortonworks announcement, we said we're bringing machine learning to Hadoop, so we've kind of covered all the landscape of where data is. Now, the next step is about how do we bring a community into this? And the way that you do that is we don't dictate a language, we don't dictate a framework. So if you want to work with IBM on machine learning, or in Data Science Experience, you choose your language. Python, great. Scala or Java, you pick whatever language you want. You pick whatever machine learning framework you want, we're not trying to dictate that because there's different preferences in the market, so what we're really talking about here this week in Munich is this idea of an open platform for data science and machine learning. And we think that is going to bring a lot of people to the table. >> And with open, one thing, with open platform in mind, one thing to me that is conspicuously missing from the announcement today, correct me if I'm wrong, is any indication that you're bringing support for the deep learning frameworks like TensorFlow into this overall machine learning environment. Am I wrong? I know you have Power AI. Is there a piece of Power AI in these announcements today? >> So, stay tuned on that. We are, it takes some time to do that right, and we are doing that. But we want to optimize so that you can do machine learning with GPU acceleration on Power AI, so stay tuned on that one. But we are supporting multiple frameworks, so if you want to use TensorFlow, that's great. If you want to use Caffe, that's great. If you want to use Theano, that's great. That is our approach here. We're going to allow you to decide what's the best framework for you. >> So as you look forward, maybe it's a question for you, Jim, but Rob I'd love you to chime in. What does that mean for businesses? I mean, is it just more automation, more capabilities as you evolve that timeline, without divulging any sort of secrets? What do you think, Jim? Or do you want me to ask-- >> What do I think, what do I think you're doing? >> No, you ask about deep learning, like, okay, that's, I don't see that, Rob says okay, stay tuned. What does it mean for a business, that, if like-- >> Yeah. >> If I'm planning my roadmap, what does that mean for me in terms of how I should think about the capabilities going forward? >> Yeah, well what it means for a business, first of all, is what they're going, they're using deep learning for, is doing things like video analytics, and speech analytics and more of the challenges involving convolution of neural networks to do pattern recognition on complex data objects for things like connected cars, and so forth. Those are the kind of things that can be done with deep learning. >> Okay. And so, Rob, you're talking about here in Europe how the uptick in some of the data orientation has been a little bit slower, so I presume from your standpoint you don't want to over-rotate, to some of these things. But what do you think, I mean, it sounds like there is difference between certainly Europe and those top 10 companies in the S&P, outperforming the S&P 500. What's the barrier, is it just an understanding of how to take advantage of data, is it cultural, what's your sense of this? >> So, to some extent, data science is easy, data culture is really hard. And so I do think that culture's a big piece of it. And the reason we're kind of starting with a focus on machine learning, simplistic view, machine learning is a general-purpose framework. And so it invites a lot of experimentation, a lot of engagement, we're trying to make it easier for people to on-board. As you get to things like deep learning as Jim's describing, that's where the market's going, there's no question. Those tend to be very domain-specific, vertical-type use cases and to some extent, what I see clients struggle with, they say well, I don't know what my use case is. So we're saying, look, okay, start with the basics. A general purpose framework, do some tests, do some iteration, do some experiments, and once you find out what's hunting and what's working, then you can go to a deep learning type of approach. And so I think you'll see an evolution towards that over time, it's not either-or. It's more of a question of sequencing. >> One of the things we've talked to you about on theCUBE in the past, you and others, is that IBM obviously is a big services business. This big data is complicated, but great for services, but one of the challenges that IBM and other companies have had is how do you take that service expertise, codify it to software and scale it at large volumes and make it adoptable? I thought the Watson data platform announcement last fall, I think at the time you called it Data Works, and then so the name evolved, was really a strong attempt to do that, to package a lot of expertise that you guys had developed over the years, maybe even some different software modules, but bring them together in a scalable software package. So is that the right interpretation, how's that going, what's the uptake been like? >> So, it's going incredibly well. What's interesting to me is what everybody remembers from that announcement is the Watson Data Platform, which is a decomposable framework for doing these types of use cases on the IBM cloud. But there was another piece of that announcement that is just as critical, which is we introduced something called the Data First method. And that is the recipe book to say to a client, so given where you are, how do you get to this future on the cloud? And that's the part that people, clients, struggle with, is how do I get from step to step? So with Data First, we said, well look. There's different approaches to this. You can start with governance, you can start with data science, you can start with data management, you can start with visualization, there's different entry points. You figure out the right one for you, and then we help clients through that. And we've made Data First method available to all of our business partners so they can go do that. We work closely with our own consulting business on that, GBS. But that to me is actually the thing from that event that has had, I'd say, the biggest impact on the market, is just helping clients map out an approach, a methodology, to getting on this journey. >> So that was a catalyst, so this is not a sequential process, you can start, you can enter, like you said, wherever you want, and then pick up the other pieces from majority model standpoint? Exactly, because everybody is at a different place in their own life cycle, and so we want to make that flexible. >> I have a question about the clients, the customers' use of Watson Data Platform in a DevOps context. So, are more of your customers looking to use Watson Data Platform to automate more of the stages of the machine learning development and the training and deployment pipeline, and do you see, IBM, do you see yourself taking the platform and evolving it into a more full-fledged automated data science release pipelining tool? Or am I misunderstanding that? >> Rob: No, I think that-- >> Your strategy. >> Rob: You got it right, I would just, I would expand a little bit. So, one is it's a very flexible way to manage data. When you look at the Watson Data Platform, we've got relational stores, we've got column stores, we've got in-memory stores, we've got the whole suite of open-source databases under the composed-IO umbrella, we've got cloud in. So we've delivered a very flexible data layer. Now, in terms of how you apply data science, we say, again, choose your model, choose your language, choose your framework, that's up to you, and we allow clients, many clients start by building models on their private cloud, then we say you can deploy those into the Watson Data Platform, so therefore then they're running on the data that you have as part of that data fabric. So, we're continuing to deliver a very fluid data layer which then you can apply data science, apply machine learning there, and there's a lot of data moving into the Watson Data Platform because clients see that flexibility. >> All right, Rob, we're out of time, but I want to kind of set up the day. We're doing CUBE interviews all morning here, and then we cut over to the main tent. You can get all of this on IBMgo.com, you'll see the schedule. Rob, you've got, you're kicking off a session. We've got Hilary Mason, we've got a breakout session on GDPR, maybe set up the main tent for us. >> Yeah, main tent's going to be exciting. We're going to debunk a lot of misconceptions about data and about what's happening. Marc Altshuller has got a great segment on what he calls the death of correlations, so we've got some pretty engaging stuff. Hilary's got a great piece that she was talking to me about this morning. It's going to be interesting. We think it's going to provoke some thought and ultimately provoke action, and that's the intent of this week. >> Excellent, well Rob, thanks again for coming to theCUBE. It's always a pleasure to see you. >> Rob: Thanks, guys, great to see you. >> You're welcome; all right, keep it right there, buddy, We'll be back with our next guest. This is theCUBE, we're live from Munich, Fast Track Your Data, right back. (upbeat electronic music)

Published Date : Jun 22 2017

SUMMARY :

Brought to you by IBM. This is Fast Track Your Data brought to you by IBM, Hey, great to see you. It's good that you joined us. and machine learning to the Hadoop community. You had to be relevant, you want to be part of the community, So first of all, you look at the last five years. but talk about the two announcements that you guys made. Even you can do it, Dave, which is amazing. I would love to see you do it, because I guarantee you can. but it wasn't this easy. and I want to make it so anybody can do it. extending that now to other parts of the portfolio. What are the machine learning announcements at this And the way that you do that is we don't dictate I know you have Power AI. We're going to allow you to decide So as you look forward, maybe it's a question No, you ask about deep learning, like, okay, that's, and speech analytics and more of the challenges But what do you think, I mean, it sounds like And the reason we're kind of starting with a focus One of the things we've talked to you about on theCUBE And that is the recipe book to say to a client, process, you can start, you can enter, and deployment pipeline, and do you see, IBM, models on their private cloud, then we say you can deploy and then we cut over to the main tent. and that's the intent of this week. It's always a pleasure to see you. This is theCUBE, we're live from Munich,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Europe	LOCATION	0.99+
Rob	PERSON	0.99+
Marc Altshuller	PERSON	0.99+
Hilary	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Bearden	PERSON	0.99+
February	DATE	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
May 2018	DATE	0.99+
March	DATE	0.99+
Munich	LOCATION	0.99+
Scala	TITLE	0.99+
Apache	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
Last week	DATE	0.99+
Java	TITLE	0.99+
last year	DATE	0.99+
two announcements	QUANTITY	0.99+
10 companies	QUANTITY	0.99+
GDPR	TITLE	0.99+
Python	TITLE	0.99+
DB2	TITLE	0.99+
15 minutes	QUANTITY	0.99+
last week	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
European Union	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
JSON	TITLE	0.99+
Watson Data Platform	TITLE	0.99+
third	QUANTITY	0.99+
One	QUANTITY	0.99+
this week	DATE	0.98+
today	DATE	0.98+
a week ago	DATE	0.98+
two things	QUANTITY	0.98+
SQL	TITLE	0.98+
last fall	DATE	0.98+
2017	DATE	0.98+
Munich, Germany	LOCATION	0.98+
each	QUANTITY	0.98+
Y2K	ORGANIZATION	0.98+

Rob Bearden, Hortonworks & Rob Thomas, IBM Analytics - #DataWorks - #theCUBE

>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2017, brought to you by Hortonworks. >> Hi, welcome to theCUBE. We are live in San Jose, in the heart of Silicon Valley at the DataWorks Summit, day one. I'm Lisa Martin, with my co-host, George Gilbert. And we're very excited to be talking to two Robs. With Rob squared on the program this morning. Rob Bearden, the CEO of Hortonworks. Welcome, Rob. >> Thank you for having us. >> And Rob Thomas, the VP, GM rather, of IBM Analytics. So, guys, we just came from this really exciting, high energy keynote. The laser show was fantastic, but one of the great things, Rob, that you kicked off with was really showing the journey that Hortonworks has been on, and in a really pretty short period of time. Tremendous inertia, and you talked about the four mega-trends that are really driving enterprises to modernize their data architecture. Cloud, IOT, streaming data, and the fourth, next leg of this is data science. Data science, you said, will be the transformational next leg in the journey. Tell our viewers a little bit more about that. What does that mean for Hortonworks and your partnership with IBM? >> Well, what I think what IBM and Hortonworks now have the ability to do is to bring all the data together across a connected data platform. The data in motion, the data at rest, now have in one common platform, irrespective of the deployment architecture, whether it's on prim across multiple data centers or whether deployed in the cloud. And now that the large volume of data and we have access to it, we can now start to begin to drive the analytics in the end as that data moves through each phase of its life cycle. And what really happens now, is now that we have visibility and access to the inclusive life cycle of the data we can now put a data science framework over that to really now understand and learn those patterns and what's the data telling us, what's the pattern behind that. And we can bring simplification to the data science and turn data science actually into a team sport. Allow them to collaborate, allow them to have access to it. And sort of take the black magic out of doing data science with the framework of the tool and the power of DSX on top of the connected data platform. Now we can advance rapidly the insights in the end of the data and what that really does is drive value really quickly back into the customer. And then we can then begin to bring smart applications via the data science back into the enterprise. So we can now do things like connected car in real time, and have connected car learn as it's moving and through all the patterns, we can now, from a retail standpoint really get smart and accurate about inventory placement and inventory management. From an industrial standpoint, we know in real time, down to the component, what's happening with the machine, and any failures that may happen and be able to eliminate downtime. Agriculture, same kind of... Healthcare, every industry, financial services, fraud detection, money laundering advances that we have but it's all going to be attributable to how machine learning is applied and the DSX platform is the best platform in the world to do that with. >> And one of the things that I thought was really interesting, was that, as we saw enterprises start to embrace Hadoop and Big Data and Segano this needs to co-exist and inter-operate with our traditional applications, our traditional technologies. Now you're saying and seeing data science is going to be strategic business differentiator. You mentioned a number of industries, and there were several of them on stage today. Give us some, maybe some, one of your favorite examples of one of your customers leveraging data science and driving a pretty significant advantage for their business. >> Sure. Yeah, well, to step back a little bit, just a little context, only ten companies have out performed the S&P 500 in each of the last five years. We start looking at what are they doing. Those are companies that have decided data science and machine learning is critical. They've made a big bet on it, and every company needs to be doing that. So a big part of our message today was, kind of, I'd say, open the eyes of everybody to say there is something happening in the market right now. And it can make a huge difference in how you're applying data analytics to improve your business. We announced our first focus on this back in February, and one of our clients that spoke at that event is a company called Argus Healthcare. And Argus has massive amounts of data, sitting on a mainframe, and they were looking for how can we unleash that to do better care of patients, better care for our hospital networks, and they did that with data they had in their mainframe. So they brought data science experience and machine learning to their mainframe, that's what they talked about. What Rob and I have announced today is there's another great trove of data in every organization which is the data inside Hadoop. HDP, leading distribution for that, is a great place to start. So the use case that I just shared, which is on the mainframe, that's going to apply anywhere where there's large amounts of data. And right now there's not a great answer for data science on Hadoop, until today, where data science experience plus HDP brings really, I'd say, an elegant approach to it. It makes it a team sport. You can collaborate, you can interact, you can get education right in the platform. So we have the opportunity to create a next generation of data scientists working with data and HDP. That's why we're excited. >> Let me follow up with this question in your intro that, in terms of sort of the data science experience as this next major building block, to extract, or to build on the value from the data lake, the two companies, your two companies have different sort of, better markets, especially at IBM, but the industry solutions and global business services, you guys can actually build semi-custom solutions around this platform, both the data and the data science experience. With Hortonworks, what are those, what's your go to market motion going to look like and what are the offerings going to look like to the customer? >> They'll be several. You just described a great example, with IBM professional services, they have the ability to take those industry templates and take these data science models and instantly be able to bring those to the data, and so as part of our joint go to market motion, we'll be able now partner, bring those templates, bring those models to not only our customer base, but also part of the new sales go to market motion in the light space, in new customer opportunities and the whole point is, now we can use the enterprise data platforms to bring the data under management in a mission critical way that then bring value to it through these kinds of use case and templates that drive the smart applications into quick time to value. And just increase that time to value for the customers. >> So, how would you look at the mix changing over time in terms of data scientists working with the data to experiment on the model development and the two hard parts that you talked about, data prep and operationalization. So in other words, custom models, the issue of deploying it 11 months later because there's no real process for that that's packaged, and then packaged enterprise apps that are going to bake these models in as part of their functionality that, you know, the way Salesforce is starting to do and Workday is starting to do. How does that change over time? >> It'll be a layering effect. So today, we now have the ability to bring through the connected data platforms all the data under management in a mission critical manner from point of origination through the entire stream till it comes at rest. Now with the data science, through DSX, we can now, then, have that data science framework to where, you know, the analogy I would say, is instead of it being a black science of how you do data access and go through and build the models and determine what the algorithms are and how that yields a result, the analogy is you don't have to be a mechanic to drive a car anymore. The common person can drive a car. So, now we really open up the community business analyst that can now participate and enable data science through collaboration and then we can take those models and build the smart apps and evolve the smart apps that go to that very rapidly and we can accelerate that process also now through the partnership with IBM and bringing their core domain and value that, drivers that they've already built and drop that into the DSX environments and so I think we can accelerate the time to value now much faster and efficient than we've ever been able to do before. >> You mentioned teamwork a number of times, and I'm curious about, you also talked about the business analyst, what's the governance like to facilitate business analysts and different lines of business that have particular access? And what is that team composed of? >> Yeah, well, so let's look at what's happening in the big enterprises in the world right now. There's two major things going one. One is everybody's recognizing this is a multi-cloud world. There's multiple public cloud options, most clients are building a private cloud. They need a way to manage data as a strategic asset across all those multiple cloud environments. The second piece is, we are moving towards, what I would call, the next generation data fabric, which is your warehousing capabilities, your database capabilities, married with Hadoop, married with other open source data repositories and doing that in a seamless fashion. So you need a governance strategy for all of that. And the way I describe governance, simple analogy, we do for data what libraries do for books. Libraries create a catalog of books, they know they have different copies of books, some they archive, but they can access all of the intelligence in the library. That's what we do for data. So when we talk about governance and working together, we're both big supporters of the Atlas project, that will continue, but the other piece, kind of this point around enterprise data fabric is what we're doing with Big SQL. Big SQL is the only 100% ANSI-SQL compliant SQL engine for data across Hadoop and other repositories. So we'll be working closely together to help enterprises evolve in a multi-cloud world to this enterprise data fabric and Big SQL's a big capability for that. >> And an immediate example of that is in our EDW optimization suite that we have today we be loading Big SQL as the platform to do the complex query sector of that. That will go to market with almost immediately. >> Follow up question on the governance, there's, to what extent is end to end governance, meaning from the point of origin through the last mile, you know, if the last mile might be some specialized analytic engine, versus having all the data management capabilities in that fabric, you mentioned operational and analytic, so, like, are customers going to be looking for a provider who can give them sort of end to end capabilities on both the governance side and on all the data management capabilities? Is that sort of a critical decision? >> I believe so. I think there's really two use cases for governance. It's either insights or it's compliance. And if you're focus is on compliance, something like GDPR, as an example, that's really about the life cycle of data from when it starts to when it can be disposed of. So for compliance use case, absolutely. When I say insights as a governance use case, that's really about self-service. The ideal world is you can make your data available to anybody in your organization, knowing that they have the right permissions, that they can access, that they can do it in a protected way and most companies don't have that advantage today. Part of the idea around data science on HDP is if you've got the right governance framework in place suddenly you can enable self-service which is any data scientist or any business analyst can go find and access the data they need. So it's a really key part of delivering on data science, is this governance piece. Now I just talked to clients, they understand where you're going. Is this about compliance or is this about insights? Because there's probably a different starting point, but the end game is similar. >> Curious about your target markets, Tyler talked about the go to market model a minute ago, are you targeting customers that are on mainframes? And you said, I think, in your keynote, 90% of transactional data is in a mainframe. Is that one of the targets, or is it the target, like you mention, Rob, with the EDW optimization solution, are you working with customers who have an existing enterprise data warehouse that needs to be modernized, is it both? >> The good news is it's both. It's about, really the opportunity and mission, is about enabling the next generation data architecture. And within that is again, back to the layering approach, is being able to bring the data under management from point of origination through point of it reg. Now if we look at it, you know, probably 90% of, at least transactional data, sits in the mainframe, so you have to be able to span all data sets and all deployment architectures on prim multi-data center as well as public cloud. And that then, is the opportunity, but for that to then drive value ultimately back, you've got to be able to have then the simplification of the data science framework and toolset to be able to then have the proper insights and basis on which you can bring the new smart applications. And drive the insights, drive the governance through the entire life cycle. >> On the value front, you know, we talk about, and Hortonworks talks about, the fact that this technology can really help a business unlock transformational value across their organization, across lines of business. This conversation, we just talked about a couple of the customer segments, is this a conversation that you're having at the C-suite initially? Where are the business leaders in terms of understanding? We know there's more value here, we probably can open up new business opportunities or are you talking more the data science level? >> Look, it's at different levels. So, data science, machined learning, that is a C-suite topic. A lot of times I'm not sure the audience knows what they're asking for, but they know it's important and they know they need to be doing something. When you go to things like a data architecture, the C-suite discussion there is, I just want to become more productive in how I'm deploying and using technology because my IT budget's probably not going up, if anything it may be going down, so I've got to become a lot more productive and efficient to do that. So it depends on who you're talking to, there's different levels of dialogue. But there's no question in my mind, I've seen, you know, just look at major press Financial Times, Wallstreet Journal last year. CEOs are talking about AI, machine learning, using data as a competitive weapon. It is happening and it's happening right now. What we're doing together, saying how do we make data simple and accessible? How do we make getting there really easy? Because right now it's pretty hard. But we think with the combination of what we're bringing, we make it pretty darn easy. >> So one quick question following up on that, and then I think we're getting close to the end. Which is when the data lakes started out, it was sort of, it seemed like, for many customers a mandate from on high, we need a big data strategy, and that translated into standing up a Hadoop cluster, and that resulted in people realizing that there's a lot to manage there. It sounds like, right now people know machine learning is hot so they need to get data science tools in place, but is there a business capability sort of like the ETL offload was for the initial Hadoop use cases, where you would go to a customer and recommend do this, bite this off as something concrete? >> I'll start and then Rob can comment. Look, the issue's not Hadoop, a lot of clients have started with it. The reason there hasn't been, in some cases, the outcomes they wanted is because just putting data into Hadoop doesn't drive an outcome. What drives an outcome is what do you do with it. How do you change your business process, how do you change what the company's doing with the data, and that's what this is about, it's kind of that next step in the evolution of Hadoop. And that's starting to happen now. It's not happening everywhere, but we think this will start to propel that discussion. Any thoughts you had, Rob? >> Spot on. Data lake was about releasing the constraints of all the silos and being able to bring those together and aggregate that data. And it was the first basis for being able to have a 360 degree or wholistic centralized insight about something and, or pattern, but what then data science does is it actually accelerates those patterns and those lessons learned and the ability to have a much more detailed and higher velocity insight that you can react to much faster, and actually accelerate the business models around this aggregate. So it's a foundational approach with Hadoop. And it's then, as I mentioned in the keynote, the data science platforms, machine learning, and AI actually is what is the thing that transformationally opens up and accelerates those insights, so then new models and patterns and applications get built to accelerate value. >> Well, speaking of transformation, thank you both so much for taking time to share your transformation and the big news and the announcements with Hortonworks and IBM this morning. Thank you Rob Bearden, CEO of Hortonworks, Rob Thomas, General Manager of IBM Analytics. I'm Lisa Martin with my co-host, George Gilbert. Stick around. We are live from day one at DataWorks Summit in the heart of Silicon Valley. We'll be right back. (tech music)

Published Date : Jun 13 2017

SUMMARY :

brought to you by Hortonworks. We are live in San Jose, in the heart of Silicon Valley and the fourth, next leg of this is data science. now have the ability to do And one of the things and every company needs to be doing that. and the data science experience. that drive the smart applications into quick time to value. and the two hard parts that you talked about, and drop that into the DSX environments and doing that in a seamless fashion. in our EDW optimization suite that we have today and most companies don't have that advantage today. Tyler talked about the go to market model a minute ago, but for that to then drive value ultimately back, On the value front, you know, we talk about, and they know they need to be doing something. that there's a lot to manage there. it's kind of that next step in the evolution of Hadoop. and the ability to have a much more detailed and the announcements with Hortonworks and IBM this morning.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
George Gilbert	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Rob Bearden	PERSON	0.99+
San Jose	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob	PERSON	0.99+
Argus	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
Tyler	PERSON	0.99+
February	DATE	0.99+
two companies	QUANTITY	0.99+
second piece	QUANTITY	0.99+
Argus Healthcare	ORGANIZATION	0.99+
last year	DATE	0.99+
360 degree	QUANTITY	0.99+
GDPR	TITLE	0.99+
one	QUANTITY	0.99+
Hadoop	TITLE	0.99+
One	QUANTITY	0.99+
both	QUANTITY	0.99+
DataWorks Summit	EVENT	0.99+
ten companies	QUANTITY	0.99+
two	QUANTITY	0.99+
fourth	QUANTITY	0.99+
today	DATE	0.99+
two hard parts	QUANTITY	0.98+
DataWorks Summit 2017	EVENT	0.98+
11 months later	DATE	0.98+
each	QUANTITY	0.98+
two use cases	QUANTITY	0.97+
100%	QUANTITY	0.97+
one quick question	QUANTITY	0.97+
Segano	ORGANIZATION	0.97+
SQL	TITLE	0.96+
four mega-trends	QUANTITY	0.96+
Big SQL	TITLE	0.96+
first basis	QUANTITY	0.94+
one common platform	QUANTITY	0.94+
two major things	QUANTITY	0.92+
Robs	PERSON	0.92+
Wallstreet Journal	ORGANIZATION	0.92+
Financial Times	ORGANIZATION	0.92+

Seth Dobrin, IBM Analytics - Spark Summit East 2017 - #sparksummit - #theCUBE

>> Narrator: Live from Boston, Massachusetts, this is theCUBE! Covering Spark Summit East 2017. Brought to you by, Databricks. Now, here are your hosts, Dave Vellante and George Gilbert. >> Welcome back to Boston, everybody, Seth Dobrin is here, he's the vice president and chief data officer of the IBM Analytics Organization. Great to see you, Seth, thanks for coming on. >> Great to be back, thanks for having me again. >> You're welcome, so chief data officer is the hot title. It was predicted to be the hot title and now it really is. Many more of you around the world and IBM's got an interesting sort of structure of chief data officers, can you explain that? >> Yeah, so there's a global chief data officer, that's Inderpal Bhandari and he's been on this podcast or videocast a view times. Then he's set up structures within each of the business units in IBM. Where each of the major business units have a chief data officer, also. And so I'm the chief data officer for the analytics business unit. >> So one of Interpol's things when I've interviewed them is culture. The data culture, you've got to drive that in. And he talks about the five things that chief data officers really need to do to be successful. Maybe you could give us your perspective on how that flows down through the organization and what are the key critical success factors for you and how are you implementing them? >> I agree, there's five key things and maybe I frame a little differently than Interpol does. There's this whole cloud migration, so every chief data officer needs to understand what their cloud migration strategy is. Every chief data officer needs to have a good understanding of what their data science strategy is. So how are they going to build the posable data science assets. So not data science assets that are delivered through spreadsheets. Every chief data officer needs to understand what their approach to unified governance is. So how do I govern all of my platforms in a way that enables that last point about data science. And then there's a piece around people. How do I build a pipeline for me today and the future? >> So the people piece is both the skills, and it's presumably a relationship with the line of business, as well. There's sort of two vectors there, right? >> Yeah the people piece when I think of it, is really about skills. There's a whole cultural component that goes across all of those five pieces that I laid out. Finding the right people, with the right skillset, where you need them, is hard. >> Can you talk about cloud migration, why that's so critical and so hard? >> If you look at kind of where the industry's been, the IT industry, it's been this race to the public cloud. I think it's a little misguided, all along. If you look at how business is run, right? Today, enterprises that are not internet born, make their money from what's running their businesses today. So this business critical assets. And just thinking that you can pick those up and move them to the cloud and take advantage of cloud, is not realistic. So the race really, is to a hybrid cloud. Our future's really lie in how do I connect these business critical assets to the cloud? And how do I migrate those things to the cloud? >> So Seth, the CIO might say to you, "Okay, let's go there for a minute, I kind of agree with what you're saying, I can't just shift everything in to the cloud. But what I can do in a hybrid cloud that I can't do in a public cloud?" >> Well, there's some drivers for that. I think one driver for hybrid cloud is what I just said. You can't just pick everything up and move it overnight, it's a journey. And it's not a six month journey, it's probably not a year journey, it's probably a multi year journey. >> Dave: So you can actually keep running your business? >> So you can actually keep running your business. And then other piece is there's new regulations that are coming up. And these regulations, EUGDPR is the biggest example of them right now. There are very stiff fines, for violations of those policies. And the party that's responsible for paying those fines, is the party that who the consumer engaged with. It's you, it's whoever owns the business. And as a business leader, I don't know that I would be, very willingly give up, trust a third party to manage that, just any any third party to manage that for me. And so there's certain types of data that some enterprises may never want to move to the cloud, because they're not going to trust a third party to manage that risk for them. >> So it's more transparent from a government standpoint. It's not opaque. >> Seth: Yup. >> You feel like you're in control? >> Yeah, you feel like you're in control and if something goes wrong, it's my fault. It's not something that I got penalized for because someone else did something wrong. >> So at the data layer, help us sort of abstract one layer up and the applications. How would you partition the applications. The ones that are managing that critical data that has to stay on premises. What would you build up potentially to compliment it in the public cloud? >> I don't think you need to partition applications. The way you build modern applications today, it's all API driven. You can reduce some of the costs of latency, through design. So you don't really need to partition the applications, per say. >> I'm thinking more along the lines of that the systems of record are not going to be torn out and those are probably the last ones if ever to go to the public cloud. But other applications leverage them. If that's not the right way of looking at it, where do you add value in the public cloud versus what stays on premise? >> So some of the system of record data, there's no reason you can't replicate some of it to the cloud. So if it's not this personal information, or highly regulated information, there's no reason that you can't replicate some of that to the cloud. And I think we get caught up in, we can't replicate data, we can't replicate data. I don't think that's the right answer, I think the right answer is to replicate the data if you need to, or if the data and system of record is not in the right structure, for what I need to do, then let's put the data in the right structure. Let's not have the conversation about how I can't replicate data. Let's have the conversation about where's the right place for the data, where does it make most sense and what's the right structure for it? And if that means you've got 10 copies of a certain type of data then you've got 10 copies of a certain type of data. >> Would you be, on that data, would it typically be, other parts of the systems of record that you might have in the public cloud, or would they be new apps, sort of green field apps? >> Seth: Yes. >> George: Okay. >> Seth: I think both. And that's part of, i think in my mind, that's kind of how you build, that question you just asked right there. Is one of the things that guide how you build your cloud migration strategy. So we said you can't just pick everything up and move it. So how do you prioritize? You look at what you need to build to run your business differently. And you start there and you start thinking about how do I migrate information to support those to the cloud? And maybe you start by building a local private cloud. So that everything's close together until you kind of master it. And then once you get enough, critical mass of data and applications around it, then you start moving stuff to the cloud. >> We talked earlier off camera about reframing governance steps. I used to head a CIO consultancy and we worked with a number of CIOs that were within legal IT, for example. And were worried about compliance and governance and things of that nature. And their ROI was always scare the board. But the holy grail, was can we turn governance into something of value? For the organization? Can we? >> I think in the world we live in today, with ever increasing regulations. And with a need to be agile and with everyone needing to and wanting to apply data science at scale. You need to reframe governance, right? Governance needs to be reframed from something that is seen as a roadblock. To something that is truly an enabler. And not just giving it lip service. And what do I mean by that? For governance to be an enabler, you really got to think about, how do I upfront, classify my data so that all data in my organization is bucketed in to some version of public, propietary and confidential. Different enterprises may have 30 scales and some may only have two. Or some may have one. and so you do that up front and so you know what can be done with data, when it can be done and who it can by done with. You need to capture intent. So what are allowed intended uses of data? And as a data scientist, what am I intending to do with this data? So that you can then mesh those two things together? Cause that's important in these new regulations I talked about, is people give you access to data, their personal data for an intended purpose. And then you need to be able to apply these governance, policies, actively. So it's not a passive, after the fact. Or you got to stop and you got to wait, it's leveraging services. Leveraging APIs. And building a composable system of polices that are delivered through APIs. So if I want to create a sandbox. To run some analytics on. I'm going to call an API. To get that data. That API is going to call a policy API that's going to say, "Okay, does Seth have permission to see this data? Can Seth use this data for this intended purpose?" if yes, the sandbox is created. If not, there's a conversation about really why does Seth need access to this data? It's really moving governance to be actively to enable me to do things. And it changes the conversation from, hey it's your data, can I have it? To there's really solid reasons as to why I can and can't have data. >> And then some potential automation around a sandbox that creates value. >> Seth: Absolutely. >> But it's still, the example you gave, public prop6ietary or confidential. Is still very governance like, where I was hoping you were going with the data classification and I think you referenced this. Can I extend that, that schema, that nomenclature to include other attributes of value? And can i do it, automate it, at the point of creation or use and scale it? >> Absolutely, that is exactly what I mean. I just used those three cause it was the three that are easy to understand. >> So I can give you as a business owner some areas that I would like to see, a classification schema and then you could automate that for me at scale? In theory? >> In theory, that's where we're hoping to go. To be able to automate. And it's going to be different based on what industry vertical you're in. What risk profile your business is willing to take. So that classification scheme is going to look very different for a bank, than it will for a pharmaceutical company. Or for a research organization. >> Dave: Well, if I can then defensively delete data. That's of real value to an organization. >> With new regulations, you need to be able to delete data. And you need to be able to know where all of your data is. So that you can delete it. Today, most organizations don't know where all their data is. >> And that problem is solved with math and data science, or? >> I think that problem is solved with a combination of governance. >> Dave: Sure. >> And technology. Right? >> Yeah, technology kind of got us into this problem. We'll say technology can get us out. >> On the technology subject, it seems like, with the explosion of data, whether it's not just volume, but also, many copies of the truth. You would need some sort of curation and catalog system that goes beyond what you had in a data warehouse. How do you address that challenge? >> Seth: Yeah and that gets into what I said when you guys asked me about CDOs, what do they care about? One of the things is unified governance. And so part of unified governance, the first piece of unified governance is having a catalog of your data. That is all of your data. And it's a single catalog for your data whether it's one of your business critical systems that's running your business today. Whether it's a public cloud, or it's a private cloud. Or some combination of both. You need to know where all your data is. You also need to have a policy catalog that's single for both of those. Catalogs like this fall apart by entropy. And the more you have, the more likely they are to fall apart. And so if you have one. And you have a lot of automation around it to do a lot of these things, so you have automation that allows you to go through your data and discover what data is where. And keep track of lineage in an automated fashion. Keep track of provenance in an automated fashion. Then we start getting into a system of truly unified governance that's active like I said before. >> There's a lot of talk about digital transformations. Of course, digital equals data. If it ain't data, it ain't digital. So one of the things that in the early days of the whole big data theme. You'd hear people say, "You have to figure out how to monetize the data." And that seems to have changed and morphed into you have to understand how your organization gets value from data. If you're a for profit company, it's monetizing. Something and feeding how data contributes to that monetization if you're a health care organization, maybe it's different. I wonder if you could talk about that in terms of the importance of understanding how an organization makes money to the CDO specifically. >> I think you bring up a good point. Monetization of data and analytics, is often interpreted differently. If you're a CFO you're going to say, "You're going to create new value for me, I'm going to start getting new revenue streams." And that may or may not be what you mean. >> Dave: Sell the data, it's not always so easy. >> It's not always so easy and it's hard to demonstrate value for data. To sell it. There's certain types, like IBM owns a weather company. Clearly, people want to buy weather data, it's important. But if you're talking about how do you transform a business unit it's not necessarily about creating new revenue streams, it's how do I leverage data and analytics to run my business differently. And maybe even what are new business models that I could never do before I had data and data science. >> Would it be fair to say that, as Dave was saying, there's the data side and people were talking about monetizing that. But when you talk about analytics increasingly, machine learning specifically, it's a fusion of the data and the model. And a feedback loop. Is that something where, that becomes a critical asset? >> I would actually say that you really can't generate a tremendous amount of value from just data. You need to apply something like machine learning to it. And machine learning has no value without good data. You need to be able to apply machine learning at scale. You need to build the deployable data science assets that run your business differently. So for example, I could run a report that shows me how my business did last quarter. How my sales team did last quarter. Or how my marketing team did last quarter. That's not really creating value. That's giving me a retrospective look on how I did. Where you can create value is how do I run my marketing team differently. So what data do I have and what types of learning can I get from that data that will tell my marketing team what they should be doing? >> George: And the ongoing process. >> And the ongoing process. And part of actually discovering, doing this catalog your data and understanding data you find data quality issues. And data quality issues are not necessarily an issue with the data itself or the people, they're usually process issues. And by discovering those data quality issues you may discover processes that need to be changed and in changing those processes you can create efficiencies. >> So it sounds like you guys got a pretty good framework. Having talked to Interpol a couple times and what you're saying makes sense. Do you have nightmares about IOT? (laughing) >> Do I have nightmares about IOT? I don't think I have nightmares about IOT. IOT is really just a series of connected devices. Is really what it is. On my talk tomorrow, I'm going to talk about hybrid cloud and connect a car is actually one of the things I'm going to talk about. And really a connected car you're just have a bunch of connected devices to a private cloud that's on wheels. I'm less concerned about IOT than I am, people manually changing data. IOT you get data, you can track it, if something goes wrong, you know what happened. I would say no, I don't have nightmares about IOT. If you do security wrong, that's a whole nother conversation. >> But it sounds like you're doing security right, sounds like you got a good handle on governance. Obviously scale is a key part of that. Could break the whole thing if you can't scale. And you're comfortable with the state of technology being able to support that? At least with IBM. >> I think at least with an IBM I think I am. Like I said, a connected car which is basically a bunch of IOT devices, a private cloud. How do we connect that private cloud to other private clouds or to a public cloud? There's tons of technologies out there to do that. Spark, Kafka. Those two things together allow you to do things that we could never do before. >> Can you elaborate? Like in a connected car environment or some other scenario where, other people called it a data center on wheels. Think of it as a private cloud, that's a wonderful analogy. How does Spark and Kafka on that very, very, smart device, cooperate with something like on the edge. Like the cities, buildings, versus in the clouds? >> If you're a connected car and you're this private cloud on wheels. You can't drive the car just on that information. You can't drive it just on the LIDAR knowing how well the wheels are in contact, you need weather information. You need information about other cars around you. You need information about pedestrians. You need information about traffic. All of this information you get from that connection. And the way you do that is leveraging Spark and Kafka. Kafka's a messaging system, you could leverage Kafka to send the car messages. Or send pedestrian messages. "This car is coming, you shouldn't cross." Or vice versa. Get a car to stop because there's a pedestrian in the way before even the systems on the car can see it. So if you can get that kind of messaging system in near real time. If I'm the pedestrian I'm 300 feet away. A half a second that it would take for that to go through, isn't that big of a deal because you'll be stopped before you get there. >> What about the again, intelligence between not just the data, but the advanced analytics. Where some of that would live in the car and some in the cloud. Is it just you're making realtime decisions in the car and you're retraining the models in the cloud, or how does that work? >> No I think some of those decisions would be done through Spark. In transit. And so one of the nice things about something about Spark is, we can do machine learning transformations on data. Think ETL. But think ETL where you can apply machine learning as part of that ETL. So I'm transferring all this weather data, positioning data and I'm applying a machine learning algorithm for a given purpose in that car. So the purpose is navigation. Or making sure I'm not running into a building. So that's happening in real time as it's streaming to the car. >> That's the prediction aspect that's happening in real time. >> Seth: Yes. >> But at the same time, you want to be learning from all the cars in your fleet. >> That would happen up in the cloud. I don't think that needs to happen on the edge. Maybe it does, but I don't think it needs to happen on the edge. And today, while I said a car is a data center, a private cloud on wheels, there's cost to the computation you can have on that car. And I don't think the cost is quite low enough yet where you could do all that where it makes sense to do all that computation on the edge. So some of it you would want to do in the cloud. Plus you would want to have all the information from as many cars in the area as possible. >> Dave: We're out of time, but some closing thoughts. They say may you live in interesting times. Well you can sum up the sum of the changes that are going on the business. Dell buys EMC, IBM buys The Weather Company. And that gave you a huge injection of data scientists. Which, talk about data culture. Just last thoughts on that in terms of the acquisition and how that's affected your role. >> I've only been at IBM since November. So all that happened before my role. >> Dave: So you inherited? >> So from my perspective it's a great thing. Before I got there, the culture was starting to change. Like we talked about before we went on air, that's the hardest part about any kind of data science transformation is the cultural aspects. >> Seth, thanks very much for coming back in theCUBE. Good to have you. >> Yeah, thanks for having me again. >> You're welcome, all right, keep it right there everybody, we'll be back with our next guest. This is theCUBE, we're live from Spark Summit in Boston. Right back. (soft rock music)

Published Date : Feb 8 2017

SUMMARY :

Brought to you by, Databricks. of the IBM Analytics Organization. Many more of you around the world And so I'm the chief data officer and what are the key critical success factors for you So how are they going to build the posable data science assets. So the people piece is both the skills, with the right skillset, where you need them, is hard. So the race really, is to a hybrid cloud. So Seth, the CIO might say to you, And it's not a six month journey, So you can actually keep running your business. So it's more transparent from a government standpoint. Yeah, you feel like you're in control that has to stay on premises. I don't think you need to partition applications. of record are not going to be torn out to replicate the data if you need to, that guide how you build your cloud migration strategy. But the holy grail, So that you can then mesh those two things together? And then some potential automation But it's still, the example you gave, that are easy to understand. So that classification scheme is going to That's of real value to an organization. And you need to be able to know where all of your data is. I think that problem is solved And technology. Yeah, technology kind of got us into this problem. that goes beyond what you had in a data warehouse. And the more you have, And that seems to have changed and morphed into you have And that may or may not be what you mean. and it's hard to demonstrate value for data. it's a fusion of the data and the model. that you really can't generate a tremendous amount And by discovering those data quality issues you may So it sounds like you guys got a pretty good framework. of the things I'm going to talk about. Could break the whole thing if you can't scale. Those two things together allow you Can you elaborate? And the way you do that is leveraging Spark and Kafka. and some in the cloud. But think ETL where you can apply machine That's the prediction aspect you want to be learning from all the cars in your fleet. to the computation you can have on that car. And that gave you a huge injection of data scientists. So all that happened before my role. that's the hardest part about any kind Good to have you. we'll be back with our next guest.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
George	PERSON	0.99+
George Gilbert	PERSON	0.99+
Seth	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Inderpal Bhandari	PERSON	0.99+
10 copies	QUANTITY	0.99+
Seth Dobrin	PERSON	0.99+
Dell	ORGANIZATION	0.99+
300 feet	QUANTITY	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
six month	QUANTITY	0.99+
both	QUANTITY	0.99+
Boston	LOCATION	0.99+
30 scales	QUANTITY	0.99+
last quarter	DATE	0.99+
five things	QUANTITY	0.99+
five pieces	QUANTITY	0.99+
IBM Analytics Organization	ORGANIZATION	0.99+
Boston, Massachusetts	LOCATION	0.99+
each	QUANTITY	0.99+
two things	QUANTITY	0.99+
today	DATE	0.99+
November	DATE	0.99+
tomorrow	DATE	0.99+
Today	DATE	0.99+
single	QUANTITY	0.99+
The Weather Company	ORGANIZATION	0.99+
two vectors	QUANTITY	0.99+
EMC	ORGANIZATION	0.98+
Spark	TITLE	0.98+
Interpol	ORGANIZATION	0.98+
IBM Analytics	ORGANIZATION	0.98+
one driver	QUANTITY	0.98+
One	QUANTITY	0.97+
first piece	QUANTITY	0.97+
Kafka	PERSON	0.97+
three	QUANTITY	0.97+
Spark Summit East 2017	EVENT	0.93+
a year	QUANTITY	0.93+
Spark Summit	EVENT	0.92+
five key things	QUANTITY	0.91+
single catalog	QUANTITY	0.9+
EUGDPR	TITLE	0.9+
one layer	QUANTITY	0.9+
Spark	PERSON	0.88+
Kafka	TITLE	0.86+
half a second	QUANTITY	0.84+
Databricks	ORGANIZATION	0.82+

Alyse Daghelian, IBM | IBM Data and AI Forum

>>Live from Miami, Florida. It's the cube covering IBM's data and AI forum brought to you by IBM. >>We're back in Miami. Welcome everybody. You watching the cube, the leader in live tech coverage. We're here at the IBM data and AI forum. Wow. What a day. 1700 customers. A lot of hands on labs sessions. What used to be the IBM analytics university is sort of morphed into this event. Now you see the buzz is going on. At least the Galean is here. She's the vice president of global sales for IBM data and AI. Welcome to the cube. Thank you for coming on. So this event is buzzing the double from last year almost. And uh, congratulations. >>Well, thank you very much. We have con, uh, lots of countries represented here. We have customers from small to large, every industry represented. And a, it's a, I can see a marked difference in the conversations in just a year around our, how customers want to figure out how to embark on this journey to AI. >>So yeah. So why are they come here? What's the, what's the primary motivation? >>Well, I think one IBM is recognized as the leader in AI and we just came out in the IDC survey as the three time w you know, leader, a recognized leader in AI. And when they come here they know they're going to hear from other clients who have embarked on similar journeys. They know they're going to have access to experts, hands on labs, and we bring our entire IBM team that's focused on data and AI to this event. So it's intimate, it's high skilled, it's high energy and they are learning a ton while they're. >>Yeah, a lot of content and you're educating but you're also trying to inspire people. I mean a raise. I was the hub this morning, he wrote this book, but he's this extreme, extreme, extreme like ultra marathoner. Uh, which I thought was a great talk this morning. And then you did a, I thought a good job of sort of connecting, you know, his talk of anything's possible to now bringing AI into the equation. What are you hearing from customers in terms of what they want to make possible and, and what's that conversation like in the field? >>Well, it's interesting because there is a huge recognition that every client that I talked to you, and they all want to understand this, that they have to be transforming their businesses on this journey to AI. So they all recognize that they need to start. Now. What I find when I talk to clients is that they're all coming in at different entry points. There's a maturity curve. So some are figuring out, you know, how do I move away from just Excel spreadsheets? I'm still running my business on Excel, right? And these are no banks in major that are operating on Excel spreadsheets and they're looking at niche competitors, you know, digital banks that are entering the scene. And if they don't change the way they operate, they're not going to survive. So a lot of companies are coming in knowing that they're low on the maturity curve and they better do something to move up that curve pretty fast. >>Some are in almost the second turn of the crank where they've invested in a lot of the AI technologies, they've built data science platforms, and now they're figuring out how do they get that next rev of productivity improvement? How do they come up with that next business idea that's going to give them that competitive advantage? So what I find is every client is embarking on this journey, which is a big difference where I think we were even a year, 18 months ago, where they were sort of just, okay, this is interesting. Now there I better do something. >>Okay, so you're a resource, you know, as the head of global sales for this group. So when you talk to customers that are immature, if I hear you right, they're saying, help us get started because we're going to fall behind. Uh, we're inefficient right now. We're drowning in spreadsheets, data. Our data quality is not where it needs to be. Help. Where do we start? What do you tell them? >>Well, one, we have a formula that we've proven works with clients. Um, we bring them into our garages where we will do design thinking, architectural workshops, and we figure out a use case because what we try not to do with our clients is boil the ocean. We want them to sh to have something that they can prove success around very quickly, create that minimal viable product, bring it back to the business so that the business can see, Oh, I understand. And then evolve that use case. So we will bring technical specialists, we will bring folks that are our own data scientists to these garage environments and we will work with them on building out this first use case. >>Explain the garage a little bit more. Is that those, those are sort of centers of excellence around the world or how do I tap them as a customer? Is it, is it a freebie? Is it for pay? Isn't it like the data science elite team? How does it all work? >>Well, it is. There are a number of physical locations and it's open to all clients. We have created these with co-leadership from across the entire IBM company. So our services organization, our cloud cognitive organization, all play a role in these garages. So we have a formal structure where a team can engage through a request process into the garages. We will help them define the use case they want to bring into the garage. We will bring them in for a period of time and provide the resources and capabilities and skills and that's not charged to the client. So we're trying to get them started now that they'll take that back to their company and then they will look at follow on opportunities and those may, you know, work out to be different services opportunities as they move forward. But we're on that get started phase. >>Yeah. Yeah. I mean you're a fraud for profit company, so it's great to have a loss leader, but the line outside the door at the garage must be huge for people that want to get in. Hi. How are you managing the dominion? >>Yes, well we're increasing obviously our capacity around the garages. Um, and we're still making customers aware of the garages. So there's still, because it's a commitment on their side, like they just can't come in and kick the tires. We ask them to bring their line of business along with their technical teams into the garages because that's where you get the best product coming out of it. When you know you've got something that's going to solve a business problem, but you have to have buy in from both sides. >>I want to ask you about the AI ladder. You know, Rob Thomas has been using this construct for awhile. It didn't just come out of thin air. I'm sure there was a lot of customer input and a lot of debate about what should be on the ladder. We went, when I first heard of the day AI ladder, it was, there was data in IAA analytics, ML and AI, sort of the building, the technical or technology building blocks. It's now become verbs, which I love, which is collect, organize, analyze and infuse, which is all about operationalizing and scaling. How is that resonating with customers and how do they fit into that methodology or framework? >>Well, I'll tell you, I use that framework with every single client and I described that there is a set of steps and you know, obviously to the ladder that every customer has to embark upon. And it starts with some very basic principles and as soon as you start with the very basic principles, every client is like, of course like it seems so obvious that first and foremost you have to date as the foundation, right? AI is not created out of, you know, someone in a back room. The foundation to AI is, is information and data. Yet every customer, every customer struggles with that data is coming from multiple systems, multiple sources that they can't get to the data fast enough. They're shipping data around an organization. It's not managed. And yet that they know that in five years, the data they think they need today is going to be completely different. >>It could be 12 months, but certainly in the future. So how do you build out that architecture that allows them to build that now, but have the agility to grow as the requirements change? You start with that basic discussion and they're like, well of course. So that's collect and then you bring it up and you talk about how do you govern that data? How do you know where that data originated? Who is the owner? How do you know what that data means? What system did it come from? What's the, you know, who has access to it? How do you create that set of govern data? And we'll of course every client recognizes they have that set of issues. So I could continue working my way up the ladder and every client realizes that, okay, I re I'm, here's where I am today. What you just painted for me is absolutely what I need to focus on and address. Now help me get from a to B. >>So I'm really interested in this discussion because it sounds like you're a very disciplined sales leaders and you said you use the ladder with virtually every client and I presume your sales teams use the ladder. So you train your salespeople how to converse the ladder. And then the other observation I'd love your thoughts on this is every step of the ladder has these questions. So you're asking customers questions and I'm sure it catalyzes conversation, the, the answers to which you have solutions presumably from any of them. But I wonder if you could talk about that. >>Well, let me tell about the ladder and how we're using it with our Salesforce because it was a unifying approach, not just within our own team, our data and AI team, but outside of data and AI. Because not only did we explain it to clients this way, but to the rest of IBM, our business partners, our whole ecosystem. So unifying in that we started every single conversation with our sales team on enabling them on how do they talk to their clients, our materials, our use cases, our references, our marketing campaigns. We tied everything to this unified approach and it's made a huge difference in how we communicate our value to clients and explain this journey to AI in in comprehensive steps that everyone could understand and relate to. >>Love it. How is the portfolio evolving to map into that framework? And what can we expect going forward? What can you share with us at least? >>Well, the other amazing feat I'll call it that we produced around this is I'll talk to a client and I'll describe these capabilities and then I will say to a customer, you don't have to do every one of these things that I've just described, but you can implement what you need when you need it. Because we have built all of this into a unified platform called cloud pack for data and it's a modern data platform. It's built on an open infrastructure built on red hat OpenShift so that you can run it on your own premises as a private cloud or on public clouds, whether that be IBM or Amazon or Zohre. It allows you to have a framework, a platform built on this open modern infrastructure with access to all these capabilities I've just described as services and you decide completely open what services you need to deploy when you grow the platform as you need it. And, Oh, by the way, if you don't have the red hat OpenShift environment set up, we'll package that in a system and I will roll in the system to you and allow you to have access to the capabilities in ours. >>How's the red hat conversation going? I would imagine a lot of the traditional IBM customers are stoked. He just picked up red hat, you know, a very innovative company, open source mindset. Um, at the same time I would imagine a lot of red hat customers saying, is IBM really gonna? Let them keep their culture. How's that conversation going in the field? >>Well, I will tell you we've been a hundred percent consistent in terms of everything that you've heard Jenny and Arvin Krishna talk about in the fact that we are going to maintain their culture, keep them as that separate entity inside of IBM. It's absolutely perpetrated throughout the entire IBM company. Um, we have a lot to learn from, from them as I'm sure they have to learn from us, but it truly is operating and I see it in the clients that I'm working with as a real win-win. >>If you had to take one thing away from this event that you want customers to, to remember, what would it be? >>Start now. Um, because if you don't begin on this journey to AI, you will find yourselves, you know, fighting against new competitors, uh, increasing costs, you know, you have to improve productivity. Every client is embarking on this journey to AI start now. >>And when you were talking about, uh, the maturity model and, and one of those levels was folks that had started already and they wanted to get to the next level, when you go into those clients, do you discern a different sort of attitude? We've started, we're down the path. Did they have more of a spring in their step? Are they like chomping at the bit to really go faster and extend their lead relative to the competition competition? What's the dynamic like in those accounts? >>That's a great question because I was with a client this afternoon, um, a large manufacturer of, uh, of goods and they are at this turning point where they did kind of phase one, they implemented cloud pack for data and they did it to just join some of their disparate systems. Now, I mean, I, I barely got a word in because he was so excited cause he's, now what I'm going to do is I'm going to figure out where my factories should go based on where my products are selling. So he's now looking at how he can change his whole distribution process as a result of getting access to this data and analytics that he never had before. Um, and I was like, okay, well just tell me how I can help you. And he was like, no way ahead. >>So this was the big kickoff day. I know yesterday there was sort of deep learning hands on stuff, the big keynotes. Today we're only here for one day. What are we going to miss? What's, what's happening tomorrow? >>Well, it's a bit of a repeat of today. So we'll have another keynote tomorrow from Beth Smith who runs our Watson, uh, business for IBM. We'll have more hands on labs. We have a lot of customer presentations where they're sharing their best practices. Um, lots of fun. >>Where do you want to see this event go? And what kind of, what's next in an IBM event land? >>Well, the feedback from last year this year says we have to do this again next year. It's, it's, it will be bigger because I think this year approves that it's already doubled and we'll probably see a similar dynamic. Um, so I fully expect us to be here. Well, maybe not here. We're sort of outgrowing this hotel. Um, but doing this event again next year, >>AI machine learning automation, uh, I'll throw in cloud. These are the hottest topics going. Elise, thanks very much for coming to the cube was great to have you. >>It's great. It's great meeting with you. >>It. Thank you for watching everybody. That's a wrap from Miami. Go to siliconangle.com check out all the news of the cube.net is where you'll find all these videos and follow the, uh, the Twitter handles at the cube at the cube three 65. I'm Dave Volante. We're out. We'll see you next time.

Published Date : Oct 22 2019

SUMMARY :

IBM's data and AI forum brought to you by IBM. Now you see the buzz is going Well, thank you very much. So yeah. just came out in the IDC survey as the three time w you know, leader, And then you did a, I thought a good job of sort of connecting, you know, So some are figuring out, you know, a lot of the AI technologies, they've built data science platforms, and now they're figuring out So when you talk to customers that are immature, if I hear you right, they're saying, bring it back to the business so that the business can see, Oh, I understand. Isn't it like the data science elite and those may, you know, work out to be different services opportunities as they move forward. Hi. How are you managing the dominion? teams into the garages because that's where you get the best product coming I want to ask you about the AI ladder. And it starts with some very basic principles and as soon as you start with the very basic principles, So that's collect and then you bring it up and you talk about So you train your salespeople how to converse the ladder. Well, let me tell about the ladder and how we're using it with our Salesforce because it was a unifying How is the portfolio evolving to map into that framework? And, Oh, by the way, if you don't have the red hat OpenShift environment He just picked up red hat, you know, a very innovative company, open source mindset. Well, I will tell you we've been a hundred percent consistent in terms of everything that you've heard to AI, you will find yourselves, you know, fighting against new competitors, to get to the next level, when you go into those clients, cloud pack for data and they did it to just join some of their disparate systems. So this was the big kickoff day. We have a lot of customer presentations where they're sharing their best practices. Well, the feedback from last year this year says we have These are the hottest topics going. It's great meeting with you. of the cube.net is where you'll find all these videos and follow the, uh,

ENTITIES

Entity	Category	Confidence
Doug	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Doug Schmitt	PERSON	0.99+
Jenny	PERSON	0.99+
Dave Volante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Michael	PERSON	0.99+
Jen	PERSON	0.99+
Jen Felch	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Miami	LOCATION	0.99+
Alyse Daghelian	PERSON	0.99+
Dell Technologies	ORGANIZATION	0.99+
Alex Barretto	PERSON	0.99+
Dell Technologies Services	ORGANIZATION	0.99+
Arvin Krishna	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
2020	DATE	0.99+
Rob Thomas	PERSON	0.99+
Today	DATE	0.99+
17%	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
next year	DATE	0.99+
Alex	PERSON	0.99+
Beth Smith	PERSON	0.99+
yesterday	DATE	0.99+
APEX	ORGANIZATION	0.99+
JJ Davis	PERSON	0.99+
Dave	PERSON	0.99+
Elise	PERSON	0.99+
last year	DATE	0.99+
Excel	TITLE	0.99+
1700 customers	QUANTITY	0.99+
third item	QUANTITY	0.99+
two	QUANTITY	0.99+
59 billion	QUANTITY	0.99+
one	QUANTITY	0.99+
Dell Technologies and Services	ORGANIZATION	0.99+
one day	QUANTITY	0.99+
12 months	QUANTITY	0.99+
today	DATE	0.99+
Miami, Florida	LOCATION	0.99+
third	QUANTITY	0.99+
three things	QUANTITY	0.99+
five years	QUANTITY	0.99+
Sunday	DATE	0.99+
both sides	QUANTITY	0.99+
one unit	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
this year	DATE	0.99+
three years	QUANTITY	0.99+
telco	ORGANIZATION	0.99+
third one	QUANTITY	0.98+
hundred percent	QUANTITY	0.98+
second priority	QUANTITY	0.98+

Keynote Analysis | IBM Data and AI Forum

>>Live from Miami, Florida. It's the cube covering IBM's data and AI forum brought to you by IBM. >>Welcome everybody to the port of Miami. My name is Dave Vellante and you're watching the cube, the leader in live tech coverage. We go out to the events, we extract the signal from the noise and we're here at the IBM data and AI form. The hashtag is data AI forum. This is IBM's. It's formerly known as the, uh, IBM analytics university. It's a combination of learning peer network and really the focus is on AI and data. And there are about 1700 people here up from, Oh, about half of that last year, uh, when it was the IBM, uh, analytics university, about 600 customers, a few hundred partners. There's press here, there's, there's analysts, and of course the cube is covering this event. We'll be here for one day, 128 hands-on sessions or ER or sessions, 35 hands on labs. As I say, a lot of learning, a lot of technical discussions, a lot of best practices. >>What's happening here. For decades, our industry has marched to the cadence of Moore's law. The idea that you could double the processor performance every 18 months, doubling the number of transistors, you know, within, uh, the footprint that's no longer what's driving innovation in the it and technology industry today. It's a combination of data with machine intelligence applied to that data and cloud. So data we've been collecting data, we've always talked about all this data that we've collected and over the past 10 years with the advent of lower costs, warehousing technologies in file stores like Hadoop, um, with activity going on at the edge with new databases and lower cost data stores that can handle unstructured data as well as structured data. We've amassed this huge amount of, of data that's growing at a, at a nonlinear rate. It's, you know, this, the curve is steepening is exponential. >>So there's all this data and then applying machine intelligence or artificial intelligence with machine learning to that data is the sort of blending of a new cocktail. And then the third piece of that third leg of that stool is the cloud. Why is the cloud important? Well, it's important for several reasons. One is that's where a lot of the data lives too. It's where agility lives. So cloud, cloud, native of dev ops, and being able to spin up infrastructure as code really started in the cloud and it's sort of seeping to to on prem, slowly and hybrid and multi-cloud, ACC architectures. But cloud gives you not only that data access, not only the agility, but also scale, global scale. So you can test things out very cheaply. You can experiment very cheaply with cloud and data and AI. And then once your POC is set and you know it's going to give you business value and the business outcomes you want, you can then scale it globally. >>And that's really what what cloud brings. So this forum here today where the big keynotes, uh, Rob Thomas kicked it off. He uh, uh, actually take that back. A gentleman named Ray Zahab, he's an adventure and ultra marathon or kicked it off. This Jude one time ran 4,500 miles in 111 days with two ultra marathon or colleagues. Um, they had no days off. They traveled through six countries, they traversed Africa, the continent, and he took two showers in a 111 days. And his whole mission is really talking about the power of human beings, uh, and, and the will of humans to really rise above any challenge would with no limits. So that was the sort of theme that, that was set for. This, the, the tone that was set for this conference that Rob Thomas came in and invoked the metaphor of superheroes and superpowers of course, AI and data being two of those three superpowers that I talked about in addition to cloud. >>So Rob talked about, uh, eliminating the good to find the great, he talked about some of the experiences with Disney's ward. Uh, ward Kimball and Stanley, uh, ward Kimball went to, uh, uh, Walt Disney with this amazing animation. And Walter said, I love it. It was so funny. It was so beautiful, was so amazing. Your work 283 days on this. I'm cutting it out. So Rob talked about cutting out the good to find, uh, the great, um, also talking about AI is penetrated only about four to 10% within organizations. Why is that? Why is it so low? He said there are three things that are blockers. They're there. One is data and he specifically is referring to data quality. The second is trust and the third is skillsets. So he then talked about, you know, of course dovetailed a bunch of IBM products and capabilities, uh, into, you know, those, those blockers, those challenges. >>He talked about two in particular, IBM cloud pack for data, which is this way to sort of virtualize data across different clouds and on prem and hybrid and and basically being able to pull different data stores in, virtualize it, combine join data and be able to act on it and apply a machine learning and AI to it. And then auto AI a way to basically machine intelligence for artificial intelligence. In other words, AI for AI. What's an example? How do I choose the right algorithm and that's the best fit for the use case that I'm using. Let machines do that. They've got experience and they can have models that are trained to actually get the best fit. So we talked about that, talked about a customer, a panel, a Miami Dade County, a Wunderman Thompson, and the standard bank of South Africa. These are incumbents that are using a machine intelligence and AI to actually try to super supercharge their business. We heard a use case with the Royal bank of Scotland, uh, basically applying AI and driving their net promoter score. So we'll talk some more about that. Um, and we're going to be here all day today, uh, interviewing executives, uh, from, uh, from IBM, talking about, you know, what customers are doing with a, uh, getting the feedback from the analysts. So this is what we do. Keep it right there, buddy. We're in Miami all day long. This is Dave Olanta. You're watching the cube. We'll be right back right after this short break..

Published Date : Oct 22 2019

SUMMARY :

IBM's data and AI forum brought to you by IBM. It's a combination of learning peer network and really the focus is doubling the number of transistors, you know, within, uh, the footprint that's in the cloud and it's sort of seeping to to on prem, slowly and hybrid and multi-cloud, really talking about the power of human beings, uh, and, and the will of humans So Rob talked about cutting out the good to find, and that's the best fit for the use case that I'm using.

ENTITIES

Entity	Category	Confidence
Ray Zahab	PERSON	0.99+
Miami	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
Dave Olanta	PERSON	0.99+
4,500 miles	QUANTITY	0.99+
35 hands	QUANTITY	0.99+
Stanley	PERSON	0.99+
two	QUANTITY	0.99+
six countries	QUANTITY	0.99+
128 hands	QUANTITY	0.99+
111 days	QUANTITY	0.99+
Walter	PERSON	0.99+
Rob	PERSON	0.99+
Africa	LOCATION	0.99+
Jude	PERSON	0.99+
one day	QUANTITY	0.99+
283 days	QUANTITY	0.99+
third piece	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
Wunderman Thompson	ORGANIZATION	0.99+
Royal bank of Scotland	ORGANIZATION	0.99+
One	QUANTITY	0.99+
third	QUANTITY	0.99+
today	DATE	0.99+
second	QUANTITY	0.98+
last year	DATE	0.98+
about 600 customers	QUANTITY	0.98+
third leg	QUANTITY	0.98+
South Africa	LOCATION	0.97+
one time	QUANTITY	0.97+
three things	QUANTITY	0.96+
IBM Data	ORGANIZATION	0.96+
about 1700 people	QUANTITY	0.96+
three superpowers	QUANTITY	0.96+
two ultra marathon	QUANTITY	0.95+
Kimball	PERSON	0.95+
two showers	QUANTITY	0.94+
10%	QUANTITY	0.94+
about four	QUANTITY	0.88+
IBM analytics university	ORGANIZATION	0.86+
Miami Dade County	LOCATION	0.8+
18 months	QUANTITY	0.78+
hundred partners	QUANTITY	0.76+
decades	QUANTITY	0.74+
university	ORGANIZATION	0.73+
ward	PERSON	0.69+
Disney	ORGANIZATION	0.69+
Hadoop	TITLE	0.67+
Moore	PERSON	0.6+
years	DATE	0.59+
Walt	PERSON	0.58+
Disney	PERSON	0.5+
10	QUANTITY	0.46+
half	QUANTITY	0.4+
past	DATE	0.39+

Seth Dobrin, IBM | IBM CDO Summit 2019

>> Live from San Francisco, California, it's the theCUBE, covering the IBM Chief Data Officer Summit, brought to you by IBM. >> Welcome back to San Francisco everybody. You're watching theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise and we're here at the IBM Chief Data Officer Summit, 10th anniversary. Seth Dobrin is here, he's the Vice President and Chief Data Officer of the IBM Analytics Group. Seth, always a pleasure to have you on. Good to see you again. >> Yeah, thanks for having me back Dave. >> You're very welcome. So I love these events you get a chance to interact with chief data officers, guys like yourself. We've been talking a lot today about IBM's internal transformation, how IBM itself is operationalizing AI and maybe we can talk about that, but I'm most interested in how you're pointing that at customers. What have you learned from your internal experiences and what are you bringing to customers? >> Yeah, so, you know, I was hired at IBM to lead part of our internal transformation, so I spent a lot of time doing that. >> Right. >> I've also, you know, when I came over to IBM I had just left Monsanto where I led part of their transformation. So I spent the better part of the first year or so at IBM not only focusing on our internal efforts, but helping our clients transform. And out of that I found that many of our clients needed help and guidance on how to do this. And so I started a team we call, The Data Science an AI Elite Team, and really what we do is we sit down with clients, we share not only our experience, but the methodology that we use internally at IBM so leveraging things like design thinking, DevOps, Agile, and how you implement that in the context of data science and AI. >> I've got a question, so Monsanto, obviously completely different business than IBM-- >> Yeah. >> But when we talk about digital transformation and then talk about the difference between a business and a digital business, it comes down to the data. And you've seen a lot of examples where you see companies traversing industries which never used to happen before. You know, Apple getting into music, there are many, many examples, and the theory is, well, it's 'cause it's data. So when you think about your experiences of a completely different industry bringing now the expertise to IBM, were there similarities that you're able to draw upon, or was it a completely different experience? >> No, I think there's tons of similarities which is, which is part of why I was excited about this and I think IBM was excited to have me. >> Because the chances for success were quite high in your mind? >> Yeah, yeah, because the chance for success were quite high, and also, you know, if you think about it there's on the, how you implement, how you execute, the differences are really cultural more than they're anything to do with the business, right? So it's, the whole role of a Chief Data Officer, or Chief Digital Officer, or a Chief Analytics Officer, is to drive fundamental change in the business, right? So it's how do you manage that cultural change, how do you build bridges, how do you make people, how do you make people a little uncomfortable, but at the same time get them excited about how to leverage things like data, and analytics, and AI, to change how they do business. And really this concept of a digital transformation is about moving away from traditional products and services, more towards outcome-based services and not selling things, but selling, as a Service, right? And it's the same whether it's IBM, you know, moving away from fully transactional to Cloud and subscription-based offerings. Or it's a bank reimagining how they interact with their customers, or it's oil and gas company, or it's a company like Monsanto really thinking about how do we provide outcomes. >> But how do you make sure that every, as a Service, is not a snowflake and it can scale so that you can actually, you know, make it a business? >> So underneath the, as a Service, is a few things. One is, data, one is, machine learning and AI, the other is really understanding your customer, right, because truly digital companies do everything through the eyes of their customer and so every company has many, many versions of their customer until they go through an exercise of creating a single version, right, a customer or a Client 360, if you will, and we went through that exercise at IBM. And those are all very consistent things, right? They're all pieces that kind of happen the same way in every company regardless of the industry and then you get into understanding what the desires of your customer are to do business with you differently. >> So you were talking before about the Chief Digital Officer, a Chief Data Officer, Chief Analytics Officer, as a change agent making people feel a little bit uncomfortable, explore that a little bit what's that, asking them questions that intuitively they, they know they need to have the answer to, but they don't through data? What did you mean by that? >> Yeah so here's the conversations that usually happen, right? You go and you talk to you peers in the organization and you start having conversations with them about what decisions are they trying to make, right? And you're the Chief Data Officer, you're responsible for that, and inevitably the conversation goes something like this, and I'm going to paraphrase. Give me the data I need to support my preconceived notions. >> (laughing) Yeah. >> Right? >> Right. >> And that's what they want to (voice covers voice). >> Here's the answer give me the data that-- >> That's right. So I want a Dashboard that helps me support this. And the uncomfortableness comes in a couple of things in that. It's getting them to let go of that and allow the data to provide some inkling of things that they didn't know were going on, that's one piece. The other is, then you start leveraging machine learning, or AI, to actually help start driving some decisions, so limiting the scope from infinity down to two or three things and surfacing those two or three things and telling people in your business your choices are one of these three things, right? That starts to make people feel uncomfortable and really is a challenge for that cultural change getting people used to trusting the machine, or in some instances even, trusting the machine to make the decision for you, or part of the decision for you. >> That's got to be one of the biggest cultural challenges because you've got somebody who's, let's say they run a big business, it's a profitable business, it's the engine of cashflow at the company, and you're saying, well, that's not what the data says. And you're, say okay, here's a future path-- >> Yeah. >> For success, but it's going to be disruptive, there's going to be a change and I can see people not wanting to go there. >> Yeah, and if you look at, to the point about, even businesses that are making the most money, or parts of a business that are making the most money, if you look at what the business journals say you start leveraging data and AI, you get double-digit increases in your productivity, in your, you know, in differentiation from your competitors. That happens inside of businesses too. So the conversation even with the most profitable parts of the business, or highly, contributing the most revenue is really what we could do better, right? You could get better margins on this revenue you're driving, you could, you know, that's the whole point is to get better leveraging data and AI to increase your margins, increase your revenue, all through data and AI. And then things like moving to, as a Service, from single point to transaction, that's a whole different business model and that leads from once every two or three or five years, getting revenue, to you get revenue every month, right? That's highly profitable for companies because you don't have to go in and send your sales force in every time to sell something, they buy something once, and they continue to pay as long as you keep 'em happy. >> But I can see that scaring people because if the incentives don't shift to go from a, you know, pay all up front, right, there's so many parts of the organization that have to align with that in order for that culture to actually occur. So can you give some examples of how you've, I mean obviously you ran through that at IBM, you saw-- >> Yeah. >> I'm sure a lot of that, got a lot of learnings and then took that to clients. Maybe some examples of client successes that you've had, or even not so successes that you've learned from. >> Yeah, so in terms of client success, I think many of our clients are just beginning this journey, certainly the ones I work with are beginning their journey so it's hard for me to say, client X has successfully done this. But I can certainly talk about how we've gone in, and some of the use cases we've done-- >> Great. >> With certain clients to think about how they transformed their business. So maybe the biggest bang for the buck one is in the oil and gas industry. So ExxonMobile was on stage with me at, Think, talking about-- >> Great. >> Some of the work that we've done with them in their upstream business, right? So every time they drop a well it costs them not thousands of dollars, but hundreds of millions of dollars. And in the oil and gas industry you're talking massive data, right, tens or hundreds of petabytes of data that constantly changes. And no one in that industry really had a data platform that could handle this dynamically. And it takes them months to get, to even start to be able to make a decision. So they really want us to help them figure out, well, how do we build a data platform on this massive scale that enables us to be able to make decisions more rapidly? And so the aim was really to cut this down from 90 days to less than a month. And through leveraging some of our tools, as well as some open-source technology, and teaching them new ways of working, we were able to lay down this foundation. Now this is before, we haven't even started thinking about helping them with AI, oil and gas industry has been doing this type of thing for decades, but they really were struggling with this platform. So that's a big success where, at least for the pilot, which was a small subset of their fields, we were able to help them reduce that timeframe by a lot to be able to start making a decision. >> So an example of a decision might be where to drill next? >> That's exactly the decision they're trying to make. >> Because for years, in that industry, it was boop, oh, no oil, boop, oh, no oil. >> Yeah, well. >> And they got more sophisticated, they started to use data, but I think what you're saying is, the time it took for that analysis was quite long. >> So the time it took to even overlay things like seismic data, topography data, what's happened in wells, and core as they've drilled around that, was really protracted just to pull the data together, right? And then once they got the data together there were some really, really smart people looking at it going, well, my experience says here, and it was driven by the data, but it was not driven by an algorithm. >> A little bit of art. >> True, a lot of art, right, and it still is. So now they want some AI, or some machine learning, to help guide those geophysicists to help determine where, based on the data, they should be dropping wells. And these are hundred million and billion dollar decisions they're making so it's really about how do we help them. >> And that's just one example, I mean-- >> Yeah. >> Every industry has it's own use cases, or-- >> Yeah, and so that's on the front end, right, about the data foundation, and then if you go to a company that was really advanced in leveraging analytics, or machine learning, JPMorgan Chase, in their, they have a division, and also they were on stage with me at, Think, that they had, basically everything is driven by a model, so they give traders a series of models and they make decisions. And now they need to monitor those models, those hundreds of models they have for misuse of those models, right? And so they needed to build a series of models to manage, to monitor their models. >> Right. >> And this was a tremendous deep-learning use case and they had just bought a power AI box from us so they wanted to start leveraging GPUs. And we really helped them figure out how do you navigate and what's the difference between building a model leveraging GPUs, compared to CPUs? How do you use it to accelerate the output, and again, this was really a cost-avoidance play because if people misuse these models they can get in a lot of trouble. But they also need to make these decisions very quickly because a trader goes to make a trade they need to make a decision, was this used properly or not before that trade is kicked off and milliseconds make a difference in the stock market so they needed a model. And one of the things about, you know, when you start leveraging GPUs and deep learning is sometimes you need these GPUs to do training and sometimes you need 'em to do training and scoring. And this was a case where you need to also build a pipeline that can leverage the GPUs for scoring as well which is actually quite complicated and not as straight forward as you might think. In near real time, in real time. >> Pretty close to real time. >> You can't get much more real time then those things, potentially to stop a trade before it occurs to protect the firm. >> Yeah. >> Right, or RELug it. >> Yeah, and don't quote, I think this is right, I think they actually don't do trades until it's confirmed and so-- >> Right. >> Or that's the desire as to not (voice covers voice). >> Well, and then now you're in a competitive situation where, you know. >> Yeah, I mean people put these trading floors as close to the stock exchange as they can-- >> Physically. >> Physically to (voice covers voice)-- >> To the speed of light right? >> Right, so every millisecond counts. >> Yeah, read Flash Boys-- >> Right, yeah. >> So, what's the biggest challenge you're finding, both at IBM and in your clients, in terms of operationalizing AI. Is it technology? Is it culture? Is it process? Is it-- >> Yeah, so culture is always hard, but I think as we start getting to really think about integrating AI and data into our operations, right? As you look at what software development did with this whole concept of DevOps, right, and really rapidly iterating, but getting things into a production-ready pipeline, looking at continuous integration, continuous development, what does that mean for data and AI? And these concept of DataOps and AIOps, right? And I think DataOps is very similar to DevOps in that things don't change that rapidly, right? You build your data pipeline, you build your data assets, you integrate them. They may change on the weeks, or months timeframe, but they're not changing on the hours, or days timeframe. As you get into some of these AI models some of them need to be retrained within a day, right, because the data changes, they fall out of parameters, or the parameters are very narrow and you need to keep 'em in there, what does that mean? How do you integrate this for your, into your CI/CD pipeline? How do you know when you need to do regression testing on the whole thing again? Does your data science and AI pipeline even allow for you to integrate into your current CI/CD pipeline? So this is actually an IBM-wide effort that my team is leading to start thinking about, how do we incorporate what we're doing into people's CI/CD pipeline so we can enable AIOps, if you will, or MLOps, and really, really IBM is the only company that's positioned to do that for so many reasons. One is, we're the only one with an end-to-end toolchain. So we do everything from data, feature development, feature engineering, generating models, whether selecting models, whether it's auto AI, or hand coding or visual modeling into things like trust and transparency. And so we're the only one with that entire toolchain. Secondly, we've got IBM research, we've got decades of industry experience, we've got our IBM Services Organization, all of us have been tackling with this with large enterprises so we're uniquely positioned to really be able to tackle this in a very enterprised-grade manner. >> Well, and the leverage that you can get within IBM and for your customers. >> And leveraging our clients, right? >> It's off the charts. >> We have six clients that are our most advanced clients that are working with us on this so it's not just us in a box, it's us with our clients working on this. >> So what are you hoping to have happen today? We're just about to get started with the keynotes. >> Yeah. >> We're going to take a break and then come back after the keynotes and we've got some great guests, but what are you hoping to get out of today? >> Yeah, so I've been with IBM for 2 1/2 years and I, and this is my eighth CEO Summit, so I've been to many more of these than I've been at IBM. And I went to these religiously before I joined IBM really for two reasons. One, there's no sales pitch, right, it's not a trade show. The second is it's the only place where I get the opportunity to listen to my peers and really have open and candid conversations about the challenges they're facing and how they're addressing them and really giving me insights into what other industries are doing and being able to benchmark me and my organization against the leading edge of what's going on in this space. >> I love it and that's why I love coming to these events. It's practitioners talking to practitioners. Seth Dobrin thanks so much for coming to theCUBE. >> Yeah, thanks always, Dave. >> Always a pleasure. All right, keep it right there everybody we'll be right back right after this short break. You're watching, theCUBE, live from San Francisco. Be right back.

Published Date : Jun 24 2019

SUMMARY :

brought to you by IBM. Seth, always a pleasure to have you on. Yeah, thanks for and what are you bringing to customers? to lead part of our DevOps, Agile, and how you implement that bringing now the expertise to IBM, and I think IBM was excited to have me. and analytics, and AI, to to do business with you differently. Give me the data I need to And that's what they want to and allow the data to provide some inkling That's got to be there's going to be a and they continue to pay as that have to align with that and then took that to clients. and some of the use cases So maybe the biggest bang for the buck one And so the aim was really That's exactly the decision it was boop, oh, no oil, boop, oh, they started to use data, but So the time it took to help guide those geophysicists And so they needed to build And one of the things about, you know, to real time. to protect the firm. Or that's the desire as to not Well, and then now so every millisecond counts. both at IBM and in your clients, and you need to keep 'em in there, Well, and the leverage that you can get We have six clients that So what are you hoping and being able to benchmark talking to practitioners. Yeah, after this short break.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
San Francisco	LOCATION	0.99+
Seth	PERSON	0.99+
JPMorgan Chase	ORGANIZATION	0.99+
Monsanto	ORGANIZATION	0.99+
90 days	QUANTITY	0.99+
two	QUANTITY	0.99+
six clients	QUANTITY	0.99+
Dave	PERSON	0.99+
hundred million	QUANTITY	0.99+
tens	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
one piece	QUANTITY	0.99+
ExxonMobile	ORGANIZATION	0.99+
IBM Analytics Group	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
San Francisco, California	LOCATION	0.99+
less than a month	QUANTITY	0.99+
2 1/2 years	QUANTITY	0.99+
three	QUANTITY	0.99+
one example	QUANTITY	0.99+
today	DATE	0.99+
thousands of dollars	QUANTITY	0.99+
one	QUANTITY	0.99+
five years	QUANTITY	0.98+
One	QUANTITY	0.98+
second	QUANTITY	0.98+
two reasons	QUANTITY	0.98+
hundreds of petabytes	QUANTITY	0.97+
hundreds of millions of dollars	QUANTITY	0.97+
hundreds of models	QUANTITY	0.97+
10th anniversary	QUANTITY	0.97+
IBM Chief Data Officer Summit	EVENT	0.97+
three things	QUANTITY	0.96+
single point	QUANTITY	0.96+
decades	QUANTITY	0.95+
billion dollar	QUANTITY	0.95+
Flash Boys	TITLE	0.95+
single version	QUANTITY	0.95+
Secondly	QUANTITY	0.94+
both	QUANTITY	0.92+
IBM Services Organization	ORGANIZATION	0.9+
IBM Chief Data Officer Summit	EVENT	0.9+
first year	QUANTITY	0.89+
once	QUANTITY	0.87+
IBM CDO Summit 2019	EVENT	0.83+
DataOps	TITLE	0.72+
years	QUANTITY	0.72+
Vice President	PERSON	0.69+
Think	ORGANIZATION	0.69+
every millisecond	QUANTITY	0.68+
DevOps	TITLE	0.68+
once every	QUANTITY	0.67+
double-	QUANTITY	0.62+
eighth CEO	QUANTITY	0.62+
Chief Data Officer	PERSON	0.6+
UBE	ORGANIZATION	0.59+
360	COMMERCIAL_ITEM	0.58+
RELug	ORGANIZATION	0.56+

Jay Limburn, IBM & Julie Lockner, IBM | IBM Think 2019

>> Live from San Francisco, it's theCUBE! Covering IBM Think 2019. Brought to you by IBM. >> Welcome back, live here in San Francisco, it's theCUBE's coverage of IBM Think 2019. I'm John Furrier--Stu Miniman. Stu, four days, we're on our fourth day, the sun's shining, they've shut down Howard Street here at IBM. Big event for IBM, in San Francisco, not Las Vegas. Lot of great cloud action, lot of great AI data developers. Great story, good to see you again. Our next two guests, Julie Lockner, Director, Offering Management, Portfolio Operations at IBM, Data+AI, great to see you. >> Thank you, it's great to see you too, thank you. >> And Jay Limburn, Director of Offering Management, IBM Data+AI, thanks for coming on. >> Hey guys, great to be here. >> So, we've chatted many times at events, the role of data. So, we're religious about data, data flows through our blood, but IBM has put it all together now. All the reorgs are over, everyone's kind of, the table is set for IBM. The data path is clear, it's part of applications. It's feeding the apps. AI's the key workload inside the application. This is now a fully set-up group, give us the update, what's the focus? >> Yeah, it's really exciting because, if you think about it, before, we were called IBM Analytics, and that really is only a part of what we do. Now that we're Data+AI, that means that not only are we responsible for delivering data assets, and technology that supports those data assets to our customers, but infusing AI, not only in the technologies that we have, but also helping them build applications so they can fuse AI into their business processes. >> It's pretty broad, I mean, data's very much a broad swath of things. Analytics, you know, wrangling data, setting things up, cataloging them. Take me through how you guys set this up. How do you present it to the marketplace? How are clients engaged with it? Because it's pretty broad. But it could be, it needs to be specific. Take us through the methodology. >> So, you probably heard a lot of people today talk about the ladder to AI, right? This is IBM's view of how we explain our client's journey towards AI. It really starts at the bottom rung of the ladder, where we've got the collection of information. Collect your data. Once you've collected your data, you move up to the next rung, which is the Organize. And this is really where all the governance stuff comes in. This is how we can provide a view across that data, understand that data, provide trust to that data, and then serve that up to the consumers of that information, so they can actually use that in AI. That's where all the data science capabilities come in, allowing people to actually be able to consume that information. >> So, the bottom set is just really all the hard and heavy lifting that data scientists actually don't want to do. >> And writing algorithms, the collecting, the ingesting of data from any source, that's the bottom? And then, tell me about that next layer up, from the collection-- >> So, Collect is the physical assets or the collection of the data that you're going to be using for AI. If you don't get that foundation right, it doesn't really make sense. You have to have the data first. The piece in the middle that Jay was referring to, that's called Organize, our whole divisions are actually organized around these ladders to AI, so, Collect, Organize, Analyze, Infuse. On the Organize side, as Jay was mentioning, it's all about inventorying the data assets, knowing what data you have, then providing data quality rules, governance, compliance-type offerings, that allow organizations to not just know your data, trust your data, but then make it available so you can use your data, and the users are those data scientists, they're the analytics teams, they're the operation organizations that need to be able to build their solutions on top of trusted data. >> So, where does the Catalog fit in? Which level does that come into? >> Yeah, so, think of the Data Catalog as the DNS for data, all right? It's the way in which you can provide a full view of all of your information. Whether it's structured information, unstructured information, data you've got on PRAM and data you've got in a cloud somewhere. >> That's in the Organize layer, right? >> That's all in the Organize layer. So, if you can collect that information, you can then provide capabilities that allow you to understand the quality of that data, know where that data's come from, and then, finally, if you serve that up inside a compelling, business-friendly experience, so that a data scientist can go to one place, quickly make a decision on if that's the right data for them, and allow them to go and be productive by building a data science model, then we're really able to move the needle on making those data science organizations efficient, allowing us to build better models to transform their business. >> Yeah, and a big part of that is, if you think about what makes Amazon successful, it's because they know where all their products are, from the vendor, to when it shows up on the doorstep. What the Catalog provides is really the similar capability of, I would call it inventory management of your data assets, where we know where the data came from, its source--in that Collect layer-- who's transformed it, who's accessed it, if they're even allowed to see it, so, data privacy policies are part of that, and then being able to just serve up that data to those users. Being able to see that whole end-to-end lineage is a key point, critical point of the ladder to AI. Especially when you start to think about things like bias detection, which is a big part of the Analyze layer. >> But one of the things we've been digging into on theCUBE is, is data the next flywheel of innovation? You know, it used to be I just had my information, many years ago we started talking about, "Okay, I need to be able to access all that other information." We hear things like 80% of the data out there isn't really searchable today. So, how do you see data, data gravity, all those pieces, as the next flywheel of innovation? >> Yeah, I think it's key. I mean, we've talked a lot about how, you can't do AI without information architecture. And it's absolutely true. And getting that view of that data in a single location, so it is like the DNS of the internet. So you know exactly where to search, you can get hold of that data, and then you've got tools that give you self-service access to actually get hold of the data without any need of support from IT to get access to it. It's really a key-- >> Yeah, but to the point you were just asking about, data gravity? I mean, being able to do this where the data resides. So, for example, we have a lot of our customers that are mergers and acquisitions. Some teams have a lot of data assets that are on-premises, others have large data lakes in AWS or Azure. How do you inventory those assets and really have a view of what you have available across that landscape? Part of what we've been focusing on this year is making our technology work across all of those clouds. And having a single view of your assets but knowing where it resides. >> So, Julie, this environment is a bit more complicated than the old data warehousing, or even what we were looking at with big data and Hadoop and all those pieces. >> Isn't that the truth? >> Help explain why we're actually going to be able to get the information, leverage and drive new business value out of data today, when we've struggled so many times in the past. >> Well, I think the biggest thing that's changed is the adoption of DevOps, and when I say adoption of DevOps and things like containerization and Docker containers, Kubernetes, the ability to provision data assets very quickly, no matter where they are, build these very quick value-producing applications based on AI, Artificial Intelligence APIs, is what's allowing us to take advantage of this multi-cloud landscape. If you didn't have that DevOps foundation, you'd still be building ETL jobs in data warehouses, and that was 20 years ago. Today, it's much more about these microservices-based architecture, building up these AI-- >> Well, that's the key point, and the "Fuse" part of the stack, I think, or ladder. Stack? Ladder? >> Ladder. (laughs) >> Ladder to success! Is key, because you're seeing the applications that have data native into the app, where it has to have certain characteristics, whether it's a realtime healthcare app, or retail app, and we had the retail folks on earlier, it's like, oh my god, this now has to be addressable very fast, so, the old fenced-off data warehouse-- "Hey, give me that data!"--pull it over. You need a sub-second latency, or milliseconds. So, this is now a requirement. >> That's right. >> So, how are people getting there? What are some use cases? >> Sure. I'll start with the healthcare 'cause you brought that up. One of the big use cases for technology that we provide is really around taking information that might be realtime, or batch data, and providing the ability to analyze that data very quickly in realtime to the point where you can predict when someone might potentially have a cardiac arrest. And yesterday's keynote that Rob Thomas presented, a demonstration that showed the ability to take data from a wearable device, combine it with data that's sitting in an Amazon... MySQL database, be able to predict who is the most at-risk of having a potential cardiac arrest! >> That's me! >> And then present that to a call center of cardiologists. So, this company that we work with, iCure, really took that entire stack, Organize, Collect, Organize, Analyze, Infuse, and built an application in a matter of six weeks. Now, that's the most compelling part. We were able to build the solution, inventory their data assets, tie it to the industry model, healthcare industry model, and predict when someone might potentially-- >> Do you have that demo on you? The device? >> Of course I do. I know, I know. So, here is, this is called a BraveHeart Life Sensor. And essentially, it's a Bluetooth device. I know! If you put it on! (laughs) >> If I put it on, it'll track... Biometric? It'll start capturing information about your heart, ECG, and on Valentine's Day, right? My heart to yours, happy Valentine's Day to my husband, of course. The ability to be able to capture all this data here on the device, stream it to an AI engine that can then immediately classify whether or not someone has an anomaly in their ECG signal. You couldn't do that without having a complete ladder to AI capability. >> So, realtime telemetry from the heart. So, I see timing's important if you're about to have a heart attack. >> Yeah. >> Pretty important. >> And that's a great example of, you mentioned the speed. It's all about being able to capture that data in whatever form it's coming in, understand what that data is, know if you can trust that data, and then put it in the hands of the individuals that can do something valuable with the analysis from that data. >> Yeah, you have to able to trust it. Especially-- >> So, you brought up earlier bias in data. So, I want to bring that up in context of this. This is just one example of wearables, Fitbits, all kinds of things happening. >> New sources of tech, yeah. >> In healthcare, retail, all kinds of edge, realtime, is bias of data. And the other one's privacy because now you have a new kind of data source going into the cloud. And then, so, this fits into what part of the ladder? So, the ladder needs a secure piece. >> Tell me about that. >> Yeah, it does. So, that really falls into that Organize piece of that ladder, the governance aspects around it. If you're going to make data available for self-service, you've got to still make sure that that data's protected, and that you're not going to go and break any kind of regulatory law around that data. So, we actually can use technology now to understand what that data is, whether it contains sensitive information, credit card numbers, and expose that information out to those consumers, yet still masking the key elements that should be protected. And that's really important, because data science is a hugely inefficient business. Data scientists are spending too much time looking for information. And worse than that, they actually don't have all the information available that they need, because certain information needs to be protected. But what we can do now is expose information that wasn't previously available, but protect just the key parts of that information, so we're still ensuring it's safe. >> That's a really key point. It's the classic iceberg, right? What you see: "Oh, data science is going to "change the game of our business!" And then when they realize what's underneath the water, it's like, all this set-up, incompatible data, dirty data, data cleaning, and then all of a sudden it just doesn't work, right? This is the reality. Are you guys seeing this? Do you see that? >> Yeah, absolutely. I think we're only just really at the beginning of a crest of a wave, here. I think organizations know they want to get to AI, the ladder to AI really helps explain and it helps to understand how they can get there. And we're able then to solve that through our technology, and help them get there and drive those efficiencies that they need. >> And just to add to that, I mean, now that there's more data assets available, you can't manually classify, tag and inventory all that data, determine whether or not it contains sensitive data. And that's where infusing machine learning into our products has really allowed our customers to automate the process. I mentioned, the only way that we were able to deploy this application in six weeks, is because we used a lot of the embedded machine learning to identify the patient data that was considered sensitive, tag it as patient data, and then, when the data scientists were actually building the models in that same environment, it was masked. So, they knew that they had access to the data, but they weren't allowed to see it. It's perfectly--especially with HIMSS' conference this week as well! You were talking about this there. >> Great use case with healthcare. >> Love to hear you speak about the ecosystem being built around this. Everything, open APIs, I'm guessing? >> Oh, yeah. What kind of partners are-- >> Jay, talk a little bit-- >> Yeah, so, one of the key things we're doing is ensuring that we're able to keep this stuff open. We don't want to curate a proprietary system. We're already big supporters of open source, as you know, in IBM. One of the things that we're heavily-invested in is our open metadata strategy. Open metadata is part of the open source ODPi Foundation. Project Egeria defines a standard for common metadata interchange. And what that means is that, any of these metadata systems that adopt this standard can freely share and exchange metadata across that landscape, so that wherever your data is, whichever systems it's stored in, wherever that metadata is harvested, it can play part of that network and share that metadata across those systems. >> I'd like to get your thoughts on something, Julie. You've been on the analyst side, you're now at IBM. Jay, if you can weigh in on this too, that'd be great. We, here, we see all the trends and go to all the events and one of the things that's popping up that's clear within the IBM ecosystem because you guys have a lot of business customers, is that a new kind of business app developer's coming in. And we've seen data science highlight the citizen data scientist, so if data is code, part of the application, and all the ladder stuff kind of falls into place, that means we're going to see new kinds of applications. So, how are you guys looking at, this is kind of a, not like the cloud-native, hardcore DevOps developer. It's the person that says, "Hey, I can innovate "a business model." I see a business model innovation that's not so much about building technology, it's about using insight and a unique... Formula or algorithm, to tweak something. That's not a lot of programming involved. 'Cause with Cloud and Cloud Private, all these back end systems, that's an ecosystem partner opportunity for you guys, but it's not your classic ISV. So, there's a new breed of business apps that we see coming, your thoughts on this? >> Yeah, it's almost like taking business process optimization as a discipline, and turning it into micro-applications. You want to be able to leverage data that's available and accessible, be able to insert that particular Artificial Intelligence machine learning algorithm to optimize that business process, and then get out of the way. Because if you try to reinvent your entire business process, culture typically gets in the way of some of these things. >> I thought, as an application value, 'cause there's value creation here, right? >> Absolutely. >> You were talking about, so, is this a new kind of genre of developer, or-- >> It really is, I mean... If you take the citizen data scientist, an example that you mentioned earlier. It's really about lowering the entry point to that technology. How can you allow individuals with lower levels of skills to actually get in and be productive and create something valuable? It shouldn't be just a practice that's held away for the hardcore developer anymore. It's about lowering the entry point with the set of tools. One of the things we have in Watson Studio, for example, our data science platform, is just that. It's about providing wizards and walkthroughs to allow people to develop productive use models very easily, without needing hardcore coding skills. >> Yeah, I also think, though, that, in order for these value-added applications to be built, the data has to be business-ready. That's how you accelerate these application development life cycles. That's how you get the new class of application developers productive, is making sure that they start with a business-ready foundation. >> So, how are you guys going to go after this new market? What's the marketing strategy? Again, this is like, forward-pioneering kind of things happening. What's the strategy, how are you going to enable this, what's the plan? >> Well, there's two parts of it. One is, when Jay was mentioning the Open Metadata Repository Services, our key strategy is embedding Catalog everywhere and anywhere we can. We believe that having that open metadata exchange allows us to open up access to metadata across these applications. So, really, that's first and foremost, is making sure that we can catalog and inventory data assets that might not necessarily be in the IBM Cloud, or in IBM products. That's really the first step. >> Absolutely. The second step, I would say, is really taking all of our capabilities, making them, from the ground up, microservices-enabled, delivering them through Docker containers and making sure that they can port across whatever cloud deployment model our customers want to be able to execute on. And being able to optimize the runtime engines, whether it's data integration, data movement, data virtualization, based on data gravity, that you had mentioned-- >> So, something like a whole new developer program opportunity to bring to the market. >> Absolutely. I mean, there is, I think there is a huge opportunity for, from an education perspective, to help our customers build these applications. But it starts with understanding the data assets, understanding what they can do with it, and using self-service-type tools that Jay was referring to. >> And all of that underpinned with the trust. If you don't trust your data, the data scientist is not going to know whether or not they're using the right thing. >> So, the ladder's great. Great way for people to figure out where they are, it's like looking in the mirror, on the organization. How early is this? What inning are we in? How do you guys see the progression? How far along are we? Obviously, you have some data, examples, some people are doing it end-to-end. What's the maturity look like? What's the uptake? >> Go ahead, Jay. >> So, I think we're at the beginning of a crest of a wave. As I say, there's been a lot of discussion so far, even if you compare this year's conference to last year's. A lot of the discussion last year was, "What's possible with AI?" This year's conference is much more about, "What are we doing with AI?" And I think we're now getting to the point where people can actually start to be productive and really start to change their business through that. >> Yeah and, just to add to that, I mean, the ladder to AI was introduced last year, and it has gained so much adoption in the marketplace and our customers, they're actually organizing their business that way. So, the Collect divisions are the database teams, are now expanding to Hadoop and Cloudera, and Hortonworks and Mongo. They're organizing their data governance teams around the Organize pillar, where they're doing things like data integration, data replication. So, I feel like the maturity of this ladder to AI is really enabling our customers to achieve it much faster than-- >> I was talking to Dave Vellante about this, and we're seeing that, you know, we've been covering IBM since, it's the 10th year of theCUBE, all ten years. It's been, watching the progression. The past couple of years has been setting the table, everyone seems to be pumping, it makes sense, everything's hanging together, it's in one group. Data's not one, "This group, that group," it's all, Data, AI, all Analytics, all Watson. Smart, and the ladder just allows you to understand where a customer is, and then-- >> Well, and also, we mentioned the emphasis on open source. It allows our customers to take an inventory of, what do they have, internally, with IBM assets, externally, open source, so that they can actually start to architect their information architecture, using the same kind of analogy. >> And an opportunity for developers too, great. Julie, thanks for coming on. Jay, appreciate it. >> Thank you so much for the opportunity, happy Valentine's Day! Happy Valentine's Day, we're theCUBE. I'm John Furrier, Stu Miniman here, live in San Francisco at the Moscone Center, and the whole street's shut down, Howard Street. Huge event, 30,000 people, we'll be back with more Day Four coverage after this short break.

Published Date : Feb 14 2019

SUMMARY :

Brought to you by IBM. Great story, good to see you again. And Jay Limburn, Director of Offering Management, It's feeding the apps. not only in the technologies that we have, But it could be, it needs to be specific. talk about the ladder to AI, right? So, the bottom set is just really that need to be able to build their solutions It's the way in which you can provide so that a data scientist can go to one place, of the ladder to AI. is data the next flywheel of innovation? get hold of the data without any need Yeah, but to the point you were than the old data warehousing, going to be able to get the information, the ability to provision data assets of the stack, I think, or ladder. (laughs) that have data native into the app, the ability to analyze that data And then present that to a call center of cardiologists. If you put it on! The ability to be able to capture So, realtime telemetry from the heart. It's all about being able to capture that data Yeah, you have to able to trust it. So, you brought up earlier bias in data. And the other one's privacy because now you have of that ladder, the governance aspects around it. This is the reality. the ladder to AI really helps explain I mentioned, the only way that we were able Love to hear you speak about What kind of partners are-- One of the things that we're heavily-invested in and one of the things that's popping up be able to insert that particular One of the things we have in Watson Studio, for example, to be built, the data has to be business-ready. What's the strategy, how are you That's really the first step. that you had mentioned-- opportunity to bring to the market. from an education perspective, to help And all of that underpinned with the trust. So, the ladder's great. A lot of the discussion last year was, So, I feel like the maturity of this ladder to AI Smart, and the ladder just allows you It allows our customers to take an inventory of, And an opportunity for developers too, great. and the whole street's shut down, Howard Street.

ENTITIES

Entity	Category	Confidence
Julie Lockner	PERSON	0.99+
Jay Limburn	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Jay	PERSON	0.99+
Julie	PERSON	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
San Francisco	LOCATION	0.99+
80%	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
last year	DATE	0.99+
yesterday	DATE	0.99+
first step	QUANTITY	0.99+
second step	QUANTITY	0.99+
two parts	QUANTITY	0.99+
first	QUANTITY	0.99+
Hadoop	ORGANIZATION	0.99+
Howard Street	LOCATION	0.99+
fourth day	QUANTITY	0.99+
Moscone Center	LOCATION	0.99+
10th year	QUANTITY	0.99+
ODPi Foundation	ORGANIZATION	0.99+
six weeks	QUANTITY	0.99+
One	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
ten years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
30,000 people	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
Mongo	ORGANIZATION	0.99+
four days	QUANTITY	0.98+
Today	DATE	0.98+
Stu	PERSON	0.98+
MySQL	TITLE	0.98+
Valentine's Day	EVENT	0.98+
20 years ago	DATE	0.98+
iCure	ORGANIZATION	0.97+
two guests	QUANTITY	0.97+
Watson Studio	TITLE	0.97+
2019	DATE	0.97+
this year	DATE	0.96+
today	DATE	0.96+
DevOps	TITLE	0.95+
one group	QUANTITY	0.95+
one	QUANTITY	0.94+
Cloud	TITLE	0.93+
single location	QUANTITY	0.92+
IBM Data	ORGANIZATION	0.92+
Project Egeria	ORGANIZATION	0.9+
this week	DATE	0.9+
one example	QUANTITY	0.9+

Rob Thomas, IBM | IBM Innovation Day 2018

(digital music) >> From Yorktown Heights, New York It's theCUBE! Covering IBM Cloud Innovation Day. Brought to you by IBM. >> Hi, it's Wikibon's Peter Burris again. We're broadcasting on The Cube from IBM Innovation Day at the Thomas J Watson Research Laboratory in Yorktown Heights, New York. Have a number of great conversations, and we got a great one right now. Rob Thomas, who's the General Manager of IBM Analytics, welcome back to theCUBE. >> Thanks Peter, great to see you. Thanks for coming out here to the woods. >> Oh, well it's not that bad. I actually live not to far from here. Interesting Rob, I was driving up the Taconic Parkway and I realized I hadn't been on it in 40 years, so. >> Is that right? (laugh) >> Very exciting. So Rob let's talk IBM analytics and some of the changes that are taking place. Specifically, how are customers thinking about achieving their AI outcomes. What's that ladder look like? >> Yeah. We call it the AI ladder. Which is basically all the steps that a client has to take to get to get to an AI future, is the best way I would describe it. From how you collect data, to how you organize your data. How you analyze your data, start to put machine learning into motion. How you infuse your data, meaning you can take any insights, infuse it into other applications. Those are the basic building blocks of this laddered AI. 81 percent of clients that start to do something with AI, they realize their first issue is a data issue. They can't find the data, they don't have the data. The AI ladder's about taking care of the data problem so you can focus on where the value is, the AI pieces. >> So, AI is a pretty broad, hairy topic today. What are customers learning about AI? What kind of experience are they gaining? How is it sharpening their thoughts and their pencils, as they think about what kind of outcomes they want to achieve? >> You know, its... For some reason, it's a bit of a mystical topic, but to me AI is actually quite simple. I'd like to say AI is not magic. Some people think it's a magical black box. You just, you know, put a few inputs in, you sit around and magic happens. It's not that, it's real work, it's real computer science. It's about how do I put, you know, how do I build models? Put models into production? Most models, when they go into production, are not that good, so how do I continually train and retrain those models? Then the AI aspect is about how do I bring human features to that? How do I integrate that with natural language, or with speech recognition, or with image recognition. So, when you get under the covers, it's actually not that mystical. It's about basic building blocks that help you start to achieve business outcomes. >> It's got to be very practical, otherwise the business has a hard time ultimately adopting it, but you mentioned a number of different... I especially like the 'add the human features' to it of the natural language. It also suggests that the skill set of AI starts to evolve as companies mature up this ladder. How is that starting to change? >> That's still one of the biggest gaps, I would say. Skill sets around the modern languages of data science that lead to AI: Python, AR, Scala, as an example of a few. That's still a bit of a gap. Our focus has been how do we make tools that anybody can use. So if you've grown up doing SPSS or SaaS, something like that, how do you adopt those skills for the open world of data science? That can make a big difference. On the human features point, we've actually built applications to try to make that piece easy. Great example is with Royal Bank of Scotland where we've created a solution called Watson Assistant which is basically how do we arm their call center representatives to be much more intelligent and engaging with clients, predicting what clients may do. Those types of applications package up the human features and the components I talked about, makes it really easy to get AI into production. >> Now many years ago, the genius Turing, noted the notion of the Turing machine where you couldn't tell the difference between the human and a machine from an engagement standpoint. We're actually starting to see that happen in some important ways. You mentioned the call center. >> Yep. >> How are technologies and agency coming together? By that I mean, the rate at which businesses are actually applying AI to act as an agent for them in front of customers? >> I think it's slow. What I encourage clients to do is, you have to do a massive number of experiments. So don't talk to me about the one or two AI projects you're doing, I'm thinking like hundreds. I was with a bank last week in Japan, and they're comment was in the last year they've done a hundred different AI projects. These are not one year long projects with hundreds of people. It's like, let's do a bunch of small experiments. You have to be comfortable that probably half of your experiments are going to fail, that's okay. The goal is how do you increase your win rate. Do you learn from the ones that work, and from the ones that don't work, so that you can apply those. This is all, to me at this stage, is about experimentation. Any enterprise right now, has to be thinking in terms of hundreds of experiments, not one, not two or 'Hey, should we do that project?' Think in terms of hundreds of experiments. You're going to learn a lot when you do that. >> But as you said earlier, AI is not magic and it's grounded in something, and it's increasingly obvious that it's grounded in analytics. So what is the relationship between AI analytics, and what types of analytics are capable of creating value independent of AI? >> So if you think about how I kind of decomposed AI, talked about human features, I talked about, it kind of starts with a model, you train the model. The model is only as good as the data that you feed it. So, that assumes that one, that your data's not locked into a bunch of different silos. It assumes that your data is actually governed. You have a data catalog or that type of capability. If you have those basics in place, once you have a single instantiation of your data, it becomes very easy to train models, and you can find that the more that you feed it, the better the model's going to get, the better your business outcomes are going to get. That's our whole strategy around IBM Cloud Private for Data. Basically, one environment, a console for all your data, build a model here, train it in all your data, no matter where it is, it's pretty powerful. >> Let me pick up on that where it is, 'cause it's becoming increasingly obvious, at least to us and our clients, that the world is not going to move all the data over to a central location. The data is going to be increasingly distributed closer to the sources, closer to where the action is. How does AI and that notion of increasing distributed data going to work together for clients. >> So we've just released what's called IBM Data Virtualization this month, and it is a leapfrog in terms of data virtualization technology. So the idea is leave your data where ever it is, it could be in a data center, it could be on a different data center, it could be on an automobile if you're an automobile manufacturer. We can federate data from anywhere, take advantage of processing power on the edge. So we're breaking down that problem. Which is, the initial analytics problem was before I do this I've got to bring all my data to one place. It's not a good use of money. It's a lot of time and it's a lot of money. So we're saying leave your data where it is, we will virtualize your data from wherever it may be. >> That's really cool. What was it called again? >> IBM Data Virtualization and it's part of IBM Cloud Private for Data. It's a feature in that. >> Excellent, so one last question Rob. February's coming up, IBM Think San Francisco thirty plus thousand people, what kind of conversations do you anticipate having with you customers, your partners, as they try to learn, experiment, take away actions that they can take to achieve their outcomes? >> I want to have this AI experimentation discussion. I will be encouraging every client, let's talk about hundreds of experiments not 5. Let's talk about what we can get started on now. Technology's incredibly cheap to get started and do something, and it's all about rate and pace, and trying a bunch of things. That's what I'm going to be encouraging. The clients that you're going to see on stage there are the ones that have adopted this mentality in the last year and they've got some great successes to show. >> Rob Thomas, general manager IBM Analytics, thanks again for being on theCUBE. >> Thanks Peter. >> Once again this is Peter Buriss of Wikibon, from IBM Innovation Day, Thomas J Watson Research Center. We'll be back in a moment. (techno beat)

Published Date : Dec 7 2018

SUMMARY :

Brought to you by IBM. at the Thomas J Watson Research Laboratory Thanks for coming out here to the woods. I actually live not to far from here. and some of the changes care of the data problem What kind of experience are they gaining? blocks that help you How is that starting to change? that lead to AI: Python, AR, notion of the Turing so that you can apply those. But as you said earlier, AI that the more that you feed it, that the world is not So the idea is leave your What was it called again? of IBM Cloud Private for Data. that they can take to going to see on stage there Rob Thomas, general Peter Buriss of Wikibon,

ENTITIES

Entity	Category	Confidence
Peter Buriss	PERSON	0.99+
Japan	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
Peter	PERSON	0.99+
one	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
one year	QUANTITY	0.99+
Royal Bank of Scotland	ORGANIZATION	0.99+
Rob	PERSON	0.99+
81 percent	QUANTITY	0.99+
last week	DATE	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Peter Burris	PERSON	0.99+
February	DATE	0.99+
first issue	QUANTITY	0.99+
Yorktown Heights, New York	LOCATION	0.99+
IBM Innovation Day	EVENT	0.99+
IBM Analytics	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.98+
Python	TITLE	0.98+
Taconic Parkway	LOCATION	0.98+
40 years	QUANTITY	0.98+
Scala	TITLE	0.98+
thirty plus thousand people	QUANTITY	0.97+
IBM Cloud Innovation Day	EVENT	0.96+
hundreds of experiments	QUANTITY	0.96+
today	DATE	0.96+
Watson Assistant	TITLE	0.96+
one place	QUANTITY	0.94+
IBM Innovation Day 2018	EVENT	0.93+
Thomas J Watson Research Center	ORGANIZATION	0.93+
SPSS	TITLE	0.89+
this month	DATE	0.88+
one environment	QUANTITY	0.86+
San Francisco	LOCATION	0.8+
half of	QUANTITY	0.79+
hundreds of people	QUANTITY	0.78+
many years ago	DATE	0.77+
hundreds of experiments	QUANTITY	0.76+
single instantiation	QUANTITY	0.76+
hundred different AI projects	QUANTITY	0.76+
one last question	QUANTITY	0.73+
SaaS	TITLE	0.71+
Turing	ORGANIZATION	0.71+
AR	TITLE	0.7+
IBM Think	ORGANIZATION	0.69+
J Watson Research	ORGANIZATION	0.67+
Thomas	LOCATION	0.62+
The Cube	TITLE	0.58+
money	QUANTITY	0.58+
Virtualization	COMMERCIAL_ITEM	0.55+
Laboratory	LOCATION	0.54+
Turing	PERSON	0.51+
Cloud Private	COMMERCIAL_ITEM	0.49+
Private for	COMMERCIAL_ITEM	0.47+
Cloud	TITLE	0.3+

Rob Thomas, IBM | Change the Game: Winning With AI 2018

>> [Announcer] Live from Times Square in New York City, it's theCUBE covering IBM's Change the Game: Winning with AI, brought to you by IBM. >> Hello everybody, welcome to theCUBE's special presentation. We're covering IBM's announcements today around AI. IBM, as theCUBE does, runs of sessions and programs in conjunction with Strata, which is down at the Javits, and we're Rob Thomas, who's the General Manager of IBM Analytics. Long time Cube alum, Rob, great to see you. >> Dave, great to see you. >> So you guys got a lot going on today. We're here at the Westin Hotel, you've got an analyst event, you've got a partner meeting, you've got an event tonight, Change the game: winning with AI at Terminal 5, check that out, ibm.com/WinWithAI, go register there. But Rob, let's start with what you guys have going on, give us the run down. >> Yeah, it's a big week for us, and like many others, it's great when you have Strata, a lot of people in town. So, we've structured a week where, today, we're going to spend a lot of time with analysts and our business partners, talking about where we're going with data and AI. This evening, we've got a broadcast, it's called Winning with AI. What's unique about that broadcast is it's all clients. We've got clients on stage doing demonstrations, how they're using IBM technology to get to unique outcomes in their business. So I think it's going to be a pretty unique event, which should be a lot of fun. >> So this place, it looks like a cool event, a venue, Terminal 5, it's just up the street on the west side highway, probably a mile from the Javits Center, so definitely check that out. Alright, let's talk about, Rob, we've known each other for a long time, we've seen the early Hadoop days, you guys were very careful about diving in, you kind of let things settle and watched very carefully, and then came in at the right time. But we saw the evolution of so-called Big Data go from a phase of really reducing investments, cheaper data warehousing, and what that did is allowed people to collect a lot more data, and kind of get ready for this era that we're in now. But maybe you can give us your perspective on the phases, the waves that we've seen of data, and where we are today and where we're going. >> I kind of think of it as a maturity curve. So when I go talk to clients, I say, look, you need to be on a journey towards AI. I think probably nobody disagrees that they need something there, the question is, how do you get there? So you think about the steps, it's about, a lot of people started with, we're going to reduce the cost of our operations, we're going to use data to take out cost, that was kind of the Hadoop thrust, I would say. Then they moved to, well, now we need to see more about our data, we need higher performance data, BI data warehousing. So, everybody, I would say, has dabbled in those two area. The next leap forward is self-service analytics, so how do you actually empower everybody in your organization to use and access data? And the next step beyond that is, can I use AI to drive new business models, new levers of growth, for my business? So, I ask clients, pin yourself on this journey, most are, depends on the division or the part of the company, they're at different areas, but as I tell everybody, if you don't know where you are and you don't know where you want to go, you're just going to wind around, so I try to get them to pin down, where are you versus where do you want to go? >> So four phases, basically, the sort of cheap data store, the BI data warehouse modernization, self-service analytics, a big part of that is data science and data science collaboration, you guys have a lot of investments there, and then new business models with AI automation running on top. Where are we today? Would you say we're kind of in-between BI/DW modernization and on our way to self-service analytics, or what's your sense? >> I'd say most are right in the middle between BI data warehousing and self-service analytics. Self-service analytics is hard, because it requires you, sometimes to take a couple steps back, and look at your data. It's hard to provide self-service if you don't have a data catalog, if you don't have data security, if you haven't gone through the processes around data governance. So, sometimes you have to take one step back to go two steps forward, that's why I see a lot of people, I'd say, stuck in the middle right now. And the examples that you're going to see tonight as part of the broadcast are clients that have figured out how to break through that wall, and I think that's pretty illustrative of what's possible. >> Okay, so you're saying that, got to maybe take a step back and get the infrastructure right with, let's say a catalog, to give some basic things that they have to do, some x's and o's, you've got the Vince Lombardi played out here, and also, skillsets, I imagine, is a key part of that. So, that's what they've got to do to get prepared, and then, what's next? They start creating new business models, imagining this is where the cheap data officer comes in and it's an executive level, what are you seeing clients as part of digital transformation, what's the conversation like with customers? >> The biggest change, the great thing about the times we live in, is technology's become so accessible, you can do things very quickly. We created a team last year called Data Science Elite, and we've hired what we think are some of the best data scientists in the world. Their only job is to go work with clients and help them get to a first success with data science. So, we put a team in. Normally, one month, two months, normally a team of two or three people, our investment, and we say, let's go build a model, let's get to an outcome, and you can do this incredibly quickly now. I tell clients, I see somebody that says, we're going to spend six months evaluating and thinking about this, I was like, why would you spend six months thinking about this when you could actually do it in one month? So you just need to get over the edge and go try it. >> So we're going to learn more about the Data Science Elite team. We've got John Thomas coming on today, who is a distinguished engineer at IBM, and he's very much involved in that team, and I think we have a customer who's actually gone through that, so we're going to talk about what their experience was with the Data Science Elite team. Alright, you've got some hard news coming up, you've actually made some news earlier with Hortonworks and Red Hat, I want to talk about that, but you've also got some hard news today. Take us through that. >> Yeah, let's talk about all three. First, Monday we announced the expanded relationship with both Hortonworks and Red Hat. This goes back to one of the core beliefs I talked about, every enterprise is modernizing their data and application of states, I don't think there's any debate about that. We are big believers in Kubernetes and containers as the architecture to drive that modernization. The announcement on Monday was, we're working closer with Red Hat to take all of our data services as part of Cloud Private for Data, which are basically microservice for data, and we're running those on OpenShift, and we're starting to see great customer traction with that. And where does Hortonworks come in? Hadoop has been the outlier on moving to microservices containers, we're working with Hortonworks to help them make that move as well. So, it's really about the three of us getting together and helping clients with this modernization journey. >> So, just to remind people, you remember ODPI, folks? It was all this kerfuffle about, why do we even need this? Well, what's interesting to me about this triumvirate is, well, first of all, Red Hat and Hortonworks are hardcore opensource, IBM's always been a big supporter of open source. You three got together and you're proving now the productivity for customers of this relationship. You guys don't talk about this, but Hortonworks had to, when it's public call, that the relationship with IBM drove many, many seven-figure deals, which, obviously means that customers are getting value out of this, so it's great to see that come to fruition, and it wasn't just a Barney announcement a couple years ago, so congratulations on that. Now, there's this other news that you guys announced this morning, talk about that. >> Yeah, two other things. One is, we announced a relationship with Stack Overflow. 50 million developers go to Stack Overflow a month, it's an amazing environment for developers that are looking to do new things, and we're sponsoring a community around AI. Back to your point before, you said, is there a skills gap in enterprises, there absolutely is, I don't think that's a surprise. Data science, AI developers, not every company has the skills they need, so we're sponsoring a community to help drive the growth of skills in and around data science and AI. So things like Python, R, Scala, these are the languages of data science, and it's a great relationship with us and Stack Overflow to build a community to get things going on skills. >> Okay, and then there was one more. >> Last one's a product announcement. This is one of the most interesting product annoucements we've had in quite a while. Imagine this, you write a sequel query, and traditional approach is, I've got a server, I point it as that server, I get the data, it's pretty limited. We're announcing technology where I write a query, and it can find data anywhere in the world. I think of it as wide-area sequel. So it can find data on an automotive device, a telematics device, an IoT device, it could be a mobile device, we think of it as sequel the whole world. You write a query, you can find the data anywhere it is, and we take advantage of the processing power on the edge. The biggest problem with IoT is, it's been the old mantra of, go find the data, bring it all back to a centralized warehouse, that makes it impossible to do it real time. We're enabling real time because we can write a query once, find data anywhere, this is technology we've had in preview for the last year. We've been working with a lot of clients to prove out used cases to do it, we're integrating as the capability inside of IBM Cloud Private for Data. So if you buy IBM Cloud for Data, it's there. >> Interesting, so when you've been around as long as I have, long enough to see some of the pendulums swings, and it's clearly a pendulum swing back toward decentralization in the edge, but the key is, from what you just described, is you're sort of redefining the boundary, so I presume it's the edge, any Cloud, or on premises, where you can find that data, is that correct? >> Yeah, so it's multi-Cloud. I mean, look, every organization is going to be multi-Cloud, like 100%, that's going to happen, and that could be private, it could be multiple public Cloud providers, but the key point is, data on the edge is not just limited to what's in those Clouds. It could be anywhere that you're collecting data. And, we're enabling an architecture which performs incredibly well, because you take advantage of processing power on the edge, where you can get data anywhere that it sits. >> Okay, so, then, I'm setting up a Cloud, I'll call it a Cloud architecture, that encompasses the edge, where essentially, there are no boundaries, and you're bringing security. We talked about containers before, we've been talking about Kubernetes all week here at a Big Data show. And then of course, Cloud, and what's interesting, I think many of the Hadoop distral vendors kind of missed Cloud early on, and then now are sort of saying, oh wow, it's a hybrid world and we've got a part, you guys obviously made some moves, a couple billion dollar moves, to do some acquisitions and get hardcore into Cloud, so that becomes a critical component. You're not just limiting your scope to the IBM Cloud. You're recognizing that it's a multi-Cloud world, that' what customers want to do. Your comments. >> It's multi-Cloud, and it's not just the IBM Cloud, I think the most predominant Cloud that's emerging is every client's private Cloud. Every client I talk to is building out a containerized architecture. They need their own Cloud, and they need seamless connectivity to any public Cloud that they may be using. This is why you see such a premium being put on things like data ingestion, data curation. It's not popular, it's not exciting, people don't want to talk about it, but we're the biggest inhibitors, to this AI point, comes back to data curation, data ingestion, because if you're dealing with multiple Clouds, suddenly your data's in a bunch of different spots. >> Well, so you're basically, and we talked about this a lot on theCUBE, you're bringing the Cloud model to the data, wherever the data lives. Is that the right way to think about it? >> I think organizations have spoken, set aside what they say, look at their actions. Their actions say, we don't want to move all of our data to any particular Cloud, we'll move some of our data. We need to give them seamless connectivity so that they can leave their data where they want, we can bring Cloud-Native Architecture to their data, we could also help move their data to a Cloud-Native architecture if that's what they prefer. >> Well, it makes sense, because you've got physics, latency, you've got economics, moving all the data into a public Cloud is expensive and just doesn't make economic sense, and then you've got things like GDPR, which says, well, you have to keep the data, certain laws of the land, if you will, that say, you've got to keep the data in whatever it is, in Germany, or whatever country. So those sort of edicts dictate how you approach managing workloads and what you put where, right? Okay, what's going on with Watson? Give us the update there. >> I get a lot of questions, people trying to peel back the onion of what exactly is it? So, I want to make that super clear here. Watson is a few things, start at the bottom. You need a runtime for models that you've built. So we have a product called Watson Machine Learning, runs anywhere you want, that is the runtime for how you execute models that you've built. Anytime you have a runtime, you need somewhere where you can build models, you need a development environment. That is called Watson Studio. So, we had a product called Data Science Experience, we've evolved that into Watson Studio, connecting in some of those features. So we have Watson Studio, that's the development environment, Watson Machine Learning, that's the runtime. Now you move further up the stack. We have a set of APIs that bring in human features, vision, natural language processing, audio analytics, those types of things. You can integrate those as part of a model that you build. And then on top of that, we've got things like Watson Applications, we've got Watson for call centers, doing customer service and chatbots, and then we've got a lot of clients who've taken pieces of that stack and built their own AI solutions. They've taken some of the APIs, they've taken some of the design time, the studio, they've taken some of the Watson Machine Learning. So, it is really a stack of capabilities, and where we're driving the greatest productivity, this is in a lot of the examples you'll see tonight for clients, is clients that have bought into this idea of, I need a development environment, I need a runtime, where I can deploy models anywhere. We're getting a lot of momentum on that, and then that raises the question of, well, do I have expandability, do I have trust in transparency, and that's another thing that we're working on. >> Okay, so there's API oriented architecture, exposing all these services make it very easy for people to consume. Okay, so we've been talking all week at Cube NYC, is Big Data is in AI, is this old wine, new bottle? I mean, it's clear, Rob, from the conversation here, there's a lot of substantive innovation, and early adoption, anyway, of some of these innovations, but a lot of potential going forward. Last thoughts? >> What people have to realize is AI is not magic, it's still computer science. So it actually requires some hard work. You need to roll up your sleeves, you need to understand how I get from point A to point B, you need a development environment, you need a runtime. I want people to really think about this, it's not magic. I think for a while, people have gotten the impression that there's some magic button. There's not, but if you put in the time, and it's not a lot of time, you'll see the examples tonight, most of them have been done in one or two months, there's great business value in starting to leverage AI in your business. >> Awesome, alright, so if you're in this city or you're at Strata, go to ibm.com/WinWithAI, register for the event tonight. Rob, we'll see you there, thanks so much for coming back. >> Yeah, it's going to be fun, thanks Dave, great to see you. >> Alright, keep it right there everybody, we'll be back with our next guest right after this short break, you're watching theCUBE.

Published Date : Sep 18 2018

SUMMARY :

brought to you by IBM. Long time Cube alum, Rob, great to see you. But Rob, let's start with what you guys have going on, it's great when you have Strata, a lot of people in town. and kind of get ready for this era that we're in now. where you want to go, you're just going to wind around, and data science collaboration, you guys have It's hard to provide self-service if you don't have and it's an executive level, what are you seeing let's get to an outcome, and you can do this and I think we have a customer who's actually as the architecture to drive that modernization. So, just to remind people, you remember ODPI, folks? has the skills they need, so we're sponsoring a community and it can find data anywhere in the world. of processing power on the edge, where you can get data a couple billion dollar moves, to do some acquisitions This is why you see such a premium being put on things Is that the right way to think about it? to a Cloud-Native architecture if that's what they prefer. certain laws of the land, if you will, that say, for how you execute models that you've built. I mean, it's clear, Rob, from the conversation here, and it's not a lot of time, you'll see the examples tonight, Rob, we'll see you there, thanks so much for coming back. we'll be back with our next guest

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
Rob	PERSON	0.99+
Rob Thomas	PERSON	0.99+
John Thomas	PERSON	0.99+
two months	QUANTITY	0.99+
one month	QUANTITY	0.99+
Germany	LOCATION	0.99+
last year	DATE	0.99+
Red Hat	ORGANIZATION	0.99+
Monday	DATE	0.99+
one	QUANTITY	0.99+
100%	QUANTITY	0.99+
GDPR	TITLE	0.99+
three people	QUANTITY	0.99+
first	QUANTITY	0.99+
two	QUANTITY	0.99+
ibm.com/WinWithAI	OTHER	0.99+
Watson Studio	TITLE	0.99+
Python	TITLE	0.99+
Scala	TITLE	0.99+
First	QUANTITY	0.99+
Data Science Elite	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Cube	ORGANIZATION	0.99+
one step	QUANTITY	0.99+
One	QUANTITY	0.99+
Times Square	LOCATION	0.99+
today	DATE	0.99+
Vince Lombardi	PERSON	0.98+
three	QUANTITY	0.98+
Stack Overflow	ORGANIZATION	0.98+
tonight	DATE	0.98+
Javits Center	LOCATION	0.98+
Barney	ORGANIZATION	0.98+
Terminal 5	LOCATION	0.98+
IBM Analytics	ORGANIZATION	0.98+
Watson	TITLE	0.97+
two steps	QUANTITY	0.97+
New York City	LOCATION	0.97+
Watson Applications	TITLE	0.97+
Cloud	TITLE	0.96+
This evening	DATE	0.95+
Watson Machine Learning	TITLE	0.94+
two area	QUANTITY	0.93+
seven-figure deals	QUANTITY	0.92+
Cube	PERSON	0.91+

Sreesha Rao, Niagara Bottling & Seth Dobrin, IBM | Change The Game: Winning With AI 2018

>> Live, from Times Square, in New York City, it's theCUBE covering IBM's Change the Game: Winning with AI. Brought to you by IBM. >> Welcome back to the Big Apple, everybody. I'm Dave Vellante, and you're watching theCUBE, the leader in live tech coverage, and we're here covering a special presentation of IBM's Change the Game: Winning with AI. IBM's got an analyst event going on here at the Westin today in the theater district. They've got 50-60 analysts here. They've got a partner summit going on, and then tonight, at Terminal 5 of the West Side Highway, they've got a customer event, a lot of customers there. We've talked earlier today about the hard news. Seth Dobern is here. He's the Chief Data Officer of IBM Analytics, and he's joined by Shreesha Rao who is the Senior Manager of IT Applications at California-based Niagara Bottling. Gentlemen, welcome to theCUBE. Thanks so much for coming on. >> Thank you, Dave. >> Well, thanks Dave for having us. >> Yes, always a pleasure Seth. We've known each other for a while now. I think we met in the snowstorm in Boston, sparked something a couple years ago. >> Yep. When we were both trapped there. >> Yep, and at that time, we spent a lot of time talking about your internal role as the Chief Data Officer, working closely with Inderpal Bhandari, and you guys are doing inside of IBM. I want to talk a little bit more about your other half which is working with clients and the Data Science Elite Team, and we'll get into what you're doing with Niagara Bottling, but let's start there, in terms of that side of your role, give us the update. >> Yeah, like you said, we spent a lot of time talking about how IBM is implementing the CTO role. While we were doing that internally, I spent quite a bit of time flying around the world, talking to our clients over the last 18 months since I joined IBM, and we found a consistent theme with all the clients, in that, they needed help learning how to implement data science, AI, machine learning, whatever you want to call it, in their enterprise. There's a fundamental difference between doing these things at a university or as part of a Kaggle competition than in an enterprise, so we felt really strongly that it was important for the future of IBM that all of our clients become successful at it because what we don't want to do is we don't want in two years for them to go "Oh my God, this whole data science thing was a scam. We haven't made any money from it." And it's not because the data science thing is a scam. It's because the way they're doing it is not conducive to business, and so we set up this team we call the Data Science Elite Team, and what this team does is we sit with clients around a specific use case for 30, 60, 90 days, it's really about 3 or 4 sprints, depending on the material, the client, and how long it takes, and we help them learn through this use case, how to use Python, R, Scala in our platform obviously, because we're here to make money too, to implement these projects in their enterprise. Now, because it's written in completely open-source, if they're not happy with what the product looks like, they can take their toys and go home afterwards. It's on us to prove the value as part of this, but there's a key point here. My team is not measured on sales. They're measured on adoption of AI in the enterprise, and so it creates a different behavior for them. So they're really about "Make the enterprise successful," right, not "Sell this software." >> Yeah, compensation drives behavior. >> Yeah, yeah. >> So, at this point, I ask, "Well, do you have any examples?" so Shreesha, let's turn to you. (laughing softly) Niagara Bottling -- >> As a matter of fact, Dave, we do. (laughing) >> Yeah, so you're not a bank with a trillion dollars in assets under management. Tell us about Niagara Bottling and your role. >> Well, Niagara Bottling is the biggest private label bottled water manufacturing company in the U.S. We make bottled water for Costcos, Walmarts, major national grocery retailers. These are our customers whom we service, and as with all large customers, they're demanding, and we provide bottled water at relatively low cost and high quality. >> Yeah, so I used to have a CIO consultancy. We worked with every CIO up and down the East Coast. I always observed, really got into a lot of organizations. I was always observed that it was really the heads of Application that drove AI because they were the glue between the business and IT, and that's really where you sit in the organization, right? >> Yes. My role is to support the business and business analytics as well as I support some of the distribution technologies and planning technologies at Niagara Bottling. >> So take us the through the project if you will. What were the drivers? What were the outcomes you envisioned? And we can kind of go through the case study. >> So the current project that we leveraged IBM's help was with a stretch wrapper project. Each pallet that we produce--- we produce obviously cases of bottled water. These are stacked into pallets and then shrink wrapped or stretch wrapped with a stretch wrapper, and this project is to be able to save money by trying to optimize the amount of stretch wrap that goes around a pallet. We need to be able to maintain the structural stability of the pallet while it's transported from the manufacturing location to our customer's location where it's unwrapped and then the cases are used. >> And over breakfast we were talking. You guys produce 2833 bottles of water per second. >> Wow. (everyone laughs) >> It's enormous. The manufacturing line is a high speed manufacturing line, and we have a lights-out policy where everything runs in an automated fashion with raw materials coming in from one end and the finished goods, pallets of water, going out. It's called pellets to pallets. Pellets of plastic coming in through one end and pallets of water going out through the other end. >> Are you sitting on top of an aquifer? Or are you guys using sort of some other techniques? >> Yes, in fact, we do bore wells and extract water from the aquifer. >> Okay, so the goal was to minimize the amount of material that you used but maintain its stability? Is that right? >> Yes, during transportation, yes. So if we use too much plastic, we're not optimally, I mean, we're wasting material, and cost goes up. We produce almost 16 million pallets of water every single year, so that's a lot of shrink wrap that goes around those, so what we can save in terms of maybe 15-20% of shrink wrap costs will amount to quite a bit. >> So, how does machine learning fit into all of this? >> So, machine learning is way to understand what kind of profile, if we can measure what is happening as we wrap the pallets, whether we are wrapping it too tight or by stretching it, that results in either a conservative way of wrapping the pallets or an aggressive way of wrapping the pallets. >> I.e. too much material, right? >> Too much material is conservative, and aggressive is too little material, and so we can achieve some savings if we were to alternate between the profiles. >> So, too little material means you lose product, right? >> Yes, and there's a risk of breakage, so essentially, while the pallet is being wrapped, if you are stretching it too much there's a breakage, and then it interrupts production, so we want to try and avoid that. We want a continuous production, at the same time, we want the pallet to be stable while saving material costs. >> Okay, so you're trying to find that ideal balance, and how much variability is in there? Is it a function of distance and how many touches it has? Maybe you can share with that. >> Yes, so each pallet takes about 16-18 wraps of the stretch wrapper going around it, and that's how much material is laid out. About 250 grams of plastic that goes on there. So we're trying to optimize the gram weight which is the amount of plastic that goes around each of the pallet. >> So it's about predicting how much plastic is enough without having breakage and disrupting your line. So they had labeled data that was, "if we stretch it this much, it breaks. If we don't stretch it this much, it doesn't break, but then it was about predicting what's good enough, avoiding both of those extremes, right? >> Yes. >> So it's a truly predictive and iterative model that we've built with them. >> And, you're obviously injecting data in terms of the trip to the store as well, right? You're taking that into consideration in the model, right? >> Yeah that's mainly to make sure that the pallets are stable during transportation. >> Right. >> And that is already determined how much containment force is required when your stretch and wrap each pallet. So that's one of the variables that is measured, but the inputs and outputs are-- the input is the amount of material that is being used in terms of gram weight. We are trying to minimize that. So that's what the whole machine learning exercise was. >> And the data comes from where? Is it observation, maybe instrumented? >> Yeah, the instruments. Our stretch-wrapper machines have an ignition platform, which is a Scada platform that allows us to measure all of these variables. We would be able to get machine variable information from those machines and then be able to hopefully, one day, automate that process, so the feedback loop that says "On this profile, we've not had any breaks. We can continue," or if there have been frequent breaks on a certain profile or machine setting, then we can change that dynamically as the product is moving through the manufacturing process. >> Yeah, so think of it as, it's kind of a traditional manufacturing production line optimization and prediction problem right? It's minimizing waste, right, while maximizing the output and then throughput of the production line. When you optimize a production line, the first step is to predict what's going to go wrong, and then the next step would be to include precision optimization to say "How do we maximize? Using the constraints that the predictive models give us, how do we maximize the output of the production line?" This is not a unique situation. It's a unique material that we haven't really worked with, but they had some really good data on this material, how it behaves, and that's key, as you know, Dave, and probable most of the people watching this know, labeled data is the hardest part of doing machine learning, and building those features from that labeled data, and they had some great data for us to start with. >> Okay, so you're collecting data at the edge essentially, then you're using that to feed the models, which is running, I don't know, where's it running, your data center? Your cloud? >> Yeah, in our data center, there's an instance of DSX Local. >> Okay. >> That we stood up. Most of the data is running through that. We build the models there. And then our goal is to be able to deploy to the edge where we can complete the loop in terms of the feedback that happens. >> And iterate. (Shreesha nods) >> And DSX Local, is Data Science Experience Local? >> Yes. >> Slash Watson Studio, so they're the same thing. >> Okay now, what role did IBM and the Data Science Elite Team play? You could take us through that. >> So, as we discussed earlier, adopting data science is not that easy. It requires subject matter, expertise. It requires understanding of data science itself, the tools and techniques, and IBM brought that as a part of the Data Science Elite Team. They brought both the tools and the expertise so that we could get on that journey towards AI. >> And it's not a "do the work for them." It's a "teach to fish," and so my team sat side by side with the Niagara Bottling team, and we walked them through the process, so it's not a consulting engagement in the traditional sense. It's how do we help them learn how to do it? So it's side by side with their team. Our team sat there and walked them through it. >> For how many weeks? >> We've had about two sprints already, and we're entering the third sprint. It's been about 30-45 days between sprints. >> And you have your own data science team. >> Yes. Our team is coming up to speed using this project. They've been trained but they needed help with people who have done this, been there, and have handled some of the challenges of modeling and data science. >> So it accelerates that time to --- >> Value. >> Outcome and value and is a knowledge transfer component -- >> Yes, absolutely. >> It's occurring now, and I guess it's ongoing, right? >> Yes. The engagement is unique in the sense that IBM's team came to our factory, understood what that process, the stretch-wrap process looks like so they had an understanding of the physical process and how it's modeled with the help of the variables and understand the data science modeling piece as well. Once they know both side of the equation, they can help put the physical problem and the digital equivalent together, and then be able to correlate why things are happening with the appropriate data that supports the behavior. >> Yeah and then the constraints of the one use case and up to 90 days, there's no charge for those two. Like I said, it's paramount that our clients like Niagara know how to do this successfully in their enterprise. >> It's a freebie? >> No, it's no charge. Free makes it sound too cheap. (everybody laughs) >> But it's part of obviously a broader arrangement with buying hardware and software, or whatever it is. >> Yeah, its a strategy for us to help make sure our clients are successful, and I want it to minimize the activation energy to do that, so there's no charge, and the only requirements from the client is it's a real use case, they at least match the resources I put on the ground, and they sit with us and do things like this and act as a reference and talk about the team and our offerings and their experiences. >> So you've got to have skin in the game obviously, an IBM customer. There's got to be some commitment for some kind of business relationship. How big was the collective team for each, if you will? >> So IBM had 2-3 data scientists. (Dave takes notes) Niagara matched that, 2-3 analysts. There were some working with the machines who were familiar with the machines and others who were more familiar with the data acquisition and data modeling. >> So each of these engagements, they cost us about $250,000 all in, so they're quite an investment we're making in our clients. >> I bet. I mean, 2-3 weeks over many, many weeks of super geeks time. So you're bringing in hardcore data scientists, math wizzes, stat wiz, data hackers, developer--- >> Data viz people, yeah, the whole stack. >> And the level of skills that Niagara has? >> We've got actual employees who are responsible for production, our manufacturing analysts who help aid in troubleshooting problems. If there are breakages, they go analyze why that's happening. Now they have data to tell them what to do about it, and that's the whole journey that we are in, in trying to quantify with the help of data, and be able to connect our systems with data, systems and models that help us analyze what happened and why it happened and what to do before it happens. >> Your team must love this because they're sort of elevating their skills. They're working with rock star data scientists. >> Yes. >> And we've talked about this before. A point that was made here is that it's really important in these projects to have people acting as product owners if you will, subject matter experts, that are on the front line, that do this everyday, not just for the subject matter expertise. I'm sure there's executives that understand it, but when you're done with the model, bringing it to the floor, and talking to their peers about it, there's no better way to drive this cultural change of adopting these things and having one of your peers that you respect talk about it instead of some guy or lady sitting up in the ivory tower saying "thou shalt." >> Now you don't know the outcome yet. It's still early days, but you've got a model built that you've got confidence in, and then you can iterate that model. What's your expectation for the outcome? >> We're hoping that preliminary results help us get up the learning curve of data science and how to leverage data to be able to make decisions. So that's our idea. There are obviously optimal settings that we can use, but it's going to be a trial and error process. And through that, as we collect data, we can understand what settings are optimal and what should we be using in each of the plants. And if the plants decide, hey they have a subjective preference for one profile versus another with the data we are capturing we can measure when they deviated from what we specified. We have a lot of learning coming from the approach that we're taking. You can't control things if you don't measure it first. >> Well, your objectives are to transcend this one project and to do the same thing across. >> And to do the same thing across, yes. >> Essentially pay for it, with a quick return. That's the way to do things these days, right? >> Yes. >> You've got more narrow, small projects that'll give you a quick hit, and then leverage that expertise across the organization to drive more value. >> Yes. >> Love it. What a great story, guys. Thanks so much for coming to theCUBE and sharing. >> Thank you. >> Congratulations. You must be really excited. >> No. It's a fun project. I appreciate it. >> Thanks for having us, Dave. I appreciate it. >> Pleasure, Seth. Always great talking to you, and keep it right there everybody. You're watching theCUBE. We're live from New York City here at the Westin Hotel. cubenyc #cubenyc Check out the ibm.com/winwithai Change the Game: Winning with AI Tonight. We'll be right back after a short break. (minimal upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by IBM. at Terminal 5 of the West Side Highway, I think we met in the snowstorm in Boston, sparked something When we were both trapped there. Yep, and at that time, we spent a lot of time and we found a consistent theme with all the clients, So, at this point, I ask, "Well, do you have As a matter of fact, Dave, we do. Yeah, so you're not a bank with a trillion dollars Well, Niagara Bottling is the biggest private label and that's really where you sit in the organization, right? and business analytics as well as I support some of the And we can kind of go through the case study. So the current project that we leveraged IBM's help was And over breakfast we were talking. (everyone laughs) It's called pellets to pallets. Yes, in fact, we do bore wells and So if we use too much plastic, we're not optimally, as we wrap the pallets, whether we are wrapping it too little material, and so we can achieve some savings so we want to try and avoid that. and how much variability is in there? goes around each of the pallet. So they had labeled data that was, "if we stretch it this that we've built with them. Yeah that's mainly to make sure that the pallets So that's one of the variables that is measured, one day, automate that process, so the feedback loop the predictive models give us, how do we maximize the Yeah, in our data center, Most of the data And iterate. the Data Science Elite Team play? so that we could get on that journey towards AI. And it's not a "do the work for them." and we're entering the third sprint. some of the challenges of modeling and data science. that supports the behavior. Yeah and then the constraints of the one use case No, it's no charge. with buying hardware and software, or whatever it is. minimize the activation energy to do that, There's got to be some commitment for some and others who were more familiar with the So each of these engagements, So you're bringing in hardcore data scientists, math wizzes, and that's the whole journey that we are in, in trying to Your team must love this because that are on the front line, that do this everyday, and then you can iterate that model. And if the plants decide, hey they have a subjective and to do the same thing across. That's the way to do things these days, right? across the organization to drive more value. Thanks so much for coming to theCUBE and sharing. You must be really excited. I appreciate it. I appreciate it. Change the Game: Winning with AI Tonight.

ENTITIES

Entity	Category	Confidence
Shreesha Rao	PERSON	0.99+
Seth Dobern	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Walmarts	ORGANIZATION	0.99+
Costcos	ORGANIZATION	0.99+
Dave	PERSON	0.99+
30	QUANTITY	0.99+
Boston	LOCATION	0.99+
New York City	LOCATION	0.99+
California	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
60	QUANTITY	0.99+
Niagara	ORGANIZATION	0.99+
Seth	PERSON	0.99+
Shreesha	PERSON	0.99+
U.S.	LOCATION	0.99+
Sreesha Rao	PERSON	0.99+
third sprint	QUANTITY	0.99+
90 days	QUANTITY	0.99+
two	QUANTITY	0.99+
first step	QUANTITY	0.99+
Inderpal Bhandari	PERSON	0.99+
Niagara Bottling	ORGANIZATION	0.99+
Python	TITLE	0.99+
both	QUANTITY	0.99+
tonight	DATE	0.99+
ibm.com/winwithai	OTHER	0.99+
one	QUANTITY	0.99+
Terminal 5	LOCATION	0.99+
two years	QUANTITY	0.99+
about $250,000	QUANTITY	0.98+
Times Square	LOCATION	0.98+
Scala	TITLE	0.98+
2018	DATE	0.98+
15-20%	QUANTITY	0.98+
IBM Analytics	ORGANIZATION	0.98+
each	QUANTITY	0.98+
today	DATE	0.98+
each pallet	QUANTITY	0.98+
Kaggle	ORGANIZATION	0.98+
West Side Highway	LOCATION	0.97+
Each pallet	QUANTITY	0.97+
4 sprints	QUANTITY	0.97+
About 250 grams	QUANTITY	0.97+
both side	QUANTITY	0.96+
Data Science Elite Team	ORGANIZATION	0.96+
one day	QUANTITY	0.95+
every single year	QUANTITY	0.95+
Niagara Bottling	PERSON	0.93+
about two sprints	QUANTITY	0.93+
one end	QUANTITY	0.93+
R	TITLE	0.92+
2-3 weeks	QUANTITY	0.91+
one profile	QUANTITY	0.91+
50-60 analysts	QUANTITY	0.91+
trillion dollars	QUANTITY	0.9+
2-3 data scientists	QUANTITY	0.9+
about 30-45 days	QUANTITY	0.88+
almost 16 million pallets of water	QUANTITY	0.88+
Big Apple	LOCATION	0.87+
couple years ago	DATE	0.87+
last 18 months	DATE	0.87+
Westin Hotel	ORGANIZATION	0.83+
pallet	QUANTITY	0.83+
#cubenyc	LOCATION	0.82+
2833 bottles of water per second	QUANTITY	0.82+
the Game: Winning with AI	TITLE	0.81+

John Thomas, IBM | Change the Game: Winning With AI

(upbeat music) >> Live from Time Square in New York City, it's The Cube. Covering IBM's change the game, winning with AI. Brought to you by IBM. >> Hi everybody, welcome back to The Big Apple. My name is Dave Vellante. We're here in the Theater District at The Westin Hotel covering a Special Cube event. IBM's got a big event today and tonight, if we can pan here to this pop-up. Change the game: winning with AI. So IBM has got an event here at The Westin, The Tide at Terminal 5 which is right up the Westside Highway. Go to IBM.com/winwithAI. Register, you can watch it online, or if you're in the city come down and see us, we'll be there. Uh, we have a bunch of customers will be there. We had Rob Thomas on earlier, he's kind of the host of the event. IBM does these events periodically throughout the year. They gather customers, they put forth some thought leadership, talk about some hard dues. So, we're very excited to have John Thomas here, he's a distinguished engineer and Director of IBM Analytics, long time Cube alum, great to see you again John >> Same here. Thanks for coming on. >> Great to have you. >> So we just heard a great case study with Niagara Bottling around the Data Science Elite Team, that's something that you've been involved in, and we're going to get into that. But give us the update since we last talked, what have you been up to?? >> Sure sure. So we're living and breathing data science these days. So the Data Science Elite Team, we are a team of practitioners. We actually work collaboratively with clients. And I stress on the word collaboratively because we're not there to just go do some work for a client. We actually sit down, expect the client to put their team to work with our team, and we build AI solutions together. Scope use cases, but sort of you know, expose them to expertise, tools, techniques, and do this together, right. And we've been very busy, (laughs) I can tell you that. You know it has been a lot of travel around the world. A lot of interest in the program. And engagements that bring us very interesting use cases. You know, use cases that you would expect to see, use cases that are hmmm, I had not thought of a use case like that. You know, but it's been an interesting journey in the last six, eight months now. >> And these are pretty small, agile teams. >> Sometimes people >> Yes. use tiger teams and they're two to three pizza teams, right? >> Yeah. And my understanding is you bring some number of resources that's called two three data scientists, >> Yes and the customer matches that resource, right? >> Exactly. That's the prerequisite. >> That is the prerequisite, because we're not there to just do the work for the client. We want to do this in a collaborative fashion, right. So, the customers Data Science Team is learning from us, we are working with them hand in hand to build a solution out. >> And that's got to resonate well with customers. >> Absolutely I mean so often the services business is like kind of, customers will say well I don't want to keep going back to a company to get these services >> Right, right. I want, teach me how to fish and that's exactly >> That's exactly! >> I was going to use that phrase. That's exactly what we do, that's exactly. So at the end of the two or three month period, when IBM leaves, my team leaves, you know, the client, the customer knows what the tools are, what the techniques are, what to watch out for, what are success criteria, they have a good handle of that. >> So we heard about the Niagara Bottling use case, which was a pretty narrow, >> Mm-hmm. How can we optimize the use of the plastic wrapping, save some money there, but at the same time maintain stability. >> Ya. You know very, quite a narrow in this case. >> Yes, yes. What are some of the other use cases? >> Yeah that's a very, like you said, a narrow one. But there are some use cases that span industries, that cut across different domains. I think I may have mentioned this on one of our previous discussions, Dave. You know customer interactions, trying to improve customer interactions is something that cuts across industry, right. Now that can be across different channels. One of the most prominent channels is a call center, I think we have talked about this previously. You know I hate calling into a call center (laughter) because I don't know Yeah, yeah. What kind of support I'm going to get. But, what if you could equip the call center agents to provide consistent service to the caller, and handle the calls in the best appropriate way. Reducing costs on the business side because call handling is expensive. And eventually lead up to can I even avoid the call, through insights on why the call is coming in in the first place. So this use case cuts across industry. Any enterprise that has got a call center is doing this. So we are looking at can we apply machine-learning techniques to understand dominant topics in the conversation. Once we understand with these have with unsupervised techniques, once we understand dominant topics in the conversation, can we drill into that and understand what are the intents, and does the intent change as the conversation progress? So you know I'm calling someone, it starts off with pleasantries, it then goes into weather, how are the kids doing? You know, complain about life in general. But then you get to something of substance why the person was calling in the first place. And then you may think that is the intent of the conversation, but you find that as the conversation progresses, the intent might actually change. And can you understand that real time? Can you understand the reasons behind the call, so that you could take proactive steps to maybe avoid the call coming in at the first place? This use case Dave, you know we are seeing so much interest in this use case. Because call centers are a big cost to most enterprises. >> Let's double down on that because I want to understand this. So you basically doing. So every time you call a call center this call may be recorded, >> (laughter) Yeah. For quality of service. >> Yeah. So you're recording the calls maybe using MLP to transcribe those calls. >> MLP is just the first step, >> Right. so you're absolutely right, when a calls come in there's already call recording systems in place. We're not getting into that space, right. So call recording systems record the voice calls. So often in offline batch mode you can take these millions of calls, pass it through a speech-to-text mechanism, which produces a text equivalent of the voice recordings. Then what we do is we apply unsupervised machine learning, and clustering, and topic-modeling techniques against it to understand what are the dominant topics in this conversation. >> You do kind of an entity extraction of those topics. >> Exactly, exactly, exactly. >> Then we find what is the most relevant, what are the relevant ones, what is the relevancy of topics in a particular conversation. That's not enough, that is just step two, if you will. Then you have to, we build what is called an intent hierarchy. So this is at top most level will be let's say payments, the call is about payments. But what about payments, right? Is it an intent to make a late payment? Or is the intent to avoid the payment or contest a payment? Or is the intent to structure a different payment mechanism? So can you get down to that level of detail? Then comes a further level of detail which is the reason that is tied to this intent. What is a reason for a late payment? Is it a job loss or job change? Is it because they are just not happy with the charges that I have coming? What is a reason? And the reason can be pretty complex, right? It may not be in the immediate vicinity of the snippet of conversation itself. So you got to go find out what the reason is and see if you can match it to this particular intent. So multiple steps off the journey, and eventually what we want to do is so we do our offers in an offline batch mode, and we are building a series of classifiers instead of classifiers. But eventually we want to get this to real time action. So think of this, if you have machine learning models, supervised models that can predict the intent, the reasons, et cetera, you can have them deployed operationalize them, so that when a call comes in real time, you can screen it in real time, do the speech to text, you can do this pass it to the supervise models that have been deployed, and the model fires and comes back and says this is the intent, take some action or guide the agent to take some action real time. >> Based on some automated discussion, so tell me what you're calling about, that kind of thing, >> Right. Is that right? >> So it's probably even gone past tell me what you're calling about. So it could be the conversation has begun to get into you know, I'm going through a tough time, my spouse had a job change. You know that is itself an indicator of some other reasons, and can that be used to prompt the CSR >> Ah, to take some action >> Ah, oh case. appropriate to the conversation. >> So I'm not talking to a machine, at first >> no no I'm talking to a human. >> Still talking to human. >> And then real time feedback to that human >> Exactly, exactly. is a good example of >> Exactly. human augmentation. >> Exactly, exactly. I wanted to go back and to process a little bit in terms of the model building. Are there humans involved in calibrating the model? >> There has to be. Yeah, there has to be. So you know, for all the hype in the industry, (laughter) you still need a (laughter). You know what it is is you need expertise to look at what these models produce, right. Because if you think about it, machine learning algorithms don't by themselves have an understanding of the domain. They are you know either statistical or similar in nature, so somebody has to marry the statistical observations with the domain expertise. So humans are definitely involved in the building of these models and claiming of these models. >> Okay. >> (inaudible). So that's who you got math, you got stats, you got some coding involved, and you >> Absolutely got humans are the last mile >> Absolutely. to really bring that >> Absolutely. expertise. And then in terms of operationalizing it, how does that actually get done? What tech behind that? >> Ah, yeah. >> It's a very good question, Dave. You build models, and what good are they if they stay inside your laptop, you know, they don't go anywhere. What you need to do is, I use a phrase, weave these models in your business processes and your applications. So you need a way to deploy these models. The models should be consumable from your business processes. Now it could be a Rest API Call could be a model. In some cases a Rest API Call is not sufficient, the latency is too high. Maybe you've got embed that model right into where your application is running. You know you've got data on a mainframe. A credit card transaction comes in, and the authorization for the credit card is happening in a four millisecond window on the mainframe on all, not all, but you know CICS COBOL Code. I don't have the time to make a Rest API call outside. I got to have the model execute in context with my CICS COBOL Code in that memory space. >> Yeah right. You know so the operationalizing is deploying, consuming these models, and then beyond that, how do the models behave over time? Because you can have the best programmer, the best data scientist build the absolute best model, which has got great accuracy, great performance today. Two weeks from now, performance is going to go down. >> Hmm. How do I monitor that? How do I trigger a loads map for below certain threshold. And, can I have a system in place that reclaims this model with new data as it comes in. >> So you got to understand where the data lives. >> Absolutely. You got to understand the physics, >> Yes. The latencies involved. >> Yes. You got to understand the economics. >> Yes. And there's also probably in many industries legal implications. >> Oh yes. >> No, the explainability of models. You know, can I prove that there is no bias here. >> Right. Now all of these are challenging but you know, doable things. >> What makes a successful engagement? Obviously you guys are outcome driven, >> Yeah. but talk about how you guys measure success. >> So um, for our team right now it is not about revenue, it's purely about adoption. Does the client, does the customer see the value of what IBM brings to the table. This is not just tools and technology, by the way. It's also expertise, right? >> Hmm. So this notion of expertise as a service, which is coupled with tools and technology to build a successful engagement. The way we measure success is has the client, have we built out the use case in a way that is useful for the business? Two, does a client see value in going further with that. So this is right now what we look at. It's not, you know yes of course everybody is scared about revenue. But that is not our key metric. Now in order to get there though, what we have found, a little bit of hard work, yes, uh, no you need different constituents of the customer to come together. It's not just me sending a bunch of awesome Python Programmers to the client. >> Yeah right. But now it is from the customer's side we need involvement from their Data Science Team. We talk about collaborating with them. We need involvement from their line of business. Because if the line of business doesn't care about the models we've produced you know, what good are they? >> Hmm. And third, people don't usually think about it, we need IT to be part of the discussion. Not just part of the discussion, part of being the stakeholder. >> Yes, so you've got, so IBM has the chops to actually bring these constituents together. >> Ya. I have actually a fair amount of experience in herding cats on large organizations. (laughter) And you know, the customer, they've got skin in the IBM game. This is to me a big differentiator between IBM, certainly some of the other technology suppliers who don't have the depth of services, expertise, and domain expertise. But on the flip side of that, differentiation from many of the a size who have that level of global expertise, but they don't have tech piece. >> Right. >> Now they would argue well we do anybodies tech. >> Ya. But you know, if you've got tech. >> Ya. >> You just got to (laughter) Ya. >> Bring those two together. >> Exactly. And that's really seems to me to be the big differentiator >> Yes, absolutely. for IBM. Well John, thanks so much for stopping by theCube and explaining sort of what you've been up to, the Data Science Elite Team, very exciting. Six to nine months in, >> Yes. are you declaring success yet? Still too early? >> Uh, well we're declaring success and we are growing, >> Ya. >> Growth is good. >> A lot of lot of attention. >> Alright, great to see you again, John. >> Absolutely, thanks you Dave. Thanks very much. Okay, keep it right there everybody. You're watching theCube. We're here at The Westin in midtown and we'll be right back after this short break. I'm Dave Vellante. (tech music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by IBM. he's kind of the host of the event. Thanks for coming on. last talked, what have you been up to?? We actually sit down, expect the client to use tiger teams and they're two to three And my understanding is you bring some That's the prerequisite. That is the prerequisite, because we're not And that's got to resonate and that's exactly So at the end of the two or three month period, How can we optimize the use of the plastic wrapping, Ya. You know very, What are some of the other use cases? intent of the conversation, but you So every time you call a call center (laughter) Yeah. So you're recording the calls maybe So call recording systems record the voice calls. You do kind of an entity do the speech to text, you can do this Is that right? has begun to get into you know, appropriate to the conversation. I'm talking to a human. is a good example of Exactly. a little bit in terms of the model building. You know what it is is you need So that's who you got math, you got stats, to really bring that how does that actually get done? I don't have the time to make a Rest API call outside. You know so the operationalizing is deploying, that reclaims this model with new data as it comes in. So you got to understand where You got to understand Yes. You got to understand And there's also probably in many industries No, the explainability of models. but you know, doable things. but talk about how you guys measure success. the value of what IBM brings to the table. constituents of the customer to come together. about the models we've produced you know, Not just part of the discussion, to actually bring these differentiation from many of the a size Now they would argue Ya. But you know, And that's really seems to me to be Six to nine months in, are you declaring success yet? Alright, great to see you Absolutely, thanks you Dave.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Dave	PERSON	0.99+
Rob Thomas	PERSON	0.99+
two	QUANTITY	0.99+
John Thomas	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Six	QUANTITY	0.99+
Time Square	LOCATION	0.99+
tonight	DATE	0.99+
first step	QUANTITY	0.99+
three	QUANTITY	0.99+
three month	QUANTITY	0.99+
nine months	QUANTITY	0.99+
third	QUANTITY	0.98+
Two	QUANTITY	0.98+
One	QUANTITY	0.98+
New York City	LOCATION	0.98+
today	DATE	0.98+
Python	TITLE	0.98+
IBM Analytics	ORGANIZATION	0.97+
Terminal 5	LOCATION	0.97+
Data Science Elite Team	ORGANIZATION	0.96+
Niagara	ORGANIZATION	0.96+
one	QUANTITY	0.96+
IBM.com/winwithAI	OTHER	0.96+
first place	QUANTITY	0.95+
eight months	QUANTITY	0.94+
Change the Game: Winning With AI	TITLE	0.89+
The Westin	ORGANIZATION	0.89+
Niagara Bottling	PERSON	0.89+
Theater District	LOCATION	0.88+
four millisecond window	QUANTITY	0.87+
step two	QUANTITY	0.86+
Cube	PERSON	0.85+
Westside Highway	LOCATION	0.83+
first	QUANTITY	0.83+
Two weeks	DATE	0.82+
millions of calls	QUANTITY	0.79+
two three data scientists	QUANTITY	0.78+
CICS	TITLE	0.77+
COBOL	OTHER	0.69+
Rest API call	OTHER	0.68+
The Tide	LOCATION	0.68+
theCube	ORGANIZATION	0.67+
The Westin	LOCATION	0.66+
Rest API	OTHER	0.66+
Apple	LOCATION	0.63+
Big	ORGANIZATION	0.62+
Westin	LOCATION	0.51+
last six	DATE	0.48+
Hotel	ORGANIZATION	0.45+
theCube	TITLE	0.33+
Bottling	COMMERCIAL_ITEM	0.3+

Rob Thomas, IBM | Change the Game: Winning With AI

>> Live from Times Square in New York City, it's The Cube covering IBM's Change the Game: Winning with AI, brought to you by IBM. >> Hello everybody, welcome to The Cube's special presentation. We're covering IBM's announcements today around AI. IBM, as The Cube does, runs of sessions and programs in conjunction with Strata, which is down at the Javits, and we're Rob Thomas, who's the General Manager of IBM Analytics. Long time Cube alum, Rob, great to see you. >> Dave, great to see you. >> So you guys got a lot going on today. We're here at the Westin Hotel, you've got an analyst event, you've got a partner meeting, you've got an event tonight, Change the game: winning with AI at Terminal 5, check that out, ibm.com/WinWithAI, go register there. But Rob, let's start with what you guys have going on, give us the run down. >> Yeah, it's a big week for us, and like many others, it's great when you have Strata, a lot of people in town. So, we've structured a week where, today, we're going to spend a lot of time with analysts and our business partners, talking about where we're going with data and AI. This evening, we've got a broadcast, it's called Winning with AI. What's unique about that broadcast is it's all clients. We've got clients on stage doing demonstrations, how they're using IBM technology to get to unique outcomes in their business. So I think it's going to be a pretty unique event, which should be a lot of fun. >> So this place, it looks like a cool event, a venue, Terminal 5, it's just up the street on the west side highway, probably a mile from the Javits Center, so definitely check that out. Alright, let's talk about, Rob, we've known each other for a long time, we've seen the early Hadoop days, you guys were very careful about diving in, you kind of let things settle and watched very carefully, and then came in at the right time. But we saw the evolution of so-called Big Data go from a phase of really reducing investments, cheaper data warehousing, and what that did is allowed people to collect a lot more data, and kind of get ready for this era that we're in now. But maybe you can give us your perspective on the phases, the waves that we've seen of data, and where we are today and where we're going. >> I kind of think of it as a maturity curve. So when I go talk to clients, I say, look, you need to be on a journey towards AI. I think probably nobody disagrees that they need something there, the question is, how do you get there? So you think about the steps, it's about, a lot of people started with, we're going to reduce the cost of our operations, we're going to use data to take out cost, that was kind of the Hadoop thrust, I would say. Then they moved to, well, now we need to see more about our data, we need higher performance data, BI data warehousing. So, everybody, I would say, has dabbled in those two area. The next leap forward is self-service analytics, so how do you actually empower everybody in your organization to use and access data? And the next step beyond that is, can I use AI to drive new business models, new levers of growth, for my business? So, I ask clients, pin yourself on this journey, most are, depends on the division or the part of the company, they're at different areas, but as I tell everybody, if you don't know where you are and you don't know where you want to go, you're just going to wind around, so I try to get them to pin down, where are you versus where do you want to go? >> So four phases, basically, the sort of cheap data store, the BI data warehouse modernization, self-service analytics, a big part of that is data science and data science collaboration, you guys have a lot of investments there, and then new business models with AI automation running on top. Where are we today? Would you say we're kind of in-between BI/DW modernization and on our way to self-service analytics, or what's your sense? >> I'd say most are right in the middle between BI data warehousing and self-service analytics. Self-service analytics is hard, because it requires you, sometimes to take a couple steps back, and look at your data. It's hard to provide self-service if you don't have a data catalog, if you don't have data security, if you haven't gone through the processes around data governance. So, sometimes you have to take one step back to go two steps forward, that's why I see a lot of people, I'd say, stuck in the middle right now. And the examples that you're going to see tonight as part of the broadcast are clients that have figured out how to break through that wall, and I think that's pretty illustrative of what's possible. >> Okay, so you're saying that, got to maybe take a step back and get the infrastructure right with, let's say a catalog, to give some basic things that they have to do, some x's and o's, you've got the Vince Lombardi played out here, and also, skillsets, I imagine, is a key part of that. So, that's what they've got to do to get prepared, and then, what's next? They start creating new business models, imagining this is where the cheap data officer comes in and it's an executive level, what are you seeing clients as part of digital transformation, what's the conversation like with customers? >> The biggest change, the great thing about the times we live in, is technology's become so accessible, you can do things very quickly. We created a team last year called Data Science Elite, and we've hired what we think are some of the best data scientists in the world. Their only job is to go work with clients and help them get to a first success with data science. So, we put a team in. Normally, one month, two months, normally a team of two or three people, our investment, and we say, let's go build a model, let's get to an outcome, and you can do this incredibly quickly now. I tell clients, I see somebody that says, we're going to spend six months evaluating and thinking about this, I was like, why would you spend six months thinking about this when you could actually do it in one month? So you just need to get over the edge and go try it. >> So we're going to learn more about the Data Science Elite team. We've got John Thomas coming on today, who is a distinguished engineer at IBM, and he's very much involved in that team, and I think we have a customer who's actually gone through that, so we're going to talk about what their experience was with the Data Science Elite team. Alright, you've got some hard news coming up, you've actually made some news earlier with Hortonworks and Red Hat, I want to talk about that, but you've also got some hard news today. Take us through that. >> Yeah, let's talk about all three. First, Monday we announced the expanded relationship with both Hortonworks and Red Hat. This goes back to one of the core beliefs I talked about, every enterprise is modernizing their data and application of states, I don't think there's any debate about that. We are big believers in Kubernetes and containers as the architecture to drive that modernization. The announcement on Monday was, we're working closer with Red Hat to take all of our data services as part of Cloud Private for Data, which are basically microservice for data, and we're running those on OpenShift, and we're starting to see great customer traction with that. And where does Hortonworks come in? Hadoop has been the outlier on moving to microservices containers, we're working with Hortonworks to help them make that move as well. So, it's really about the three of us getting together and helping clients with this modernization journey. >> So, just to remind people, you remember ODPI, folks? It was all this kerfuffle about, why do we even need this? Well, what's interesting to me about this triumvirate is, well, first of all, Red Hat and Hortonworks are hardcore opensource, IBM's always been a big supporter of open source. You three got together and you're proving now the productivity for customers of this relationship. You guys don't talk about this, but Hortonworks had to, when it's public call, that the relationship with IBM drove many, many seven-figure deals, which, obviously means that customers are getting value out of this, so it's great to see that come to fruition, and it wasn't just a Barney announcement a couple years ago, so congratulations on that. Now, there's this other news that you guys announced this morning, talk about that. >> Yeah, two other things. One is, we announced a relationship with Stack Overflow. 50 million developers go to Stack Overflow a month, it's an amazing environment for developers that are looking to do new things, and we're sponsoring a community around AI. Back to your point before, you said, is there a skills gap in enterprises, there absolutely is, I don't think that's a surprise. Data science, AI developers, not every company has the skills they need, so we're sponsoring a community to help drive the growth of skills in and around data science and AI. So things like Python, R, Scala, these are the languages of data science, and it's a great relationship with us and Stack Overflow to build a community to get things going on skills. >> Okay, and then there was one more. >> Last one's a product announcement. This is one of the most interesting product annoucements we've had in quite a while. Imagine this, you write a sequel query, and traditional approach is, I've got a server, I point it as that server, I get the data, it's pretty limited. We're announcing technology where I write a query, and it can find data anywhere in the world. I think of it as wide-area sequel. So it can find data on an automotive device, a telematics device, an IoT device, it could be a mobile device, we think of it as sequel the whole world. You write a query, you can find the data anywhere it is, and we take advantage of the processing power on the edge. The biggest problem with IoT is, it's been the old mantra of, go find the data, bring it all back to a centralized warehouse, that makes it impossible to do it real time. We're enabling real time because we can write a query once, find data anywhere, this is technology we've had in preview for the last year. We've been working with a lot of clients to prove out used cases to do it, we're integrating as the capability inside of IBM Cloud Private for Data. So if you buy IBM Cloud for Data, it's there. >> Interesting, so when you've been around as long as I have, long enough to see some of the pendulums swings, and it's clearly a pendulum swing back toward decentralization in the edge, but the key is, from what you just described, is you're sort of redefining the boundary, so I presume it's the edge, any Cloud, or on premises, where you can find that data, is that correct? >> Yeah, so it's multi-Cloud. I mean, look, every organization is going to be multi-Cloud, like 100%, that's going to happen, and that could be private, it could be multiple public Cloud providers, but the key point is, data on the edge is not just limited to what's in those Clouds. It could be anywhere that you're collecting data. And, we're enabling an architecture which performs incredibly well, because you take advantage of processing power on the edge, where you can get data anywhere that it sits. >> Okay, so, then, I'm setting up a Cloud, I'll call it a Cloud architecture, that encompasses the edge, where essentially, there are no boundaries, and you're bringing security. We talked about containers before, we've been talking about Kubernetes all week here at a Big Data show. And then of course, Cloud, and what's interesting, I think many of the Hadoop distral vendors kind of missed Cloud early on, and then now are sort of saying, oh wow, it's a hybrid world and we've got a part, you guys obviously made some moves, a couple billion dollar moves, to do some acquisitions and get hardcore into Cloud, so that becomes a critical component. You're not just limiting your scope to the IBM Cloud. You're recognizing that it's a multi-Cloud world, that' what customers want to do. Your comments. >> It's multi-Cloud, and it's not just the IBM Cloud, I think the most predominant Cloud that's emerging is every client's private Cloud. Every client I talk to is building out a containerized architecture. They need their own Cloud, and they need seamless connectivity to any public Cloud that they may be using. This is why you see such a premium being put on things like data ingestion, data curation. It's not popular, it's not exciting, people don't want to talk about it, but we're the biggest inhibitors, to this AI point, comes back to data curation, data ingestion, because if you're dealing with multiple Clouds, suddenly your data's in a bunch of different spots. >> Well, so you're basically, and we talked about this a lot on The Cube, you're bringing the Cloud model to the data, wherever the data lives. Is that the right way to think about it? >> I think organizations have spoken, set aside what they say, look at their actions. Their actions say, we don't want to move all of our data to any particular Cloud, we'll move some of our data. We need to give them seamless connectivity so that they can leave their data where they want, we can bring Cloud-Native Architecture to their data, we could also help move their data to a Cloud-Native architecture if that's what they prefer. >> Well, it makes sense, because you've got physics, latency, you've got economics, moving all the data into a public Cloud is expensive and just doesn't make economic sense, and then you've got things like GDPR, which says, well, you have to keep the data, certain laws of the land, if you will, that say, you've got to keep the data in whatever it is, in Germany, or whatever country. So those sort of edicts dictate how you approach managing workloads and what you put where, right? Okay, what's going on with Watson? Give us the update there. >> I get a lot of questions, people trying to peel back the onion of what exactly is it? So, I want to make that super clear here. Watson is a few things, start at the bottom. You need a runtime for models that you've built. So we have a product called Watson Machine Learning, runs anywhere you want, that is the runtime for how you execute models that you've built. Anytime you have a runtime, you need somewhere where you can build models, you need a development environment. That is called Watson Studio. So, we had a product called Data Science Experience, we've evolved that into Watson Studio, connecting in some of those features. So we have Watson Studio, that's the development environment, Watson Machine Learning, that's the runtime. Now you move further up the stack. We have a set of APIs that bring in human features, vision, natural language processing, audio analytics, those types of things. You can integrate those as part of a model that you build. And then on top of that, we've got things like Watson Applications, we've got Watson for call centers, doing customer service and chatbots, and then we've got a lot of clients who've taken pieces of that stack and built their own AI solutions. They've taken some of the APIs, they've taken some of the design time, the studio, they've taken some of the Watson Machine Learning. So, it is really a stack of capabilities, and where we're driving the greatest productivity, this is in a lot of the examples you'll see tonight for clients, is clients that have bought into this idea of, I need a development environment, I need a runtime, where I can deploy models anywhere. We're getting a lot of momentum on that, and then that raises the question of, well, do I have expandability, do I have trust in transparency, and that's another thing that we're working on. >> Okay, so there's API oriented architecture, exposing all these services make it very easy for people to consume. Okay, so we've been talking all week at Cube NYC, is Big Data is in AI, is this old wine, new bottle? I mean, it's clear, Rob, from the conversation here, there's a lot of substantive innovation, and early adoption, anyway, of some of these innovations, but a lot of potential going forward. Last thoughts? >> What people have to realize is AI is not magic, it's still computer science. So it actually requires some hard work. You need to roll up your sleeves, you need to understand how I get from point A to point B, you need a development environment, you need a runtime. I want people to really think about this, it's not magic. I think for a while, people have gotten the impression that there's some magic button. There's not, but if you put in the time, and it's not a lot of time, you'll see the examples tonight, most of them have been done in one or two months, there's great business value in starting to leverage AI in your business. >> Awesome, alright, so if you're in this city or you're at Strata, go to ibm.com/WinWithAI, register for the event tonight. Rob, we'll see you there, thanks so much for coming back. >> Yeah, it's going to be fun, thanks Dave, great to see you. >> Alright, keep it right there everybody, we'll be back with our next guest right after this short break, you're watching The Cube.

Published Date : Sep 13 2018

SUMMARY :

brought to you by IBM. Rob, great to see you. what you guys have going on, it's great when you have on the phases, the waves that we've seen where you want to go, you're the BI data warehouse modernization, a data catalog, if you and get the infrastructure right with, and help them get to a first and I think we have a as the architecture to news that you guys announced that are looking to do new things, I point it as that server, I get the data, of processing power on the the edge, where essentially, it's not just the IBM Cloud, Is that the right way to think about it? We need to give them seamless connectivity certain laws of the land, that is the runtime for people to consume. and it's not a lot of time, register for the event tonight. Yeah, it's going to be fun, we'll be back with our next guest

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
John Thomas	PERSON	0.99+
two months	QUANTITY	0.99+
six months	QUANTITY	0.99+
six months	QUANTITY	0.99+
Rob	PERSON	0.99+
Rob Thomas	PERSON	0.99+
Monday	DATE	0.99+
last year	DATE	0.99+
one month	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Germany	LOCATION	0.99+
New York City	LOCATION	0.99+
one	QUANTITY	0.99+
Vince Lombardi	PERSON	0.99+
GDPR	TITLE	0.99+
three people	QUANTITY	0.99+
Watson Studio	TITLE	0.99+
Cube	ORGANIZATION	0.99+
ibm.com/WinWithAI	OTHER	0.99+
two	QUANTITY	0.99+
Times Square	LOCATION	0.99+
both	QUANTITY	0.99+
tonight	DATE	0.99+
First	QUANTITY	0.99+
today	DATE	0.98+
Data Science Elite	ORGANIZATION	0.98+
The Cube	TITLE	0.98+
two steps	QUANTITY	0.98+
Scala	TITLE	0.98+
Python	TITLE	0.98+
One	QUANTITY	0.98+
three	QUANTITY	0.98+
Barney	ORGANIZATION	0.98+
Javits Center	LOCATION	0.98+
Watson	TITLE	0.98+
This evening	DATE	0.98+
IBM Analytics	ORGANIZATION	0.97+
one step	QUANTITY	0.97+
Stack Overflow	ORGANIZATION	0.96+
Cloud	TITLE	0.96+
seven-figure deals	QUANTITY	0.96+
Terminal 5	LOCATION	0.96+
Watson Applications	TITLE	0.95+
Watson Machine Learning	TITLE	0.94+
a month	QUANTITY	0.94+
50 million developers	QUANTITY	0.92+

Daniel Hernandez, IBM | Change the Game: Winning With AI 2018

>> Live from Times Square in New York City, it's theCUBE, covering IBM's Change the Game, Winning with AI, brought to you by IBM. >> Hi everybody, welcome back to theCUBE's special presentation. We're here at the Western Hotel and the theater district covering IBM's announcements. They've got an analyst meeting today, partner event. They've got a big event tonight. IBM.com/winwithAI, go to that website, if you're in town register. You can watch the webcast online. You'll see this very cool play of Vince Lombardy, one of his famous plays. It's kind of a power sweep right which is a great way to talk about sort of winning and with X's and O's. So anyway, Daniel Hernandez is here the vice president of IBM analytics, long time Cube along. It's great to see you again, thanks for coming on. >> My pleasure Dave. >> So we've talked a number of times. We talked earlier this year. Give us the update on momentum in your business. You guys are doing really well, we see this in the quadrants and the waves, but your perspective. >> Data science and AI, so when we last talked we were just introducing something called IBM Club Private for data. The basic idea is anybody that wants to do data science, data engineering or building apps with data anywhere, we're going to give them a single integrated platform to get that done. It's going to be the most efficient, best way to do those jobs to be done. We introduced it, it's been a resounding success. Been rolling that out with clients, that's been a whole lot of fun. >> So we talked a little bit with Rob Thomas about some of the news that you guys have, but this is really your wheelhouse so I'm going to drill down into each of these. Let's say we had Rob Beerden on yesterday on our program and he talked a lot about the IBM Red Hat and Hortonworks relationship. Certainly they talked about it on their earnings call and there seems to be clear momentum in the marketplace. But give us your perspective on that announcement. What exactly is it all about? I mean it started kind of back in the ODPI days and it's really evolved into something that now customers are taking advantage of. >> You go back to June last year, we entered into a relationship with Hortonworks where the basic primacy, was customers care about data and any data driven initiative was going to require data science. We had to do a better job bringing these eco systems, one focused on kind of Hadoop, the other one on classic enterprise analytical and operational data together. We did that last year. The other element of that was we're going to bring our data science and machine learning tools and run times to where the data is including Hadoop. That's been a resounding success. The next step up is how do we proliferate that single integrated stack everywhere including private Cloud or preferred Clouds like Open Shift. So there was two elements of the announcement. We did the hybrid Cloud architecture initiative which is taking the Hadoop data stack and bringing it to containers and Kubernetes. That's a big deal for people that want to run the infrastructure with Cloud characteristics. And the other was we're going to bring that whole stack onto Open Shift. So on IBM's side, with IBM Cloud Private for data we are driving certification of that entire stack on OpenShift so any customer that's betting on OpenShift as their Cloud infrastructure can benefit from that and the single integrated data stack. It's a pretty big deal. >> So OpenShift is really interesting because OpenShift was kind of quiet for awhile. It was quiest if you will. And then containers come on the scene and OpenShift has just exploded. What are your perspectives on that and what's IBM's angle on OpenShift? >> Containers of Kubernetes basically allow you to get Cloud characteristics everywhere. It used to be locked in to kind of the public Cloud or SCP providers that were offering as a service whether PAS OR IAS and Docker and Kubernetes are making the same underline technology that enabled elasticity, pay as you go models available anywhere including your own data center. So I think it explains why OpenShift, why IBM Cloud Private, why IBM Club Private for data just got on there. >> I mean the Core OS move by Red Hat was genius. They picked that up for the song in our view anyway and it's really helped explode that. And in this world, everybody's talking about Kubernetes. I mean we're here at a big data conference all week. It used to be Hadoop world. Everybody's talking about containers, Kubernetes and Multi cloud. Those are kind of the hot trends. I presume you've seen the same thing. >> 100 percent. There's not a single client that I know, and I spend the majority of my time with clients that are running their workloads in a single stack. And so what do you do? If data is an imperative for you, you better run your data analytic stack wherever you need to and that means Multi cloud by definition. So you've got a choice. You can say, I can port that workload to every distinct programming model and data stack or you can have a data stack everywhere including Multi clouds and Open Shift in this case. >> So thinking about the three companies, so Hortonworks obviously had duped distro specialists, open source, brings that end to end sort of data management from you know Edge, or Clouds on Prim. Red Hat doing a lot of the sort of hardcore infrastructure layer. IBM bringing in the analytics and really empowering people to get insights out of data. Is that the right way to think about that triangle? >> 100 percent and you know with the Hortonworks and IBM data stacks, we've got our common services, particularly you're on open meta data which means wherever your data is, you're going to know about it and you're going to be able to control it. Privacy, security, data discovery reasons, that's a pretty big deal. >> Yeah and as the Cloud, well obviously the Cloud whether it's on Prim or in the public Cloud expands now to the Edge, you've also got this concept of data virtualization. We've talked about this in the past. You guys have made some announcements there. But let's put a double click on that a little bit. What's it all about? >> Data virtualization been going on for a long time. It's basic intent is to help you access data through whatever tools, no matter where the data is. Traditional approaches of data virtualization are pretty limiting. So they work relatively well when you've got small data sets but when you've got highly fragmented data, which is the case in virtually every enterprise that exists a lot of the undermined technology for data virtualization breaks down. Data coming through a single headnote. Ultimately that becomes the critical issue. So you can't take advantage of data virtualization technologies largely because of that when you've got wide scale deployments. We've been incubating technology under this project codename query plex, it was a code name that we used internally and that we were working with Beta clients on and testing it out, validating it technically and it was pretty clear that this is a game changing method for data virtualization that allows you to drive the benefits of accessing your data wherever it is, pushing down queries where the data is and getting benefits of that through highly fragmented data landscape. And so what we've done is take that extremely innovated next generation data virtualization technology include it in our data platform called IBM Club Private for Data, and made it a critical feature inside of that. >> I like that term, query plex, it reminds me of the global sisplex. I go back to the days when actually viewing sort of distributed global systems was very, very challenging and IBM sort of solved that problem. Okay, so what's the secret sauce though of query plex and data virtualization? How does it all work? What's the tech behind it? >> So technically, instead of data coming and getting funneled through one node. If you ever think of your data as kind of a graph of computational data nodes. What query plex does is take advantage of that computational mesh to do queries and analytics. So instead of bringing all the data and funneling it through one of the nodes, and depending on the computational horsepower of that node and all the data being able to get to it, this just federates it out. It distributes out that workload so it's some magic behind the scenes but relatively simple technique. Low computing aggregate, it's probably going to be higher than whatever you can put into that single node. >> And how do customers access these services? How long does it take? >> It would look like a standard query interface to them. So this is all magic behind the scenes. >> Okay and they get this capability as part of what? IBM's >> IBM's Club Private for Data. It's going to be a feature, so this project query plex, is introduced as next generation data virtualization technology which just becomes a part of IBM Club Private for Data. >> Okay and then the other announcement that we talked to Rob, I'd like to understand a little bit more behind it. Actually before we get there, can we talk about the business impact of query plex and data virtualization? Thinking about it, it dramatically simplifies the processes that I have to go through to get data. But more importantly, it helps me get a handle on my data so I can apply machine intelligence. It seems like the innovation sandwich if you will. Data plus AI and then Cloud models for scale and simplicity and that's what's going to drive innovation. So talk about the business impact that people are excited about with regard to query plex. >> Better economics, so in order for you to access your data, you don't have to do ETO in this particular case. So data at rest getting consumed because of this online technology. Two performance, so because of the way this works you're actually going to get faster response times. Three, you're going to be able to query more data simply because this technology allows you to access all your data in a fragmented way without having to consolidate it. >> Okay, so it eliminates steps, right, and gets you time to value and gives you a bigger corporate of data that you can the analyze and drive inside. >> 100 percent. >> Okay, let's talk about stack overflow. You know, Rob took us through a little bit about what that's, what's going on there but why stack overflow, you're targeting developers? Talk to me more about that. >> So stack overflow, 50 million active developers each month on that community. You're a developer and you want to know something, you have to go to stack overflow. You think about data science and AI as disciplines. The idea that that is only dermained to AI and data scientists is very limiting idea. In order for you to actually apply artificial intelligence for whatever your use case is instead of a business it's going to require multiple individuals working together to get that particular outcome done including developers. So instead of having a distinct community for AI that's focused on AI machine developers, why not bring the artificial intelligence community to where the developers already are, which is stack overflow. So, if you go to AI.stackexchange.com, it's going to be the place for you to go to get all your answers to any question around artificial intelligence and of course IBM is going to be there in the community helping out. >> So it's AI.stackexchange.com. You know, it's interesting Daniel that, I mean to talk about digital transformation talking about data. John Furrier said something awhile back about the dots. This is like five or six years ago. He said data is the new development kit and now you guys are essentially targeting developers around AI, obviously a data centric. People trying to put data at the core of the organization. You see that that's a winning strategy. What do you think about that? >> 100 percent, I mean we're the data company instead of IBM, so you're probably asking the wrong guy if you think >> You're biased. (laughing) >> Yeah possibly, but I'm acknowledged. The data over opinions. >> Alright, tell us about tonight what we can expect? I was referencing the Vince Lombardy play here. You know, what's behind that? What are we going to see tonight? >> We were joking a little bit about the old school power eye formation, but that obviously works for your, you're a New England fan aren't you? >> I am actually, if you saw the games this weekend Pat's were in the power eye for quite a bit of the game which I know upset a lot of people. But it works. >> Yeah, maybe we should of used it as a Dallas Cowboy team. But anyways, it's going to be an amazing night. So we're going to have a bunch of clients talking about what they're doing with AI. And so if you're interested in learning what's happening in the industry, kind of perfect event to get it. We're going to do some expert analysis. It will be a little bit of fun breaking down what those customers did to be successful and maybe some tips and tricks that will help you along your way. >> Great, it's right up the street on the west side highway, probably about a mile from the Javis Center people that are at Strata. We've been running programs all week. One of the themes that we talked about, we had an event Tuesday night. We had a bunch of people coming in. There was people from financial services, we had folks from New York State, the city of New York. It was a great meet up and we had a whole conversation got going and one of the things that we talked about and I'd love to get your thoughts and kind of know where you're headed here, but big data to do all that talk and people ask, is that, now at AI, the conversation has moved to AI, is it same wine, new bottle, or is there something substantive here? The consensus was, there's substantive innovation going on. Your thoughts about where that innovation is coming from and what the potential is for clients? >> So if you're going to implement AI for let's say customer care for instance, you're going to be three wrongs griefs. You need data, you need algorithms, you need compute. With a lot of different structure to relate down to capture data wasn't captured until the traditional data systems anchored by Hadoop and big data movement. We landed, we created a data and computational grid for that data today. With all the advancements going on in algorithms particularly in Open Source, you now have, you can build a neuro networks, you can do Cisco machine learning in any language that you want. And bringing those together are exactly the combination that you need to implement any AI system. You already have data and computational grids here. You've got algorithms bringing them together solving some problem that matters to a customer is like the natural next step. >> And despite the skills gap, the skill gaps that we talked about, you're seeing a lot of knowledge transfer from a lot of expertise getting out there into the wild when you follow people like Kirk Born on Twitter you'll see that he'll post like the 20 different models for deep learning and people are starting to share that information. And then that skills gap is closing. Maybe not as fast as some people like but it seems like the industry is paying attention to this and really driving hard to work toward it 'cause it's real. >> Yeah I agree. You're going to have Seth Dulpren, I think it's Niagara, one of our clients. What I like about them is the, in general there's two skill issues. There's one, where does data science and AI help us solve problems that matter in business? That's really a, trying to build a treasure map of potential problems you can solve with a stack. And Seth and Niagara are going to give you a really good basis for the kinds of problems that we can solve. I don't think there's enough of that going on. There's a lot of commentary communication actually work underway in the technical skill problem. You know, how do I actually build these models to do. But there's not enough in how do I, now that I solved that problem, how do we marry it to problems that matter? So the skills gap, you know, we're doing our part with our data science lead team which Seth opens which is telling a customer, pick a hard problem, give us some data, give us some domain experts. We're going to be in the AI and ML experts and we're going to see what happens. So the skill problem is very serious but I don't think it's most people are not having the right conversations about it necessarily. They understand intuitively there's a tech problem but that tech not linked to a business problem matters nothing. >> Yeah it's not insurmountable, I'm glad you mentioned that. We're going to be talking to Niagara Bottling and how they use the data science elite team as an accelerant, to kind of close that gap. And I'm really interested in the knowledge transfer that occurred and of course the one thing about IBM and companies like IBM is you get not only technical skills but you get deep industry expertise as well. Daniel, always great to see you. Love talking about the offerings and going deep. So good luck tonight. We'll see you there and thanks so much for coming on theCUBE. >> My pleasure. >> Alright, keep it right there everybody. This is Dave Vellanti. We'll be back right after this short break. You're watching theCUBE. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

IBM's Change the Game, Hotel and the theater district and the waves, but your perspective. It's going to be the most about some of the news that you guys have, and run times to where the It was quiest if you will. kind of the public Cloud Those are kind of the hot trends. and I spend the majority Is that the right way to and you're going to be able to control it. Yeah and as the Cloud, and getting benefits of that I go back to the days and all the data being able to get to it, query interface to them. It's going to be a feature, So talk about the business impact of the way this works that you can the analyze Talk to me more about that. it's going to be the place for you to go and now you guys are You're biased. The data over opinions. What are we going to see tonight? saw the games this weekend kind of perfect event to get it. One of the themes that we talked about, that you need to implement any AI system. that he'll post like the And Seth and Niagara are going to give you kind of close that gap. This is Dave Vellanti.

ENTITIES

Entity	Category	Confidence
Dave Vellanti	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Rob	PERSON	0.99+
Daniel	PERSON	0.99+
John Furrier	PERSON	0.99+
Tuesday night	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Beerden	PERSON	0.99+
AI.stackexchange.com	OTHER	0.99+
Cisco	ORGANIZATION	0.99+
Three	QUANTITY	0.99+
Dave	PERSON	0.99+
New York City	LOCATION	0.99+
New York State	LOCATION	0.99+
Seth Dulpren	PERSON	0.99+
last year	DATE	0.99+
Rob Thomas	PERSON	0.99+
yesterday	DATE	0.99+
tonight	DATE	0.99+
Dallas Cowboy	ORGANIZATION	0.99+
one	QUANTITY	0.99+
three companies	QUANTITY	0.99+
Open Shift	TITLE	0.99+
New York	LOCATION	0.99+
two elements	QUANTITY	0.99+
IBM Red Hat	ORGANIZATION	0.99+
100 percent	QUANTITY	0.99+
June last year	DATE	0.99+
20 different models	QUANTITY	0.98+
Vince Lombardy	PERSON	0.98+
five	DATE	0.98+
Times Square	LOCATION	0.98+
Red Hat	ORGANIZATION	0.97+
each	QUANTITY	0.97+
Pat	PERSON	0.97+
OpenShift	TITLE	0.97+
each month	QUANTITY	0.97+
single client	QUANTITY	0.96+
New England	LOCATION	0.96+
single	QUANTITY	0.96+
single stack	QUANTITY	0.96+
Hadoop	TITLE	0.96+
six years ago	DATE	0.94+
three wrongs	QUANTITY	0.94+
IBM.com/winwithAI	OTHER	0.94+
today	DATE	0.94+
earlier this year	DATE	0.93+
Niagara	ORGANIZATION	0.93+
One	QUANTITY	0.92+
about a mile	QUANTITY	0.92+
Kirk Born	PERSON	0.91+
Seth	ORGANIZATION	0.91+
IBM Club	ORGANIZATION	0.89+
Change the Game: Winning With AI	TITLE	0.88+
50 million active developers	QUANTITY	0.88+

Pandit Prasad, IBM | DataWorks Summit 2018

>> From San Jose, in the heart of Silicon Valley, it's theCube. Covering DataWorks Summit 2018. Brought to you by Hortonworks. (upbeat music) >> Welcome back to theCUBE's live coverage of Data Works here in sunny San Jose, California. I'm your host Rebecca Knight along with my co-host James Kobielus. We're joined by Pandit Prasad. He is the analytics, projects, strategy, and management at IBM Analytics. Thanks so much for coming on the show. >> Thanks Rebecca, glad to be here. >> So, why don't you just start out by telling our viewers a little bit about what you do in terms of in relationship with the Horton Works relationship and the other parts of your job. >> Sure, as you said I am in Offering Management, which is also known as Product Management for IBM, manage the big data portfolio from an IBM perspective. I was also working with Hortonworks on developing this relationship, nurturing that relationship, so it's been a year since the Northsys partnership. We announced this partnership exactly last year at the same conference. And now it's been a year, so this year has been a journey and aligning the two portfolios together. Right, so Hortonworks had HDP HDF. IBM also had similar products, so we have for example, Big Sequel, Hortonworks has Hive, so how Hive and Big Sequel align together. IBM has a Data Science Experience, where does that come into the picture on top of HDP, so it means before this partnership if you look into the market, it has been you sell Hadoop, you sell a sequel engine, you sell Data Science. So what this year has given us is more of a solution sell. Now with this partnership we go to the customers and say here is NTN experience for you. You start with Hadoop, you put more analytics on top of it, you then bring Big Sequel for complex queries and federation visualization stories and then finally you put Data Science on top of it, so it gives you a complete NTN solution, the NTN experience for getting the value out of the data. >> Now IBM a few years back released a Watson data platform for team data science with DSX, data science experience, as one of the tools for data scientists. Is Watson data platform still the core, I call it dev ops for data science and maybe that's the wrong term, that IBM provides to market or is there sort of a broader dev ops frame work within which IBM goes to market these tools? >> Sure, Watson data platform one year ago was more of a cloud platform and it had many components of it and now we are getting a lot of components on to the (mumbles) and data science experience is one part of it, so data science experience... >> So Watson analytics as well for subject matter experts and so forth. >> Yes. And again Watson has a whole suit of side business based offerings, data science experience is more of a a particular aspect of the focus, specifically on the data science and that's been now available on PRAM and now we are building this arm from stack, so we have HDP, HDF, Big Sequel, Data Science Experience and we are working towards adding more and more to that portfolio. >> Well you have a broader reference architecture and a stack of solutions AI and power and so for more of the deep learning development. In your relationship with Hortonworks, are they reselling more of those tools into their customer base to supplement, extend what they already resell DSX or is that outside of the scope of the relationship? >> No it is all part of the relationship, these three have been the core of what we announced last year and then there are other solutions. We have the whole governance solution right, so again it goes back to the partnership HDP brings with it Atlas. IBM has a whole suite of governance portfolio including the governance catalog. How do you expand the story from being a Hadoop-centric story to an enterprise data-like story, and then now we are taking that to the cloud that's what Truata is all about. Rob Thomas came out with a blog yesterday morning talking about Truata. If you look at it is nothing but a governed data-link hosted offering, if you want to simplify it. That's one way to look at it caters to the GDPR requirements as well. >> For GDPR for the IBM Hortonworks partnership is the lead solution for GDPR compliance, is it Hortonworks Data Steward Studio or is it any number of solutions that IBM already has for data governance and curation, or is it a combination of all of that in terms of what you, as partners, propose to customers for soup to nuts GDPR compliance? Give me a sense for... >> It is a combination of all of those so it has a HDP, its has HDF, it has Big Sequel, it has Data Science Experience, it had IBM governance catalog, it has IBM data quality and it has a bunch of security products, like Gaurdium and it has some new IBM proprietary components that are very specific towards data (cough drowns out speaker) and how do you deal with the personal data and sensitive personal data as classified by GDPR. I'm supposed to query some high level information but I'm not allowed to query deep into the personal information so how do you blog those queries, how do you understand those, these are not necessarily part of Data Steward Studio. These are some of the proprietary components that are thrown into the mix by IBM. >> One of the requirements that is not often talked about under GDPR, Ricky of Formworks got in to it a little bit in his presentation, was the notion that the requirement that if you are using an UE citizen's PII to drive algorithmic outcomes, that they have the right to full transparency. It's the algorithmic decision paths that were taken. I remember IBM had a tool under the Watson brand that wraps up a narrative of that sort. Is that something that IBM still, it was called Watson Curator a few years back, is that a solution that IBM still offers, because I'm getting a sense right now that Hortonworks has a specific solution, not to say that they may not be working on it, that addresses that side of GDPR, do you know what I'm referring to there? >> I'm not aware of something from the Hortonworks side beyond the Data Steward Studio, which offers basically identification of what some of the... >> Data lineage as opposed to model lineage. It's a subtle distinction. >> It can identify some of the personal information and maybe provide a way to tag it and hence, mask it, but the Truata offering is the one that is bringing some new research assets, after GDPR guidelines became clear and then they got into they are full of how do we cater to those requirements. These are relatively new proprietary components, they are not even being productized, that's why I am calling them proprietary components that are going in to this hosting service. >> IBM's got a big portfolio so I'll understand if you guys are still working out what position. Rebecca go ahead. >> I just wanted to ask you about this new era of GDPR. The last Hortonworks conference was sort of before it came into effect and now we're in this new era. How would you say companies are reacting? Are they in the right space for it, in the sense of they're really still understand the ripple effects and how it's all going to play out? How would you describe your interactions with companies in terms of how they're dealing with these new requirements? >> They are still trying to understand the requirements and interpret the requirements coming to terms with what that really means. For example I met with a customer and they are a multi-national company. They have data centers across different geos and they asked me, I have somebody from Asia trying to query the data so that the query should go to Europe, but the query processing should not happen in Asia, the query processing all should happen in Europe, and only the output of the query should be sent back to Asia. You won't be able to think in these terms before the GDPR guidance era. >> Right, exceedingly complicated. >> Decoupling storage from processing enables those kinds of fairly complex scenarios for compliance purposes. >> It's not just about the access to data, now you are getting into where the processing happens were the results are getting displayed, so we are getting... >> Severe penalties for not doing that so your customers need to keep up. There was announcement at this show at Dataworks 2018 of an IBM Hortonwokrs solution. IBM post-analytics with with Hortonworks. I wonder if you could speak a little bit about that, Pandit, in terms of what's provided, it's a subscription service? If you could tell us what subset of IBM's analytics portfolio is hosted for Hortonwork's customers? >> Sure, was you said, it is a a hosted offering. Initially we are starting of as base offering with three products, it will have HDP, Big Sequel, IBM DB2 Big Sequel and DSX, Data Science Experience. Those are the three solutions, again as I said, it is hosted on IBM Cloud, so customers have a choice of different configurations they can choose, whether it be VMs or bare metal. I should say this is probably the only offering, as of today, that offers bare metal configuration in the cloud. >> It's geared to data scientist developers and machine-learning models will build the models and train them in IBM Cloud, but in a hosted HDP in IBM Cloud. Is that correct? >> Yeah, I would rephrase that a little bit. There are several different offerings on the cloud today and we can think about them as you said for ad-hoc or ephemeral workloads, also geared towards low cost. You think about this offering as taking your on PRAM data center experience directly onto the cloud. It is geared towards very high performance. The hardware and the software they are all configured, optimized for providing high performance, not necessarily for ad-hoc workloads, or ephemeral workloads, they are capable of handling massive workloads, on sitcky workloads, not meant for I turned this massive performance computing power for a couple of hours and then switched them off, but rather, I'm going to run these massive workloads as if it is located in my data center, that's number one. It comes with the complete set of HDP. If you think about it there are currently in the cloud you have Hive and Hbase, the sequel engines and the stories separate, security is optional, governance is optional. This comes with the whole enchilada. It has security and governance all baked in. It provides the option to use Big Sequel, because once you get on Hadoop, the next experience is I want to run complex workloads. I want to run federated queries across Hadoop as well as other data storage. How do I handle those, and then it comes with Data Science Experience also configured for best performance and integrated together. As a part of this partnership, I mentioned earlier, that we have progress towards providing this story of an NTN solution. The next steps of that are, yeah I can say that it's an NTN solution but are the product's look and feel as if they are one solution. That's what we are getting into and I have featured some of those integrations. For example Big Sequel, IBM product, we have been working on baking it very closely with HDP. It can be deployed through Morey, it is integrated with Atlas and Granger for security. We are improving the integrations with Atlas for governance. >> Say you're building a Spark machine learning model inside a DSX on HDP within IH (mumbles) IBM hosting with Hortonworks on HDP 3.0, can you then containerize that machine learning Sparks and then deploy into an edge scenario? >> Sure, first was Big Sequel, the next one was DSX. DSX is integrated with HDP as well. We can run DSX workloads on HDP before, but what we have done now is, if you want to run the DSX workloads, I want to run a Python workload, I need to have Python libraries on all the nodes that I want to deploy. Suppose you are running a big cluster, 500 cluster. I need to have Python libraries on all 500 nodes and I need to maintain the versioning of it. If I upgrade the versions then I need to go and upgrade and make sure all of them are perfectly aligned. >> In this first version will you be able build a Spark model and a Tesorflow model and containerize them and deploy them. >> Yes. >> Across a multi-cloud and orchestrate them with Kubernetes to do all that meshing, is that a capability now or planned for the future within this portfolio? >> Yeah, we have that capability demonstrated in the pedestal today, so that is a new one integration. We can run virtual, we call it virtual Python environment. DSX can containerize it and run data that's foreclosed in the HDP cluster. Now we are making use of both the data in the cluster, as well as the infrastructure of the cluster itself for running the workloads. >> In terms of the layers stacked, is also incorporating the IBM distributed deep-learning technology that you've recently announced? Which I think is highly differentiated, because deep learning is increasingly become a set of capabilities that are across a distributed mesh playing together as is they're one unified application. Is that a capability now in this solution, or will it be in the near future? DPL distributed deep learning? >> No, we have not yet. >> I know that's on the AI power platform currently, gotcha. >> It's what we'll be talking about at next year's conference. >> That's definitely on the roadmap. We are starting with the base configuration of bare metals and VM configuration, next one is, depending on how the customers react to it, definitely we're thinking about bare metal with GPUs optimized for Tensorflow workloads. >> Exciting, we'll be tuned in the coming months and years I'm sure you guys will have that. >> Pandit, thank you so much for coming on theCUBE. We appreciate it. I'm Rebecca Knight for James Kobielus. We will have, more from theCUBE's live coverage of Dataworks, just after this.

Published Date : Jun 19 2018

SUMMARY :

Brought to you by Hortonworks. Thanks so much for coming on the show. and the other parts of your job. and aligning the two portfolios together. and maybe that's the wrong term, getting a lot of components on to the (mumbles) and so forth. a particular aspect of the focus, and so for more of the deep learning development. No it is all part of the relationship, For GDPR for the IBM Hortonworks partnership the personal information so how do you blog One of the requirements that is not often I'm not aware of something from the Hortonworks side Data lineage as opposed to model lineage. It can identify some of the personal information if you guys are still working out what position. in the sense of they're really still understand the and interpret the requirements coming to terms kinds of fairly complex scenarios for compliance purposes. It's not just about the access to data, I wonder if you could speak a little that offers bare metal configuration in the cloud. It's geared to data scientist developers in the cloud you have Hive and Hbase, can you then containerize that machine learning Sparks on all the nodes that I want to deploy. In this first version will you be able build of the cluster itself for running the workloads. is also incorporating the IBM distributed It's what we'll be talking next one is, depending on how the customers react to it, I'm sure you guys will have that. Pandit, thank you so much for coming on theCUBE.

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
James Kobielus	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Europe	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Asia	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
San Jose	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Pandit	PERSON	0.99+
last year	DATE	0.99+
Python	TITLE	0.99+
yesterday morning	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
three solutions	QUANTITY	0.99+
Ricky	PERSON	0.99+
Northsys	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
Pandit Prasad	PERSON	0.99+
GDPR	TITLE	0.99+
IBM Analytics	ORGANIZATION	0.99+
first version	QUANTITY	0.99+
both	QUANTITY	0.99+
one year ago	DATE	0.98+
Hortonwork	ORGANIZATION	0.98+
three	QUANTITY	0.98+
today	DATE	0.98+
DSX	TITLE	0.98+
Formworks	ORGANIZATION	0.98+
this year	DATE	0.98+
Atlas	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Granger	ORGANIZATION	0.97+
Gaurdium	ORGANIZATION	0.97+
one	QUANTITY	0.97+
Data Steward Studio	ORGANIZATION	0.97+
two portfolios	QUANTITY	0.97+
Truata	ORGANIZATION	0.96+
DataWorks Summit 2018	EVENT	0.96+
one solution	QUANTITY	0.96+
one way	QUANTITY	0.95+
next year	DATE	0.94+
500 nodes	QUANTITY	0.94+
NTN	ORGANIZATION	0.93+
Watson	TITLE	0.93+
Hortonworks	PERSON	0.93+

Daniel Hernandez, IBM | IBM Think 2018

>> Narrator: Live from Las Vegas It's theCUBE covering IBM Think 2018. Brought to you by IBM. >> We're back at Mandalay Bay in Las Vegas. This is IBM Think 2018. This is day three of theCUBE's wall-to-wall coverage. My name is Dave Vellante, I'm here with Peter Burris. You're watching theCUBE, the leader in live tech coverage. Daniel Hernandez is here. He's the Vice President of IBM Analytics, a CUBE alum. It's great to see you again, Daniel >> Thanks >> Dave: Thanks for coming back on >> Happy to be here. >> Big tech show, consolidating a bunch of shows, you guys, you kind of used to have your own sort of analytics show but now you've got all the clients here. How do you like it? Compare and contrast. >> IBM Analytics loves to share so having all our clients in one place, I actually like it. We're going to work out some of the kinks a little bit but I think one show where you can have a conversation around Artificial Intelligence, data, analytics, power systems, is beneficial to all of us, actually. >> Well in many respects, the whole industry is munging together. Folks focus more on workloads as opposed to technology or even roles. So having an event like this where folks can talk about what they're trying to do, the workloads they're trying to create, the role that analytics, AI, et cetera is going to play in informing those workloads. Not a bad place to get that crosspollination. What do you think? >> Daniel: Totally. You talk to a client, there are so many problems. Problems are a combination of stuff that we have to offer and analytics stuff that our friends in Hybrid Integration have to offer. So for me, logistically, I could say oh, Mike Gilfix, business process automation. Go talk to him. And he's here. That's happened probably at least a dozen times so far in not even two days. >> Alright so I got to ask, your tagline. Making data ready for AI. What does that mean? >> We get excited about amazing tech. Artificial intelligence is amazing technology. I remember when Watson beat Jeopardy. Just being inspired by all the things that I thought it could do to solve problems that matter to me. And if you look over the last many years, virtual assistants, image recognition systems that solve pretty big problems like catching bad guys are inspirational pieces of work that were inspired a lot by what we did then. And in business, it's triggered a wave of artificial intelligence can help me solve business critical issues. And I will tell you that many clients simply aren't ready to get started. And because they're not ready, they're going to fail. And so our attitude about things are, through IBM Analytics, we're going to deliver the critical capabilities you need to be ready for AI. And if you don't have that, 100% of your projects will fail. >> But how do you get the business ready to think about data differently? You can do a lot to say, the technology you need to do this looks differently but you also need to get the organization to acculturate, appreciate that their business is going to run differently as a consequence of data and what you do with it. How do you get the business to start making adjustments? >> I think you just said the magic word, the business. Which is to say, at least all the conversations I have with my customers, they can't even tell that I'm from the analytics because I'm asking them about the problems. What do you try to do? How would you measure success? What are the critical issues that you're trying to solve? Are you trying to make money, save money, those kinds of things. And by focusing on it, we can advise them then based on that how we can help. So the data culture that you're describing I think it's a fact, like you become data aware and understand the power of it by doing. You do by starting with the problems, developing successes and then iterating. >> An approach to solving problems. >> Yeah >> So that's kind of a step zero to getting data ready for AI >> Right. But in no conversation that leads to success does it ever start with we're going to do AI or machine learning, what problem are we going to solve? It's always the other way around. And when we do that, our technology then is easily explainable. It's like okay, you want to build a system for better customer interactions in your call center. Well, what does that mean? You need data about how they have interacted with you, products they have interacted with, you might want predictions that anticipate what their needs are before they tell you. And so we can systematically address them through the capabilities we've got. >> Dave, if I could amplify one thing. It makes the technology easier when you put it in these constants I think that's a really crucial important point. >> It's super simple. All of us have had to have it, if we're in technology. Going the other way around, my stuff is cool. Here's why it's cool. What problems can you solve? Not helpful for most of our clients. >> I wonder if you could comment on this Daniel. I feel like we're, the last ten years about cloud mobile, social, big data. We seem to be entering an era now of sense, speak, act, optimize, see, learn. This sort of pervasive AI, if you will. How- is that a reasonable notion, that we're entering that era, and what do you see clients doing to take advantage of that? What's their mindset like when you talk to them? >> I think the evidence is there. You just got to look around the show and see what's possible, technically. The Watson team has been doing quite a bit of stuff around speech, around image. It's fascinating tech, stuff that feels magical to me. And I know how this stuff works and it still feels kind of fascinating. Now the question is how do you apply that to solve problems. I think it's only a matter of time where most companies are implementing artificial intelligence systems in business critical and core parts of their processes and they're going to get there by starting, by doing what they're already doing now with us, and that is what problem am I solving? What data do I need to get that done? How do I control and organize that information so I can exploit it? How can I exploit machine learning and deep learning and all these other technologies to then solve that problem. How do I measure success? How do I track that? And just systematically running these experiments. I think that crescendos to a critical mass. >> Let me ask you a question. Because you're a technologist and you said it's amazing, it's like magic even to you. Imagine non technologists, what `it's like to me. There's a black box component of AI, and maybe that's okay. I'm just wondering if that's, is that a headwind, are clients comfortable with that? If you have to describe how you really know it's a cat. I mean, I know a cat when I see it. And the machine can tell me it's a cat, or not a hot dog Silicon Valley reference. (Peter laughs) But to tell me actually how it works, to figure that out there's a black box component. Does that scare people? Or are they okay with that? >> You've probably given me too much credit. So I really can't explain how all that just works but what I can tell you is how certainly, I mean, lets take regulated industries like banks and insurance companies that are building machine learning models throughout their enterprise. They've got to explain to a regulator that they are offering considerations around anti discriminatory, basically they're not buying systems that cause them to do things that are against the law, effectively. So what are they doing? Well, they're using tools like ones from IBM to build these models to track the process of creating these models which includes what data they used, how that training was done, prove that the inputs and outputs are not anti-discriminatory and actually go through their own internal general counsel and regulators to get it done. So whether you can explain the model in this particular case doesn't matter. What they're trying to prove is that the effect is not violating the law, which the tool sets and the process around those tool sets allow you to get that done today. >> Well, let me build on that because one of the ways that it does work is that, as Ginni said yesterday, Ginni Rometty said yesterday that it's always going to be a machine human component to it. And so the way it typically works is a machine says I think this is a cat and a human validates it or not. The machine still doesn't really know if it's a cat but coming back to this point, one of the key things that we see anyway, and one of the advantages that IBM likely has, is today the folks running Operational Systems, the core of the business, trust their data sources. >> Do they? >> They trust their DB2 database, they trust their Oracle database, they trust the data that's in the applications. >> Dave: So it's the data that's in their Data Lake? >> I'm not saying they do but that's the key question. At what point in time, and I think the real important part of your question is, at what point in time do the hardcore people allow AI to provide a critical input that's going to significantly or potentially dramatically change the behavior of the core operational systems. That seems a really crucial point. What kind of feedback do you get from customers as you talk about turning AI from something that has an insight every now and then to becoming effectively, an element or essential to the operation of the business? >> One of the critical issues in getting especially machine learning models, integrated in business critical processes and workflows is getting those models running where that work is done. So if you look, I mean, when I was here last time I was talking about the, we were focused on portfolio simplification and bringing machine learning where the data was. We brought machine learning to private cloud, we brought it onto Gadook, we brought it on mainframe. I think it is a critical necessary ingredient that you need to deliver that outcome. Like, bring that technology where the data is. Otherwise it just won't work. Why? As soon as you move, you've got latency. As soon as you move, you've got data quality issues you're going to have contending. That's going to exacerbate whatever mistrust you might have. >> Or the stuff's not cheap to move. It's not cheap to ingest. >> Yeah. By the way, the Machine Learning on Z offering that we launched last year in March, April was one of our highest, most successful offerings last year. >> Let's talk about some of the offerings. I mean, at the end of the day you're in the business of selling stuff. You've talked about Machine Learning on Z X, whatever platform. Cloud Private, I know you've got perspectives on that. Db2 Event Store is something that you're obviously familiar with. SPSS is part of the portfolio. >> 50 year, the anniversary. >> Give us the update on some of these products. >> Making data ready for AI requires a design principled on simplicity. We launched in January three core offerings that help clients benefit from the capability that we deliver to capture data, to organize and control that data and analyze that data. So we delivered a Hybrid Data Management offering which gives you everything you need to collect data, it's anchored by Db2. We have the Unified Governance and Integration portfolio that gives you everything you need to organize and control that data as anchored by our information server product set. And we've got our Data Science and Businesses Analytics portfolio, which is anchored by our data science experience, SPSS and Cognos Analytics portfolio. So clients that want to mix and match those capabilities in support of artificial intelligence systems, or otherwise, can benefit from that easily. We just announced here a radical- an even radical step forward in simplification, which we thought that there already was. So if you want to move to the public cloud but can't, don't want to move to the public cloud for whatever reason and we think, by the way, public cloud for workload to like, you should try to run as much as you can there because the benefits of it. But if for whatever reason you can't, we need to deliver those benefits behind the firewall where those workloads are. So last year the Hybrid Integration team led by Denis Kennelly, introduced an IBM cloud private offering. It's basically application paths behind the firewall. It's like run on a Kubernetes environment. Your applications do buildouts, do migrations of existing workloads to it. What we did with IBM Cloud Private for data is have the data companion for that. IBM Cloud Private was a runaway success for us. You could imagine the data companion to that just being like, what application doesn't need data? It's peanut butter and jelly for us. >> Last question, oh you had another point? >> It's alright. I wanted to talk about Db2 and SPCC. >> Oh yes, let's go there, yeah. >> Db2 Event Store, I forget if anybody- It has 100x performance improvement on Ingest relative to the current state of the order. You say, why does that matter? If you do an analysis or analytics, machine learning, artificial intelligence, you're only as good as whatever data you have captured of your, whatever your reality is. Currently our databases don't allow you to capture everything you would want. So Db2 Event Store with that Ingest lets you capture more than you could ever imagine you would want. 250 billion events per year is basically what it's rated at. So we think that's a massive improvement in database technology and it happens to be based in open source, so the programming model is something that developers feel is familiar. SPSS is celebrating it's 50th year anniversary. It's the number one digital offering inside of IBM. It had 510,000 users trying it out last year. We just renovated the user experience and made it even more simple on stats. We're doing the same thing on Modeler and we're bringing SPSS and our data science experience together so that there's one tool chain for data science end to end in the Private Cloud. It's pretty phenomenal stuff. >> Okay great, appreciate you running down the portfolio for us. Last question. It's kind of a, get out of your telescope. When you talk to clients, when you think about technology from a technologist's perspective, how far can we take machine intelligence? Think 20 plus years, how far can we take it and how far should we take it? >> Can they ever really know what a cat is? (chuckles) >> I don't know what the answer to that question is, to be honest. >> Are people asking you that question, in the client base? >> No. >> Are they still figuring out, how do I apply it today? >> Surely they're not asking me, probably because I'm not the smartest guy in the room. They're probably asking some of the smarter guys-- >> Dave: Well, Elon Musk is talking about it. Stephen Hawking was talking about it. >> I think it's so hard to anticipate. I think where we are today is magical and I couldn't have anticipated it seven years ago, to be honest, so I can't imagine. >> It's really hard to predict, isn't it? >> Yeah. I've been wrong on three to four year horizons. I can't do 20 realistically. So I'm sorry to disappoint you. >> No, that's okay. Because it leads to my real last question which is what kinds of things can machines do that humans can't and you don't even have to answer this, but I just want to put it out there to the audience to think about how are they going to complement each other. How are they going to compete with each other? These are some of the big questions that I think society is asking. And IBM has some answers, but we're going to apply it here, here and here, you guys are clear about augmented intelligence, not replacing. But there are big questions that I think we want to get out there and have people ponder. I don't know if you have a comment. >> I do. I think there are non obvious things to human beings, relationships between data that's expressing some part of your reality that a machine through machine learning can see that we can't. Now, what does it mean? Do you take action on it? Is it simply an observation? Is it something that a human being can do? So I think that combination is something that companies can take advantage of today. Those non obvious relationships inside of your data, non obvious insights into your data is what machines can get done now. It's how machine learning is being used today. Is it going to be able to reason on what to do about it? Not yet, so you still need human beings in the middle too, especially when you deal with consequential decisions. >> Yeah but nonetheless, I think the impact on industry is going to be significant. Other questions we ask are retail stores going to be the exception versus the normal. Banks lose control of the payment systems. Will cyber be the future of warfare? Et cetera et cetera. These are really interesting questions that we try and cover on theCUBE and we appreciate you helping us explore those. Daniel, it's always great to see you. >> Thank you, Dave. Thank you, Peter. >> Alright keep it right there buddy, we'll be back with our next guest right after this short break. (electronic music)

Published Date : Mar 21 2018

SUMMARY :

Brought to you by IBM. It's great to see you again, Daniel How do you like it? bit but I think one show where you can have a is going to play in informing those workloads. You talk to a client, Alright so I got to ask, your tagline. And I will tell you that many clients simply appreciate that their business is going to run differently I think you just said the magic word, the business. But in no conversation that leads to success when you put it in these constants What problems can you solve? entering that era, and what do you see Now the question is how do you apply that to solve problems. If you have to describe how you really know it's a cat. So whether you can explain the model in this Well, let me build on that because one of the the applications. What kind of feedback do you get from customers That's going to exacerbate whatever mistrust you might have. Or the stuff's not cheap to move. that we launched last year in March, April I mean, at the end of the day you're in to like, you should try to run as much as you I wanted to talk about Db2 and SPCC. So Db2 Event Store with that Ingest lets you capture When you talk to clients, when you think about is, to be honest. I'm not the smartest guy in the room. Dave: Well, Elon Musk is talking about it. I think it's so hard to anticipate. So I'm sorry to disappoint you. How are they going to compete with each other? I think there are non obvious things to industry is going to be significant. with our next guest right after this short break.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Daniel	PERSON	0.99+
Daniel Hernandez	PERSON	0.99+
Mike Gilfix	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Ginni	PERSON	0.99+
Ginni Rometty	PERSON	0.99+
Peter	PERSON	0.99+
Denis Kennelly	PERSON	0.99+
Dave	PERSON	0.99+
January	DATE	0.99+
Stephen Hawking	PERSON	0.99+
yesterday	DATE	0.99+
Elon Musk	PERSON	0.99+
last year	DATE	0.99+
100x	QUANTITY	0.99+
20 plus years	QUANTITY	0.99+
100%	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.99+
510,000 users	QUANTITY	0.99+
March	DATE	0.99+
today	DATE	0.99+
50 year	QUANTITY	0.99+
Db2	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
seven years ago	DATE	0.98+
one	QUANTITY	0.98+
One	QUANTITY	0.98+
Z X	TITLE	0.98+
20	QUANTITY	0.98+
three	QUANTITY	0.98+
SPSS	TITLE	0.98+
April	DATE	0.96+
IBM Analytics	ORGANIZATION	0.96+
Gadook	ORGANIZATION	0.96+
Silicon Valley	LOCATION	0.94+
two days	QUANTITY	0.94+
Oracle	ORGANIZATION	0.92+
SPCC	ORGANIZATION	0.92+
DB2	TITLE	0.9+
four year	QUANTITY	0.9+
one place	QUANTITY	0.89+
Vegas	LOCATION	0.89+
Kubernetes	TITLE	0.87+
SPSS	ORGANIZATION	0.86+
Jeopardy	ORGANIZATION	0.86+
50th year anniversary	QUANTITY	0.86+
Watson	PERSON	0.82+
at least a dozen times	QUANTITY	0.82+
Db2 Event Store	TITLE	0.8+
theCUBE	ORGANIZATION	0.8+
intelligence	EVENT	0.79+
step zero	QUANTITY	0.78+
one tool	QUANTITY	0.77+
250 billion events per year	QUANTITY	0.76+
three core offerings	QUANTITY	0.75+
one thing	QUANTITY	0.7+
Db2 Event	ORGANIZATION	0.68+
Vice President	PERSON	0.68+
Ingest	ORGANIZATION	0.68+

Rob Thomas, IBM | Think 2018

>> Announcer: Live from Las Vegas. It's the Cube. Covering IBM Think 2018, brought to you by IBM. >> Hello everyone, I'm John Furrier. We are here in the Cube at IBM Think 2018. Great conversations here in the Mandalay Bay in Las Vegas for IMB Think, which is six shows wrapped into one, all combined into a big tent event. Good call by IBM, great branding. Our next guest is Rob Thomas. Cube alumni, general manager of IBM Analytics. Great to see you. >> John, great to see you. Thanks for being here. >> We love having you on, Cube alumni many times. I mean, you've seen the journey. I can remember when I talked to you, it was almost five, four or five years ago. Data, Hadoop, big data analytics, data lakes evolved significantly now where Jenny's major keynote speech has data at the center of the value proposition. I mean, we've said that before. >> Yes, we have. >> The data is the center of the value proposition. >> Every company is finally waking up. >> And then I had coined the term "the innovation sandwich." Blockchain on one side of the data, and you got AI on the other side, it's actually software. This is super important with multi-cloud. You've got multiple perspectives. You've got regions all around the world, GDPR, which everyone's been talking about, you guys have been doing lately, but the bigger question is: the technical stacks are changing. 30 years of stacks evolving, technology under the hood is changing, but the business models are also changing. This puts data as the number one conversation. That's your division. Your keynote here, what are you guys talking about? Are you hitting that note as well? >> So, number one is, think of this ladder to AI. We've talked about that before. Every client's on a journey towards AI, and there's a set of building blocks everybody needs to get there. We used the phrase once before, "There's no AI without IA," meaning if you want to get to that end point, you have to have the right information architecture. We're going to focus a lot on that. We've got a new product we've released called IBM Cloud Private for Data, which takes all of the assembly out of the data process. A really elegant solution to see all your enterprise data. That's going to be the focus for me this week. >> I want to get into that, but I also heard Scott, your VP of marketing now, talk about bad data can cripple you. So, I want to explain what that actually means. Because it's always been dirty data, it's been kind of a data science word, data warehouse word, clean data, you know, data cleanliness, but if you're going to use AI as a real strategic thing, you need high quality data. >> You do. >> John: Your thoughts? >> Think backwards from the shiny object, 'cause everybody loves the shiny object, which is some type of AI outcome, customer centricity, making you feel like a celebrity. There's two things that have to happen before that, or really three. One is you need some type of inferencing, a model layer where you're actually automating a lot of the predictive process. Before that, you need to actually understand what the data is. That's the data governance, the data integration. And before that, you need to actually have access to the data, meaning know where it's stored. Without those things, you just have a shiny object and not necessarily an outcome. That's why these building blocks are fundamental. And the clients, they get to this point, and they're the ones who try to jump to the shiny object and they don't have the data to support that. >> And then you've got companies going on digital transformation, which is basically saying all their data legacy, trying to modernize it. The modern companies like Uber, and we saw the first fatality of an Uber car this week, again, points out the reality that realtime is realtime, and the importance of having data, whether it's sensing data. We're not, it's coming there, you can start to see it happening. Realtime data is key. That means data mobility is critical, and you mentioned private, public. Storing the data and moving the data around, having data intelligence, is the most important thing. Realtime data in motion, intelligence, you know, where are we? Is that a setback with the Uber incident? Is it a step forward, is it learning? What's your view of the data quality of movement in realtime? >> I think data ingestion is one of the least talked about topics that is one of the most important. With IBM Cloud Private for data, we can ingest 250 billion events a day. Let me give you some context for that. 2016, the entire credit card industry, everywhere in the world, did 250 billion transactions. So what credit cards do in a year, we can do in a day. Biggest stock trading day ever on the New York Stock Exchange, what got done in that entire day, we can do in the first 40 minutes of trading. But that value there is, how fast can you bring data in to be analyzed, and can you do a decent bit of that pre-processing, or analytics, on the way in? That's how you start to solve some of the problems that you're describing, because it's instant >> John: Yeah. >> And it's unsurpassed amounts of data. >> So ingestion's a key part of the value chain, if you will, on data management. The new kind of data management. Ingesting it, understanding context, then is that where AI kicks in? Where does the AI kick in? Because the ingestion speaks to the information architecture, IA. >> Rob: Yes. >> Now I got to put AI on top of that data, so is the data different? Talk about the dynamic between, okay I'm ingesting data for the sake of ingesting, where does the AI connect? >> So you got the data, yep. So you go the data, AI starts where you're saying, all right, now we want to automate this. We're going to build models, we're going to use the data that we've got in here to train those models. As we get more data, the models are going to get better. Now we're going to connect it to how humans want to interact. Maybe it's natural language processing, maybe it's visualizing data. That's the whole lineage of how somebody gets toward this AI idea. >> What are some of the conversations you're having with customers, and how have they changed? And give some color, I mean, only a few years ago we're talking about data lakes. >> Right. >> Okay, what is the conversation now, and give some context of how far that conversation has gone down the road toward advancement. >> I think we're going from data lakes to an idea of a fluid data layer, which is all your data assets managed as a single system, even if they sit in different architectures. Because there's no one, we all know this. We've been around this industry forever. There's no one way to support or manage data that's going to support every use case. So this idea of a fluid data layer becomes critical for every organization. That's one big change. Other big change is containers. What we're doing with Cloud Private for Data is based on Kubernetes, that's how people want to consume applications, but nobody's really solved that for data. I think we're solving that for data. >> Let's dig into that. It was one of my topics I wanted to drill down on. Containers have been great for moving workloads around, certainly Kubernetes has been a great orchestration tool. How does that fit for data? I'm just putting a container on data sets? Who's addressing the envelope of that container? How is that addressable? I mean, how does it work? >> Let me give you an analogy. So you go back to the year 1955. There is no standards in any shipping port around the world. Everybody is literally building their own containers, building their own ships, building their own trucks. It's incredibly expensive and takes forever to get cargo to move from one place to the next. 1956, a guy named Malcom McClean, he invents the first intermodal shipping container, patents it. It becomes the standard. So now, every port, every container looks identical. What's the benefit? Sure, it made more flexibility. Saved lot of money, 90% of the cost came out of shipping a container. But the biggest thing is it changed commerce. So, you look at GDP at that time, it took off. All because of the standardization around a form factor that made it accessible to everybody. Now, let's put that in the IT world. We got containers for the application world. Made it much easier to deploy, a standard, again. >> Yeah, and program around. >> More cost-effective, more-- Yep, exactly. What's the cargo in IT? It's data. Data is the cargo, that's what's sitting inside the container. Now you have to say, how do we actually take the same concepts that we did for applications, make that available for data so that my data can fit anywhere? That's what we're doing. >> How does that work and what's the impact to the customer? Is it IBM software that you're doing? Is it Kubernetes open source software? Just tie that together for me. >> So IBM Cloud Private is our Kubernetes distribution, with some different pieces we put on it. When you add the Cloud Private for Data, it's got a Spark Engine, like everything we do it's based on open source to start with. And then we have an experience for a data scientist, an experience for a data analyst. It's your view to your enterprise data. You'll love the UI when you see it. First, above the fold, all my machine learning models in the organization, what's working, what's not working. Below the fold, what's my data? Structured or unstructured? Sensitive, non-sensitive? I click it on, I can see all of my data. Hadoop, Cloud-A, Cloud-B, Cloud-C, on-premise system. It's get a view to all of your data. >> So is the purpose to move the data around? >> No, the purpose is actually the exact opposite. Leave the data in place, but be able to treat it as a single data environment. We're doing a lot of work with Federation, our SQL technology which historically, as we all know, Federation hasn't really performed. We have it performing. >> Okay, so I'm just, in the use case in my head, so I store the data on my private, secure, comfortable, feeling good about it, but I have a public cloud app. How does that work? Is it a replica of the data? Is it just the container that makes it addressable? How does that move across? >> So, click a button, move the data. If you want it to be a replica, click a button and say "replicate." If you want to just move it, just click a button and move it. It's literally that easy. >> And so the customers can choose where to put the data. >> Yes. >> Can they do a public version of this, or only private? >> Both, it connects to public as well. >> Okay, so that was Jenny's mention, okay cool. What's the most exciting thing for you this week going on in your world? Obviously, center of the value proposition, and Jenny used your lines so I'm sure you fed her some good sound bytes there, because she was basically taking your pitch as the headline for the keynote. Is that the highlight, or is it customer activity? >> I think the exciting thing, and Jenny did talk about it, is connecting data to AI. I'd say many clients have kind of thought of those as two different topics. We do that in three ways. We say common machine learning fabric. You can build a model in Watson, you can deploy it where your enterprise data is or vice versa. We do that with the metadata. You create business or technical metadata on-premise, you can push that to Watson or vice versa. And like we just talked about, we make the data movement incredibly easy. So we're uniting these two worlds of data and AI that have tended to be different parts of an organization in many clients. We're uniting that, I think that's pretty interesting. >> All right, so final question, I've got to ask the tough one, which is, okay, Rob I love it, but I'm really not paying attention to the data because I've got my hands full in my IT transformation and we're making critical decisions on cloud globally, I've got multiple regions to deal with, I got different issues outside in each digital nation, but I'm going to get the data after. What's in it for me, your whole pitch? I'm dealing with cloud right now, so why should I be cross-connecting with the cloud decision and the cloud conversations that relate to the benefit of what you're doing? >> If you're not paying attention to the data, you're not going to be around. So your cloud decisions are kind of worthless, because you're not going to be around if you're not paying attention to the data. >> So I can make a bad cloud decision if I don't factor in what? >> I believe you have to think about your data strategy. Look, every organization is going to be multi-cloud, but you have to have a single data strategy regardless of what your cloud strategy is. You've got to think about all those building blocks I talked about. Manage data, collect data, govern data, analyze data. That has to be one strategy regardless of cloud. If you're not thinking about that, you're in trouble. >> Or making sure that I have Kubernetes? Is that a good decision? >> That is a great decision. >> (laughs) >> Makes it really easy, seamless to deploy applications, to deploy data, to move it around clouds. Makes it really easy. >> And what's the business model for containers? Kind of shifts to being a commodity? >> I think over time, yes, but there's so much to do around containers because containers, again, go back to the analogy. It's just the crate. >> John: Makes things easy. >> It's not the cargo, it's not the ship. It's just the crate, it's one piece. >> Yeah, and there's no, a lot of choice there, too. Clients can do whatever they want. >> Yeah. >> All right, we love Kubernetes. We'll be at KubeCon in Copenhagen next month, so keep a lookout there for us. This is Rob Thomas, here inside the Cube, here at IBM Think, breaking down all the action in the data science world, data world. It's the center of the value proposition. Main story here at IBM Think is data at the center of the value proposition for the modern enterprise. I'm John Furrier inside the Cube. We'll be back with more after this short break. (light electronic music)

Published Date : Mar 21 2018

SUMMARY :

It's the Cube. We are here in the Cube at IBM Think 2018. John, great to see you. has data at the center of the value proposition. You've got regions all around the world, A really elegant solution to see all your enterprise data. you know, data cleanliness, but if you're going to use AI And the clients, they get to this point, having data intelligence, is the most important thing. some of the problems that you're describing, Because the ingestion speaks to That's the whole lineage of What are some of the conversations down the road toward advancement. that's going to support every use case. Who's addressing the envelope of that container? Now, let's put that in the IT world. Data is the cargo, Is it IBM software that you're doing? You'll love the UI when you see it. Leave the data in place, but be able to treat it Is it just the container that makes it addressable? So, click a button, move the data. What's the most exciting thing for you this week that have tended to be different parts that relate to the benefit of what you're doing? So your cloud decisions are kind of worthless, I believe you have to think about your data strategy. Makes it really easy, seamless to deploy applications, It's just the crate. It's not the cargo, it's not the ship. Yeah, and there's no, a lot of choice there, too. It's the center of the value proposition.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
David	PERSON	0.99+
Michael	PERSON	0.99+
Marc Lemire	PERSON	0.99+
Chris O'Brien	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Hilary	PERSON	0.99+
Mark	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ildiko Vancsa	PERSON	0.99+
John	PERSON	0.99+
Alan Cohen	PERSON	0.99+
Lisa Martin	PERSON	0.99+
John Troyer	PERSON	0.99+
Rajiv	PERSON	0.99+
Europe	LOCATION	0.99+
Stefan Renner	PERSON	0.99+
Ildiko	PERSON	0.99+
Mark Lohmeyer	PERSON	0.99+
JJ Davis	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Beth	PERSON	0.99+
Jon Bakke	PERSON	0.99+
John Farrier	PERSON	0.99+
Boeing	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Dave Nicholson	PERSON	0.99+
Cassandra Garber	PERSON	0.99+
Peter McKay	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Dave Brown	PERSON	0.99+
Beth Cohen	PERSON	0.99+
Stu Miniman	PERSON	0.99+
John Walls	PERSON	0.99+
Seth Dobrin	PERSON	0.99+
Seattle	LOCATION	0.99+
5	QUANTITY	0.99+
Hal Varian	PERSON	0.99+
JJ	PERSON	0.99+
Jen Saavedra	PERSON	0.99+
Michael Loomis	PERSON	0.99+
Lisa	PERSON	0.99+
Jon	PERSON	0.99+
Rajiv Ramaswami	PERSON	0.99+
Stefan	PERSON	0.99+

Scott Hebner, IBM | IBM Think 2018

>> Announcer: Live, from Las Vegas, It's theCUBE, covering IBM Think 2018. Brought to you by IBM. >> We're back at IBM Think 2018 from Mandalay Bay in Las Vegas. My name is Dave Vellante. I'm here with Peter Burris, my co-host. You're watching theCUBE, the leader in live tech coverage. Scott Hebner's here as the Vice President of Marketing for IBM Analytics. Scott, welcome back, good to see you. >> Thank you, glad to be back again. >> So you heard Jenny this morning, a very inspiring speech. I love her talks. She's really good in front of an audience and one-on-one. What were your takeaways, specifically as it relates to your group? >> Well I think the theme of this whole conference is a lot of these technologies over the years that have been purchased separately and are thought of separate, quote-unquote, segments, are really all starting to fuse together. They're becoming different facets of the same challenge that a large majority of our clients have. And that is really this evolution towards a more AI based set of business models, right? There's a stack of things that need to be done to make that successful. You've got to move to the cloud for the agility of it, the economics of it. You got to get more value out of your data, and make your data ready for AI. Then you can start to more effectively train your AI models and allow them to continue to learn and everything. So it all really comes together, and I thought that's what she was framing, of what IBM's trying to do uniquely. >> Yeah, and I think it came across that way. Obviously, this conference is about bringing together all the separate... And your organization is evolving. I mean, when you think about IBM... Go back, Peter, to even the Gerstner days, and he said, "No, we're not going to split up "into a million companies. "We're going to have one face to the customer." And then, obviously, IBM was very successful there. You now had some major changes in the marketplace and you're responding to those. >> Yeah, and I think that's exactly right. We're being very customer-driven. One of the great advantages of IBM is that we have so many customers, right? A mix of new ones, a mix of ones we've had for a long time. We have so many people that engage. If you think about the size of IBM and how many are engaged with customers every single day at all levels, from the very most technical to the people that manage relationships, we learn a lot collectively. With all the new technologies, particularly around digital, net promoter score, all these things, we learn a lot about what they're trying to do. And that's what's driving us to fuse these strategies together into a more wholistic one. And that's what you heard this morning from Jenny. >> So, I also really enjoyed what I heard this morning from Jenny. It takes me reminded me, though, of one of those television shows where people bring in their old family artifacts, and then people price them. I imagine enterprises today literally looking at their data, the 80% that nobody has visibility to, and finding Grandpa's letter from Abraham Lincoln. >> Yeah. >> And using and discovering that this is a source of value that they've never envisioned before. Is that kind of the mentality, the conversation, that we're having today? >> No, that's exactly right. A large, large majority of CEOs have declared their data to be a strategic asset, but only about 10% of them believe their company treats it that way. And it leads to the statistic that you just referenced, which is 80% of data is either unanalyzed, untrusted, or inaccessible. So they're sitting on a gold mine of data, right? It's not just empirical customer records, but it's increasingly IOT and sensor data. It's behavioral data. There's a gold mine there. Step one is how do you take advantage of that and get more value out of it, right? Just in today's world, right? And then it really becomes fundamental to being successful with artificial intelligence. You have to have an information architecture. We kind of say if there's no IA, there's no AI. You have to have that information architecture to be successful, and that's really where we're focused on at this conference today, is getting that data ready for AI. >> So getting the data ready for AI, there's a lot that goes into that. But when you consider the notion of data as an asset, and what we heard from Jenny this morning, it seems as so, in many respects, there's kind of two models happening in the industry. You can see if I got this right. Companies that make money off of your data and companies that aren't going to make money off of your data. >> Right. >> Would you agree... I mean, is that kind of how the split is starting to happen in the industry right now? >> Yeah, no, I think that's right. I mean, I think a large majority of our clients are using their data within their firewall to operate their businesses better, better understand their customers. >> No, I learned something different. Yeah, sorry, I apologize. Companies that are going to make money off their customers' data-- >> Yes. >> And companies that are not going to make money off their customers' data. >> Yeah. >> Right? >> Yeah. What I'm saying is... No, I get the question. Different companies have different business models with what they're going to do with their data. Some see it as an asset to run their business more effectively. Others see it at as a direct asset that they sell and resell and resell, right? What I'm saying is the majority of the customers we deal with are looking at their data as an asset to run their business better. >> And that's the basis for the argument that the incumbency, that we're entering back into the area of the incumbency because of all these rich assets that aren't currently being utilized. Is that right? >> That's right. >> Great. >> It all starts with the fact that the data is fragmented everywhere. Business partner networks across different databases. Step one is to make that data simple and accessible. But once you do that, that's not the end of it because you need to make sure that the data that people are using is trusted. You have to have that trusted analytics foundation. So you got to integrate it, replicate it, catalog it, cleanse it, manage its lifecycle. You need to have one version of the truth, right, that everyone works off of, which is a major problem, by the way. It's the whole notion of governance and that falls into other categories like privacy and all the compliance challenges that customers have. Then from there, you have that foundation where you can start to drive more insights out of it through things like machine learning and pattern recognition. As you start to build those skills around data science, it starts to get you really ready for that next step on that ladder to AI. That's where a lot of these customers are figuring out how do I get on this roadmap to AI. And 85% or so say they're going to get there in the next five years. There's a great study from MIT Sloan that came out last year of 3,000 customers and was very clear. The difference between the pioneers that are having success, and those that aren't, is the pioneers have figured out how to make their data ready for AI. It all really starts there. That's really what we're focused on here at the show. >> Let's talk about that incumbent theme. It was part of Jenny's talk this morning. >> Scott: Yup. >> And you're right, the incumbents, their data exists in silos, even though they're maybe data companies, like a bank. >> Scott: Yeah. >> They're organized, perhaps, around their products. Or a manufacturer might be organized around the bottling plant, as you say. Whereas those companies that are AI driven have data at their core. So it's a challenge for the incumbent. >> Huge. >> How are you helping them close that gap, that AI gap, if you will? >> Right, and that's exactly what I was just saying before, is that the data is incredibly dynamic and growing at exponential rates. Not only through what you just mentioned, but there's acquisitions. There's different business partners that evolve through your networks, your client data, things of that nature. >> Dave: And data sources, yeah. >> Data sources are changing. And then you get into the technical layer of all different types of data, from images to empirical data. And then you get into different databases. It becomes a very heterogenous mess. Step one is to make it simple and accessible. And doing that though big data and being able to view through a single layer all the data as it changes, right? Because if you don't have access to your data, then what are you going to be training your AI algorithms on? And again, from there, you've got to govern it in a way that it's trusted data. This is a huge challenge for customers, because they get different versions of data that tell them different things. Which is the single version of the truth? It's kind of like if you've ever been on a... When you get on a treadmill, your watch says this many steps, your phone says another number of steps, the treadmill says the third number of steps. You're like, how many steps did I really take? They have that challenge every day. When you get that foundation and information architecture together, then you're ready for AI. What this MIT Sloan study showed was that bad data is paralyzing to AI. No matter how sophisticated your algorithmic AI capabilities are, bad data is simply paralyzing. So that's really where it needs to start. To circle back to your point about 80% of data, untrusted, unanalyzed, and inaccessible, that's got to be step one on that ladder to AI. >> So how are we going to use ML, machine learning AI, to help us get our data ready for machine learning AI? >> Well, that's exactly what we're doing in the IBM portfolio of data and analytics products, is we have this theme called Machine Learning Everywhere. So it actually is in almost every part of our platform. Hybrid data management uses machine learning to help do a much quicker assessment of how you bring data together and analyze it and things of that nature. We use it in the governance. In fact, we have a technology prototype that we've been working with some customers on, that will do the work for GDPR, the European Compliance Guidelines, in probably a few days to a week versus months and months and months. 'Cause we will go in and do all the entity associations for all your data. Help you organize it in a way that you can actually manage what to do with the compliance. And then, obviously, machine language is fundamental to just business analytics in general, right, pattern recognition. The traditional analytics tools will help you understand the data as it's presented, based on what you are trying to get out of it. Often, you don't know what you're trying to get out of it. Machine learning gives that data science method of actually uncovering patterns, which you can't really see. >> Peter: Creating models. >> Yeah, creating models and then you add the neuro-networks to it in deep learning. It's really literally a ladder that you're building that when you get to AI, you're going to be a lot more successful because you've built that trusted foundation underneath it. And I think Jenny was touching on that to some degree this morning. That's what we're majoring on, is that that data is really the key element of AI. >> Scott, who are the roles that you see developing this information architecture, getting ready for AI? CDO, CIO, Chief Digital Officer, where do they all fit? >> Yeah, I think it leads under the CDO. And actually both CDOs, the chief digital officer and the chief data officer, and their collection of data engineers, data stewards, things of that nature. 'Cause, again, you got to start by getting that information architecture in place. It also involves sort of a new generation of data developers that are building cloud-based data intensive applications, particularly of event-space data, which is a little bit different that customer data from sensors and all that, where you need that massive ingest speeds. It's those data-driven applications from the cloud that are really starting to incorporate machine learning. So they become really key. Then from there, if you think of it as a collaborative lifecycle, you get into the data scientists that are applying analytics. They're applying a more sophisticated version of mathematical programming and data science. Then there's a new, sort of subset of them, which are the AI developers. It's really from the data engineer right through business analysts. There's a lifecycle of people that are part of that team. They all have to work off a common platform, a common set of trusted data, to be successful. 'Cause you can no longer segment it. >> Is your strategy to build tooling that allows all of those roles to collaborate, maybe not the chief digital and chief data officer, but the data engineer, the data quality engineer, the application developer, the data scientist, right. Is that correct? >> That is absolutely correct and the CDOs. Actually, what we're announcing at the show is a new offer called IBM Cloud Private For Data. >> Dave: Right. >> So if you're familiar with IBM Cloud Private, it's our private, behind the firewall, cloud platform. We're coming out with a new offering that plugs into it. It's based on Coubanettis, so it runs on IBM Cloud Private For Data, and will run on other Coubanettis-based platforms. It is a fully integrated data and analytics platform, where no assembly is required. It will provision in minutes a pre-assembled, customized experience for you, based on what your role is. So if you're the CDO and you're the data scientist, and I'm the data engineer, we're all going to have a different set of requirements of what we want to get out of the data and what we're looking to do. It will pre-provision that for you very, very quickly. And you're all working off a common platform. It's collaborative in nature, with dynamic dashboards so you can see what's going on. It's really taking the building blocks that you need to move up that ladder and integrating to microservices in to a cloud platform that is just lightning fast in terms of, not only its ingest speeds of data but, more importantly, the ability to provision new users. So it's a major step forward in making it so much easier, so much more simple to get more out of your data and to get your data ready for AI. >> So, last question. You have this giant portfolio. We just finished our Big Data report. You guys, IBM, came up number one. Well, that was services, but still, you got a lot of software in there as well. >> Scott: Yes, we do. >> You've been working hard to pull those pieces together so the clients, it's simplify data. >> Scott: Yup. >> Okay, here's where are are, 2018, where do you want to take this thing? >> Well, I think, again, I think step one is this unified experiences. Because, again, we were kind of majoring on this conversation about the desegmentation of how people work in a business, what technology, what data they use. 'Cause with AI, it really does need to come together, right? So we're trying to do the same thing for the users, which is provision-based, almost on-demand, what you need based on what you're looking to do. And I think what's going to change as we go through time is it becomes more and more machine learning based, pattern recognition. It's more automated and customized and personalized, based on what you're trying to do. That's going to allow businesses to move at a much more rapid pace. And, again, I think the overriding theme when you look over a five year horizon is, is your data ready for AI? And that's where we're moving this whole thing. It's about the data. It's about the people and their skills. And it's the ability to move quickly. That's where the linkage with cloud comes in. >> Getting to pervasive AI, but you got to get your data house in order first. >> You got it. >> Scott Hebner, thanks very much for coming on theCUBE. >> Thank you. >> Great to see you again. >> Great meeting you. >> All right, keep right there everybody. We'll be back with our next guest. You're watching theCUBE at IBM Think 2018.

Published Date : Mar 20 2018

SUMMARY :

Brought to you by IBM. Scott Hebner's here as the Vice President specifically as it relates to your group? You got to get more value out of your data, "We're going to have one face to the customer." And that's what you heard this morning from Jenny. the 80% that nobody has visibility to, Is that kind of the mentality, the conversation, And it leads to the statistic that you just referenced, and companies that aren't going to make money I mean, is that kind of how the split is starting to operate their businesses better, Companies that are going to make money And companies that are not going to make money as an asset to run their business better. And that's the basis for the argument that the incumbency, it starts to get you really ready Let's talk about that incumbent theme. And you're right, the incumbents, the bottling plant, as you say. is that the data is incredibly dynamic then what are you going to be training your based on what you are trying to get out of it. that when you get to AI, that are really starting to incorporate machine learning. that allows all of those roles to collaborate, That is absolutely correct and the CDOs. and to get your data ready for AI. Well, that was services, but still, so the clients, it's simplify data. And it's the ability to move quickly. but you got to get your data house in order first. We'll be back with our next guest.

ENTITIES

Entity	Category	Confidence
Scott	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Peter Burris	PERSON	0.99+
Dave	PERSON	0.99+
Jenny	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Scott Hebner	PERSON	0.99+
80%	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.99+
85%	QUANTITY	0.99+
Peter	PERSON	0.99+
MIT Sloan	ORGANIZATION	0.99+
both	QUANTITY	0.99+
2018	DATE	0.99+
Abraham Lincoln	PERSON	0.99+
3,000 customers	QUANTITY	0.99+
last year	DATE	0.99+
Las Vegas	LOCATION	0.99+
two models	QUANTITY	0.99+
today	DATE	0.99+
GDPR	TITLE	0.98+
one	QUANTITY	0.97+
Gerstner	PERSON	0.97+
about 10%	QUANTITY	0.96+
IBM Analytics	ORGANIZATION	0.96+
one version	QUANTITY	0.95+
One	QUANTITY	0.95+
step one	QUANTITY	0.95+
Step one	QUANTITY	0.94+
about 80%	QUANTITY	0.94+
single layer	QUANTITY	0.93+
third number	QUANTITY	0.92+
IBM Think 2018	EVENT	0.92+
European Compliance Guidelines	TITLE	0.91+
this morning	DATE	0.91+
Coubanettis	TITLE	0.91+
Vice President	PERSON	0.89+
five year	QUANTITY	0.88+
Think 2018	EVENT	0.87+
a week	QUANTITY	0.86+
single version	QUANTITY	0.83+
next five years	DATE	0.74+
first	QUANTITY	0.69+
a million companies	QUANTITY	0.68+
Private	COMMERCIAL_ITEM	0.64+
CDO	TITLE	0.63+
single day	QUANTITY	0.62+
Cloud	TITLE	0.55+
IBM Cloud Private For Data	TITLE	0.53+
Cloud Private	COMMERCIAL_ITEM	0.51+
shows	QUANTITY	0.5+
IBM	COMMERCIAL_ITEM	0.38+

Ranjana Young, Northern Trust | IBM Think 2018

>> Announcer: Live from Las Vegas, it's The Cube, covering IBM Think 2018, brought to you by IBM. >> Welcome back to The Cube. We are live in sunny Las Vegas at the inaugural IBM Think 2018 event. I'm Lisa Martin with Dave Vellante. Dave, this weather has got to beat Boston hands down, right? >> It was beautiful yesterday, about 15 degrees in Boston, snowy. >> So you thawed out since you've gotten here? >> I took the snowshoes out, actually. Life makes lemons. >> Exactly, and we have another cold-weather guest who's probably thawing out as well, Ranjana Young, the senior vice president of Enterprise Data Services from Northern Trust, welcome. >> Thank you, thanks for having me. >> We're excited to chat with you. You have a role at Northern Trust, and your mission is all-around data, five-core competencies, including data governance and stewardship, data quality, master data management, enterprise integration with data platforms. Tell us a little bit about your role, how long you've been doing that, and really what this focus on data is enabling for Northern Trust. >> Sure, I want to talk first about our mission as you had mentioned. I think it was critical to establish a broad mission for Northern Trust. We wanted to make sure that we establishing an enterprise data program that enabled our customer needs and overall our customer experience, but also truly helped support our regulatory needs that we had, and it was critical to establish those two as the main goals, not just one or the other. And then the role, I call myself a change agent because establishing capabilities that you talked about, it is difficult to do, with a lot of legacy that we have. The firm has been in existence for 128 years To establish a data-driven culture was very different. I think we were known to do provide good business solutions, but a lot with the gut, given that we were good at it, but how do you make sure that you change that culture and have a relationship managers and others really think differently and use data to provide those solutions to our clients. >> I remember when I met Inderpal Bhandari, I'm sure you know him, and he said that he has a framework for a data leader, and he said there are five things a data leader has to do to get started, and three are in parallel, or sorry, three are linear, two are in parallel. I don't know if you've heard this rap, but I'd like to sort of explore them and see how your three are generally. He said you start with understanding how the organization monetizes data, not directly, maybe selling data, but how it contributes, and then the next one was sort of data access and then data quality. Those are the sort of sequential activities, and then the parallel ones were form relationships with a line of business and then re-skill. So those are his five. How did you approach it, what was different, what was similar, what were some of the challenges that you had in doing that? >> Sure. If I had to think about kind of, to correlate some of the components of the strategy, skills is an important thing. When I started establishing the team three years ago, it was critical that we had to bring some of the core skills within the firm because they had the business capabilities, they understood the systems, they understood kind of the skeletons that were in the closets and knew the culture and also embraced the challenges and still could find solutions. And then you had to bring external folks that really had the capability to drive that change, had the mastery of management skills to really support and set up an account domain and a party domain, a reference data domain, especially an asset domain, et cetera. So we had to look at kind of a conglomerate of individuals to do that. And then if you look at kind of where was the starting point in terms of really establishing the program was, we were going through a transformation to really re-platform a lot of our legacy, whether it was our valuation system or our cash platform, others, and data was a thread throughout all of those programs, so it was critical to establish and think and take bite-sized chunks, it was important to think about, okay, throughout all the programs, what is the important data that we could kind of understand, so we focused quite a bit on initially looking at critical data and looking at critical data from a master data perspective, so asset data, which is very critical to the work that we do on the institutional side. As you know, we had a management asset servicing company. Data is an asset for us, we enrich the data. We provide services around that today, and have been, and so embedding data governance through that process was important, and also our clients were really looking for the enriched data but also were looking for clean information but also were looking for where did that data come from? Where does the definition of this data? So kind of giving them that external catalog of here's the data, but here's the enriched data and here's the metrics for data quality around it, and then here's the definitions for it. So to some extent, that drove change because of customers were looking for it, and a lot of the capabilities that were foundational to the firm, we're starting to externalize, especially the meta-data catalog, et cetera. >> So if I could play that back, so you started the team, all right, you said, okay, I need to build a team. I think I heard that, and then the data quality, and then presumably, okay, who has access to this data? Is that about right? >> So I started with the mission to say, we have to do this for both arms, the left arm being our customer experience and making sure that we change the way we're doing our work there, or enhance the work so that our customer experience was better, and then obviously the regulatory, make sure that we need the regulatory. So for that, we needed five core competencies. We knew that we had to establish a role of the steward, a role of the custodian, so the team started to become very critical then, and then we knew that we had some gaps in our master data management capability, a complete gap in having integrated data platforms. I notice I've talked a little bit about we established a whole strategy and architecture for ING. I totally relate to how we had to do the same. Each silo did their own particular thing. The management did their own thing. >> David: By data. >> The institutional side did their own thing. Asset management was, I would say, a lot more mature. So I would say if you were to think about it, it's establishing the mission and establishing the team. >> And then, just one last follow-up. The services that you're providing, data services, those are delivered through your organization, the IT organization, what's the practice? >> We have a partnership, a very collaborative partnership that we work together. The technology team does all the build for the work, we work collaboratively to kind of build a strategy of what solutions need to be first versus later, given the client priorities and our institutional side, our business unit priorities, so that's a collaborative effort, working together. >> So speaking of collaboration, you mentioned earlier that it was really key to have both the veterans within Northern Trust and their expertise that you said kind of the skeletons, that they know where things are buried, as well as that maybe external, you might say more fresh perspective. You also talked about, we chatted before we went live, about governance. Seems like what you guys have done is kind of flipped governance from being viewed as potentially an inhibitor to really empowering, being an empowering capability. Can you tell us how you've leveraged data governance to empower a data-driven culture within a business that is 128, I think, years old, you said? >> Yes, that's right. So, for us, I think that while we were establishing the program, it was very critical to understand kind of the challenges on the institutional side first because they had the maximum number of challenges with data. Again, because we're an asset servicing company, a data is an asset, we enrich that information and provide that information, but what was happening was it was taking us so much longer to provide these solutions to our clients, so we've embedded, now, the data governance framework as a part of that solution, and our clients are seeing the value, so if you look at one of the customers that we're working with, we actually have externalized our catalog where they understand now what data that they're receiving, and you're speaking the same language, and that was not the case before. But again, as I said, if we didn't do the foundational work of cataloging the information, understanding what the data is, where the data is, what the data assets are, we just couldn't have done that, so it's really paying off because of that. >> How has that affected your ability to be prepared for GDPR, which obviously went into effect last year, the fines go into effect in May of this year? What was the relationship there? >> So we have worked very, very closely with our chief privacy officer, and we've really done a phenomenal job of identifying where our highly sensitive data assets are. We're in the process of cataloging all of them through the unified governance framework that we've established, so we leverage IBM's IGC NIA to do all that work, and the lineage all the way to the authentic source, which is something the regulators definitely are looking for, so are we fully, completely done yet? No, so we're in that journey, and with unstructured data, we're looking at discovery tools to kind of provide that. We have a solution that's a little manual at this point, but we hope to kind of make more progress on that side. >> I got to ask you, so around 17%, the data suggests, 17% of the IT, technology industry is women, but I was at an IBM, it was a Data Divas breakfast that I crashed, I snuck in, one of the few guys there. >> Oh, very cool. And there was a stat that around 30% of data leaders are women, I don't know, it was a sort of a small sample, who knows? Sounded a little high. Somebody said it's because it's a thankless job and women have to take it on, so thoughts on women in tech, women in this role, perspectives. >> So I am excited to meet a few here at the conference. That statistic is pretty high that you're stating. I don't see that. >> David: It's outside that. >> In the industry, I do find myself sometimes as a lone warrior, at least in the industry forums, but I think it's growing. I think especially women in technology, women in leadership on the line of business side is growing, and Northern Trust, I'm very proud to say, is big around diversity and providing opportunities to women, so from that perspective, I think I'm excited that women are taking interest in data, yes, it is a very hard job, so I think, I feel like we are organized, we get a lot done at the same time, so I think it's really helped. >> Other than it's the right thing to do, are there other sort of business dimensions? Is it Mars versus Venus? Are there sort of enrichments that a woman leader brings to the equation, or is it just because it's the right thing to do? >> I've seen tenacity women have. No offense to anyone, I think the higher tenacity to be persistent. >> I don't take offense. >> To be methodical, to be methodical, and also to have the hard discussions in a very factual way sometimes, but also in, yes, this is the right thing to do, but is there ways we could make this change happen in a systematic, bite-size chunk way. Sometimes I think those coercive conversations help a lot more than the others, and I think, to me, I would say tenacity, tenacity. >> I love that word. I have to say, that's a word that's oftentimes associated with males. A lot of times a tenacious woman, it's a different adjective, right? It's a term, I don't know, Lisa, what your experience has been, so that's good, a good choice of words in my view. >> I've heard pushy before, and I think what they really meant >> David: There you go, okay. >> Is persistence. (laughs) >> That's right. >> A man is tenacious, a woman is pushy. You hear that a lot. >> Right, I think it's persistence. So last question for you. Here we are at the inaugural IBM Think 2018. You guys are an IBM Analytics Global Elite Partner. Can you talk to us a little bit about that strategic partnership and what it means for Northern Trust? >> This partnership has really helped us tremendously in the last three years while we were putting the strategy to action while operationalizing data governance, while operationalizing a lot of the capabilities we thought we would have but really kind of bringing that to life. We're also really excited because lot of the feedback that we've provided has gone into kind of redoing some of the products within IBM, so we've definitely partnered and done lot of testing for some of the ones, the beta versions, and it's also helped us, I think, sometimes it's been like a marriage. We've had hard times getting through certain hurdles, but it really has paid off, and I think the other thing is we've really operationalized governance to the core at Northern Trust. I think IBM is also seeing value in sharing that our story with others because others have started the journey but may have taken certain different approaches to making that happen, so all in all, I think that the unified governance framework has really helped us, and I think we really love the partnership. >> As a client, what's on their to-do list? What's on IBM's to-do list for you? >> So I think one of the things that we've been talking quite a bit is we have a new CIO, and he's really interested in the cloud strategy, I know you've been talking about that. Again, we're a bank, so due to regulation there's strategies in terms of private versus public cloud. That's one conversation we'll definitely want to take further. We want more integrated tooling within the unified governance platform. That's something that's been a topic that we've discussed quite a bit with them. AI, machine learning, robotics is huge for us, so how do we leverage Watson much more? We've done a few POCs, how do we really operationalize and make sure that that's something that we do more of, so I think I would say those three. >> So sounds like a very symbiotic relationship. >> Ranjana: It is. >> Slash marriage that you have. Ranjana, we want to thank you for joining us and sharing how really kind of you're exhibiting the term change agent in a tenacious way. >> Okay, thank you. >> I feel like I want to say I'm flanked between two data divas, you don't take offense at that, do you? >> No, not at all. It's a compliment. >> You crashed an event. I'm seeing a new >> I like that. >> Twitter handle come up here. We want to thank you so much again for stopping by and sharing. Congrats on your success, and we hope you have a great time here. Enjoy the sunshine! Maybe bring some back to Chicago. >> Will do, will do, yeah. Thanks again, very much. >> And for Dave Vellante, I'm Lisa Martin. We want to encourage you to check out thecube.net to watch all of the videos that we have done so far and will be doing at IBM Think 2018, and of course on all of the shows that we do. Also, head over to siliconangle.com. That's our media site where you're going to find pretty much in near real time synopsis and stories on not just what we're doing here but everything around the globe. Again, for Dave Vellante, I'm Lisa Martin, live from IBM Think 2018 in Vegas. We'll be right back after a short break with our next guest.

Published Date : Mar 19 2018

SUMMARY :

brought to you by IBM. at the inaugural IBM Think 2018 event. It was beautiful yesterday, I took the snowshoes out, actually. Exactly, and we have We're excited to chat with you. that we were good at it, of the challenges that you had and a lot of the capabilities So if I could play that back, and making sure that we change the way and establishing the team. the IT organization, what's the practice? that we work together. and their expertise that you said kind of and our clients are seeing the value, and the lineage all the way 17% of the IT, technology and women have to take it on, to meet a few here at the conference. so I think, I feel like we are organized, higher tenacity to be persistent. is the right thing to do, I have to say, that's a word Is persistence. You hear that a lot. and what it means for Northern Trust? because lot of the feedback and make sure that that's something So sounds like a very Slash marriage that you have. It's a compliment. You crashed an event. we hope you have a great time here. Thanks again, very much. on all of the shows that we do.

ENTITIES

Entity	Category	Confidence
Ranjana	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ranjana Young	PERSON	0.99+
Lisa Martin	PERSON	0.99+
David	PERSON	0.99+
IBM	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Chicago	LOCATION	0.99+
Northern Trust	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
last year	DATE	0.99+
Dave	PERSON	0.99+
128 years	QUANTITY	0.99+
Vegas	LOCATION	0.99+
two	QUANTITY	0.99+
three	QUANTITY	0.99+
Northern Trust	ORGANIZATION	0.99+
ING	ORGANIZATION	0.99+
Inderpal Bhandari	PERSON	0.99+
17%	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
both arms	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Lisa	PERSON	0.99+
one	QUANTITY	0.99+
yesterday	DATE	0.99+
Mars	LOCATION	0.99+
both	QUANTITY	0.99+
128	QUANTITY	0.99+
GDPR	TITLE	0.99+
first	QUANTITY	0.99+
Venus	LOCATION	0.98+
thecube.net	OTHER	0.98+
IBM Think 2018	EVENT	0.97+
around 17%	QUANTITY	0.97+
today	DATE	0.97+
around 30%	QUANTITY	0.96+
Each silo	QUANTITY	0.95+
about 15 degrees	QUANTITY	0.95+
Enterprise Data Services	ORGANIZATION	0.95+
Twitter	ORGANIZATION	0.95+
three years ago	DATE	0.95+
five-core competencies	QUANTITY	0.92+
IBM Analytics Global	ORGANIZATION	0.89+
five things	QUANTITY	0.88+
one conversation	QUANTITY	0.88+
five core competencies	QUANTITY	0.84+
last three years	DATE	0.83+
May of this year	DATE	0.82+
Think 2018	EVENT	0.79+
two data divas	QUANTITY	0.79+
2018	EVENT	0.75+
Data	ORGANIZATION	0.57+
Watson	TITLE	0.54+
IGC NIA	ORGANIZATION	0.5+
IBM Think	ORGANIZATION	0.48+
Cube	PERSON	0.24+

Seth Dobrin, IBM | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to theCUBE's continuing coverage of our own event, Big Data SV. I'm Lisa Martin, with my cohost Dave Vellante. We're in downtown San Jose at this really cool place, Forager Eatery. Come by, check us out. We're here tomorrow as well. We're joined by, next, one of our CUBE alumni, Seth Dobrin, the Vice President and Chief Data Officer at IBM Analytics. Hey, Seth, welcome back to theCUBE. >> Hey, thanks for having again. Always fun being with you guys. >> Good to see you, Seth. >> Good to see you. >> Yeah, so last time you were chatting with Dave and company was about in the fall at the Chief Data Officers Summit. What's kind of new with you in IBM Analytics since then? >> Yeah, so the Chief Data Officers Summit, I was talking with one of the data governance people from TD Bank and we spent a lot of time talking about governance. Still doing a lot with governance, especially with GDPR coming up. But really started to ramp up my team to focus on data science, machine learning. How do you do data science in the enterprise? How is it different from doing a Kaggle competition, or someone getting their PhD or Masters in Data Science? >> Just quickly, who is your team composed of in IBM Analytics? >> So IBM Analytics represents, think of it as our software umbrella, so it's everything that's not pure cloud or Watson or services. So it's all of our software franchise. >> But in terms of roles and responsibilities, data scientists, analysts. What's the mixture of-- >> Yeah. So on my team I have a small group of people that do governance, and so they're really managing our GDPR readiness inside of IBM in our business unit. And then the rest of my team is really focused on this data science space. And so this is set up from the perspective of we have machine-learning engineers, we have predictive-analytics engineers, we have data engineers, and we have data journalists. And that's really focus on helping IBM and other companies do data science in the enterprise. >> So what's the dynamic amongst those roles that you just mentioned? Is it really a team sport? I mean, initially it was the data science on a pedestal. Have you been able to attack that problem? >> So I know a total of two people that can do that all themselves. So I think it absolutely is a team sport. And it really takes a data engineer or someone with deep expertise in there, that also understands machine-learning, to really build out the data assets, engineer the features appropriately, provide access to the model, and ultimately to what you're going to deploy, right? Because the way you do it as a research project or an activity is different than using it in real life, right? And so you need to make sure the data pipes are there. And when I look for people, I actually look for a differentiation between machine-learning engineers and optimization. I don't even post for data scientists because then you get a lot of data scientists, right? People who aren't really data scientists, and so if you're specific and ask for machine-learning engineers or decision optimization, OR-type people, you really get a whole different crowd in. But the interplay is really important because most machine-learning use cases you want to be able to give information about what you should do next. What's the next best action? And to do that, you need decision optimization. >> So in the early days of when we, I mean, data science has been around forever, right? We always hear that. But in the, sort of, more modern use of the term, you never heard much about machine learning. It was more like stats, math, some programming, data hacking, creativity. And then now, machine learning sounds fundamental. Is that a new skillset that the data scientists had to learn? Did they get them from other parts of the organization? >> I mean, when we talk about math and stats, what we call machine learning today has been what we've been doing since the first statistics for years, right? I mean, a lot of the same things we apply in what we call machine learning today I did during my PhD 20 years ago, right? It was just with a different perspective. And you applied those types of, they were more static, right? So I would build a model to predict something, and it was only for that. It really didn't apply it beyond, so it was very static. Now, when we're talking about machine learning, I want to understand Dave, right? And I want to be able to predict Dave's behavior in the future, and learn how you're changing your behavior over time, right? So one of the things that a lot of people don't realize, especially senior executives, is that machine learning creates a self-fulfilling prophecy. You're going to drive a behavior so your data is going to change, right? So your model needs to change. And so that's really the difference between what you think of as stats and what we think of as machine learning today. So what we were looking for years ago is all the same we just described it a little differently. >> So how fine is the line between a statistician and a data scientist? >> I think any good statistician can really become a data scientist. There's some issues around data engineering and things like that but if it's a team sport, I think any really good, pure mathematician or statistician could certainly become a data scientist. Or machine-learning engineer. Sorry. >> I'm interested in it from a skillset standpoint. You were saying how you're advertising to bring on these roles. I was at the Women in Data Science Conference with theCUBE just a couple of days ago, and we hear so much excitement about the role of data scientists. It's so horizontal. People have the opportunity to make impact in policy change, healthcare, etc. So the hard skills, the soft skills, mathematician, what are some of the other elements that you would look for or that companies, enterprises that need to learn how to embrace data science, should look for? Someone that's not just a mathematician but someone that has communication skills, collaboration, empathy, what are some of those, openness, to not lead data down a certain, what do you see as the right mix there of a data scientist? >> Yeah, so I think that's a really good point, right? It's not just the hard skills. When my team goes out, because part of what we do is we go out and sit with clients and teach them our philosophy on how you should integrate data science in the enterprise. A good part of that is sitting down and understanding the use case. And working with people to tease out, how do you get to this ultimate use case because any problem worth solving is not one model, any use case is not one model, it's many models. How do you work with the people in the business to understand, okay, what's the most important thing for us to deliver first? And it's almost a negotiation, right? Talking them back. Okay, we can't solve the whole problem. We need to break it down in discreet pieces. Even when we break it down into discreet pieces, there's going to be a series of sprints to deliver that. Right? And so having these soft skills to be able to tease that in a way, and really help people understand that their way of thinking about this may or may not be right. And doing that in a way that's not offensive. And there's a lot of really smart people that can say that, but they can come across at being offensive, so those soft skills are really important. >> I'm going to talk about GDPR in the time we have remaining. We talked about in the past, the clocks ticking, May the fines go into effect. The relationship between data science, machine learning, GDPR, is it going to help us solve this problem? This is a nightmare for people. And many organizations aren't ready. Your thoughts. >> Yeah, so I think there's some aspects that we've talked about before. How important it's going to be to apply machine learning to your data to get ready for GDPR. But I think there's some aspects that we haven't talked about before here, and that's around what impact does GDPR have on being able to do data science, and being able to implement data science. So one of the aspects of the GDPR is this concept of consent, right? So it really requires consent to be understandable and very explicit. And it allows people to be able to retract that consent at any time. And so what does that mean when you build a model that's trained on someone's data? If you haven't anonymized it properly, do I have to rebuild the model without their data? And then it also brings up some points around explainability. So you need to be able to explain your decision, how you used analytics, how you got to that decision, to someone if they request it. To an auditor if they request it. Traditional machine learning, that's not too much of a problem. You can look at the features and say these features, this contributed 20%, this contributed 50%. But as you get into things like deep learning, this concept of explainable or XAI becomes really, really important. And there were some talks earlier today at Strata about how you apply machine learning, traditional machine learning to interpret your deep learning or black box AI. So that's really going to be important, those two things, in terms of how they effect data science. >> Well, you mentioned the black box. I mean, do you think we'll ever resolve the black box challenge? Or is it really that people are just going to be comfortable that what happens inside the box, how you got to that decision is okay? >> So I'm inherently both cynical and optimistic. (chuckles) But I think there's a lot of things we looked at five years ago and we said there's no way we'll ever be able to do them that we can do today. And so while I don't know how we're going to get to be able to explain this black box as a XAI, I'm fairly confident that in five years, this won't even be a conversation anymore. >> Yeah, I kind of agree. I mean, somebody said to me the other day, well, it's really hard to explain how you know it's a dog. >> Seth: Right (chuckles). But you know it's a dog. >> But you know it's a dog. And so, we'll get over this. >> Yeah. >> I love that you just brought up dogs as we're ending. That's my favorite thing in the world, thank you. Yes, you knew that. Well, Seth, I wish we had more time, and thanks so much for stopping by theCUBE and sharing some of your insights. Look forward to the next update in the next few months from you. >> Yeah, thanks for having me. Good seeing you again. >> Pleasure. >> Nice meeting you. >> Likewise. We want to thank you for watching theCUBE live from our event Big Data SV down the street from the Strata Data Conference. I'm Lisa Martin, for Dave Vellante. Thanks for watching, stick around, we'll be rick back after a short break.

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media Welcome back to theCUBE's continuing coverage Always fun being with you guys. Yeah, so last time you were chatting But really started to ramp up my team So it's all of our software franchise. What's the mixture of-- and other companies do data science in the enterprise. that you just mentioned? And to do that, you need decision optimization. So in the early days of when we, And so that's really the difference I think any good statistician People have the opportunity to make impact there's going to be a series of sprints to deliver that. in the time we have remaining. And so what does that mean when you build a model Or is it really that people are just going to be comfortable ever be able to do them that we can do today. I mean, somebody said to me the other day, But you know it's a dog. But you know it's a dog. I love that you just brought up dogs as we're ending. Good seeing you again. We want to thank you for watching theCUBE

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Seth	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
20%	QUANTITY	0.99+
50%	QUANTITY	0.99+
TD Bank	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
two people	QUANTITY	0.99+
tomorrow	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
one model	QUANTITY	0.99+
five years	QUANTITY	0.98+
20 years ago	DATE	0.98+
Big Data SV	EVENT	0.98+
five years ago	DATE	0.98+
GDPR	TITLE	0.98+
theCUBE	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Strata Data Conference	EVENT	0.97+
today	DATE	0.97+
first statistics	QUANTITY	0.95+
CUBE	ORGANIZATION	0.94+
Women in Data Science Conference	EVENT	0.94+
both	QUANTITY	0.94+
Chief Data Officers Summit	EVENT	0.93+
Big Data SV 2018	EVENT	0.93+
couple of days ago	DATE	0.93+
years	DATE	0.9+
Forager Eatery	ORGANIZATION	0.9+
first	QUANTITY	0.86+
Watson	TITLE	0.86+
Officers Summit	EVENT	0.74+
Data Officer	PERSON	0.73+
SV	EVENT	0.71+
President	PERSON	0.68+
Strata	TITLE	0.67+
Big Data	ORGANIZATION	0.66+
earlier today	DATE	0.65+
Silicon Valley	LOCATION	0.64+
years	QUANTITY	0.6+
Chief	EVENT	0.44+
Kaggle	ORGANIZATION	0.43+

Rob Thomas, IBM | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE, covering Machine Learning Everywhere: Build Your Ladder to AI, brought to you by IBM. >> Welcome back to New York City. theCUBE continue our coverage here at IBM's event, Machine Learning Everywhere: Build Your Ladder to AI. And with us now is Rob Thomas, who is the vice president of, or general manager, rather, of IBM analytics. Sorry about that, Rob. Good to have you with us this morning. Good to see you, sir. >> Great to see you John. Dave, great to see you as well. >> Great to see you. >> Well let's just talk about the event first. Great lineup of guests. We're looking forward to visiting with several of them here on theCUBE today. But let's talk about, first off, general theme with what you're trying to communicate and where you sit in terms of that ladder to success in the AI world. >> So, maybe start by stepping back to, we saw you guys a few times last year. Once in Munich, I recall, another one in New York, and the theme of both of those events was, data science renaissance. We started to see data science picking up steam in organizations. We also talked about machine learning. The great news is that, in that timeframe, machine learning has really become a real thing in terms of actually being implemented into organizations, and changing how companies run. And that's what today is about, is basically showcasing a bunch of examples, not only from our clients, but also from within IBM, how we're using machine learning to run our own business. And the thing I always remind clients when I talk to them is, machine learning is not going to replace managers, but I think machine learning, managers that use machine learning will replace managers that do not. And what you see today is a bunch of examples of how that's true because it gives you superpowers. If you've automated a lot of the insight, data collection, decision making, it makes you a more powerful manager, and that's going to change a lot of enterprises. >> It seems like a no-brainer, right? I mean, or a must-have. >> I think there's a, there's always that, sometimes there's a fear factor. There is a culture piece that holds people back. We're trying to make it really simple in terms of how we talk about the day, and the examples that we show, to get people comfortable, to kind of take a step onto that ladder back to the company. >> It's conceptually a no-brainer, but it's a challenge. You wrote a blog and it was really interesting. It was, one of the clients said to you, "I'm so glad I'm not in the technology industry." And you went, "Uh, hello?" (laughs) "I've got news for you, you are in the technology industry." So a lot of customers that I talk to feel like, meh, you know, in our industry, it's really not getting disrupted. That's kind of taxis and retail. We're in banking and, you know, but, digital is disrupting every industry and every industry is going to have to adopt ML, AI, whatever you want to call it. Can traditional companies close that gap? What's your take? >> I think they can, but, I'll go back to the word I used before, it starts with culture. Am I accepting that I'm a technology company, even if traditionally I've made tractors, as an example? Or if traditionally I've just been you know, selling shirts and shoes, have I embraced the role, my role as a technology company? Because if you set that culture from the top, everything else flows from there. It can't be, IT is something that we do on the side. It has to be a culture of, it's fundamental to what we do as a company. There was an MIT study that said, data-driven cultures drive productivity gains of six to 10 percent better than their competition. You can't, that stuff compounds, too. So if your competitors are doing that and you're not, not only do you fall behind in the short term but you fall woefully behind in the medium term. And so, I think companies are starting to get there but it takes a constant push to get them focused on that. >> So if you're a tractor company, you've got human expertise around making tractors and messaging and marketing tractors, and then, and data is kind of there, sort of a bolt-on, because everybody's got to be data-driven, but if you look at the top companies by market cap, you know, we were talking about it earlier. Data is foundational. It's at their core, so, that seems to me to be the hard part, Rob, I'd like you to comment in terms of that cultural shift. How do you go from sort of data in silos and, you know, not having cloud economics and, that are fundamental, to having that dynamic, and how does IBM help? >> You know, I think, to give companies credit, I think most organizations have developed some type of data practice or discipline over the last, call it five years. But most of that's historical, meaning, yeah, we'll take snapshots of history. We'll use that to guide decision making. You fast-forward to what we're talking about today, just so we're on the same page, machine learning is about, you build a model, you train a model with data, and then as new data flows in, your model is constantly updating. So your ability to make decisions improves over time. That's very different from, we're doing historical reporting on data. And so I think it's encouraging that companies have kind of embraced that data discipline in the last five years, but what we're talking about today is a big next step and what we're trying to break it down to what I call the building blocks, so, back to the point on an AI ladder, what I mean by an AI ladder is, you can't do AI without machine learning. You can't do machine learning without analytics. You can't do analytics without the right data architecture. So those become the building blocks of how you get towards a future of AI. And so what I encourage companies is, if you're not ready for that AI leading edge use case, that's okay, but you can be preparing for that future now. That's what the building blocks are about. >> You know, I think we're, I know we're ahead of, you know, Jeremiah Owyang on a little bit later, but I was reading something that he had written about gut and instinct, from the C-Suite, and how, that's how companies were run, right? You had your CEO, your president, they made decisions based on their guts or their instincts. And now, you've got this whole new objective tool out there that's gold, and it's kind of taking some of the gut and instinct out of it, in a way, and maybe there are people who still can't quite grasp that, that maybe their guts and their instincts, you know, what their gut tells them, you know, is one thing, but there's pretty objective data that might indicate something else. >> Moneyball for business. >> A little bit of a clash, I mean, is there a little bit of a clash in that respect? >> I think you'd be surprise by how much decision making is still pure opinion. I mean, I see that everywhere. But we're heading more towards what you described for sure. One of the clients talking here today, AMC Networks, think it's a great example of a company that you wouldn't think of as a technology company, primarily a content producer, they make great shows, but they've kind of gone that extra step to say, we can integrate data sources from third parties, our own data about viewer habits, we can do that to change our relationship with advertisers. Like, that's a company that's really embraced this idea of being a technology company, and you can see it in their results, and so, results are not coincidence in this world anymore. It's about a practice applied to data, leveraging machine learning, on a path towards AI. If companies are doing that, they're going to be successful. >> And we're going to have the tally from AMC on, but so there's a situation where they have embraced it, that they've dealt with that culture, and data has become foundational. Now, I'm interested as to what their journey look like. What are you seeing with clients? How they break this down, the silos of data that have been built up over decades. >> I think, so they get almost like a maturity curve. You've got, and the rule I talk about is 40-40-20, where 40% of organizations are really using data just to optimize costs right now. That's okay, but that's on the lower end of the maturity curve. 40% are saying, all right, I'm starting to get into data science. I'm starting to think about how I extend to new products, new services, using data. And then 20% are on the leading edge. And that's where I'd put AMC Networks, by the way, because they've done unique things with integrating data sets and building models so that they've automated a lot of what used to be painstakingly long processes, internal processes to do it. So you've got this 40-40-20 of organizations in terms of their maturity on this. If you're not on that curve right now, you have a problem. But I'd say most are somewhere on that curve. If you're in the first 40% and you're, right now data for you is just about optimizing cost, you're going to be behind. If you're not right now, you're going to be behind in the next year, that's a problem. So I'd kind of encourage people to think about what it takes to be in the next 40%. Ultimately you want to be in the 20% that's actually leading this transformation. >> So change it to 40-20-40. That's where you want it to go, right? You want to flip that paradigm. >> I want to ask you a question. You've done a lot of M and A in the past. You spent a lot of time in Silicon Valley and Silicon Valley obviously very, very disruptive, you know, cultures and organizations and it's always been a sort of technology disruption. It seems like there's a ... another disruption going on, not just horizontal technologies, you know, cloud or mobile or social, whatever it is, but within industries. Some industries, as we've been talking, radically disrupted. Retail, taxis, certainly advertising, et cetera et cetera. Some have not yet, the client that you talked to. Do you see, technology companies generally, Silicon Valley companies specifically, as being able to pull off a sort of disruption of not only technologies but also industries and where does IBM play there? You've made a sort of, Ginni in particular has made a deal about, hey, we're not going to compete with our customers. So talking about this sort of dual disruption agenda, one on the technology side, one within industries that Apple's getting into financial services and, you know, Amazon getting into grocery, what's your take on that and where does IBM fit in that world? >> So, I mean, IBM has been in Silicon Valley for a long time, I would say probably longer than 99.9% of the companies in Silicon Valley, so, we've got a big lab there. We do a lot of innovation out of there. So love it, I mean, the culture of the valley is great for the world because it's all about being the challenger, it's about innovation, and that's tremendous. >> No fear. >> Yeah, absolutely. So, look, we work with a lot of different partners, some who are, you know, purely based in the valley. I think they challenge us. We can learn from them, and that's great. I think the one, the one misnomer that I see right now, is there's a undertone that innovation is happening in Silicon Valley and only in Silicon Valley. And I think that's a myth. Give you an example, we just, in December, we released something called Event Store which is basically our stab at reinventing the database business that's been pretty much the same for the last 30 to 40 years. And we're now ingesting millions of rows of data a second. We're doing it in a Parquet format using a Spark engine. Like, this is an amazing innovation that will change how any type of IOT use case can manage data. Now ... people don't think of IBM when they think about innovations like that because it's not the only thing we talk about. We don't have, the IBM website isn't dedicated to that single product because IBM is a much bigger company than that. But we're innovating like crazy. A lot of that is out of what we're doing in Silicon Valley and our labs around the world and so, I'm very optimistic on what we're doing in terms of innovation. >> Yeah, in fact, I think, rephrase my question. I was, you know, you're right. I mean people think of IBM as getting disrupted. I wasn't posing it, I think of you as a disruptor. I know that may sound weird to some people but in the sense that you guys made some huge bets with things like Watson on solving some of the biggest, world's problems. And so I see you as disrupting sort of, maybe yourselves. Okay, frame that. But I don't see IBM as saying, okay, we are going to now disrupt healthcare, disrupt financial services, rather we are going to help our, like some of your comp... I don't know if you'd call them competitors. Amazon, as they say, getting into content and buying grocery, you know, food stores. You guys seems to have a different philosophy. That's what I'm trying to get to is, we're going to disrupt ourselves, okay, fine. But we're not going to go hard into healthcare, hard into financial services, other than selling technology and services to those organizations, does that make sense? >> Yeah, I mean, look, our mission is to make our clients ... better at what they do. That's our mission, we want to be essential in terms of their journey to be successful in their industry. So frankly, I love it every time I see an announcement about Amazon entering another vertical space, because all of those companies just became my clients. Because they're not going to work with Amazon when they're competing with them head to head, day in, day out, so I love that. So us working with these companies to make them better through things like Watson Health, what we're doing in healthcare, it's about making companies who have built their business in healthcare, more effective at how they perform, how they drive results, revenue, ROI for their investors. That's what we do, that's what IBM has always done. >> Yeah, so it's an interesting discussion. I mean, I tend to agree. I think Silicon Valley maybe should focus on those technology disruptions. I think that they'll have a hard time pulling off that dual disruption and maybe if you broadly define Silicon Valley as Seattle and so forth, but, but it seems like that formula has worked for decades, and will continue to work. Other thoughts on sort of the progression of ML, how it gets into organizations. You know, where you see this going, again, I was saying earlier, the parlance is changing. Big data is kind of, you know, mm. Okay, Hadoop, well, that's fine. We seem to be entering this new world that's pervasive, it's embedded, it's intelligent, it's autonomous, it's self-healing, it's all these things that, you know, we aspire to. We're now back in the early innings. We're late innings of big data, that's kind of ... But early innings of this new era, what are your thoughts on that? >> You know, I'd say the biggest restriction right now I see, we talked before about somehow, sometimes companies don't have the desire, so we have to help create the desire, create the culture to go do this. Even for the companies that have a burning desire, the issue quickly becomes a skill gap. And so we're doing a lot to try to help bridge that skill gap. Let's take data science as an example. There's two worlds of data science that I would describe. There's clickers, and there's coders. Clickers want to do drag and drop. They will use traditional tools like SPSS, which we're modernizing, that's great. We want to support them if that's how they want to work and build models and deploy models. There's also this world of coders. This is people that want to do all their data science in ML, and Python, and Scala, and R, like, that's what they want to do. And so we're supporting them through things like Data Science Experience, which is built on Apache Jupiter. It's all open source tooling, it'd designed for coders. The reason I think that's important, it goes back to the point on skill sets. There is a skill gap in most companies. So if you walk in and you say, this is the only way to do this thing, you kind of excluded half the companies because they say, I can't play in that world. So we are intentionally going after a strategy that says, there's a segmentation in skill types. In places there's a gap, we can help you fill that gap. That's how we're thinking about them. >> And who does that bode well for? If you say that you were trying to close a gap, does that bode well for, we talked about the Millennial crowd coming in and so they, you know, do they have a different approach or different mental outlook on this, or is it to the mid-range employee, you know, who is open minded, I mean, but, who is the net sweet spot, you think, that say, oh, this is a great opportunity right now? >> So just take data science as an example. The clicker coder comment I made, I would put the clicker audience as mostly people that are 20 years into their career. They've been around a while. The coder audience is all the Millennials. It's all the new audience. I think the greatest beneficiary is the people that find themselves kind of stuck in the middle, which is they're kind of interested in this ... >> That straddle both sides of the line yeah? >> But they've got the skill set and the desire to do some of the new tooling and new approaches. So I think this kind of creates an opportunity for that group in the middle to say, you know, what am I going to adopt as a platform for how I go forward and how I provide leadership in my company? >> So your advice, then, as you're talking to your clients, I mean you're also talking to their workforce. In a sense, then, your advice to them is, you know, join, jump in the wave, right? You've got your, you can't straddle, you've got to go. >> And you've got to experiment, you've got to try things. Ultimately, organizations are going to gravitate to things that they like using in terms of an approach or a methodology or a tool. But that comes with experimentation, so people need to get out there and try something. >> Maybe we could talk about developers a little bit. We were talking to Dinesh earlier and you guys of course have focused on data scientists, data engineers, obviously developers. And Dinesh was saying, look, many, if not most, of the 10 million Java developers out there, they're not, like, focused around the data. That's really the data scientist's job. But then, my colleague John Furrier says, hey, data is the new development kit. You know, somebody said recently, you know, Andreessen's comment, "software is eating the world." Well, data is eating software. So if Furrier is right and that comment is right, it seems like developers increasingly have to become more data aware, fundamentally. Blockchain developers clearly are more data focused. What's your take on the developer community, where they fit into this whole AI, machine learning space? >> I was just in Las Vegas yesterday and I did a session with a bunch of our business partners. ISVs, so software companies, mostly a developer audience, and the discussion I had with them was around, you're doing, you're building great products, you're building great applications. But your product is only as good as the data and the intelligence that you embed in your product. Because you're still putting too much of a burden on the user, as opposed to having everything happen magically, if you will. So that discussion was around, how do you embed data, embed AI, into your products and do that at the forefront versus, you deliver a product and the client has to say, all right, now I need to get my data out of this application and move it somewhere else so I can do the data science that I want to do. That's what I see happening with developers. It's kind of ... getting them to think about data as opposed to just thinking about the application development framework, because that's where most of them tend to focus. >> Mm, right. >> Well, we've talked about, well, earlier on about the governance, so just curious, with Madhu, which I'll, we'll have that interview in just a little bit here. I'm kind of curious about your take on that, is that it's a little kinder, gentler, friendlier than maybe some might look at it nowadays because of some organization that it causes, within your group and some value that's being derived from that, that more efficiency, more contextual information that's, you know, more relevant, whatever. When you talk to your clients about meeting rules, regs, GDPR, all these things, how do you get them to see that it's not a black veil of doom and gloom but it really is, really more of an opportunity for them to cash in? >> You know, my favorite question to ask when I go visit clients is I say, I say, just show of hands, how many people have all the data they need to do their job? To date, nobody has ever raised their hand. >> Not too many hands up. >> The reason I phrased it that way is, that's fundamentally a governance challenge. And so, when you think about governance, I think everybody immediately thinks about compliance, GDPR, types of things you mentioned, and that's great. But there's two use cases for governance. One is compliance, the other one is self service analytics. Because if you've done data governance, then you can make your data available to everybody in the organization because you know you've got the right rules, the right permissions set up. That will change how people do their jobs and I think sometimes governance gets painted into a compliance corner, when organizations need to think about it as, this is about making data accessible to my entire workforce. That's a big change. I don't think anybody has that today. Except for the clients that we're working with, where I think we've made good strides in that. >> What's your sort of number one, two, and three, or pick one, advice for those companies that as you blogged about, don't realize yet that they're in the software business and the technology business? For them to close the ... machine intelligence, machine learning, AI gap, where should they start? >> I do think it can be basic steps. And the reason I say that is, if you go to a company that hasn't really viewed themselves as a technology company, and you start talking about machine intelligence, AI, like, everybody like, runs away scared, like it's not interesting. So I bring it back to building blocks. For a client to be great in data, and to become a technology company, you really need three platforms for how you think about data. You need a platform for how you manage your data, so think of it as data management. You need a platform for unified governance and integration, and you need a platform for data science and business analytics. And to some extent, I don't care where you start, but you've got to start with one of those. And if you do that, you know, you'll start to create a flywheel of momentum where you'll get some small successes. Then you can go in the other area, and so I just encourage everybody, start down that path. Pick one of the three. Or you may already have something going in one of them, so then pick one where you don't have something going. Just start down the path, because, those building blocks, once you have those in place, you'll be able to scale AI and ML in the future in your organization. But without that, you're going to always be limited to kind of a use case at a time. >> Yeah, and I would add, this is, you talked about it a couple times today, is that cultural aspect, that realization that in order to be data driven, you know, buzzword, you have to embrace that and drive that through the culture. Right? >> That starts at the top, right? Which is, it's not, you know, it's not normal to have a culture of, we're going to experiment, we're going to try things, half of them may not work. And so, it starts at the top in terms of how you set the tone and set that culture. >> IBM Think, we're less than a month away. CUBE is going to be there, very excited about that. First time that you guys have done Think. You've consolidated all your big, big events. What can we expect from you guys? >> I think it's going to be an amazing show. To your point, we thought about this for a while, consolidating to a single IBM event. There's no question just based on the response and the enrollment we have so far, that was the right answer. We'll have people from all over the world. A bunch of clients, we've got some great announcements that will come out that week. And for clients that are thinking about coming, honestly the best thing about it is all the education and training. We basically build a curriculum, and think of it as a curriculum around, how do we make our clients more effective at competing with the Amazons of the world, back to the other point. And so I think we build a great curriculum and it will be a great week. >> Well, if I've heard anything today, it's about, don't be afraid to dive in at the deep end, just dive, right? Get after it and, looking forward to the rest of the day. Rob, thank you for joining us here and we'll see you in about a month! >> Sounds great. >> Right around the corner. >> All right, Rob Thomas joining us here from IBM Analytics, the GM at IBM Analytics. Back with more here on theCUBE. (upbeat music)

Published Date : Feb 27 2018

SUMMARY :

Build Your Ladder to AI, brought to you by IBM. Good to have you with us this morning. Dave, great to see you as well. and where you sit in terms of that ladder And what you see today is a bunch of examples I mean, or a must-have. onto that ladder back to the company. So a lot of customers that I talk to And so, I think companies are starting to get there to be the hard part, Rob, I'd like you to comment You fast-forward to what we're talking about today, and it's kind of taking some of the gut But we're heading more towards what you described for sure. Now, I'm interested as to what their journey look like. to think about what it takes to be in the next 40%. That's where you want it to go, right? I want to ask you a question. So love it, I mean, the culture of the valley for the last 30 to 40 years. but in the sense that you guys made some huge bets in terms of their journey to be successful Big data is kind of, you know, mm. create the culture to go do this. The coder audience is all the Millennials. for that group in the middle to say, you know, you know, join, jump in the wave, right? so people need to get out there and try something. and you guys of course have focused on data scientists, that you embed in your product. When you talk to your clients about have all the data they need to do their job? And so, when you think about governance, and the technology business? And to some extent, I don't care where you start, that in order to be data driven, you know, buzzword, Which is, it's not, you know, it's not normal CUBE is going to be there, very excited about that. I think it's going to be an amazing show. and we'll see you in about a month! from IBM Analytics, the GM at IBM Analytics.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
December	DATE	0.99+
Rob Thomas	PERSON	0.99+
New York	LOCATION	0.99+
Dinesh	PERSON	0.99+
AMC Networks	ORGANIZATION	0.99+
John	PERSON	0.99+
Jeremiah Owyang	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Rob	PERSON	0.99+
20 years	QUANTITY	0.99+
Dave	PERSON	0.99+
Munich	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
MIT	ORGANIZATION	0.99+
10 million	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
20%	QUANTITY	0.99+
last year	DATE	0.99+
Furrier	PERSON	0.99+
AMC	ORGANIZATION	0.99+
One	QUANTITY	0.99+
yesterday	DATE	0.99+
six	QUANTITY	0.99+
New York City	LOCATION	0.99+
GDPR	TITLE	0.99+
40%	QUANTITY	0.99+
both	QUANTITY	0.99+
three	QUANTITY	0.99+
one	QUANTITY	0.99+
Seattle	LOCATION	0.99+
Scala	TITLE	0.99+
two use cases	QUANTITY	0.99+
today	DATE	0.99+
Python	TITLE	0.98+
Andreessen	PERSON	0.98+
both sides	QUANTITY	0.98+
two	QUANTITY	0.98+
Watson Health	ORGANIZATION	0.98+
millions of rows	QUANTITY	0.98+
five years	QUANTITY	0.97+
next year	DATE	0.97+
less than a month	QUANTITY	0.97+
Madhu	PERSON	0.97+
Amazons	ORGANIZATION	0.96+

Data Science for All: It's a Whole New Game

>> There's a movement that's sweeping across businesses everywhere here in this country and around the world. And it's all about data. Today businesses are being inundated with data. To the tune of over two and a half million gigabytes that'll be generated in the next 60 seconds alone. What do you do with all that data? To extract insights you typically turn to a data scientist. But not necessarily anymore. At least not exclusively. Today the ability to extract value from data is becoming a shared mission. A team effort that spans the organization extending far more widely than ever before. Today, data science is being democratized. >> Data Sciences for All: It's a Whole New Game. >> Welcome everyone, I'm Katie Linendoll. I'm a technology expert writer and I love reporting on all things tech. My fascination with tech started very young. I began coding when I was 12. Received my networking certs by 18 and a degree in IT and new media from Rochester Institute of Technology. So as you can tell, technology has always been a sure passion of mine. Having grown up in the digital age, I love having a career that keeps me at the forefront of science and technology innovations. I spend equal time in the field being hands on as I do on my laptop conducting in depth research. Whether I'm diving underwater with NASA astronauts, witnessing the new ways which mobile technology can help rebuild the Philippine's economy in the wake of super typhoons, or sharing a first look at the newest iPhones on The Today Show, yesterday, I'm always on the hunt for the latest and greatest tech stories. And that's what brought me here. I'll be your host for the next hour and as we explore the new phenomenon that is taking businesses around the world by storm. And data science continues to become democratized and extends beyond the domain of the data scientist. And why there's also a mandate for all of us to become data literate. Now that data science for all drives our AI culture. And we're going to be able to take to the streets and go behind the scenes as we uncover the factors that are fueling this phenomenon and giving rise to a movement that is reshaping how businesses leverage data. And putting organizations on the road to AI. So coming up, I'll be doing interviews with data scientists. We'll see real world demos and take a look at how IBM is changing the game with an open data science platform. We'll also be joined by legendary statistician Nate Silver, founder and editor-in-chief of FiveThirtyEight. Who will shed light on how a data driven mindset is changing everything from business to our culture. We also have a few people who are joining us in our studio, so thank you guys for joining us. Come on, I can do better than that, right? Live studio audience, the fun stuff. And for all of you during the program, I want to remind you to join that conversation on social media using the hashtag DSforAll, it's data science for all. Share your thoughts on what data science and AI means to you and your business. And, let's dive into a whole new game of data science. Now I'd like to welcome my co-host General Manager IBM Analytics, Rob Thomas. >> Hello, Katie. >> Come on guys. >> Yeah, seriously. >> No one's allowed to be quiet during this show, okay? >> Right. >> Or, I'll start calling people out. So Rob, thank you so much. I think you know this conversation, we're calling it a data explosion happening right now. And it's nothing new. And when you and I chatted about it. You've been talking about this for years. You have to ask, is this old news at this point? >> Yeah, I mean, well first of all, the data explosion is not coming, it's here. And everybody's in the middle of it right now. What is different is the economics have changed. And the scale and complexity of the data that organizations are having to deal with has changed. And to this day, 80% of the data in the world still sits behind corporate firewalls. So, that's becoming a problem. It's becoming unmanageable. IT struggles to manage it. The business can't get everything they need. Consumers can't consume it when they want. So we have a challenge here. >> It's challenging in the world of unmanageable. Crazy complexity. If I'm sitting here as an IT manager of my business, I'm probably thinking to myself, this is incredibly frustrating. How in the world am I going to get control of all this data? And probably not just me thinking it. Many individuals here as well. >> Yeah, indeed. Everybody's thinking about how am I going to put data to work in my organization in a way I haven't done before. Look, you've got to have the right expertise, the right tools. The other thing that's happening in the market right now is clients are dealing with multi cloud environments. So data behind the firewall in private cloud, multiple public clouds. And they have to find a way. How am I going to pull meaning out of this data? And that brings us to data science and AI. That's how you get there. >> I understand the data science part but I think we're all starting to hear more about AI. And it's incredible that this buzz word is happening. How do businesses adopt to this AI growth and boom and trend that's happening in this world right now? >> Well, let me define it this way. Data science is a discipline. And machine learning is one technique. And then AI puts both machine learning into practice and applies it to the business. So this is really about how getting your business where it needs to go. And to get to an AI future, you have to lay a data foundation today. I love the phrase, "there's no AI without IA." That means you're not going to get to AI unless you have the right information architecture to start with. >> Can you elaborate though in terms of how businesses can really adopt AI and get started. >> Look, I think there's four things you have to do if you're serious about AI. One is you need a strategy for data acquisition. Two is you need a modern data architecture. Three is you need pervasive automation. And four is you got to expand job roles in the organization. >> Data acquisition. First pillar in this you just discussed. Can we start there and explain why it's so critical in this process? >> Yeah, so let's think about how data acquisition has evolved through the years. 15 years ago, data acquisition was about how do I get data in and out of my ERP system? And that was pretty much solved. Then the mobile revolution happens. And suddenly you've got structured and non-structured data. More than you've ever dealt with. And now you get to where we are today. You're talking terabytes, petabytes of data. >> [Katie] Yottabytes, I heard that word the other day. >> I heard that too. >> Didn't even know what it meant. >> You know how many zeros that is? >> I thought we were in Star Wars. >> Yeah, I think it's a lot of zeroes. >> Yodabytes, it's new. >> So, it's becoming more and more complex in terms of how you acquire data. So that's the new data landscape that every client is dealing with. And if you don't have a strategy for how you acquire that and manage it, you're not going to get to that AI future. >> So a natural segue, if you are one of these businesses, how do you build for the data landscape? >> Yeah, so the question I always hear from customers is we need to evolve our data architecture to be ready for AI. And the way I think about that is it's really about moving from static data repositories to more of a fluid data layer. >> And we continue with the architecture. New data architecture is an interesting buzz word to hear. But it's also one of the four pillars. So if you could dive in there. >> Yeah, I mean it's a new twist on what I would call some core data science concepts. For example, you have to leverage tools with a modern, centralized data warehouse. But your data warehouse can't be stagnant to just what's right there. So you need a way to federate data across different environments. You need to be able to bring your analytics to the data because it's most efficient that way. And ultimately, it's about building an optimized data platform that is designed for data science and AI. Which means it has to be a lot more flexible than what clients have had in the past. >> All right. So we've laid out what you need for driving automation. But where does the machine learning kick in? >> Machine learning is what gives you the ability to automate tasks. And I think about machine learning. It's about predicting and automating. And this will really change the roles of data professionals and IT professionals. For example, a data scientist cannot possibly know every algorithm or every model that they could use. So we can automate the process of algorithm selection. Another example is things like automated data matching. Or metadata creation. Some of these things may not be exciting but they're hugely practical. And so when you think about the real use cases that are driving return on investment today, it's things like that. It's automating the mundane tasks. >> Let's go ahead and come back to something that you mentioned earlier because it's fascinating to be talking about this AI journey, but also significant is the new job roles. And what are those other participants in the analytics pipeline? >> Yeah I think we're just at the start of this idea of new job roles. We have data scientists. We have data engineers. Now you see machine learning engineers. Application developers. What's really happening is that data scientists are no longer allowed to work in their own silo. And so the new job roles is about how does everybody have data first in their mind? And then they're using tools to automate data science, to automate building machine learning into applications. So roles are going to change dramatically in organizations. >> I think that's confusing though because we have several organizations who saying is that highly specialized roles, just for data science? Or is it applicable to everybody across the board? >> Yeah, and that's the big question, right? Cause everybody's thinking how will this apply? Do I want this to be just a small set of people in the organization that will do this? But, our view is data science has to for everybody. It's about bring data science to everybody as a shared mission across the organization. Everybody in the company has to be data literate. And participate in this journey. >> So overall, group effort, has to be a common goal, and we all need to be data literate across the board. >> Absolutely. >> Done deal. But at the end of the day, it's kind of not an easy task. >> It's not. It's not easy but it's maybe not as big of a shift as you would think. Because you have to put data in the hands of people that can do something with it. So, it's very basic. Give access to data. Data's often locked up in a lot of organizations today. Give people the right tools. Embrace the idea of choice or diversity in terms of those tools. That gets you started on this path. >> It's interesting to hear you say essentially you need to train everyone though across the board when it comes to data literacy. And I think people that are coming into the work force don't necessarily have a background or a degree in data science. So how do you manage? >> Yeah, so in many cases that's true. I will tell you some universities are doing amazing work here. One example, University of California Berkeley. They offer a course for all majors. So no matter what you're majoring in, you have a course on foundations of data science. How do you bring data science to every role? So it's starting to happen. We at IBM provide data science courses through CognitiveClass.ai. It's for everybody. It's free. And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. The key point is this though. It's more about attitude than it is aptitude. I think anybody can figure this out. But it's about the attitude to say we're putting data first and we're going to figure out how to make this real in our organization. >> I also have to give a shout out to my alma mater because I have heard that there is an offering in MS in data analytics. And they are always on the forefront of new technologies and new majors and on trend. And I've heard that the placement behind those jobs, people graduating with the MS is high. >> I'm sure it's very high. >> So go Tigers. All right, tangential. Let me get back to something else you touched on earlier because you mentioned that a number of customers ask you how in the world do I get started with AI? It's an overwhelming question. Where do you even begin? What do you tell them? >> Yeah, well things are moving really fast. But the good thing is most organizations I see, they're already on the path, even if they don't know it. They might have a BI practice in place. They've got data warehouses. They've got data lakes. Let me give you an example. AMC Networks. They produce a lot of the shows that I'm sure you watch Katie. >> [Katie] Yes, Breaking Bad, Walking Dead, any fans? >> [Rob] Yeah, we've got a few. >> [Katie] Well you taught me something I didn't even know. Because it's amazing how we have all these different industries, but yet media in itself is impacted too. And this is a good example. >> Absolutely. So, AMC Networks, think about it. They've got ads to place. They want to track viewer behavior. What do people like? What do they dislike? So they have to optimize every aspect of their business from marketing campaigns to promotions to scheduling to ads. And their goal was transform data into business insights and really take the burden off of their IT team that was heavily burdened by obviously a huge increase in data. So their VP of BI took the approach of using machine learning to process large volumes of data. They used a platform that was designed for AI and data processing. It's the IBM analytics system where it's a data warehouse, data science tools are built in. It has in memory data processing. And just like that, they were ready for AI. And they're already seeing that impact in their business. >> Do you think a movement of that nature kind of presses other media conglomerates and organizations to say we need to be doing this too? >> I think it's inevitable that everybody, you're either going to be playing, you're either going to be leading, or you'll be playing catch up. And so, as we talk to clients we think about how do you start down this path now, even if you have to iterate over time? Because otherwise you're going to wake up and you're going to be behind. >> One thing worth noting is we've talked about analytics to the data. It's analytics first to the data, not the other way around. >> Right. So, look. We as a practice, we say you want to bring data to where the data sits. Because it's a lot more efficient that way. It gets you better outcomes in terms of how you train models and it's more efficient. And we think that leads to better outcomes. Other organization will say, "Hey move the data around." And everything becomes a big data movement exercise. But once an organization has started down this path, they're starting to get predictions, they want to do it where it's really easy. And that means analytics applied right where the data sits. >> And worth talking about the role of the data scientist in all of this. It's been called the hot job of the decade. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. >> Yes. >> I want to see this on the cover of Vogue. Like I want to see the first data scientist. Female preferred, on the cover of Vogue. That would be amazing. >> Perhaps you can. >> People agree. So what changes for them? Is this challenging in terms of we talk data science for all. Where do all the data science, is it data science for everyone? And how does it change everything? >> Well, I think of it this way. AI gives software super powers. It really does. It changes the nature of software. And at the center of that is data scientists. So, a data scientist has a set of powers that they've never had before in any organization. And that's why it's a hot profession. Now, on one hand, this has been around for a while. We've had actuaries. We've had statisticians that have really transformed industries. But there are a few things that are new now. We have new tools. New languages. Broader recognition of this need. And while it's important to recognize this critical skill set, you can't just limit it to a few people. This is about scaling it across the organization. And truly making it accessible to all. >> So then do we need more data scientists? Or is this something you train like you said, across the board? >> Well, I think you want to do a little bit of both. We want more. But, we can also train more and make the ones we have more productive. The way I think about it is there's kind of two markets here. And we call it clickers and coders. >> [Katie] I like that. That's good. >> So, let's talk about what that means. So clickers are basically somebody that wants to use tools. Create models visually. It's drag and drop. Something that's very intuitive. Those are the clickers. Nothing wrong with that. It's been valuable for years. There's a new crop of data scientists. They want to code. They want to build with the latest open source tools. They want to write in Python or R. These are the coders. And both approaches are viable. Both approaches are critical. Organizations have to have a way to meet the needs of both of those types. And there's not a lot of things available today that do that. >> Well let's keep going on that. Because I hear you talking about the data scientists role and how it's critical to success, but with the new tools, data science and analytics skills can extend beyond the domain of just the data scientist. >> That's right. So look, we're unifying coders and clickers into a single platform, which we call IBM Data Science Experience. And as the demand for data science expertise grows, so does the need for these kind of tools. To bring them into the same environment. And my view is if you have the right platform, it enables the organization to collaborate. And suddenly you've changed the nature of data science from an individual sport to a team sport. >> So as somebody that, my background is in IT, the question is really is this an additional piece of what IT needs to do in 2017 and beyond? Or is it just another line item to the budget? >> So I'm afraid that some people might view it that way. As just another line item. But, I would challenge that and say data science is going to reinvent IT. It's going to change the nature of IT. And every organization needs to think about what are the skills that are critical? How do we engage a broader team to do this? Because once they get there, this is the chance to reinvent how they're performing IT. >> [Katie] Challenging or not? >> Look it's all a big challenge. Think about everything IT organizations have been through. Some of them were late to things like mobile, but then they caught up. Some were late to cloud, but then they caught up. I would just urge people, don't be late to data science. Use this as your chance to reinvent IT. Start with this notion of clickers and coders. This is a seminal moment. Much like mobile and cloud was. So don't be late. >> And I think it's critical because it could be so costly to wait. And Rob and I were even chatting earlier how data analytics is just moving into all different kinds of industries. And I can tell you even personally being effected by how important the analysis is in working in pediatric cancer for the last seven years. I personally implement virtual reality headsets to pediatric cancer hospitals across the country. And it's great. And it's working phenomenally. And the kids are amazed. And the staff is amazed. But the phase two of this project is putting in little metrics in the hardware that gather the breathing, the heart rate to show that we have data. Proof that we can hand over to the hospitals to continue making this program a success. So just in-- >> That's a great example. >> An interesting example. >> Saving lives? >> Yes. >> That's also applying a lot of what we talked about. >> Exciting stuff in the world of data science. >> Yes. Look, I just add this is an existential moment for every organization. Because what you do in this area is probably going to define how competitive you are going forward. And think about if you don't do something. What if one of your competitors goes and creates an application that's more engaging with clients? So my recommendation is start small. Experiment. Learn. Iterate on projects. Define the business outcomes. Then scale up. It's very doable. But you've got to take the first step. >> First step always critical. And now we're going to get to the fun hands on part of our story. Because in just a moment we're going to take a closer look at what data science can deliver. And where organizations are trying to get to. All right. Thank you Rob and now we've been joined by Siva Anne who is going to help us navigate this demo. First, welcome Siva. Give him a big round of applause. Yeah. All right, Rob break down what we're going to be looking at. You take over this demo. >> All right. So this is going to be pretty interesting. So Siva is going to take us through. So he's going to play the role of a financial adviser. Who wants to help better serve clients through recommendations. And I'm going to really illustrate three things. One is how do you federate data from multiple data sources? Inside the firewall, outside the firewall. How do you apply machine learning to predict and to automate? And then how do you move analytics closer to your data? So, what you're seeing here is a custom application for an investment firm. So, Siva, our financial adviser, welcome. So you can see at the top, we've got market data. We pulled that from an external source. And then we've got Siva's calendar in the middle. He's got clients on the right side. So page down, what else do you see down there Siva? >> [Siva] I can see the recent market news. And in here I can see that JP Morgan is calling for a US dollar rebound in the second half of the year. And, I have upcoming meeting with Leo Rakes. I can get-- >> [Rob] So let's go in there. Why don't you click on Leo Rakes. So, you're sitting at your desk, you're deciding how you're going to spend the day. You know you have a meeting with Leo. So you click on it. You immediately see, all right, so what do we know about him? We've got data governance implemented. So we know his age, we know his degree. We can see he's not that aggressive of a trader. Only six trades in the last few years. But then where it gets interesting is you go to the bottom. You start to see predicted industry affinity. Where did that come from? How do we have that? >> [Siva] So these green lines and red arrows here indicate the trending affinity of Leo Rakes for particular industry stocks. What we've done here is we've built machine learning models using customer's demographic data, his stock portfolios, and browsing behavior to build a model which can predict his affinity for a particular industry. >> [Rob] Interesting. So, I like to think of this, we call it celebrity experiences. So how do you treat every customer like they're a celebrity? So to some extent, we're reading his mind. Because without asking him, we know that he's going to have an affinity for auto stocks. So we go down. Now we look at his portfolio. You can see okay, he's got some different holdings. He's got Amazon, Google, Apple, and then he's got RACE, which is the ticker for Ferrari. You can see that's done incredibly well. And so, as a financial adviser, you look at this and you say, all right, we know he loves auto stocks. Ferrari's done very well. Let's create a hedge. Like what kind of security would interest him as a hedge against his position for Ferrari? Could we go figure that out? >> [Siva] Yes. Given I know that he's gotten an affinity for auto stocks, and I also see that Ferrari has got some terminus gains, I want to lock in these gains by hedging. And I want to do that by picking a auto stock which has got negative correlation with Ferrari. >> [Rob] So this is where we get to the idea of in database analytics. Cause you start clicking that and immediately we're getting instant answers of what's happening. So what did we find here? We're going to compare Ferrari and Honda. >> [Siva] I'm going to compare Ferrari with Honda. And what I see here instantly is that Honda has got a negative correlation with Ferrari, which makes it a perfect mix for his stock portfolio. Given he has an affinity for auto stocks and it correlates negatively with Ferrari. >> [Rob] These are very powerful tools at the hand of a financial adviser. You think about it. As a financial adviser, you wouldn't think about federating data, machine learning, pretty powerful. >> [Siva] Yes. So what we have seen here is that using the common SQL engine, we've been able to federate queries across multiple data sources. Db2 Warehouse in the cloud, IBM's Integrated Analytic System, and Hortonworks powered Hadoop platform for the new speeds. We've been able to use machine learning to derive innovative insights about his stock affinities. And drive the machine learning into the appliance. Closer to where the data resides to deliver high performance analytics. >> [Rob] At scale? >> [Siva] We're able to run millions of these correlations across stocks, currency, other factors. And even score hundreds of customers for their affinities on a daily basis. >> That's great. Siva, thank you for playing the role of financial adviser. So I just want to recap briefly. Cause this really powerful technology that's really simple. So we federated, we aggregated multiple data sources from all over the web and internal systems. And public cloud systems. Machine learning models were built that predicted Leo's affinity for a certain industry. In this case, automotive. And then you see when you deploy analytics next to your data, even a financial adviser, just with the click of a button is getting instant answers so they can go be more productive in their next meeting. This whole idea of celebrity experiences for your customer, that's available for everybody, if you take advantage of these types of capabilities. Katie, I'll hand it back to you. >> Good stuff. Thank you Rob. Thank you Siva. Powerful demonstration on what we've been talking about all afternoon. And thank you again to Siva for helping us navigate. Should be give him one more round of applause? We're going to be back in just a moment to look at how we operationalize all of this data. But in first, here's a message from me. If you're a part of a line of business, your main fear is disruption. You know data is the new goal that can create huge amounts of value. So does your competition. And they may be beating you to it. You're convinced there are new business models and revenue sources hidden in all the data. You just need to figure out how to leverage it. But with the scarcity of data scientists, you really can't rely solely on them. You may need more people throughout the organization that have the ability to extract value from data. And as a data science leader or data scientist, you have a lot of the same concerns. You spend way too much time looking for, prepping, and interpreting data and waiting for models to train. You know you need to operationalize the work you do to provide business value faster. What you want is an easier way to do data prep. And rapidly build models that can be easily deployed, monitored and automatically updated. So whether you're a data scientist, data science leader, or in a line of business, what's the solution? What'll it take to transform the way you work? That's what we're going to explore next. All right, now it's time to delve deeper into the nuts and bolts. The nitty gritty of operationalizing data science and creating a data driven culture. How do you actually do that? Well that's what these experts are here to share with us. I'm joined by Nir Kaldero, who's head of data science at Galvanize, which is an education and training organization. Tricia Wang, who is co-founder of Sudden Compass, a consultancy that helps companies understand people with data. And last, but certainly not least, Michael Li, founder and CEO of Data Incubator, which is a data science train company. All right guys. Shall we get right to it? >> All right. >> So data explosion happening right now. And we are seeing it across the board. I just shared an example of how it's impacting my philanthropic work in pediatric cancer. But you guys each have so many unique roles in your business life. How are you seeing it just blow up in your fields? Nir, your thing? >> Yeah, for example like in Galvanize we train many Fortune 500 companies. And just by looking at the demand of companies that wants us to help them go through this digital transformation is mind-blowing. Data point by itself. >> Okay. Well what we're seeing what's going on is that data science like as a theme, is that it's actually for everyone now. But what's happening is that it's actually meeting non technical people. But what we're seeing is that when non technical people are implementing these tools or coming at these tools without a base line of data literacy, they're often times using it in ways that distance themselves from the customer. Because they're implementing data science tools without a clear purpose, without a clear problem. And so what we do at Sudden Compass is that we work with companies to help them embrace and understand the complexity of their customers. Because often times they are misusing data science to try and flatten their understanding of the customer. As if you can just do more traditional marketing. Where you're putting people into boxes. And I think the whole ROI of data is that you can now understand people's relationships at a much more complex level at a greater scale before. But we have to do this with basic data literacy. And this has to involve technical and non technical people. >> Well you can have all the data in the world, and I think it speaks to, if you're not doing the proper movement with it, forget it. It means nothing at the same time. >> No absolutely. I mean, I think that when you look at the huge explosion in data, that comes with it a huge explosion in data experts. Right, we call them data scientists, data analysts. And sometimes they're people who are very, very talented, like the people here. But sometimes you have people who are maybe re-branding themselves, right? Trying to move up their title one notch to try to attract that higher salary. And I think that that's one of the things that customers are coming to us for, right? They're saying, hey look, there are a lot of people that call themselves data scientists, but we can't really distinguish. So, we have sort of run a fellowship where you help companies hire from a really talented group of folks, who are also truly data scientists and who know all those kind of really important data science tools. And we also help companies internally. Fortune 500 companies who are looking to grow that data science practice that they have. And we help clients like McKinsey, BCG, Bain, train up their customers, also their clients, also their workers to be more data talented. And to build up that data science capabilities. >> And Nir, this is something you work with a lot. A lot of Fortune 500 companies. And when we were speaking earlier, you were saying many of these companies can be in a panic. >> Yeah. >> Explain that. >> Yeah, so you know, not all Fortune 500 companies are fully data driven. And we know that the winners in this fourth industrial revolution, which I like to call the machine intelligence revolution, will be companies who navigate and transform their organization to unlock the power of data science and machine learning. And the companies that are not like that. Or not utilize data science and predictive power well, will pretty much get shredded. So they are in a panic. >> Tricia, companies have to deal with data behind the firewall and in the new multi cloud world. How do organizations start to become driven right to the core? >> I think the most urgent question to become data driven that companies should be asking is how do I bring the complex reality that our customers are experiencing on the ground in to a corporate office? Into the data models. So that question is critical because that's how you actually prevent any big data disasters. And that's how you leverage big data. Because when your data models are really far from your human models, that's when you're going to do things that are really far off from how, it's going to not feel right. That's when Tesco had their terrible big data disaster that they're still recovering from. And so that's why I think it's really important to understand that when you implement big data, you have to further embrace thick data. The qualitative, the emotional stuff, that is difficult to quantify. But then comes the difficult art and science that I think is the next level of data science. Which is that getting non technical and technical people together to ask how do we find those unknown nuggets of insights that are difficult to quantify? Then, how do we do the next step of figuring out how do you mathematically scale those insights into a data model? So that actually is reflective of human understanding? And then we can start making decisions at scale. But you have to have that first. >> That's absolutely right. And I think that when we think about what it means to be a data scientist, right? I always think about it in these sort of three pillars. You have the math side. You have to have that kind of stats, hardcore machine learning background. You have the programming side. You don't work with small amounts of data. You work with large amounts of data. You've got to be able to type the code to make those computers run. But then the last part is that human element. You have to understand the domain expertise. You have to understand what it is that I'm actually analyzing. What's the business proposition? And how are the clients, how are the users actually interacting with the system? That human element that you were talking about. And I think having somebody who understands all of those and not just in isolation, but is able to marry that understanding across those different topics, that's what makes a data scientist. >> But I find that we don't have people with those skill sets. And right now the way I see teams being set up inside companies is that they're creating these isolated data unicorns. These data scientists that have graduated from your programs, which are great. But, they don't involve the people who are the domain experts. They don't involve the designers, the consumer insight people, the people, the salespeople. The people who spend time with the customers day in and day out. Somehow they're left out of the room. They're consulted, but they're not a stakeholder. >> Can I actually >> Yeah, yeah please. >> Can I actually give a quick example? So for example, we at Galvanize train the executives and the managers. And then the technical people, the data scientists and the analysts. But in order to actually see all of the RY behind the data, you also have to have a creative fluid conversation between non technical and technical people. And this is a major trend now. And there's a major gap. And we need to increase awareness and kind of like create a new, kind of like environment where technical people also talks seamlessly with non technical ones. >> [Tricia] We call-- >> That's one of the things that we see a lot. Is one of the trends in-- >> A major trend. >> data science training is it's not just for the data science technical experts. It's not just for one type of person. So a lot of the training we do is sort of data engineers. People who are more on the software engineering side learning more about the stats of math. And then people who are sort of traditionally on the stat side learning more about the engineering. And then managers and people who are data analysts learning about both. >> Michael, I think you said something that was of interest too because I think we can look at IBM Watson as an example. And working in healthcare. The human component. Because often times we talk about machine learning and AI, and data and you get worried that you still need that human component. Especially in the world of healthcare. And I think that's a very strong point when it comes to the data analysis side. Is there any particular example you can speak to of that? >> So I think that there was this really excellent paper a while ago talking about all the neuro net stuff and trained on textual data. So looking at sort of different corpuses. And they found that these models were highly, highly sexist. They would read these corpuses and it's not because neuro nets themselves are sexist. It's because they're reading the things that we write. And it turns out that we write kind of sexist things. And they would sort of find all these patterns in there that were sort of latent, that had a lot of sort of things that maybe we would cringe at if we sort of saw. And I think that's one of the really important aspects of the human element, right? It's being able to come in and sort of say like, okay, I know what the biases of the system are, I know what the biases of the tools are. I need to figure out how to use that to make the tools, make the world a better place. And like another area where this comes up all the time is lending, right? So the federal government has said, and we have a lot of clients in the financial services space, so they're constantly under these kind of rules that they can't make discriminatory lending practices based on a whole set of protected categories. Race, sex, gender, things like that. But, it's very easy when you train a model on credit scores to pick that up. And then to have a model that's inadvertently sexist or racist. And that's where you need the human element to come back in and say okay, look, you're using the classic example would be zip code, you're using zip code as a variable. But when you look at it, zip codes actually highly correlated with race. And you can't do that. So you may inadvertently by sort of following the math and being a little naive about the problem, inadvertently introduce something really horrible into a model and that's where you need a human element to sort of step in and say, okay hold on. Slow things down. This isn't the right way to go. >> And the people who have -- >> I feel like, I can feel her ready to respond. >> Yes, I'm ready. >> She's like let me have at it. >> And the people here it is. And the people who are really great at providing that human intelligence are social scientists. We are trained to look for bias and to understand bias in data. Whether it's quantitative or qualitative. And I really think that we're going to have less of these kind of problems if we had more integrated teams. If it was a mandate from leadership to say no data science team should be without a social scientist, ethnographer, or qualitative researcher of some kind, to be able to help see these biases. >> The talent piece is actually the most crucial-- >> Yeah. >> one here. If you look about how to enable machine intelligence in organization there are the pillars that I have in my head which is the culture, the talent and the technology infrastructure. And I believe and I saw in working very closely with the Fortune 100 and 200 companies that the talent piece is actually the most important crucial hard to get. >> [Tricia] I totally agree. >> It's absolutely true. Yeah, no I mean I think that's sort of like how we came up with our business model. Companies were basically saying hey, I can't hire data scientists. And so we have a fellowship where we get 2,000 applicants each quarter. We take the top 2% and then we sort of train them up. And we work with hiring companies who then want to hire from that population. And so we're sort of helping them solve that problem. And the other half of it is really around training. Cause with a lot of industries, especially if you're sort of in a more regulated industry, there's a lot of nuances to what you're doing. And the fastest way to develop that data science or AI talent may not necessarily be to hire folks who are coming out of a PhD program. It may be to take folks internally who have a lot of that domain knowledge that you have and get them trained up on those data science techniques. So we've had large insurance companies come to us and say hey look, we hire three or four folks from you a quarter. That doesn't move the needle for us. What we really need is take the thousand actuaries and statisticians that we have and get all of them trained up to become a data scientist and become data literate in this new open source world. >> [Katie] Go ahead. >> All right, ladies first. >> Go ahead. >> Are you sure? >> No please, fight first. >> Go ahead. >> Go ahead Nir. >> So this is actually a trend that we have been seeing in the past year or so that companies kind of like start to look how to upscale and look for talent within the organization. So they can actually move them to become more literate and navigate 'em from analyst to data scientist. And from data scientist to machine learner. So this is actually a trend that is happening already for a year or so. >> Yeah, but I also find that after they've gone through that training in getting people skilled up in data science, the next problem that I get is executives coming to say we've invested in all of this. We're still not moving the needle. We've already invested in the right tools. We've gotten the right skills. We have enough scale of people who have these skills. Why are we not moving the needle? And what I explain to them is look, you're still making decisions in the same way. And you're still not involving enough of the non technical people. Especially from marketing, which is now, the CMO's are much more responsible for driving growth in their companies now. But often times it's so hard to change the old way of marketing, which is still like very segmentation. You know, demographic variable based, and we're trying to move people to say no, you have to understand the complexity of customers and not put them in boxes. >> And I think underlying a lot of this discussion is this question of culture, right? >> Yes. >> Absolutely. >> How do you build a data driven culture? And I think that that culture question, one of the ways that comes up quite often in especially in large, Fortune 500 enterprises, is that they are very, they're not very comfortable with sort of example, open source architecture. Open source tools. And there is some sort of residual bias that that's somehow dangerous. So security vulnerability. And I think that that's part of the cultural challenge that they often have in terms of how do I build a more data driven organization? Well a lot of the talent really wants to use these kind of tools. And I mean, just to give you an example, we are partnering with one of the major cloud providers to sort of help make open source tools more user friendly on their platform. So trying to help them attract the best technologists to use their platform because they want and they understand the value of having that kind of open source technology work seamlessly on their platforms. So I think that just sort of goes to show you how important open source is in this movement. And how much large companies and Fortune 500 companies and a lot of the ones we work with have to embrace that. >> Yeah, and I'm seeing it in our work. Even when we're working with Fortune 500 companies, is that they've already gone through the first phase of data science work. Where I explain it was all about the tools and getting the right tools and architecture in place. And then companies started moving into getting the right skill set in place. Getting the right talent. And what you're talking about with culture is really where I think we're talking about the third phase of data science, which is looking at communication of these technical frameworks so that we can get non technical people really comfortable in the same room with data scientists. That is going to be the phase, that's really where I see the pain point. And that's why at Sudden Compass, we're really dedicated to working with each other to figure out how do we solve this problem now? >> And I think that communication between the technical stakeholders and management and leadership. That's a very critical piece of this. You can't have a successful data science organization without that. >> Absolutely. >> And I think that actually some of the most popular trainings we've had recently are from managers and executives who are looking to say, how do I become more data savvy? How do I figure out what is this data science thing and how do I communicate with my data scientists? >> You guys made this way too easy. I was just going to get some popcorn and watch it play out. >> Nir, last 30 seconds. I want to leave you with an opportunity to, anything you want to add to this conversation? >> I think one thing to conclude is to say that companies that are not data driven is about time to hit refresh and figure how they transition the organization to become data driven. To become agile and nimble so they can actually see what opportunities from this important industrial revolution. Otherwise, unfortunately they will have hard time to survive. >> [Katie] All agreed? >> [Tricia] Absolutely, you're right. >> Michael, Trish, Nir, thank you so much. Fascinating discussion. And thank you guys again for joining us. We will be right back with another great demo. Right after this. >> Thank you Katie. >> Once again, thank you for an excellent discussion. Weren't they great guys? And thank you for everyone who's tuning in on the live webcast. As you can hear, we have an amazing studio audience here. And we're going to keep things moving. I'm now joined by Daniel Hernandez and Siva Anne. And we're going to turn our attention to how you can deliver on what they're talking about using data science experience to do data science faster. >> Thank you Katie. Siva and I are going to spend the next 10 minutes showing you how you can deliver on what they were saying using the IBM Data Science Experience to do data science faster. We'll demonstrate through new features we introduced this week how teams can work together more effectively across the entire analytics life cycle. How you can take advantage of any and all data no matter where it is and what it is. How you could use your favorite tools from open source. And finally how you could build models anywhere and employ them close to where your data is. Remember the financial adviser app Rob showed you? To build an app like that, we needed a team of data scientists, developers, data engineers, and IT staff to collaborate. We do this in the Data Science Experience through a concept we call projects. When I create a new project, I can now use the new Github integration feature. We're doing for data science what we've been doing for developers for years. Distributed teams can work together on analytics projects. And take advantage of Github's version management and change management features. This is a huge deal. Let's explore the project we created for the financial adviser app. As you can see, our data engineer Joane, our developer Rob, and others are collaborating this project. Joane got things started by bringing together the trusted data sources we need to build the app. Taking a closer look at the data, we see that our customer and profile data is stored on our recently announced IBM Integrated Analytics System, which runs safely behind our firewall. We also needed macro economic data, which she was able to find in the Federal Reserve. And she stored it in our Db2 Warehouse on Cloud. And finally, she selected stock news data from NASDAQ.com and landed that in a Hadoop cluster, which happens to be powered by Hortonworks. We added a new feature to the Data Science Experience so that when it's installed with Hortonworks, it automatically uses a need of security and governance controls within the cluster so your data is always secure and safe. Now we want to show you the news data we stored in the Hortonworks cluster. This is the mean administrative console. It's powered by an open source project called Ambari. And here's the news data. It's in parquet files stored in HDFS, which happens to be a distributive file system. To get the data from NASDAQ into our cluster, we used IBM's BigIntegrate and BigQuality to create automatic data pipelines that acquire, cleanse, and ingest that news data. Once the data's available, we use IBM's Big SQL to query that data using SQL statements that are much like the ones we would use for any relation of data, including the data that we have in the Integrated Analytics System and Db2 Warehouse on Cloud. This and the federation capabilities that Big SQL offers dramatically simplifies data acquisition. Now we want to show you how we support a brand new tool that we're excited about. Since we launched last summer, the Data Science Experience has supported Jupyter and R for data analysis and visualization. In this week's update, we deeply integrated another great open source project called Apache Zeppelin. It's known for having great visualization support, advanced collaboration features, and is growing in popularity amongst the data science community. This is an example of Apache Zeppelin and the notebook we created through it to explore some of our data. Notice how wonderful and easy the data visualizations are. Now we want to walk you through the Jupyter notebook we created to explore our customer preference for stocks. We use notebooks to understand and explore data. To identify the features that have some predictive power. Ultimately, we're trying to assess what ultimately is driving customer stock preference. Here we did the analysis to identify the attributes of customers that are likely to purchase auto stocks. We used this understanding to build our machine learning model. For building machine learning models, we've always had tools integrated into the Data Science Experience. But sometimes you need to use tools you already invested in. Like our very own SPSS as well as SAS. Through new import feature, you can easily import those models created with those tools. This helps you avoid vendor lock-in, and simplify the development, training, deployment, and management of all your models. To build the models we used in app, we could have coded, but we prefer a visual experience. We used our customer profile data in the Integrated Analytic System. Used the Auto Data Preparation to cleanse our data. Choose the binary classification algorithms. Let the Data Science Experience evaluate between logistic regression and gradient boosted tree. It's doing the heavy work for us. As you can see here, the Data Science Experience generated performance metrics that show us that the gradient boosted tree is the best performing algorithm for the data we gave it. Once we save this model, it's automatically deployed and available for developers to use. Any application developer can take this endpoint and consume it like they would any other API inside of the apps they built. We've made training and creating machine learning models super simple. But what about the operations? A lot of companies are struggling to ensure their model performance remains high over time. In our financial adviser app, we know that customer data changes constantly, so we need to always monitor model performance and ensure that our models are retrained as is necessary. This is a dashboard that shows the performance of our models and lets our teams monitor and retrain those models so that they're always performing to our standards. So far we've been showing you the Data Science Experience available behind the firewall that we're using to build and train models. Through a new publish feature, you can build models and deploy them anywhere. In another environment, private, public, or anywhere else with just a few clicks. So here we're publishing our model to the Watson machine learning service. It happens to be in the IBM cloud. And also deeply integrated with our Data Science Experience. After publishing and switching to the Watson machine learning service, you can see that our stock affinity and model that we just published is there and ready for use. So this is incredibly important. I just want to say it again. The Data Science Experience allows you to train models behind your own firewall, take advantage of your proprietary and sensitive data, and then deploy those models wherever you want with ease. So summarize what we just showed you. First, IBM's Data Science Experience supports all teams. You saw how our data engineer populated our project with trusted data sets. Our data scientists developed, trained, and tested a machine learning model. Our developers used APIs to integrate machine learning into their apps. And how IT can use our Integrated Model Management dashboard to monitor and manage model performance. Second, we support all data. On premises, in the cloud, structured, unstructured, inside of your firewall, and outside of it. We help you bring analytics and governance to where your data is. Third, we support all tools. The data science tools that you depend on are readily available and deeply integrated. This includes capabilities from great partners like Hortonworks. And powerful tools like our very own IBM SPSS. And fourth, and finally, we support all deployments. You can build your models anywhere, and deploy them right next to where your data is. Whether that's in the public cloud, private cloud, or even on the world's most reliable transaction platform, IBM z. So see for yourself. Go to the Data Science Experience website, take us for a spin. And if you happen to be ready right now, our recently created Data Science Elite Team can help you get started and run experiments alongside you with no charge. Thank you very much. >> Thank you very much Daniel. It seems like a great time to get started. And thanks to Siva for taking us through it. Rob and I will be back in just a moment to add some perspective right after this. All right, once again joined by Rob Thomas. And Rob obviously we got a lot of information here. >> Yes, we've covered a lot of ground. >> This is intense. You got to break it down for me cause I think we zoom out and see the big picture. What better data science can deliver to a business? Why is this so important? I mean we've heard it through and through. >> Yeah, well, I heard it a couple times. But it starts with businesses have to embrace a data driven culture. And it is a change. And we need to make data accessible with the right tools in a collaborative culture because we've got diverse skill sets in every organization. But data driven companies succeed when data science tools are in the hands of everyone. And I think that's a new thought. I think most companies think just get your data scientist some tools, you'll be fine. This is about tools in the hands of everyone. I think the panel did a great job of describing about how we get to data science for all. Building a data culture, making it a part of your everyday operations, and the highlights of what Daniel just showed us, that's some pretty cool features for how organizations can get to this, which is you can see IBM's Data Science Experience, how that supports all teams. You saw data analysts, data scientists, application developer, IT staff, all working together. Second, you saw how we support all tools. And your choice of tools. So the most popular data science libraries integrated into one platform. And we saw some new capabilities that help companies avoid lock-in, where you can import existing models created from specialist tools like SPSS or others. And then deploy them and manage them inside of Data Science Experience. That's pretty interesting. And lastly, you see we continue to build on this best of open tools. Partnering with companies like H2O, Hortonworks, and others. Third, you can see how you use all data no matter where it lives. That's a key challenge every organization's going to face. Private, public, federating all data sources. We announced new integration with the Hortonworks data platform where we deploy machine learning models where your data resides. That's been a key theme. Analytics where the data is. And lastly, supporting all types of deployments. Deploy them in your Hadoop cluster. Deploy them in your Integrated Analytic System. Or deploy them in z, just to name a few. A lot of different options here. But look, don't believe anything I say. Go try it for yourself. Data Science Experience, anybody can use it. Go to datascience.ibm.com and look, if you want to start right now, we just created a team that we call Data Science Elite. These are the best data scientists in the world that will come sit down with you and co-create solutions, models, and prove out a proof of concept. >> Good stuff. Thank you Rob. So you might be asking what does an organization look like that embraces data science for all? And how could it transform your role? I'm going to head back to the office and check it out. Let's start with the perspective of the line of business. What's changed? Well, now you're starting to explore new business models. You've uncovered opportunities for new revenue sources and all that hidden data. And being disrupted is no longer keeping you up at night. As a data science leader, you're beginning to collaborate with a line of business to better understand and translate the objectives into the models that are being built. Your data scientists are also starting to collaborate with the less technical team members and analysts who are working closest to the business problem. And as a data scientist, you stop feeling like you're falling behind. Open source tools are keeping you current. You're also starting to operationalize the work that you do. And you get to do more of what you love. Explore data, build models, put your models into production, and create business impact. All in all, it's not a bad scenario. Thanks. All right. We are back and coming up next, oh this is a special time right now. Cause we got a great guest speaker. New York Magazine called him the spreadsheet psychic and number crunching prodigy who went from correctly forecasting baseball games to correctly forecasting presidential elections. He even invented a proprietary algorithm called PECOTA for predicting future performance by baseball players and teams. And his New York Times bestselling book, The Signal and the Noise was named by Amazon.com as the number one best non-fiction book of 2012. He's currently the Editor in Chief of the award winning website, FiveThirtyEight and appears on ESPN as an on air commentator. Big round of applause. My pleasure to welcome Nate Silver. >> Thank you. We met backstage. >> Yes. >> It feels weird to re-shake your hand, but you know, for the audience. >> I had to give the intense firm grip. >> Definitely. >> The ninja grip. So you and I have crossed paths kind of digitally in the past, which it really interesting, is I started my career at ESPN. And I started as a production assistant, then later back on air for sports technology. And I go to you to talk about sports because-- >> Yeah. >> Wow, has ESPN upped their game in terms of understanding the importance of data and analytics. And what it brings. Not just to MLB, but across the board. >> No, it's really infused into the way they present the broadcast. You'll have win probability on the bottom line. And they'll incorporate FiveThirtyEight metrics into how they cover college football for example. So, ESPN ... Sports is maybe the perfect, if you're a data scientist, like the perfect kind of test case. And the reason being that sports consists of problems that have rules. And have structure. And when problems have rules and structure, then it's a lot easier to work with. So it's a great way to kind of improve your skills as a data scientist. Of course, there are also important real world problems that are more open ended, and those present different types of challenges. But it's such a natural fit. The teams. Think about the teams playing the World Series tonight. The Dodgers and the Astros are both like very data driven, especially Houston. Golden State Warriors, the NBA Champions, extremely data driven. New England Patriots, relative to an NFL team, it's shifted a little bit, the NFL bar is lower. But the Patriots are certainly very analytical in how they make decisions. So, you can't talk about sports without talking about analytics. >> And I was going to save the baseball question for later. Cause we are moments away from game seven. >> Yeah. >> Is everyone else watching game seven? It's been an incredible series. Probably one of the best of all time. >> Yeah, I mean-- >> You have a prediction here? >> You can mention that too. So I don't have a prediction. FiveThirtyEight has the Dodgers with a 60% chance of winning. >> [Katie] LA Fans. >> So you have two teams that are about equal. But the Dodgers pitching staff is in better shape at the moment. The end of a seven game series. And they're at home. >> But the statistics behind the two teams is pretty incredible. >> Yeah. It's like the first World Series in I think 56 years or something where you have two 100 win teams facing one another. There have been a lot of parity in baseball for a lot of years. Not that many offensive overall juggernauts. But this year, and last year with the Cubs and the Indians too really. But this year, you have really spectacular teams in the World Series. It kind of is a showcase of modern baseball. Lots of home runs. Lots of strikeouts. >> [Katie] Lots of extra innings. >> Lots of extra innings. Good defense. Lots of pitching changes. So if you love the modern baseball game, it's been about the best example that you've had. If you like a little bit more contact, and fewer strikeouts, maybe not so much. But it's been a spectacular and very exciting World Series. It's amazing to talk. MLB is huge with analysis. I mean, hands down. But across the board, if you can provide a few examples. Because there's so many teams in front offices putting such an, just a heavy intensity on the analysis side. And where the teams are going. And if you could provide any specific examples of teams that have really blown your mind. Especially over the last year or two. Because every year it gets more exciting if you will. I mean, so a big thing in baseball is defensive shifts. So if you watch tonight, you'll probably see a couple of plays where if you're used to watching baseball, a guy makes really solid contact. And there's a fielder there that you don't think should be there. But that's really very data driven where you analyze where's this guy hit the ball. That part's not so hard. But also there's game theory involved. Because you have to adjust for the fact that he knows where you're positioning the defenders. He's trying therefore to make adjustments to his own swing and so that's been a major innovation in how baseball is played. You know, how bullpens are used too. Where teams have realized that actually having a guy, across all sports pretty much, realizing the importance of rest. And of fatigue. And that you can be the best pitcher in the world, but guess what? After four or five innings, you're probably not as good as a guy who has a fresh arm necessarily. So I mean, it really is like, these are not subtle things anymore. It's not just oh, on base percentage is valuable. It really effects kind of every strategic decision in baseball. The NBA, if you watch an NBA game tonight, see how many three point shots are taken. That's in part because of data. And teams realizing hey, three points is worth more than two, once you're more than about five feet from the basket, the shooting percentage gets really flat. And so it's revolutionary, right? Like teams that will shoot almost half their shots from the three point range nowadays. Larry Bird, who wound up being one of the greatest three point shooters of all time, took only eight three pointers his first year in the NBA. It's quite noticeable if you watch baseball or basketball in particular. >> Not to focus too much on sports. One final question. In terms of Major League Soccer, and now in NFL, we're having the analysis and having wearables where it can now showcase if they wanted to on screen, heart rate and breathing and how much exertion. How much data is too much data? And when does it ruin the sport? >> So, I don't think, I mean, again, it goes sport by sport a little bit. I think in basketball you actually have a more exciting game. I think the game is more open now. You have more three pointers. You have guys getting higher assist totals. But you know, I don't know. I'm not one of those people who thinks look, if you love baseball or basketball, and you go in to work for the Astros, the Yankees or the Knicks, they probably need some help, right? You really have to be passionate about that sport. Because it's all based on what questions am I asking? As I'm a fan or I guess an employee of the team. Or a player watching the game. And there isn't really any substitute I don't think for the insight and intuition that a curious human has to kind of ask the right questions. So we can talk at great length about what tools do you then apply when you have those questions, but that still comes from people. I don't think machine learning could help with what questions do I want to ask of the data. It might help you get the answers. >> If you have a mid-fielder in a soccer game though, not exerting, only 80%, and you're seeing that on a screen as a fan, and you're saying could that person get fired at the end of the day? One day, with the data? >> So we found that actually some in soccer in particular, some of the better players are actually more still. So Leo Messi, maybe the best player in the world, doesn't move as much as other soccer players do. And the reason being that A) he kind of knows how to position himself in the first place. B) he realizes that you make a run, and you're out of position. That's quite fatiguing. And particularly soccer, like basketball, is a sport where it's incredibly fatiguing. And so, sometimes the guys who conserve their energy, that kind of old school mentality, you have to hustle at every moment. That is not helpful to the team if you're hustling on an irrelevant play. And therefore, on a critical play, can't get back on defense, for example. >> Sports, but also data is moving exponentially as we're just speaking about today. Tech, healthcare, every different industry. Is there any particular that's a favorite of yours to cover? And I imagine they're all different as well. >> I mean, I do like sports. We cover a lot of politics too. Which is different. I mean in politics I think people aren't intuitively as data driven as they might be in sports for example. It's impressive to follow the breakthroughs in artificial intelligence. It started out just as kind of playing games and playing chess and poker and Go and things like that. But you really have seen a lot of breakthroughs in the last couple of years. But yeah, it's kind of infused into everything really. >> You're known for your work in politics though. Especially presidential campaigns. >> Yeah. >> This year, in particular. Was it insanely challenging? What was the most notable thing that came out of any of your predictions? >> I mean, in some ways, looking at the polling was the easiest lens to look at it. So I think there's kind of a myth that last year's result was a big shock and it wasn't really. If you did the modeling in the right way, then you realized that number one, polls have a margin of error. And so when a candidate has a three point lead, that's not particularly safe. Number two, the outcome between different states is correlated. Meaning that it's not that much of a surprise that Clinton lost Wisconsin and Michigan and Pennsylvania and Ohio. You know I'm from Michigan. Have friends from all those states. Kind of the same types of people in those states. Those outcomes are all correlated. So what people thought was a big upset for the polls I think was an example of how data science done carefully and correctly where you understand probabilities, understand correlations. Our model gave Trump a 30% chance of winning. Others models gave him a 1% chance. And so that was interesting in that it showed that number one, that modeling strategies and skill do matter quite a lot. When you have someone saying 30% versus 1%. I mean, that's a very very big spread. And number two, that these aren't like solved problems necessarily. Although again, the problem with elections is that you only have one election every four years. So I can be very confident that I have a better model. Even one year of data doesn't really prove very much. Even five or 10 years doesn't really prove very much. And so, being aware of the limitations to some extent intrinsically in elections when you only get one kind of new training example every four years, there's not really any way around that. There are ways to be more robust to sparce data environments. But if you're identifying different types of business problems to solve, figuring out what's a solvable problem where I can add value with data science is a really key part of what you're doing. >> You're such a leader in this space. In data and analysis. It would be interesting to kind of peek back the curtain, understand how you operate but also how large is your team? How you're putting together information. How quickly you're putting it out. Cause I think in this right now world where everybody wants things instantly-- >> Yeah. >> There's also, you want to be first too in the world of journalism. But you don't want to be inaccurate because that's your credibility. >> We talked about this before, right? I think on average, speed is a little bit overrated in journalism. >> [Katie] I think it's a big problem in journalism. >> Yeah. >> Especially in the tech world. You have to be first. You have to be first. And it's just pumping out, pumping out. And there's got to be more time spent on stories if I can speak subjectively. >> Yeah, for sure. But at the same time, we are reacting to the news. And so we have people that come in, we hire most of our people actually from journalism. >> [Katie] How many people do you have on your team? >> About 35. But, if you get someone who comes in from an academic track for example, they might be surprised at how fast journalism is. That even though we might be slower than the average website, the fact that there's a tragic event in New York, are there things we have to say about that? A candidate drops out of the presidential race, are things we have to say about that. In periods ranging from minutes to days as opposed to kind of weeks to months to years in the academic world. The corporate world moves faster. What is a little different about journalism is that you are expected to have more precision where people notice when you make a mistake. In corporations, you have maybe less transparency. If you make 10 investments and seven of them turn out well, then you'll get a lot of profit from that, right? In journalism, it's a little different. If you make kind of seven predictions or say seven things, and seven of them are very accurate and three of them aren't, you'll still get criticized a lot for the three. Just because that's kind of the way that journalism is. And so the kind of combination of needing, not having that much tolerance for mistakes, but also needing to be fast. That is tricky. And I criticize other journalists sometimes including for not being data driven enough, but the best excuse any journalist has, this is happening really fast and it's my job to kind of figure out in real time what's going on and provide useful information to the readers. And that's really difficult. Especially in a world where literally, I'll probably get off the stage and check my phone and who knows what President Trump will have tweeted or what things will have happened. But it really is a kind of 24/7. >> Well because it's 24/7 with FiveThirtyEight, one of the most well known sites for data, are you feeling micromanagey on your people? Because you do have to hit this balance. You can't have something come out four or five days later. >> Yeah, I'm not -- >> Are you overseeing everything? >> I'm not by nature a micromanager. And so you try to hire well. You try and let people make mistakes. And the flip side of this is that if a news organization that never had any mistakes, never had any corrections, that's raw, right? You have to have some tolerance for error because you are trying to decide things in real time. And figure things out. I think transparency's a big part of that. Say here's what we think, and here's why we think it. If we have a model to say it's not just the final number, here's a lot of detail about how that's calculated. In some case we release the code and the raw data. Sometimes we don't because there's a proprietary advantage. But quite often we're saying we want you to trust us and it's so important that you trust us, here's the model. Go play around with it yourself. Here's the data. And that's also I think an important value. >> That speaks to open source. And your perspective on that in general. >> Yeah, I mean, look, I'm a big fan of open source. I worry that I think sometimes the trends are a little bit away from open source. But by the way, one thing that happens when you share your data or you share your thinking at least in lieu of the data, and you can definitely do both is that readers will catch embarrassing mistakes that you made. By the way, even having open sourceness within your team, I mean we have editors and copy editors who often save you from really embarrassing mistakes. And by the way, it's not necessarily people who have a training in data science. I would guess that of our 35 people, maybe only five to 10 have a kind of formal background in what you would call data science. >> [Katie] I think that speaks to the theme here. >> Yeah. >> [Katie] That everybody's kind of got to be data literate. >> But yeah, it is like you have a good intuition. You have a good BS detector basically. And you have a good intuition for hey, this looks a little bit out of line to me. And sometimes that can be based on domain knowledge, right? We have one of our copy editors, she's a big college football fan. And we had an algorithm we released that tries to predict what the human being selection committee will do, and she was like, why is LSU rated so high? Cause I know that LSU sucks this year. And we looked at it, and she was right. There was a bug where it had forgotten to account for their last game where they lost to Troy or something and so -- >> That also speaks to the human element as well. >> It does. In general as a rule, if you're designing a kind of regression based model, it's different in machine learning where you have more, when you kind of build in the tolerance for error. But if you're trying to do something more precise, then so much of it is just debugging. It's saying that looks wrong to me. And I'm going to investigate that. And sometimes it's not wrong. Sometimes your model actually has an insight that you didn't have yourself. But fairly often, it is. And I think kind of what you learn is like, hey if there's something that bothers me, I want to go investigate that now and debug that now. Because the last thing you want is where all of a sudden, the answer you're putting out there in the world hinges on a mistake that you made. Cause you never know if you have so to speak, 1,000 lines of code and they all perform something differently. You never know when you get in a weird edge case where this one decision you made winds up being the difference between your having a good forecast and a bad one. In a defensible position and a indefensible one. So we definitely are quite diligent and careful. But it's also kind of knowing like, hey, where is an approximation good enough and where do I need more precision? Cause you could also drive yourself crazy in the other direction where you know, it doesn't matter if the answer is 91.2 versus 90. And so you can kind of go 91.2, three, four and it's like kind of A) false precision and B) not a good use of your time. So that's where I do still spend a lot of time is thinking about which problems are "solvable" or approachable with data and which ones aren't. And when they're not by the way, you're still allowed to report on them. We are a news organization so we do traditional reporting as well. And then kind of figuring out when do you need precision versus when is being pointed in the right direction good enough? >> I would love to get inside your brain and see how you operate on just like an everyday walking to Walgreens movement. It's like oh, if I cross the street in .2-- >> It's not, I mean-- >> Is it like maddening in there? >> No, not really. I mean, I'm like-- >> This is an honest question. >> If I'm looking for airfares, I'm a little more careful. But no, part of it's like you don't want to waste time on unimportant decisions, right? I will sometimes, if I can't decide what to eat at a restaurant, I'll flip a coin. If the chicken and the pasta both sound really good-- >> That's not high tech Nate. We want better. >> But that's the point, right? It's like both the chicken and the pasta are going to be really darn good, right? So I'm not going to waste my time trying to figure it out. I'm just going to have an arbitrary way to decide. >> Serious and business, how organizations in the last three to five years have just evolved with this data boom. How are you seeing it as from a consultant point of view? Do you think it's an exciting time? Do you think it's a you must act now time? >> I mean, we do know that you definitely see a lot of talent among the younger generation now. That so FiveThirtyEight has been at ESPN for four years now. And man, the quality of the interns we get has improved so much in four years. The quality of the kind of young hires that we make straight out of college has improved so much in four years. So you definitely do see a younger generation for which this is just part of their bloodstream and part of their DNA. And also, particular fields that we're interested in. So we're interested in people who have both a data and a journalism background. We're interested in people who have a visualization and a coding background. A lot of what we do is very much interactive graphics and so forth. And so we do see those skill sets coming into play a lot more. And so the kind of shortage of talent that had I think frankly been a problem for a long time, I'm optimistic based on the young people in our office, it's a little anecdotal but you can tell that there are so many more programs that are kind of teaching students the right set of skills that maybe weren't taught as much a few years ago. >> But when you're seeing these big organizations, ESPN as perfect example, moving more towards data and analytics than ever before. >> Yeah. >> You would say that's obviously true. >> Oh for sure. >> If you're not moving that direction, you're going to fall behind quickly. >> Yeah and the thing is, if you read my book or I guess people have a copy of the book. In some ways it's saying hey, there are lot of ways to screw up when you're using data. And we've built bad models. We've had models that were bad and got good results. Good models that got bad results and everything else. But the point is that the reason to be out in front of the problem is so you give yourself more runway to make errors and mistakes. And to learn kind of what works and what doesn't and which people to put on the problem. I sometimes do worry that a company says oh we need data. And everyone kind of agrees on that now. We need data science. Then they have some big test case. And they have a failure. And they maybe have a failure because they didn't know really how to use it well enough. But learning from that and iterating on that. And so by the time that you're on the third generation of kind of a problem that you're trying to solve, and you're watching everyone else make the mistake that you made five years ago, I mean, that's really powerful. But that doesn't mean that getting invested in it now, getting invested both in technology and the human capital side is important. >> Final question for you as we run out of time. 2018 beyond, what is your biggest project in terms of data gathering that you're working on? >> There's a midterm election coming up. That's a big thing for us. We're also doing a lot of work with NBA data. So for four years now, the NBA has been collecting player tracking data. So they have 3D cameras in every arena. So they can actually kind of quantify for example how fast a fast break is, for example. Or literally where a player is and where the ball is. For every NBA game now for the past four or five years. And there hasn't really been an overall metric of player value that's taken advantage of that. The teams do it. But in the NBA, the teams are a little bit ahead of journalists and analysts. So we're trying to have a really truly next generation stat. It's a lot of data. Sometimes I now more oversee things than I once did myself. And so you're parsing through many, many, many lines of code. But yeah, so we hope to have that out at some point in the next few months. >> Anything you've personally been passionate about that you've wanted to work on and kind of solve? >> I mean, the NBA thing, I am a pretty big basketball fan. >> You can do better than that. Come on, I want something real personal that you're like I got to crunch the numbers. >> You know, we tried to figure out where the best burrito in America was a few years ago. >> I'm going to end it there. >> Okay. >> Nate, thank you so much for joining us. It's been an absolute pleasure. Thank you. >> Cool, thank you. >> I thought we were going to chat World Series, you know. Burritos, important. I want to thank everybody here in our audience. Let's give him a big round of applause. >> [Nate] Thank you everyone. >> Perfect way to end the day. And for a replay of today's program, just head on over to ibm.com/dsforall. I'm Katie Linendoll. And this has been Data Science for All: It's a Whole New Game. Test one, two. One, two, three. Hi guys, I just want to quickly let you know as you're exiting. A few heads up. Downstairs right now there's going to be a meet and greet with Nate. And we're going to be doing that with clients and customers who are interested. So I would recommend before the game starts, and you lose Nate, head on downstairs. And also the gallery is open until eight p.m. with demos and activations. And tomorrow, make sure to come back too. Because we have exciting stuff. I'll be joining you as your host. And we're kicking off at nine a.m. So bye everybody, thank you so much. >> [Announcer] Ladies and gentlemen, thank you for attending this evening's webcast. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your name badge at the registration desk. Thank you. Also, please note there are two exits on the back of the room on either side of the room. Have a good evening. Ladies and gentlemen, the meet and greet will be on stage. Thank you.

Published Date : Nov 1 2017

SUMMARY :

Today the ability to extract value from data is becoming a shared mission. And for all of you during the program, I want to remind you to join that conversation on And when you and I chatted about it. And the scale and complexity of the data that organizations are having to deal with has It's challenging in the world of unmanageable. And they have to find a way. AI. And it's incredible that this buzz word is happening. And to get to an AI future, you have to lay a data foundation today. And four is you got to expand job roles in the organization. First pillar in this you just discussed. And now you get to where we are today. And if you don't have a strategy for how you acquire that and manage it, you're not going And the way I think about that is it's really about moving from static data repositories And we continue with the architecture. So you need a way to federate data across different environments. So we've laid out what you need for driving automation. And so when you think about the real use cases that are driving return on investment today, Let's go ahead and come back to something that you mentioned earlier because it's fascinating And so the new job roles is about how does everybody have data first in their mind? Everybody in the company has to be data literate. So overall, group effort, has to be a common goal, and we all need to be data literate But at the end of the day, it's kind of not an easy task. It's not easy but it's maybe not as big of a shift as you would think. It's interesting to hear you say essentially you need to train everyone though across the And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. And I've heard that the placement behind those jobs, people graduating with the MS is high. Let me get back to something else you touched on earlier because you mentioned that a number They produce a lot of the shows that I'm sure you watch Katie. And this is a good example. So they have to optimize every aspect of their business from marketing campaigns to promotions And so, as we talk to clients we think about how do you start down this path now, even It's analytics first to the data, not the other way around. We as a practice, we say you want to bring data to where the data sits. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. Female preferred, on the cover of Vogue. And how does it change everything? And while it's important to recognize this critical skill set, you can't just limit it And we call it clickers and coders. [Katie] I like that. And there's not a lot of things available today that do that. Because I hear you talking about the data scientists role and how it's critical to success, And my view is if you have the right platform, it enables the organization to collaborate. And every organization needs to think about what are the skills that are critical? Use this as your chance to reinvent IT. And I can tell you even personally being effected by how important the analysis is in working And think about if you don't do something. And now we're going to get to the fun hands on part of our story. And then how do you move analytics closer to your data? And in here I can see that JP Morgan is calling for a US dollar rebound in the second half But then where it gets interesting is you go to the bottom. data, his stock portfolios, and browsing behavior to build a model which can predict his affinity And so, as a financial adviser, you look at this and you say, all right, we know he loves And I want to do that by picking a auto stock which has got negative correlation with Ferrari. Cause you start clicking that and immediately we're getting instant answers of what's happening. And what I see here instantly is that Honda has got a negative correlation with Ferrari, As a financial adviser, you wouldn't think about federating data, machine learning, pretty And drive the machine learning into the appliance. And even score hundreds of customers for their affinities on a daily basis. And then you see when you deploy analytics next to your data, even a financial adviser, And as a data science leader or data scientist, you have a lot of the same concerns. But you guys each have so many unique roles in your business life. And just by looking at the demand of companies that wants us to help them go through this And I think the whole ROI of data is that you can now understand people's relationships Well you can have all the data in the world, and I think it speaks to, if you're not doing And I think that that's one of the things that customers are coming to us for, right? And Nir, this is something you work with a lot. And the companies that are not like that. Tricia, companies have to deal with data behind the firewall and in the new multi cloud And so that's why I think it's really important to understand that when you implement big And how are the clients, how are the users actually interacting with the system? And right now the way I see teams being set up inside companies is that they're creating But in order to actually see all of the RY behind the data, you also have to have a creative That's one of the things that we see a lot. So a lot of the training we do is sort of data engineers. And I think that's a very strong point when it comes to the data analysis side. And that's where you need the human element to come back in and say okay, look, you're And the people who are really great at providing that human intelligence are social scientists. the talent piece is actually the most important crucial hard to get. It may be to take folks internally who have a lot of that domain knowledge that you have And from data scientist to machine learner. And what I explain to them is look, you're still making decisions in the same way. And I mean, just to give you an example, we are partnering with one of the major cloud And what you're talking about with culture is really where I think we're talking about And I think that communication between the technical stakeholders and management You guys made this way too easy. I want to leave you with an opportunity to, anything you want to add to this conversation? I think one thing to conclude is to say that companies that are not data driven is And thank you guys again for joining us. And we're going to turn our attention to how you can deliver on what they're talking about And finally how you could build models anywhere and employ them close to where your data is. And thanks to Siva for taking us through it. You got to break it down for me cause I think we zoom out and see the big picture. And we saw some new capabilities that help companies avoid lock-in, where you can import And as a data scientist, you stop feeling like you're falling behind. We met backstage. And I go to you to talk about sports because-- And what it brings. And the reason being that sports consists of problems that have rules. And I was going to save the baseball question for later. Probably one of the best of all time. FiveThirtyEight has the Dodgers with a 60% chance of winning. So you have two teams that are about equal. It's like the first World Series in I think 56 years or something where you have two 100 And that you can be the best pitcher in the world, but guess what? And when does it ruin the sport? So we can talk at great length about what tools do you then apply when you have those And the reason being that A) he kind of knows how to position himself in the first place. And I imagine they're all different as well. But you really have seen a lot of breakthroughs in the last couple of years. You're known for your work in politics though. What was the most notable thing that came out of any of your predictions? And so, being aware of the limitations to some extent intrinsically in elections when It would be interesting to kind of peek back the curtain, understand how you operate but But you don't want to be inaccurate because that's your credibility. I think on average, speed is a little bit overrated in journalism. And there's got to be more time spent on stories if I can speak subjectively. And so we have people that come in, we hire most of our people actually from journalism. And so the kind of combination of needing, not having that much tolerance for mistakes, Because you do have to hit this balance. And so you try to hire well. And your perspective on that in general. But by the way, one thing that happens when you share your data or you share your thinking And you have a good intuition for hey, this looks a little bit out of line to me. And I think kind of what you learn is like, hey if there's something that bothers me, It's like oh, if I cross the street in .2-- I mean, I'm like-- But no, part of it's like you don't want to waste time on unimportant decisions, right? We want better. It's like both the chicken and the pasta are going to be really darn good, right? Serious and business, how organizations in the last three to five years have just And man, the quality of the interns we get has improved so much in four years. But when you're seeing these big organizations, ESPN as perfect example, moving more towards But the point is that the reason to be out in front of the problem is so you give yourself Final question for you as we run out of time. And so you're parsing through many, many, many lines of code. You can do better than that. You know, we tried to figure out where the best burrito in America was a few years Nate, thank you so much for joining us. I thought we were going to chat World Series, you know. And also the gallery is open until eight p.m. with demos and activations. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your

ENTITIES

Entity	Category	Confidence
Tricia Wang	PERSON	0.99+
Katie	PERSON	0.99+
Katie Linendoll	PERSON	0.99+
Rob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Joane	PERSON	0.99+
Daniel	PERSON	0.99+
Michael Li	PERSON	0.99+
Nate Silver	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Trump	PERSON	0.99+
Nate	PERSON	0.99+
Honda	ORGANIZATION	0.99+
Siva	PERSON	0.99+
McKinsey	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Larry Bird	PERSON	0.99+
2017	DATE	0.99+
Rob Thomas	PERSON	0.99+
Michigan	LOCATION	0.99+
Yankees	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Clinton	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Tesco	ORGANIZATION	0.99+
Michael	PERSON	0.99+
America	LOCATION	0.99+
Leo	PERSON	0.99+
four years	QUANTITY	0.99+
five	QUANTITY	0.99+
30%	QUANTITY	0.99+
Astros	ORGANIZATION	0.99+
Trish	PERSON	0.99+
Sudden Compass	ORGANIZATION	0.99+
Leo Messi	PERSON	0.99+
two teams	QUANTITY	0.99+
1,000 lines	QUANTITY	0.99+
one year	QUANTITY	0.99+
10 investments	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
The Signal and the Noise	TITLE	0.99+
Tricia	PERSON	0.99+
Nir Kaldero	PERSON	0.99+
80%	QUANTITY	0.99+
BCG	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
ESPN	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Ferrari	ORGANIZATION	0.99+
last year	DATE	0.99+
18	QUANTITY	0.99+
three	QUANTITY	0.99+
Data Incubator	ORGANIZATION	0.99+
Patriots	ORGANIZATION	0.99+

Seth Dobrin & Jennifer Gibbs | IBM CDO Strategy Summit 2017

>> Live from Boston, Massachusetts. It's The Cube! Covering IBM Chief Data Officer's Summit. Brought to you by IBM. (techno music) >> Welcome back to The Cube's live coverage of the IBM CDO Strategy Summit here in Boston, Massachusetts. I'm your host Rebecca Knight along with my Co-host Dave Vellante. We're joined by Jennifer Gibbs, the VP Enterprise Data Management of TD Bank, and Seth Dobrin who is VP and Chief Data Officer of IBM Analytics. Thanks for joining us Seth and Jennifer. >> Thanks for having us. >> Thank you. >> So Jennifer, I want to start with you can you tell our viewers a little about TD Bank, America's Most Convenient Bank. Based, of course, in Toronto. (laughs). >> Go figure. (laughs) >> So tell us a little bit about your business. >> So TD is a, um, very old bank, headquartered in Toronto. We do have, ah, a lot of business as well in the U.S. Through acquisition we've built quite a big business on the Eastern seaboard of the United States. We've got about 85 thousand employees and we're servicing 42 lines of business when it comes to our Data Management and our Analytics programs, bank wide. >> So talk about your Data Management and Analytics programs a little bit. Tell our viewers a little bit about those. >> So, we split up our office of the Chief Data Officer, about 3 to 4 years ago and so we've been maturing. >> That's relatively new. >> Relatively new, probably, not unlike peers of ours as well. We started off with a strong focus on Data Governance. Setting up roles and responsibilities, data storage organization and councils from which we can drive consensus and discussion. And then we started rolling out some of our Data Management programs with a focus on Data Quality Management and Meta Data Management, across the business. So setting standards and policies and supporting business processes and tooling for those programs. >> Seth when we first met, now you're a long timer at IBM. (laughs) When we first met you were a newbie. But we heard today, about,it used to be the Data Warehouse was king but now Process is king. Can you unpack that a little bit? What does that mean? >> So, you know, to make value of data, it's more than just having it in one place, right? It's what you do with the data, how you ingest the data, how you make it available for other uses. And so it's really, you know, data is not for the sake of data. Data is not a digital dropping of applications, right? The whole purpose of having and collecting data is to use it to generate new value for the company. And that new value could be cost savings, it could be a cost avoidance, or it could be net new revenue. Um, and so, to do that right, you need processes. And the processes are everything from business processes, to technical processes, to implementation processes. And so it's the whole, you need all of it. >> And so Jennifer, I don't know if you've seen kind of a similar evolution from data warehouse to data everywhere, I'm sure you have. >> Yeah. >> But the data quality problem was hard enough when you had this sort of central master data management approach. How are you dealing with it? Is there less of a single version of the truth now than there ever was, and how do you deal with the data quality challenge? >> I think it's important to scope out the work effort in a way that you can get the business moving in the right direction without overwhelming and focusing on the areas that are most important to the bank. So, we've identified and scoped out what we call critical data. So each line of business has to identify what's critical to them. Does relate very strongly to what Seth said around what are your core business processes and what data are you leveraging to provide value to that, to the bank. So, um, data quality for us is about a consistent approach, to ensure the most critical elements of data that used for business processes are where they need to be from a quality perspective. >> You can go down a huge rabbit whole with data quality too, right? >> Yeah. >> Data quality is about what's good enough, and defining, you know. >> Right. >> Mm-hmm (affirmative) >> It's not, I liked your, someone, I think you said, it's not about data quality, it's about, you know it's, you got to understand what good enough is, and it's really about, you know, what is the state of the data and under, it's really about understanding the data, right? Than it is perfection. There are some cases, especially in banking, where you need perfection, but there's tons of cases where you don't. And you shouldn't spend a lot of resources on something that's not value added. And I think it's important to do, even things like, data quality, around a specific use case so that you do it right. >> And what you were saying too, it that it's good enough but then that, that standard is changing too, all the time. >> Yeah and that changes over time and it's, you know, if you drive it by use case and not just, we have get this boil the ocean kind of approach where all data needs to be perfect. And all data will never be perfect. And back to your question about processes, usually, a data quality issue, is not a data issue, it's a process issue. You get bad data quality because a process is broken or it's not working for a business or it's changed and no one's documented it so there's a work around, right? And so that's really where your data quality issues come from. Um, and I think that's important to remember. >> Yeah, and I think also coming out of the data quality efforts that we're making, to your point, is it central wise or is it cross business? It's really driving important conversations around who's the producer of this data, who's the consumer of this data? What does data quality mean to you? So it's really generating a lot of conversation across lines of business so that we can start talking about data in more of a shared way versus more of a business by business point of view. So those conversations are important by-products I would say of the individual data quality efforts that we're doing across the bank. >> Well, and of course, you're in a regulated business so you can have the big hammer of hey, we've got regulations, so if somebody spins up a Hadoop Cluster in some line of business you can reel 'em in, presumably, more easily, maybe not always. Seth you operate in an unregulated business. You consult with clients that are in unregulated businesses, is that a bigger challenge for you to reel in? >> So, I think, um, I think that's changing. >> Mm-hmm (affirmative) >> You know, there's new regulations coming out in Europe that basically have global impact, right? This whole GDPR thing. It's not just if you're based in Europe. It's if you have a subject in Europe and that's an employee, a contractor, a customer. And so everyone is subject to regulations now, whether they like it or not. And, in fact, there was some level of regulation even in the U.S., which is kind of the wild, wild, west when it comes to regulations. But I think, um, you should, even doing it because of regulation is not the right answer. I mean it's a great stick to hold up. It's great to be able to go to your board and say, "Hey if we don't do this, we need to spend this money 'cause it's going to cost us, in the case of GDPR, four percent of our revenue per instance.". Yikes, right? But really it's about what's the value and how do you use that information to drive value. A lot of these regulation are about lineage, right? Understanding where your data came from, how it's being processed, who's doing what with it. A lot of it is around quality, right? >> Yep. >> And so these are all good things, even if you're not in a regulated industry. And they help you build a better connection with your customer, right? I think lots of people are scared of GDPR. I think it's a really good thing because it forces companies to build a personal relationship with each of their clients. Because you need to get consent to do things with their data, very explicitly. No more of these 30 pages, two point font, you know ... >> Click a box. >> Click a box. >> Yeah. >> It's, I am going to use your data for X. Are you okay with that? Yes or no. >> So I'm interested from, to hear from both of you, what are you hearing from customers on this? Because this is such a sensitive topic and, in particularly, financial data, which is so private. What are you, what are you hearing from customers on this? >> Um, I think customers are, um, are, especially us in our industry, and us as a bank. Our relationship with our customer is top priority and so maintaining that trust and confidence is always a top priority. So whenever we leverage data or look for use cases to leverage data, making sure that that trust will not be compromised is critically important. So finding that balance between innovating with data while also maintaining that trust and frankly being very transparent with customers around what we're using it for, why we're using it, and what value it brings to them, is something that we're focused on with, with all of our data initiatives. >> So, big part of your job is understanding how data can affect and contribute to the monetization, you know, of your businesses. Um, at the simplest level, two ways, cut costs, increase revenue. Where do you each see the emphasis? I'm sure both, but is there a greater emphasis on cutting costs 'cause you're both established, you know, businesses, with hundreds of thousands, well in your case, 85 thousand employees. Where do you see the emphasis? Is it greater on cutting costs or not necessarily? >> I think for us, I don't necessarily separate the two. Anything we can do to drive more efficiency within our business processes is going to help us focus our efforts on innovative use of data, innovative ways to interact with our customers, innovative ways to understand more about out customers. So, I see them both as, um, I don't see them mutually exclusive, I see them as contributing to each. >> Mm-hmm (affirmative) >> So our business cases tend to have an efficiency slant to them or a productivity slant to them and that helps us redirect effort to other, other things that provide extra value to our clients. So I'd say it's a mix. >> I mean I think, I think you have to do the cost savings and cost avoidance ones first. Um, you learn a lot about your data when you do that. You learn a lot about the gaps. You learn about how would I even think about bringing external data in to generate that new revenue if I don't understand my own data? How am I going to tie 'em all together? Um, and there's a whole lot of cultural change that needs to happen before you can even start generating revenue from data. And you kind of cut your teeth on that by doing the really, simple cost savings, cost avoidance ones first, right? Inevitably, maybe not in the bank, but inevitably most company's supply chain. Let's go find money we can take out of your supply chain. Most companies, if you take out one percent of the supply chain budget, you're talking a lot of money for the company, right? And so you can generate a lot of money to free up to spend on some of these other things. >> So it's a proof of concept to bring everyone along. >> Well it's a proof of concept but it's also, it's more of a cultural change, right? >> Mm-hmm (affirmative) It's not even, you don't even frame it up as a proof of concept for data or analytics, you just frame it up, we're going to save the company, you know, one percent of our supply chain, right? We're going to save the company a billion dollars. >> Yes. >> And then there's gain share there 'cause we're going to put that thing there. >> And then there's a gain share and then other people are like, "Well, how do I do that?". And how do I do that, and how do I do that? And it kind of picks up. >> Mm-hmm (affirmative) But I don't think you can jump just to making new revenue. You got to kind of get there iteratively. >> And it becomes a virtuous circle. >> It becomes a virtuous circle and you kind of change the culture as you do it. But you got to start with, I don't, I don't think they're mutually exclusive, but I think you got to start with the cost avoidance and cost savings. >> Mm-hmm (affirmative) >> Great. Well, Seth, Jennifer thanks so much for coming on The Cube. We've had a great conversation. >> Thanks for having us. >> Thanks. >> Thanks you guys. >> We will have more from the IBM CDO Summit in Boston, Massachusetts, just after this. (techno music)

Published Date : Oct 25 2017

SUMMARY :

Brought to you by IBM. Cube's live coverage of the So Jennifer, I want to start with you (laughs) So tell us a little of the United States. So talk about your Data Management and of the Chief Data Officer, And then we started met you were a newbie. And so it's the whole, you need all of it. to data everywhere, I'm sure you have. How are you dealing with it? So each line of business has to identify and defining, you know. And I think it's important to do, And what you were And back to your question about processes, across lines of business so that we can business so you can have the big hammer of So, I think, um, I and how do you use that And they help you build Are you okay with that? what are you hearing and so maintaining that Where do you each see the emphasis? as contributing to each. So our business cases tend to have And so you can generate a lot of money to bring everyone along. It's not even, you don't even frame it up to put that thing there. And it kind of picks up. But I don't think you can jump change the culture as you do it. much for coming on The Cube. from the IBM CDO Summit

ENTITIES

Entity	Category	Confidence
Seth	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jennifer	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer Gibbs	PERSON	0.99+
Europe	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
TD Bank	ORGANIZATION	0.99+
Toronto	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
TD	ORGANIZATION	0.99+
42 lines	QUANTITY	0.99+
two	QUANTITY	0.99+
Boston, Massachusetts	LOCATION	0.99+
30 pages	QUANTITY	0.99+
United States	LOCATION	0.99+
one percent	QUANTITY	0.99+
both	QUANTITY	0.99+
two point	QUANTITY	0.99+
U.S.	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
each line	QUANTITY	0.99+
GDPR	TITLE	0.99+
today	DATE	0.98+
each	QUANTITY	0.98+
85 thousand employees	QUANTITY	0.98+
hundreds of thousands	QUANTITY	0.98+
four percent	QUANTITY	0.97+
first	QUANTITY	0.97+
one place	QUANTITY	0.97+
two ways	QUANTITY	0.97+
about 85 thousand employees	QUANTITY	0.95+
4 years ago	DATE	0.93+
IBM	EVENT	0.93+
IBM CDO Summit	EVENT	0.91+
IBM CDO Strategy Summit	EVENT	0.91+
Data Warehouse	ORGANIZATION	0.89+
billion dollars	QUANTITY	0.89+
IBM Chief Data Officer's	EVENT	0.88+
about 3	DATE	0.81+
tons of cases	QUANTITY	0.79+
America	ORGANIZATION	0.77+
CDO Strategy Summit 2017	EVENT	0.76+
single version	QUANTITY	0.67+
Data Officer	PERSON	0.59+
Cube	ORGANIZATION	0.58+
money	QUANTITY	0.52+
lot	QUANTITY	0.45+
The Cube	ORGANIZATION	0.36+

Rob Thomas, IBM | Big Data NYC 2017

>> Voiceover: Live from midtown Manhattan, it's theCUBE! Covering Big Data New York City 2017. Brought to you by, SiliconANGLE Media and as ecosystems sponsors. >> Okay, welcome back everyone, live in New York City this is theCUBE's coverage of, eighth year doing Hadoop World now, evolved into Strata Hadoop, now called Strata Data, it's had many incarnations but O'Reilly Media running their event in conjunction with Cloudera, mainly an O'Reilly media show. We do our own show called Big Data NYC here with our community with theCUBE bringing you the best interviews, the best people, entrepreneurs, thought leaders, experts, to get the data and try to project the future and help users find the value in data. My next guest is Rob Thomas, who is the General Manager of IBM Analytics, theCUBE Alumni, been on multiple times successfully executing in the San Francisco Bay area. Great to see you again. >> Yeah John, great to see you, thanks for having me. >> You know IBM is really been interesting through its own transformation and a lot of people will throw IBM in that category but you guys have been transforming okay and the scoreboard yet has to yet to show in my mind what's truly happening because if you still look at this industry, we're only eight years into what Hadoop evolved into now as a large data set but the analytics game just seems to be getting started with the cloud now coming over the top, you're starting to see a lot of cloud conversations in the air. Certainly there's a lot of AI washing, you know, AI this, but it's machine learning and deep learning at the heart of it as innovation but a lot more work on the analytics side is coming. You guys are at the center of that. What's the update? What's your view of this analytics market? >> Most enterprises struggle with complexity. That's the number one problem when it comes to analytics. It's not imagination, it's not willpower, in many cases, it's not even investment, it's just complexity. We are trying to make data really simple to use and the way I would describe it is we're moving from a world of products to platforms. Today, if you want to go solve a data governance problem you're typically integrating 10, 15 different products. And the burden then is on the client. So, we're trying to make analytics a platform game. And my view is an enterprise has to have three platforms if they're serious about analytics. They need a data manager platform for managing all types of data, public, private cloud. They need unified governance so governance of all types of data and they need a data science platform machine learning. If a client has those three platforms, they will be successful with data. And what I see now is really mixed. We've got 10 products that do that, five products that do this, but it has to be integrated in a platform. >> You as an IBM or the customer has these tools? >> Yeah, when I go see clients that's what I see is data... >> John: Disparate data log. >> Yeah, they have disparate tools and so we are unifying what we deliver from a product perspective to this platform concept. >> You guys announce an integrated analytic system, got to see my notes here, I want to get into that in a second but interesting you bring up the word platform because you know, platforms have always been kind of reserved for the big supplier but you're talking about customers having a platform, not a supplier delivering a platform per se 'cause this is where the integration thing becomes interesting. We were joking yesterday on theCUBE here, kind of just kind of ad hoc conceptually like the world has turned into a tool shed. I mean everyone has a tool shed or knows someone that has a tool shed where you have the tools in the back and they're rusty. And so, this brings up the tool conversation, there's too many tools out there that try to be platforms. >> Rob: Yes. >> And if you have too many tools, you're not really doing the platform game right. And complexity also turns into when you bought a hammer it turned into a lawn mower. Right so, a lot of these companies have been groping and trying to iterate what their tool was into something else it wasn't built for. So, as the industry evolves, that's natural Darwinism if you will, they will fall to the wayside. So talk about that dynamic because you still need tooling >> Rob: Yes. but tool will be a function of the work as Peter Burris would say, so talk about how does a customer really get that platform out there without sacrificing the tooling that they may have bought or want to get rid of. >> Well, so think about the, in enterprise today, what the data architecture looks like is, I've got this box that has this software on it, use your terms, has these types of tools on it, and it's isolated and if you want a different set of tooling, okay, move that data to this other box where we have the other tooling. So, it's very isolated in terms of how platforms have evolved or technology platforms today. When I talk about an integrated platform, we are big contributors to Kubernetes. We're making that foundational in terms of what we're doing on Private Cloud and Public Cloud is if you move to that model, suddenly what was a bunch of disparate tools are now microservices against a common architecture. And so it totally changes the nature of the data platform in an enterprise. It's a much more fluid data layer. The term I use sometimes is you have data as a service now, available to all your employees. That's totally different than I want to do this project, so step one, make room in the data center, step two, bring in a server. It's a much more flexible approach so that's what I mean when I say platform. >> So operationalizing it is a lot easier than just going down the linear path of provisioning. All right, so let's bring up the complexity issue because integrated and unified are two different concepts that kind of mean the same thing depending on how you look at it. When you look at the data integration problem, you've got all this complexity around governance, it's a lot of moving parts of data. How does a customer actually execute without compromising the integrity of their policies that they need to have in place? So in other words, what are the baby steps that someone can take, the customers take through with what you guys are dealing with them, how do they get into the game, how do they take steps towards the outcome? They might not have the big money to push it all at once, they might want to take a risk of risk management approach. >> I think there's a clear recipe for doing this right and we have experience of doing it well and doing it not so well, so over time we've gotten some, I'd say a pretty good perspective on that. My view is very simple, data governance has to start with a catalog. And the analogy I use is, you have to do for data what libraries do for books. And think about a library, the first thing you do with books, card catalog. You know where, you basically itemize everything, you know exactly where it sits. If you've got multiple copies of the same book, you can distinguish between which one is which. As books get older they go to archives, to microfilm or something like that. That's what you have to do with your data. >> On the front end. >> On the front end. And it starts with a catalog. And that reason I say that is, I see some organizations that start with, hey, let's go start ETL, I'll create a new warehouse, create a new Hadoop environment. That might be the right thing to do but without having a basis of what you have, which is the catalog, that's where I think clients need to start. >> Well, I would just add one more level of complexity just to kind of reinforce, first of all I agree with you but here's another example that would reinforce this step. Let's just say you write some machine learning and some algorithms and a new policy from the government comes down. Hey, you know, we're dealing with Bitcoin differently or whatever, some GPRS kind of thing happens where someone gets hacked and a new law comes out. How do you inject that policy? You got to rewrite the code, so I'm thinking that if you do this right, you don't have to do a lot of rewriting of applications to the library or the catalog will handle it. Is that right, am I getting that right? >> That's right 'cause then you have a baseline is what I would describe it as. It's codified in the form of a data model or in the form on ontology for how you're looking at unstructured data. You have a baseline so then as changes come, you can easily adjust to those changes. Where I see clients struggle is if you don't have that baseline then you're constantly trying to change things on the fly and that makes it really hard to get to this... >> Well, really hard, expensive, they have to rewrite apps. >> Exactly. >> Rewrite algorithms and machine learning things that were built probably by people that maybe left the company, who knows, right? So the consequences are pretty grave, I mean, pretty big. >> Yes. >> Okay, so let's back to something that you said yesterday. You were on theCUBE yesterday with Hortonworks CEO, Rob Bearden and you were commenting about AI or AI washing. You said quote, "You can't have AI without IA." A play on letters there, sequence of letters which was really an interesting comment, we kind of referenced it pretty much all day yesterday. Information architecture is the IA and AI is the artificial intelligence basically saying if you don't have some sort of architecture AI really can't work. Which really means models have to be understood, with the learning machine kind of approach. Expand more on that 'cause that was I think a fundamental thing that we're seeing at the show this week, this in New York is a model for the models. Who trains the machine learning? Machines got to learn somewhere too so there's learning for the learning machines. This is a real complex data problem and a half. If you don't set up the architecture it may not work, explain. >> So, there's two big problems enterprises have today. One is trying to operationalize data science and machine learning that scale, the other one is getting the cloud but let's focus on the first one for a minute. The reason clients struggle to operationalize this at scale is because they start a data science project and they build a model for one discreet data set. Problem is that only applies to that data set, it doesn't, you can't pick it up and move it somewhere else so this idea of data architecture just to kind of follow through, whether it's the catalog or how you're managing your data across multiple clouds becomes fundamental because ultimately you want to be able to provide machine learning across all your data because machine learning is about predictions and it's hard to do really good predictions on a subset. But that pre-req is the need for an information architecture that comprehends for the fact that you're going to build models and you want to train those models. As new data comes in, you want to keep the training process going. And that's the biggest challenge I see clients struggling with. So they'll have success with their first ML project but then the next one becomes progressively harder because now they're trying to use more data and they haven't prepared their architecture for that. >> Great point. Now, switching to data science. You spoke many times with us on theCUBE about data science, we know you're passionate about you guys doing a lot of work on that. We've observed and Jim Kobielus and I were talking yesterday, there's too much work still in the data science guys plate. There's still doing a lot of what I call, sys admin like work, not the right word, but like administrative building and wrangling. They're not doing enough data science and there's enough proof points now to show that data science actually impacts business in whether it's military having data intelligence to execute something, to selling something at the right time, or even for work or play or consume, or we use, all proof is out there. So why aren't we going faster, why aren't the data scientists more effective, what does it going to take for the data science to have a seamless environment that works for them? They're still doing a lot of wrangling and they're still getting down the weeds. Is that just the role they have or how does it get easier for them that's the big catch? >> That's not the role. So they're a victim of their architecture to some extent and that's why they end up spending 80% of their time on data prep, data cleansing, that type of thing. Look, I think we solved that. That's why when we introduced the integrated analytic system this week, that whole idea was get rid of all the data prep that you need because land the data in one place, machine learning and data science is built into that. So everything that the data scientist struggles with today goes away. We can federate to data on cloud, on any cloud, we can federate to data that's sitting inside Hortonworks so it looks like one system but machine learning is built into it from the start. So we've eliminated the need for all of that data movement, for all that data wrangling 'cause we organized the data, we built the catalog, and we've made it really simple. And so if you go back to the point I made, so one issue is clients can't apply machine learning at scale, the other one is they're struggling to get the cloud. I think we've nailed those problems 'cause now with a click of a button, you can scale this to part of the cloud. >> All right, so how does the customer get their hands on this? Sounds like it's a great tool, you're saying it's leading edge. We'll take a look at it, certainly I'll do a review on it with the team but how do I get it, how do I get a hold of this? What do I do, download it, you guys supply it to me, is it some open source, how do your customers and potential customers engage with this product? >> However they want to but I'll give you some examples. So, we have an analytic system built on Spark, you can bring the whole box into your data center and right away you're ready for data science. That's one way. Somebody like you, you're going to want to go get the containerized version, you go download it on the web and you'll be up and running instantly with a highly performing warehouse integrated with machine learning and data science built on Spark using Apache Jupyter. Any developer can go use that and get value out of it. You can also say I want to run it on my desktop. >> And that's free? >> Yes. >> Okay. >> There's a trial version out there. >> That's the open source, yeah, that's the free version. >> There's also a version on public cloud so if you don't want to download it, you want to run it outside your firewall, you can go run it on IBM cloud on the public cloud so... >> Just your cloud, Amazon? >> No, not today. >> John: Just IBM cloud, okay, I got it. >> So there's variety of ways that you can go use this and I think what you'll find... >> But you have a premium model that people can get started out so they'll download it to your data center, is that also free too? >> Yeah, absolutely. >> Okay, so all the base stuff is free. >> We also have a desktop version too so you can download... >> What URL can people look at this? >> Go to datascience.ibm.com, that's the best place to start a data science journey. >> Okay, multi-cloud, Common Cloud is what people are calling it, you guys have Common SQL engine. What is this product, how does it relate to the whole multi-cloud trend? Customers are looking for multiple clouds. >> Yeah, so Common SQL is the idea of integrating data wherever it is, whatever form it's in, ANSI SQL compliant so what you would expect for a SQL query and the type of response you get back, you get that back with Common SQL no matter where the data is. Now when you start thinking multi-cloud you introduce a whole other bunch of factors. Network, latency, all those types of things so what we talked about yesterday with the announcement of Hortonworks Dataplane which is kind of extending the YARN environment across multi-clouds, that's something we can plug in to. So, I think let's be honest, the multi-cloud world is still pretty early. >> John: Oh, really early. >> Our focus is delivery... >> I don't think it really exists actually. >> I think... >> It's multiple clouds but no one's actually moving workloads across all the clouds, I haven't found any. >> Yeah, I think it's hard for latency reasons today. We're trying to deliver an outstanding... >> But people are saying, I mean this is head room I got but people are saying, I'd love to have a preferred future of multi-cloud even though they're kind of getting their own shops in order, retrenching, and re-platforming it but that's not a bad ask. I mean, I'm a user, I want to move from if I don't like IBM's cloud or I got a better service, I can move around here. If Amazon is too expensive I want to move to IBM, you got product differentiation, I might want to to be in your cloud. So again, this is the customers mindset, right. If you have something really compelling on your cloud, do I have to go all in on IBM cloud to run my data? You shouldn't have to, right? >> I agree, yeah I don't think any enterprise will go all in on one cloud. I think it's delusional for people to think that so you're going to have this world. So the reason when we built IBM Cloud Private we did it on Kubernetes was we said, that can be a substrate if you will, that provides a level of standards across multiple cloud type environments. >> John: And it's got some traction too so it's a good bet there. >> Absolutely. >> Rob, final word, just talk about the personas who you now engage with from IBM's standpoint. I know you have a lot of great developers stuff going on, you've done some great work, you've got a free product out there but you still got to make money, you got to provide value to IBM, who are you selling to, what's the main thing, you've got multiple stakeholders, could you just clarify the stakeholders that you're serving in the marketplace? >> Yeah, I mean, the emerging stakeholder that we speak with more and more than we used to is chief marketing officers who have real budgets for data and data science and trying to change how they're performing their job. That's a major stakeholder, CTOs, CIOs, any C level, >> Chief data officer. >> Chief data officer. You know chief data officers, honestly, it's a mixed bag. Some organizations they're incredibly empowered and they're driving the strategy. Others, they're figure heads and so you got to know how the organizations do it. >> A puppet for the CFO or something. >> Yeah, exactly. >> Our ops. >> A puppet? (chuckles) So, you got to you know. >> Well, they're not really driving it, they're not changing it. It's not like we're mandated to go do something they're maybe governance police or something. >> Yeah, and in some cases that's true. In other cases, they drive the data architecture, the data strategy, and that's somebody that we can engage with right away and help them out so... >> Any events you got going up? Things happening in the marketplace that people might want to participate in? I know you guys do a lot of stuff out in the open, events they can connect with IBM, things going on? >> So we do, so we're doing a big event here in New York on November first and second where we're rolling out a lot of our new data products and cloud products so that's one coming up pretty soon. The biggest thing we've changed this year is there's such a craving for clients for education as we've started doing what we're calling Analytics University where we actually go to clients and we'll spend a day or two days, go really deep and open languages, open source. That's become kind of a new focus for us. >> A lot of re-skilling going on too with the transformation, right? >> Rob: Yes, absolutely. >> All right, Rob Thomas here, General Manager IBM Analytics inside theCUBE. CUBE alumni, breaking it down, giving his perspective. He's got two books out there, The Data Revolution was the first one. >> Big Data Revolution. >> Big Data Revolution and the new one is Every Company is a Tech Company. Love that title which is true, check it out on Amazon. Rob Thomas, Bid Data Revolution, first book and then second book is Every Company is a Tech Company. It's theCUBE live from New York. More coverage after the short break. (theCUBE jingle) (theCUBE jingle) (calm soothing music)

Published Date : Oct 2 2017

SUMMARY :

Brought to you by, SiliconANGLE Media Great to see you again. but the analytics game just seems to be getting started and the way I would describe it is and so we are unifying what we deliver where you have the tools in the back and they're rusty. So talk about that dynamic because you still need tooling that they may have bought or want to get rid of. and it's isolated and if you want They might not have the big money to push it all at once, the first thing you do with books, card catalog. That might be the right thing to do just to kind of reinforce, first of all I agree with you and that makes it really hard to get to this... they have to rewrite apps. probably by people that maybe left the company, Okay, so let's back to something that you said yesterday. and you want to train those models. Is that just the role they have the data prep that you need What do I do, download it, you guys supply it to me, However they want to but I'll give you some examples. There's a That's the open source, so if you don't want to download it, So there's variety of ways that you can go use this that's the best place to start a data science journey. you guys have Common SQL engine. and the type of response you get back, across all the clouds, I haven't found any. Yeah, I think it's hard for latency reasons today. If you have something really compelling on your cloud, that can be a substrate if you will, so it's a good bet there. I know you have a lot of great developers stuff going on, Yeah, I mean, the emerging stakeholder that you got to know how the organizations do it. So, you got to you know. It's not like we're mandated to go do something the data strategy, and that's somebody that we can and cloud products so that's one coming up pretty soon. CUBE alumni, breaking it down, giving his perspective. and the new one is Every Company is a Tech Company.

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Peter Burris	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
John	PERSON	0.99+
Rob Bearden	PERSON	0.99+
Rob Thomas	PERSON	0.99+
O'Reilly Media	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
10	QUANTITY	0.99+
New York	LOCATION	0.99+
10 products	QUANTITY	0.99+
O'Reilly	ORGANIZATION	0.99+
two days	QUANTITY	0.99+
first book	QUANTITY	0.99+
two books	QUANTITY	0.99+
a day	QUANTITY	0.99+
Rob	PERSON	0.99+
Today	DATE	0.99+
yesterday	DATE	0.99+
New York City	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
San Francisco Bay	LOCATION	0.99+
five products	QUANTITY	0.99+
second book	QUANTITY	0.99+
IBM Analytics	ORGANIZATION	0.99+
this week	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first	QUANTITY	0.99+
first one	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
Spark	TITLE	0.99+
SQL	TITLE	0.99+
Common SQL	TITLE	0.98+
datascience.ibm.com	OTHER	0.98+
eighth year	QUANTITY	0.98+
One	QUANTITY	0.98+
one issue	QUANTITY	0.97+
Hortonworks Dataplane	ORGANIZATION	0.97+
three platforms	QUANTITY	0.97+
Strata Hadoop	TITLE	0.97+
today	DATE	0.97+
The Data Revolution	TITLE	0.97+
Cloudera	ORGANIZATION	0.97+
second	QUANTITY	0.96+
NYC	LOCATION	0.96+
two big problems	QUANTITY	0.96+
Analytics University	ORGANIZATION	0.96+
step two	QUANTITY	0.96+
one way	QUANTITY	0.96+
November first	DATE	0.96+
Big Data Revolution	TITLE	0.95+
one	QUANTITY	0.94+
Every Company is a Tech Company	TITLE	0.94+
CUBE	ORGANIZATION	0.93+
this year	DATE	0.93+
two different concepts	QUANTITY	0.92+
one system	QUANTITY	0.92+
step one	QUANTITY	0.92+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for IBM analytics: