Scott Buckles, IBM | Actifio Data Driven 2020

>> Narrator: From around the globe. It's theCUBE, with digital coverage of Actifio Data Driven 2020, brought to you by Actifio. >> Welcome back. I'm Stuart Miniman and this is theCUBE's coverage of Actifio Data Driven 2020. We wish everybody could join us in Boston, but instead we're doing it online this year, of course, and really excited. We're going to be digging into the value of data, how DataOps, data scientists are leveraging data. And joining me on the program, Scott Buckles, he's the North American Business Executive for database data science and DataOps with IBM, Scott, welcome to theCUBE. >> Thanks Stuart, thanks for having me, great to see you. >> Start with the Actifio-IBM partnership. Anyone that knows that Actifio knows that the IBM partnership is really the oldest one that they've had, either it's hardware through software, those joint solutions go together. So tell us about the partnership here in 2020. >> Sure. So it's been a fabulous partnership. In the DataOps world where we are looking to help, all of our customers gain efficiency and effectiveness in their data pipeline and getting value out of their data, Actifio really compliments a lot of the solutions that we have very well. So the folks from everybody from the up top, all the way through the engineering team, is a great team to work with. We're very, very fortunate to have them. How many or any specific examples or anonymized examples that you can share about joint (indistinct). >> I'm going to stay safe and go on the anonymized side. But we've had a lot of great wins, several significantly large wins, where we've had clients that have been struggling with their different data pipelines. And I say data pipeline, I mean getting value from understanding their data, to developing models and and doing the testing on that, and we can get into this in a minute, but those folks have really needed a solution where Actifio has stepped in and provided that solution. To do that at several of the largest banks in the world, including one that was a very recent merger down in the Southeast, where we were able to bring in the Actifio solution and address our, the customer's needs around how they were testing and how they were trying to really move through that testing cycle, because it was a very iterative process, a very sequential process, and they just weren't doing it fast enough, and Actifio stepped in and helped us deliver that in a much more effective way, in a much more efficient way, especially when you into a bank or two banks rather that are merging and have a lot of work to convert systems into one another and converge data, not an easy task. And that was one of the best wins that we've had in the recent months. And again, going back to the partnership, it was an awesome, awesome opportunity to work with them. >> Well, Scott, as I teed up for the beginning of the conversation, you've got data science and DataOps, help us understand how this isn't just a storage solution, when you're talking about BDP. How does DevOps fit into this? Talk a little bit about some of the constituents inside your customers that are engaging with the solution. >> Yeah. So we call it DataOps, and DataOps is both a methodology, which is really trying to combine the best of the way that we've transformed how we develop applications with DevOps and Agile Development. So going back 20 years ago, everything was a waterfall approach, everything was very slow , and then you had to wait a long time to figure out whether you had success or failure in the application that you had developed and whether it was the right application. And with the advent of DevOps and continuous delivery, the advent of things like Agile Development methodologies, DataOps is really converging that and applying that to our data pipelines. So when we look at the opportunity ahead of us, with the world exploding with data, we see it all the time. And it's not just structured data anymore, it's unstructured data, it's how do we take advantage of all the data that we have so that we can make that impact to our business. But oftentimes we are seeing where it's still a very slow process. Data scientists are struggling or business analysts are struggling to get the data in the right form so that they can create a model, and then they're having to go through a long process of trying to figure out whether that model that they've created in Python or R is an effective model. So DataOps is all about driving more efficiency, more speed to that process, and doing it in a much more effective manner. And we've had a lot of good success, and so it's part methodology, which is really cool, and applying that to certain use cases within the, in the data science world, and then it's also a part of how do we build our solutions within IBM, so that we are aligning with that methodology and taking advantage of it. So that we have the AI machine learning capabilities built in to increase that speed which is required by our customers. Because data science is great, AI is great, but you still have to have good data underneath and you have to do it at speed. Well, yeah, Scott, definitely a theme that I heard loud and clear read. IBM think this year, we do a lot of interviews with theCUBE there, it was helping with the tools, helping with the processes, and as you said, helping customers move fast. A big piece of IBM strategy there are the Cloud Paks. My understanding you've got an update with regards to BDP and Cloud Pak. So to tell us what the new releases here for the show. >> Yeah. So in our (indistinct) release that's coming up, we will be to launch BDP directly from Cloud Pak, so that you can take advantage of the Activio capabilities, which we call virtual data pipeline, straight from within Cloud Pak. So it's a native integration, and that's the first of many things to come with how we are tying those two capabilities and those two solutions more closely together. So we're excited about it and we're looking forward to getting it in our customer's hands. >> All right. And that's the Cloud Pak for Data, if I have that correct, right? >> That's called Cloud Pak for data, correct, sorry, yes. Absolutely, I should have been more clear. >> No, it's all right. It's, it's definitely, we've been watching that, those different solutions that IBM is building out with the Cloud Paks, and of course data, as we said, it's so important. Bring us inside a little bit, if you could, the customers. What are the use cases, those problems that you're helping your customers solve with these solution? >> Sure. So there's three primary use cases. One is about accelerating the development process. Getting into how do you take data from its raw form, which may or may not be usable, in a lot of cases it's not, and getting it to a business ready state, so that your data scientists, your business, your data models can take advantage of it, about speed. The second is about reducing storage costs. As data has exponentially grown so has storage costs. We've been in the test data management world for a number of years now. And our ability to help customers reduce that storage footprint is also tied to actually the acceleration piece, but helping them reduce that cost is a big part of it. And then the third part is about mitigating risk. With the amount of data security challenges that we've seen, customers are continuously looking for ways to mitigate their exposure to somebody manipulating data, accessing production data and manipulating production data, especially sensitive data. And by virtualizing that data, we really almost fully mitigate that risk of them being able to do that. Somebody either unintentionally or intentionally altering that data and exposing a client. >> Scott, I know IBM is speaking at the Data Driven event. I read through some of the pieces that they're talking about. It looks like really what you talk about accelerating customer outcomes, helping them be more productive, if you could, what, what are some of key measurements, KPIs that your customers have when they successfully deploy the solution? >> So when it comes to speed, it's really about, we're looking at about how are we reducing the time of that project, right? Are we able to have a material impact on the amount of time that we see clients get through a testing cycle, right? Are we taking them from months to days, are we taking them from weeks to hours? Having that type of material impact. The other piece on storage costs is certainly looking at what is the future growth? You're not necessarily going to reduce storage costs, but are you reducing the growth or the speed at which your storage costs are growing. And then the third piece is really looking at how are we minimizing the vulnerabilities that we have. And when you go through an audit, internally or externally around your data, understanding that the number of exposures and helping find a material impact there, those vulnerabilities are reduced. >> Scott, last question I have for you. You talk about making data scientists more efficient and the like, what are you seeing organizationally, have teams come together or are they planning together, who has the enablement to be able to leverage some of the more modern technologies out there? >> Well, that's a great question. And it varies. I think the organizations that we see that have the most impact are the ones that are most open to bringing their data science as close to the business as possible. The ones that are integrating their data organizations, either the CDO organization or wherever that may set it. Even if you don't have a CDO, that data organization and who owned those data scientists, and folding them and integrating them into the business so that they're an integral part of it, rather than a standalone organization. I think the ones that sort of weave them into the fabric of the business are the ones that get the most benefit and we've seen have the most success thus far. >> Well, Scott, absolutely. We know how important data is and getting full value out of those data scientists, critical initiative for customers. Thanks so much for joining us. Great to get the updates. >> Oh, thank you for having me. Greatly appreciated. >> Stay tuned for more coverage from Activio Data Driven 2020. I'm Stuart Miniman, and thank you for watching theCUBE. (upbeat music)

Published Date : Sep 16 2020

SUMMARY :

Narrator: From around the globe. And joining me on the thanks for having me, great to see you. is really the oldest one that they've had, the solutions that we have very well. To do that at several of the beginning of the conversation, in the application that you had developed and that's the first of And that's the Cloud Pak for Data, Absolutely, I should have been more clear. What are the use cases, and getting it to a business ready state, at the Data Driven event. on the amount of time that we see leverage some of the more are the ones that are most open to and getting full value out of Oh, thank you for having me. I'm Stuart Miniman, and thank

ENTITIES

Entity	Category	Confidence
Scott	PERSON	0.99+
Stuart	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Scott Buckles	PERSON	0.99+
Stuart Miniman	PERSON	0.99+
2020	DATE	0.99+
third piece	QUANTITY	0.99+
Actifio	ORGANIZATION	0.99+
two banks	QUANTITY	0.99+
One	QUANTITY	0.99+
Cloud Pak	TITLE	0.99+
two solutions	QUANTITY	0.99+
Python	TITLE	0.99+
DevOps	TITLE	0.99+
third part	QUANTITY	0.99+
second	QUANTITY	0.99+
first	QUANTITY	0.99+
Actifio Data Driven 2020	TITLE	0.98+
one	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
two capabilities	QUANTITY	0.98+
Cloud Paks	TITLE	0.97+
20 years ago	DATE	0.97+
this year	DATE	0.96+
three primary use cases	QUANTITY	0.96+
both	QUANTITY	0.95+
DataOps	ORGANIZATION	0.95+
DataOps	TITLE	0.94+
Southeast	LOCATION	0.94+
Agile	TITLE	0.94+
Agile Development	TITLE	0.92+
R	TITLE	0.88+
North American	PERSON	0.78+
Activio Data Driven 2020	TITLE	0.74+
Cloud	COMMERCIAL_ITEM	0.74+
BDP	TITLE	0.7+
Data Driven	EVENT	0.67+
BDP	ORGANIZATION	0.53+
Paks	TITLE	0.52+
minute	QUANTITY	0.52+

Tony Higham, IBM | IBM Data and AI Forum

>>live from Miami, Florida It's the Q covering IBM is data in a I forum brought to you by IBM. >>We're back in Miami and you're watching the cubes coverage of the IBM data and a I forum. Tony hi. Amiss here is a distinguished engineer for Ditch the Digital and Cloud Business Analytics at IBM. Tony, first of all, congratulations on being a distinguished engineer. That doesn't happen often. Thank you for coming on the Cube. Thank you. So your area focus is on the B I and the Enterprise performance management space. >>Um, and >>if I understand it correctly, a big mission of yours is to try to modernize those make himself service, making cloud ready. How's that going? >>It's going really well. I mean, you know, we use things like B. I and enterprise performance management. When you really boil it down, there's that's analysis of data on what do we do with the data this useful that makes a difference in the world, and then this planning and forecasting and budgeting, which everyone has to do whether you are, you know, a single household or whether you're an Amazon or Boeing, which are also some of our clients. So it's interesting that we're going from really enterprise use cases, democratizing it all the way down to single user on the cloud credit card swipe 70 bucks a month >>so that was used to be used to work for Lotus. But Cognos is one of IBM's largest acquisitions in the software space ever. Steve Mills on his team architected complete transformation of IBM is business and really got heavily into it. I think I think it was a $5 billion acquisition. Don't hold me to that, but massive one of the time and it's really paid dividends now when all this sort of 2000 ten's came in and said, Oh, how Duke's gonna kill all the traditional b I traditional btw that didn't happen, that these traditional platforms were a fundamental component of people's data strategies, so that created the imperative to modernize and made sure that there could be things like self service and cloud ready, didn't it? >>Yeah, that's absolutely true. I mean, the work clothes that we run a really sticky were close right when you're doing your reporting, your consolidation or you're planning of your yearly cycle, your budget cycle on these technologies, you don't rip them out so easily. So yes, of course, there's competitive disruption in the space. And of course, cloud creates on opportunity for work loads to be wrong, Cheaper without your own I t people. And, of course, the era of digital software. I find it myself. I tried myself by it without ever talking to a sales person creates a democratization process for these really powerful tools that's never been invented before in that space. >>Now, when I started in the business a long, long time ago, it was called GSS decision support systems, and they at the time they promised a 360 degree view with business That never really happened. You saw a whole new raft of players come in, and then the whole B I and Enterprise Data Warehouse was gonna deliver on that promise. That kind of didn't happen, either. Sarbanes Oxley brought a big wave of of imperative around these systems because compliance became huge. So that was a real tailwind for it. Then her duke was gonna solve all these problems that really didn't happen. And now you've got a I, and it feels like the combination of those systems of record those data warehouse systems, the traditional business intelligence systems and all this new emerging tech together are actually going to be a game changer. I wonder if you could comment on >>well so they can be a game changer, but you're touching on a couple of subjects here that are connected. Right? Number one is obviously the mass of data, right? Cause data has accelerated at a phenomenal pace on then you're talking about how do I then visualize or use that data in a useful manner? And that really drives the use case for a I right? Because A I in and of itself, for augmented intelligence as we as we talk about, is only useful almost when it's invisible to the user cause the user needs to feel like it's doing something for them that super intuitive, a bit like the sort of transition between the electric car on the normal car. That only really happens when the electric car can do what the normal car can do. So with things like Imagine, you bring a you know, how do cluster into a B. I solution and you're looking at that data Well. If I can correlate, for example, time profit cost. Then I can create KP eyes automatically. I can create visualizations. I know which ones you like to see from that. Or I could give you related ones that I can even automatically create dashboards. I've got the intelligence about the data and the knowledge to know what? How you might what? Visualize adversity. You have to manually construct everything >>and a I is also going to when you when you spring. These disparage data sets together, isn't a I also going to give you an indication of the confidence level in those various data set. So, for example, you know, you're you're B I data set might be part of the General ledger. You know of the income statement and and be corporate fact very high confidence level. More sometimes you mention to do some of the unstructured data. Maybe not as high a confidence level. How our customers dealing with that and applying that first of all, is that a sort of accurate premise? And how is that manifesting itself in terms of business? Oh, >>yeah. So it is an accurate premise because in the world in the world of data. There's the known knowns on the unknown knowns, right? No, no's are what you know about your data. What's interesting about really good B I solutions and planning solutions, especially when they're brought together, right, Because planning and analysis naturally go hand in hand from, you know, one user 70 bucks a month to the Enterprise client. So it's things like, What are your key drivers? So this is gonna be the drivers that you know what drives your profit. But when you've got massive amounts of data and you got a I around that, especially if it's a I that's gone ontology around your particular industry, it can start telling you about drivers that you don't know about. And that's really the next step is tell me what are the drivers around things that I don't know. So when I'm exploring the data, I'd like to see a key driver that I never even knew existed. >>So when I talk to customers, I'm doing this for a while. One of the concerns they had a criticisms they had of the traditional systems was just the process is too hard. I got to go toe like a few guys I could go to I gotta line up, you know, submit a request. By the time I get it back, I'm on to something else. I want self serve beyond just reporting. Um, how is a I and IBM changing that dynamic? Can you put thes tools in the hands of users? >>Right. So this is about democratizing the cleverness, right? So if you're a big, broad organization, you can afford to hire a bunch of people to do that stuff. But if you're a startup or an SNB, and that's where the big market opportunity is for us, you know, abilities like and this it would be we're building this into the software already today is I'll bring a spreadsheet. Long spreadsheets. By definition, they're not rows and columns, right? Anyone could take a Roan Collin spreadsheet and turn into a set of data because it looks like a database. But when you've got different tabs on different sets of data that may or may not be obviously relatable to each other, that ai ai ability to be on introspect a spreadsheet and turn into from a planning point of view, cubes, dimensions and rules which turn your spreadsheet now to a three dimensional in memory cube or a planning application. You know, the our ability to go way, way further than you could ever do with that planning process over thousands of people is all possible now because we don't have taken all the hard work, all the lifting workout, >>so that three dimensional in memory Cuba like the sound of that. So there's a performance implication. Absolutely. On end is what else? Accessibility Maw wraps more users. Is that >>well, it's the ability to be out of process water. What if things on huge amounts of data? Imagine you're bowing, right? Howdy, pastors. Boeing How? I don't know. Three trillion. I'm just guessing, right? If you've got three trillion and you need to figure out based on the lady's hurricane report how many parts you need to go ship toe? Where that hurricane reports report is you need to do a water scenario on massive amounts of data in a second or two. So you know that capability requires an old lap solution. However, the rest of the planet other than old people bless him who are very special. People don't know what a laugh is from a pop tart, so democratizing it right to the person who says, I've got a set of data on as I still need to do what if analysis on things and probably at large data cause even if you're a small company with massive amounts of data coming through, people click. String me through your website just for example. You know what if I What if analysis on putting a 5% discount on this product based on previous sales have that going to affect me from a future sales again? I think it's the democratizing as the well is the ability to hit scale. >>You talk about Cloud and analytics, how they've they've come together, what specifically IBM has done to modernize that platform. And I'm interested in what customers are saying. What's the adoption like? >>So So I manage the Global Cloud team. We have night on 1000 clients that are using cloud the cloud implementations of our software growing actually so actually Maur on two and 1/2 1000. If you include the multi tenant version, there's two steps in this process, right when you've got an enterprise software solution, your clients have a certain expectation that your software runs on cloud just the way as it does on premise, which means in practical terms, you have to build a single tenant will manage cloud instance. And that's just the first step, right? Because getting clients to see the value of running the workload on cloud where they don't need people to install it, configure it, update it, troubleshoot it on all that other sort of I t. Stuff that subtracts you from doing running your business value. We duel that for you. But the future really is in multi tenant on how we can get vast, vast scale and also greatly lower costs. But the adoptions been great. Clients love >>it. Can you share any kind of indication? Or is that all confidential or what kind of metrics do you look at it? >>So obviously we look, we look a growth. We look a user adoption, and we look at how busy the service. I mean, let me give you the best way I can give you is a is a number of servers, volume numbers, right. So we have 8000 virtual machines running on soft layer or IBM cloud for our clients business Analytics is actually the largest client for IBM Cloud running those workloads for our clients. So it's, you know, that the adoption has been really super hard on the growth continues. Interestingly enough, I'll give you another factoid. So we just launched last October. Cognos Alex. Multi tenant. So it is truly multi infrastructure. You try, you buy, you give you credit card and away you go. And you would think, because we don't have software sellers out there selling it per se that it might not adopt as much as people are out there selling software. Okay, well, in one year, it's growing 10% month on month cigarette Ally's 10% month on month, and we're nearly 1400 users now without huge amounts of effort on our part. So clearly this market interest in running those softwares and then they're not want Tuesdays easer. Six people pretending some of people have 150 people pretending on a multi tenant software. So I believe that the future is dedicated is the first step to grow confidence that my own premise investments will lift and shift the cloud, but multi tenant will take us a lot >>for him. So that's a proof point of existing customer saying okay, I want to modernize. I'm buying in. Take 1/2 step of the man dedicated. And then obviously multi tenant for scale. And just way more cost efficient. Yes, very much. All right. Um, last question. Show us a little leg. What? What can you tell us about the road map? What gets you excited about the future? >>So I think the future historically, Planning Analytics and Carlos analytics have been separate products, right? And when they came together under the B I logo in about about a year ago, we've been spending a lot of our time bringing them together because, you know, you can fight in the B I space and you can fight in the planning space. And there's a lot of competitors here, not so many here. But when you bring the two things together, the connected value chain is where we really gonna win. But it's not only just doing is the connected value chain it and it could be being being vice because I'm the the former Lotus guy who believes in democratization of technology. Right? But the market showing us when we create a piece of software that starts at 15 bucks for a single user. For the same power mind you write little less less of the capabilities and 70 bucks for a single user. For all of it, people buy it. So I'm in. >>Tony, thanks so much for coming on. The kid was great to have you. Brilliant. Thank you. Keep it right there, everybody. We'll be back with our next guest. You watching the Cube live from the IBM data and a I form in Miami. We'll be right back.

Published Date : Oct 23 2019

SUMMARY :

IBM is data in a I forum brought to you by IBM. is on the B I and the Enterprise performance management How's that going? I mean, you know, we use things like B. I and enterprise performance management. so that created the imperative to modernize and made sure that there could be things like self service and cloud I mean, the work clothes that we run a really sticky were close right when you're doing and it feels like the combination of those systems of record So with things like Imagine, you bring a you know, and a I is also going to when you when you spring. that you know what drives your profit. By the time I get it back, I'm on to something else. You know, the our ability to go way, way further than you could ever do with that planning process So there's a performance implication. So you know that capability What's the adoption like? t. Stuff that subtracts you from doing running your business value. or what kind of metrics do you look at it? So I believe that the future is dedicated What can you tell us about the road map? For the same power mind you write little less less of the capabilities and 70 bucks for a single user. The kid was great to have you.

ENTITIES

Entity	Category	Confidence
Tony Higham	PERSON	0.99+
Steve Mills	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Boeing	ORGANIZATION	0.99+
Miami	LOCATION	0.99+
$5 billion	QUANTITY	0.99+
15 bucks	QUANTITY	0.99+
Tony	PERSON	0.99+
70 bucks	QUANTITY	0.99+
three trillion	QUANTITY	0.99+
5%	QUANTITY	0.99+
Three trillion	QUANTITY	0.99+
360 degree	QUANTITY	0.99+
150 people	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
two steps	QUANTITY	0.99+
Six people	QUANTITY	0.99+
1000 clients	QUANTITY	0.99+
two things	QUANTITY	0.99+
two	QUANTITY	0.99+
first step	QUANTITY	0.99+
last October	DATE	0.99+
One	QUANTITY	0.97+
one year	QUANTITY	0.97+
Duke	ORGANIZATION	0.97+
Ditch the Digital	ORGANIZATION	0.97+
today	DATE	0.97+
Cuba	LOCATION	0.96+
Amiss	PERSON	0.96+
Planning Analytics	ORGANIZATION	0.96+
single user	QUANTITY	0.96+
Lotus	TITLE	0.95+
nearly 1400 users	QUANTITY	0.95+
Tuesdays	DATE	0.92+
one	QUANTITY	0.92+
10% month	QUANTITY	0.92+
B I	ORGANIZATION	0.91+
about	DATE	0.91+
over thousands of people	QUANTITY	0.91+
Global Cloud	ORGANIZATION	0.91+
Carlos analytics	ORGANIZATION	0.91+
10% month	QUANTITY	0.9+
1/2 1000	QUANTITY	0.87+
Alex	PERSON	0.87+
first	QUANTITY	0.81+
70 bucks a month	QUANTITY	0.81+
8000 virtual machines	QUANTITY	0.8+
Ally	ORGANIZATION	0.79+
Enterprise Data Warehouse	ORGANIZATION	0.79+
single tenant	QUANTITY	0.79+
a year ago	DATE	0.79+
Collin	PERSON	0.78+
single user	QUANTITY	0.76+
1/2 step	QUANTITY	0.73+
Sarbanes Oxley	PERSON	0.73+
single household	QUANTITY	0.7+
Cloud Business Analytics	ORGANIZATION	0.7+
a second	QUANTITY	0.68+
couple	QUANTITY	0.65+
Cognos	PERSON	0.59+
2000 ten	DATE	0.58+
cloud	TITLE	0.57+
Roan	ORGANIZATION	0.56+
IBM Cloud	ORGANIZATION	0.53+
Cube	PERSON	0.37+

Alyse Daghelian, IBM | IBM Data and AI Forum

>>Live from Miami, Florida. It's the cube covering IBM's data and AI forum brought to you by IBM. >>We're back in Miami. Welcome everybody. You watching the cube, the leader in live tech coverage. We're here at the IBM data and AI forum. Wow. What a day. 1700 customers. A lot of hands on labs sessions. What used to be the IBM analytics university is sort of morphed into this event. Now you see the buzz is going on. At least the Galean is here. She's the vice president of global sales for IBM data and AI. Welcome to the cube. Thank you for coming on. So this event is buzzing the double from last year almost. And uh, congratulations. >>Well, thank you very much. We have con, uh, lots of countries represented here. We have customers from small to large, every industry represented. And a, it's a, I can see a marked difference in the conversations in just a year around our, how customers want to figure out how to embark on this journey to AI. >>So yeah. So why are they come here? What's the, what's the primary motivation? >>Well, I think one IBM is recognized as the leader in AI and we just came out in the IDC survey as the three time w you know, leader, a recognized leader in AI. And when they come here they know they're going to hear from other clients who have embarked on similar journeys. They know they're going to have access to experts, hands on labs, and we bring our entire IBM team that's focused on data and AI to this event. So it's intimate, it's high skilled, it's high energy and they are learning a ton while they're. >>Yeah, a lot of content and you're educating but you're also trying to inspire people. I mean a raise. I was the hub this morning, he wrote this book, but he's this extreme, extreme, extreme like ultra marathoner. Uh, which I thought was a great talk this morning. And then you did a, I thought a good job of sort of connecting, you know, his talk of anything's possible to now bringing AI into the equation. What are you hearing from customers in terms of what they want to make possible and, and what's that conversation like in the field? >>Well, it's interesting because there is a huge recognition that every client that I talked to you, and they all want to understand this, that they have to be transforming their businesses on this journey to AI. So they all recognize that they need to start. Now. What I find when I talk to clients is that they're all coming in at different entry points. There's a maturity curve. So some are figuring out, you know, how do I move away from just Excel spreadsheets? I'm still running my business on Excel, right? And these are no banks in major that are operating on Excel spreadsheets and they're looking at niche competitors, you know, digital banks that are entering the scene. And if they don't change the way they operate, they're not going to survive. So a lot of companies are coming in knowing that they're low on the maturity curve and they better do something to move up that curve pretty fast. >>Some are in almost the second turn of the crank where they've invested in a lot of the AI technologies, they've built data science platforms, and now they're figuring out how do they get that next rev of productivity improvement? How do they come up with that next business idea that's going to give them that competitive advantage? So what I find is every client is embarking on this journey, which is a big difference where I think we were even a year, 18 months ago, where they were sort of just, okay, this is interesting. Now there I better do something. >>Okay, so you're a resource, you know, as the head of global sales for this group. So when you talk to customers that are immature, if I hear you right, they're saying, help us get started because we're going to fall behind. Uh, we're inefficient right now. We're drowning in spreadsheets, data. Our data quality is not where it needs to be. Help. Where do we start? What do you tell them? >>Well, one, we have a formula that we've proven works with clients. Um, we bring them into our garages where we will do design thinking, architectural workshops, and we figure out a use case because what we try not to do with our clients is boil the ocean. We want them to sh to have something that they can prove success around very quickly, create that minimal viable product, bring it back to the business so that the business can see, Oh, I understand. And then evolve that use case. So we will bring technical specialists, we will bring folks that are our own data scientists to these garage environments and we will work with them on building out this first use case. >>Explain the garage a little bit more. Is that those, those are sort of centers of excellence around the world or how do I tap them as a customer? Is it, is it a freebie? Is it for pay? Isn't it like the data science elite team? How does it all work? >>Well, it is. There are a number of physical locations and it's open to all clients. We have created these with co-leadership from across the entire IBM company. So our services organization, our cloud cognitive organization, all play a role in these garages. So we have a formal structure where a team can engage through a request process into the garages. We will help them define the use case they want to bring into the garage. We will bring them in for a period of time and provide the resources and capabilities and skills and that's not charged to the client. So we're trying to get them started now that they'll take that back to their company and then they will look at follow on opportunities and those may, you know, work out to be different services opportunities as they move forward. But we're on that get started phase. >>Yeah. Yeah. I mean you're a fraud for profit company, so it's great to have a loss leader, but the line outside the door at the garage must be huge for people that want to get in. Hi. How are you managing the dominion? >>Yes, well we're increasing obviously our capacity around the garages. Um, and we're still making customers aware of the garages. So there's still, because it's a commitment on their side, like they just can't come in and kick the tires. We ask them to bring their line of business along with their technical teams into the garages because that's where you get the best product coming out of it. When you know you've got something that's going to solve a business problem, but you have to have buy in from both sides. >>I want to ask you about the AI ladder. You know, Rob Thomas has been using this construct for awhile. It didn't just come out of thin air. I'm sure there was a lot of customer input and a lot of debate about what should be on the ladder. We went, when I first heard of the day AI ladder, it was, there was data in IAA analytics, ML and AI, sort of the building, the technical or technology building blocks. It's now become verbs, which I love, which is collect, organize, analyze and infuse, which is all about operationalizing and scaling. How is that resonating with customers and how do they fit into that methodology or framework? >>Well, I'll tell you, I use that framework with every single client and I described that there is a set of steps and you know, obviously to the ladder that every customer has to embark upon. And it starts with some very basic principles and as soon as you start with the very basic principles, every client is like, of course like it seems so obvious that first and foremost you have to date as the foundation, right? AI is not created out of, you know, someone in a back room. The foundation to AI is, is information and data. Yet every customer, every customer struggles with that data is coming from multiple systems, multiple sources that they can't get to the data fast enough. They're shipping data around an organization. It's not managed. And yet that they know that in five years, the data they think they need today is going to be completely different. >>It could be 12 months, but certainly in the future. So how do you build out that architecture that allows them to build that now, but have the agility to grow as the requirements change? You start with that basic discussion and they're like, well of course. So that's collect and then you bring it up and you talk about how do you govern that data? How do you know where that data originated? Who is the owner? How do you know what that data means? What system did it come from? What's the, you know, who has access to it? How do you create that set of govern data? And we'll of course every client recognizes they have that set of issues. So I could continue working my way up the ladder and every client realizes that, okay, I re I'm, here's where I am today. What you just painted for me is absolutely what I need to focus on and address. Now help me get from a to B. >>So I'm really interested in this discussion because it sounds like you're a very disciplined sales leaders and you said you use the ladder with virtually every client and I presume your sales teams use the ladder. So you train your salespeople how to converse the ladder. And then the other observation I'd love your thoughts on this is every step of the ladder has these questions. So you're asking customers questions and I'm sure it catalyzes conversation, the, the answers to which you have solutions presumably from any of them. But I wonder if you could talk about that. >>Well, let me tell about the ladder and how we're using it with our Salesforce because it was a unifying approach, not just within our own team, our data and AI team, but outside of data and AI. Because not only did we explain it to clients this way, but to the rest of IBM, our business partners, our whole ecosystem. So unifying in that we started every single conversation with our sales team on enabling them on how do they talk to their clients, our materials, our use cases, our references, our marketing campaigns. We tied everything to this unified approach and it's made a huge difference in how we communicate our value to clients and explain this journey to AI in in comprehensive steps that everyone could understand and relate to. >>Love it. How is the portfolio evolving to map into that framework? And what can we expect going forward? What can you share with us at least? >>Well, the other amazing feat I'll call it that we produced around this is I'll talk to a client and I'll describe these capabilities and then I will say to a customer, you don't have to do every one of these things that I've just described, but you can implement what you need when you need it. Because we have built all of this into a unified platform called cloud pack for data and it's a modern data platform. It's built on an open infrastructure built on red hat OpenShift so that you can run it on your own premises as a private cloud or on public clouds, whether that be IBM or Amazon or Zohre. It allows you to have a framework, a platform built on this open modern infrastructure with access to all these capabilities I've just described as services and you decide completely open what services you need to deploy when you grow the platform as you need it. And, Oh, by the way, if you don't have the red hat OpenShift environment set up, we'll package that in a system and I will roll in the system to you and allow you to have access to the capabilities in ours. >>How's the red hat conversation going? I would imagine a lot of the traditional IBM customers are stoked. He just picked up red hat, you know, a very innovative company, open source mindset. Um, at the same time I would imagine a lot of red hat customers saying, is IBM really gonna? Let them keep their culture. How's that conversation going in the field? >>Well, I will tell you we've been a hundred percent consistent in terms of everything that you've heard Jenny and Arvin Krishna talk about in the fact that we are going to maintain their culture, keep them as that separate entity inside of IBM. It's absolutely perpetrated throughout the entire IBM company. Um, we have a lot to learn from, from them as I'm sure they have to learn from us, but it truly is operating and I see it in the clients that I'm working with as a real win-win. >>If you had to take one thing away from this event that you want customers to, to remember, what would it be? >>Start now. Um, because if you don't begin on this journey to AI, you will find yourselves, you know, fighting against new competitors, uh, increasing costs, you know, you have to improve productivity. Every client is embarking on this journey to AI start now. >>And when you were talking about, uh, the maturity model and, and one of those levels was folks that had started already and they wanted to get to the next level, when you go into those clients, do you discern a different sort of attitude? We've started, we're down the path. Did they have more of a spring in their step? Are they like chomping at the bit to really go faster and extend their lead relative to the competition competition? What's the dynamic like in those accounts? >>That's a great question because I was with a client this afternoon, um, a large manufacturer of, uh, of goods and they are at this turning point where they did kind of phase one, they implemented cloud pack for data and they did it to just join some of their disparate systems. Now, I mean, I, I barely got a word in because he was so excited cause he's, now what I'm going to do is I'm going to figure out where my factories should go based on where my products are selling. So he's now looking at how he can change his whole distribution process as a result of getting access to this data and analytics that he never had before. Um, and I was like, okay, well just tell me how I can help you. And he was like, no way ahead. >>So this was the big kickoff day. I know yesterday there was sort of deep learning hands on stuff, the big keynotes. Today we're only here for one day. What are we going to miss? What's, what's happening tomorrow? >>Well, it's a bit of a repeat of today. So we'll have another keynote tomorrow from Beth Smith who runs our Watson, uh, business for IBM. We'll have more hands on labs. We have a lot of customer presentations where they're sharing their best practices. Um, lots of fun. >>Where do you want to see this event go? And what kind of, what's next in an IBM event land? >>Well, the feedback from last year this year says we have to do this again next year. It's, it's, it will be bigger because I think this year approves that it's already doubled and we'll probably see a similar dynamic. Um, so I fully expect us to be here. Well, maybe not here. We're sort of outgrowing this hotel. Um, but doing this event again next year, >>AI machine learning automation, uh, I'll throw in cloud. These are the hottest topics going. Elise, thanks very much for coming to the cube was great to have you. >>It's great. It's great meeting with you. >>It. Thank you for watching everybody. That's a wrap from Miami. Go to siliconangle.com check out all the news of the cube.net is where you'll find all these videos and follow the, uh, the Twitter handles at the cube at the cube three 65. I'm Dave Volante. We're out. We'll see you next time.

Published Date : Oct 22 2019

SUMMARY :

IBM's data and AI forum brought to you by IBM. Now you see the buzz is going Well, thank you very much. So yeah. just came out in the IDC survey as the three time w you know, leader, And then you did a, I thought a good job of sort of connecting, you know, So some are figuring out, you know, a lot of the AI technologies, they've built data science platforms, and now they're figuring out So when you talk to customers that are immature, if I hear you right, they're saying, bring it back to the business so that the business can see, Oh, I understand. Isn't it like the data science elite and those may, you know, work out to be different services opportunities as they move forward. Hi. How are you managing the dominion? teams into the garages because that's where you get the best product coming I want to ask you about the AI ladder. And it starts with some very basic principles and as soon as you start with the very basic principles, So that's collect and then you bring it up and you talk about So you train your salespeople how to converse the ladder. Well, let me tell about the ladder and how we're using it with our Salesforce because it was a unifying How is the portfolio evolving to map into that framework? And, Oh, by the way, if you don't have the red hat OpenShift environment He just picked up red hat, you know, a very innovative company, open source mindset. Well, I will tell you we've been a hundred percent consistent in terms of everything that you've heard to AI, you will find yourselves, you know, fighting against new competitors, to get to the next level, when you go into those clients, cloud pack for data and they did it to just join some of their disparate systems. So this was the big kickoff day. We have a lot of customer presentations where they're sharing their best practices. Well, the feedback from last year this year says we have These are the hottest topics going. It's great meeting with you. of the cube.net is where you'll find all these videos and follow the, uh,

ENTITIES

Entity	Category	Confidence
Doug	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Doug Schmitt	PERSON	0.99+
Jenny	PERSON	0.99+
Dave Volante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Michael	PERSON	0.99+
Jen	PERSON	0.99+
Jen Felch	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Miami	LOCATION	0.99+
Alyse Daghelian	PERSON	0.99+
Dell Technologies	ORGANIZATION	0.99+
Alex Barretto	PERSON	0.99+
Dell Technologies Services	ORGANIZATION	0.99+
Arvin Krishna	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
2020	DATE	0.99+
Rob Thomas	PERSON	0.99+
Today	DATE	0.99+
17%	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
next year	DATE	0.99+
Alex	PERSON	0.99+
Beth Smith	PERSON	0.99+
yesterday	DATE	0.99+
APEX	ORGANIZATION	0.99+
JJ Davis	PERSON	0.99+
Dave	PERSON	0.99+
Elise	PERSON	0.99+
last year	DATE	0.99+
Excel	TITLE	0.99+
1700 customers	QUANTITY	0.99+
third item	QUANTITY	0.99+
two	QUANTITY	0.99+
59 billion	QUANTITY	0.99+
one	QUANTITY	0.99+
Dell Technologies and Services	ORGANIZATION	0.99+
one day	QUANTITY	0.99+
12 months	QUANTITY	0.99+
today	DATE	0.99+
Miami, Florida	LOCATION	0.99+
third	QUANTITY	0.99+
three things	QUANTITY	0.99+
five years	QUANTITY	0.99+
Sunday	DATE	0.99+
both sides	QUANTITY	0.99+
one unit	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
this year	DATE	0.99+
three years	QUANTITY	0.99+
telco	ORGANIZATION	0.99+
third one	QUANTITY	0.98+
hundred percent	QUANTITY	0.98+
second priority	QUANTITY	0.98+

Mike Gilfix, IBM | IBM Data and AI Forum

>>Live from Miami, Florida. It's the cube covering IBM's data and AI forum brought to you by IBM. >>Welcome back to Miami, everybody. This is the cube, the leader in live tech coverage. We're covering the IBM data and AI forum. Mike Gilfix is here. He's the vice president of digital business automation at IBM. Mike, good to see you again. Good to see you. So your question, what's the difference between a business and an? >>Digital business? Digital business is one that gets digital software scale. So as opposed to traditional business, you know, very manual, very rote. If you want to get software like scale, you need to digitize. >>Okay, that's important. So now, followup question. Uh, you here, I dunno. I think Benioff said every company is a software company or a SAS company. Um, does every company have to be a digital business else? They're toast. >>Uh, I think it's a competitive pressure. I think every business today is looking to get more and more leveraged to stay ahead of their competition. And they're looking to technology to do that. That's actually where we come in because we bring to them automation, technology, automation, technology. They can apply to their business operations that will help them to get that scale. Like you guys got some hard news. Let's get right into it. What do you announce? Sure. So we're announcing a new critical capability. It's part of our cloud pack for automation. It's IBM automation, digital workers. The idea is that you can leverage a digital workforce. You can manage them like people, they can work alongside your people and they can help to free up your people to be that much more productive. They can spend their time on creative things. They can get assistance where they need it all integrated as part of this digital workforce. >>I got a lot of questions, so, so what's a digital worker? Well, it kind of works just like a person does. It can do critical tasks that they need to do, like sift through documents to find out, you know, what to take action on, help with decision making processes, figure out when to act, how to prioritize work. And it can integrate into those people's workflow so they can offload, say, mundane tasks to even more complex tasks where it works alongside them, helps them be more productive. And it sounds a little bit like a software robot, is it? I mean, is it, it is a, it is a form of software robot. You know, the way that we've approached the problem though is we've really approached it from the human aspect. We've looked at the set of things where people spend their time, where they're doing things that they're really not good at. >>For example, we, many organizations actually probably think about even your own job. We spend tons of time sifting through emails, business documents that you're out to turn something to action. It's boring, it's tedious. We're frankly overwhelmed with it. We can use a digital worker to go through those documents, figure out then what to do and then take action on it. Simple example. Um, let's say someone's doing contract analysis. Think about all the time spent going through a contract to figure what's in it, the decision making process. Is it a valid contract? And then determining, you know, who should I get involved when there's a situation so you can bring the right to the right job. >>So is this a a pre-integrated package or do I have to sort of roll my own? How does it, how do I consume it? >>Uh, well that comes as part of our cloud pack, but it comes with a set of tools that you can adapt to your given job roles. So you can describe for example, what's my compliance officer do, what are the set of tasks they do in their day, for example, checking those contracts. And then you can use that to do automation and augmentation where it integrates into the person's workflow and you can manage them just like people. It'll tell you what work they did. And very importantly, we have an element of business controls so that you can trust sort of the work that gets turned over and it'll determine when you have to stage sort of intervention and get a human involved to complete some form of tasks. >>So it sounds like it still sounds a little bit like RPA, but maybe more focused and more specific to certain use cases or tasks. >>Well, if you really look at where RPA is making strides today, it's making strides in data entry and sort of automation of input and data, a lot of back office stuff. But what it doesn't do really well is for example, complex decision making. So consider that compliance officer checking whether something's compliant requires more than simple decision making. It's not excelling today in the area of dealing with unstructured data or figuring out how to integrate into workflows directly. And we've approached this problem from the perspective of the job role. Tell me about the person, not the point thing that I want to get involved. So it's something that can integrate with RPA. It'll extend RPA, but it will really allow you to create a digital worker as in a hybrid workforce management. >>Okay. That's starting to make sense now because you're right, RPA is basically take this, this mundane work process that's very well understood and automated. Sometimes I call it paving the cow path. Um, but, but the, the, to me, the future of RPA is being able to cross that chasm and going into these fuzzy areas that you're describing. Uh, bending into workflows, maybe allowing humans to come into the equation, maybe calling other automations that I can to act on my behalf. >>That's where I think we partner with RPA vendors. We can supply that brain, if you will, that manages the digital worker brain and we can seamlessly integrate it into business processes, many of which actually run on our technology. And so the marriage of those things is effectively really what we've heard clients want. But today struggle to achieve. >>It was interesting because it makes it so you're not trying to replicate RPA, there's not enough vendors out there doing that. You're trying to add value to that in other, I'm sure there are other areas that you can add value to. Um, and are you partnering specifically with RPA vendors? >>We do. We have close partnerships with RPA vendors. Um, you know, one that we've worked very closely with is automation anywhere, but you know, we interoperate and we work with, uh, all the T the the top. >>Okay. So, um, when you think about digital workers, what's the critical issue for customers in terms of enabling >>bullying them? Um, well first a few things. There's a series, there's just a set of trust. You know, if I'm going to turn over work to this digital worker, how do I know that? For example, you know, I don't care what it comes up with. It's not going to sell, uh, inappropriate goods to miners as an example cause it doesn't know, it hasn't been taught those things. So we put some business controls in place that you can specify in natural language so you understand exactly what your digital worker does and it knows then when to get a human involved. Kind of second component is, I think today people want those to be integrated to their workflow. They want to know that it gets you involved, the right person at the right time to take action. And we can integrate that seamlessly into workflow. So that way it's not an isolated thing that just runs as automation. It's truly a synergistic collaboration between both humans and the digital work. >>Great. So what is, what is the cloud pack for automation? We've been talking about the cloud pack for data. What does the cloud pack for automation? >>So it's a set of technologies that digitize what you do in a line of business. So all the technologies in it have a direct analogy to what people do in their workplace. It digitizes your workflow, meaning it coordinates the activities, it digitizes the business data and documents around it and all of the who can see it. Uh, what's the lifecycle? Uh, enables collaboration around those documents. It digitizes decision-making, uh, processing of unstructured data. So really if you think about going to someone who works in a line of business, say they work in supplier onboarding and you ask them what they do, they'll probably describe their day and those kinds of elements and we can digitize that, run it, manage it, and then give you visibility into the. >>How do you, how do you go from what's in the domain experts head to codifying it? Um, is there a, is that a methodology process? Is that services you have tooling to do that? Well, >>yeah. So one of the key ideas behind the technology is it's low code or model driven. So what the thing does is what you see, and that's really important because you can explain to a non technical user essentially what the system is doing so they can check it with you along the way. And we have this methodology that we call playbacks. And the idea is as you kind of elicit requirements from your business user, you put it in the technology at any given point, you can click play, step through your solution, your business user can kind of watch it even if it's incomplete and say, Oh yeah, that's what I had in mind. That isn't what I had in mind. So that's a very powerful technology for doing sort of interactive development between business and it. >>So it's an iterative process where you kind of record the user activity and then show it back, play it back to the user, say, Oh I close. But that's what make this alteration. >>And once you've digitized your operations on it, the automation play is you can integrate things like digital workers or we actually allow you to use the data from your operations to find ways to scale your workforce. Well >>IBM is obviously in addition to a technology company, you're world-class services organization. Um, one of the largest and in, in most capable SIS in the world, global scale with a lot of domain expertise by it, pick an industry, health care, manufacturing, financial services, name it, IBM's got domain expertise there are able to tap that deep domain expertise to drive your business. >>Sure. So first I'm in the software part of IBM. So I support a broad ecosystem, which is inclusive of partners, specific to IBM global services. We actually have an IBM automation practice that, uh, has expertise specifically in the area of how to apply automation technology to business operations. >>Okay. So I love that answer cause basically I'll translate, you said I'm an arms dealer, I'll sell software to my, my colleagues within IBM, but I love all my partners just as well a little more benign than. Yes. Nonetheless. But the point is your, your, your, your, your goal is to scale your software across, uh, many as clients as possible. If they want to use a competitor of IBM global services and that's fine with you, obviously they can yet. Yeah. And so what's that ecosystem look like? I mean, you've got it as a software company, you've got to develop that ecosystem. >>Yeah, we have a massive business partner ecosystem. Um, everything from larger size of course, as you mentioned, but we have lots of regional size. We have a lot of people that have created vertical solutions around our technology. In fact, that's one of the key ways in which we go to market where they've developed something that's specific say to accounts payable or loan processing or you know, health care claims and so on. And so that allows us to extend our reach to niches and it gives them an opportunity to add value add. >>Okay. So I'm going to ask you some thoughts on automation in general. We've seen a lot of text bending, uh, but we haven't seen a productivity boost as a result in the last several years except for this I guess first quarter of 2019, which is the latest data we saw a big uptick. And so a lot of people think we're, we're on the cusp of a productivity boom. Um, that obviously is your business. What are your thoughts? What's your point of view on all that? >>Well, so, uh, we think of ourselves as our mission in life is to bring digital scale to knowledge workers. And let me explain why that's so important. Um, this industry has been talking about digital transformation for a really long time and we've actually been quite successful in digitizing. We're not done, but we've been quite successful in digitizing a huge portion of our business. But the side effect of digitization is that we generated all of this work. You expect that digital business to serve twice as many customers or be that much more responsive and people can't keep up. And what it's done is it's fueled this growth of knowledge work where today we're not doing manual things the way that they were before they were digitized. But instead we're doing them in software. So how do we help people to keep pace? And that's the goal of automation technology. And there's this explosion of knowledge work. To some extent, organizations far and wide are figuring out, okay, how do I get productivity in this new era? That's where we come in. We can help them get that productivity. And we really are in the cusp of people using those techniques now to get that next level of productivity. >>So, and so I've been saying for a while that I feel like there's this huge wave coming in, in, in productivity as a result of things like automation. Um, people don't like to talk about it in the technology community because the sellers especially cause you know, but I think it clearly has to have an impact on jobs. Maybe people don't get fired, but you might hire less people. But that's not really the point I want to make and ask you about. It's the types of jobs that are going to become valuable. We'll shift to these higher value activities. If you're, you know, filling out a form that's going to be less valuable than some of these other more creative, more strategic types of things. What's your point of >>you on that? So, uh, first off, I don't think there's any human that can keep pace with the growth of knowledge work that's getting generated now. So they're going to need help. There's no lack of things to do. So that's kind of my, my, my first thought. I would say my, my second thought in that is, you know, what, if you could use your time differently, I would ask that question to anybody. If you could use your time differently, think about all the value you could go and create. But if you're spending time doing administrivia, is that really the best use of your time? It's clearly not. And so that's where this technology comes into play. The productivity gain is cause you're going to be able to do things that matter the most. Or unleash the creativity of your people. And my experience in working with organizations is exactly that. They leverage automation technology. Now they can do the missions they always wanted to do but never got to in their backlog. >>Yeah. So I guess that my, my take on that, I'd love, I'd love your thoughts. I mean take existing jobs and put a brick wall around them, those existing jobs or are going to change and I think he's going to have a, a negative, you're gonna have job loss, there's no question. And, but then the other jobs are going to be created. My rap on this, people who want to protect the, the past from the future is we basically have 0% unemployment right now. Even in an economic and dramatic economic downturn, we have one 10% unemployment. So if you're, if you're 90% of the people out there, you're going to be able to get a job. Now, nobody likes the economic downturn, but the point is to be competitive as a, as a nation, as a society, you've got to innovate. And automation is part of that innovation. >>Look, I think if you think about the jobs that people want to do, yeah, they're probably not the jobs that are going to be affected by this. And that's what I mean by the evolution. So people can now spend their time on those higher value things. People don't want to do those sets of tasks. Or if you really ask them, think about what they put on the resume, no one puts on their resume today, I'm a great data entry expert. They want to talk about their time with clients, relationship management, making a difference for the business. That's a potential. >>Yeah, but there was a time people would put that on their resume. Punch card, you know, operator. Right. So, right. So we're still, we're still thriving. We're still around. Thanks so much for coming to the queue. It was a great conversation. Thank you. Thanks for hosting me. Pleasure. All right, you're welcome. All right, keep it right there buddy. We'll be back to wrap the IBM data and AI forum from Miami. You're watching the cube. You're right back.

Published Date : Oct 22 2019

SUMMARY :

IBM's data and AI forum brought to you by IBM. Mike, good to see you again. So as opposed to traditional business, you know, very manual, very rote. Um, does every company have to be a digital business else? The idea is that you can leverage do, like sift through documents to find out, you know, what to take action on, help with decision And then determining, you know, who should I get involved when there's a situation so you can bring the right to the Uh, well that comes as part of our cloud pack, but it comes with a set of tools that you can adapt to your given but maybe more focused and more specific to certain use cases or but it will really allow you to create a digital worker as in a hybrid workforce management. maybe calling other automations that I can to act on my behalf. We can supply that brain, if you will, Um, and are you partnering specifically with RPA with is automation anywhere, but you know, we interoperate and we work with, uh, all the T the the top. what's the critical issue for customers in terms of enabling that you can specify in natural language so you understand exactly what your digital worker does and it knows then So what is, what is the cloud pack for automation? So it's a set of technologies that digitize what you do in a line of business. And the idea is as you kind of elicit requirements from your business user, So it's an iterative process where you kind of record the user activity workers or we actually allow you to use the data from your operations to find ways to scale your workforce. able to tap that deep domain expertise to drive your business. specific to IBM global services. But the point is your, your, your, your, your goal is to scale or loan processing or you know, health care claims and so on. a lot of text bending, uh, but we haven't seen a productivity And that's the goal of automation technology. But that's not really the point I want to make and ask you about. you know, what, if you could use your time differently, I would ask that question to anybody. Now, nobody likes the economic downturn, but the point is to be competitive as that are going to be affected by this. Punch card, you know,

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Mike Gilfix	PERSON	0.99+
Mike	PERSON	0.99+
90%	QUANTITY	0.99+
SAS	ORGANIZATION	0.99+
Benioff	PERSON	0.99+
second thought	QUANTITY	0.99+
first thought	QUANTITY	0.99+
Miami	LOCATION	0.99+
Miami, Florida	LOCATION	0.99+
first quarter of 2019	DATE	0.99+
twice	QUANTITY	0.99+
second component	QUANTITY	0.99+
today	DATE	0.99+
both	QUANTITY	0.98+
first	QUANTITY	0.98+
one	QUANTITY	0.98+
IBM Data	ORGANIZATION	0.92+
0% unemployment	QUANTITY	0.89+
RPA	TITLE	0.86+
10% unemployment	QUANTITY	0.81+
last several years	DATE	0.76+
Forum	ORGANIZATION	0.75+
IBM data and AI forum	ORGANIZATION	0.62+
RPA	ORGANIZATION	0.53+
and AI	EVENT	0.4+

Matthias Funke, IBM | IBM Data and AI Forum

>>Live from Miami, Florida. It's the cube covering IBM's data and AI forum brought to you by IBM. >>We're back in Miami. You're watching the cube, the leader in live tech coverage, and we're covering the IBM data and AI forum in the port of Miami. Mateus Fuka is here, he's the director of offering management for hybrid data management. Everything data. But see, it's great to see you. It's great to have you. So be here with you. We're going to talk database, we're gonna talk data warehouse, everything. Data, you know, did the database market, you know, 10 years ago, 12 years, it was kind of boring. Right. And now it's like data's everywhere. Database is exploding. What's your point of view on what's going on in the marketplace? You know, I mean it's funny too. You used to it boring because I think it's the boring stuff that really matters nowadays to get, get things to where you get people to value with the solutions you want to be or the modernization. >>Thea. Yeah. Seeking to do on the data estates. Um, the challenge that you have in embracing multi-cloud data architectures. So to get, to get to, well you have to, do, I have to take care of the boring stuff. How real is multi-cloud? I mean, I know multi-cloud is, is real and that everybody has multiple clouds. But is multi-cloud a strategy or is it a sort of a symptom of multi-vendor and it just, we could have ended up here with the shadow it and everything else. >> I think it's a reality and yes, it should be a strategy, but I think more more clients and not they find themselves being exposed to this as a reality with different lines of businesses, acquiring data, um, estates running on different locations, different clouds, you know, and then companies have challenge if you want to bring it all together and actually the value of that data, um, and make it available for analytics or AI solutions. >>You know, you've got to have a strategy. >> So IBM is one of the few companies that has both a cloud and an aggressive multi-cloud strategy. Um, you know, Amazon's got outpost a little bit here and Microsoft I guess has some stuff, uh, a but, but generally speaking, uh, Oracle has got a little bit here but IBM has both a cloud. So you'd love people to come into your cloud, but you recognize not everybody's gonna come in your club. So you have an aggressive multi-cloud strategy. Why is that? What's the underpinning of that strategy? Is it openness? Is it just market, you know, total available market? Why? So first of all, yes, we have a, we have a strong portfolio on IBM cloud and we think, you know, it's the best in terms of, you know, integration with other cloud services, the performance you get on the different data services. >>But we also have a strategy that says we want to be our clients want to go. And many clients might have committed already on a strategic level to a different cloud, whether that's AWS, you know, why IBM cloud or Asia. And so we want to be ready as clients want to go. And our commitment is to offer them a complete portfolio of data services that support different workloads. And a complete portfolio in terms of, um, your, the IBM, uh, hope heavy set of technologies as well as open source technologies, give clients choice but then make them available across that universe of multicloud hybrid cloud on premise in a way that they get a consistent experience. And you know, I mean you are familiar with the term. Oh, you divide and conquer, right? I like to talk about it as uh, you know, um, unify to conquer. >>So our, our mission is really unified experience and unified the access to different capabilities available across multicloud architects. So is that really the brand promise gotta unify across clouds? Absolutely. That's our mission. And what's the experience like today and what is sort of the optimal outcome that you guys are looking for? Uh, being able to run any database on any cloud anywhere. Describe that. >> So I think, um, you'd be talking about chapter one and two off the cloud, right? When it, when it comes to chapter one in my, in my view, chapter one was very much about attracting people to the cloud by offering them a set of managed services that take away the management headaches and you know, the, the infrastructure, uh, management aspects. Um, but when you think about chapter two, when you think about how to run, uh, mission critical workloads on, on a cloud or on premise, um, you know, you want to have the ability to integrate data States that run in different environments and we think that OpenShift is leveling the playing field by avoiding location, by, by giving clients the ability to basically abstract from PI, Teri cloud infrastructure services and mechanisms. >>And that gives them freedom of action. They can, they can deploy a certain workload in one in one place and then decide six months later that they are better off moving that workload somewhere else. Yes. >> So OpenShift is the linchpin, absolutely. That cross-cloud integration, is that right? Correct. And with the advent of the rise of the operator, I think you see, you know, you see, um, the industry closing the gap between the value proposition of a fully managed service and what a client managed open shift based environment can deliver in terms of automation, simplicity and annual Oh value. Let's talk about the database market and you're trying to, what's happening? You've got, you know, transactional database, you've got analytic database, you've got legacy data warehouses, you've got new, emerging, emerging, you know, databases that are handling unstructured data. You got to know sequel, not only sequel lay out the landscape and where, what's IBM strategy in the database space? >>So our strategy has, has, so starting with the DB to family, right? We have introduced about two one, two years ago we introduced somebody called Tacoma sequel engine. That gives you a consistent, um, experience in from an application and user perspective in the way you consume, um, data for different workload types. Whether that's transactional data, um, analytical use cases, speak data overdue or fast data solution events, different data architectures, everything, you know, with a consistent experience from a management perspective, from a, from a working behavior perspective in the way you interact with, with this as an application. And, and not only that, but also then make that available on premises in the cloud, fully managed or now open shift based on any cloud. Right. So our, our, I would say our commitment right now is very much focusing on leveraging OpenShift, leveraging cloud pick for data as a platform to deliver all these capabilities DB to an open source in a unified and consistent. >>Uh, I would say with a unified and consistent experience on anybody's cloud, it's like what's in any bag was first, you know, like six months ago when we announced it. And I think now for us doing the same with data and making that data, make it easy for people to access state our way every to the sides is really, but Ts, what's IBM's point of view on, on the degree of integration that you have to have in that stack from hardware and software. So people, some people would argue, well you have to have the same control plane, same data plane, same hardware, same software, same database on prem as you have in the cloud. What's your thoughts on that degree of homogeneity that's required to succeed? So I think it's certainly something that, uh, companies strive to get to simplify the data architectures, unify, consolidate, reduced the number of data sources that you have to deal with. >>But the reality is that the average enterprise client has 168 different data services they have to incorporate, right? So to me it's a moving target and while you want to consolidate, you will never fully get there. So I think our approach is we want to give to client choice best different choice in terms of technologies for for the same workload type. Potentially, whether it's a post test for four transactional workloads for TB, two for transactional workloads, whatever fits the bill, right? And then at the same time, um, at the same time abstract or unify on top of that by, by when you think about operators and OpenShift, for instance, we invest in a, in um, in operators leveraging a consistent framework that basically provides, you know, homogeneous set of interfaces by which people could deploy and life cycle manager Postgres instance or DB two instance. >>So you need only one skillset to manage all these different data services and you know, it reduces total cost of ownership is it provides more agility and, and you know, you know, accelerates time to value for this client. So you're saying that IBM strategy recognizes the heterogeneity within the client base, right? Um, you're not taking, even though you might have a box somewhere in the portfolio, but you're not a, you need this box only strategy. The God box. This is, this is the hammer and every opportunity is a nail. Yeah, we have way beyond that. So we, we are much more open in the way we embrace open source and we bring open source technologies to our enterprise clients and we invest in integration of these different technologies so they can, the value of those can be actuated much more in a much more straightforward fashion. >>The thing about cloud pay for data and the ability to access data insights in different open Sozo, different depositories, IBM, one third party, but then make that data accessible through data virtualization or full governance, applying governance to the data so that data scientists can actually get reef that data for, for his work. That is really important. Can you argue that that's less lock-in than say like they say the God box approach or the cloud only approach? Yeah, absolutely. Because how so? How so? Because, well, because we give you choice to begin with, right? And it's not only choice in terms of the data services and the different technologies that are available, but also in terms of the location where you deploy these data services and how you run them. Okay. So to me it's all about exit strategies. If I go down a path and a path doesn't work for me, how do I get out? >>Exactly. Um, is that a concern of customers in terms of risk management? Yeah. I think, look, every, every costume out there, I daresay, you know, has a data strategy and every customer needs to make some decisions. But you know, there's only so much information you have today to make that decision. But as you learn more, your decision might change six months down the road. And you know, how to preserve that agility as a business to do course corrections I think is really important. So, okay, a hypothetical that this happens every day. You've got a big portfolio companies, they've done a lot of M and a, they've got, you know, 10 different databases that they're running. They got different clouds that they're using, they've got different development teams using, using different tooling, certainly different physical infrastructure. And they really haven't had a strategy to bring it all together. >>Uh, you're hired as the, uh, the data architect or the CTO of the company and say, but Tia's, the CEO says, fix this problem. You're not, we're not taking advantage, uh, and leveraging our data. Where do you start? So of course, being IBM, I would recommend to start with clapping for data as the number one data platform out there because eventually every component will want to capitalize on the value that the data represents. It's not just about a data layer is not just about a database, it's about an indicated solutions tech that gets people to do analytics over the data, the life insights from the data. That's number one. But even if you are, you know, if, if, if it's not I the IBM stack, right, I would always recommend to the client to think about a strategy that that allows for the flexibility change to change course wide and move workloads from one location to another or move data from one technology stack to another. >>And I think that that kind of, you know, that agility and flexibility and, um, translate into, um, risk mitigation strategies that every client should think about. So cloud pack for data, it's okay, let's start there. I'm gonna, I'm gonna, I'm gonna install that, or I'm gonna access that in, into the cloud. And then what do I have to do as a customer to take advantage of that? Do I just have to point it at my data stores? What are the prerequisites? Yeah. So let's say you deploy that on IBM cloud, right? Then you have, you usually are invested already. So you have data, large data estates either residing on share is already in the cloud. You can pull those, those, those datasets in remotely without really moving the workload of the data sets into a cloud pixel, data managed environment by using technologies like data virtualization, right? >>Or using technologies like data stage and ETL capabilities, um, to access the data. But you can also, as you modernize and you build your next next generation application, you can do that within that managed environment with OpenShift. And, and that's what most people want to do. They want to do a digital transformation. They want to modernize the workloads, but we want to leverage the existing investments that they have been making over the last decade. Okay. So, but there's a discovery phase, right, where you bring in cloud pack for data to say, okay, what do I have? Yup, go find it. And then it's bringing in necessary tooling on the, on the diff with the development side with things like OpenShift and then, and then what it's magically virtualizes my data is that, so just on that point, I think you know, the, what made us much more going forward for his clients is how they can incorporate different datasets with adding insure in open source technologies or, or DB two on a third party vendor said, I don't want to mention right now, but, but what matters more is, so how do I make data accessible? >>How do I discover the data set in a way that I can automatically generate metadata? So I have a business glossary, I have metadata and I understand various data sets. Lyft, that's their vision objective business technology objectives. To be able to do that and to what's watching knowledge catalog, which is part of topic for data is a core component that helps you with dead auto discover the metadata generation basically generating, okay, adding data sets in a way that they are now visible to the data scientists and the ultimate end user. What really matters and I think what is our vision overall is the ability to service the ultimate end user medicine developer, a data scientist, so business analysts so that they can get a chip done without depending on it. Yeah, so that metadata catalog is part of the secret sauce that'll that that allows the system to know what data lives, where, how to get to it and and how to join it. >>Since one of the core elements of that, of that integrated platform and solution state board. What I think is really key here is the effort we spend in integrating these different components so that it is, it is, it looks seamlessly, it is happening in an automated fashion that as much as possible and it delivers on that promise of a self service experience for that person that sits at the very end of that. Oh, if that chain right, but to your sex so much for explaining that QA for coming on the cube. Great to meet you. All right. Keep it right there everybody. We'll be back with our next guest right after this short break. You're watching the cube from the IBM data and AI forum in Miami. We'll be right back.

Published Date : Oct 22 2019

SUMMARY :

IBM's data and AI forum brought to you by IBM. to get, get things to where you get people to value with the solutions you want to be or the modernization. So to get, to get to, well you have to, locations, different clouds, you know, and then companies have challenge if you want to bring it all together and it's the best in terms of, you know, integration with other cloud services, I like to talk about it as uh, you know, um, unify to conquer. So is that really the brand promise gotta unify services that take away the management headaches and you know, the, the infrastructure, And that gives them freedom of action. you know, you see, um, the industry closing the gap between the value proposition of a fully managed service perspective in the way you consume, um, data for different workload types. that you have to have in that stack from hardware and software. So to me it's a moving target and while you want So you need only one skillset to manage all these different data services and you know, it reduces total cost technologies that are available, but also in terms of the location where you deploy these data services And you know, how to preserve that agility as a business to But even if you are, you know, if, if, if it's not I the IBM stack, right, And I think that that kind of, you know, that agility and flexibility and, um, translate I think you know, the, what made us much more going forward for his clients that that allows the system to know what data lives, where, how to get to it and Oh, if that chain right, but to your sex so much

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Mateus Fuka	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Matthias Funke	PERSON	0.99+
Miami	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
six months	QUANTITY	0.99+
six months ago	DATE	0.99+
Sozo	ORGANIZATION	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
one skillset	QUANTITY	0.98+
two years ago	DATE	0.97+
OpenShift	TITLE	0.97+
10 years ago	DATE	0.97+
one	QUANTITY	0.97+
six months later	DATE	0.97+
first	QUANTITY	0.96+
IBM Data	ORGANIZATION	0.96+
Tia	ORGANIZATION	0.96+
Asia	LOCATION	0.95+
one place	QUANTITY	0.95+
10 different databases	QUANTITY	0.92+
last decade	DATE	0.82+
168 different data services	QUANTITY	0.74+
Lyft	ORGANIZATION	0.74+
12 years	QUANTITY	0.73+
IBM cloud	ORGANIZATION	0.71+
IBM data and AI forum	ORGANIZATION	0.66+
Tacoma	TITLE	0.64+
chapter one	OTHER	0.63+
chapter two	OTHER	0.61+
Forum	ORGANIZATION	0.6+
Teri	ORGANIZATION	0.56+
two one	DATE	0.52+
God	PERSON	0.52+
chapter	OTHER	0.52+
DB two	TITLE	0.49+
Postgres	TITLE	0.38+
AI	EVENT	0.3+

Daniel G Hernandez & Scott Buckles, IBM | IBM Data and AI Forum

>> Narrator: Live from Miami, Florida, it's The Cube. Covering IBM's Data in AI Forum, brought to you by IBM. >> Welcome back to Miami, everybody. You're watching The Cube, the leader in live tech coverage. We're here covering the IBM Data and AI Forum. Scott Buckles is here to my right. He's the business unit executive at IBM and long time Cube alum, Daniel Hernandez is the Vice President of Data and AI group. Good to see you guys, thanks for coming on. >> Thanks for having us. >> Good to see you. >> You're very welcome. We're going to talk about data ops, kind of accelerating the journey to AI around data ops, but what is data ops and how does it fit into AI? Daniel, we'll start with you. >> There's no AI without data. You've got data science to help you build AI. You've got dev ops to help you build apps. You've got nothing to basically help you prepare data for AI. Data ops is the equivalent of dev ops, but for delivering AI ready data. >> So, how are you, Scott, dealing with this topic with customers, is it resonating? Are they leaning into it, or are they saying, "what?" >> No, it's absolutely resonating. We have a lot of customers that are doing a lot of good things on the data science side. But, trying to get the right data at the right people, and do it fast, is a huge problem. They're finding they're spending too much time prepping data, getting the data into the models, and they're not spending enough time failing fast with some of those models, or getting the models that they need to put in production into production fast enough. So, this absolutely resonates with them because I think it's been confusing for a long time. >> So, AI's scary to a lot of people, right? It's a complicated situation, right? And how do you make it less scary? >> Talk about problems that can be solved with it, basically. You want a better customer experience in your contact center, you want a similarly amazing experience when they're interacting with you on the web. How do you do that? AI is simply a way to get it done, and a way to get it done exceptionally well. So, that's how I like to talk about it. I don't start with here's AI, tell me what problems you can solve. Here are the problems you've got, and where appropriate, here's where AI can help. >> So what are some of your favorite problems that you guys are solving with customers. >> Customer and employee care, which, basically, is any business that does business has customers. Customer and employee care are huge a problem space. Catching bad people, financial crimes investigation is a huge one. Fraud, KYC AML as an example. >> National security, things like that, right? >> Yeah. >> You spend all your time with customers, what else? >> Well, customer experience is probably the one that we're seeing the most. The other is being more efficient. Helping businesses solve those problems quicker, faster. Try to find new avenues for revenue. How to cut costs out of their organization, out of their run time. Those are the ones that we see the most. >> So when you say customer experience, immediately chat bots jumps into my head. But I know we're talking more than, sort of a, transcends chat bots, but double click on customer experience, how are people applying machine intelligence to improve customer experience? >> Well, when I think of it, I think about if you call in to Delta, and you have one bad experience, or your airline, whatever that airline may be, that that customer experience could lead to losing that customer forever, and there used to be an old adage that you have one bad experience and you tell 10 people about it, you have a good one, and you tell one person, or two peoples. So, getting the right data to have that experience is where it becomes a challenge and we've seen instances where customers, or excuse me, organizations are literally trying to find the data on the screen while the customer is on hold. So, they're saying, "can I put you on hold?" and they're trying to go out and find it. So, being able to automate finding that data, getting it in the right hands, to the right people, at the right time, in moment's notice, is a great opportunity for AI and machine learning, and that's an example of how we do it. >> So, from a technical standpoint, Daniel, you guys have this IBM Cloud Pak for Data that's going to magic data virtualization thing. Let's take an example that Scott just gave us, think of an airline. I love my mobile app, I can do everything on my mobile app, except there are certain things I can't do, I have to go to the website. There are certain things I have to do with e-commerce that I have to go to the website that I can't do. Sometimes watching a movie, I can't order a movie from the app, I have to go to website, the URL, and order it there and put it on my watch list. So, I presume that there's some technical debt in each of those platforms, and there's no way to get the data from here, and the data from here talking to each other. Is that the kind of problem that you're solving? >> Yes, and in this particular case, you're actually touching on what we mean by customer and employee care everywhere. The interaction you have on your phone should be the same as the interaction and the kind of response on the web, which should be the same, if not better, when you're talking to a human being. How do you have the exceptional customer and employee care, all channels. Today, say the art is, I've got a specific experience for my phone, a specific experience for my website, a specific, different experience in my contact center. The whole work we're doing around Watson Assistant, and it as a virtual assistant, is to be that nervous system that underpins all channels, and with Cloud Pak for Data, we can deliver it anywhere. You want to run your contact center on an IBM Cloud? Great. You want to run it on Amazon, Azure, Google, your own private center, or everything in between, great. Cloud Pak for Data is how you get Watson Assistant, the rest of Watson and our data stack anywhere you want, so you can deliver that same consistent, amazing experience, all channels, anywhere. >> And I know the tone of my question was somewhat negative, but I'm actually optimistic, and there's a couple examples I'll give. I remember Bill Belichick one time said, "Agh, the weather, it can't ever get the weather right," this is probably five, six years ago. Actually, they do pretty well with the weather compared to 10 or 15 years ago. The other is fraud detection. In the last 10 years, fraud detection has become so much better in terms of just the time it takes to identify a fraud, and the number of false positives. Even in the last, I'd say, 12 to 18 months, false positives are way down. I think that's machine intelligence, right? >> I mean, if you're using business rules, they're not way down. They're still way up. If you're using more sophisticated techniques, that are depending upon the operational data to be trained, then they should be way down. But, there is still a lot of these systems that are based on old school business rules that can't keep up. They're producing alerts that, in many cases, are ignored, and because they're ignored, you're susceptible to bad issues. With, especially AI based techniques for fraud detection, you better have good data to train this stuff, which gets back to the whole data ops thing, and training those with good data, which data ops can help you get done. >> And a key part to data ops is the people and the process. It's not just about automating things and automating the data to get it in the right place. You have to modernize those business processes and have the right skills to be able to do that as well. Otherwise, you're not going to make the progress. You're not going to reap the benefits. >> Well, that was actually my next question. What about the people and the process? We were talking before, off camera, about our PA, and he's saying "pave the cow path." But sometimes you actually have to re-engineer the process and you might not have the skill set. So it's people and process, and then technology you lay in. And we've always talked about this, technology is always going to change. Smart technologists will figure it out. But, the people and the process, that's the hardest part. What are you seeing in the field? >> We see a lot of customers struggling with the people and process side, for a variety of reasons. The technology seems to be the focus, but when we talk to customers, we spend a lot of time saying, "well, what needs to change in your business process "when this happens? "How do those business rules need to change "so you don't get those false positives?" Because it doesn't matter at the end of the day. >> So, can we go back to the business rules thing? So, it sounds like the business rules are sort of an outdated, policy based, rigid sort of structure that's enforced no matter what. Versus machine intelligence, which can interpret situations on the fly, but can you add some color to that and explain the difference between what you call sort of business rules based versus AI based. >> So the AI based ones, in this particular case, probably classic statistical machine learning techniques, to do something like know who I am, right? My name is Danny Hernandez, if you were to Google Danny Hernandez, the number one search result is going to be a rapper. There is a rapper that actually just recently came out, he's not even that good, but he's a new one. A statistical machine learning technique would be able to say, "all right, given Daniel "and the context information I know about him, "when I look for Daniel Hernandez, "and I supplement the identity with that "contextual information, it means it's one of "the six that work at IBM." Right? >> Not the rapper. >> Not the rapper. >> Not the rapper. >> Exactly. I don't mind being matched with a rapper, but match me with a good rapper. >> All you've got to do is search Daniel Hernandez and The Cube and you'll find him. >> Ha, right. Bingo. Actually that's true. So, in any case, the AI based techniques basically allow you to isolate who I am, based on more features that you know about me, so that you get me right. Because if you can't even start there, with whom are you transacting, you're not going to have any hope of detecting fraud. Either that, or you're going to get false positives because you're going to associate me with someone that I'm not, and then it's just going to make me upset, because when you should be transacting with me, you're not because you're saying I'm someone I'm not. >> So, that ties back to what we were saying before, know you're customer and anti money laundering. Which, of course, was big, and still is, during the crypto craze. Maybe crypto is not as crazy, but that was a big deal when you had bitcoin at whatever it was. What are some practical applications for KYC AML that you're seeing in the field today? >> I think that what we see a lot of, what we're applying in my business is automating the discovery of data and learning about the lineage of that data. Where did it come from? This was a problem that was really hard to solve 18 months ago, because it took a lot of man power to do it. And as soon as you did it once, it was outdated. So, we've recently released some capabilities within Watson Knowledge Catalog that really help automate that, so that as the data continues to grow, and continues to change, as it always does, that rather than having two, three hundred business analysts or data stewards trying to go figure that out, machine learning can go do that for you. >> So, all the big banks are glomming on to this? >> Absolutely. >> So think about any customer onboarding, right? You better know who your customer is, and you better have provisions around anti money laundering. Otherwise, there's going to be some very serious downside risk. It's just one example of many, for sure. >> Let's talk about some of the data challenges because we talked a lot about digital, digital business, I've always said the difference between a business and a digital business is how they use data. So, what are some of the challenging issues that customers are facing, and particularly, incumbents, Ginni Rometty used the term a couple of events ago, and it might have even been World of Watson, incumbent disruptors, maybe that was the first think, which I thought was a very poignant term. So, what are some of the data challenges that these incumbents are facing, and how is IMB helping solve them? >> For us, one of them that we see is just understanding where their data is. There is a lot of dark data out there that they haven't discovered yet. And what impact is that having on their analytics, what opportunities aren't they taking advantage of, and what risks are they being exposed to by that being out there. Unstructured data is another big part of it as well. Structured data is sort of the easy answer to solving the data problem, >> [Daniel Hernandez] But still hard. >> But still hard. Unstructured data is something that almost feels like an afterthought a lot of times. But, the opportunities and risks there are equally, if not greater, to your business. >> So yeah, what you're saying it's an afterthought, because a lot of times people are saying, "that's too hard." >> Scott Buckles: Right. >> Forget it. >> Scott Buckles: Right. Right. Absolutely. >> Because there's gold in them there hills, right? >> Scott Buckles: Yeah, absolutely. >> So, how does IBM help solve that problem? Is it tooling, is it discovery tooling? >> Well, yeah, so we recently released a product called InstaScan, that helps you to go discover unstructured data within any cloud environment. So, that was released a couple months ago, that's a huge opportunity that we see where customers can actually go and discover that dark data, discover those risks. And then combine that with some of the capabilities that we do with structured data too, so you have a holistic view of where your data is, and start tying that together. >> If I could add, any company that has any operating history is going to have a pretty complex data environment. Any company that wants to employ AI has a fundamental choice. Either I bring my AI to the data, or I bring my data to the AI. Our competition demand that you bring your data to the AI, which is expensive, hard, often impossible. So, if you have any desire to employ this stuff, you had better take the I'm going to bring my AI to the data approach, or be prepared to deal with a multi-year deployment for this stuff. So, that principle difference in how we think about the problem, means that we can help our customers apply AI to problem sets that they otherwise couldn't because they would have to move. And in many cases, they're just abandoning projects all together because of that. >> So, now we're starting to get into sort of data strategy. So, let's talk about data strategy. So, it starts with, I guess, understanding the value of your data. >> [Daniel Hernandez] Start with understanding what you got. >> Yeah, what data do I have. What's the value of that data? How do I get to that data? You just mentioned you can't have a strategy that says, "okay, move all the data into some God box." >> Good luck. >> Yeah. That won't work. So, do customers have coherent data strategies? Are they formulating? Where are we on that maturity curve? >> Absolutely, I think the advent of the CDO role, as the Chief Data Officer role, has really helped bring the awareness that you have to have that enterprise data strategy. >> So, that's a sign. If there's a CDO in the house. >> There's someone working on enterprise, yeah, absolutely. >> So, it's really their role, the CDO's role, to construct the data strategy. >> Absolutely. And one of the challenges that we see, though, in that, is that because it is a new role, is like going back to Daniel's historical operational stuff, right? There's a lot of things you have to sort out within your data strategy of who owns the data, right? Regardless of where it sits within an enterprise, and how are you applying that strategy to those data assets across the business. And that's not an easy challenge. That goes back to the people process side of it. >> Well, right. I bet you if I asked Jim Cavanaugh what's IBM's data strategy, I bet you he'd have a really coherent answer. But I bet you if I asked Scott Hebner, the CMO of the data and AI group, I bet you I'd get a somewhat different answer. And so, there's multiple data strategies, but I guess it's (mumbles) job to make sure that they are coherent and tie in, right? >> Absolutely. >> Am I getting this? >> Absolutely. >> Quick study. >> So, what's IBM's data strategy? (laughs) >> Data is good. >> Data is good. Bring AI to the data. >> Look, I mean, data and AI, that's the name of the business, that's the name of the portfolio that represents our philosophy. No AI without data, increasingly, not a lot of value of data without AI. We have to help our customers understand this, that's a skill, education, point of view problem, and we have to deliver technology that actually works in the wild, in their environment, not as we want them to be, but as they are. Which is often messy. But I think that's our fun. It's the reason we've been here for a while. >> All right, I'll give you guys a last word, we got to run, but both Scott and Daniel, take aways from the event today, things that you're excited about, things that you learned. Just give us the bumper sticker. >> For me, you talk about whether people recognize the need for a data strategy in their role. For me, it's people being pumped about that, being excited about it, recognizing it, and wanting to solve those problems and leverage the capabilities that are out there. >> We've seen a lot of that today. >> Absolutely. And we're at a great time and place where the capabilities and the technologies with machine learning and AI are applicable and real, that they're solving those problems. So, I think that gets everybody excited, which is cool. >> Bring it home, Daniel. >> Excitement, a ton of experimentation with AI, some real issues that are getting in the way of full-scale deployments, a methodology data ops, to deal with those real hardcore data problems in the enterprise, resonating, a technology stack that allows you to implement that as a company is, through Cloud Pak for Data, no matter where they want to run is what they need, and I'm happy we're able to deliver it to them. >> Great. Great segment, guys. Thanks for coming. >> Awesome. Thank you. >> Data, applying AI to that data, scaling with the cloud, that's the innovation cocktail that we talk about all the time on The Cube. Scaling data your way, this is Dave Vellante and we're in Miami at the AI and Data Forum, brought to you by IBM. We'll be right back right after this short break. (upbeat music)

Published Date : Oct 22 2019

SUMMARY :

Covering IBM's Data in AI Forum, brought to you by IBM. Good to see you guys, thanks for coming on. kind of accelerating the journey to AI around data ops, You've got dev ops to help you build apps. or getting the models that they need to put in production So, that's how I like to talk about it. that you guys are solving with customers. is any business that does business has customers. Those are the ones that we see the most. So when you say customer experience, So, getting the right data to have that experience and the data from here talking to each other. and the kind of response on the web, in terms of just the time it takes to identify a fraud, you better have good data to train this stuff, and automating the data to get it in the right place. the process and you might not have the skill set. Because it doesn't matter at the end of the day. and explain the difference between what you call the number one search result is going to be a rapper. I don't mind being matched with a rapper, and The Cube and you'll find him. so that you get me right. So, that ties back to what we were saying before, automate that, so that as the data continues to grow, and you better have provisions around anti money laundering. Let's talk about some of the data challenges Structured data is sort of the are equally, if not greater, to your business. because a lot of times people are saying, "that's too hard." Absolutely. that helps you to go discover unstructured data Our competition demand that you bring your data to the AI, So, it starts with, I guess, You just mentioned you can't have a strategy that says, So, do customers have coherent data strategies? that you have to have that enterprise data strategy. So, that's a sign. to construct the data strategy. There's a lot of things you have to sort out But I bet you if I asked Scott Hebner, Bring AI to the data. data and AI, that's the name of the business, but both Scott and Daniel, take aways from the event today, and leverage the capabilities that are out there. that they're solving those problems. a technology stack that allows you to implement that Thanks for coming. Thank you. brought to you by IBM.

ENTITIES

Entity	Category	Confidence
Daniel	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jim Cavanaugh	PERSON	0.99+
Scott Buckles	PERSON	0.99+
Daniel Hernandez	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Scott	PERSON	0.99+
Danny Hernandez	PERSON	0.99+
Miami	LOCATION	0.99+
Ginni Rometty	PERSON	0.99+
Bill Belichick	PERSON	0.99+
two	QUANTITY	0.99+
Scott Hebner	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Daniel G Hernandez	PERSON	0.99+
Delta	ORGANIZATION	0.99+
one person	QUANTITY	0.99+
10 people	QUANTITY	0.99+
12	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
two peoples	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
Today	DATE	0.99+
18 months	QUANTITY	0.99+
five	DATE	0.99+
today	DATE	0.99+
six	QUANTITY	0.99+
Watson Assistant	TITLE	0.99+
18 months ago	DATE	0.98+
each	QUANTITY	0.98+
both	QUANTITY	0.98+
one example	QUANTITY	0.98+
one	QUANTITY	0.98+
10	DATE	0.96+
The Cube	TITLE	0.95+
Azure	ORGANIZATION	0.94+
one bad experience	QUANTITY	0.94+
IBM Data and AI Forum	ORGANIZATION	0.93+
15 years ago	DATE	0.91+
World of Watson	ORGANIZATION	0.9+
first think	QUANTITY	0.9+
Watson	TITLE	0.9+
six years ago	DATE	0.9+
couple months ago	DATE	0.9+
one time	QUANTITY	0.89+
three hundred business	QUANTITY	0.89+
The Cube	ORGANIZATION	0.88+
Cloud Pak for	TITLE	0.84+
AI and	ORGANIZATION	0.82+
last 10 years	DATE	0.82+
IBM Data	ORGANIZATION	0.81+
Cloud Pak	COMMERCIAL_ITEM	0.81+
couple	QUANTITY	0.8+
Watson Knowledge Catalog	TITLE	0.77+
Cloud Pak for Data	TITLE	0.72+
couple of events	DATE	0.69+
double	QUANTITY	0.66+
Data Forum	ORGANIZATION	0.65+
KYC AML	TITLE	0.62+
Cloud Pak	ORGANIZATION	0.61+
Vice	PERSON	0.58+
and AI Forum	EVENT	0.56+
Data	ORGANIZATION	0.55+
InstaScan	TITLE	0.55+

Scott Hebner, IBM Data & AI | IBM Data and AI Forum

>>live from Miami, Florida It's the Q covering IBM is data in a I forum brought to you by IBM. >>Welcome back to Miami, Florida Everybody watching the Cube, the leader in live tech coverage. We go out to the events and extract the signal from the noise we're covering the IBM data and a I Forum Scott Hefner series The CMO on uh, sorry VP and CMO IBM Data. Yeah, right, I know. Here's the CMO of late again. So welcome. Welcome to the >>cake was great. Great >>event. Yeah, I've never attended one of these before. The sort of analytics University 1700 people that everybody's like. Sponges trying to learn more and more and more. >>60% higher attendance last year. Awesome. A lot of interest. >>So if we go back a couple of years ago, talks about digital transformation, people roll their eyes. They think it's a buzz word. When you talk to customers, it's really they're trying to transform their business, and data is at the center of that. So if you go back to like 2016 there's a lot of experimentation going on. Kind of throw everything against the wall, see what sticks. It seems Scott, based on the data that I see, that people are now narrowing their their bets on things like Ai ai automation machine learning containers. What are you seeing from customers? >>I think you framed it Well, I mean, if you kind of think about it, this digital transformations been going on for almost 20 years. With the advent of the Internet back around 2000 late 19 nineties, every started on the Internet doing business transactions, and slowly but surely, digital transformation was taken effect, right? And I think clients are now shifting to what we can call digital transformation two point. Oh, what's the next 20 years look like? And our view, our viewpoint from overlay from our clients is, if you think about it, it's data that fuels digital transformation. Right? Without data, there is no digital transformation is no digital. It's all data driven, evidence based decision making, using data to do things more efficiently and more effectively for your clients and your employees, and so on, so forth. But if you think about it, we've been using data as a way of looking to what has happened in the past or what is happening now in clients with digital transformation. To point out what a shift to a word of predictive data. How do you How do you predict in shape? Future outcomes, right? And if you think about it's a I that's gonna unlock predictive data. That's why we see such an intense focus on a I as a really the linchpin of digital transformation. Two point. Oh, and of course, all that data needs to be virtualized. It has to sit in a hybrid cloud environment. 94% of clients have multiple clouds. So if that unlocks the value or if a Iot of Mark's value the data and predictive ways the cloud in a multi cloud environment is that platform that has built upon, it's. That's why you see this enormous shift today. I in terms of investment priority along with hybrid multi cloud. >>So I like this this point of view, this digital transformation 2.0, because what's in their senior business in a digital business? That's how they used data. Yeah, and IBM is mission. Using your group is to help people better take advantage of data to five business outcomes. I mean, that's pretty clearly. What you guys are doing this to Dato To me. Three innovation cocktails, data plus machine intelligence or a I, and then you scale it with cloud. And so you talk about cloud to two point. Oh, really? Involves this predictive sort of a component of the equation that you're bringing into it, doesn't it? >>Yeah. When I think of this next phase, there's several things our clients trying to achieve. One is to predict and shape future outcomes, whether it be inventory, whether it be patient care, whatever it may be. Ah, customer service call. You want Toby to predict what the call's gonna be about what the client or what the customers has gone through before with the issue may be right. So this notion of predicted in shaping the outcome the second is empowering. People do higher value work. How do you make them better at what they're doing? The superpowers of being aided by a machine all right, or some kind of software, it's gonna help you be better what you do. And of course, this whole notion of automating task that people don't want to do automated experiences and intelligent ways. This all adds up to like new business models, right? And that's where a I comes in. That's what I does, and I do think it's a linchpin. What clients are looking to invest in is this notion that you need one unified platform to build upon for the future. That is, cloud service is data service is an aye aye. Service is all is one thing. One cloud native platform that runs on any cloud and completely opens up where all your data is. You run your APS wherever you want to run them secure to the core, and that's what they're looking to invest in. And >>so you guys use is the sort of tag line you can't have a I without. Ay, ay, ay, ay, ay, being information architecture. So for years on the q b been talking about bringing the cloud model to your data? Could you don't move data around? Now you're talking about bringing machine intelligence to your data wherever your data lives, to talk about why that's important and what IBM is doing both conceptually in from a product standpoint, to enable that. >>So the number one issue with the eye and actually a number one issue that sometimes results in failure with a I is didn't understand the data. Some 81% of clients do not understand the data that they're gonna need for the aye aye models. And if they do understand the doubt that they don't know how to make it simple, inaccessible, especially when its ever changing and then they have all the issues of compliance and quality. And is it a trusted set of data that you're using? And that's what you mentioned about? There is no way I without an aye aye, which is information architecture. So it starts there than two. To your point is, Dad is everywhere. There's thousands of sources of data, if not more than that. So how do you normalize all that? Virtual eyes it right. And that's where you get into one platform, any cloud, so that you can access the data wherever it sits. Don't spend the money moving things around the complexity of all that. And then, finally, the third thing we're looking to do is use a I to build. I use a I to actually manage the life cycle of how do you incorporate this into your business and That's what this one platform is gonna d'oh! Versus enabling customers to piece together all this stuff. It's just it's too much. >>So this is what cloudpack for data? Yeah, it is and does. Yes. So you say Aye, aye. Free. Are you talking about picking the functions and automating components? Prioritizing? Yeah. How you apply those those algorithms. Is that right? >>Yeah. So I think Way talk about data with three big things to really focus on his data. And that is the whole nursing. You need that information architecture that's that's ready for an aye aye multicolored world. It's all about the dad in the end, right? Two is about talent, right? Talent being skills. Are you able to acquire the skills you need? So we're trying to help our customers apply. I actually generate and build a I optimize eh? So they don't need is, you know, as much skill to do it. In other words, democratize the ability to build a I models for your business. And then finally, the dad is everywhere. You need to have completely open environment. That's the run on any cloud notion. And that's why the Red had open shift is such a big component of this. So think of clients are looking to climb the ladder >>today. I >>modernize their data states, make the data simple, inaccessible, create a trusted data foundation building scale new models and infuse it throughout their business. Cloudpack for data is essentially the foundational platform that gives you the latter >>day I >>that is in earnestly extensible with things that may be important to you or certain areas of additional capabilities. So Compaq for Dad essentially is the platform that I'm referring to hear when you say you know any cloud, right? >>So I feel like we're on the cusp of this enormous productivity boom. If you look at the data, productivity in the first quarter went up now and if you believe the Bureau of Labor Statistics, but over the long term productivity numbers right, you probably can't believe in them. I think for Q one was like 3% which is a huge uptick. And I feel like it's much, much higher than the anemic whatever it was one and 1/2 1.7%. All this ay, ay, all this automation is gonna drive productivity. It's gonna have an impact on organizations. So what's your perspective? Point of view on on the depending productivity boom boom? Do you believe that premise, How our job's going to be affected, What a client seeing in terms of how their retraining people, What should we expect? >>Yeah, I think a I's gonna give people superpowers. It's gonna make them better. What they do, it's gonna make you as a consumer better at how you choose what to buy. It's gonna make the automobile drive more efficiently and more more information that's relevant to you in the dashboard. It's gonna allow you call for service on your cable company. For them to already know your history, maybe already died. Knows what why you're calling and make it a more efficient call. It's gonna make everyone more productive. It's gonna result in higher quality output because you're able to predict things right. You automate things and intelligent ways, so I don't see it as anything that replaces jobs. It's just gonna make people better at what they do. Allow them to focus on higher value work and be more efficient when you are making decisions right in that will that will result in higher productivity per per worker, right? >>I mean, we've certainly heard examples today of customers that are doing that basically, and it's not like they're firing people. They're basically taking away mundane tasks or things that maybe humans would take so long to do and then re pointing that talent somewhere else. >>Toe higher value. >>So you're seeing that in your client base? Yeah, it's starting to hit today. It's gonna be interesting to see whether or not that affects jobs. I mean, we like to say That's not I ultimately think it's gonna create more jobs. There may be some kind of dip where we've got to retrain people, maybe have to change the way in which we do. Reading right bet Smith and I were talking, reading, writing arithmetic in coding, You know, maybe one of the skills that we have to bring in, but ultimately I think it is a positive, and I'm sanguine and I'm an optimist. Um, but you're seeing examples today of people refocusing their talent. What are they focusing that talent on more strategic things? Like what? >>Well, again, I think it's just getting people to be better at what they do by giving them that predictive power of super powers to be a to do their job better. It's gonna make people better not replace >>them. So it's consumers. We're probably gonna buy more. You're >>gonna buy more, you're gonna buy the right things more. And the right things are gonna be there for you to buy the right sales because everything is gonna be able to better understand patterns of what happens and predict right. And that's why you're seeing this enormous investment shift among among technologists companies. What was that? M. I. T. Sloane in the Boston Consulting Group just came out with a study. I think couple weeks ago, 92% of companies are looking to expand their investments in a I gardener came out with the study of C i ose and there in top investment areas, artificial intelligence was number one. Data and analytics was number two, which is the information architecture, right? One into as the first time it's been like that. So and I think it's for this reason of digital transformation, the predictive notion predictive enterprise, if you will, and just helping everyone be more efficient, more productive or what they do. That's really what it's about. It's not so much replacing people. They're thinking of robots and things like that. That's a small part of what we're talking about. >>Well, even when you talk to people about software robots, they love them because they don't have to do these Monday tests and dramatically impact the quality of what they're doing it again. It frees them up to do other things. >>Good, Good example. Legal Legal Nation is one of our clients that we've been working with, and they do case law for business clients. And sometimes it can take weeks, if not a month, to prepare case law documents. They're able to do that ours now because they have artificial intelligence. The background has done a lot of the case law, intelligence and finding the right dad in the right case law and helping to populate those documents where they don't have to do all the research themselves. So what does that do for the lawyer? Right? It makes them better what they do. They can shift a higher value work than just preparing the document. They could work on more cases that could spend more time on the subtleties of the case. Actually, that's a good example of what we mean here. He's not replacing the lawyer. >>Well, I'm seeing a lot of examples like this in legal fields. Also, auditing. I've talked enough. I've asked you think I'd be able to cut the auditing bill? And the answer is actually, No, because to the point you just made is they're shifting their activities to higher value. They might be charging Maur for activities that take less time. >>Customer service is is another great example. There's so many some examples of that. But it used to be. If you called, everyone treated equal right and you get onto a call. And then sometimes it's very rudimentary things. Sometimes there's gotta be a way to prioritize What are the most critical calls knowing that there's something already wrong and you know why they're calling? And if you can shift your human agents to focus on those and let let a I help with the more rudimentary ones you're making, the client's happier. But those people doing higher value work, we go on forever and ever on just different examples across different industries in different businesses, of how this is really helping people, and it all comes down to it. The three big words, which is prediction, automation and optimization. And that's what I was gonna do. And with digital transformation in just shift the whole the whole notion of using data for evidence based decision making what's happened in the past? What's happening now, too? I'm gonna I'm gonna understand its shape, the future. You could do so many things with that. >>It's amazing when you think about it. We've been at this computer industry 50 60 plus years, and you think everything's automated. It's not even close. All this technology has actually created so much more data so much on structured data. Actually, so many Maur inefficient processes in a lot of ways that now machine intelligence is beginning to attack in a big >>way. You won't find a survey because, ah, a survey of businesses where a eyes not a top aspiration trick, is how do you turn the aspirations of the outcomes? And that's what this latter day eyes all about. It's a very prescriptive approach that we've learned from our clients on howto take that journey to a I and a lot of things we talk about on this on this conversation or the real key linchpins, right? You gotta get the data right. You have to trust in the data that you're going to be used and you gotta get the talent and be able to simple find democratize how you build his models and deploy them. And then ultimately you got to get trust across your organization. And that means the models have to have explained ability, Understand? You have to help you understand how it is recommending these things, and then they're gonna buy into it. It's just gonna make them better. It's the whole notion of superpowers. >>Get that down and then you could scale. And that's really where the business and >>they all want to get there. Now the hard part is now we got to start doing it right. It's kind of like the Internet was 20 years ago. They know they want to do business transactions over the Internet and do commerce. But it didn't happen like overnight. It wasn't magic. It took. It was a journey. I think we're seeing that movie. We playing here? >>Yeah. And in fact, I think in some ways it could even happen faster now because you have the Internet because you have clouds. That's not predicting a very steep Pogue. I've s curve here. We'll have to leave it there. Scott, great to see you. Thanks >>for coming >>on. >>Any time. >>All right. Keep it right, everybody. We'll be back with our next guest right after this short break. You're watching the Cube from the IBM data and a I form in Miami. We'll be right back.

Published Date : Oct 22 2019

SUMMARY :

IBM is data in a I forum brought to you by IBM. We go out to the events and extract cake was great. people that everybody's like. A lot of interest. So if you go back to like 2016 there's a lot of And I think clients are now shifting to what And so you talk about cloud to two point. or some kind of software, it's gonna help you be better what you do. talking about bringing the cloud model to your data? And that's what you mentioned about? So you say Aye, aye. the ability to build a I models for your business. I Cloudpack for data is essentially the foundational platform that gives you the latter to hear when you say you know any cloud, right? And I feel like it's much, much higher than the anemic whatever it was one and 1/2 1.7%. It's gonna make the automobile drive more efficiently and more more information that's relevant to you that talent somewhere else. gonna be interesting to see whether or not that affects jobs. Well, again, I think it's just getting people to be better at what they do by giving them that predictive So it's consumers. And the right things are gonna be there for you to buy Well, even when you talk to people about software robots, they love them because they don't have to do these dad in the right case law and helping to populate those documents where they don't have to do all the research themselves. No, because to the point you just made is they're shifting their activities to higher value. And if you can shift It's amazing when you think about it. And that means the models have to have explained ability, Get that down and then you could scale. It's kind of like the Internet We'll have to leave it there. the IBM data and a I form in Miami.

ENTITIES

Entity	Category	Confidence
Bureau of Labor Statistics	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Scott	PERSON	0.99+
Boston Consulting Group	ORGANIZATION	0.99+
Miami	LOCATION	0.99+
Scott Hebner	PERSON	0.99+
94%	QUANTITY	0.99+
2016	DATE	0.99+
thousands	QUANTITY	0.99+
M. I. T. Sloane	PERSON	0.99+
Miami, Florida	LOCATION	0.99+
Two	QUANTITY	0.99+
Scott Hefner	PERSON	0.99+
one platform	QUANTITY	0.99+
92%	QUANTITY	0.99+
one	QUANTITY	0.99+
3%	QUANTITY	0.99+
couple weeks ago	DATE	0.99+
last year	DATE	0.99+
Smith	PERSON	0.99+
two	QUANTITY	0.99+
two point	QUANTITY	0.98+
both	QUANTITY	0.98+
today	DATE	0.98+
third thing	QUANTITY	0.98+
Monday	DATE	0.98+
20 years ago	DATE	0.98+
second	QUANTITY	0.98+
60%	QUANTITY	0.98+
One	QUANTITY	0.98+
IBM Data	ORGANIZATION	0.98+
five business	QUANTITY	0.97+
almost 20 years	QUANTITY	0.97+
Legal Legal Nation	ORGANIZATION	0.96+
1700 people	QUANTITY	0.95+
a month	QUANTITY	0.95+
first time	QUANTITY	0.91+
three big words	QUANTITY	0.91+
81%	QUANTITY	0.91+
50 60 plus years	QUANTITY	0.91+
Red	ORGANIZATION	0.91+
1.7%	QUANTITY	0.88+
three big things	QUANTITY	0.87+
Two point	QUANTITY	0.85+
couple of years ago	DATE	0.84+
next 20 years	DATE	0.83+
one	OTHER	0.83+
around 2000 late 19 nineties	DATE	0.8+
One cloud	QUANTITY	0.78+
Pogue	PERSON	0.78+
one issue	QUANTITY	0.76+
one unified platform	QUANTITY	0.73+
Three innovation	QUANTITY	0.72+
cloudpack	ORGANIZATION	0.72+
Toby	PERSON	0.72+
sources	QUANTITY	0.71+
Compaq for Dad	ORGANIZATION	0.7+
Cloudpack for	TITLE	0.69+
Cube	PERSON	0.69+
first quarter	DATE	0.68+
Q	QUANTITY	0.64+
C i ose	TITLE	0.61+
Mark	PERSON	0.38+
Cube	COMMERCIAL_ITEM	0.35+

John Thomas, IBM Data and AI | IBM Data and AI Forum

(upbeat music) >> Announcer: Live from Miami, Florida. It's theCUBE. Covering IBM's Data and AI Forum. Brought to you by IBM. >> We're back in Miami everybody. You're watching theCube, the leader in live tech coverage. We go out to the events and extract the signal from the noise we hear. Covering the IBM Data and AI Forum, John Thomas is here, many time CUBE guest. He's not only a distinguished engineer but he's also the chief data scientist for IBM Data and AI. John, great to see you again. >> Great to see you again Dave. >> I'm always excited to talk to you because you're hard core data science. You're working with the customers and you're kind of where the action is. The watchword today is end to end data science life cycle. What's behind that? I mean it's been a lot of experimentation, a lot of tactical things going on. You're talking about end to end life cycle, explain. >> So Dave, what we are saying in our client engagements is, actually working with the data, building the models. That part is relatively easy. The tougher part is to make the business understand what is the true value of this. So it's not a science project, right? It is not a, an academic exercise. So how do you do that? In order for that to happen these models need to go into production. Well, okay, well how do you do that? There is this business of, I've got something in my development environment that needs to move up through QA and staging, and then to production. Well, lot of different things need to happen as you go through that process. How do you do this? See this is not a new paradigm. It is a paradigm that exists in the world of application development. You got to go through a dev ops life cycle. You got to go through continuous integration and continuous delivery mindset. You got to have the same rigor in data science. Then at the front end of this is, what business problem are you actually solving? Do you have business KPIs for that? And when the model is actually is in production, can you track, can you monitor the performance of the model against the business KPIs that the business cares about? And how do you do this on an end to end fashion? And then in there is retraining the model when performance degrades, et cetera, et cetera. But this notion of following dev ops mindset in the world of data science is absolutely essential. >> Dave: So when you think about dev ops, you think of agile. So help me square this circle, when you think end to end data life cycle, you think chewy, big, waterfall, but I'm inferring you're not prescribing a waterfall. >> John: No, no, no. >> So how are organizations dealing with that wholistic end to end view but still doing it in an agile manner? >> Yeah, exactly. So, I always say do not boil the ocean, especially if you're approaching AI use cases. Start with something that is convened, that you can define and break it into springs. So taking an agile approach to this. Two, three springs, if you're not seeing value in those two, three springs, go back to the drawing board and see what is it that you're doing wrong. So for each of your springs, what is the specific successful criteria that you care about and the business cares about? Now, as you go through this process, you need a mechanism to look at, okay, well I've got something in development, how do I move the assets? Not just the model, but, what is the set of features that you're working with? What is the data prep pipeline? What are the scripts being used to evaluate the model? All of these things are logical assets surrounding the model. How do you move them from development to staging? How do you do QA against these set of assets? Then how do you do third party approval oversight? How do you do code review? How do make sure that when you move these assets all of the surrounding mechanisms are being adhered to, compliance requirements, regulatory requirements? And then finally get them to production. So there's a technology aspect of it, obviously. You have a lot of discussion around cube flow, ml flow, et cetera, et cetera as technology options. But there is also mindset that needs to be followed here. >> So once you find a winner, business people want a scale, 'cause they can make more money the more and more times they can replicate that value. And I want to understand this trust and transparent, 'cause when you scale, if you're scaling things that aren't compliant, you're in trouble. But before we get there, I wonder if we can take an example of, pick an industry, or some kind of use case where you've seen this end to end life cycle be successful. >> Yeah, across industries. I mean it's not just specific industry related. But, I'll give you an example. This morning Wunderman Thompson was talking about how they are applying machine learning to, a very difficult problem, which is how to improve how they create a first-time buyer list for their clients. But think of the problem here. It's not just about a one time building of a model. The model needs, okay you got data, understand what data says you're working with, what is the lineage of that data. Once I have their understanding of their data then I get into feature selection, feature engineering, all the steps that I need in your machine learning cycle. Once I am done with selecting my features, doing my feature engineering, I go into model building. Now, it's a pipeline that is being built. It is not a one time activity. Once that model, the pipeline has been vetted, you got to move it from development to your QA environment, from there to your production environment, and so on. And here comes, and this is where it links to the question, transparency discussion. Well the model is in production, how do I make sure the model is being fair? How do I make sure that I can explain what is going on? How do I make sure that the model is not unfairly biased? So all of these are important discussions in the trust and transparency because, you know, people are going to question the outcome of the model. Why did it make a decision? If a campaign was run for an end individual, why did you choose him and not somebody else? If it's a credit card fraud detection scenario, why was somebody tagged as fraudulent and not the other person? If a loan application was rejected, why was he rejected and not someone else? You got to explain this. So, it's not an explain ability that Tom has a lot of, it's over loaded at times, but. The idea here is you should be able to retrace your steps back to an individual scoring activity and explain an individual transaction. You should be able to play back an individual transaction and say version 15 of my model used these features, these hundred features for it's scoring. This was the incoming payload, this was the outcome, and, if I had changed five of my incoming payload variables out of the 500 I use, or hundred I use, the outcome would have been different. Now you can say, you know what, ethnicity, age, education, gender. These parameters did play a role in the decision but they were within the fairness bracket. And the fairness bracket is something that you have to define. >> So, if I could play that back. Take fraud detection. So you might have the machine tell you with 90% confidence or greater that this is fraud but it throws back a false positive. When you dig in, you might see well there's some bias included in there. Then what? You would kind of re-factor the model? >> A couple of different things. Sometimes a bias is in the data itself and it may be valid bias. And you may not want to change that. Well, that's what the system allows you to do. It tells you, this is the kind of bias that exists in the data already. And you can make a business decision as to whether it is good to retain that bias or to correct it in the data itself. Now, if the bias is in how the algorithm is processing their data, again, it's a business decision. Should I correct it or not. Sometimes, bias is not a bad thing. (laughs) It's not a bad thing. No, because, you are actually looking at what signal exists in their data. But what you want to make sure is that it's fair. Now what is fair, that is up to the regulatory body. Are your business defined? You know what, age range between 26 and 45, I want to treat them a certain way. If this is a conscious decision that you, as a business, or your industry is making, that's fair game. But if it is, this is what I wanted that model to do for this age range but the model is behaving a different way, I want to catch that. And I want to either fix the bias in the data or in how the algorithm is behaving with the model itself. >> So, you can eject the edits of the company into the model, but then, and then appropriately and fairly apply that, as long as it doesn't break the law. >> Exactly. (laughs) >> Which is another part of the compliance. >> So, this is not just about compliance. Compliance is a big, big part here. But, this also just answering what your end customer is going to ask. I put in an application for a loan and I was rejected. And, I want an explanation as to why it was rejected, right? >> So you got to be transparent, is your point there. >> Exactly, exactly. And if the business can say, you know what, this is the criteria we used, you fell in this range, and this, in our mind, is a fair range, that is okay. It may not be okay for the end customer but at least you have a valid explanation for why the decision was made by the model. So, it's some black box making some.. >> So the bank might say, well, the decision was made because we don't like the location of the property, we think they're over valued. It had nothing to do with your credit. >> John: Exactly. >> We just don't want to invest in this, by the way, maybe we advise you don't invest in that either. >> Right, right, right. >> So that feedback loop is there. >> This is, being able to find it for each individual transaction, each individual model scoring. What weighed in into the decision that was made by the model. This is important. >> So you got to have atomic access to that data? >> John: At the transaction level. >> And then make it transparent. Are organizations, banks, are they actually making it transparent to their consumers, 'cause I know in situations that I'm involved in, it's either okay go or no but, we're not going to tell you why. >> Everyone is beginning to look into this place. >> Healthcare is another one, right, where we would love more transparency in healthcare. >> Exactly. So this is happening. This is happening where people are looking at oh we can't do just black box in decision making, we have to get serious about this. >> And I wonder, John, if a lot of that black box decision making is just easy to not share information. Healthcare, you're worried about HIPPA. Financial services is just so highly regulated so people are afraid to actually be transparent. >> John: Yup. >> But machine intelligence potentially solves that problem? >> So, internally, at least internal to the company, when the decision is made, you need to have a good idea why the decision was made, right. >> Yeah right. >> As to what you use to explain to the end client or to regulatory body, is up to you. At least internally you need to have clarity on how the decision was arrived at. >> When you were talking about feature selection and feature engineering and model building, how much of that is being done by AI or things like auto AI? >> John: Yup >> You know, versus humans? >> So, it depends. If it's a relatively straightforward use case, you're dealing with 50, maybe a hundred features. Not a big deal. I mean, a good data scientist can sit down and do that. But, again, I'm going back to the Wunderman Thomas example from this morning's keynote, they're dealing with 20,000 features. You just, that is, you just can't do this economically at scale with a bunch of data scientists, even if they're super data scientists doing this in a programmatic way. So this is where something like auto AI comes into play and says, you know what, out of this 20,000 plus feature set, I can select, no. This percentage, maybe a thousand or 2,000 features that are actually relevant. Two, now here comes interesting things. Not just that it has selected 2,000 features out of 20,000, but it says, if I were to take three of these features and two of these features and combine them. Combine them, maybe to do a transpose. Maybe do an inverse of one and multiply it with something else or whatever, right. Do a logarithm make approach to one and then combine it with something else, XOR, whatever, right. Some combination of operations on these features generates a new feature which boosts the signal in your data. Here is the magic, right. So suddenly you've gone from this huge array of features to a small subset and in there you are saying, okay, if I were to combine these features I can now get much better productivity, prediction power for my model. And that is very good, and auto AI is very heavily used in the Wunderman example. In scenarios like that where you have very large scale feature selection, feature engineering. >> You guys use this concept of the data ladder, collect, organize, analyze, and infuse. Correct me if I'm wrong, but a lot of data scientists times is spent collecting, organizing. They want to do more analysis and so ultimately they can infuse. Talk about that analyze portion and how to get there? What kind of progress the industry, generally and IBM is making to help data scientists? >> So analyzers typically.. You don't jump into building machine learning models. The first part is to just do explore re-analysis. You know, age old exploration of your data to understand what is there. I mean people jump into the exhibit first and it's normal, but if you don't understand what your data is telling you, it is foolish to expect magic to happen from your data. So, explorate reanalysis, your traditional approaches. You start there. Then you say, in that context I think I can do model building to solve a particular business problem and then comes the discussion, okay am I using neural nets or am using classical mechanisms, am I doing this framework, XGBoost or Tensorflow? All of that is secondary once you get to explorate reanalysis, looking at framing the business problem as a set of models that can be built, then say what technique do I use now. And auto AI, for example, will help you select the algorithms once you have framed the problem. It's says, should I use lite GBN? Should I use something else? Should I use logistic regression? Whatever, right. So, it is something that the algorithm selection can be helped by auto AI. >> John, we're up against the clock. Great to have you. A wonderful discussion Thanks so much, really appreciate it. >> Absolutely, absolutely. >> Good to see you again. >> Yup, same here. >> All right. Thanks for watching everybody. We'll be right back right after this short break. You're watching theCUBE from the IBM Data and AI Forum in Miami. We'll be right back. (upbeat music)

Published Date : Oct 22 2019

SUMMARY :

Brought to you by IBM. John, great to see you again. I'm always excited to talk to you It is a paradigm that exists in the world Dave: So when you think about dev ops, How do make sure that when you move these assets So once you find a winner, How do I make sure that the model is not unfairly biased? So you might have the machine tell you Well, that's what the system allows you to do. So, you can eject the edits of the company Exactly. is going to ask. And if the business can say, It had nothing to do with your credit. by the way, maybe we advise you don't invest This is, being able to find it we're not going to tell you why. Healthcare is another one, right, So this is happening. so people are afraid to actually be transparent. you need to have a good idea why As to what you use to explain to the end client In scenarios like that where you have very large scale and how to get there? select the algorithms once you have framed the problem. Great to have you. from the IBM Data and AI Forum in Miami.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
John	PERSON	0.99+
John Thomas	PERSON	0.99+
50	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Miami	LOCATION	0.99+
five	QUANTITY	0.99+
Tom	PERSON	0.99+
20,000 features	QUANTITY	0.99+
two	QUANTITY	0.99+
Two	QUANTITY	0.99+
2,000 features	QUANTITY	0.99+
each	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
45	QUANTITY	0.99+
26	QUANTITY	0.99+
one time	QUANTITY	0.99+
IBM Data	ORGANIZATION	0.98+
20,000	QUANTITY	0.98+
first part	QUANTITY	0.98+
three	QUANTITY	0.98+
today	DATE	0.97+
first	QUANTITY	0.97+
a thousand	QUANTITY	0.96+
each individual model	QUANTITY	0.95+
this morning	DATE	0.94+
500	QUANTITY	0.94+
hundred features	QUANTITY	0.93+
first-time	QUANTITY	0.92+
each individual transaction	QUANTITY	0.92+
This morning	DATE	0.89+
IBM Data and AI	ORGANIZATION	0.89+
agile	TITLE	0.88+
one	QUANTITY	0.85+
version 15	OTHER	0.85+
hundred	QUANTITY	0.85+
two of these features	QUANTITY	0.84+
a hundred features	QUANTITY	0.83+
90% confidence	QUANTITY	0.83+
20,000 plus feature set	QUANTITY	0.79+
CUBE	ORGANIZATION	0.78+
Wunderman Thompson	PERSON	0.75+
three of	QUANTITY	0.74+
IBM Data and AI Forum	ORGANIZATION	0.74+
three springs	QUANTITY	0.72+
and AI Forum	EVENT	0.61+
AI Forum	ORGANIZATION	0.59+
couple	QUANTITY	0.56+
HIPPA	ORGANIZATION	0.55+
Thomas	PERSON	0.52+
features	QUANTITY	0.51+
Wunderman	PERSON	0.5+
Wunderman	ORGANIZATION	0.49+
Data	ORGANIZATION	0.33+

Beth Smith, IBM Watson | IBM Data and AI Forum

>> Narrator: Live from Miami, Florida. It's theCUBE. Covering IBM's data and AI forum. Brought to you by IBM. >> Welcome back to the port of Miami everybody. This is theCube, the leader in live tech coverage. We're here covering the IBM AI and data forum. Of course, the centerpiece of IBM's AI platform is Watson. Beth Smith is here, she's the GM of IBM Watson. Beth, good to see you again. >> You too. Always good to be with theCUBE. >> So, awesome. Love it. So give us the update on Watson. You know, it's beyond Jeopardy. >> Yeah, yeah. >> Oh, wow. >> That was a long time ago now. (laughs) >> Right, but that's what a lot of people think of, when they think of Watson. What, how should we think about Watson today? >> So first of all, focus Watson on being ready for business. And then, a lot of people ask me, "So what is it?" And I often describe it as a set of tools, to help you do your own AI and ML. A set of applications that are AI applications. Where we have prebuilt it for you, around a use case. And there is examples where it gets embedded in a different application or system that may have existed already. In all of those cases, Watson is here, tuned to business enterprise, how to help people operational-wise, AI. So they can get the full benefit, because at the end of the day it's about those business outcomes. >> Okay, so the tools are for the super geeks, (Beth laughs) who actually want to go in and build the real AI. >> (laughs) That's right, that's right. >> The APPS are, okay. It's prebuilt, right? Go ahead and apply it. >> That's right. >> And the embedded is, we don't even know we're using it, right? >> That's right, or you may. Like, QRadar with Watson has an example of using Watson inside of it. Or, OpenPages with Watson. So sometimes you know you're using it. Sometimes you don't. >> So, how's the mix? I mean, in terms of the adoption of Watson? Are there enough like, super techies out there, who are absorbing this stuff? Or is it mostly packaged APPS? Is it a mix? >> So it is a mix, but we know that data science skills are limited. I mean, they're coveted, right? And so those are the geeks, as you say, that are using the tool chain as a part of it. And we see that in a lot of customers and a lot of industries around the world. And then from a packaged APP standpoint, the biggest use case of adoption is really around customer care, customer service, customer engagement. That kind of thing. And we see that as well. All around the world, all different industries. Lots of great adoption. Watson Assistant is our flagship in that. >> So, in terms of, if you think about these digital initiatives, we talked about digital transformation, >> Yup. >> Last few years, we kind of started in 2016 in earnest, it's real when you talk to customers. And there was a ton of experimentation going on. It was almost like spaghetti. Throw against the wall and see what sticks. Are you seeing people starting to place their bets on AI, Narrowing their scope, and really driving you know, specific business value now? >> Beth: Yeah. >> Or is it still kind of all over the place? >> Well, there's a lot of studies that says about 51% or so still stuck in experimentation. But I would tell you in most of those cases even, they have a nice pilot that's in production, that's doing a part of the business. So, 'cause people understand while they may be interested in the sexiness of the technology, they really want to be able to get the business outcomes. So yes, I would tell 'ya that things have kind of been guided, focused towards the use cases and patterns that are the most common. You know, and we see that. Like I mentioned, customer care. We see it in, how do you help knowledge workers? So you think of all those business documents, and papers and everything that exists. How do you assist those knowledge workers? Whether or not it's an attorney or an engineer, or a mortgage loan advisor. So you see that kind of use case, and then you see customers that are building their own. Focused in on, you know, how do they optimize or automate, or predict something in a particular line of business? >> So you mentioned Watson Assistant. So tell us more about Watson Assistant, and how has that affected adoption? >> So Watson Assistant as I said, it is our flagship around customer care. And just to give you a little bit of a data point, Watson Assistant now, through our public cloud, SaaS version, converses with 82 million end users a month. So it's great adoption. And this is, this is enabling customers. Customers of our customers, to be able to get self-service help in what they're doing. And Watson Assistant, you know, a lot of people want to talk about it being a chat bot. And you can do simple chat bots with it. But it's to sophisticated assistance as well. 'Cause it shows up to do work. It's there to do a task. It's to help you deal with your bank account, or whatever it is you're trying to do, and whatever company you're interacting with. >> So chat bots is kind of a, (laughs) bit of a pejorative. But you're talking about digital systems, it's like a super chat bot, right? >> Beth: Yeah. I saw a stat the other day that there's going to be, by I don't know, 2025, whatever. There's going to be more money spent on chat bot development, or digital assistance, than there is on mobile development. And I don't know if that's true or not, >> Beth: Mhm, wow. But it's kind of an interesting thing. So what are you seeing there? I mean, again I think chat bots, people think, oh, I got to talk into a bot. But a lot of times you don't know you're, >> Beth: That's right. >> so they're getting, they're getting better. I liken it to fraud detection. You know, 10 years ago fraud detection was like, six months later you'll, >> Right. >> you'll get a call. >> Exactly. >> And so chat bots are just going to get better and better and better, and now there's this super category that maybe we can define here. >> That's right. >> What is that all about? >> That's right. And actually I would tell you, they kind of, they can become the brain behind something that's happening. So just earlier today I was, I was with a customer and talking about their email CRM system, and Watson Assistant is behind that. So chat bots aren't just about what you may see in a little window. They're really about understanding user intent, guiding the user through what they're trying to either find out or do, and taking the action as a part of it. And that's why we talk about it being more than chat bots. 'Cause it's more than a FAQ interchange. >> Yes, okay. So it's software, >> Beth: Yes. >> that actually does, performs tasks. >> Beth: Yes. >> Probably could call other software, >> Beth: Absolutely. >> to actually take action. >> That's right. >> I mean, I see. We think of this as systems of agency, actually. Making, sort of, >> That's right. >> decisions and then I guess, the third piece of that is, having some kind of human interaction, where appropriate, right? >> That's right. >> What do you see in terms of, you know, infusing humans into the equation? >> So, well a couple of things. So one of the things that Watson Assistant will do, is if it realizes that it's not the expert on whatever it is, then it will pass over to an expert. And think of that expert as a human agent. And while it's doing that, so you may be in the queue, because that human person is tied up, you can continue to do other things with it, while you're waiting to actually talk to the person. So that's a way that the human is in the loop. I would tell you there's also examples of how the agents are being assisted in the background. So they have the interaction directly with the user, but Watson Assistant is helping them, be able to get to more information quicker, and narrow in on what the topic is. >> So you guys talk about the AI ladder, >> Beth: Mhm. >> Sort of, Rob talked about that this morning. My first version of the AI ladder was building blocks. It was like data and AI analytics, ML, and then AI on top of that. >> Beth: Yup. >> I said AI. Data and IA. >> Beth: Yup. >> Information Architecture. Now you use verbs. Sort of, to describe it. >> Beth: Yup. Which is actually more powerful. Collect, organize, analyze and infuse. Now infuse is like the Holy Grail, right? 'Cause that's operationalizing and being able to scale AI. >> Beth: That's right. >> What can you tell us about how successful companies are infusing AI, and what is IBM doing to help them? >> So, I'm glad you picked up first of all, that these are verbs and it's about action. And action leads to outcome, which is, I think, critical. And I would also tell you yes, infuse is, you know, the Holy Grail of the whole thing. Because that's about injecting it into business processes, into workflows, into how things are done. So you can then see examples of how attorneys may be able to get through their legal prep process in just a few minutes, versus 10, 15 hours on certain things. You can see conversion rates of, from a sales standpoint, improve significantly. A number of different things. We've also got it as a part of supply chain optimization, understanding a little bit more about both inventory, but also where the goods are along the way. And particularly when you think about a very complicated thing, there could be a lot of different goods in various points of transit. >> You know, I was sort of joking. Not joking, but mentioning Jeopardy at first. 'Cause a lot of people associate Watson with Jeopardy. >> Beth: Right. >> I can't remember the first time I saw that. It had to be the mid part of the last decade. What was it? >> Beth: February of 2011. >> 2011, okay I thought I even saw demos before that. I'm actually sure I did. Like in, back in some lab in IBM. And of course, the potential like, blew your mind. >> Right. >> I suspect you guys didn't even know what you had at the time. You were like, "Okay, we're going to go change the world." And you know, when you drive up and down 101 in Silicone Valley, it's like, "Oh, Watson this, Watson that." You know, you get the consumer guys, doing facial recognition, ad serving. You know, serving up fake news, you know. All kinds of applications. But IBM started to do something different. You're trying to really change business. Did you have any clue as to what you had at the time? And then how much of a challenge you were taking on, and then bring us to where we are now, and what do you see as a potential for the next 10 years? >> So, of course we had a clue. So let me start there. (Dave laughs) But with that, I think the possibilities of it weren't completely understood. There's no question in my mind about that. And what the early days were, were understanding, okay, what is that business application? What's the pattern that's going to come about as a part of it? And I think we made tremendous progress on that along the way. I would tell you now, you mentioned operationalizing stuff, and you know, now it's about, how do we help companies have it more throughout their company? Through different lines of business, how does it tie to various things that are important to us? And so that brings in things like trust, explainablity, the ethics of what it's doing. Bias detection and mitigation. And I actually believe a lot of that, and the operationalizing it within the processes, is where we're going to head, going forward. Of course there'll continue to be advancements on the features and the capabilities, but it's going to be about that. >> Alright, I'm going to ask you the it's depends question. (Beth laughs) So I know that's your answer, but at the macro, can machines make better diagnosis than doctors today, and if not, when will they be able to, in your view? >> So I would actually tell you that today they cannot, but what they can do is help the doctor make a better diagnosis than she would have done by herself. And because it comes back to this point of, you know, how the machine can process so much information, and help the expert, in this case the doctor's the expert, it could be an attorney, it could be an engineer, whatever. Help that expert be able to augment the knowledge that he or she has as a part of it. So, and that's where I think it is. And I think that's where it will be for my lifetime. >> So, there's no question in your mind that machines today, AI today, is helping make better diagnosis, it's just within augmented or attended type of approach. >> Absolutely. >> And I want to talk about Watson Anywhere. >> Beth: Okay, great. >> So we saw some discussion in the key notes and some demos. My understanding is, you could bring Watson Anywhere, to the data. >> That's right. >> You don't have to move the data around. Why is that important? Give us the update on Watson Anywhere. >> So first of all, this is the biggest requirement I had since I joined the Watson team, three and a half years ago. Was please can I have Watson on-prem, can I have Watson in my company data center, etcetera. And you know, we needed to instead, really focus in on what these patterns and use cases were, and we needed some help in the platform. And so thanks to Cloud Pak for data, and the underlying Red Hat OpenShift and container platform, we now are enabled to truly take Watson anywhere. So you can have it on premise, you can have it on the other public clouds, and this is important, because like you said, it's important because of where your data is. But it's also important because the workloads of today and tomorrow are very complex. And what's on cloud today, may be on premise tomorrow, may be in a different cloud. And as that moves around, you also want to protect the investment of what you're doing, as you have Watson customize for what your business needs are. >> Do you think you timed it right? I mean, you kind of did. All this talk about multicloud now. You really didn't hear much about it four or five years ago. For awhile I thought you were trying to juice your cloud business. Saying, "You want, if you want Watson, you got to go to the IBM cloud." Was there some of that, or was it really just, "Hey, now the timing's right." Where clients are demanding it, and hybrid and multicloud and on-prem situations? >> Well look, we know that cloud and AI go hand in hand. So there was a lot of positive with that. But it really was this technology point, because had I taken it anywhere three and a half years ago, what would've happened is, every deployment would've been a unique environment, a unique stack. We needed to get to a point that was a modern day, you know, infrastructure, if you will. And that's what we get now, with a container based platform. >> So you're able to scale it, such that every instance isn't a snowflake, >> That's right. >> that requires customization. >> That's right. So then I can invest in the enhancements to the actual capabilities it is there to do, not supporting multiple platform instantiations, under the covers. >> Well, okay. So you guys are making that transparent to the customer. How much of an engineering challenge is that? Can you share that with us? You got to run on this cloud, on that cloud, or on forever? >> Well, now because of Cloud Pak for data, and then what we have with OpenShift and Kubernetes and containers, it becomes, well, you know, there's still some technical work, my engineering team would tell you it was a lie. But it's simple now, it's straightforward. It's a lot of portability and flexibility. In the past, it would've been every combination of whatever people were trying to do, and we would not have had the benefit of what that now gives you. >> And what's the technical enable there? Is it sort of open API's? Architecture that allows for the interconnectivity? >> So, but inside of Watson? Or the overall platform? >> The overall platform. >> So I would say, it's been, at it's, at it's core it's what containers bring. >> Okay, really. So it's that, it's that. It's the marriage of your tech, >> Yeah. >> with the container wave. >> That's right. That's right. Which is why the timing was critical now, right? So you go back, yes they existed, but it really hadn't matured to a point of broad adoption. And that's where we are now. >> Yeah, the adoption of containers, Kubernetes, you know, micro services. >> Right, exactly. Now it's on a very steep curve. >> Exactly. >> Alright, give your last word on, big take away, from this event. What do you hearing, you know, what are you, some of the things you're most excited about? >> So first of all, that we have all of these clients and partners here, and all the buzz that you see. And that we've gotten. And then the other thing that I would tell you is, the great client examples. And what they're bragging on, because they are getting business outcomes. And they're getting better outcomes than they thought they would achieve. >> IBM knows how to throw an event. (Beth laughs) Beth, thanks so much for coming to theCUBE. >> Thank you, good to >> Appreciate it. >> see you again. >> Alright, great to see you. Keep it right there everybody, we'll be back. This is theCUBE live, from the IBM Data Forum in Miami, we'll be right back. (upbeat instrumental music)

Published Date : Oct 22 2019

SUMMARY :

Brought to you by IBM. Beth, good to see you again. Always good to be with theCUBE. So give us the update on Watson. That was a long time ago now. a lot of people think of, to help you do your own AI and ML. and build the real AI. (laughs) That's right, Go ahead and apply it. So sometimes you know you're using it. and a lot of industries around the world. and really driving you know, But I would tell you So you mentioned Watson Assistant. And just to give you a little bit of a data point, So chat bots is kind of a, I saw a stat the other day So what are you seeing there? I liken it to fraud detection. are just going to get better and better and better, what you may see in a little window. So it's software, that actually does, of agency, actually. is if it realizes that it's not the expert that this morning. Data and IA. Now you use verbs. and being able to scale AI. And I would also tell you yes, 'Cause a lot of people associate I can't remember the first time I saw that. And of course, as to what you had at the time? and you know, ask you the it's depends question. So I would actually tell you that machines today, you could bring Watson Anywhere, You don't have to move the data around. And you know, I mean, you kind of did. you know, infrastructure, to the actual capabilities it is there to do, So you guys are making that transparent to the customer. my engineering team would tell you it was a lie. So I would say, It's the marriage of your tech, So you go back, you know, micro services. Now it's on a very steep curve. you know, what are you, and all the buzz that you see. for coming to theCUBE. from the IBM Data Forum in Miami,

ENTITIES

Entity	Category	Confidence
2016	DATE	0.99+
Beth Smith	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Beth	PERSON	0.99+
February of 2011	DATE	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
today	DATE	0.99+
third piece	QUANTITY	0.99+
tomorrow	DATE	0.99+
2011	DATE	0.99+
four	DATE	0.99+
Silicone Valley	LOCATION	0.99+
Miami, Florida	LOCATION	0.99+
both	QUANTITY	0.99+
six months later	DATE	0.99+
Watson Assistant	TITLE	0.99+
Miami	LOCATION	0.99+
Watson	PERSON	0.99+
IBM Data	ORGANIZATION	0.99+
three and a half years ago	DATE	0.98+
10 years ago	DATE	0.98+
one	QUANTITY	0.98+
five years ago	DATE	0.98+
2025	DATE	0.98+
about 51%	QUANTITY	0.98+
Watson	ORGANIZATION	0.97+
Watson	TITLE	0.96+
Cloud Pak	TITLE	0.95+
first	QUANTITY	0.94+
first time	QUANTITY	0.93+
last decade	DATE	0.92+
82 million end users	QUANTITY	0.92+
OpenShift	TITLE	0.92+
IBM Watson	ORGANIZATION	0.91+
Red Hat OpenShift	TITLE	0.88+
QRadar	TITLE	0.86+
Last few years	DATE	0.85+
Jeopardy	ORGANIZATION	0.83+
earlier today	DATE	0.83+
first version	QUANTITY	0.81+
this morning	DATE	0.81+
Kubernetes	TITLE	0.8+

Ritika Gunnar, IBM | IBM Data and AI Forum

>>Live from Miami, Florida. It's the cube covering IBM's data and AI forum brought to you by IBM. >>Welcome back to downtown Miami. Everybody. We're here at the Intercontinental hotel covering the IBM data AI form hashtag data AI forum. My name is Dave Volante and you're watching the cube, the leader in live tech coverage. Ritika gunner is here. She's the vice president of data and AI expert labs and learning at IBM. Ritika, great to have you on. Again, always a pleasure to be here. Dave. I love interviewing you because you're a woman executive that said a lot of different roles at IBM. Um, you know, you've, we've talked about the AI ladder. You're climbing the IBM ladder and so it's, it's, it's, it's awesome to see and I love this topic. It's a topic that's near and dear to the cubes heart, not only women in tech, but women in AI. So great to have you. Thank you. So what's going on with the women in AI program? We're going to, we're going to cover that, but let me start with women in tech. It's an age old problem that we've talked about depending on, you know, what statistic you look at. 15% 17% of, uh, of, of, of the industry comprises women. We do a lot of events. You can see it. Um, let's start there. >>Well, obviously the diversity is not yet there, right? So we talk about women in technology, um, and we just don't have the representation that we need to be able to have. Now when it comes to like artificial intelligence, I think the statistic is 10 to 15% of the workforce today in AI is female. When you think about things like bias and ethicacy, having the diversity in terms of having male and female representation be equal is absolutely essential so that you're creating fair AI, unbiased AI, you're creating trust and transparency, set of capabilities that really have the diversity in backgrounds. >>Well, you work for a company that is as chairman and CEO, that's, that's a, that's a woman. I mean IBM generally, you know, we could see this stuff on the cube because IBM puts women on a, we get a lot of women customers that, that come on >>and not just because we're female, because we're capable. >>Yeah. Well of course. Right. It's just because you're in roles where you're spokespeople and it's natural for spokespeople to come on a forum like this. But, but I have to ask you, with somebody inside of IBM, a company that I could say the test to relative to most, that's pretty well. Do you feel that way or do you feel like even a company like IBM has a long way to go? >>Oh, um, I personally don't feel that way and I've never felt that to be an issue. And if you look at my peers, um, my um, lead for artificial intelligence, Beth Smith, who, you know, a female, a lot of my peers under Rob Thomas, all female. So I have not felt that way in terms of the leadership team that I have. Um, but there is a gap that exists, not necessarily within IBM, but in the community as a whole. And I think it goes back to you want to, you know, when you think about data science and artificial intelligence, you want to be able to see yourself in the community. And while there's only 10 to 15% of females in AI today, that's why IBM has created programs such as women AI that we started in June because we want strong female leaders to be able to see that there are, is great representation of very technical capable females in artificial intelligence that are doing amazing things to be able to transform their organizations and their business model. >>So tell me more about this program. I understand why you started it started in June. What does it entail and what's the evolution of this? >>So we started it in June and the idea was to be able to get some strong female leaders and multiple different organizations that are using AI to be able to change their companies and their business models and really highlight not just the journey that they took, but the types of transformations that they're doing and their organizations. We're going to have one of those events tonight as well, where we have leaders from Harley Davidson in Miami Dade County coming to really talk about not only what was their journey, but what actually brought them to artificial intelligence and what they're doing. And I think Dave, the reason that's so important is you want to be able to understand that those journeys are absolutely approachable. They're doable by any females that are out there. >>Talk about inherent bias. The humans are biased and if you're developing models that are using AI, there's going to be inherent bias in those models. So talk about how to address that and why is it important for more diversity to be injected into those models? >>Well, I think a great example is if you took the data sets that existed even a decade ago, um, for the past 50 years and you created a model that was to be able to predict whether to give loans to certain candidates or not, all things being equal, what would you find more males get these loans than females? The inherent data that exists has bias in it. Even from the history based on what we've had yet, that's not the way we want to be able to do things today. You want to be able to identify that bias and say all things being equal, it is absolutely important that regardless of whether you are a male or a female, you want to be able to give that loan to that person if they have all the other qualities that are there. And that's why being able to not only detect these things but have the diversity and the kinds of backgrounds of people who are building AI who are deploying this AI is absolutely critical. >>So for the past decade, and certainly in the past few years, there's been a light shined on this topic. I think, you know, we were at the Grace Hopper conference when Satya Nadella stuck his foot in his mouth and it said, Hey, it's bad karma for you know, if you feel like you're underpaid to go complain. And the women in the audience like, dude, no way. And he, he did the right thing. He goes, you know what, you're right. You know, any, any backtrack on that? And that was sort of another inflection point. But you talk about the women in, in AI program. I was at a CDO event one time. It was I and I, an IBM or had started the data divas breakfast and I asked, can I go? They go, yeah, you can be the day to dude. Um, which was, so you're seeing a lot of initiatives like this. My question is, are they having the impact that you would expect and that you want to have? >>I think they absolutely are. Again, I mean, I'll go back to, um, I'll give you a little bit of a story. Um, you know, people want to be able to relate and see that they can see themselves in these females leaders. And so we've seen cases now through our events, like at IBM we have a program called grow, which is really about helping our female lead female. Um, technical leaders really understand that they can grow, they can be nurtured, and they have development programs to help them accelerate where they need to be on their technical programs. We've absolutely seen a huge impact from that from a technology perspective. In terms of more females staying in technology wanting to go in the, in those career paths as another story. I'll, I'll give you kind of another kind of point of view. Um, Dave and that is like when you look at where it starts, it starts a lot earlier. >>So I have a young daughter who a year, year and a half ago when I was doing a lot of stuff with Watson, she would ask me, you know, not only what Watson's doing, but she would say, what does that mean for me mom? Like what's my job going to be? And if you think about the changes in technology and cultural shifts, technology and artificial intelligence is going to impact every job, every industry, every role that there is out there. So much so that I believe her job hasn't been invented yet. And so when you think about what's absolutely critical, not only today's youth, but every person out there needs to have a foundational understanding, not only in the three RS that you and I know from when we grew up have reading, writing and arithmetic, we need to have a foundational understanding of what it means to code. And you know, having people feel confident, having young females feel confident that they can not only do that, that they can be technical, that they can understand how artificial intelligence is really gonna impact society. And the world is absolutely critical. And so these types of programs that shed light on that, that help bridge that confidence is game changing. >>Well, you got kids, I >>got kids, I have daughters, you have daughter. Are they receptive to that? So, um, you know, I think they are, but they need to be able to see themselves. So the first time I sent my daughter to a coding camp, she came back and said, not for me mom. I said, why? Because she's like, all the boys, they're coding in their Minecraft area. Not something I can relate to. You need to be able to relate and see something, develop that passion, and then mix yourself in that diverse background where you can see the diversity of backgrounds. When you don't have that diversity and when you can't really see how to progress yourself, it becomes a blocker. So as she started going to grow star programs, which was something in Austin where young girls coded together, it became something that she's really passionate about and now she's Python programming. So that's just an example of yes, you need to be able to have these types of skills. It needs to start early and you need to have types of programs that help enhance that journey. >>Yeah, and I think you're right. I think that that is having an impact. My girls who code obviously as a some does some amazing work. My daughters aren't into it. I try to send them to coder camp too and they don't do it. But here's my theory on that is that coding is changing and, and especially with artificial intelligence and cognitive, we're a software replacing human skills. Creativity is going to become much, much more important. My daughters are way more creative than my sons. I shouldn't say that, but >>I think you just admitted that >>they, but, but in a way they are. I mean they've got amazing creativity, certainly more than I am. And so I see that as a key component of how coding gets done in the future, taking different perspectives and then actually codifying them. Your, your thoughts on that. >>Well there is an element of understanding like the outcomes that you want to generate and the outcomes really is all about technology. How can you imagine the art of the possible with technology? Because technology alone, we all know not useful enough. So understanding what you do with it, just as important. And this is why a lot of people who are really good in artificial intelligence actually come from backgrounds that are philosophy, sociology, economy. Because if you have the culture of curiosity and the ability to be able to learn, you can take the technology aspects, you can take those other aspects and blend them together. So understanding the problem to be solved and really marrying that with the technological aspects of what AI can do. That's how you get outcomes. >>And so we've, we've obviously talking in detail about women in AI and women in tech, but it's, there's data that shows that diversity drives value in so many different ways. And it's not just women, it's people of color, it's people of different economic backgrounds, >>underrepresented minorities. Absolutely. And I think the biggest thing that you can do in an organization is have teams that have that diverse background, whether it be from where they see the underrepresented, where they come from, because those differences in thought are the things that create new ideas that really innovate, that drive, those business transformations that drive the changes in the way that we do things. And so having that difference of opinion, having healthy ways to bring change and to have conflict, absolutely essential for progress to happen. >>So how did you get into the tech business? What was your background? >>So my background was actually, um, a lot in math and science. And both of my parents were engineers. And I have always had this unwavering, um, need to be able to marry business and the technology side and really figure out how you can create the art of the possible. So for me it was actually the creativity piece of it where you could create something from nothing that really drove me to computer science. >>Okay. So, so you're your math, uh, engineer and you ended up in CS, is that right? >>Science. Yeah. >>Okay. So you were coded. Did you ever work as a programmer? >>Absolutely. My, my first years at IBM were all about coding. Um, and so I've always had a career where I've coded and then I've gone to the field and done field work. I've come back and done development and development management, gone back to the field and kind of seen how that was actually working. So personally for me, being able to create and work with clients to understand how they drive value and having that back and forth has been a really delightful part. And the thing that drives me, >>you know, that's actually not an uncommon path for IBM. Ours, predominantly male IBM, or is in the 50 sixties and seventies and even eighties. Who took that path? They started out programming. Um, I just think, trying to think of some examples. I know Omar para, who was the CIO of Aetna international, he started out coding at IBM. Joe Tucci was a programmer at IBM. He became CEO of EMC. It was a very common path for people and you took the same path. That's kind of interesting. Why do you think, um, so many women who maybe maybe start in computer science and coding don't continue on that path? And what was it that sort of allowed you to break through that barrier? >>No, I'm not sure why most women don't stay with it. But for me, I think, um, you know, I, I think that every organization today is going to have to be technical in nature. I mean, just think about it for a moment. Technology impacts every part of every type of organization and the kinds of transformation that happens. So being more technical as leaders and really understanding the technology that allows the kinds of innovations and business for informations is absolutely essential to be able to see progress in a lot of what we're doing. So I think that even general CXOs that you see today have to be more technically acute to be able to do their jobs really well and marry those business outcomes with what it fundamentally means to have the right technology backbone. >>Do you think a woman in the white house would make a difference for young people? I mean, part of me says, yeah, of course it would. Then I say, okay, well some examples you can think about Margaret Thatcher in the UK, Angela Merkel, and in Germany it's still largely male dominated cultures, but I dunno, what do you think? Maybe maybe that in the United States would be sort of the, >>I'm not a political expert, so I wouldn't claim to answer that, but I do think more women in technology, leadership role, CXO leadership roles is absolutely what we need. So, you know, politics aside more women in leadership roles. Absolutely. >>Well, it's not politics is gender. I mean, I'm independent, Republican, Democrat, conservative, liberal, right? Absolutely. Oh yeah. Well, companies, politics. I mean you certainly see women leaders in a, in Congress and, and the like. Um, okay. Uh, last question. So you've got a program going on here. You have a, you have a panel that you're running. Tell us more about. >>Well this afternoon we'll be continuing that from women leaders in AI and we're going to do a panel with a few of our clients that really have transformed their organizations using data and artificial intelligence and they'll talk about like their backgrounds in history. So what does it actually mean to come from? One of, one of the panelists actually from Miami Dade has always come from a technical background and the other panelists really etched in from a non technical background because she had a passion for data and she had a passion for the technology systems. So we're going to go through, um, how these females actually came through to the journey, where they are right now, what they're actually doing with artificial intelligence in their organizations and what the future holds for them. >>I lied. I said, last question. What is, what is success for you? Cause I, I would love to help you achieve that. That objective isn't, is it some metric? Is it awareness? How do you know it when you see it? >>Well, I think it's a journey. Success is not an endpoint. And so for me, I think the biggest thing I've been able to do at IBM is really help organizations help businesses and people progress what they do with technology. There's nothing more gratifying than like when you can see other organizations and then what they can do, not just with your technology, but what you can bring in terms of expertise to make them successful, what you can do to help shape their culture and really transform. To me, that's probably the most gratifying thing. And as long as I can continue to do that and be able to get more acknowledgement of what it means to have the right diversity ingredients to do that, that success >>well Retika congratulations on your success. I mean, you've been an inspiration to a number of people. I remember when I first saw you, you were working in group and you're up on stage and say, wow, this person really knows her stuff. And then you've had a variety of different roles and I'm sure that success is going to continue. So thanks very much for coming on the cube. You're welcome. All right, keep it right there, buddy. We'll be back with our next guest right after this short break, we're here covering the IBM data in a AI form from Miami right back.

Published Date : Oct 22 2019

SUMMARY :

IBM's data and AI forum brought to you by IBM. Ritika, great to have you on. When you think about things like bias and ethicacy, having the diversity in I mean IBM generally, you know, we could see this stuff on the cube because Do you feel that way or do you feel like even a company like IBM has a long way to And I think it goes back to you want to, I understand why you started it started in June. And I think Dave, the reason that's so important is you want to be able to understand that those journeys are So talk about how to address that and why is it important for more it is absolutely important that regardless of whether you are a male or a female, and that you want to have? Um, Dave and that is like when you look at where it starts, out there needs to have a foundational understanding, not only in the three RS that you and I know from when It needs to start early and you I think that that is having an impact. And so I see that as a key component of how coding gets done in the future, So understanding what you And so we've, we've obviously talking in detail about women in AI and women And so having that figure out how you can create the art of the possible. is that right? Yeah. Did you ever work as a programmer? So personally for me, being able to create And what was it that sort of allowed you to break through that barrier? that you see today have to be more technically acute to be able to do their jobs really Then I say, okay, well some examples you can think about Margaret Thatcher in the UK, So, you know, politics aside more women in leadership roles. I mean you certainly see women leaders in a, in Congress and, how these females actually came through to the journey, where they are right now, How do you know it when you see but what you can bring in terms of expertise to make them successful, what you can do to help shape their that success is going to continue.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Ritika	PERSON	0.99+
Dave Volante	PERSON	0.99+
Dave	PERSON	0.99+
Angela Merkel	PERSON	0.99+
10	QUANTITY	0.99+
EMC	ORGANIZATION	0.99+
Ritika Gunnar	PERSON	0.99+
Rob Thomas	PERSON	0.99+
Joe Tucci	PERSON	0.99+
June	DATE	0.99+
Satya Nadella	PERSON	0.99+
Margaret Thatcher	PERSON	0.99+
Germany	LOCATION	0.99+
Austin	LOCATION	0.99+
Miami Dade County	LOCATION	0.99+
Aetna	ORGANIZATION	0.99+
Omar para	PERSON	0.99+
United States	LOCATION	0.99+
UK	LOCATION	0.99+
Beth Smith	PERSON	0.99+
Miami	LOCATION	0.99+
one	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
both	QUANTITY	0.99+
15%	QUANTITY	0.99+
Minecraft	TITLE	0.99+
tonight	DATE	0.99+
first years	QUANTITY	0.99+
Python	TITLE	0.99+
Intercontinental	ORGANIZATION	0.98+
today	DATE	0.98+
Retika	PERSON	0.98+
Congress	ORGANIZATION	0.97+
One	QUANTITY	0.97+
a decade ago	DATE	0.97+
first time	QUANTITY	0.96+
Grace Hopper	EVENT	0.96+
Watson	PERSON	0.96+
first	QUANTITY	0.96+
one time	QUANTITY	0.96+
17%	QUANTITY	0.95+
this afternoon	DATE	0.94+
Democrat	ORGANIZATION	0.91+
a year	DATE	0.91+
Republican	ORGANIZATION	0.9+
three RS	QUANTITY	0.9+
year and a half ago	DATE	0.89+
past decade	DATE	0.89+
IBM Data	ORGANIZATION	0.87+
Miami Dade	ORGANIZATION	0.82+
Harley Davidson	ORGANIZATION	0.81+
seventies	DATE	0.77+
IBM data	ORGANIZATION	0.76+
past few years	DATE	0.74+
downtown Miami	LOCATION	0.63+
50	QUANTITY	0.59+
years	QUANTITY	0.58+
sixties	DATE	0.57+
eighties	DATE	0.53+
the panelists	QUANTITY	0.52+
past 50	DATE	0.52+
those	QUANTITY	0.52+

Seth Dobrin, IBM | IBM Data and AI Forum

>>live from Miami, Florida It's the Q covering. IBM is data in a I forum brought to you by IBM. >>Welcome back to the port of Miami, everybody. We're here at the Intercontinental Hotel. You're watching the Cube? The leader and I live tech covered set. Daubert is here. He's the vice president of data and I and a I and the chief data officer of cloud and cognitive software. And I'd be upset too. Good to see you again. >>Good. See, Dave, thanks for having me >>here. The data in a I form hashtag data. I I It's amazing here. 1700 people. Everybody's gonna hands on appetite for learning. Yeah. What do you see out in the marketplace? You know what's new since we last talked. >>Well, so I think if you look at some of the things that are really need in the marketplace, it's really been around filling the skill shortage. And how do you operationalize and and industrialize? You're a I. And so there's been a real need for things ways to get more productivity out of your data. Scientists not necessarily replace them. But how do you get more productivity? And we just released a few months ago, something called Auto A I, which really is, is probably the only tool out there that automates the end end pipeline automates 80% of the work on the Indian pipeline, but isn't a black box. It actually kicks out code. So your data scientists can then take it, optimize it further and understand it, and really feel more comfortable about it. >>He's got a eye for a eyes. That's >>exactly what is a eye for an eye. >>So how's that work? So you're applying machine intelligence Two data to make? Aye. Aye, more productive pick algorithms. Best fit. >>Yeah, So it does. Basically, you feed it your data and it identifies the features that are important. It does feature engineering for you. It does model selection for you. It does hyper parameter tuning and optimization, and it does deployment and also met monitors for bias. >>So what's the date of scientists do? >>Data scientist takes the code out the back end. And really, there's some tweaks that you know, the model, maybe the auto. Aye, aye. Maybe not. Get it perfect, Um, and really customize it for the business and the needs of the business. that the that the auto A I so they not understand >>the data scientist, then can can he or she can apply it in a way that is unique to their business that essentially becomes their I p. It's not like generic. Aye, aye for everybody. It's it's customized by And that's where data science to complain that I have the time to do this. Wrangling data >>exactly. And it was built in a combination from IBM Research since a great assets at IBM Research plus some cattle masters at work here at IBM that really designed and optimize the algorithm selection and things like that. And then at the keynote today, uh, wonderment Thompson was up there talking, and this is probably one of the most impactful use cases of auto. Aye, aye to date. And it was also, you know, my former team, the data science elite team, was engaged, but wonderment Thompson had this problem where they had, like, 17,000 features in their data sets, and what they wanted to do was they wanted to be able to have a custom solution for their customers. And so every time they get a customer that have to have a data scientist that would sit down and figure out what the right features and how the engineer for this customer. It was an intractable problem for them. You know, the person from wonderment Thompson have prevented presented today said he's been trying to solve this problem for eight years. Auto Way I, plus the data science elite team solve the form in two months, and after that two months, it went right into production. So in this case, oughta way. I isn't doing the whole pipeline. It's helping them identify the features and engineering the features that are important and giving them a head start on the model. >>What's the, uh, what's the acquisition bottle for all the way as a It's a license software product. Is it assassin part >>of Cloudpack for data, and it's available on IBM Cloud. So it's on IBM Cloud. You can use it paper use so you get a license as part of watching studio on IBM Cloud. If you invest in Cloudpack for data, it could be a perpetual license or committed term license, which essentially assassin, >>it's essentially a feature at dawn of Cloudpack for data. >>It's part of Cloudpack per day and you're >>saying it can be usage based. So that's key. >>Consumption based hot pack for data is all consumption based, >>so people want to use a eye for competitive advantage. I said by my open that you know, we're not marching to the cadence of Moore's Law in this industry anymore. It's a combination of data and then cloud for scale. So so people want competitive advantage. You've talked about some things that folks are doing to gain that competitive advantage. But the same time we heard from Rob Thomas that only about 4 to 10% penetration for a I. What? What are the key blockers that you see and how you're knocking them >>down? Well, I think there's. There's a number of key blockers, so one is of access to data, right? Cos have tons of data, but being able to even know what data is, they're being able to pull it all together and being able to do it in a way that is compliant with regulation because you got you can't do a I in a vacuum. You have to do it in the context of ever increasing regulation like GDP R and C, C, P A and all these other regulator privacy regulations that are popping up. So so that's that's really too so access to data and regulation can be blockers. The 2nd 1 or the 3rd 1 is really access to appropriate skills, which we talked a little bit about. Andi, how do you retrain, or how do you up skill, the talent you have? And then how do you actually bring in new talent that can execute what you want on then? Sometimes in some cos it's a lack of strategy with appropriate measurement, right? So what is your A II strategy, and how are you gonna measure success? And you and I have talked about this on Cuban on Cube before, where it's gotta measure your success in dollars and cents right cost savings, net new revenue. That's really all your CFO is care about. That's how you have to be able to measure and monitor your success. >>Yes. Oh, it's so that's that Last one is probably were where most organizations start. Let's prioritize the use cases of the give us the best bang for the buck, and then business guys probably get really excited and say Okay, let's go. But to up to truly operationalize that you gotta worry about these other things. You know, the compliance issues and you gotta have the skill sets. Yeah, it's a scale. >>And sometimes that's actually the first thing you said is sometimes a mistake. So focusing on the one that's got the most bang for the buck is not necessarily the best place to start for a couple of reasons. So one is you may not have the right data. It may not be available. It may not be governed properly. Number one, number two the business that you're building it for, may not be ready to consume it right. They may not be either bought in or the processes need to change so much or something like that, that it's not gonna get used. And you can build the best a I in the world. If it doesn't get used, it creates zero value, right? And so you really want to focus on for the first couple of projects? What are the one that we can deliver the best value, not Sarah, the most value, but the best value in the shortest amount of time and ensure that it gets into production because especially when you're starting off, if you don't show adoption, people are gonna lose interest. >>What are you >>seeing in terms of experimentation now in the customer base? You know, when you talk to buyers and you talk about, you know, you look at the I T. Spending service. People are concerned about tariffs. The trade will hurt the 2020 election. They're being a little bit cautious. But in the last two or three years have been a lot of experimentation going on. And a big part of that is a I and machine learning. What are you seeing in terms of that experimentation turning into actually production project that we can learn from and maybe do some new experiments? >>Yeah, and I think it depends on how you're doing the experiments. There's, I think there's kind of academic experimentation where you have data science, Sistine Data science teams that come work on cool stuff that may or may not have business value and may or may not be implemented right. They just kind of latch on. The business isn't really involved. They latch on, they do projects, and that's I think that's actually bad experimentation if you let it that run your program. The good experimentation is when you start identity having a strategy. You identify the use cases you want to go after and you experiment by leveraging, agile to deliver these methodologies. You deliver value in two weeks prints, and you can start delivering value quickly. You know, in the case of wonderment, Thompson again 88 weeks, four sprints. They got value. That was an experiment, right? That was an experiment because it was done. Agile methodologies using good coding practices using good, you know, kind of design up front practices. They were able to take that and put it right into production. If you're doing experimentation, you have to rewrite your code at the end. And it's a waste of time >>T to your earlier point. The moon shots are oftentimes could be too risky. And if you blow it on a moon shot, it could set you back years. So you got to be careful. Pick your spots, picked ones that maybe representative, but our lower maybe, maybe lower risk. Apply agile methodologies, get a quick return, learn, develop those skills, and then then build up to the moon ship >>or you break that moon shot down its consumable pieces. Right, Because the moon shot may take you two years to get to. But maybe there are sub components of that moon shot that you could deliver in 34 months and you start delivering knows, and you work up to the moon shot. >>I always like to ask the dog food in people. And I said, like that. Call it sipping your own champagne. What do you guys done internally? When we first met, it was and I think, a snowy day in Boston, right at the spark. Some it years ago. And you did a big career switch, and it's obviously working out for you, But But what are some of the things? And you were in part, brought in to help IBM internally as well as Interpol Help IBM really become data driven internally? Yeah. How has that gone? What have you learned? And how are you taking that to customers? >>Yeah, so I was hired three years ago now believe it was that long toe lead. Our internal transformation over the last couple of years, I got I don't want to say distracted there were really important business things I need to focus on, like gpr and helping our customers get up and running with with data science, and I build a data science elite team. So as of a couple months ago, I'm back, you know, almost entirely focused on her internal transformation. And, you know, it's really about making sure that we use data and a I to make appropriate decisions on DSO. Now we have. You know, we have an app on her phone that leverages Cognos analytics, where at any point, Ginny Rometty or Rob Thomas or Arvin Krishna can pull up and look in what we call E P M. Which is enterprise performance management and understand where the business is, right? What what do we do in third quarter, which just wrapped up what was what's the pipeline for fourth quarter? And it's at your fingertips. We're working on revamping our planning cycle. So today planning has been done in Excel. We're leveraging Planning Analytics, which is a great planning and scenario planning tool that with the tip of a button, really let a click of a button really let you understand how your business can perform in the future and what things need to do to get it perform. We're also looking across all of cloud and cognitive software, which data and A I sits in and within each business unit and cloud and cognitive software. The sales teams do a great job of cross sell upsell. But there's a huge opportunity of how do we cross sell up sell across the five different businesses that live inside of cloud and cognitive software. So did an aye aye hybrid cloud integration, IBM Cloud cognitive Applications and IBM Security. There's a lot of potential interplay that our customers do across there and providing a I that helps the sales people understand when they can create more value. Excuse me for our customers. >>It's interesting. This is the 10th year of doing the Cube, and when we first started, it was sort of the beginning of the the big data craze, and a lot of people said, Oh, okay, here's the disruption, crossing the chasm. Innovator's dilemma. All that old stuff going away, all the new stuff coming in. But you mentioned Cognos on mobile, and that's this is the thing we learned is that the key ingredients to data strategies. Comprised the existing systems. Yes. Throw those out. Those of the systems of record that were the single version of the truth, if you will, that people trusted you, go back to trust and all this other stuff built up around it. Which kind of created dissidents. Yeah. And so it sounds like one of the initiatives that you you're an IBM I've been working on is really bringing in the new pieces, modernizing sort of the existing so that you've got sort of consistent data sets that people could work. And one of the >>capabilities that really has enabled this transformation in the last six months for us internally and for our clients inside a cloud pack for data, we have this capability called IBM data virtualization, which we have all these independent sources of truth to stomach, you know? And then we have all these other data sources that may or may not be as trusted, but to be able to bring them together literally. With the click of a button, you drop your data sources in the Aye. Aye, within data. Virtualization actually identifies keys across the different things so you can link your data. You look at it, you check it, and it really enables you to do this at scale. And all you need to do is say, pointed out the data. Here's the I. P. Address of where the data lives, and it will bring that in and help you connect it. >>So you mentioned variances in data quality and consumer of the data has to have trust in that data. Can you use machine intelligence and a I to sort of give you a data confidence meter, if you will. Yeah. So there's two things >>that we use for data confidence. I call it dodging this factor, right. Understanding what the dodging this factor is of the data. So we definitely leverage. Aye. Aye. So a I If you have a date, a dictionary and you have metadata, the I can understand eight equality. And it can also look at what your data stewards do, and it can do some of the remediation of the data quality issues. But we all in Watson Knowledge catalog, which again is an in cloudpack for data. We also have the ability to vote up and vote down data. So as much as the team is using data internally. If there's a data set that had a you know, we had a hive data quality score, but it wasn't really valuable. It'll get voted down, and it will help. When you search for data in the system, it will sort it kind of like you do a search on the Internet and it'll it'll down rank that one, depending on how many down votes they got. >>So it's a wisdom of the crowd type of. >>It's a crowd sourcing combined with the I >>as that, in your experience at all, changed the dynamics of politics within organizations. In other words, I'm sure we've all been a lot of meetings where somebody puts foursome data. And if the most senior person in the room doesn't like the data, it doesn't like the implication he or she will attack the data source, and then the meeting's over and it might not necessarily be the best decision for the organization. So So I think it's maybe >>not the up, voting down voting that does that, but it's things like the E PM tool that I said we have here. You know there is a single source of truth for our finance data. It's on everyone's phone. Who needs access to it? Right? When you have a conversation about how the company or the division or the business unit is performing financially, it comes from E. P M. Whether it's in the Cognos app or whether it's in a dashboard, a separate dashboard and Cognos or is being fed into an aye aye, that we're building. This is the source of truth. Similarly, for product data, our individual products before me it comes from here's so the conversation at the senior senior meetings are no longer your data is different from my data. I don't believe it. You've eliminated that conversation. This is the data. This is the only data. Now you can have a conversation about what's really important >>in adult conversation. Okay, Now what are we going to do? It? It's >>not a bickering about my data versus your data. >>So what's next for you on? You know, you're you've been pulled in a lot of different places again. You started at IBM as an internal transformation change agent. You got pulled into a lot of customer situations because yeah, you know, you're doing so. Sales guys want to drag you along and help facilitate activity with clients. What's new? What's what's next for you. >>So really, you know, I've only been refocused on the internal transformation for a couple months now. So really extending IBM struck our cloud and cognitive software a data and a I strategy and starting to quickly implement some of these products, just like project. So, like, just like I just said, you know, we're starting project without even knowing what the prioritized list is. Intuitively, this one's important. The team's going to start working on it, and one of them is an aye aye project, which is around cross sell upsell that I mentioned across the portfolio and the other one we just got done talking about how in the senior leadership meeting for Claude Incognito software, how do we all work from a Cognos dashboard instead of Excel data data that's been exported put into Excel? The challenge with that is not that people don't trust the data. It's that if there's a question you can't drill down. So if there's a question about an Excel document or a power point that's up there, you will get back next meeting in a month or in two weeks, we'll have an e mail conversation about it. If it's presented in a really live dashboard, you can drill down and you can actually answer questions in real time. The value of that is immense, because now you as a leadership team, you can make a decision at that point and decide what direction you're going to do. Based on data, >>I said last time I have one more questions. You're CDO but you're a polymath on. So my question is, what should people look for in a chief data officer? What sort of the characteristics in the attributes, given your >>experience, that's kind of a loaded question, because there is. There is no good job, single job description for a chief date officer. I think there's a good solid set of skill sets, the fine for a cheap date officer and actually, as part of the chief data officer summits that you you know, you guys attend. We had were having sessions with the chief date officers, kind of defining a curriculum for cheap date officers with our clients so that we can help build the chief. That officer in the future. But if you look a quality so cheap, date officer is also a chief disruption officer. So it needs to be someone who is really good at and really good at driving change and really good at disrupting processes and getting people excited about it changes hard. People don't like change. How do you do? You need someone who can get people excited about change. So that's one thing. On depending on what industry you're in, it's got to be. It could be if you're in financial or heavy regulated industry, you want someone that understands governance. And that's kind of what Gardner and other analysts call a defensive CDO very governance Focus. And then you also have some CDOs, which I I fit into this bucket, which is, um, or offensive CDO, which is how do you create value from data? How do you caught save money? How do you create net new revenue? How do you create new business models, leveraging data and a I? And now there's kind of 1/3 type of CDO emerging, which is CDO not as a cost center but a studio as a p N l. How do you generate revenue for the business directly from your CDO office. >>I like that framework, right? >>I can't take credit for it. That's Gartner. >>Its governance, they call it. We say he called defensive and offensive. And then first time I met Interpol. He said, Look, you start with how does data affect the monetization of my organization? And that means making money or saving money. Seth, thanks so much for coming on. The Cube is great to see you >>again. Thanks for having me >>again. All right, Keep it right to everybody. We'll be back at the IBM data in a I form from Miami. You're watching the Cube?

Published Date : Oct 22 2019

SUMMARY :

IBM is data in a I forum brought to you by IBM. Good to see you again. What do you see out in the marketplace? And how do you operationalize and and industrialize? He's got a eye for a eyes. So how's that work? Basically, you feed it your data and it identifies the features that are important. And really, there's some tweaks that you know, the data scientist, then can can he or she can apply it in a way that is unique And it was also, you know, my former team, the data science elite team, was engaged, Is it assassin part You can use it paper use so you get a license as part of watching studio on IBM Cloud. So that's key. What are the key blockers that you see and how you're knocking them the talent you have? You know, the compliance issues and you gotta have the skill sets. And sometimes that's actually the first thing you said is sometimes a mistake. You know, when you talk to buyers and you talk You identify the use cases you want to go after and you experiment by leveraging, And if you blow it on a moon shot, it could set you back years. Right, Because the moon shot may take you two years to And how are you taking that to customers? with the tip of a button, really let a click of a button really let you understand how your business And so it sounds like one of the initiatives that you With the click of a button, you drop your data sources in the Aye. to sort of give you a data confidence meter, if you will. So a I If you have a date, a dictionary and you have And if the most senior person in the room doesn't like the data, so the conversation at the senior senior meetings are no longer your data is different Okay, Now what are we going to do? a lot of customer situations because yeah, you know, you're doing so. So really, you know, I've only been refocused on the internal transformation for What sort of the characteristics in the attributes, given your And then you also have some CDOs, which I I I can't take credit for it. The Cube is great to see you Thanks for having me We'll be back at the IBM data in a I form from Miami.

ENTITIES

Entity	Category	Confidence
Seth	PERSON	0.99+
Arvin Krishna	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Daubert	PERSON	0.99+
Boston	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
Dave	PERSON	0.99+
Ginny Rometty	PERSON	0.99+
Seth Dobrin	PERSON	0.99+
IBM Research	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Miami	LOCATION	0.99+
Excel	TITLE	0.99+
eight years	QUANTITY	0.99+
88 weeks	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
Gardner	PERSON	0.99+
Sarah	PERSON	0.99+
Miami, Florida	LOCATION	0.99+
34 months	QUANTITY	0.99+
17,000 features	QUANTITY	0.99+
two things	QUANTITY	0.99+
10th year	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
1700 people	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
Cognos	TITLE	0.99+
three years ago	DATE	0.99+
two months	QUANTITY	0.99+
first time	QUANTITY	0.98+
one	QUANTITY	0.98+
today	DATE	0.98+
each business	QUANTITY	0.97+
first couple	QUANTITY	0.97+
Interpol	ORGANIZATION	0.96+
about 4	QUANTITY	0.96+
Thompson	PERSON	0.96+
third quarter	DATE	0.96+
five different businesses	QUANTITY	0.95+
Two data	QUANTITY	0.95+
Intercontinental Hotel	ORGANIZATION	0.94+
IBM Data	ORGANIZATION	0.94+
first	QUANTITY	0.93+
single job	QUANTITY	0.93+
first thing	QUANTITY	0.92+
Cognos	ORGANIZATION	0.91+
last couple of years	DATE	0.91+
single source	QUANTITY	0.89+
few months ago	DATE	0.89+
one more questions	QUANTITY	0.89+
couple months ago	DATE	0.88+
Cloudpack	TITLE	0.87+
single version	QUANTITY	0.87+
Cube	COMMERCIAL_ITEM	0.86+
80% of	QUANTITY	0.85+
last six months	DATE	0.84+
Claude Incognito	ORGANIZATION	0.84+
agile	TITLE	0.84+
10%	QUANTITY	0.84+
years	DATE	0.84+
Moore	ORGANIZATION	0.82+
zero	QUANTITY	0.81+
three years	QUANTITY	0.8+
2020 election	EVENT	0.8+
E PM	TITLE	0.79+
four sprints	QUANTITY	0.79+
Watson	ORGANIZATION	0.77+
2nd 1	QUANTITY	0.75+

Rob Thomas, IBM | IBM Data and AI Forum

>>live from Miami, Florida. It's the Q covering. IBM is data in a I forum brought to you by IBM. >>Welcome back to the port of Miami, Everybody. You're watching the Cube, the leader in live tech coverage. We're here covering the IBM data and a I form. Rob Thomas is here. He's the general manager for data in A I and I'd be great to see again. >>Right. Great to see you here in Miami. Beautiful week here on the beach area. It's >>nice. Yeah. This is quite an event. I mean, I had thought it was gonna be, like, roughly 1000 people. It's over. Sold or 17. More than 1700 people here. This is a learning event, right? I mean, people here, they're here to absorb best practice, you know, learn technical hands on presentations. Tell us a little bit more about how this event has evolved. >>It started as a really small training event, like you said, which goes back five years. And what we saw those people, they weren't looking for the normal kind of conference. They wanted to be hands on. They want to build something. They want to come here and leave with something they didn't have when they arrived. So started as a little small builder conference and now somehow continues to grow every year, which were very thankful for. And we continue to kind of expand at sessions. We've had to add hotels this year, so it's really taken off >>you and your title has two of the three superpowers data. And of course, Cloud is the third superpower, which is part of IBMs portfolio. But people want to apply those superpowers, and you use that metaphor in your your keynote today to really transform their business. But you pointed out that only about a eyes only 4 to 10% penetrated within organizations, and you talked about some of the barriers that, but this is a real appetite toe. Learn isn't there. >>There is. Let's go talk about the superpower for a bit. A. I does give employees superpowers because they can do things now. They couldn't do before, but you think about superheroes. They all have an origin story. They always have somewhere where they started and applying a I an organization. It's actually not about doing something completely different. It's about extenuating. What you already d'oh doing something massively better. That's kind of in your DNA already. So we're encouraging all of our clients this week like use the time to understand what you're great at, what your value proposition is. And then how do you use a I to accentuate that? Because your superpower is only gonna last if it's starts with who you are as a company or as a >>person who was your favorite superhero is a kid. Let's see. I was >>kind of into the whole Hall of Justice. Super Superman, that kind of thing. That was probably my cartoon. >>I was a Batman guy. And the reason I love that movie because all the combination of tech, it's kind of reminds me, is what's happening here today. In the marketplace, people are taking data. They're taking a I. They're applying machine intelligence to that data to create new insights, which they couldn't have before. But to your point, there's a There's an issue with the quality of data and and there's a there's a skills gap as well. So let's let's start with the data quality problem described that problem and how are you guys attacking it? >>You're a I is only as good as your data. I'd say that's the fundamental problem and organization we worked with. 80% of the projects get slowed down or they get stopped because the company has a date. A problem. That's why we introduce this idea of the A i ladder, which is all of the steps that a company has to think about for how they get to a level of data maturity that supports a I. So how they collect their data, organize their data, analyze their data and ultimately begin to infuse a I into business processes soap. Every organization needs to climb that ladder, and they're all different spots. So for someone might be, we gotta focus on organization a data catalogue. For others, it might be we got do a better job of data collection data management. That's for every organization to figure out. But you need a methodical approach to how you attack the data problem. >>So I wanna ask you about the Aye aye ladder so you could have these verbs, the verbs overlay on building blocks. I went back to some of my notes in the original Ai ai ladder conversation that you introduced a while back. It was data and information architecture at the at the base and then building on that analytics machine learning. Aye, aye, aye. And then now you've added the verbs, collect, organized, analyze and infused. Should we think of this as a maturity model or building blocks and verbs that you can apply depending on where you are in that maturity model, >>I would think of it as building blocks and the methodology, which is you got to decide. Do wish we focus on our data collection and doing that right? Is that our weakness or is a data organization or is it the sexy stuff? The Aye. Aye. The data science stuff. We just This is just a tool to help organizations organize themselves on what's important. I asked every company I visit. Do you have a date? A strategy? You wouldn't believe the looks you get when you ask that question, you get either. Well, she's got one. He's got one. So we got seven or you get No, we've never had one. Or Hey, we just hired a CDO. So we hope to have one. But we use the eye ladder just as a tool to encourage companies to think about your data strategy >>should do you think in the context I want follow up on that data strategy because you see a lot of tactical data strategies? Well, we use Data Thio for this initiative of that initiative. Maybe in sales or marketing, or maybe in R and D. Increasingly, our organization's developing. And should they develop a holistic data strategy, or should they trying to just get kind of quick wins? What are you seeing in the marketplace? >>It depends on where you are in your maturity cycle. I do think it behooves every company to say We understand where we are and we understand where we want to go. That could be the high level data strategy. What are our focus and priorities gonna be? Once you understand focus and priorities, the best way to get things into production is through a bunch of small experiments to your point. So I don't think it's an either or, but I think it's really valuable tohave an overarching data strategy, and I recommended companies think about a hub and spokes model for this. Have a centralized chief date officer, but your business units also need a cheap date officer. So strategy and one place execution in another. There's a best practice to going about this >>the next you ask the question. What is a I? You get that question a lot, and you said it's about predicting, automating and optimizing. Can we unpack that a little bit? What's behind those three items? >>People? People overreact a hype on topics like II. And they think, Well, I'm not ready for robots or I'm not ready for self driving Vehicles like those Mayor may not happen. Don't know. But a eyes. Let's think more basic it's about can we make better predictions of the business? Every company wants to see a future. They want the proverbial crystal ball. A. I helped you make better predictions. If you have the data to do that, it helps you automate tasks, automate the things that you don't want to do. There's a lot of work that has to happen every day that nobody really wants to do you software to automate that there's about optimization. How do you optimize processes to drive greater productivity? So this is not black magic. This is not some far off thing. We're talking about basics better predictions, better automation, better optimization. >>Now interestingly, use the term black magic because because a lot of a I is black box and IBM is always made a point of we're trying to make a I transparent. You talk a lot about taking the bias out, or at least understanding when bias makes sense. When it doesn't make sense, Talk about the black box problem and how you're addressing. >>That starts with one simple idea. A eyes, not magic. I say that over and over again. This is just computer science. Then you have to look at what are the components inside the proverbial black box. With Watson, we have a few things. We've got tools for clients that want to build their own. Aye, aye, to think of it as a tool box you can choose. Do you want a hammer and you want a screwdriver? You wanna nail you go build your own, aye, aye. Using Watson. We also have applications, so it's basically an end user application that puts a I into practice things like Watson assistant to virtually no create a virtual agent for customer service or Watson Discovery or things like open pages with Watson for governance, risk and compliance. So, aye, aye, for Watson is about tools. You want to build your own applications if you want to consume an application, but we've also got in bed today. I capability so you can pick up Watson and put it inside of any software product in the >>world. He also mentioned that Watson was built with a lot of of of, of open source components, which a lot of people might not know. What's behind Watson. >>85% of the work that happens and Watson today is open source. Most people don't know that it's Python. It's our it's deploying into tensorflow. What we've done, where we focused our efforts, is how do you make a I easier to use? So we've introduced Auto Way. I had to watch the studio, So if you're building models and python, you can use auto. I tow automate things like feature engineering algorithm, selection, the kind of thing that's hard for a lot of data scientists. So we're not trying to create our own language. We're using open source, but then we make that better so that a data scientist could do their job better >>so again come back to a adoption. We talked about three things. Quality, trust and skills. We talked about the data quality piece we talked about the black box, you know, challenge. It's not about skills you mention. There's a 250,000 person Gap data science skills. How is IBM approaching how our customers and IBM approaching closing that gap? >>So think of that. But this in basic economic terms. So we have a supply demand mismatch. Massive demand for data scientists, not enough supply. The way that we address that is twofold. One is we've created a team called Data Science Elite. They've done a lot of work for the clients that were on stage with me, who helped a client get to their first big win with a I. It's that simple. We go in for 4 to 6 weeks. It's an elite team. It's not a long project we're gonna get you do for your success. Second piece is the other way to solve demand and supply mismatch is through automation. So I talked about auto. Aye, aye. But we also do things like using a eye for building data catalogs, metadata creation data matching so making that data prep process automated through A. I can also help that supply demand. Miss Max. The way that you solve this is we put skills on the field, help clients, and we do a lot of automation in software. That's how we can help clients navigate this. So the >>data science elite team. I love that concept because way first picked up on a couple of years ago. At least it's one of the best freebies in the business. But of course you're doing it with the customers that you want to have deeper relationships with, and I'm sure it leads toe follow on business. What are some of the things that you're most proud of from the data science elite team that you might be able to share with us? >>The clients stories are amazing. I talked in the keynote about origin stories, Roll Bank of Scotland, automating 40% of their customer service. Now customer SATs going up 20% because they put their customer service reps on those hardest problems. That's data science, a lead helping them get to a first success. Now they scale it out at Wonderman Thompson on stage, part of big W P p big advertising agency. They're using a I to comb through customer records they're using auto Way I. That's the data science elite team that went in for literally four weeks and gave them the confidence that they could then do this on their own. Once we left, we got countless examples where this team has gone in for very short periods of time. And clients don't talk about this because they have to talk about it cause they're like, we can't believe what this team did. So we're really excited by the >>interesting thing about the RVs example to me, Rob was that you basically applied a I to remove a lot of these mundane tasks that weren't really driving value for the organization. And an R B s was able to shift the skill sets. It's a more strategic areas. We always talk about that, but But I love the example C. Can you talk a little bit more about really, where, where that ship was, What what did they will go from and what did they apply to and how it impacted their businesses? A improvement? I think it was 20% improvement in NPS but >>realizes the inquiry's they had coming in were two categories. There were ones that were really easy. There were when they were really hard and they were spreading those equally among their employees. So what you get is a lot of unhappy customers. And then once they said, we can automate all the easy stuff, we can put all of our people in the hardest things customer sat shot through the roof. Now what is a virtual agent do? Let's decompose that a bit. We have a thing called intent classifications as part of Watson assistant, which is, it's a model that understands customer a tent, and it's trained based on the data from Royal Bank of Scotland. So this model, after 30 days is not very good. After 90 days, it's really good. After 180 days, it's excellent, because at the core of this is we understand the intent of customers engaging with them. We use natural language processing. It really becomes a virtual agent that's done all in software, and you can only do that with things like a I. >>And what is the role of the human element in that? How does it interact with that virtual agent. Is it a Is it sort of unattended agent or is it unattended? What is that like? >>So it's two pieces. So for the easiest stuff no humans needed, we just go do that in software for the harder stuff. We've now given the RVs, customer service agents, superpowers because they've got Watson assistant at their fingertips. The hardest thing for a customer service agent is only finding the right data to solve a problem. Watson Discovery is embedded and Watson assistant so they can basically comb through all the data in the bank to answer a question. So we're giving their employees superpowers. So on one hand, it's augmenting the humans. In another case, we're just automating the stuff the humans don't want to do in the first place. >>I'm gonna shift gears a little bit. Talk about, uh, red hat in open shift. Obviously huge acquisition last year. $34 billion Next chapter, kind of in IBM strategy. A couple of things you're doing with open shift. Watson is now available on open shifts. So that means you're bringing Watson to the data. I want to talk about that and then cloudpack for data also on open shifts. So what has that Red had acquisition done for? You obviously know a lot about M and A but now you're in the position of you've got to take advantage of that. And you are taking advantage of this. So give us an update on what you're doing there. >>So look at the cloud market for a moment. You've got around $600 million of opportunity of traditional I t. On premise, you got another 600 billion. That's public clouds, dedicated clouds. And you got about 400 billion. That's private cloud. So the cloud market is fragmented between public, private and traditional. I t. The opportunity we saw was, if we can help clients integrate across all of those clouds, that's a great opportunity for us. What red at open shift is It's a liberator. It says right. Your application once deployed them anywhere because you build them on red hot, open shift. Now we've brought cloudpack for data. Our data platform on the red hot open shift certified on that Watson now runs on red had open shift. What that means is you could have the best data platform. The best Aye, Aye. And you can run it on Google. Eight of us, Azure, Your own private cloud. You get the best, Aye. Aye. With Watson from IBM and run it in any of those places. So the >>reason why that's so powerful because you're able to bring those capabilities to the data without having to move the date around It was Jennifer showed an example or no, maybe was tail >>whenever he was showing Burt analyzing the data. >>And so the beauty of that is I don't have to move any any data, talk about the importance of not having Thio move that data. And I want I want to understand what the client prerequisite is. They really take advantage of that. This one >>of the greatest inventions out of IBM research in the last 10 years, that hasn't gotten a lot attention, which is data virtualization. Data federation. Traditional federation's been around forever. The issue is it doesn't perform our data virtualization performance 500% faster than anything else in the market. So what Jennifer showed that demo was I'm training a model, and I'm gonna virtualized a data set from Red shift on AWS and on premise repositories a my sequel database. We don't have to move the data. We just virtualized those data sets into cloudpack for data and then we can train the model in one place like this is actually breaking down data silos that exist in every organization. And it's really unique. >>It was a very cool demo because what she did is she was pulling data from different data stores doing joins. It was a health care application, really trying to understand where the bias was peeling the onion, right? You know, it is it is bias, sometimes biases. Okay, you just got to know whether or not it's actionable. And so that was that was very cool without having to move any of the data. What is the prerequisite for clients? What do they have to do to take advantage of this? >>Start using cloudpack for data. We've got something on the Web called cloudpack experiences. Anybody can go try this in less than two minutes. I just say go try it. Because cloudpack for data will just insert right onto any public cloud you're running or in your private cloud environment. You just point to the sources and it will instantly begin to start to create what we call scheme a folding. So a skiing version of the schema from your source writing compact for data. This is like instant access to your data. >>It sounds like magic. OK, last question. One of the big takeaways You want people to leave this event with? >>We are trying to inspire clients to give a I shot. Adoption is 4 to 10% for what is the largest economic opportunity we will ever see in our lives. That's not an acceptable rate of adoption. So we're encouraging everybody Go try things. Don't do one, eh? I experiment. Do Ah, 100. Aye, aye. Experiments in the next year. If you do, 150 of them probably won't work. This is where you have to change the cultural idea. Ask that comes into it, be prepared that half of them are gonna work. But then for the 52 that do work, then you double down. Then you triple down. Everybody will be successful. They I if they had this iterative mindset >>and with cloud it's very inexpensive to actually do those experiments. Rob Thomas. Thanks so much for coming on. The Cuban great to see you. Great to see you. All right, Keep right, everybody. We'll be back with our next guest. Right after this short break, we'll hear from Miami at the IBM A I A data form right back.

Published Date : Oct 22 2019

SUMMARY :

IBM is data in a I forum brought to you by IBM. We're here covering the IBM data and a I form. Great to see you here in Miami. I mean, people here, they're here to absorb best practice, It started as a really small training event, like you said, which goes back five years. and you use that metaphor in your your keynote today to really transform their business. the time to understand what you're great at, what your value proposition I was kind of into the whole Hall of Justice. quality problem described that problem and how are you guys attacking it? But you need a methodical approach to how you attack the data problem. So I wanna ask you about the Aye aye ladder so you could have these verbs, the verbs overlay So we got seven or you get No, we've never had one. What are you seeing in the marketplace? It depends on where you are in your maturity cycle. the next you ask the question. There's a lot of work that has to happen every day that nobody really wants to do you software to automate that there's Talk about the black box problem and how you're addressing. Aye, aye, to think of it as a tool box you He also mentioned that Watson was built with a lot of of of, of open source components, What we've done, where we focused our efforts, is how do you make a I easier to use? We talked about the data quality piece we talked about the black box, you know, challenge. It's not a long project we're gonna get you do for your success. it with the customers that you want to have deeper relationships with, and I'm sure it leads toe follow on have to talk about it cause they're like, we can't believe what this team did. interesting thing about the RVs example to me, Rob was that you basically applied So what you get is a lot of unhappy customers. What is that like? So for the easiest stuff no humans needed, we just go do that in software for And you are taking advantage of this. What that means is you And so the beauty of that is I don't have to move any any data, talk about the importance of not having of the greatest inventions out of IBM research in the last 10 years, that hasn't gotten a lot attention, What is the prerequisite for clients? This is like instant access to your data. One of the big takeaways You want people This is where you have to change the cultural idea. The Cuban great to see you.

ENTITIES

Entity	Category	Confidence
Miami	LOCATION	0.99+
Jennifer	PERSON	0.99+
4	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
20%	QUANTITY	0.99+
Royal Bank of Scotland	ORGANIZATION	0.99+
40%	QUANTITY	0.99+
Python	TITLE	0.99+
IBMs	ORGANIZATION	0.99+
$34 billion	QUANTITY	0.99+
seven	QUANTITY	0.99+
Rob	PERSON	0.99+
Eight	QUANTITY	0.99+
two pieces	QUANTITY	0.99+
python	TITLE	0.99+
two categories	QUANTITY	0.99+
250,000 person	QUANTITY	0.99+
500%	QUANTITY	0.99+
two	QUANTITY	0.99+
four weeks	QUANTITY	0.99+
less than two minutes	QUANTITY	0.99+
Second piece	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
last year	DATE	0.99+
Miami, Florida	LOCATION	0.99+
Google	ORGANIZATION	0.99+
Max.	PERSON	0.99+
Roll Bank of Scotland	ORGANIZATION	0.99+
one	QUANTITY	0.99+
next year	DATE	0.99+
One	QUANTITY	0.99+
10%	QUANTITY	0.99+
Data Thio	ORGANIZATION	0.99+
Red	ORGANIZATION	0.99+
6 weeks	QUANTITY	0.99+
52	QUANTITY	0.98+
600 billion	QUANTITY	0.98+
Watson	TITLE	0.98+
Wonderman Thompson	ORGANIZATION	0.98+
one simple idea	QUANTITY	0.98+
More than 1700 people	QUANTITY	0.98+
today	DATE	0.98+
Batman	PERSON	0.98+
about 400 billion	QUANTITY	0.97+
first	QUANTITY	0.97+
IBM Data	ORGANIZATION	0.97+
100	QUANTITY	0.97+
this year	DATE	0.97+
around $600 million	QUANTITY	0.97+
this week	DATE	0.96+
third superpower	QUANTITY	0.96+
Burt	PERSON	0.96+
red	ORGANIZATION	0.96+
three things	QUANTITY	0.96+
17	QUANTITY	0.95+
Hall of Justice	TITLE	0.94+
Superman	PERSON	0.94+
three superpowers	QUANTITY	0.94+
cloudpack	TITLE	0.94+
Azure	ORGANIZATION	0.94+
five years	QUANTITY	0.93+
couple of years ago	DATE	0.92+
80%	QUANTITY	0.91+
1000 people	QUANTITY	0.9+

Keynote Analysis | IBM Data and AI Forum

>>Live from Miami, Florida. It's the cube covering IBM's data and AI forum brought to you by IBM. >>Welcome everybody to the port of Miami. My name is Dave Vellante and you're watching the cube, the leader in live tech coverage. We go out to the events, we extract the signal from the noise and we're here at the IBM data and AI form. The hashtag is data AI forum. This is IBM's. It's formerly known as the, uh, IBM analytics university. It's a combination of learning peer network and really the focus is on AI and data. And there are about 1700 people here up from, Oh, about half of that last year, uh, when it was the IBM, uh, analytics university, about 600 customers, a few hundred partners. There's press here, there's, there's analysts, and of course the cube is covering this event. We'll be here for one day, 128 hands-on sessions or ER or sessions, 35 hands on labs. As I say, a lot of learning, a lot of technical discussions, a lot of best practices. >>What's happening here. For decades, our industry has marched to the cadence of Moore's law. The idea that you could double the processor performance every 18 months, doubling the number of transistors, you know, within, uh, the footprint that's no longer what's driving innovation in the it and technology industry today. It's a combination of data with machine intelligence applied to that data and cloud. So data we've been collecting data, we've always talked about all this data that we've collected and over the past 10 years with the advent of lower costs, warehousing technologies in file stores like Hadoop, um, with activity going on at the edge with new databases and lower cost data stores that can handle unstructured data as well as structured data. We've amassed this huge amount of, of data that's growing at a, at a nonlinear rate. It's, you know, this, the curve is steepening is exponential. >>So there's all this data and then applying machine intelligence or artificial intelligence with machine learning to that data is the sort of blending of a new cocktail. And then the third piece of that third leg of that stool is the cloud. Why is the cloud important? Well, it's important for several reasons. One is that's where a lot of the data lives too. It's where agility lives. So cloud, cloud, native of dev ops, and being able to spin up infrastructure as code really started in the cloud and it's sort of seeping to to on prem, slowly and hybrid and multi-cloud, ACC architectures. But cloud gives you not only that data access, not only the agility, but also scale, global scale. So you can test things out very cheaply. You can experiment very cheaply with cloud and data and AI. And then once your POC is set and you know it's going to give you business value and the business outcomes you want, you can then scale it globally. >>And that's really what what cloud brings. So this forum here today where the big keynotes, uh, Rob Thomas kicked it off. He uh, uh, actually take that back. A gentleman named Ray Zahab, he's an adventure and ultra marathon or kicked it off. This Jude one time ran 4,500 miles in 111 days with two ultra marathon or colleagues. Um, they had no days off. They traveled through six countries, they traversed Africa, the continent, and he took two showers in a 111 days. And his whole mission is really talking about the power of human beings, uh, and, and the will of humans to really rise above any challenge would with no limits. So that was the sort of theme that, that was set for. This, the, the tone that was set for this conference that Rob Thomas came in and invoked the metaphor of superheroes and superpowers of course, AI and data being two of those three superpowers that I talked about in addition to cloud. >>So Rob talked about, uh, eliminating the good to find the great, he talked about some of the experiences with Disney's ward. Uh, ward Kimball and Stanley, uh, ward Kimball went to, uh, uh, Walt Disney with this amazing animation. And Walter said, I love it. It was so funny. It was so beautiful, was so amazing. Your work 283 days on this. I'm cutting it out. So Rob talked about cutting out the good to find, uh, the great, um, also talking about AI is penetrated only about four to 10% within organizations. Why is that? Why is it so low? He said there are three things that are blockers. They're there. One is data and he specifically is referring to data quality. The second is trust and the third is skillsets. So he then talked about, you know, of course dovetailed a bunch of IBM products and capabilities, uh, into, you know, those, those blockers, those challenges. >>He talked about two in particular, IBM cloud pack for data, which is this way to sort of virtualize data across different clouds and on prem and hybrid and and basically being able to pull different data stores in, virtualize it, combine join data and be able to act on it and apply a machine learning and AI to it. And then auto AI a way to basically machine intelligence for artificial intelligence. In other words, AI for AI. What's an example? How do I choose the right algorithm and that's the best fit for the use case that I'm using. Let machines do that. They've got experience and they can have models that are trained to actually get the best fit. So we talked about that, talked about a customer, a panel, a Miami Dade County, a Wunderman Thompson, and the standard bank of South Africa. These are incumbents that are using a machine intelligence and AI to actually try to super supercharge their business. We heard a use case with the Royal bank of Scotland, uh, basically applying AI and driving their net promoter score. So we'll talk some more about that. Um, and we're going to be here all day today, uh, interviewing executives, uh, from, uh, from IBM, talking about, you know, what customers are doing with a, uh, getting the feedback from the analysts. So this is what we do. Keep it right there, buddy. We're in Miami all day long. This is Dave Olanta. You're watching the cube. We'll be right back right after this short break..

Published Date : Oct 22 2019

SUMMARY :

IBM's data and AI forum brought to you by IBM. It's a combination of learning peer network and really the focus is doubling the number of transistors, you know, within, uh, the footprint that's in the cloud and it's sort of seeping to to on prem, slowly and hybrid and multi-cloud, really talking about the power of human beings, uh, and, and the will of humans So Rob talked about cutting out the good to find, and that's the best fit for the use case that I'm using.

ENTITIES

Entity	Category	Confidence
Ray Zahab	PERSON	0.99+
Miami	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
Dave Olanta	PERSON	0.99+
4,500 miles	QUANTITY	0.99+
35 hands	QUANTITY	0.99+
Stanley	PERSON	0.99+
two	QUANTITY	0.99+
six countries	QUANTITY	0.99+
128 hands	QUANTITY	0.99+
111 days	QUANTITY	0.99+
Walter	PERSON	0.99+
Rob	PERSON	0.99+
Africa	LOCATION	0.99+
Jude	PERSON	0.99+
one day	QUANTITY	0.99+
283 days	QUANTITY	0.99+
third piece	QUANTITY	0.99+
Miami, Florida	LOCATION	0.99+
Wunderman Thompson	ORGANIZATION	0.99+
Royal bank of Scotland	ORGANIZATION	0.99+
One	QUANTITY	0.99+
third	QUANTITY	0.99+
today	DATE	0.99+
second	QUANTITY	0.98+
last year	DATE	0.98+
about 600 customers	QUANTITY	0.98+
third leg	QUANTITY	0.98+
South Africa	LOCATION	0.97+
one time	QUANTITY	0.97+
three things	QUANTITY	0.96+
IBM Data	ORGANIZATION	0.96+
about 1700 people	QUANTITY	0.96+
three superpowers	QUANTITY	0.96+
two ultra marathon	QUANTITY	0.95+
Kimball	PERSON	0.95+
two showers	QUANTITY	0.94+
10%	QUANTITY	0.94+
about four	QUANTITY	0.88+
IBM analytics university	ORGANIZATION	0.86+
Miami Dade County	LOCATION	0.8+
18 months	QUANTITY	0.78+
hundred partners	QUANTITY	0.76+
decades	QUANTITY	0.74+
university	ORGANIZATION	0.73+
ward	PERSON	0.69+
Disney	ORGANIZATION	0.69+
Hadoop	TITLE	0.67+
Moore	PERSON	0.6+
years	DATE	0.59+
Walt	PERSON	0.58+
Disney	PERSON	0.5+
10	QUANTITY	0.46+
half	QUANTITY	0.4+
past	DATE	0.39+

Survey Data Shows Momentum for IBM Red Hat But Questions Remain

>> From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE! (upbeat electronic music) Now, here's your host, Dave Vellante. >> Hi, everybody, this is Dave Vellante, and I want to share with you some recent survey data that talks to the IBM acquisition of Red Hat, which closed today. It's always really valuable to go out, talk to practitioners, see what they're doing, and it's a hard thing to do. It's very expensive to get this type of survey data. A lot of times, it's very much out of date. You might remember. Some of you might remember a company called the InfoPro. Its founder and CEO was Ken Male, and he raised some money from Gideon Gartner, and he had this awesome survey panel. Well, somehow it failed. Well, friends of mine at ETR, Enterprise Technology Research, have basically created a modern version of the InfoPro. It's the InfoPro on steroids with a modern interface and data science behind it. They've now been at this for 10 years. They built a panel of 4,500 users, practitioners that they can go to, a lot of C level folks, a lot of VP level and then some doers down at the engineering level, and they go out and periodically survey these folks, and one of the surveys they did back in October was what do you think of the IBM-Red Hat acquisition? And then they've periodically gone out and talked to customers of both Red Hat and IBM or both to get a sense of the sentiment. So given that the acquisition closed today, we wanted to share some of that data with you, and our friends at ETR shared with us some of their drill down data with us, and we're going to share it with you. So first of all, I want to summarize something that they said. Back in October, they said, "We view this acquisition as less of an attempt "by IBM to climb into the cloud game, cloud relevance, "but rather a strategic opportunity "to reboot IBM's early 1990s IT services business strategy." I couldn't agree with that more. I've said all along this is a services play connecting OpenShift from Red Hat into the what Ginni Rometty talks about as the 80% of the install base that is still on prem with the workloads at the backend of mission critical systems that need to be modernized. That's IBM's opportunity. That's why this is a front end loaded cashflow deal 'cause IBM can immediately start doing business through it services organization and generate cash. They went on to say, ETR said, "Here, IBM could position itself "as the de facto IT services partner "for Fortune 100 to Global 2000 organizations "and their digital transformations. "Therefore, in theory, this could reinvigorate "the global services business for IBM "and their overlapping customer bases "could alow IBM to recapture and accelerate a great deal "of service revenues that they have lost "over the past few years." Again, I couldn't agree more. It's less about a cloud play. It is definitely about a multi-cloud play, which is how IBM's positioning this, but services de-risks this entire acquisition in my opinion even though it's very large, 34 billion. Okay, I'm show you some data. So pull up this slide. So what ETR does is they'll go out. So this is a survey of right after the acquisition of about 132 Global 2000 practitioners across a bunch of different industries, energy, utilities, financial services, government, healthcare, IT, telco, retail consumers, so a nice cross section of industries and largely in North America but a healthy cross section of AMIA and APAC. And again, these are large enterprises. So what this slide shows is conditioned responses, which I love conditioned responses. It sort of forces people to answer which of the following best describes. But this says, "Given IBM's intent to acquire Red Hat, "do you believe your organization will be more likely "to use this new combination "or less likely in your digital transformation?" You can see here on the left hand side, the green, 23% positive, on the right hand side, 13% negative. So, the data doesn't necessarily support ETR's original conclusions and my belief that this all about services momentum because most IT people are going to wait and see. So you can see the fat middle there is 64%. Basically you're saying, "Yeah, we're going to wait and see. "This really doesn't change anything." But nonetheless, you see a meaningfully more positive sentiment than negative sentiment. The bottom half of this slide shows, the question is, "Do you believe that this acquisition "makes or will make IBM a legitimate competitor "in the cloud wars between AWS and Microsoft Azure?" You can see on the left hand side, it says 45% positive. Very few say, all the way on the left hand side, a very legitimate player in the cloud on par with AWS and Azure. I don't believe that's the case. But a majority said, "IBM is surely better off "with Red Hat than without Red Hat in the context of cloud." Again, I would agree with that. While I think this is largely a services play, it's also, as Stu Miniman pointed out in an earlier video with me, a cloud play. And you can see it's still 38% is negative on the right hand side. 15% absolutely not, IBM is far behind AWS and Azure in cloud. I would tend to agree with that, but IBM is different. They're trying to bring together its entire software portfolio so it has a competitive approach. It's not trying to take Azure and AWS head on. So you see 38% negative, 45% positive. Now, what the survey didn't do is really didn't talk to multi-cloud. This, to me, puts IBM at the forefront of multi-cloud, right in there with VMware. You got IBM-Red Hat, Google with Anthos, Cisco coming at it from a network perspective and, of course, Microsoft leveraging its large estate of software. So, maybe next time we can poke at the multi-cloud. Now, that survey was done of about over 150, about 157 in the Global 2000. Sorry, I apologize. That was was 137. The next chart that I'm going to show you is a sentiment chart that took a pulse periodically, which was 157 IT practitioners, C level executives, VPs and IT practitioners. And what this chart shows essentially is the spending intentions for Red Hat over time. Now, the green bars are really about the adoption rates, and you can see they fluctuate, and it's kind of the percentage on left hand side and time is on the horizontal axis. The red is the replacement. We're going to replace. We're not going to buy. We're going to replace. In the middle is that fat middle, we're going to stay flat. So the yellow line is essentially what ETR calls market share. It's really an indication of mind share in my opinion. And then the blue line is spending intentions net score. So what does that mean? What that means is they basically take the gray, which is staying the same, they subtract out the red, which is we're doing less, and they add in the we're going to do more. So what does this data show? Let's focus on the blue line. So you can see, you know, slightly declining, and then pretty significantly declining last summer, maybe that's 'cause people spend less in the summer, and then really dropping coming into the announcement of the acquisition in October of 2018, IBM announced the $34 billion acquisition of Red Hat. Look at the spike post announcement. The sentiment went way up. You have a meaningful jump. Now, you see a little dip in the April survey, and again, that might've been just an attenuation of the enthusiasm. Now, July is going on right now, so that's why it's phased out, but we'll come back and check that data later. So, and then you can see this sort of similar trend with what they call market share, which, to me, is, again, really mind share and kind of sentiment. You can see the significant uptick in momentum coming out of the announcement. So people are generally pretty enthusiastic. Again, remember, these are customers of IBM, customers of Red Hat and customer of both. Now, let's see what the practitioners said. Let's go to some of the open endeds. What I love about ETR is they actually don't just do the hardcore data, they actually ask people open ended questions. So let's put this slide up and share with you some of the drill down statements that I thought were quite relevant. The first one is right on. "Assuming IBM does not try to increase subscription costs "for RHEL," Red Hat Enterprise Linux, "then its organizational issues over sales "and support should go away. "This should fix an issue where enterprises "were moving away from RHEL to lower cost alternatives "with significant movement to other vendors. "This plus IBM's purchase of SoftLayer and deployment "of CloudFoundry will make it harder "for Fortune 1000 companies to move away from IBM." So a lot implied things in there. The first thing I want to mention is IBM has a nasty habit when it buys companies, particularly software companies, to raise prices. You certainly saw this with SPSS. You saw this with other smaller acquisitions like Ustream. Cognos customers complained about that. IBM buys software companies with large install bases. It's got a lock in spec. It'll raise prices. It works because financially it's clearly worked for IBM, but it sometimes ticks off customers. So IBM has said it's going to keep Red Hat separate. Let's see what it does from a pricing standpoint. The next comment here is kind of interesting. "IBM has been trying hard to "transition to cloud-service model. "However, its transition has not been successful "even in the private-cloud domain." So basically these guys are saying something that I've just said is that IBM's cloud strategy essentially failed to meet its expectations. That's why it has to go out and spend $34 billion with Red Hat. While it's certainly transformed IBM in some respects, IBM's still largely a services company, not as competitive as cloud as it would've liked. So this guys says, "let alone in this fiercely competitive "public cloud domain." They're not number one. "One of the reasons, probably the most important one, "is IBM itself does not have a cloudOS product. "So, acquiring Red Hat will give IBM "some competitive advantage going forward." Interesting comments. Let's take a look at some of the other ones here. I think this is right on, too. "I don't think IBM's goal is to challenge AWS "or Azure directly." 100% agree. That's why they got rid of the low end intel business because it's not trying to be in the commodity businesses. They cannot compete with AWS and Azure in terms of the cost structure of cloud infrastructure. No way. "It's more to go after hybrid multi-cloud." Ginni Rometty said today at the announcement, "We're the only hybrid multi-cloud, opensource vendor out there. Now, the third piece of that opensource I think is less important than competing in hybrid and multi-cloud. Clearly Red hat gives IMB a better position to do this with CoreOS, CentOS. And so is it worth 34 billion? This individual thinks it is. So it's a vice president of a financial insurance organization, again, IBM's strong house. So you can here some of the other comments here. "For customers doing significant business "with IBM Global Services teams." Again, outsourcing, it's a 10-plus billion dollar opportunity for IBM to monetize over the next five years, in my opinion. "This acquisition could help IBM "drive some of those customers "toward a multi-cloud strategy "that also includes IBM's cloud." Yes, it's a very much of a play that will integrate services, Red Hat, Linux, OpenShift, and of course, IBM's cloud, sprinkle in a little Watson, throw in some hardware that IBM has a captive channel so the storage guys and the server guys can sell their hardware in there if the customer doesn't care. So it's a big integrated services play. "Positioning Red Hat, and empowering them "across legacy IBM silos, will determine if this works." Again, couldn't agree more. These are very insightful comments. This is a largely a services and an integration play. Hybrid cloud, multi-cloud is complex. IBM loves complexity. IBM's services organization is number one in the industry. Red Hat gives it an ingredient that it didn't have before other than as a partner. IBM now owns that intellectual property and can really go hard and lean in to that services opportunity. Okay, so thanks to our friends at Enterprise Technology Research for sharing that data, and thank you for watching theCUBE. This is Dave Vellante signing off for now. Talk to you soon. (upbeat electronic music)

Published Date : Jul 9 2019

SUMMARY :

From the SiliconANGLE Media office and it's kind of the percentage on left hand side

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
October of 2018	DATE	0.99+
Ken Male	PERSON	0.99+
APAC	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
October	DATE	0.99+
InfoPro	ORGANIZATION	0.99+
AMIA	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
$34 billion	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
April	DATE	0.99+
45%	QUANTITY	0.99+
10 years	QUANTITY	0.99+
64%	QUANTITY	0.99+
July	DATE	0.99+
38%	QUANTITY	0.99+
Enterprise Technology Research	ORGANIZATION	0.99+
North America	LOCATION	0.99+
4,500 users	QUANTITY	0.99+
13%	QUANTITY	0.99+
80%	QUANTITY	0.99+
34 billion	QUANTITY	0.99+
100%	QUANTITY	0.99+
15%	QUANTITY	0.99+
RHEL	TITLE	0.99+
third piece	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Ustream	ORGANIZATION	0.99+
23%	QUANTITY	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
first thing	QUANTITY	0.99+
One	QUANTITY	0.99+

Charlie Kwon, IBM | Actifio Data Driven 2019

>> from Boston, Massachusetts. It's the queue covering active eo 2019. Data driven you by activity. >> Welcome back to Boston. Everybody watching the Cube, the leader and on the ground tech coverage. My name is David Locke. They still minimus here. John Barrier is also in the house. We're covering the active FIO data driven 19 event. Second year for this conference. It's all about data. It's all about being data driven. Charlie Quanis here. He's the director of data and a I offering management and IBM. Charlie, thanks for coming on The Cube. >> Happy to be here. Thank you. >> So active Theo has had a long history with IBM. Effect with company got started at a time the marketplace took a virtual ization product and allowed them to be be first really and then get heavily into the data virtualization. They since evolved that you guys are doing a lot of partnerships together. We're going to get into that, But talk about your role with an IBM and you know, what is this data and a I offering management thing? >> He absolutely eso data and a I is our business unit within IBN Overall Corporation, our focus and our mission is really about helping our customers drive better business outcomes through data. Leveraging data in the contacts and the pursuit of analytics and artificial intelligence are augmented intelligence. >> So >> a portion of the business that I'm part of his unified governance and integration and you think about data and I as a whole, you could think about it in the context of the latter day. I often times when we talk about data and I we talk about the foundational principles and capabilities that are required to help companies and our customers progress on their journey. They II and it really is about the information architecture that we help them build. That information architectures essentially a foundational prerequisite around that journey to a i. R. Analytics and those layers of the latter day I r. Collecting the data and making sure you haven't easily accessible to the individual's need it organizing the data. That's where the unified governance in Immigration folio comes into play. Building trusted business ready data, high quality with governance around that making shorts available to be used later, thie analyzed layer in terms of leveraging the data for analytics and die and then infuse across the organization, leveraging those models across the organization. So within that context of data and I, we partnered with Active Theo at the end of 2018. >> So before we get into that, I have started dropped. You know, probably Rob Thomas is, and I want a double click on what you just said. Rob Thomas is is famous for saying There is no way I without a training, no, no artificial intelligence without information architecture so sounds good. You talk about governance. That's obviously part of it. But what does that mean? No A without a. >> So it is really about the fundamental prerequisites to be able to have the underlying infrastructure around the data assets that you have. A fundamental tenet is that data is one of your tremendous assets. Any enterprise may have a lot of time, and effort has been spent investing and man hours invested into collecting the data, making sure it's available. But at the same time, it hasn't been freed up to be. A ploy used for downstream purpose is whether it's operational use cases or analytical cases, and the information architecture is really about How do you frame your data strategy so that you have that data available to use and to drive business outcomes later. And those business outcomes, maybe results of insights that are driven out of the way the data but they got could also be part of the data pipeline that goes into feeding things like application development or test data management. And that's one of the areas that were working with that feeling. >> So the information architecture's a framework that you guys essentially publish and communicate to your clients. It doesn't require that you have IBM products plugged in, but of course, you can certainly plug in. IBM products are. If you're smart enough to develop information architect here presumably, and you got to show where your products fit. You're gonna sell more stuff, but it's not a prerequisite. I confuse other tooling if I wanted to go there. The framework is a good >> prerequisite, the products and self of course, now right. But the framework is a good foundational. Construct around how you can think about it so that you can progress along that journey, >> right? You started talking about active fio. You're relationship there. See that created the Info sphere Virtual data pipeline, right? Why did you developed that product or we'll get into it? >> Sure, it's all part of our overall unified covers and integration portfolio. Like I said, that's that organized layer of the latter day I that I was referring to. And it's all about making sure you have clear visibility and knowing what they had assets that you have. So we always talk about in terms of no trust in use. No, the data assets you have. Make sure you understand the data quality in the classification around that data that you have trust the data, understand the lineage, understand how it's been Touch Haussmann, transformed building catalog around that data and then use and make sure it's usable to downstream applications of down street individuals. And the virtual data pipeline offering really helps us on that last category around using and making use of the data, the assets that you have putting it into directly into the hands of the users of that data. So whether they be data scientist and data engineers or application developers and testers. So the virtual data pipeline and the capabilities based on activity sky virtual appliance really help build a snapshot data provide the self service user interface to be able to get into the hands of application developers and testers or data engineers and data scientist. >> And why is that important? Is it because they're actually using the same O. R. O R. Substantially similar data sets across their their their their work stream. Maybe you could explain that it's important >> because the speed at which the applications are being built insights are being driven is requiring that there is a lot more agility and ability to self service into the data that you need. Traditional challenges that we see is you think about preparing to build an application or preparing to build an aye aye model, building it, deploy it and managing it the majority of the time. 80% of the time. Todd spilled front, preparing the data talking, trying to figure out what data you need asking for and waiting for two weeks to two months to try to get access to that data getting. And they're realizing, Oh, I got the wrong data. I need to supplement that. I need to do another iteration of the model going back to try to get more data on. That's you have the area that application developers and data scientists don't necessarily want to be spending their >> time on. >> And so >> we're trying to shrink >> that timeframe. And how do we shrink? That is by providing business users our line of business users, data scientist application developers with the individuals that are actually using the data to provide their own access to it, right To be able to get that snapshot that point in time, access to that point of production data to be able to then infuse it into their development process. They're testing process or the analytic development process >> is we're we're do traditional tooling were just traditional tooling fit in this sort of new world because you remember what the Duke came out. It was like, Oh, that enterprise data warehouses dead. And then you ask customers like What's one of the most important things you're doing in your big data? Play blind and they'd say, Oh, yeah, we need R w. So I could now collect more data for lower costs keep her longer low stuff. But the traditional btw was still critical, but well, you were just describing, you know, building a cube. You guys own Cognos Obviously, that's one of the biggest acquisitions that I'm being made here is a critical component. Um, you talk about data quality, integration, those things. It's all the puzzle fits together in this larger mosaic and help us understand that. Sure >> and well, One of the fundamental things to understand is you have to know what you have right, and the data catalogue is a critical component of that data strategy. Understanding where your enterprise assets sit, they could be structured information that may be a instruction information city and file repositories or e mails, for example. But understanding what you have, understanding how it's been touched, how it's been used, understanding the requirements and limitations around that data understanding. Who are the owners of that data? So building that catalog view of your overall enterprise assets fundamental starting point from a governess standpoint. And then from there, you can allow access to individuals that are interested in understanding and leveraging that date assets that you may have in one pool here challenges data exists across enterprise everywhere. Right silos that may have rose in one particular department that then gets murdered in with another department, and then you have two organization that may not even know what the other individual has. So the challenge is to try to break down those silos, get clarity of the visibility around what assets so that individuals condemned leverage that data for whatever uses they may have, whether it be development or testing or analytics. >> So if I could generalize the problem, Yeah, too much data, not enough value. And I'll talk about value in terms of things that you guys do that I'm inferring. Risk reduction. Correct uh, speed to insights. Andan. Ultimately, lowering costs are increasing revenue. That's kind of what it's all >> the way to talk about business outcomes in terms of increase revenue, decrease costs or reduce risk, right in terms of governance, those air the three things that you want to unlock for your customers and you don't think about governance and creating new revenue streams. We generally don't think about in terms of reducing costs, but you do think about it oftentimes in terms of reducing your risk profile and compliance. But the ability to actually know your data built trust and then use that data really does open up different opportunities to actually build new application new systems of engagement uses a record new applications around analytics and a I that will unlock those different ways that we can market to customers. Cell two customers engage our own employees. >> Yes. So the initial entry into the organism the budget, if you will, is around that risk reduction. Right? Can you stand that? I got all this data and I need to make sure that I'm managing a corner on the edicts of my organization. But you actually seeing we play skeptic, you're really seeing value beyond that risk reduction. I mean, it's been nirvana in the compliance and governance world, not just compliance and governance and, you know, avoiding fees and right getting slapped on the wrist or even something worse? Sure, but we can actually, through the state Equality Initiative and integration, etcetera, etcetera Dr. Other value. You actually seeing that? >> Yes. We are actually, particularly last year with the whole onslaught of GDP are in the European Union, and the implications of GDP are here in the U. S. Or other parts of the world. Really was a pervasive topic on a lot of what we were talking about was specifically that compliance make sure you stay on the right side of the regulation, but the same time investing in that data architecture, information, architecture, investing in the governance programme actually allowed our customers to understand the different components that are touching the individual. Because it's all about individual rights and individual privacy. It's understanding what they're buying, understanding what information we're collecting on them, understanding what permissions and consent that we have, the leverage their information really allowed. Our customers actually delivered that information and for a different purpose. Outside of the whole compliance mindset is compliance is a difficult nut to crack. There's requirements around it, but at the same time, they're our best effort requirements around that as well. So the driver for us is not necessarily just about compliance, But it's about what more can you do with that govern data that you already have? Because you have to meet those compliance department anyway, to be able to flip the script and talk about business value, business impact revenue, and that's everything. >> Now you So you're only about what, six months in correct this part of the partnership? All right, so it's early days, but how's it going and what can we expect going forward? >> Don't. Great. We have a terrific partner partnership with Octavio, Like tippy a virtual Or the IBM virtual data pipeline offering is part of our broader portfolio within unified governance and fits nicely to build out some of the test data management capability that we've already had. Optimal portfolio is part of our capability. Said it's really been focused around test data management building synthetic data, orchestrating test data management as well. And the virtual data pipeline offering actually is a nice compliment to that to build out our the robust portfolio now. >> All right, Charlie. Well, hey, thanks very much for coming in the house. The event >> has been terrific. It's been terrific. It's It's amazing to be surrounded by so many people that are excited about data. We don't get that everywhere. >> They were always excited about, Right, Charlie? Thanks so much. Thank you. Thank you. All right. Keep it right there, buddy. We're back with our next guest. A Valon Day, John. Furry and student Amanda in the house. You're watching the cube Active eo active Fio data driven. 2019. Right back

Published Date : Jun 19 2019

SUMMARY :

It's the queue covering active eo We're covering the active FIO data driven Happy to be here. They since evolved that you guys are doing a lot of partnerships together. Leveraging data in the contacts and the pursuit of analytics and a portion of the business that I'm part of his unified governance and integration and you think about data and I as a whole, You know, probably Rob Thomas is, and I want a double click on what you just said. or analytical cases, and the information architecture is really about How do you frame your data So the information architecture's a framework that you guys essentially publish and communicate to your clients. But the framework is a good foundational. See that created the Info sphere Virtual No, the data assets you have. Maybe you could explain that it's important preparing the data talking, trying to figure out what data you need asking for and waiting They're testing process or the analytic development process You guys own Cognos Obviously, that's one of the biggest acquisitions that I'm being made here is a critical component. and the data catalogue is a critical component of that data strategy. So if I could generalize the problem, Yeah, too much data, not enough value. But the ability to actually know your data built trust on the edicts of my organization. and the implications of GDP are here in the U. S. Or other parts of the world. And the virtual data pipeline offering actually is a nice compliment to that to build out our the robust portfolio now. All right, Charlie. It's It's amazing to be surrounded by so many people that are excited about data. Furry and student Amanda in the house.

ENTITIES

Entity	Category	Confidence
David Locke	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
Charlie	PERSON	0.99+
John Barrier	PERSON	0.99+
80%	QUANTITY	0.99+
six months	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
two months	QUANTITY	0.99+
Charlie Kwon	PERSON	0.99+
last year	DATE	0.99+
Boston	LOCATION	0.99+
Amanda	PERSON	0.99+
2019	DATE	0.99+
Second year	QUANTITY	0.99+
Boston, Massachusetts	LOCATION	0.99+
Charlie Quanis	PERSON	0.99+
end of 2018	DATE	0.99+
Active Theo	ORGANIZATION	0.99+
two organization	QUANTITY	0.98+
U. S.	LOCATION	0.98+
two customers	QUANTITY	0.98+
one pool	QUANTITY	0.98+
first	QUANTITY	0.98+
one	QUANTITY	0.97+
Duke	ORGANIZATION	0.96+
Touch Haussmann	ORGANIZATION	0.96+
Octavio	ORGANIZATION	0.95+
Theo	PERSON	0.95+
Info sphere	ORGANIZATION	0.94+
Cognos	ORGANIZATION	0.94+
One	QUANTITY	0.91+
three things	QUANTITY	0.91+
Todd	PERSON	0.87+
Actifio	TITLE	0.86+
19	QUANTITY	0.86+
Cube	COMMERCIAL_ITEM	0.79+
double click	QUANTITY	0.79+
IBN Overall	ORGANIZATION	0.76+
eo 2019	EVENT	0.69+
European Union	ORGANIZATION	0.69+
one particular department	QUANTITY	0.67+
Valon Day	PERSON	0.57+
John	PERSON	0.56+

Phil Buckellew, IBM | Actifio Data Driven 2019

>> From Boston, Massachusetts, it's theCUBE! Covering Actifio 2019 Data Driven. Brought to you by Actifio. >> Here we are in Boston, Massachusetts. I'm Stu Miniman, this is theCUBE at the special, at Data Driven '19, Actifio's user event. Happy to bring on a CUBE alum who's a partner of Actifio, Phil Buckellew, who's General Manager of IBM Cloud Object Storage. Phil, thanks for coming back. >> Great, great to be here Stu. >> All right, so object storage. Why don't you give us first just kind of an encapsulation of kind of the state of your business today. >> Sure, object storage is really an extremely important business for the industry today because really it's a new way accessing data, it's been around obviously for a decade or so but really, it's increasingly important because it's a way to cost-effectively store a lot of data, to really to be able to get access to that data in new and exciting ways, and with the growth in the volume of data, of particularly unstructured data, like 103 zettabytes by 2023 I think I heard from the IDC guys, that really kind of shows how important being able to handle that volume of data really is. >> So Phil, I go back, think about 12 years ago, all the technologists in this space were like, "The future of storage is object," and I was working at one of the big storage companies and I'm like, "Well we've been doing block and file," and there was this big gap out there, and kind of quietly object's taken over the world because underneath a lot of the cloud services there, object's there, so IBM made a big acquisition in this space. Talk about, you know, customers that I talk to it's not like they come out and say, "Oh jeez, I'm buying object storage, "I'm thinking about object storage." They've got use cases and services that they're using that happen to have object underneath. Is that what you hear from your users? >> Yeah, there's a couple of different buying groups that exist in the object storage market today. The historic market is really super large volumes. I mean, we're unique in that IBM acquired the Cleversafe company back in 2015 and that technology is technology we've expanded upon and it really, it's great because it can go to exabyte scale and beyond and that's really important for certain use cases. So some customers that have high volumes of videos and other unstructured data, that is really a super good fit for those clients. Additionally, clients that really have the need for highly resilient, because the other thing that's important the way that we built our object storage is to be able to have a lot of resiliency, to be able to run across multiple data centers, to be able to use erasure coding to ensure the data's protected, that's really a large part of the value, and because you can do that at scale without having downtime when you upgrade, those are really a lot of core benefits of object storage. >> Right, that resiliency is kind of built into the way we do it and that was something that was just kind of a mind shift as opposed to, okay I've got to have this enterprise mindset with an HA configuration and everything with N plus whatever version of it. Object's going to give you some of that built-in. The other thing I always found really interesting is storing data is okay, there's some value there, but how do I gain leverage out of the data? And there's the metadata underneath that helps. You talk about video, you talk about all these kinds there. If I don't understand what I've got and how I'd leverage it, it's not nearly as valuable for me, and that's something, you know really that one of the key topics of this show is, how do I become data driven, is the show, and that I have to believe is something critically important to your customers. >> Absolutely, and really object storage is the foundation for modern cloud-native data lakes, if you will, because it's cost-effective enough you can drop any kind of storage in there and then you can really get value from those assets wherever you are, and wherever you're accessing the data. We've taken the same technology that was the exabyte scale on-premise technology, and we've put it in the IBM public cloud, and so that really allows us to be able to deliver against all kinds of use cases with the data sets that clients want, and there's a lot of great innovation that's happening especially on the cloud side. We've got the ability to query that data, any kind of rectangular data with standard ANSI SQL statements, and that just really allows clients to unlock the potential of those data sets, so really good innovation going on in that space to unlock the value of the data that you put inside of object storage. >> All right, Phil let's make the connection. Actifio's here, IBM OEM's the solution. So, talk about the partnership and what customers are looking for when they're looking at their IPs. Sure, so, quite a ways prior to the partnership our object storage team partnered up with the Actifio team at a large financial services customer that recognized the growth in the volume of the data that they had, that had some unique use cases like cyber resiliency. They get attacked with ransomware attacks, they needed to have a standard way to have those data sets and those databases running in a resilient way against object storage that can still be mounted and used, effectively immediately, in case of ransomware attacks, and so that plus a lot of other traditional backup use cases is what drew the IBM Cloud Object Storage team and the Actifio team together. Successful deployments at large customers are really where we got our traction. And with that we also really began to notice the uptick in clients that wanted to use, they wanted to do test data management, they wanted, they needed to be able to have DevOps team that needed to spin up a replica of this database or that database very fast, and, you know, what we found was the combination of the Actifio product, which we've OEM'd as IBM Virtual Data Pipeline, allows us to run those virtual databases extremely cost-effectively backed by object storage, versus needing to make full replicas on really expensive block storage that takes a long time. >> Well yeah, we'd actually done research on this a number of years ago. Copies are great, but how do I leverage that right? From the developer team it's, I want to have something that mirrors what I have in production, not just some test data, so the more I can replicate that, the better. Phil, please, go ahead. >> There's some really important parts of that whole story, of being able to get that data flow right, to be able to go do point-in-time recoveries of those databases so that the data is accurate, but also being able to mask out that PII or sensitive information, credit card data or others that you really shouldn't be exposing to your testers and DevOps people. Being able to have the kind of-- (Phil laughs) >> Yeah, yeah, shouldn't because, you know, there's laws and lawsuits and security and all these things we have. >> Good, good, absolutely. >> So, Phil, we're talking a lot about data, you've actually got some new data to share with us, a recent survey that was done, should we share some of your data with us? >> Yeah, we did some, we did a, the ESG guys actually worked with us to build out a piece of research that looked at what would it cost to take a 50 terabyte Oracle 12c database and effectively spin up five copies the way you traditionally would so that different test teams can hammer away against that data set. And we compared that to running the VDP offering with our Cloud Object Storage solution. You know, distances apart, we had one where the source database is in Dallas and the destination database is in Washington, D.C. over a 10 gigabyte link, and we were able to show that you could set up five replicas of the database in like 90 minutes, compared with the two weeks that it would take to do full replication, because you were going against object storage, which runs about 2.3 cents per gigabyte per month, versus block storage fully loaded, which runs about 58 cents per gigabyte per month. The economics would blow away. And the fact that you could even do queries, because object storage is interesting. Yes, if you're using, if you have microsecond response times for small queries you got to run some of that content on block storage, but for traditional queries, we look at, like, really big queries that would run against 600 rows, and we were half the time that you would need on traditional block storage. So, for those DevOps use cases where you're doing that test in development you can have mass data, five different copies, and you can actually point back in time because really, the Actifio technology is really super in that it can go do point-in-time, it was able to store the right kind of data so the developers can get the most recent current copies of the data. All in, it was like 80% less than what you would have paid doing it the traditional way. >> Okay, so Phil, you started talking a little bit about some of the cloud pieces, you know, Actifio in the last year launched their first SaaS offering Actifio GO. How much of these solutions are for the cloud versus on-premises these days? >> Absolutely, so one of the benefits of using a virtual data approach is being able to leverage cloud economics 'cause a lot of clients they want to do, you know, they want to be able to do the test in dev which has ups and downs and peaks and valleys when you need to use those resources, the cloud is really an ideal way to do those types of workloads. And so, the integration work that we've done with the Actifio team around VDP allows you to replicate or have virtual copies of those databases in the cloud where you want to do your testing, or we can do it in traditional on-prem object storage environments. Really, whatever makes most sense for the client is where we can stand up those environments. >> The other thing I wonder if you could expand on a little bit more, you talked about, like, cloud-native deployment and what's happening there. How does that tie into this discussion? >> Well, obviously modern architectures and ways of Agile, ways of building things, cloud-native with microservices, those are all extremely important, but you've got to be able to access the data, and it's that core data that no matter how much you do with putting Kubernetes around all of your existing applications you've still got to be able to access that core data, often systems record data, which is sitting on these standard databases of record, and so being able to have the VDP technology, be able to replicate those, stand those up like in our public cloud right next to all of our Kubernetes service and all the other technologies, it gives you the kind of full stack that you need to go do that dev in test, or run production workloads if you prefer from a public cloud environment, without having all of the burdens of running the data centers and maintaining things on your own. >> Okay, so Phil, everybody here for this two day event are going to get a nice, you know, jolt of where Actifio fits. You know, lots of orange here at the show. Give us the final word of what does it mean with orange and blue coming together. >> Well absolutely, we think this is going to be great for our clients. We've got, you know, tons of interested clients in this space because they see the value of being able to take what Actifio's done, to be able to virtualize that data, combine it with some of the technologies we've got for object storage or even block storage, to be able to serve up those environments in a super cost-effective way, all underlined by one of our core values at IBM, which is really trust and being responsible. And so, we often say that there's no AI, which all of this data leads up to, without information architecture and that's really where we specialize, is providing that governance, all the masking, all of the things that you need to feel confident that the data you've got is in the right hands, being used the right way, to be able to give you maximum advantage for your business, so we're super excited about the partnership. >> Phil, definitely a theme we heard at IBM Think, there is no AI without the IA, so, Phil Buckellew, thanks so much for joining us, sharing all the updates on what IBM is doing here with Actifio. >> Great, great to be here. >> All right, and we'll be back with more coverage here in Boston, Massachusetts at Actifio Data Driven 2019. I'm Stu Miniman and thanks for watching theCUBE. (futuristic music)

Published Date : Jun 19 2019

SUMMARY :

Brought to you by Actifio. Happy to bring on a CUBE alum who's a encapsulation of kind of the state of your business today. from the IDC guys, that really kind of shows how important and kind of quietly object's taken over the world and because you can do that at scale and that I have to believe is something Absolutely, and really object storage is the and the Actifio team together. so the more I can replicate that, the better. that you really shouldn't be exposing and all these things we have. And the fact that you could even do queries, some of the cloud pieces, you know, 'cause a lot of clients they want to do, you know, The other thing I wonder if you could expand on and all the other technologies, are going to get a nice, you know, all of the things that you need to feel confident sharing all the updates on what IBM I'm Stu Miniman and thanks for watching theCUBE.

ENTITIES

Entity	Category	Confidence
Phil Buckellew	PERSON	0.99+
2015	DATE	0.99+
IBM	ORGANIZATION	0.99+
Dallas	LOCATION	0.99+
Phil	PERSON	0.99+
Cleversafe	ORGANIZATION	0.99+
Actifio	ORGANIZATION	0.99+
90 minutes	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
600 rows	QUANTITY	0.99+
80%	QUANTITY	0.99+
Washington, D.C.	LOCATION	0.99+
two day	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
2023	DATE	0.99+
Boston, Massachusetts	LOCATION	0.99+
50 terabyte	QUANTITY	0.99+
last year	DATE	0.99+
10 gigabyte	QUANTITY	0.99+
103 zettabytes	QUANTITY	0.99+
five copies	QUANTITY	0.98+
five replicas	QUANTITY	0.98+
first	QUANTITY	0.98+
a decade	QUANTITY	0.98+
Kubernetes	TITLE	0.96+
ESG	ORGANIZATION	0.96+
one	QUANTITY	0.95+
DevOps	TITLE	0.94+
Stu	PERSON	0.92+
CUBE	ORGANIZATION	0.91+
IDC	ORGANIZATION	0.91+
today	DATE	0.91+
Agile	TITLE	0.9+
IBM Cloud	ORGANIZATION	0.89+
five	QUANTITY	0.87+
12 years ago	DATE	0.84+
IBM Think	ORGANIZATION	0.82+
about 58 cents per gigabyte per	QUANTITY	0.8+
Actifio GO	TITLE	0.78+
Virtual Data Pipeline	COMMERCIAL_ITEM	0.78+
Oracle	ORGANIZATION	0.78+
about 2.3 cents per gigabyte per	QUANTITY	0.77+
of years ago	DATE	0.75+
Data	EVENT	0.74+
Actifio 2019	TITLE	0.63+
2019	DATE	0.63+
theCUBE	ORGANIZATION	0.59+
VDP	TITLE	0.57+
tons	QUANTITY	0.57+
DevOps	ORGANIZATION	0.52+
Data Driven 2019	EVENT	0.46+
Actifio	TITLE	0.44+
12c	TITLE	0.41+
Data Driven	EVENT	0.32+
'19	EVENT	0.3+

Seth Dobrin, IBM | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to theCUBE's continuing coverage of our own event, Big Data SV. I'm Lisa Martin, with my cohost Dave Vellante. We're in downtown San Jose at this really cool place, Forager Eatery. Come by, check us out. We're here tomorrow as well. We're joined by, next, one of our CUBE alumni, Seth Dobrin, the Vice President and Chief Data Officer at IBM Analytics. Hey, Seth, welcome back to theCUBE. >> Hey, thanks for having again. Always fun being with you guys. >> Good to see you, Seth. >> Good to see you. >> Yeah, so last time you were chatting with Dave and company was about in the fall at the Chief Data Officers Summit. What's kind of new with you in IBM Analytics since then? >> Yeah, so the Chief Data Officers Summit, I was talking with one of the data governance people from TD Bank and we spent a lot of time talking about governance. Still doing a lot with governance, especially with GDPR coming up. But really started to ramp up my team to focus on data science, machine learning. How do you do data science in the enterprise? How is it different from doing a Kaggle competition, or someone getting their PhD or Masters in Data Science? >> Just quickly, who is your team composed of in IBM Analytics? >> So IBM Analytics represents, think of it as our software umbrella, so it's everything that's not pure cloud or Watson or services. So it's all of our software franchise. >> But in terms of roles and responsibilities, data scientists, analysts. What's the mixture of-- >> Yeah. So on my team I have a small group of people that do governance, and so they're really managing our GDPR readiness inside of IBM in our business unit. And then the rest of my team is really focused on this data science space. And so this is set up from the perspective of we have machine-learning engineers, we have predictive-analytics engineers, we have data engineers, and we have data journalists. And that's really focus on helping IBM and other companies do data science in the enterprise. >> So what's the dynamic amongst those roles that you just mentioned? Is it really a team sport? I mean, initially it was the data science on a pedestal. Have you been able to attack that problem? >> So I know a total of two people that can do that all themselves. So I think it absolutely is a team sport. And it really takes a data engineer or someone with deep expertise in there, that also understands machine-learning, to really build out the data assets, engineer the features appropriately, provide access to the model, and ultimately to what you're going to deploy, right? Because the way you do it as a research project or an activity is different than using it in real life, right? And so you need to make sure the data pipes are there. And when I look for people, I actually look for a differentiation between machine-learning engineers and optimization. I don't even post for data scientists because then you get a lot of data scientists, right? People who aren't really data scientists, and so if you're specific and ask for machine-learning engineers or decision optimization, OR-type people, you really get a whole different crowd in. But the interplay is really important because most machine-learning use cases you want to be able to give information about what you should do next. What's the next best action? And to do that, you need decision optimization. >> So in the early days of when we, I mean, data science has been around forever, right? We always hear that. But in the, sort of, more modern use of the term, you never heard much about machine learning. It was more like stats, math, some programming, data hacking, creativity. And then now, machine learning sounds fundamental. Is that a new skillset that the data scientists had to learn? Did they get them from other parts of the organization? >> I mean, when we talk about math and stats, what we call machine learning today has been what we've been doing since the first statistics for years, right? I mean, a lot of the same things we apply in what we call machine learning today I did during my PhD 20 years ago, right? It was just with a different perspective. And you applied those types of, they were more static, right? So I would build a model to predict something, and it was only for that. It really didn't apply it beyond, so it was very static. Now, when we're talking about machine learning, I want to understand Dave, right? And I want to be able to predict Dave's behavior in the future, and learn how you're changing your behavior over time, right? So one of the things that a lot of people don't realize, especially senior executives, is that machine learning creates a self-fulfilling prophecy. You're going to drive a behavior so your data is going to change, right? So your model needs to change. And so that's really the difference between what you think of as stats and what we think of as machine learning today. So what we were looking for years ago is all the same we just described it a little differently. >> So how fine is the line between a statistician and a data scientist? >> I think any good statistician can really become a data scientist. There's some issues around data engineering and things like that but if it's a team sport, I think any really good, pure mathematician or statistician could certainly become a data scientist. Or machine-learning engineer. Sorry. >> I'm interested in it from a skillset standpoint. You were saying how you're advertising to bring on these roles. I was at the Women in Data Science Conference with theCUBE just a couple of days ago, and we hear so much excitement about the role of data scientists. It's so horizontal. People have the opportunity to make impact in policy change, healthcare, etc. So the hard skills, the soft skills, mathematician, what are some of the other elements that you would look for or that companies, enterprises that need to learn how to embrace data science, should look for? Someone that's not just a mathematician but someone that has communication skills, collaboration, empathy, what are some of those, openness, to not lead data down a certain, what do you see as the right mix there of a data scientist? >> Yeah, so I think that's a really good point, right? It's not just the hard skills. When my team goes out, because part of what we do is we go out and sit with clients and teach them our philosophy on how you should integrate data science in the enterprise. A good part of that is sitting down and understanding the use case. And working with people to tease out, how do you get to this ultimate use case because any problem worth solving is not one model, any use case is not one model, it's many models. How do you work with the people in the business to understand, okay, what's the most important thing for us to deliver first? And it's almost a negotiation, right? Talking them back. Okay, we can't solve the whole problem. We need to break it down in discreet pieces. Even when we break it down into discreet pieces, there's going to be a series of sprints to deliver that. Right? And so having these soft skills to be able to tease that in a way, and really help people understand that their way of thinking about this may or may not be right. And doing that in a way that's not offensive. And there's a lot of really smart people that can say that, but they can come across at being offensive, so those soft skills are really important. >> I'm going to talk about GDPR in the time we have remaining. We talked about in the past, the clocks ticking, May the fines go into effect. The relationship between data science, machine learning, GDPR, is it going to help us solve this problem? This is a nightmare for people. And many organizations aren't ready. Your thoughts. >> Yeah, so I think there's some aspects that we've talked about before. How important it's going to be to apply machine learning to your data to get ready for GDPR. But I think there's some aspects that we haven't talked about before here, and that's around what impact does GDPR have on being able to do data science, and being able to implement data science. So one of the aspects of the GDPR is this concept of consent, right? So it really requires consent to be understandable and very explicit. And it allows people to be able to retract that consent at any time. And so what does that mean when you build a model that's trained on someone's data? If you haven't anonymized it properly, do I have to rebuild the model without their data? And then it also brings up some points around explainability. So you need to be able to explain your decision, how you used analytics, how you got to that decision, to someone if they request it. To an auditor if they request it. Traditional machine learning, that's not too much of a problem. You can look at the features and say these features, this contributed 20%, this contributed 50%. But as you get into things like deep learning, this concept of explainable or XAI becomes really, really important. And there were some talks earlier today at Strata about how you apply machine learning, traditional machine learning to interpret your deep learning or black box AI. So that's really going to be important, those two things, in terms of how they effect data science. >> Well, you mentioned the black box. I mean, do you think we'll ever resolve the black box challenge? Or is it really that people are just going to be comfortable that what happens inside the box, how you got to that decision is okay? >> So I'm inherently both cynical and optimistic. (chuckles) But I think there's a lot of things we looked at five years ago and we said there's no way we'll ever be able to do them that we can do today. And so while I don't know how we're going to get to be able to explain this black box as a XAI, I'm fairly confident that in five years, this won't even be a conversation anymore. >> Yeah, I kind of agree. I mean, somebody said to me the other day, well, it's really hard to explain how you know it's a dog. >> Seth: Right (chuckles). But you know it's a dog. >> But you know it's a dog. And so, we'll get over this. >> Yeah. >> I love that you just brought up dogs as we're ending. That's my favorite thing in the world, thank you. Yes, you knew that. Well, Seth, I wish we had more time, and thanks so much for stopping by theCUBE and sharing some of your insights. Look forward to the next update in the next few months from you. >> Yeah, thanks for having me. Good seeing you again. >> Pleasure. >> Nice meeting you. >> Likewise. We want to thank you for watching theCUBE live from our event Big Data SV down the street from the Strata Data Conference. I'm Lisa Martin, for Dave Vellante. Thanks for watching, stick around, we'll be rick back after a short break.

Published Date : Mar 8 2018

SUMMARY :

brought to you by SiliconANGLE Media Welcome back to theCUBE's continuing coverage Always fun being with you guys. Yeah, so last time you were chatting But really started to ramp up my team So it's all of our software franchise. What's the mixture of-- and other companies do data science in the enterprise. that you just mentioned? And to do that, you need decision optimization. So in the early days of when we, And so that's really the difference I think any good statistician People have the opportunity to make impact there's going to be a series of sprints to deliver that. in the time we have remaining. And so what does that mean when you build a model Or is it really that people are just going to be comfortable ever be able to do them that we can do today. I mean, somebody said to me the other day, But you know it's a dog. But you know it's a dog. I love that you just brought up dogs as we're ending. Good seeing you again. We want to thank you for watching theCUBE

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Seth	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Seth Dobrin	PERSON	0.99+
20%	QUANTITY	0.99+
50%	QUANTITY	0.99+
TD Bank	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
two people	QUANTITY	0.99+
tomorrow	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
one model	QUANTITY	0.99+
five years	QUANTITY	0.98+
20 years ago	DATE	0.98+
Big Data SV	EVENT	0.98+
five years ago	DATE	0.98+
GDPR	TITLE	0.98+
theCUBE	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Strata Data Conference	EVENT	0.97+
today	DATE	0.97+
first statistics	QUANTITY	0.95+
CUBE	ORGANIZATION	0.94+
Women in Data Science Conference	EVENT	0.94+
both	QUANTITY	0.94+
Chief Data Officers Summit	EVENT	0.93+
Big Data SV 2018	EVENT	0.93+
couple of days ago	DATE	0.93+
years	DATE	0.9+
Forager Eatery	ORGANIZATION	0.9+
first	QUANTITY	0.86+
Watson	TITLE	0.86+
Officers Summit	EVENT	0.74+
Data Officer	PERSON	0.73+
SV	EVENT	0.71+
President	PERSON	0.68+
Strata	TITLE	0.67+
Big Data	ORGANIZATION	0.66+
earlier today	DATE	0.65+
Silicon Valley	LOCATION	0.64+
years	QUANTITY	0.6+
Chief	EVENT	0.44+
Kaggle	ORGANIZATION	0.43+

Data Science: Present and Future | IBM Data Science For All

>> Announcer: Live from New York City it's The Cube, covering IBM data science for all. Brought to you by IBM. (light digital music) >> Welcome back to data science for all. It's a whole new game. And it is a whole new game. >> Dave Vellante, John Walls here. We've got quite a distinguished panel. So it is a new game-- >> Well we're in the game, I'm just happy to be-- (both laugh) Have a swing at the pitch. >> Well let's what we have here. Five distinguished members of our panel. It'll take me a minute to get through the introductions, but believe me they're worth it. Jennifer Shin joins us. Jennifer's the founder of 8 Path Solutions, the director of the data science of Comcast and part of the faculty at UC Berkeley and NYU. Jennifer, nice to have you with us, we appreciate the time. Joe McKendrick an analyst and contributor of Forbes and ZDNet, Joe, thank you for being here at well. Another ZDNetter next to him, Dion Hinchcliffe, who is a vice president and principal analyst of Constellation Research and also contributes to ZDNet. Good to see you, sir. To the back row, but that doesn't mean anything about the quality of the participation here. Bob Hayes with a killer Batman shirt on by the way, which we'll get to explain in just a little bit. He runs the Business over Broadway. And Joe Caserta, who the founder of Caserta Concepts. Welcome to all of you. Thanks for taking the time to be with us. Jennifer, let me just begin with you. Obviously as a practitioner you're very involved in the industry, you're on the academic side as well. We mentioned Berkeley, NYU, steep experience. So I want you to kind of take your foot in both worlds and tell me about data science. I mean where do we stand now from those two perspectives? How have we evolved to where we are? And how would you describe, I guess the state of data science? >> Yeah so I think that's a really interesting question. There's a lot of changes happening. In part because data science has now become much more established, both in the academic side as well as in industry. So now you see some of the bigger problems coming out. People have managed to have data pipelines set up. But now there are these questions about models and accuracy and data integration. So the really cool stuff from the data science standpoint. We get to get really into the details of the data. And I think on the academic side you now see undergraduate programs, not just graduate programs, but undergraduate programs being involved. UC Berkeley just did a big initiative that they're going to offer data science to undergrads. So that's a huge news for the university. So I think there's a lot of interest from the academic side to continue data science as a major, as a field. But I think in industry one of the difficulties you're now having is businesses are now asking that question of ROI, right? What do I actually get in return in the initial years? So I think there's a lot of work to be done and just a lot of opportunity. It's great because people now understand better with data sciences, but I think data sciences have to really think about that seriously and take it seriously and really think about how am I actually getting a return, or adding a value to the business? >> And there's lot to be said is there not, just in terms of increasing the workforce, the acumen, the training that's required now. It's a still relatively new discipline. So is there a shortage issue? Or is there just a great need? Is the opportunity there? I mean how would you look at that? >> Well I always think there's opportunity to be smart. If you can be smarter, you know it's always better. It gives you advantages in the workplace, it gets you an advantage in academia. The question is, can you actually do the work? The work's really hard, right? You have to learn all these different disciplines, you have to be able to technically understand data. Then you have to understand it conceptually. You have to be able to model with it, you have to be able to explain it. There's a lot of aspects that you're not going to pick up overnight. So I think part of it is endurance. Like are people going to feel motivated enough and dedicate enough time to it to get very good at that skill set. And also of course, you know in terms of industry, will there be enough interest in the long term that there will be a financial motivation. For people to keep staying in the field, right? So I think it's definitely a lot of opportunity. But that's always been there. Like I tell people I think of myself as a scientist and data science happens to be my day job. That's just the job title. But if you are a scientist and you work with data you'll always want to work with data. I think that's just an inherent need. It's kind of a compulsion, you just kind of can't help yourself, but dig a little bit deeper, ask the questions, you can't not think about it. So I think that will always exist. Whether or not it's an industry job in the way that we see it today, and like five years from now, or 10 years from now. I think that's something that's up for debate. >> So all of you have watched the evolution of data and how it effects organizations for a number of years now. If you go back to the days when data warehouse was king, we had a lot of promises about 360 degree views of the customer and how we were going to be more anticipatory in terms and more responsive. In many ways the decision support systems and the data warehousing world didn't live up to those promises. They solved other problems for sure. And so everybody was looking for big data to solve those problems. And they've begun to attack many of them. We talked earlier in The Cube today about fraud detection, it's gotten much, much better. Certainly retargeting of advertising has gotten better. But I wonder if you could comment, you know maybe start with Joe. As to the effect that data and data sciences had on organizations in terms of fulfilling that vision of a 360 degree view of customers and anticipating customer needs. >> So. Data warehousing, I wouldn't say failed. But I think it was unfinished in order to achieve what we need done today. At the time I think it did a pretty good job. I think it was the only place where we were able to collect data from all these different systems, have it in a single place for analytics. The big difference between what I think, between data warehousing and data science is data warehouses were primarily made for the consumer to human beings. To be able to have people look through some tool and be able to analyze data manually. That really doesn't work anymore, there's just too much data to do that. So that's why we need to build a science around it so that we can actually have machines actually doing the analytics for us. And I think that's the biggest stride in the evolution over the past couple of years, that now we're actually able to do that, right? It used to be very, you know you go back to when data warehouses started, you had to be a deep technologist in order to be able to collect the data, write the programs to clean the data. But now you're average causal IT person can do that. Right now I think we're back in data science where you have to be a fairly sophisticated programmer, analyst, scientist, statistician, engineer, in order to do what we need to do, in order to make machines actually understand the data. But I think part of the evolution, we're just in the forefront. We're going to see over the next, not even years, within the next year I think a lot of new innovation where the average person within business and definitely the average person within IT will be able to do as easily say, "What are my sales going to be next year?" As easy as it is to say, "What were my sales last year." Where now it's a big deal. Right now in order to do that you have to build some algorithms, you have to be a specialist on predictive analytics. And I think, you know as the tools mature, as people using data matures, and as the technology ecosystem for data matures, it's going to be easier and more accessible. >> So it's still too hard. (laughs) That's something-- >> Joe C.: Today it is yes. >> You've written about and talked about. >> Yeah no question about it. We see this citizen data scientist. You know we talked about the democratization of data science but the way we talk about analytics and warehousing and all the tools we had before, they generated a lot of insights and views on the information, but they didn't really give us the science part. And that's, I think that what's missing is the forming of the hypothesis, the closing of the loop of. We now have use of this data, but are are changing, are we thinking about it strategically? Are we learning from it and then feeding that back into the process. I think that's the big difference between data science and the analytics side. But, you know just like Google made search available to everyone, not just people who had highly specialized indexers or crawlers. Now we can have tools that make these capabilities available to anyone. You know going back to what Joe said I think the key thing is we now have tools that can look at all the data and ask all the questions. 'Cause we can't possibly do it all ourselves. Our organizations are increasingly awash in data. Which is the life blood of our organizations, but we're not using it, you know this a whole concept of dark data. And so I think the concept, or the promise of opening these tools up for everyone to be able to access those insights and activate them, I think that, you know, that's where it's headed. >> This is kind of where the T shirt comes in right? So Bob if you would, so you've got this Batman shirt on. We talked a little bit about it earlier, but it plays right into what Dion's talking about. About tools and, I don't want to spoil it, but you go ahead (laughs) and tell me about it. >> Right, so. Batman is a super hero, but he doesn't have any supernatural powers, right? He can't fly on his own, he can't become invisible on his own. But the thing is he has the utility belt and he has these tools he can use to help him solve problems. For example he as the bat ring when he's confronted with a building that he wants to get over, right? So he pulls it out and uses that. So as data professionals we have all these tools now that these vendors are making. We have IBM SPSS, we have data science experience. IMB Watson that these data pros can now use it as part of their utility belt and solve problems that they're confronted with. So if you''re ever confronted with like a Churn problem and you have somebody who has access to that data they can put that into IBM Watson, ask a question and it'll tell you what's the key driver of Churn. So it's not that you have to be a superhuman to be a data scientist, but these tools will help you solve certain problems and help your business go forward. >> Joe McKendrick, do you have a comment? >> Does that make the Batmobile the Watson? (everyone laughs) Analogy? >> I was just going to add that, you know all of the billionaires in the world today and none of them decided to become Batman yet. It's very disappointing. >> Yeah. (Joe laughs) >> Go ahead Joe. >> And I just want to add some thoughts to our discussion about what happened with data warehousing. I think it's important to point out as well that data warehousing, as it existed, was fairly successful but for larger companies. Data warehousing is a very expensive proposition it remains a expensive proposition. Something that's in the domain of the Fortune 500. But today's economy is based on a very entrepreneurial model. The Fortune 500s are out there of course it's ever shifting. But you have a lot of smaller companies a lot of people with start ups. You have people within divisions of larger companies that want to innovate and not be tied to the corporate balance sheet. They want to be able to go through, they want to innovate and experiment without having to go through finance and the finance department. So there's all these open source tools available. There's cloud resources as well as open source tools. Hadoop of course being a prime example where you can work with the data and experiment with the data and practice data science at a very low cost. >> Dion mentioned the C word, citizen data scientist last year at the panel. We had a conversation about that. And the data scientists on the panel generally were like, "Stop." Okay, we're not all of a sudden going to turn everybody into data scientists however, what we want to do is get people thinking about data, more focused on data, becoming a data driven organization. I mean as a data scientist I wonder if you could comment on that. >> Well I think so the other side of that is, you know there are also many people who maybe didn't, you know follow through with science, 'cause it's also expensive. A PhD takes a lot of time. And you know if you don't get funding it's a lot of money. And for very little security if you think about how hard it is to get a teaching job that's going to give you enough of a pay off to pay that back. Right, the time that you took off, the investment that you made. So I think the other side of that is by making data more accessible, you allow people who could have been great in science, have an opportunity to be great data scientists. And so I think for me the idea of citizen data scientist, that's where the opportunity is. I think in terms of democratizing data and making it available for everyone, I feel as though it's something similar to the way we didn't really know what KPIs were, maybe 20 years ago. People didn't use it as readily, didn't teach it in schools. I think maybe 10, 20 years from now, some of the things that we're building today from data science, hopefully more people will understand how to use these tools. They'll have a better understanding of working with data and what that means, and just data literacy right? Just being able to use these tools and be able to understand what data's saying and actually what it's not saying. Which is the thing that most people don't think about. But you can also say that data doesn't say anything. There's a lot of noise in it. There's too much noise to be able to say that there is a result. So I think that's the other side of it. So yeah I guess in terms for me, in terms of data a serious data scientist, I think it's a great idea to have that, right? But at the same time of course everyone kind of emphasized you don't want everyone out there going, "I can be a data scientist without education, "without statistics, without math," without understanding of how to implement the process. I've seen a lot of companies implement the same sort of process from 10, 20 years ago just on Hadoop instead of SQL. Right and it's very inefficient. And the only difference is that you can build more tables wrong than they could before. (everyone laughs) Which is I guess >> For less. it's an accomplishment and for less, it's cheaper, yeah. >> It is cheaper. >> Otherwise we're like I'm not a data scientist but I did stay at a Holiday Inn Express last night, right? >> Yeah. (panelists laugh) And there's like a little bit of pride that like they used 2,000, you know they used 2,000 computers to do it. Like a little bit of pride about that, but you know of course maybe not a great way to go. I think 20 years we couldn't do that, right? One computer was already an accomplishment to have that resource. So I think you have to think about the fact that if you're doing it wrong, you're going to just make that mistake bigger, which his also the other side of working with data. >> Sure, Bob. >> Yeah I have a comment about that. I've never liked the term citizen data scientist or citizen scientist. I get the point of it and I think employees within companies can help in the data analytics problem by maybe being a data collector or something. I mean I would never have just somebody become a scientist based on a few classes here she takes. It's like saying like, "Oh I'm going to be a citizen lawyer." And so you come to me with your legal problems, or a citizen surgeon. Like you need training to be good at something. You can't just be good at something just 'cause you want to be. >> John: Joe you wanted to say something too on that. >> Since we're in New York City I'd like to use the analogy of a real scientist versus a data scientist. So real scientist requires tools, right? And the tools are not new, like microscopes and a laboratory and a clean room. And these tools have evolved over years and years, and since we're in New York we could walk within a 10 block radius and buy any of those tools. It doesn't make us a scientist because we use those tools. I think with data, you know making, making the tools evolve and become easier to use, you know like Bob was saying, it doesn't make you a better data scientist, it just makes the data more accessible. You know we can go buy a microscope, we can go buy Hadoop, we can buy any kind of tool in a data ecosystem, but it doesn't really make you a scientist. I'm very involved in the NYU data science program and the Columbia data science program, like these kids are brilliant. You know these kids are not someone who is, you know just trying to run a day to day job, you know in corporate America. I think the people who are running the day to day job in corporate America are going to be the recipients of data science. Just like people who take drugs, right? As a result of a smart data scientist coming up with a formula that can help people, I think we're going to make it easier to distribute the data that can help people with all the new tools. But it doesn't really make it, you know the access to the data and tools available doesn't really make you a better data scientist. Without, like Bob was saying, without better training and education. >> So how-- I'm sorry, how do you then, if it's not for everybody, but yet I'm the user at the end of the day at my company and I've got these reams of data before me, how do you make it make better sense to me then? So that's where machine learning comes in or artificial intelligence and all this stuff. So how at the end of the day, Dion? How do you make it relevant and usable, actionable to somebody who might not be as practiced as you would like? >> I agree with Joe that many of us will be the recipients of data science. Just like you had to be a computer science at one point to develop programs for a computer, now we can get the programs. You don't need to be a computer scientist to get a lot of value out of our IT systems. The same thing's going to happen with data science. There's far more demand for data science than there ever could be produced by, you know having an ivory tower filled with data scientists. Which we need those guys, too, don't get me wrong. But we need to have, productize it and make it available in packages such that it can be consumed. The outputs and even some of the inputs can be provided by mere mortals, whether that's machine learning or artificial intelligence or bots that go off and run the hypotheses and select the algorithms maybe with some human help. We have to productize it. This is a constant of data scientist of service, which is becoming a thing now. It's, "I need this, I need this capability at scale. "I need it fast and I need it cheap." The commoditization of data science is going to happen. >> That goes back to what I was saying about, the recipient also of data science is also machines, right? Because I think the other thing that's happening now in the evolution of data is that, you know the data is, it's so tightly coupled. Back when you were talking about data warehousing you have all the business transactions then you take the data out of those systems, you put them in a warehouse for analysis, right? Maybe they'll make a decision to change that system at some point. Now the analytics platform and the business application is very tightly coupled. They become dependent upon one another. So you know people who are using the applications are now be able to take advantage of the insights of data analytics and data science, just through the app. Which never really existed before. >> I have one comment on that. You were talking about how do you get the end user more involved, well like we said earlier data science is not easy, right? As an end user, I encourage you to take a stats course, just a basic stats course, understanding what a mean is, variability, regression analysis, just basic stuff. So you as an end user can get more, or glean more insight from the reports that you're given, right? If you go to France and don't know French, then people can speak really slowly to you in French, you're not going to get it. You need to understand the language of data to get value from the technology we have available to us. >> Incidentally French is one of the languages that you have the option of learning if you're a mathematicians. So math PhDs are required to learn a second language. France being the country of algebra, that's one of the languages you could actually learn. Anyway tangent. But going back to the point. So statistics courses, definitely encourage it. I teach statistics. And one of the things that I'm finding as I go through the process of teaching it I'm actually bringing in my experience. And by bringing in my experience I'm actually kind of making the students think about the data differently. So the other thing people don't think about is the fact that like statisticians typically were expected to do, you know, just basic sort of tasks. In a sense that they're knowledge is specialized, right? But the day to day operations was they ran some data, you know they ran a test on some data, looked at the results, interpret the results based on what they were taught in school. They didn't develop that model a lot of times they just understand what the tests were saying, especially in the medical field. So when you when think about things like, we have words like population, census. Which is when you take data from every single, you have every single data point versus a sample, which is a subset. It's a very different story now that we're collecting faster than it used to be. It used to be the idea that you could collect information from everyone. Like it happens once every 10 years, we built that in. But nowadays you know, you know here about Facebook, for instance, I think they claimed earlier this year that their data was more accurate than the census data. So now there are these claims being made about which data source is more accurate. And I think the other side of this is now statisticians are expected to know data in a different way than they were before. So it's not just changing as a field in data science, but I think the sciences that are using data are also changing their fields as well. >> Dave: So is sampling dead? >> Well no, because-- >> Should it be? (laughs) >> Well if you're sampling wrong, yes. That's really the question. >> Okay. You know it's been said that the data doesn't lie, people do. Organizations are very political. Oftentimes you know, lies, damned lies and statistics, Benjamin Israeli. Are you seeing a change in the way in which organizations are using data in the context of the politics. So, some strong P&L manager say gets data and crafts it in a way that he or she can advance their agenda. Or they'll maybe attack a data set that is, probably should drive them in a different direction, but might be antithetical to their agenda. Are you seeing data, you know we talked about democratizing data, are you seeing that reduce the politics inside of organizations? >> So you know we've always used data to tell stories at the top level of an organization that's what it's all about. And I still see very much that no matter how much data science or, the access to the truth through looking at the numbers that story telling is still the political filter through which all that data still passes, right? But it's the advent of things like Block Chain, more and more corporate records and corporate information is going to end up in these open and shared repositories where there is not alternate truth. It'll come back to whoever tells the best stories at the end of the day. So I still see the organizations are very political. We are seeing now more open data though. Open data initiatives are a big thing, both in government and in the private sector. It is having an effect, but it's slow and steady. So that's what I see. >> Um, um, go ahead. >> I was just going to say as well. Ultimately I think data driven decision making is a great thing. And it's especially useful at the lower tiers of the organization where you have the routine day to day's decisions that could be automated through machine learning and deep learning. The algorithms can be improved on a constant basis. On the upper levels, you know that's why you pay executives the big bucks in the upper levels to make the strategic decisions. And data can help them, but ultimately, data, IT, technology alone will not create new markets, it will not drive new businesses, it's up to human beings to do that. The technology is the tool to help them make those decisions. But creating businesses, growing businesses, is very much a human activity. And that's something I don't see ever getting replaced. Technology might replace many other parts of the organization, but not that part. >> I tend to be a foolish optimist when it comes to this stuff. >> You do. (laughs) >> I do believe that data will make the world better. I do believe that data doesn't lie people lie. You know I think as we start, I'm already seeing trends in industries, all different industries where, you know conventional wisdom is starting to get trumped by analytics. You know I think it's still up to the human being today to ignore the facts and go with what they think in their gut and sometimes they win, sometimes they lose. But generally if they lose the data will tell them that they should have gone the other way. I think as we start relying more on data and trusting data through artificial intelligence, as we start making our lives a little bit easier, as we start using smart cars for safety, before replacement of humans. AS we start, you know, using data really and analytics and data science really as the bumpers, instead of the vehicle, eventually we're going to start to trust it as the vehicle itself. And then it's going to make lying a little bit harder. >> Okay, so great, excellent. Optimism, I love it. (John laughs) So I'm going to play devil's advocate here a little bit. There's a couple elephant in the room topics that I want to, to explore a little bit. >> Here it comes. >> There was an article today in Wired. And it was called, Why AI is Still Waiting for It's Ethics Transplant. And, I will just read a little segment from there. It says, new ethical frameworks for AI need to move beyond individual responsibility to hold powerful industrial, government and military interests accountable as they design and employ AI. When tech giants build AI products, too often user consent, privacy and transparency are overlooked in favor of frictionless functionality that supports profit driven business models based on aggregate data profiles. This is from Kate Crawford and Meredith Whittaker who founded AI Now. And they're calling for sort of, almost clinical trials on AI, if I could use that analogy. Before you go to market you've got to test the human impact, the social impact. Thoughts. >> And also have the ability for a human to intervene at some point in the process. This goes way back. Is everybody familiar with the name Stanislav Petrov? He's the Soviet officer who back in 1983, it was in the control room, I guess somewhere outside of Moscow in the control room, which detected a nuclear missile attack against the Soviet Union coming out of the United States. Ordinarily I think if this was an entirely AI driven process we wouldn't be sitting here right now talking about it. But this gentlemen looked at what was going on on the screen and, I'm sure he's accountable to his authorities in the Soviet Union. He probably got in a lot of trouble for this, but he decided to ignore the signals, ignore the data coming out of, from the Soviet satellites. And as it turned out, of course he was right. The Soviet satellites were seeing glints of the sun and they were interpreting those glints as missile launches. And I think that's a great example why, you know every situation of course doesn't mean the end of the world, (laughs) it was in this case. But it's a great example why there needs to be a human component, a human ability for human intervention at some point in the process. >> So other thoughts. I mean organizations are driving AI hard for profit. Best minds of our generation are trying to figure out how to get people to click on ads. Jeff Hammerbacher is famous for saying it. >> You can use data for a lot of things, data analytics, you can solve, you can cure cancer. You can make customers click on more ads. It depends on what you're goal is. But, there are ethical considerations we need to think about. When we have data that will have a racial bias against blacks and have them have higher prison sentences or so forth or worse credit scores, so forth. That has an impact on a broad group of people. And as a society we need to address that. And as scientists we need to consider how are we going to fix that problem? Cathy O'Neil in her book, Weapons of Math Destruction, excellent book, I highly recommend that your listeners read that book. And she talks about these issues about if AI, if algorithms have a widespread impact, if they adversely impact protected group. And I forget the last criteria, but like we need to really think about these things as a people, as a country. >> So always think the idea of ethics is interesting. So I had this conversation come up a lot of times when I talk to data scientists. I think as a concept, right as an idea, yes you want things to be ethical. The question I always pose to them is, "Well in the business setting "how are you actually going to do this?" 'Cause I find the most difficult thing working as a data scientist, is to be able to make the day to day decision of when someone says, "I don't like that number," how do you actually get around that. If that's the right data to be showing someone or if that's accurate. And say the business decides, "Well we don't like that number." Many people feel pressured to then change the data, change, or change what the data shows. So I think being able to educate people to be able to find ways to say what the data is saying, but not going past some line where it's a lie, where it's unethical. 'Cause you can also say what data doesn't say. You don't always have to say what the data does say. You can leave it as, "Here's what we do know, "but here's what we don't know." There's a don't know part that many people will omit when they talk about data. So I think, you know especially when it comes to things like AI it's tricky, right? Because I always tell people I don't know everyone thinks AI's going to be so amazing. I started an industry by fixing problems with computers that people didn't realize computers had. For instance when you have a system, a lot of bugs, we all have bug reports that we've probably submitted. I mean really it's no where near the point where it's going to start dominating our lives and taking over all the jobs. Because frankly it's not that advanced. It's still run by people, still fixed by people, still managed by people. I think with ethics, you know a lot of it has to do with the regulations, what the laws say. That's really going to be what's involved in terms of what people are willing to do. A lot of businesses, they want to make money. If there's no rules that says they can't do certain things to make money, then there's no restriction. I think the other thing to think about is we as consumers, like everyday in our lives, we shouldn't separate the idea of data as a business. We think of it as a business person, from our day to day consumer lives. Meaning, yes I work with data. Incidentally I also always opt out of my credit card, you know when they send you that information, they make you actually mail them, like old school mail, snail mail like a document that says, okay I don't want to be part of this data collection process. Which I always do. It's a little bit more work, but I go through that step of doing that. Now if more people did that, perhaps companies would feel more incentivized to pay you for your data, or give you more control of your data. Or at least you know, if a company's going to collect information, I'd want you to be certain processes in place to ensure that it doesn't just get sold, right? For instance if a start up gets acquired what happens with that data they have on you? You agree to give it to start up. But I mean what are the rules on that? So I think we have to really think about the ethics from not just, you know, someone who's going to implement something but as consumers what control we have for our own data. 'Cause that's going to directly impact what businesses can do with our data. >> You know you mentioned data collection. So slightly on that subject. All these great new capabilities we have coming. We talked about what's going to happen with media in the future and what 5G technology's going to do to mobile and these great bandwidth opportunities. The internet of things and the internet of everywhere. And all these great inputs, right? Do we have an arms race like are we keeping up with the capabilities to make sense of all the new data that's going to be coming in? And how do those things square up in this? Because the potential is fantastic, right? But are we keeping up with the ability to make it make sense and to put it to use, Joe? >> So I think data ingestion and data integration is probably one of the biggest challenges. I think, especially as the world is starting to become more dependent on data. I think you know, just because we're dependent on numbers we've come up with GAAP, which is generally accepted accounting principles that can be audited and proven whether it's true or false. I think in our lifetime we will see something similar to that we will we have formal checks and balances of data that we use that can be audited. Getting back to you know what Dave was saying earlier about, I personally would trust a machine that was programmed to do the right thing, than to trust a politician or some leader that may have their own agenda. And I think the other thing about machines is that they are auditable. You know you can look at the code and see exactly what it's doing and how it's doing it. Human beings not so much. So I think getting to the truth, even if the truth isn't the answer that we want, I think is a positive thing. It's something that we can't do today that once we start relying on machines to do we'll be able to get there. >> Yeah I was just going to add that we live in exponential times. And the challenge is that the way that we're structured traditionally as organizations is not allowing us to absorb advances exponentially, it's linear at best. Everyone talks about change management and how are we going to do digital transformation. Evidence shows that technology's forcing the leaders and the laggards apart. There's a few leading organizations that are eating the world and they seem to be somehow rolling out new things. I don't know how Amazon rolls out all this stuff. There's all this artificial intelligence and the IOT devices, Alexa, natural language processing and that's just a fraction, it's just a tip of what they're releasing. So it just shows that there are some organizations that have path found the way. Most of the Fortune 500 from the year 2000 are gone already, right? The disruption is happening. And so we are trying, have to find someway to adopt these new capabilities and deploy them effectively or the writing is on the wall. I spent a lot of time exploring this topic, how are we going to get there and all of us have a lot of hard work is the short answer. >> I read that there's going to be more data, or it was predicted, more data created in this year than in the past, I think it was five, 5,000 years. >> Forever. (laughs) >> And that to mix the statistics that we're analyzing currently less than 1% of the data. To taking those numbers and hear what you're all saying it's like, we're not keeping up, it seems like we're, it's not even linear. I mean that gap is just going to grow and grow and grow. How do we close that? >> There's a guy out there named Chris Dancy, he's known as the human cyborg. He has 700 hundred sensors all over his body. And his theory is that data's not new, having access to the data is new. You know we've always had a blood pressure, we've always had a sugar level. But we were never able to actually capture it in real time before. So now that we can capture and harness it, now we can be smarter about it. So I think that being able to use this information is really incredible like, this is something that over our lifetime we've never had and now we can do it. Which hence the big explosion in data. But I think how we use it and have it governed I think is the challenge right now. It's kind of cowboys and indians out there right now. And without proper governance and without rigorous regulation I think we are going to have some bumps in the road along the way. >> The data's in the oil is the question how are we actually going to operationalize around it? >> Or find it. Go ahead. >> I will say the other side of it is, so if you think about information, we always have the same amount of information right? What we choose to record however, is a different story. Now if you want wanted to know things about the Olympics, but you decide to collect information every day for years instead of just the Olympic year, yes you have a lot of data, but did you need all of that data? For that question about the Olympics, you don't need to collect data during years there are no Olympics, right? Unless of course you're comparing it relative. But I think that's another thing to think about. Just 'cause you collect more data does not mean that data will produce more statistically significant results, it does not mean it'll improve your model. You can be collecting data about your shoe size trying to get information about your hair. I mean it really does depend on what you're trying to measure, what your goals are, and what the data's going to be used for. If you don't factor the real world context into it, then yeah you can collect data, you know an infinite amount of data, but you'll never process it. Because you have no question to ask you're not looking to model anything. There is no universal truth about everything, that just doesn't exist out there. >> I think she's spot on. It comes down to what kind of questions are you trying to ask of your data? You can have one given database that has 100 variables in it, right? And you can ask it five different questions, all valid questions and that data may have those variables that'll tell you what's the best predictor of Churn, what's the best predictor of cancer treatment outcome. And if you can ask the right question of the data you have then that'll give you some insight. Just data for data's sake, that's just hype. We have a lot of data but it may not lead to anything if we don't ask it the right questions. >> Joe. >> I agree but I just want to add one thing. This is where the science in data science comes in. Scientists often will look at data that's already been in existence for years, weather forecasts, weather data, climate change data for example that go back to data charts and so forth going back centuries if that data is available. And they reformat, they reconfigure it, they get new uses out of it. And the potential I see with the data we're collecting is it may not be of use to us today, because we haven't thought of ways to use it, but maybe 10, 20, even 100 years from now someone's going to think of a way to leverage the data, to look at it in new ways and to come up with new ideas. That's just my thought on the science aspect. >> Knowing what you know about data science, why did Facebook miss Russia and the fake news trend? They came out and admitted it. You know, we miss it, why? Could they have, is it because they were focused elsewhere? Could they have solved that problem? (crosstalk) >> It's what you said which is are you asking the right questions and if you're not looking for that problem in exactly the way that it occurred you might not be able to find it. >> I thought the ads were paid in rubles. Shouldn't that be your first clue (panelists laugh) that something's amiss? >> You know red flag, so to speak. >> Yes. >> I mean Bitcoin maybe it could have hidden it. >> Bob: Right, exactly. >> I would think too that what happened last year is actually was the end of an age of optimism. I'll bring up the Soviet Union again, (chuckles). It collapsed back in 1991, 1990, 1991, Russia was reborn in. And think there was a general feeling of optimism in the '90s through the 2000s that Russia is now being well integrated into the world economy as other nations all over the globe, all continents are being integrated into the global economy thanks to technology. And technology is lifting entire continents out of poverty and ensuring more connectedness for people. Across Africa, India, Asia, we're seeing those economies that very different countries than 20 years ago and that extended into Russia as well. Russia is part of the global economy. We're able to communicate as a global, a global network. I think as a result we kind of overlook the dark side that occurred. >> John: Joe? >> Again, the foolish optimist here. But I think that... It shouldn't be the question like how did we miss it? It's do we have the ability now to catch it? And I think without data science without machine learning, without being able to train machines to look for patterns that involve corruption or result in corruption, I think we'd be out of luck. But now we have those tools. And now hopefully, optimistically, by the next election we'll be able to detect these things before they become public. >> It's a loaded question because my premise was Facebook had the ability and the tools and the knowledge and the data science expertise if in fact they wanted to solve that problem, but they were focused on other problems, which is how do I get people to click on ads? >> Right they had the ability to train the machines, but they were giving the machines the wrong training. >> Looking under the wrong rock. >> (laughs) That's right. >> It is easy to play armchair quarterback. Another topic I wanted to ask the panel about is, IBM Watson. You guys spend time in the Valley, I spend time in the Valley. People in the Valley poo-poo Watson. Ah, Google, Facebook, Amazon they've got the best AI. Watson, and some of that's fair criticism. Watson's a heavy lift, very services oriented, you just got to apply it in a very focused. At the same time Google's trying to get you to click on Ads, as is Facebook, Amazon's trying to get you to buy stuff. IBM's trying to solve cancer. Your thoughts on that sort of juxtaposition of the different AI suppliers and there may be others. Oh, nobody wants to touch this one, come on. I told you elephant in the room questions. >> Well I mean you're looking at two different, very different types of organizations. One which is really spent decades in applying technology to business and these other companies are ones that are primarily into the consumer, right? When we talk about things like IBM Watson you're looking at a very different type of solution. You used to be able to buy IT and once you installed it you pretty much could get it to work and store your records or you know, do whatever it is you needed it to do. But these types of tools, like Watson actually tries to learn your business. And it needs to spend time doing that watching the data and having its models tuned. And so you don't get the results right away. And I think that's been kind of the challenge that organizations like IBM has had. Like this is a different type of technology solution, one that has to actually learn first before it can provide value. And so I think you know you have organizations like IBM that are much better at applying technology to business, and then they have the further hurdle of having to try to apply these tools that work in very different ways. There's education too on the side of the buyer. >> I'd have to say that you know I think there's plenty of businesses out there also trying to solve very significant, meaningful problems. You know with Microsoft AI and Google AI and IBM Watson, I think it's not really the tool that matters, like we were saying earlier. A fool with a tool is still a fool. And regardless of who the manufacturer of that tool is. And I think you know having, a thoughtful, intelligent, trained, educated data scientist using any of these tools can be equally effective. >> So do you not see core AI competence and I left out Microsoft, as a strategic advantage for these companies? Is it going to be so ubiquitous and available that virtually anybody can apply it? Or is all the investment in R&D and AI going to pay off for these guys? >> Yeah, so I think there's different levels of AI, right? So there's AI where you can actually improve the model. I remember when I was invited when Watson was kind of first out by IBM to a private, sort of presentation. And my question was, "Okay, so when do I get "to access the corpus?" The corpus being sort of the foundation of NLP, which is natural language processing. So it's what you use as almost like a dictionary. Like how you're actually going to measure things, or things up. And they said, "Oh you can't." "What do you mean I can't?" It's like, "We do that." "So you're telling me as a data scientist "you're expecting me to rely on the fact "that you did it better than me and I should rely on that." I think over the years after that IBM started opening it up and offering different ways of being able to access the corpus and work with that data. But I remember at the first Watson hackathon there was only two corpus available. It was either the travel or medicine. There was no other foundational data available. So I think one of the difficulties was, you know IBM being a little bit more on the forefront of it they kind of had that burden of having to develop these systems and learning kind of the hard way that if you don't have the right models and you don't have the right data and you don't have the right access, that's going to be a huge limiter. I think with things like medical, medical information that's an extremely difficult data to start with. Partly because you know anything that you do find or don't find, the impact is significant. If I'm looking at things like what people clicked on the impact of using that data wrong, it's minimal. You might lose some money. If you do that with healthcare data, if you do that with medical data, people may die, like this is a much more difficult data set to start with. So I think from a scientific standpoint it's great to have any information about a new technology, new process. That's the nice that is that IBM's obviously invested in it and collected information. I think the difficulty there though is just 'cause you have it you can't solve everything. And if feel like from someone who works in technology, I think in general when you appeal to developers you try not to market. And with Watson it's very heavily marketed, which tends to turn off people who are more from the technical side. Because I think they don't like it when it's gimmicky in part because they do the opposite of that. They're always trying to build up the technical components of it. They don't like it when you're trying to convince them that you're selling them something when you could just give them the specs and look at it. So it could be something as simple as communication. But I do think it is valuable to have had a company who leads on the forefront of that and try to do so we can actually learn from what IBM has learned from this process. >> But you're an optimist. (John laughs) All right, good. >> Just one more thought. >> Joe go ahead first. >> Joe: I want to see how Alexa or Siri do on Jeopardy. (panelists laugh) >> All right. Going to go around a final thought, give you a second. Let's just think about like your 12 month crystal ball. In terms of either challenges that need to be met in the near term or opportunities you think will be realized. 12, 18 month horizon. Bob you've got the microphone headed up, so I'll let you lead off and let's just go around. >> I think a big challenge for business, for society is getting people educated on data and analytics. There's a study that was just released I think last month by Service Now, I think, or some vendor, or Click. They found that only 17% of the employees in Europe have the ability to use data in their job. Think about that. >> 17. >> 17. Less than 20%. So these people don't have the ability to understand or use data intelligently to improve their work performance. That says a lot about the state we're in today. And that's Europe. It's probably a lot worse in the United States. So that's a big challenge I think. To educate the masses. >> John: Joe. >> I think we probably have a better chance of improving technology over training people. I think using data needs to be iPhone easy. And I think, you know which means that a lot of innovation is in the years to come. I do think that a keyboard is going to be a thing of the past for the average user. We are going to start using voice a lot more. I think augmented reality is going to be things that becomes a real reality. Where we can hold our phone in front of an object and it will have an overlay of prices where it's available, if it's a person. I think that we will see within an organization holding a camera up to someone and being able to see what is their salary, what sales did they do last year, some key performance indicators. I hope that we are beyond the days of everyone around the world walking around like this and we start actually becoming more social as human beings through augmented reality. I think, it has to happen. I think we're going through kind of foolish times at the moment in order to get to the greater good. And I think the greater good is using technology in a very, very smart way. Which means that you shouldn't have to be, sorry to contradict, but maybe it's good to counterpoint. I don't think you need to have a PhD in SQL to use data. Like I think that's 1990. I think as we evolve it's going to become easier for the average person. Which means people like the brain trust here needs to get smarter and start innovating. I think the innovation around data is really at the tip of the iceberg, we're going to see a lot more of it in the years to come. >> Dion why don't you go ahead, then we'll come down the line here. >> Yeah so I think over that time frame two things are likely to happen. One is somebody's going to crack the consumerization of machine learning and AI, such that it really is available to the masses and we can do much more advanced things than we could. We see the industries tend to reach an inflection point and then there's an explosion. No one's quite cracked the code on how to really bring this to everyone, but somebody will. And that could happen in that time frame. And then the other thing that I think that almost has to happen is that the forces for openness, open data, data sharing, open data initiatives things like Block Chain are going to run headlong into data protection, data privacy, customer privacy laws and regulations that have to come down and protect us. Because the industry's not doing it, the government is stepping in and it's going to re-silo a lot of our data. It's going to make it recede and make it less accessible, making data science harder for a lot of the most meaningful types of activities. Patient data for example is already all locked down. We could do so much more with it, but health start ups are really constrained about what they can do. 'Cause they can't access the data. We can't even access our own health care records, right? So I think that's the challenge is we have to have that battle next to be able to go and take the next step. >> Well I see, with the growth of data a lot of it's coming through IOT, internet of things. I think that's a big source. And we're going to see a lot of innovation. A new types of Ubers or Air BnBs. Uber's so 2013 though, right? We're going to see new companies with new ideas, new innovations, they're going to be looking at the ways this data can be leveraged all this big data. Or data coming in from the IOT can be leveraged. You know there's some examples out there. There's a company for example that is outfitting tools, putting sensors in the tools. Industrial sites can therefore track where the tools are at any given time. This is an expensive, time consuming process, constantly loosing tools, trying to locate tools. Assessing whether the tool's being applied to the production line or the right tool is at the right torque and so forth. With the sensors implanted in these tools, it's now possible to be more efficient. And there's going to be innovations like that. Maybe small start up type things or smaller innovations. We're going to see a lot of new ideas and new types of approaches to handling all this data. There's going to be new business ideas. The next Uber, we may be hearing about it a year from now whatever that may be. And that Uber is going to be applying data, probably IOT type data in some, new innovative way. >> Jennifer, final word. >> Yeah so I think with data, you know it's interesting, right, for one thing I think on of the things that's made data more available and just people we open to the idea, has been start ups. But what's interesting about this is a lot of start ups have been acquired. And a lot of people at start ups that got acquired now these people work at bigger corporations. Which was the way it was maybe 10 years ago, data wasn't available and open, companies kept it very proprietary, you had to sign NDAs. It was like within the last 10 years that open source all of that initiatives became much more popular, much more open, a acceptable sort of way to look at data. I think that what I'm kind of interested in seeing is what people do within the corporate environment. Right, 'cause they have resources. They have funding that start ups don't have. And they have backing, right? Presumably if you're acquired you went in at a higher title in the corporate structure whereas if you had started there you probably wouldn't be at that title at that point. So I think you have an opportunity where people who have done innovative things and have proven that they can build really cool stuff, can now be in that corporate environment. I think part of it's going to be whether or not they can really adjust to sort of the corporate, you know the corporate landscape, the politics of it or the bureaucracy. I think every organization has that. Being able to navigate that is a difficult thing in part 'cause it's a human skill set, it's a people skill, it's a soft skill. It's not the same thing as just being able to code something and sell it. So you know it's going to really come down to people. I think if people can figure out for instance, what people want to buy, what people think, in general that's where the money comes from. You know you make money 'cause someone gave you money. So if you can find a way to look at a data or even look at technology and understand what people are doing, aren't doing, what they're happy about, unhappy about, there's always opportunity in collecting the data in that way and being able to leverage that. So you build cooler things, and offer things that haven't been thought of yet. So it's a very interesting time I think with the corporate resources available if you can do that. You know who knows what we'll have in like a year. >> I'll add one. >> Please. >> The majority of companies in the S&P 500 have a market cap that's greater than their revenue. The reason is 'cause they have IP related to data that's of value. But most of those companies, most companies, the vast majority of companies don't have any way to measure the value of that data. There's no GAAP accounting standard. So they don't understand the value contribution of their data in terms of how it helps them monetize. Not the data itself necessarily, but how it contributes to the monetization of the company. And I think that's a big gap. If you don't understand the value of the data that means you don't understand how to refine it, if data is the new oil and how to protect it and so forth and secure it. So that to me is a big gap that needs to get closed before we can actually say we live in a data driven world. >> So you're saying I've got an asset, I don't know if it's worth this or this. And they're missing that great opportunity. >> So devolve to what I know best. >> Great discussion. Really, really enjoyed the, the time as flown by. Joe if you get that augmented reality thing to work on the salary, point it toward that guy not this guy, okay? (everyone laughs) It's much more impressive if you point it over there. But Joe thank you, Dion, Joe and Jennifer and Batman. We appreciate and Bob Hayes, thanks for being with us. >> Thanks you guys. >> Really enjoyed >> Great stuff. >> the conversation. >> And a reminder coming up a the top of the hour, six o'clock Eastern time, IBMgo.com featuring the live keynote which is being set up just about 50 feet from us right now. Nick Silver is one of the headliners there, John Thomas is well, or rather Rob Thomas. John Thomas we had on earlier on The Cube. But a panel discussion as well coming up at six o'clock on IBMgo.com, six to 7:15. Be sure to join that live stream. That's it from The Cube. We certainly appreciate the time. Glad to have you along here in New York. And until the next time, take care. (bright digital music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Welcome back to data science for all. So it is a new game-- Have a swing at the pitch. Thanks for taking the time to be with us. from the academic side to continue data science And there's lot to be said is there not, ask the questions, you can't not think about it. of the customer and how we were going to be more anticipatory And I think, you know as the tools mature, So it's still too hard. I think that, you know, that's where it's headed. So Bob if you would, so you've got this Batman shirt on. to be a data scientist, but these tools will help you I was just going to add that, you know I think it's important to point out as well that And the data scientists on the panel And the only difference is that you can build it's an accomplishment and for less, So I think you have to think about the fact that I get the point of it and I think and become easier to use, you know like Bob was saying, So how at the end of the day, Dion? or bots that go off and run the hypotheses So you know people who are using the applications are now then people can speak really slowly to you in French, But the day to day operations was they ran some data, That's really the question. You know it's been said that the data doesn't lie, the access to the truth through looking at the numbers of the organization where you have the routine I tend to be a foolish optimist You do. I think as we start relying more on data and trusting data There's a couple elephant in the room topics Before you go to market you've got to test And also have the ability for a human to intervene to click on ads. And I forget the last criteria, but like we need I think with ethics, you know a lot of it has to do of all the new data that's going to be coming in? Getting back to you know what Dave was saying earlier about, organizations that have path found the way. than in the past, I think it was (laughs) I mean that gap is just going to grow and grow and grow. So I think that being able to use this information Or find it. But I think that's another thing to think about. And if you can ask the right question of the data you have And the potential I see with the data we're collecting is Knowing what you know about data science, for that problem in exactly the way that it occurred I thought the ads were paid in rubles. I think as a result we kind of overlook And I think without data science without machine learning, Right they had the ability to train the machines, At the same time Google's trying to get you And so I think you know And I think you know having, I think in general when you appeal to developers But you're an optimist. Joe: I want to see how Alexa or Siri do on Jeopardy. in the near term or opportunities you think have the ability to use data in their job. That says a lot about the state we're in today. I don't think you need to have a PhD in SQL to use data. Dion why don't you go ahead, We see the industries tend to reach an inflection point And that Uber is going to be applying data, I think part of it's going to be whether or not if data is the new oil and how to protect it I don't know if it's worth this or this. Joe if you get that augmented reality thing Glad to have you along here in New York.

ENTITIES

Entity	Category	Confidence
Jeff Hammerbacher	PERSON	0.99+
Dave	PERSON	0.99+
Dion Hinchcliffe	PERSON	0.99+
John	PERSON	0.99+
Jennifer	PERSON	0.99+
Joe	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
Chris Dancy	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Cathy O'Neil	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Stanislav Petrov	PERSON	0.99+
Joe McKendrick	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Nick Silver	PERSON	0.99+
John Thomas	PERSON	0.99+
100 variables	QUANTITY	0.99+
John Walls	PERSON	0.99+
1990	DATE	0.99+
Joe Caserta	PERSON	0.99+
Rob Thomas	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
UC Berkeley	ORGANIZATION	0.99+
1983	DATE	0.99+
1991	DATE	0.99+
2013	DATE	0.99+
Constellation Research	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Bob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Bob Hayes	PERSON	0.99+
United States	LOCATION	0.99+
360 degree	QUANTITY	0.99+
one	QUANTITY	0.99+
New York	LOCATION	0.99+
Benjamin Israeli	PERSON	0.99+
France	LOCATION	0.99+
Africa	LOCATION	0.99+
12 month	QUANTITY	0.99+
Soviet Union	LOCATION	0.99+
Batman	PERSON	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Olympics	EVENT	0.99+
Meredith Whittaker	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Moscow	LOCATION	0.99+
Ubers	ORGANIZATION	0.99+
20 years	QUANTITY	0.99+
Joe C.	PERSON	0.99+

John Thomas, IBM | IBM Data Science For All

(upbeat music) >> Narrator: Live from New York City, it's the Cube, covering IBM Data Science for All. Brought to you by IMB. >> Welcome back to Data Science for All. It's a whole new game here at IBM's event, two-day event going on, 6:00 tonight the big keynote presentation on IBM.com so be sure to join the festivities there. You can watch it live stream, all that's happening. Right now, we're live here on the Cube, along with Dave Vellente, I'm John Walls and we are joined by John Thomas who is a distinguished engineer and director at IBM. John, thank you for your time, good to see you. >> Same here, John. >> Yeah, pleasure, thanks for being with us here. >> John Thomas: Sure. >> I know, in fact, you just wrote this morning about machine learning, so that's obviously very near and dear to you. Let's talk first off about IBM, >> John Thomas: Sure. >> Not a new concept by any means, but what is new with regard to machine learning in your work? >> Yeah, well, that's a good question, John. Actually, I get that question a lot. Machine learning itself is not new, companies have been doing it for decades, so exactly what is new, right? I actually wrote this in a blog today, this morning. It's really three different things, I call them democratizing machine learning, operationalizing machine learning, and hybrid machine learning, right? And we can talk through each of these if you like. But I would say hybrid machine learning is probably closest to my heart. So let me explain what that is because it's sounds fancy, right? (laughter) >> Right. It's what we need is another hybrid something, right? >> In reality, what it is is let data gravity decide where your data stays and let your performance requirements, your SLA's, dictate where your machine learning models go, right? So what do I mean by that? You might have sensitive data, customer data, which you want to keep on a certain platform, right? Instead of moving data off that platform to do machine learning, bring machine learning to that platform, whether that be the mainframe or specialized appliances or hadoop clusters, you name it, right? Bring machine learning to where the data is. Do the training, building of the model, where that is, but then have complete flexibility in terms of where you deploy that model. As an example, you might choose to build and train your model on premises behind the firewall using very sensitive data, but the model that has been built, you may choose to deploy that into a Cloud environment because you have other applications that need to consume it. That flexibility is what I mean by hybrid. Another example is, especially when you get into so many more complex machine learning, deep learning domains, you need exploration and there is hardware that provides that exploration, right? For example, GPU's provide exploration. Well, you need to have the flexibility to train and build the models on hardware that provides that kind of exploration, but then the model that has been built might go into inside of a CICS mainframe transaction for some second scoring of a credit card transaction as to whether it's fraudulent or not, right? So there's flexibility off peri, on peri, different platforms, this is what I mean by hybrid. >> What is the technical enabler to allow that to happen? Is it just a modern software architecture, microservices, containers, blah, blah, blah? Explain that in more detail. >> Yeah, that's a good question and we're not, you know, it's a couple different things. One is bringing native machine learning to these platforms themselves. So you need native machine learning on the mainframe, in the Cloud, in a hadoop cluster environment, in an appliance, right? So you need the run times, the libraries, the frameworks running native on those platforms. And that is not easy to do that, you know? You've got machine learning running native on ZOS, not even Linux on Z. It's native to ZOS on the mainframe. >> At the very primitive level you're talking about. >> Yeah. >> So you get the performance you need. >> You have the runtime environments there and then what you need is a seamless experience across all of these platforms. You need way to export models, repositories into which you can save models, the same API's to save models into a different repository and then consume from them there. So it's a bit of engineering that IBM is doing to enable this, right? Native capabilities on the platforms, the same API's to talk to repositories and consume from the repositories. >> So the other piece of that architecture is talking a lot of tooling that integrated and native. >> John Thomas: Yes. >> And the tooling, as you know, changes, I feel like daily. There's a new tool out there and everybody gloms onto it, so the architecture has to be able to absorb those. What is the enabler there? >> Yeah, so you actually bring up a very good point. There is a new language, a new framework everyday, right? I mean, we all know that, in the world of machine learning, Python and R and Scala. Frameworks like Spark and TensorFlow, they're table scapes now, you know? You have to support all of these, scikit-learning, you name it, right? Obviously, you need a way to support all these frameworks on the platforms you want to enable, right? And then you need an environment which lets you work with the tools of your choice. So you need an environment like a workbench which can allow you to work in the language, the framework that you are the most comfortable with. And that's what we are doing with data science experience. I don't know if you have thought of this, but data science experience is an enterprise ML platform, right, runs in the Cloud, on prem, on x86 machines, you can have it on a (mumbles) box. The idea here is support for a variety of open languages, frameworks, enable through a collaborative workbench kind of interface. >> And the decision to move, whether it's on-prem or in the Cloud, it's a function of many things, but let's talk about those. I mean, data volume is one. You can't just move your business into the Cloud. It's not going to work that well. >> It's a journey, yeah. >> It's too expensive. But then there's others, there's governance edicts and security edicts, not that the security in the Cloud is any worse, it might just different than what your organization requires, and the Cloud supplier might not support that. It's different Clouds, it's location, etc. When you talked about the data thing being on trend, maybe training a model, and then that model moving to the Cloud, so obviously, it's a lighter weight ... It's not as much-- >> Yeah, yeah, yeah, you're not moving the entire data. Right. >> But I have a concern. I wonder if clients as you about this. Okay, well, it's my data, my data, I'm going to keep behind my firewall. But that data trained that model and I'm really worried that that model is now my IP that's going to seep out into the industry. What do you tell a client? >> Yeah, that's a fair point. Obviously, you still need your security mechanisms, you access control mechanisms, your governance control mechanisms. So you need governance whether you are on the Cloud or on prem. And your encryption mechanisms, your version control mechanisms, your governance mechanisms, all need to be in place, regardless of where you deploy, right? And to your question of how do you decide where the model should go, as I said earlier to John, you know, let data gravity SLA's performance security requirements dictate where the model should go. >> We're talking so much about concepts, right, and theories that you have. Lets roll up our sleeves and get to the nitty-gritty a little bit here and talk about what are people really doing out there? >> John Thomas: Oh yeah, use cases. >> Yeah, just give us an idea for some of the ... Kind of the latest and greatest that you're seeing. >> Lots of very interesting, interesting use cases out there so actually, a part of what IBM calls a data science elite team. We go out and engage with customers on very interesting use cases, right? And we see a lot of these hybrid discussions happen as well. On one end of the spectrum is understanding customers better. So I call this reading the customer's mind. So can you understand what is in the customer's mind and have an interaction with the client without asking a bunch of questions, right? Can you look at his historical data, his browsing behavior, his purchasing behavior, and have an offer that he will really love? Can you really understand him and give him a celebrity experience? That's one class of use cases, right? Another class of use cases is around improving operations, improving your own internal processes. One example is fraud detection, right? I mean, that is a hot topic these days. So how do you, as the credit card is swiped, right, it's just a few milliseconds before that travels through a network and kicks you back in mainframe and a scoring is done to as to whether this should be approved or not. Well, you need to have a prediction of how likely this is to be fraudulent or not in the span of the transaction. Here's another one. I don't know if you call help desks now. I sometimes call them "helpless desks." (laughter) >> Try not to. >> Dave: Hell desks. >> Try not to helpless desks but, you know, for pretty every enterprise that I am talking to, there is a goal to optimize their help desk, their call centers. And call center optimization is good. So as the customer calls in, can you understand the intent of the customer? See, he may start off talking about something, but as the call progresses, the intent might change. Can you understand that? In fact, not just understand, but predict it and intercept with something that the client will love before the conversation takes a bad turn? (laughter) >> You must be listening in on my calls. >> Your calls, must be your calls! >> I meander, I go every which way. >> I game the system and just go really mad and go, let me get you an operator. (laughter) Agent, okay. >> You tow guys, your data is a special case. >> Dave: Yeah right, this guy's pissed. >> We are red-flagged right off the top. >> We're not even analyzing you. >> Day job, forget about, you know. What about things, you know, because they're moving so far out to the edge and now with mobile and that explosion there, and sensor data being what it is and all this is tremendous growth. Tough to manage. >> Dave: It is, it really is. >> I guess, maybe tougher to make sense of it, so how are you helping people make sense of this so they can really filter through and find the data that matters? >> Yeah, this is a lot of things rolled up into that question, right? One is just managing those devices, those endpoints in multiple thousands, tens of thousands, millions of these devices. How would you manage them? Then, are you doing the processing of the data and applying ML and DL right at the edge, or are you bringing the data back behind the firewall or into Cloud and then processing it there? If you are doing image reduction in a car, in a self-driving car, can you allow the latency of data being shipping of an image of a pedestrian jumping in front, do we ship across the Cloud for a deep-learning network to process it and give you an answer - oh, that's a pedestrian? You know, you may not have that latency now. So you may want to do some processing on the edge, so that is another interesting discussion, right? And you need exploration there as well. Another aspect now is, as you said, separating the signal from the noise, you know. It's just really, really coming down to the different industries that we go into, what are the signals that we understand now? Can we build on them and can we re-use them? That is an interesting discussion as well. But, yeah, you're right. With the world of exploding data that we are in, with all these devices, it's very important to have systematic approach to managing your data, cataloging it, understanding where to apply ML, where to apply exploration, governance. All of these things become important. >> I want to ask you about, come back to the use cases for a moment. You talk about celebrity experiences, I put that in sort of a marketing category. Fraud detection's always been one of the favorite, big data use cases, help desks, recommendation engines and so forth. Let's start with the fraud detection. About a year ago, first of all, fraud detection in the last six, seven years, has been getting immensely better, no question. And it's great. However, the number of false positives, about a year ago, it was too many. We're a small company but we buy a lot of equipment and lights and cameras and stuff. The number of false positives that I personally get was overwhelming. >> Yeah. >> They've gone down dramatically. >> Yeah. >> In the last 12 months. Is that just a coincidence, happenstance, or is it getting better? >> No, it's not that the bad guys have gone down in number. It's not that at all, no. (laughter) >> Well, that, I know. >> No, I think there is a lot of sophistication in terms of the algorithms that are available now. In terms of ... If you have tens of thousands of features that you're looking at, how do you collapse that space and how do you do that efficiently, right? There are techniques that are evolving in terms of handing that kind of information. In terms of the actual algorithms, are different types of innovations that are happening in that space. But I think, perhaps, the most important one is that things that use to take weeks or days to train and test, now can be done in days or minutes, right? The exploration that comes from GPU's, for example, allows you to test out different algorithms, different models and say, okay, well, this performs well enough for me to roll it out and try this out, right? It gives you a very quick cycle of innovation. >> The time to value is really compressed. Okay, now let's take one that's not so good. Ad recommendations, the Google ads that pop up. One in a hundred are maybe relevant, if that, right? And they pop up on the screen and they're annoying. I worry that Siri's listening somehow. I talk to my wife about Israel and then next thing I know, I'm getting ads for going to Israel. Is that a coincidence or are they listening? What's happening there? >> I don't know about what Google's doing. I can't comment on that. (laughter) I don't want to comment on that. >> Maybe just from a technology perspective. >> From a technology perspective, this notion of understanding what is in the customer's mind and really getting to a customer segment at one, this is top interest for many, many organizations. Regardless of which industry you are, insurance or banking or retail, doesn't matter, right? And it all comes down to the fundamental principles about how efficiently can you do. Now, can you identify the features that have the most predictive power? This is a level of sophistication in terms of the feature engineering, in terms of collapsing that space of features that I had talked about, and then, how do I actually go to the latest science of this? How do I do the exploratory analysis? How do I actually build and test my machine learning models quickly? Do the tools allow me to be very productive about this? Or do I spend weeks and weeks coding in lower-level formats? Or do I get help, do I get guided interfaces, which guide me through the process, right? And then, the topic of exploration we talk about, right? These things come together and then couple that with cognitive API's. For example, speech to text, the word (mumbles) have gone down dramatically now. So as you talk on the phone, with a very high accuracy, we can understand what is being talked about. Image recognition, the accuracy has gone up dramatically. You can create custom classifiers for industry-specific topics that you want to identify in pictures. Natural language processing, natural language understanding, all of these have evolved in the last few years. And all these come together. So machine learning's not an island. All these things coming together is what makes these dramatic advancements possible. >> Well, John, if you've figured out anything about the past 20 minutes or so, is that Dave and I want ads delivered that matter and we want our help desk questions answered right away. (laugher) so if you can help us with that, you're welcome back on the Cube anytime, okay? >> We will try, John. >> That's all we want, that's all we ask. >> You guys, your calls are still being screened. (laughter) >> John Thomas, thank you for joining us, we appreciate that. >> Thank you. >> Our panel discussion coming up at 4:00 Eastern time. Live here on the Cube, we're in New York City. Be back in a bit. (upbeat music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IMB. John, thank you for your time, good to see you. I know, in fact, you just wrote this morning And we can talk through each of these if you like. It's what we need is another hybrid something, right? of where you deploy that model. What is the technical enabler to allow that to happen? And that is not easy to do that, you know? and then what you need is a seamless experience So the other piece of that architecture is And the tooling, as you know, changes, I feel like daily. the framework that you are the most comfortable with. And the decision to move, whether it's on-prem and security edicts, not that the security in the Cloud is Yeah, yeah, yeah, you're not moving the entire data. I wonder if clients as you about this. So you need governance whether you are and theories that you have. Kind of the latest and greatest that you're seeing. I don't know if you call help desks now. So as the customer calls in, can you understand and go, let me get you an operator. What about things, you know, because they're moving the signal from the noise, you know. I want to ask you about, come back to the use cases In the last 12 months. No, it's not that the bad guys have gone down in number. and how do you do that efficiently, right? I talk to my wife about Israel and then next thing I know, I don't know about what Google's doing. So as you talk on the phone, with a very high accuracy, so if you can help us with that, You guys, your calls are still being screened. Live here on the Cube, we're in New York City.

ENTITIES

Entity	Category	Confidence
Dave Vellente	PERSON	0.99+
John	PERSON	0.99+
John Thomas	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Israel	LOCATION	0.99+
Google	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
Siri	TITLE	0.99+
ZOS	TITLE	0.99+
today	DATE	0.99+
Linux	TITLE	0.99+
One example	QUANTITY	0.99+
Python	TITLE	0.99+
thousands	QUANTITY	0.99+
One	QUANTITY	0.99+
Scala	TITLE	0.99+
Spark	TITLE	0.98+
tens of thousands	QUANTITY	0.98+
this morning	DATE	0.98+
each	QUANTITY	0.98+
IMB	ORGANIZATION	0.96+
one	QUANTITY	0.96+
TensorFlow	TITLE	0.95+
millions	QUANTITY	0.95+
About a year ago	DATE	0.95+
first	QUANTITY	0.94+
one class	QUANTITY	0.92+
Z.	TITLE	0.91+
4:00 Eastern time	DATE	0.9+
decades	QUANTITY	0.9+
6:00 tonight	DATE	0.9+
CICS	ORGANIZATION	0.9+
about a year ago	DATE	0.89+
second	QUANTITY	0.88+
two-day event	QUANTITY	0.86+
three different things	QUANTITY	0.85+
last 12 months	DATE	0.84+
IBM Data Science	ORGANIZATION	0.82+
Cloud	TITLE	0.8+
R	TITLE	0.78+
past 20 minutes	DATE	0.77+
Cube	COMMERCIAL_ITEM	0.75+
a hundred	QUANTITY	0.72+
one end	QUANTITY	0.7+
seven years	QUANTITY	0.69+
features	QUANTITY	0.69+
couple	QUANTITY	0.67+
last six	DATE	0.66+
few milliseconds	QUANTITY	0.63+
last few years	DATE	0.59+
x86	QUANTITY	0.55+
IBM.com	ORGANIZATION	0.53+
SLA	ORGANIZATION	0.49+

Tricia Wang, Sudden Compass | IBM Data Science For All

>> Narrator: Live from New York City, it's theCUBE covering IBM Data Science For All brought to you by IBM. >> Welcome back here on theCUBE. We are live in New York continuing our coverage here for Data Science for All where all things happen. Big things are happening. In fact, there's a huge event tonight I'm going to tell you about a little bit later on, but Tricia Wang who is our next guest is a part of that panel discussion that you'll want to tune in for live on ibmgo.com. 6 o'clock, but more on that a little bit later on. Along with Dave Vellante, John Walls here, and Tricia Wang now joins us. A first ever for us. How are you doing? >> Good. >> A global tech ethnographer. >> You said it correctly, yay! >> I learned a long time ago when you're not sure slow down. >> A plus already. >> Slow down and breathe. >> Slow down. >> You did a good job. Want to do it one more time? >> A global tech ethnographer. >> Tricia: Good job. >> Studying ethnography and putting ethnography into practice. How about that? >> Really great. >> That's taking on the challenge stretch. >> Now say it 10 times faster in a row. >> How about when we're done? Also co-founder of Sudden Compass. So first off, let's tell our viewers a little bit about Sudden Compass. Then I want to get into the ethnography and how that relates to tech. So let's go first off about Sudden Compass and the origins there. >> So Sudden Compass, we're a consulting firm based in New York City, and we help our partners embrace and understand the complexity of their customers. So whenever there are, wherever there's data and wherever there's people, we are there to help them make sure that they can understand their customers at the end of the day. And customers are really the most unpredictable, the most unknown, and the most difficult to quantify thing for any business. We see a lot of our partners really investing in big data data science tools and they're hiring the most amazing data scientists, but we saw them still struggling to make the right decisions, they still weren't getting their ROI, and they certainly weren't growing their customer base. And what we are helping them do is to say, "Look, you can't just rely only on data science. "You can't put it all into only the tool. "You have to think about how to operationalize that "and build a culture around it "and get the right skillsets in place, "and incorporate what we call the thick data, "which is the stuff that's very difficult to quantify, "the unknown, "and then you can figure out "how to best mathematically scale your data models "when it's actually based on real human behavior, "which is what the practice of ethnography is there to help "is to help you understand what do humans actually do, "what is unquantifiable. "And then once you find out those unquantifiable bits "you then have the art and science of figuring out "how do you scale it into a data model." >> Yeah, see that's what I find fascinating about this is that you've got hard and fast, right, data, objective, black and white, very clear, and then you've got people, you know? We all react differently. We have different influences, and different biases, and prejudices, and all that stuff, aptitudes. So you are meshing this art and science. >> Tricia: Absolutely. >> And what is that telling you then about how best to your clients and how to use data (mumbles)? >> Well, we tell our clients that because people are, there are biases, and people are not objective and there's emotions, that all ends up in the data set. To think that your data set, your quantitative data set, is free of biases and has some kind of been scrubbed of emotion is a total fallacy and it's something that needs to be corrected, because that means decision makers are making decisions based off of numbers thinking that they're objective when in fact they contain all the biases of the very complexity of the humans that they're serving. So, there is an art and science of making sure that when you capture that complexity ... We're saying, "Don't scrub it away." Traditional marketing wants to say, "Put your customers in boxes. "Put them in segments. "Use demographic variables like education, income. "Then you can just put everyone in a box, "figure out where you want to target, "figure out the right channels, "and you buy against that and you reach them." That's not how it works anymore. Customers now are moving faster than corporations. The new net worth customer of today has multiple identities is better understood when in relationship to other people. And we're not saying get rid of the data science. We're saying absolutely have it. You need to have scale. What is thick data going to offer you? Not scale, but it will offer you depth. So, that's why you need to combine both to be able to make effective decisions. >> So, I presume you work with a lot of big consumer brands. Is that a safe assumption? >> Absolutely. >> Okay. So, we work with a lot of big tech brands, like IBM and others, and they tend to move at the speed of the CIO, which tends to be really slow and really risk averse, and they're afraid to over rotate and get ahead over their skis. What do you tell folks like that? Is that a mistake being so cautious in this digital age? >> Well, I think the new CIO is on the cutting edge. I was just at Constellation Research Annual Conference in Half Moon Bay at-- >> Our friend Ray Wang. >> Yeah, Ray Wang. And I just spoke about this at their Constellation Connected Enterprise where they had the most, I would have to say the most amazing forward thinking collection of CIOs, CTOs, CDOs all in one room. And the conversation there was like, "We cannot afford to be slow anymore. "We have to be on the edge "of helping our companies push the ground." So, investing in tools is not enough. It is no longer enough to be the buyer, and to just have a relationship with your vendor and assume that they will help you deliver all the understanding. So, CIOs and CTOs need to ensure that their teams are diverse, multi-functional, and that they're totally integrated embedded into the business. And I don't mean just involve a business analyst as if that's cutting edge. I'm saying, "No, you need to make sure that every team "has qualitative people, "and that they're embedded and working closely together." The problem is we don't teach these skills. We're not graduating data scientists or ethnographers who even want to talk to each other. In fact, each side thinks the other side is useless. We're saying, "No, "we need to be able to have these skills "being taught within companies." And you don't need to hire a PhD data scientist or a PhD ethnographer. What we're saying is that these skills can be taught. We need to teach people to be data literate. You've hired the right experts, you have bought the right tools, but we now need to make sure that we're creating data literacy among decision makers so that we can turn these data into insights and then into action. >> Let's peel that a little bit. Data literate, you're talking about creativity, visualization, combining different perspectives? Where should the educational focus be? >> The educational focus should be on one storytelling. Right now, you cannot just be assuming that you can have a decision maker make a decision based on a number or some long PowerPoint report. We have to teach people how to tell compelling stories with data. And when I say data I'm talking about it needs the human component and it needs the numbers. And so one of the things that I saw, this is really close to my heart, was when I was at Nokia, and I remember I spent a decade understanding China. I really understood China. And when I finally had the insight where I was like, "Look, after spending 10 years there, "following 100 to 200 families around, "I had the insight back in 2009 that look, "your company is about to go out of business because "people don't want to buy your feature phones anymore. "They're going to want to buy smartphones." But, I only had qualitative data, and I needed to work alongside the business analysts and the data scientists. I needed access to their data sets, but I needed us to play together and to be on a team together so that I could scale my insights into quantitative models. And the problem was that, your question is, "What does that look like?" That looks like sitting on a team, having a mandate to say, "You have to play together, "and be able to tell an effective story "to the management and to leadership." But back then they were saying, "No, "we don't even consider your data set "to be worthwhile to even look at." >> We love our candy bar phone, right? It's a killer. >> Tricia: And we love our numbers. We love our surveys that tell us-- >> Market share was great. >> Market share is great. We've done all of the analysis. >> Forget the razor. >> Exactly. I'm like, "Look, of course your market share was great, "because your surveys were optimized "for your existing business model." So, big data is great if you want to optimize your supply chain or in systems that are very contained and quantifiable that's more or less fine. You can get optimization. You can get that one to two to five percent. But if you really want to grow your company and you want to ensure its longevity, you cannot just rely on your quantitative data to tell you how to do that. You actually need thick data for discovery, because you need to find the unknown. >> One of the things you talk about your passion is to understand how human perspectives shape the technology we build and how we use it. >> Tricia: Yes, you're speaking my language. >> Okay, so when you think about the development of the iPhone, it wasn't a bunch of surveys that led Steve Jobs to develop the iPhone. I guess the question is does technology lead and shape human perspectives or do human perspectives shape technology? >> Well, it's a dialectical relationship. It's like does a hamburger ... Does a bun shape the burger or does the bun shape the burger? You would never think of asking someone who loves a hamburger that question, because they both shape each other. >> Okay. (laughing) >> So, it's symbiote here, totally symbiotic. >> Surprise answer. You weren't expecting that. >> No, but it is kind of ... Okay, so you're saying it's not a chicken and egg, it's both. >> Absolutely. And the best companies are attuned to both. The best companies know that. The most powerful companies of the 21st century are obsessed with their customers and they're going to do a great job at leveraging human models to be scaled into data models, and that gap is going to be very, very narrow. You get big data. We're going to see more AI or ML disasters when their data models are really far from their actual human models. That's how we get disasters like Tesco or Target, or even when Google misidentified black people as gorillas. It's because their model of their data was so far from the understanding of humans. And the best companies of the future are going to know how to close that gap, and that means they will have the thick data and big data closely integrated. >> Who's doing that today? It seems like there are no ethics in AI. People are aggressively AI for profit and not really thinking about the human impacts and the societal impacts. >> Let's look at IBM. They're doing it. I would say that some of the most innovative projects that are happening at IBM with Watson, where people are using AI to solve meaningful social problems. I don't think that has to be-- >> Like IBM For Social Good. >> Exactly, but it's also, it's not just experimental. I think IBM is doing really great stuff using Watson to understand, identify skin cancer, or looking at the ways that people are using AI to understand eye diseases, things that you can do at scale. But also businesses are also figuring out how to use AI for actually doing better things. I think some of the most interesting ... We're going to see more examples of people using AI for solving meaningful social problems and making a profit at the same time. I think one really great example is WorkIt is they're using AI. They're actually working with Watson. Watson is who they hired to create their engine where union workers can ask questions of Watson that they may not want to ask or may be too costly to ask. So you can be like, "If I want to take one day off, "will this affect my contract or my job?" That's a very meaningful social problem that unions are now working with, and I think that's a really great example of how Watson is really pushing the edge to solve meaningful social problems at the same time. >> I worry sometimes that that's like the little device that you put in your car for the insurance company to see how you drive. >> How do you brake? How do you drive? >> Do people trust feeding that data to Watson because they're afraid Big Brother is watching? >> That's why we always have to have human intelligence working with machine intelligence. This idea of AI versus humans is a false binary, and I don't even know why we're engaging in those kinds of questions. We're not clearly, but there are people who are talking about it as if it's one or the other, and I find it to be a total waste of time. It's like clearly the best AI systems will be integrated with human intelligence, and we need the human training the data with machine learning systems. >> Alright, I'll play the yeah but. >> You're going to play the what? >> Yeah but! >> Yeah but! (crosstalk) >> That machines are replacing humans in cognitive functions. You walk into an airport and there are kiosks. People are losing jobs. >> Right, no that's real. >> So okay, so that's real. >> That is real. >> You agree with that. >> Job loss is real and job replacement is real. >> And I presume you agree that education is at least a part the answer, and training people differently than-- >> Tricia: Absolutely. >> Just straight reading, writing, and arithmetic, but thoughts on that. >> Well what I mean is that, yes, AI is replacing jobs, but the fact that we're treating AI as some kind of rogue machine that is operating on its own without human guidance, that's not happening, and that's not happening right now, and that's not happening in application. And what is more meaningful to talk about is how do we make sure that humans are more involved with the machines, that we always have a human in the loop, and that they're always making sure that they're training in a way where it's bringing up these ethical questions that are very important that you just raised. >> Right, well, and of course a lot of AI people would say is about prediction and then automation. So think about some of the brands that you serve, consult with, don't they want the machines to make certain decisions for them so that they can affect an outcome? >> I think that people want machines to surface things that is very difficult for humans to do. So if a machine can efficiently surface here is a pattern that's going on then that is very helpful. I think we have companies that are saying, "We can automate your decisions," but when you actually look at what they can automate it's in very contained, quantifiable systems. It's around systems around their supply chain or logistics. But, you really do not want your machine automating any decision when it really affects people, in particular your customers. >> Okay, so maybe changing the air pressure somewhere on a widget that's fine, but not-- >> Right, but you still need someone checking that, because will that air pressure create some unintended consequences later on? There's always some kind of human oversight. >> So I was looking at your website, and I always look for, I'm intrigued by interesting, curious thoughts. >> Tricia: Okay, I have a crazy website. >> No, it's very good, but back in your favorite quotes, "Rather have a question I can't answer "than an answer I can't question." So, how do you bring that kind of there's no fear of failure to the boardroom, to people who have to make big leaps and big decisions and enter this digital transformative world? >> I think that a lot of companies are so fearful of what's going to happen next, and that fear can oftentimes corner them into asking small questions and acting small where they're just asking how do we optimize something? That's really essentially what they're asking. "How do we optimize X? "How do we optimize this business?" What they're not really asking are the hard questions, the right questions, the discovery level questions that are very difficult to answer that no big data set can answer. And those are questions ... The questions about the unknown are the most difficult, but that's where you're going to get growth, because when something is unknown that means you have not either quantified it yet or you haven't found the relationship yet in your data set, and that's your competitive advantage. And that's where the boardroom really needs to set the mandate to say, "Look, I don't want you guys only answering "downstream, company-centric questions like, "'How do we optimize XYZ?"'" which is still important to answer. We're saying you absolutely need to pay attention to that, but you also need to ask upstream very customer-centric questions. And that's very difficult, because all day you're operating inside a company . You have to then step outside of your shoes and leave the building and see the world from a customer's perspective or from even a non existing customer's perspective, which is even more difficult. >> The whole know your customer meme has taken off in a big way right now, but I do feel like the pendulum is swinging. Well, I'm sanguined toward AI. It seems to me that ... It used to be that brands had all the power. They had all the knowledge, they knew the pricing, and the consumers knew nothing. The Internet changed all that. I feel like digital transformation and all this AI is an attempt to create that asymmetry again back in favor of the brand. I see people getting very aggressive toward, certainly you see this with Amazon, Amazon I think knows more about me than I know about myself. Should we be concerned about that and who protects the consumer, or is just maybe the benefits outweigh the risks there? >> I think that's such an important question you're asking and it's totally important. A really great TED talk just went up by Zeynep Tufekci where she talks about the most brilliant data scientists, the most brilliant minds of our day, are working on ad tech platforms that are now being created to essentially do what Kenyatta Jeez calls advertising terrorism, which is that all of this data is being collected so that advertisers have this information about us that could be used to create the future forms of surveillance. And that's why we need organizations to ask the kind of questions that you did. So two organizations that I think are doing a really great job to look at are Data & Society. Founder is Danah Boyd. Based in New York City. This is where I'm an affiliate. And they have all these programs that really look at digital privacy, identity, ramifications of all these things we're looking at with AI systems. Really great set of researchers. And then Vint Cerf (mumbles) co-founded People-Centered Internet. And I think this is another organization that we really should be looking at, it's based on the West Coast, where they're also asking similar questions of like instead of just looking at the Internet as a one-to-one model, what is the Internet doing for communities, and how do we make sure we leverage the role of communities to protect what the original founders of the Internet created? >> Right, Danah Boyd, CUBE alum. Shout out to Jeff Hammerbacher, founder of Cloudera, the originator of the greatest minds of my generation are trying to get people to click on ads. Quit Cloudera and now is working at Mount Sinai as an MD, amazing, trying to solve cancer. >> John: A lot of CUBE alums out there. >> Yeah. >> And now we have another one. >> Woo-hoo! >> Tricia, thank you for being with us. >> You're welcome. >> Fascinating stuff. >> Thanks for being on. >> It really is. >> Great questions. >> Nice to really just change the lens a little bit, look through it a different way. Tricia, by the way, part of a panel tonight with Michael Li and Nir Kaldero who we had earlier on theCUBE, 6 o'clock to 7:15 live on ibmgo.com. Nate Silver also joining the conversation, so be sure to tune in for that live tonight 6 o'clock. Back with more of theCUBE though right after this. (techno music)

Published Date : Nov 1 2017

SUMMARY :

brought to you by IBM. I'm going to tell you about a little bit later on, Want to do it one more time? and putting ethnography into practice. the challenge stretch. and how that relates to tech. and the most difficult to quantify thing for any business. and different biases, and prejudices, and all that stuff, and it's something that needs to be corrected, So, I presume you work with a lot of big consumer brands. and they tend to move at the speed of the CIO, I was just at Constellation Research Annual Conference and assume that they will help you deliver Where should the educational focus be? and to be on a team together We love our candy bar phone, right? We love our surveys that tell us-- We've done all of the analysis. You can get that one to two to five percent. One of the things you talk about your passion that led Steve Jobs to develop the iPhone. or does the bun shape the burger? Okay. You weren't expecting that. but it is kind of ... and that gap is going to be very, very narrow. and the societal impacts. I don't think that has to be-- and making a profit at the same time. that you put in your car for the insurance company and I find it to be a total waste of time. You walk into an airport and there are kiosks. but thoughts on that. that are very important that you just raised. So think about some of the brands that you serve, But, you really do not want your machine Right, but you still need someone checking that, and I always look for, to the boardroom, and see the world from a customer's perspective and the consumers knew nothing. that I think are doing a really great job to look at Shout out to Jeff Hammerbacher, Nice to really just change the lens a little bit,

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Eric Herzog	PERSON	0.99+
James Kobielus	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
Diane	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Mark Albertson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Tricia Wang	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Singapore	LOCATION	0.99+
James Scott	PERSON	0.99+
Scott	PERSON	0.99+
Ray Wang	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Brian Walden	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Jeff Bezos	PERSON	0.99+
Rachel Tobik	PERSON	0.99+
Alphabet	ORGANIZATION	0.99+
Zeynep Tufekci	PERSON	0.99+
Tricia	PERSON	0.99+
Stu	PERSON	0.99+
Tom Barton	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sandra Rivera	PERSON	0.99+
John	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
France	LOCATION	0.99+
Jennifer Lin	PERSON	0.99+
Steve Jobs	PERSON	0.99+
Seattle	LOCATION	0.99+
Brian	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Peter Burris	PERSON	0.99+
Scott Raynovich	PERSON	0.99+
Radisys	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Eric	PERSON	0.99+
Amanda Silver	PERSON	0.99+

Nir Kaldero, Galvanize | IBM Data Science For All

>> Announcer: Live from New York City, it's The Cube, covering IBM data science for all. Brought to you by IBM. >> Welcome back to data science for all. This is IBM's event here on the west side of Manhattan, here on The Cube. We're live, we'll be here all day, along with Dave Vallente, I'm John Walls Poor Dave had to put up with all that howling music at this hotel last night, kept him up 'til, all hours. >> Lots of fun here in the city. >> Yeah, yeah. >> All the crazies out last night. >> Yeah, but the headphones, they worked for ya. Glad to hear that. >> People are already dressed for Halloween, you know what I mean? >> John: Yes. >> In New York, you know what I mean? >> John: All year. >> All the time. >> John: All year. >> 365. >> Yeah. We have with us now the head of data science, and the VP at Galvanize, Nir Kaldero, and Nir, good to see you, sir. Thanks for being with us. We appreciate the time. >> Well of course, my pleasure. >> Tell us about Galvanize. I know you're heavily involved in education in terms of the tech community, but you've got corporate clients, you've got academic clients. You cover the waterfront, and I know data science is your baby. >> Nir: Right. >> But tell us a little bit about Galvanize and your mission there. >> Sure, so Galvanize is the learning community for technology. We provide the training in data science, data engineering, and also modern software engineering. We recently built a very large, fast growing enterprise corporate training department, where we basically help companies become digital, become nimble, and also very data driven, so they can actually go through this digital transformation, and survive in this fourth industrial revolution. We do it across all layers of the business, from the executives, to managers, to data scientists, and data analysts, and kind of transform and upscale all current skills to be modern, to be digital, so companies can actually go through this transformation. >> Hit on one of those items you talked about, data driven. >> Nir: Right. >> It seems like a no-brainer, right? That the more information you give me, the more analysis I can apply to it, the more I can put it in my business practice, the more money I make, the more my customers are happy. It's a lay up, right? >> Nir: It is. >> What is a data driven organization, then? Do you have to convince people that this is where they need to be today? >> Sometimes I need to convince them, but (laughs) anyway, so let's back up a little bit. We are in the midst of the fourth industrial revolution, and in order to survive in this fourth industrial revolution, companies need to become nimble, as I said, become agile, but most importantly become data driven, so the organization can actually best respond to all the predictions that are coming from this very sophisticated machine intelligence models. If the organization immediately can best respond to all of that, companies will be able to enhance the user experience, get insight about their customers, enhance performances, and et cetera, and we know that the winners in this revolution, in this era, will be companies who are very digital, that master the skills of becoming a data driven organization, and you know, we can talk more about the transformation, and what it consisted of. Do you want me to? >> John: Sure. >> Can I just ask you a question? This fourth wave, this is what, the cognitive machine wave? Or how would you describe it? >> Some people call it artificial intelligence. I think artificial intelligence is like big data, kind of like a buzz word. I think more appropriately, we should call it machine intelligence industrial revolution. >> Okay. I've got a lot of questions, but carry on. >> So hitting on that, so you see that as being a major era. >> Nir: It's a game changer. >> If you will, not just a chapter, but a major game changer. >> Nir: Yup. >> Why so? >> So, okay, I'll jump in again. Machines have always replaced man, people. >> John: The automation, right. >> Nir: To some extent. >> But certain machines have replaced certain human tasks, let's say that. >> Nir: Correct. >> But for the first time in history, this fourth era, machine's are replacing humans with cognitive tasks, and that scares a lot of people, because you look at the United States, the median income of the U.S. worker has dropped since 1999, from $55,000 to $52,000, and a lot of people believe it's sort of the hollowing out of that factor that we just mentioned. Education many believe is the answer. You know, Galvanize is an organization that plays a critical role in helping deal with that problem, does it not? >> So, as Mark Zuckerberg says, there is a lot of hate love relationship with A.I. People love it on one side, because they're excited about all the opportunities that can come from this utilization of machine intelligence, but many people actually are afraid from it. I read a survey a few weeks ago that says that 36% of the population thinks that A.I. will destroy humanity, and will conquer the world. That's a fact that's what people think. If I think it's going to happen? I don't think so. I highly believe that education is one of the pillars that can address this fear for machine intelligence, and you spoke a lot about jobs I talk about it forever, but just my belief is that machines can actually replace some of our responsibilities, right? Not necessarily take and replace the entire job. Let's talk about lawyers, right? Lawyers currently spend between 40% to 60% of the time writing contracts, or looking at previous cases. The machine can write a contract in two minutes, or look up millions of data points of previous cases in zero time. Why a lawyer today needs to spend 40% to 60% of the time on that? >> Billable hours, that's why. >> It is, so I don't think the machine will replace the job of the lawyer. I think in the future, the machine replaces some of the responsibilities, like auditing, or writing contracts, or looking at previous cases. >> Menial labor, if you will. >> Yes, but you know, for example, the machine is not that great right now with negotiations skills. So maybe in the future, the job of the lawyer will be mostly around negotiation skills, rather than writing contracts, et cetera, but yeah, you're absolutely right. There is a big fear in the market right now among executives, among people in the public. I think we should educate people about what is the true implications of machine intelligence in this fourth industrial revolution and era, and education is definitely one of those. >> Well, one of my favorite stories, when people bring up this topic, is when Gary Kasparov lost to the IBM super computer, Blue Jean, or whatever it's called. >> Nir: Yup. >> Instead of giving up, what he said is he started a competition, where he proved that humans and machines could beat the IBM super computer. So to this day has a competition where the best chess player in the world is a combination between humans and machines, and so it's that creativity. >> Nir: Imagination. >> Imagination, right, combinatorial effects of different technologies that education, hopefully, can help keep those either way. >> Look, I'm a big fan of neuroscience. I wish I did my PhD in neuroscience, but we are very, very far away from understanding how our brain works. Now to try to imitate the brain when we don't know how the brain works? We are very far away from being in a place where a machine can actually replicate, and really best respond like a human. We don't know how our brain works yet. So we need to do a lot of research on that before we actually really write a very strong, powerful machine intelligence model that can actually replace us as humans, and outbid us. We can speak about Jeopardy, and what's on, and we can speak about AlphaGo, it's a Google company that kind of outperformed the world champion. These are very specific tasks, right? Again, like the lawyer, the machines can write beautiful contracts with NLP, machines can look at millions and trillions of data and figure out what's the conclusion there, right? Or summarize text very fast, but not necessarily good in negotiation yet. >> So when you think about a digital business, to us a digital business is a business that uses data to differentiate, and serve customers, and maintain customers. So when you talk about data driven, it strikes me that when everybody's saying digital business, digital transformation, it's about a data transformation, how well they utilize data, and if you look at the bell curve of organizations, most are not. Everybody wants to be data driven, many say they are data driven. >> Right. >> Dave: Would you agree most are not? >> I will agree that most companies say that they are data driven, but actually they're not. I work with a lot of Fortune 500 companies on a daily basis. I meet their executives and functional leaders, and actually see their data, and business problems that they have. Most of them do tend to say that they are data driven, but truly just ask them if they put data and decisions in the same place, every time they have to make a decision, they don't do it. It's a habit that they don't yet have. Companies need to start investing in building what we say healthy data culture in order to enable and become data driven. Part of it is democratization of data, right? Currently what I see if lots of organizations actually open the data just for the analyst, or the marketers, people who kind of make decisions, that need to make decisions with data, but not throughout the entire organization. I know I always say that everyone in the organization makes decisions on a daily basis, from the barista, to the CEO, right? And the entirety of becoming data driven is that data can actually help us make better decisions on a daily basis, so how about democratizing the data to everyone? So everyone, from the barista, to the CEO, can actually make better decisions on a daily basis, and companies don't excel yet in doing it. Not every company is as digital as Amazon. Amazon, I think, is actually one of the most digital companies in the world, if you look at the digital index. Not everyone is Google or Facebook. Most companies want to be there, most companies understand that they will not be able to survive in this era if they will not become data driven, so it's a big problem. We try at Galvanize to address this problem from executive type of education, where we actually meet with the C-level executives in companies, and actually guide them through how to write their data strategy, how to think about prioritizing data investment, to actual implementation of that, and so far we are highly successful. We were able to make a big transformation in very large, important organizations. So I'm actually very proud of it. >> How long are these eras? Is it a century, or more? >> This fourth industrial? >> Yeah. >> Well it's hard to predict that, and I'm not a machine, or what's on it. (laughs) >> But certainly more than 50 years, would you say? Or maybe not, I don't know. >> I actually don't think so. I think it's going to be fast, and we're going to move to the next one pretty soon that will be even more, with more intelligence, with more data. >> So the reason I ask, is there was an article I saw and linked, and I haven't had time to read it, but it talked about the Four Horsemen, Amazon, Google, Facebook, and Apple, and it said they will all be out of business in 50 years. Now, I don't know, I think Apple probably has 50 years of cash flow in the bank, but then they said, the one, the author said, if I had to predict one that would survive, it would be Amazon, to your point, because they are so data driven. The premise, again I didn't read the whole thing, was that some new data driven, digital upstart will disrupt them. >> Yeah, and you know, companies like Amazon, and Alibaba lately, that try kind of like in a competition with Amazon about who is becoming more data driven, utilizing more machine intelligence, are the ones that invested in these capabilities many, many years ago. It's no that they started investing in it last year, or five years ago. We speak about 15 and 20 years ago. So companies who were really a pioneer, and invested very early on, will predict actually to survive in the future, and you know, very much align. >> Yeah, I'm going to touch on something. It might be a bridge too far, I don't know, but you talk about, Dave brought it up, about replacing human capital, right? Because of artificial intelligence. >> Nir: Yup. >> Is there a reluctance, perhaps, on behalf of executives to embrace that, because they are concerned about their own price? >> Nir: You should be in the room with me. (laughing) >> You provide data, but you also provide that capability to analyze, and make the best informed decision, and therefore, eliminate the human element of a C-suite executive that maybe they're not as necessary today, or tomorrow, as they were two years ago. >> So it is absolutely true, and there is a lot of fear in the room, especially when I show them robots, they freak out typically, (John and Dave laugh) but the fact is well known. Leaders who will not embrace these skills, and understanding, and will help the organization to become agile, nimble, and data driven, will not survive. They will be replaced. So on the one hand, they're afraid from it. On the other side, they see that if they will not actually do something, and take an action today, they might be replaced in the future. >> Where should organizations start? Hey, I want to be data driven. Where do I start? >> That's a good question. So data science, machine learning, is a top down initiative. It requires a lot of funding. It requires a change in culture and habits. So it has to start from the top. The journey has to start from executive, from educating and executive about what is data science, what is machine learning, how to prioritize investments in this field, how to build data driven culture, right? When we spoke about data driven, we mainly speaks about the culture aspect here, not specifically about the technical side of it. So it has to come from the top, leaders have to incorporate it in the organization, the have to give authority and power for people, they have to put the funding at first, and then, this is how it's beautiful, that you actually see it trickles down to the organization when they have a very powerful CEO that makes a decision, and moves the organization quickly to become data driven, make executives look at data every time they make a decision, get them into the habit. When people look up to executives, they try to do the same, and if my boss is an example for me, someone who is looking at data every time he is making a decision, ask the right questions, know how to prioritize, set the right goals for me, this helps me, and helps the organization better perform. >> Follow the leader, right? >> Yup. >> Follow the leader. >> Yup, follow the leader. >> Thanks for being with us. >> Nir: Of course, it's my pleasure. >> Pinned this interesting love hate thing that we have going on. >> We should address that. >> Right, right. That's the next segment, how about that? >> Nir Kaldero from Galvanize joining us here live on The Cube. Back with more from New York in just a bit.

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. the west side of Manhattan, Yeah, but the headphones, and the VP at Galvanize, Nir Kaldero, in terms of the tech community, and your mission there. from the executives, to managers, you talked about, data driven. the more analysis I can apply to it, We are in the midst of the I think artificial but carry on. so you see that as being a major era. If you will, not just a chapter, Machines have always replaced man, people. But certain machines have But for the first time of the pillars that can address of the responsibilities, the job of the lawyer will to the IBM super computer, and so it's that creativity. that education, hopefully, kind of outperformed the world champion. and if you look at the bell from the barista, to the CEO, right? and I'm not a machine, or what's on it. 50 years, would you say? I think it's going to be fast, the author said, if I had to are the ones that invested in Yeah, I'm going to touch on something. Nir: You should be in the room with me. and make the best informed decision, So on the one hand, Hey, I want to be data driven. the have to give authority that we have going on. That's the next segment, how about that? New York in just a bit.

ENTITIES

Entity	Category	Confidence
Dave Vallente	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
40%	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
Gary Kasparov	PERSON	0.99+
New York	LOCATION	0.99+
$55,000	QUANTITY	0.99+
50 years	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Galvanize	ORGANIZATION	0.99+
Nir	PERSON	0.99+
New York City	LOCATION	0.99+
Mark Zuckerberg	PERSON	0.99+
Nir Kaldero	PERSON	0.99+
two minutes	QUANTITY	0.99+
tomorrow	DATE	0.99+
36%	QUANTITY	0.99+
1999	DATE	0.99+
Four Horsemen	ORGANIZATION	0.99+
United States	LOCATION	0.99+
60%	QUANTITY	0.99+
last year	DATE	0.99+
more than 50 years	QUANTITY	0.99+
$52,000	QUANTITY	0.99+
five years ago	DATE	0.99+
one	QUANTITY	0.98+
two years ago	DATE	0.98+
today	DATE	0.98+
first time	QUANTITY	0.98+
Manhattan	LOCATION	0.98+
Halloween	EVENT	0.97+
NLP	ORGANIZATION	0.97+
zero time	QUANTITY	0.97+
fourth wave	EVENT	0.97+
last night	DATE	0.96+
20 years ago	DATE	0.95+
AlphaGo	ORGANIZATION	0.95+
IBM Data Science	ORGANIZATION	0.93+
U.S.	LOCATION	0.93+
fourth industrial revolution	EVENT	0.93+
one side	QUANTITY	0.92+
millions and trillions	QUANTITY	0.9+
John Walls	PERSON	0.85+
years ago	DATE	0.83+
Edu	PERSON	0.82+
few weeks ago	DATE	0.82+
millions of data	QUANTITY	0.77+
fourth industrial revolution	EVENT	0.75+
Fortune 500	ORGANIZATION	0.73+
machine wave	EVENT	0.72+
cognitive	EVENT	0.72+
a century	QUANTITY	0.69+

Vikram Murali, IBM | IBM Data Science For All

>> Narrator: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome back to New York here on theCUBE. Along with Dave Vellante, I'm John Walls. We're Data Science For All, IBM's two day event, and we'll be here all day long wrapping up again with that panel discussion from four to five here Eastern Time, so be sure to stick around all day here on theCUBE. Joining us now is Vikram Murali, who is a program director at IBM, and Vikram thank for joining us here on theCUBE. Good to see you. >> Good to see you too. Thanks for having me. >> You bet. So, among your primary responsibilities, The Data Science Experience. So first off, if you would, share with our viewers a little bit about that. You know, the primary mission. You've had two fairly significant announcements. Updates, if you will, here over the past month or so, so share some information about that too if you would. >> Sure, so my team, we build The Data Science Experience, and our goal is for us to enable data scientist, in their path, to gain insights into data using data science techniques, mission learning, the latest and greatest open source especially, and be able to do collaboration with fellow data scientist, with data engineers, business analyst, and it's all about freedom. Giving freedom to data scientist to pick the tool of their choice, and program and code in the language of their choice. So that's the mission of Data Science Experience, when we started this. The two releases, that you mentioned, that we had in the last 45 days. There was one in September and then there was one on October 30th. Both of these releases are very significant in the mission learning space especially. We now support Scikit-Learn, XGBoost, TensorFlow libraries in Data Science Experience. We have deep integration with Horton Data Platform, which is keymark of our partnership with Hortonworks. Something that we announced back in the summer, and this last release of Data Science Experience, two days back, specifically can do authentication with Technotes with Hadoop. So now our Hadoop customers, our Horton Data Platform customers, can leverage all the goodies that we have in Data Science Experience. It's more deeply integrated with our Hadoop based environments. >> A lot of people ask me, "Okay, when IBM announces a product like Data Science Experience... You know, IBM has a lot of products in its portfolio. Are they just sort of cobbling together? You know? So exulting older products, and putting a skin on them? Or are they developing them from scratch?" How can you help us understand that? >> That's a great question, and I hear that a lot from our customers as well. Data Science Experience started off as a design first methodology. And what I mean by that is we are using IBM design to lead the charge here along with the product and development. And we are actually talking to customers, to data scientist, to data engineers, to enterprises, and we are trying to find out what problems they have in data science today and how we can best address them. So it's not about taking older products and just re-skinning them, but Data Science Experience, for example, it started of as a brand new product: completely new slate with completely new code. Now, IBM has done data science and mission learning for a very long time. We have a lot of assets like SPSS Modeler and Stats, and digital optimization. And we are re-investing in those products, and we are investing in such a way, and doing product research in such a way, not to make the old fit with the new, but in a way where it fits into the realm of collaboration. How can data scientist leverage our existing products with open source, and how we can do collaboration. So it's not just re-skinning, but it's building ground up. >> So this is really important because you say architecturally it's built from the ground up. Because, you know, given enough time and enough money, you know, smart people, you can make anything work. So the reason why this is important is you mentioned, for instance, TensorFlow. You know that down the road there's going to be some other tooling, some other open source project that's going to take hold, and your customers are going to say, "I want that." You've got to then integrate that, or you have to choose whether or not to. If it's a super heavy lift, you might not be able to do it, or do it in time to hit the market. If you architected your system to be able to accommodate that. Future proof is the term everybody uses, so have you done? How have you done that? I'm sure API's are involved, but maybe you could add some color. >> Sure. So we are and our Data Science Experience and mission learning... It is a microservices based architecture, so we are completely dockerized, and we use Kubernetes under the covers for container dockerstration. And all these are tools that are used in The Valley, across different companies, and also in products across IBM as well. So some of these legacy products that you mentioned, we are actually using some of these newer methodologies to re-architect them, and we are dockerizing them, and the microservice architecture actually helps us address issues that we have today as well as be open to development and taking newer methodologies and frameworks into consideration that may not exist today. So the microservices architecture, for example, TensorFlow is something that you brought in. So we can just pin up a docker container just for TensorFlow and attach it to our existing Data Science Experience, and it just works. Same thing with other frameworks like XGBoost, and Kross, and Scikit-Learn, all these are frameworks and libraries that are coming up in open source within the last, I would say, a year, two years, three years timeframe. Previously, integrating them into our product would have been a nightmare. We would have had to re-architect our product every time something came, but now with the microservice architecture it is very easy for us to continue with those. >> We were just talking to Daniel Hernandez a little bit about the Hortonworks relationship at high level. One of the things that I've... I mean, I've been following Hortonworks since day one when Yahoo kind of spun them out. And know those guys pretty well. And they always make a big deal out of when they do partnerships, it's deep engineering integration. And so they're very proud of that, so I want to come on to test that a little bit. Can you share with our audience the kind of integrations you've done? What you've brought to the table? What Hortonworks brought to the table? >> Yes, so Data Science Experience today can work side by side with Horton Data Platform, HDP. And we could have actually made that work about two, three months back, but, as part of our partnership that was announced back in June, we set up drawing engineering teams. We have multiple touch points every day. We call it co-development, and they have put resources in. We have put resources in, and today, especially with the release that came out on October 30th, Data Science Experience can authenticate using secure notes. That I previously mentioned, and that was a direct example of our partnership with Hortonworks. So that is phase one. Phase two and phase three is going to be deeper integration, so we are planning on making Data Science Experience and a body management pact. And so a Hortonworks customer, if you have HDP already installed, you don't have to install DSX separately. It's going to be a management pack. You just spin it up. And the third phase is going to be... We're going to be using YARN for resource management. YARN is very good a resource management. And for infrastructure as a service for data scientist, we can actually delegate that work to YARN. So, Hortonworks, they are putting resources into YARN, doubling down actually. And they are making changes to YARN where it will act as the resource manager not only for the Hadoop and Spark workloads, but also for Data Science Experience workloads. So that is the level of deep engineering that we are engaged with Hortonworks. >> YARN stands for yet another resource negotiator. There you go for... >> John: Thank you. >> The trivia of the day. (laughing) Okay, so... But of course, Hortonworks are big on committers. And obviously a big committer to YARN. Probably wouldn't have YARN without Hortonworks. So you mentioned that's kind of what they're bringing to the table, and you guys primarily are focused on the integration as well as some other IBM IP? >> That is true as well as the notes piece that I mentioned. We have a notes commenter. We have multiple notes commenters on our side, and that helps us as well. So all the notes is part of the HDP package. We need knowledge on our side to work with Hortonworks developers to make sure that we are contributing and making end roads into Data Science Experience. That way the integration becomes a lot more easier. And from an IBM IP perspective... So Data Science Experience already comes with a lot of packages and libraries that are open source, but IBM research has worked on a lot of these libraries. I'll give you a few examples: Brunel and PixieDust is something that our developers love. These are visualization libraries that were actually cooked up by IBM research and the open sourced. And these are prepackaged into Data Science Experience, so there is IBM IP involved and there are a lot of algorithms, mission learning algorithms, that we put in there. So that comes right out of the package. >> And you guys, the development teams, are really both in The Valley? Is that right? Or are you really distributed around the world? >> Yeah, so we are. The Data Science Experience development team is in North America between The Valley and Toronto. The Hortonworks team, they are situated about eight miles from where we are in The Valley, so there's a lot of synergy. We work very closely with them, and that's what we see in the product. >> I mean, what impact does that have? Is it... You know, you hear today, "Oh, yeah. We're a virtual organization. We have people all over the world: Eastern Europe, Brazil." How much of an impact is that? To have people so physically proximate? >> I think it has major impact. I mean IBM is a global organization, so we do have teams around the world, and we work very well. With the invent of IP telephoning, and screen-shares, and so on, yes we work. But it really helps being in the same timezone, especially working with a partner just eight miles or ten miles a way. We have a lot of interaction with them and that really helps. >> Dave: Yeah. Body language? >> Yeah. >> Yeah. You talked about problems. You talked about issues. You know, customers. What are they now? Before it was like, "First off, I want to get more data." Now they've got more data. Is it figuring out what to do with it? Finding it? Having it available? Having it accessible? Making sense of it? I mean what's the barrier right now? >> The barrier, I think for data scientist... The number one barrier continues to be data. There's a lot of data out there. Lot of data being generated, and the data is dirty. It's not clean. So number one problem that data scientist have is how do I get to clean data, and how do I access data. There are so many data repositories, data lakes, and data swamps out there. Data scientist, they don't want to be in the business of finding out how do I access data. They want to have instant access to data, and-- >> Well if you would let me interrupt you. >> Yeah? >> You say it's dirty. Give me an example. >> So it's not structured data, so data scientist-- >> John: So unstructured versus structured? >> Unstructured versus structured. And if you look at all the social media feeds that are being generated, the amount of data that is being generated, it's all unstructured data. So we need to clean up the data, and the algorithms need structured data or data in a particular format. And data scientist don't want to spend too much time in cleaning up that data. And access to data, as I mentioned. And that's where Data Science Experience comes in. Out of the box we have so many connectors available. It's very easy for customers to bring in their own connectors as well, and you have instant access to data. And as part of our partnership with Hortonworks, you don't have to bring data into Data Science Experience. The data is becoming so big. You want to leave it where it is. Instead, push analytics down to where it is. And you can do that. We can connect to remote Spark. We can push analytics down through remote Spark. All of that is possible today with Data Science Experience. The second thing that I hear from data scientist is all the open source libraries. Every day there's a new one. It's a boon and a bane as well, and the problem with that is the open source community is very vibrant, and there a lot of data science competitions, mission learning competitions that are helping move this community forward. And it's a good thing. The bad thing is data scientist like to work in silos on their laptop. How do you, from an enterprise perspective... How do you take that, and how do you move it? Scale it to an enterprise level? And that's where Data Science Experience comes in because now we provide all the tools. The tools of your choice: open source or proprietary. You have it in here, and you can easily collaborate. You can do all the work that you need with open source packages, and libraries, bring your own, and as well as collaborate with other data scientist in the enterprise. >> So, you're talking about dirty data. I mean, with Hadoop and no schema on, right? We kind of knew this problem was coming. So technology sort of got us into this problem. Can technology help us get out of it? I mean, from an architectural standpoint. When you think about dirty data, can you architect things in to help? >> Yes. So, if you look at the mission learning pipeline, the pipeline starts with ingesting data and then cleansing or cleaning that data. And then you go into creating a model, training, picking a classifier, and so on. So we have tools built into Data Science Experience, and we're working on tools, that will be coming up and down our roadmap, which will help data scientist do that themselves. I mean, they don't have to be really in depth coders or developers to do that. Python is very powerful. You can do a lot of data wrangling in Python itself, so we are enabling data scientist to do that within the platform, within Data Science Experience. >> If I look at sort of the demographics of the development teams. We were talking about Hortonworks and you guys collaborating. What are they like? I mean people picture IBM, you know like this 100 plus year old company. What's the persona of the developers in your team? >> The persona? I would say we have a very young, agile development team, and by that I mean... So we've had six releases this year in Data Science Experience. Just for the on premises side of the product, and the cloud side of the product it's got huge delivery. We have releases coming out faster than we can code. And it's not just re-architecting it every time, but it's about adding features, giving features that our customers are asking for, and not making them wait for three months, six months, one year. So our releases are becoming a lot more frequent, and customers are loving it. And that is, in part, because of the team. The team is able to evolve. We are very agile, and we have an awesome team. That's all. It's an amazing team. >> But six releases in... >> Yes. We had immediate release in April, and since then we've had about five revisions of the release where we add lot more features to our existing releases. A lot more packages, libraries, functionality, and so on. >> So you know what monster you're creating now don't you? I mean, you know? (laughing) >> I know, we are setting expectation. >> You still have two months left in 2017. >> We do. >> We do not make frame release cycles. >> They are not, and that's the advantage of the microservices architecture. I mean, when you upgrade, a customer upgrades, right? They don't have to bring that entire system down to upgrade. You can target one particular part, one particular microservice. You componentize it, and just upgrade that particular microservice. It's become very simple, so... >> Well some of those microservices aren't so micro. >> Vikram: Yeah. Not. Yeah, so it's a balance. >> You're growing, but yeah. >> It's a balance you have to keep. Making sure that you componentize it in such a way that when you're doing an upgrade, it effects just one small piece of it, and you don't have to take everything down. >> Dave: Right. >> But, yeah, I agree with you. >> Well, it's been a busy year for you. To say the least, and I'm sure 2017-2018 is not going to slow down. So continue success. >> Vikram: Thank you. >> Wish you well with that. Vikram, thanks for being with us here on theCUBE. >> Thank you. Thanks for having me. >> You bet. >> Back with Data Science For All. Here in New York City, IBM. Coming up here on theCUBE right after this. >> Cameraman: You guys are clear. >> John: All right. That was great.

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Good to see you. Good to see you too. about that too if you would. and be able to do collaboration How can you help us understand that? and we are investing in such a way, You know that down the and attach it to our existing One of the things that I've... And the third phase is going to be... There you go for... and you guys primarily are So that comes right out of the package. The Valley and Toronto. We have people all over the We have a lot of interaction with them Is it figuring out what to do with it? and the data is dirty. You say it's dirty. You can do all the work that you need with can you architect things in to help? I mean, they don't have to and you guys collaborating. And that is, in part, because of the team. and since then we've had about and that's the advantage of microservices aren't so micro. Yeah, so it's a balance. and you don't have to is not going to slow down. Wish you well with that. Thanks for having me. Back with Data Science For All. That was great.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Vikram	PERSON	0.99+
John	PERSON	0.99+
three months	QUANTITY	0.99+
six months	QUANTITY	0.99+
John Walls	PERSON	0.99+
October 30th	DATE	0.99+
2017	DATE	0.99+
April	DATE	0.99+
June	DATE	0.99+
one year	QUANTITY	0.99+
Daniel Hernandez	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
September	DATE	0.99+
one	QUANTITY	0.99+
ten miles	QUANTITY	0.99+
YARN	ORGANIZATION	0.99+
eight miles	QUANTITY	0.99+
Vikram Murali	PERSON	0.99+
New York City	LOCATION	0.99+
North America	LOCATION	0.99+
two day	QUANTITY	0.99+
Python	TITLE	0.99+
two releases	QUANTITY	0.99+
New York	LOCATION	0.99+
two years	QUANTITY	0.99+
three years	QUANTITY	0.99+
six releases	QUANTITY	0.99+
Toronto	LOCATION	0.99+
today	DATE	0.99+
Both	QUANTITY	0.99+
two months	QUANTITY	0.99+
a year	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
third phase	QUANTITY	0.98+
both	QUANTITY	0.98+
this year	DATE	0.98+
first methodology	QUANTITY	0.98+
First	QUANTITY	0.97+
second thing	QUANTITY	0.97+
one small piece	QUANTITY	0.96+
One	QUANTITY	0.96+
XGBoost	TITLE	0.96+
Cameraman	PERSON	0.96+
about eight miles	QUANTITY	0.95+
Horton Data Platform	ORGANIZATION	0.95+
2017-2018	DATE	0.94+
first	QUANTITY	0.94+
The Valley	LOCATION	0.94+
TensorFlow	TITLE	0.94+

Daniel Hernandez, Analytics Offering Management | IBM Data Science For All

>> Announcer: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome to the big apple, John Walls and Dave Vellante here on theCUBE we are live at IBM's Data Science For All. Going to be here throughout the day with a big panel discussion wrapping up our day. So be sure to stick around all day long on theCUBe for that. Dave always good to be here in New York is it not? >> Well you know it's been kind of the data science weeks, months, last week we're in Boston at an event with the chief data officer conference. All the Boston Datarati were there, bring it all down to New York City getting hardcore really with data science so it's from chief data officer to the hardcore data scientists. >> The CDO, hot term right now. Daniel Hernandez now joins as our first guest here at Data Science For All. Who's a VP of IBM Analytics, good to see you. David thanks for being with us. >> Pleasure. >> Alright well give us first off your take, let's just step back high level here. Data science it's certainly been evolving for decades if you will. First off how do you define it today? And then just from the IBM side of the fence, how do you see it in terms of how businesses should be integrating this into their mindset. >> So the way I describe data science simply to my clients is it's using the scientific method to answer questions or deliver insights. It's kind of that simple. Or answering questions quantitatively. So it's a methodology, it's a discipline, it's not necessarily tools. So that's kind of the way I approach describing what it is. >> Okay and then from the IBM side of the fence, in terms of how wide of a net are you casting these days I assume it's as big as you can get your arms out. >> So when you think about any particular problem that's a data science problem, you need certain capabilities. We happen to deliver those capabilities. You need the ability to collect, store, manage, any and all data. You need the ability to organize that data so you can discover it and protect it. You got to be able to analyze it. Automate the mundane, explain the past, predict the future. Those are the capabilities you need to do data science. We deliver a portfolio of it. Including on the analyze part of our portfolio, our data science tools that we would declare as such. >> So data science for all is very aspirational, and when you guys made the announcement of the Watson data platform last fall, one of the things that you focused on was collaboration between data scientists, data engineers, quality engineers, application development, the whole sort of chain. And you made the point that most of the time that data scientists spend is on wrangling data. You're trying to attack that problem, and you're trying to break down the stovepipes between those roles that I just mentioned. All that has to happen before you can actually have data science for all. I mean that's just data science for all hardcore data people. Where are we in terms of sort of the progress that your clients have made in that regard? >> So you know, I would say there's two majors vectors of progress we've made. So if you want data science for all you need to be able to address people that know how to code and people that don't know how to code. So if you consider kind the history of IBM in the data science space especially in SPSS, which has been around for decades. We're mastering and solving data science problems for non-coders. The data science experience really started with embracing coders. Developers that grew up in open source, that lived and learned Jupiter or Python and were more comfortable there. And integration of these is kind of our focus. So that's one aspect. Serving the needs of people that know how to code and don't in the kind of data science role. And then for all means supporting an entire analytics life cycle from collecting the data you need in order to answer the question that you're trying to answer to organizing that information once you've collected so you can discover it inside of tools like our own data science experience and SPSS, and then of course the set of tools that around exploratory analytics. All integrated so that you can do that end to end life cycle. So where clients are, I think they're getting certainly much more sophisticated in understanding that. You know most people have approached data science as a tool problem, as a data prep problem. It's a life cycle problem. And that's kind of how we're thinking about it. We're thinking about it in terms of, alright if our job is answer questions, delivering insights through scientific methods, how do we decompose that problem to a set of things that people need to get the job done, serving the individuals that have to work together. >> And when you think about, go back to the days where it's sort of the data warehouse was king. Something we talked about in Boston last week, it used to be the data warehouse was king, now it's the process is much more important. But it was very few people had access to that data, you had the elapsed time of getting answers, and the inflexibility of the systems. Has that changed and to what degree has it changed? >> I think if you were to go ask anybody in business whether or not they have all the data they need to do their job, they would say no. Why? So we've invested in EDW's, we've invested in Hadoop. In part sometimes, the problem might be, I just don't have the data. Most of the time it is I have the data I just don't know where it is. So there's a pretty significant issue on data discoverability, and it's important that I might have data in my operational systems, I might have data inside my EDW, I don't have everything inside my EDW, I've standed up one or more data lakes, and to solve my problem like customer segmentation I have data everywhere, how do I find and bring it in? >> That seems like that should be a fundamental consideration, right? If you're going to gather this much more information, make it accessible to people. And if you don't, it's a big flaw, it's a big gap is it not? >> So yes, and I think part of the reason why is because governance professionals which I am, you know I spent quite a bit of time trying to solve governance related problems. We've been focusing pretty maniacally on kind of the compliance, and the regulatory and security related issues. Like how do we keep people from going to jail, how do we ensure regulatory compliance with things like e-discovery, and records for instance. And it just so happens the same discipline that you use, even though in some cases lighter weight implementations, are what you need in order to solve this data discovery problem. So the discourse around governance has been historically about compliance, about regulations, about cost takeout, not analytics. And so a lot of our time certainly in R&D is trying to solve that data discovery problem which is how do I discover data using semantics that I have, which as a regular user is not physical understandings of my data, and once I find it how am I assured that what I get is what I should get so that it's, I'm not subject to compliance related issues, but also making the company more vulnerable to data breach. >> Well so presumably part of that anyway involves automating classification at the point of creation or use, which is actually was a technical challenge for a number of years. Has that challenge been solved in your view? >> I think machine learning is, and in fact later on today I will be doing some demonstrations of technology which will show how we're making the application of machine learning easy, inside of everything we do we're applying machine learning techniques including to classification problems that help us solve the problem. So it could be we're automatically harvesting technical metadata. Are there business terms that could be automatically extracted that don't require some data steward to have to know and assert, right? Or can we automatically suggest and still have the steward for a case where I need a canonical data model, and so I just don't want the machine to tell me everything, but I want the machine to assist the data curation process. We are not just exploring the application of machine learning to solve that data classification problem, which historically was a manual one. We're embedding that into most of the stuff that we're doing. Often you won't even know that we're doing it behind the scenes. >> So that means that often times well the machine ideally are making the decisions as to who gets access to what, and is helping at least automate that governance, but there's a natural friction that occurs. And I wonder if you can talk about the balance sheet if you will between information as an asset, information as a liability. You know the more restrictions you put on that information the more it constricts you know a business user's ability. So how do you see that shaping up? >> I think it's often a people process problem, not necessarily a technology problem. I don't think as an industry we've figured it out. Certainly a lot of our clients haven't figured out that balance. I mean there are plenty of conversation I'll go into where I'll talk to a data science team in a same line of business as a governance team and what the data science team will tell us is I'm building my own data catalog because the stuff that the governance guys are doing doesn't help me. And the reason why it doesn't help me is because it's they're going through this top down data curation methodology and I've got a question, I need to go find the data that's relevant. I might not know what that is straight away. So the CDO function in a lot of organizations is helping bridge that. So you'll see governance responsibilities line up with the CDO with analytics. And I think that's gone a long way to bridge that gaps. But that conversation that I was just mentioning is not unique to one or two customers. Still a lot of customers are doing it. Often customers that either haven't started a CDO practice or are early days on it still. >> So about that, because this is being introduced to the workplace, a new concept right, fairly new CDOs. As opposed to CIO or CTO, you know you have these other. I mean how do you talk to your clients about trying to broaden their perspective on that and I guess emphasizing the need for them to consider putting somebody of a sole responsibility, or primary responsibility for their data. Instead of just putting it lumping it in somewhere else. >> So we happen to have one of the best CDO's inside of our group which is like a handy tool for me. So if I go into a client and it's purporting to be a data science problem and it turns out they have a data management issue around data discovery, and they haven't yet figured out how to install the process and people design to solve that particular issue one of the key things I'll do is I'll bring in our CDO and his delegates to have a conversation around them on what we're doing inside of IBM, what we're seeing in other customers to help institute that practice inside of, inside of their own organization. We have forums like the CDO event in Boston last week, which are designed to, you know it's not designed to be here's what IBM can do in technology, it's designed to say here's how the discipline impacts your business and here's some best practices you should apply. So if ultimately I enter into those conversations where I find that there's a need, I typically am like alright, I'm not going to, tools are part of the problem but not the only issue, let me bring someone in that can describe the people process related issues which you got to get right. In order for, in some cases to the tools that I deliver to matter. >> We had Seth Dobrin on last weekend in Boston, and Inderpal Bhandari as well, and he put forth this enterprise, sort of data blueprint if you will. CDO's are sort of-- >> Daniel: We're using that in IBM by the way. >> Well this is the thing, it's a really well thought out sort of structure that seems to be trickling down to the divisions. And so it's interesting to hear how you're applying Seth's expertise. I want to ask you about the Hortonworks relationship. You guys have made a big deal about that this summer. To me it was a no brainer. Really what was the point of IBM having a Hadoop distro, and Hortonworks gets this awesome distribution channel. IBM has always had an affinity for open source so that made sense there. What's behind that relationship and how's it going? >> It's going awesome. Perhaps what we didn't say and we probably should have focused on is the why customers care aspect. There are three main by an occasion use cases that customers are implementing where they are ready even before the relationship. They're asking IBM and Hortonworks to work together. And so we were coming to the table working together as partners before the deeper collaboration we started in June. The first one was bringing data science to Hadoop. So running data science models, doing data exploration where the data is. And if you were to actually rewind the clock on the IBM side and consider what we did with Hortonworks in full consideration of what we did prior, we brought the data science experience and machine learning to Z in February. The highest value transactional data was there. The next step was bring data science to where the, often for a lot of clients the second most valuable set of data which is Hadoop. So that was kind of part one. And then we've kind of continued that by bringing data science experience to the private cloud. So that's one use case. I got a lot data, I need to do data science, I want to do it in resident, I want to take advantage of the compute grid I've already laid down, and I want to take advantage of the performance benefits and the integrated security and governance benefits by having these things co-located. That's kind of play one. So we're bringing in data science experience and HDP and HDF, which are the Hortonworks distributions way closer together and optimized for each other. Another component of that is not all data is going to be in Hadoop as we were describing. Some of it's in an EDW and that data science job is going to require data outside of Hadoop, and so we brought big SQL. It was already supporting Hortonworks, we just optimized the stack, and so the combination of data science experience and big SQL allows you to data science against a broader surface area of data. That's kind of play one. Play two is I've got a EDW either for cost or agility reasons I want to augment it or some cases I might want to offload some data from it to Hadoop. And so the combination of Hortonworks plus big SQL and our data integration technologies are a perfect combination there and we have plenty of clients using that for kind of analytics offloading from EDW. And then the third piece that we're doing quite a bit of engineering, go-to-market work around is govern data lakes. So I want to enable self service analytics throughout my enterprise. I want self service analytics tools to everyone that has access to it. I want to make data available to them, but I want that data to be governed so that they can discover what's in it in the lake, and whatever I give them is what they should have access to. So those are the kind of the three tracks that we're working with Hortonworks on, and all of them are making stunning results inside of clients. >> And so that involves actually some serious engineering as well-- >> Big time. It's not just sort of a Barney deal or just a pure go to market-- >> It's certainly more the market texture and just works. >> Big picture down the road then. Whatever challenges that you see on your side of the business for the next 12 months. What are you going to tackle, what's that monster out there that you think okay this is our next hurdle to get by. >> I forgot if Rob said this before, but you'll hear him say often and it's statistically proven, the majority of the data that's available is not available to be Googled, so it's behind a firewall. And so we started last year with the Watson data platform creating an integrating data analytics system. What if customers have data that's on-prem that they want to take advantage of, what if they're not ready for the public cloud. How do we deliver public benefits to them when they want to run that workload behind a firewall. So we're doing a significant amount of engineering, really starting with the work that we did on a data science experience. Bringing it behind the firewall, but still delivering similar benefits you would expect if you're delivering it in the public cloud. A major advancement that IBM made is run IBM cloud private. I don't know if you guys are familiar with that announcement. We made, I think it's already two weeks ago. So it's a (mumbles) foundation on top of which we have micro services on top of which our stack is going to be made available. So when I think of kind of where the future is, you know our customers ultimately we believe want to run data and analytic workloads in the public cloud. How do we get them there considering they're not there now in a stepwise fashion that is sensible economically project management-wise culturally. Without having them having to wait. That's kind of big picture, kind of a big problem space we're spending considerable time thinking through. >> We've been talking a lot about this on theCUBE in the last several months or even years is people realize they can't just reform their business and stuff into the cloud. They have to bring the cloud model to their data. Wherever that data exists. If it's in the cloud, great. And the key there is you got to have a capability and a solution that substantially mimics that public cloud experience. That's kind of what you guys are focused on. >> What I tell clients is, if you're ready for certain workloads, especially green field workloads, and the capability exists in a public cloud, you should go there now. Because you're going to want to go there eventually anyway. And if not, then a vendor like IBM helps you take advantage of that behind a firewall, often in form facts that are ready to go. The integrated analytics system, I don't know if you're familiar with that. That includes our super advanced data warehouse, the data science experience, our query federation technology powered by big SQL, all in a form factor that's ready to go. You get started there for data and data science workloads and that's a major step in the direction to the public cloud. >> Alright well Daniel thank you for the time, we appreciate that. We didn't get to touch at all on baseball, but next time right? >> Daniel: Go Cubbies. (laughing) >> Sore spot with me but it's alright, go Cubbies. Alright Daniel Hernandez from IBM, back with more here from Data Science For All. IBM's event here in Manhattan. Back with more in theCUBE in just a bit. (electronic music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. So be sure to stick around all day long on theCUBe for that. to the hardcore data scientists. Who's a VP of IBM Analytics, good to see you. how do you see it in terms of how businesses should be So that's kind of the way I approach describing what it is. in terms of how wide of a net are you casting You need the ability to organize that data All that has to happen before you can actually and people that don't know how to code. Has that changed and to what degree has it changed? and to solve my problem like customer segmentation And if you don't, it's a big flaw, it's a big gap is it not? And it just so happens the same discipline that you use, Well so presumably part of that anyway We're embedding that into most of the stuff You know the more restrictions you put on that information So the CDO function in a lot of organizations As opposed to CIO or CTO, you know you have these other. the process and people design to solve that particular issue data blueprint if you will. that seems to be trickling down to the divisions. is going to be in Hadoop as we were describing. just a pure go to market-- that you think okay this is our next hurdle to get by. I don't know if you guys are familiar And the key there is you got to have a capability often in form facts that are ready to go. We didn't get to touch at all on baseball, Daniel: Go Cubbies. IBM's event here in Manhattan.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Daniel	PERSON	0.99+
February	DATE	0.99+
Boston	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
one	QUANTITY	0.99+
David	PERSON	0.99+
Manhattan	LOCATION	0.99+
Inderpal Bhandari	PERSON	0.99+
June	DATE	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Seth	PERSON	0.99+
Python	TITLE	0.99+
third piece	QUANTITY	0.99+
EDW	ORGANIZATION	0.99+
second	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
last week	DATE	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
SQL	TITLE	0.99+
two customers	QUANTITY	0.99+
Hadoop	TITLE	0.99+
first	QUANTITY	0.99+
SPSS	TITLE	0.98+
Seth Dobrin	PERSON	0.98+
three tracks	QUANTITY	0.98+
John Walls	PERSON	0.98+
IBM Analytics	ORGANIZATION	0.98+
first guest	QUANTITY	0.97+
two weeks ago	DATE	0.97+
one aspect	QUANTITY	0.96+
first one	QUANTITY	0.96+
Barney	ORGANIZATION	0.96+
two majors	QUANTITY	0.96+
last weekend	DATE	0.94+
this summer	DATE	0.94+
Hadoop	ORGANIZATION	0.93+
decades	QUANTITY	0.92+
last fall	DATE	0.9+
two	QUANTITY	0.85+
IBM Data Science For All	ORGANIZATION	0.79+
three main	QUANTITY	0.78+
next 12 months	DATE	0.78+
CDO	TITLE	0.77+
D	ORGANIZATION	0.72+

Rob Thomas, IBM | Big Data NYC 2017

>> Voiceover: Live from midtown Manhattan, it's theCUBE! Covering Big Data New York City 2017. Brought to you by, SiliconANGLE Media and as ecosystems sponsors. >> Okay, welcome back everyone, live in New York City this is theCUBE's coverage of, eighth year doing Hadoop World now, evolved into Strata Hadoop, now called Strata Data, it's had many incarnations but O'Reilly Media running their event in conjunction with Cloudera, mainly an O'Reilly media show. We do our own show called Big Data NYC here with our community with theCUBE bringing you the best interviews, the best people, entrepreneurs, thought leaders, experts, to get the data and try to project the future and help users find the value in data. My next guest is Rob Thomas, who is the General Manager of IBM Analytics, theCUBE Alumni, been on multiple times successfully executing in the San Francisco Bay area. Great to see you again. >> Yeah John, great to see you, thanks for having me. >> You know IBM is really been interesting through its own transformation and a lot of people will throw IBM in that category but you guys have been transforming okay and the scoreboard yet has to yet to show in my mind what's truly happening because if you still look at this industry, we're only eight years into what Hadoop evolved into now as a large data set but the analytics game just seems to be getting started with the cloud now coming over the top, you're starting to see a lot of cloud conversations in the air. Certainly there's a lot of AI washing, you know, AI this, but it's machine learning and deep learning at the heart of it as innovation but a lot more work on the analytics side is coming. You guys are at the center of that. What's the update? What's your view of this analytics market? >> Most enterprises struggle with complexity. That's the number one problem when it comes to analytics. It's not imagination, it's not willpower, in many cases, it's not even investment, it's just complexity. We are trying to make data really simple to use and the way I would describe it is we're moving from a world of products to platforms. Today, if you want to go solve a data governance problem you're typically integrating 10, 15 different products. And the burden then is on the client. So, we're trying to make analytics a platform game. And my view is an enterprise has to have three platforms if they're serious about analytics. They need a data manager platform for managing all types of data, public, private cloud. They need unified governance so governance of all types of data and they need a data science platform machine learning. If a client has those three platforms, they will be successful with data. And what I see now is really mixed. We've got 10 products that do that, five products that do this, but it has to be integrated in a platform. >> You as an IBM or the customer has these tools? >> Yeah, when I go see clients that's what I see is data... >> John: Disparate data log. >> Yeah, they have disparate tools and so we are unifying what we deliver from a product perspective to this platform concept. >> You guys announce an integrated analytic system, got to see my notes here, I want to get into that in a second but interesting you bring up the word platform because you know, platforms have always been kind of reserved for the big supplier but you're talking about customers having a platform, not a supplier delivering a platform per se 'cause this is where the integration thing becomes interesting. We were joking yesterday on theCUBE here, kind of just kind of ad hoc conceptually like the world has turned into a tool shed. I mean everyone has a tool shed or knows someone that has a tool shed where you have the tools in the back and they're rusty. And so, this brings up the tool conversation, there's too many tools out there that try to be platforms. >> Rob: Yes. >> And if you have too many tools, you're not really doing the platform game right. And complexity also turns into when you bought a hammer it turned into a lawn mower. Right so, a lot of these companies have been groping and trying to iterate what their tool was into something else it wasn't built for. So, as the industry evolves, that's natural Darwinism if you will, they will fall to the wayside. So talk about that dynamic because you still need tooling >> Rob: Yes. but tool will be a function of the work as Peter Burris would say, so talk about how does a customer really get that platform out there without sacrificing the tooling that they may have bought or want to get rid of. >> Well, so think about the, in enterprise today, what the data architecture looks like is, I've got this box that has this software on it, use your terms, has these types of tools on it, and it's isolated and if you want a different set of tooling, okay, move that data to this other box where we have the other tooling. So, it's very isolated in terms of how platforms have evolved or technology platforms today. When I talk about an integrated platform, we are big contributors to Kubernetes. We're making that foundational in terms of what we're doing on Private Cloud and Public Cloud is if you move to that model, suddenly what was a bunch of disparate tools are now microservices against a common architecture. And so it totally changes the nature of the data platform in an enterprise. It's a much more fluid data layer. The term I use sometimes is you have data as a service now, available to all your employees. That's totally different than I want to do this project, so step one, make room in the data center, step two, bring in a server. It's a much more flexible approach so that's what I mean when I say platform. >> So operationalizing it is a lot easier than just going down the linear path of provisioning. All right, so let's bring up the complexity issue because integrated and unified are two different concepts that kind of mean the same thing depending on how you look at it. When you look at the data integration problem, you've got all this complexity around governance, it's a lot of moving parts of data. How does a customer actually execute without compromising the integrity of their policies that they need to have in place? So in other words, what are the baby steps that someone can take, the customers take through with what you guys are dealing with them, how do they get into the game, how do they take steps towards the outcome? They might not have the big money to push it all at once, they might want to take a risk of risk management approach. >> I think there's a clear recipe for doing this right and we have experience of doing it well and doing it not so well, so over time we've gotten some, I'd say a pretty good perspective on that. My view is very simple, data governance has to start with a catalog. And the analogy I use is, you have to do for data what libraries do for books. And think about a library, the first thing you do with books, card catalog. You know where, you basically itemize everything, you know exactly where it sits. If you've got multiple copies of the same book, you can distinguish between which one is which. As books get older they go to archives, to microfilm or something like that. That's what you have to do with your data. >> On the front end. >> On the front end. And it starts with a catalog. And that reason I say that is, I see some organizations that start with, hey, let's go start ETL, I'll create a new warehouse, create a new Hadoop environment. That might be the right thing to do but without having a basis of what you have, which is the catalog, that's where I think clients need to start. >> Well, I would just add one more level of complexity just to kind of reinforce, first of all I agree with you but here's another example that would reinforce this step. Let's just say you write some machine learning and some algorithms and a new policy from the government comes down. Hey, you know, we're dealing with Bitcoin differently or whatever, some GPRS kind of thing happens where someone gets hacked and a new law comes out. How do you inject that policy? You got to rewrite the code, so I'm thinking that if you do this right, you don't have to do a lot of rewriting of applications to the library or the catalog will handle it. Is that right, am I getting that right? >> That's right 'cause then you have a baseline is what I would describe it as. It's codified in the form of a data model or in the form on ontology for how you're looking at unstructured data. You have a baseline so then as changes come, you can easily adjust to those changes. Where I see clients struggle is if you don't have that baseline then you're constantly trying to change things on the fly and that makes it really hard to get to this... >> Well, really hard, expensive, they have to rewrite apps. >> Exactly. >> Rewrite algorithms and machine learning things that were built probably by people that maybe left the company, who knows, right? So the consequences are pretty grave, I mean, pretty big. >> Yes. >> Okay, so let's back to something that you said yesterday. You were on theCUBE yesterday with Hortonworks CEO, Rob Bearden and you were commenting about AI or AI washing. You said quote, "You can't have AI without IA." A play on letters there, sequence of letters which was really an interesting comment, we kind of referenced it pretty much all day yesterday. Information architecture is the IA and AI is the artificial intelligence basically saying if you don't have some sort of architecture AI really can't work. Which really means models have to be understood, with the learning machine kind of approach. Expand more on that 'cause that was I think a fundamental thing that we're seeing at the show this week, this in New York is a model for the models. Who trains the machine learning? Machines got to learn somewhere too so there's learning for the learning machines. This is a real complex data problem and a half. If you don't set up the architecture it may not work, explain. >> So, there's two big problems enterprises have today. One is trying to operationalize data science and machine learning that scale, the other one is getting the cloud but let's focus on the first one for a minute. The reason clients struggle to operationalize this at scale is because they start a data science project and they build a model for one discreet data set. Problem is that only applies to that data set, it doesn't, you can't pick it up and move it somewhere else so this idea of data architecture just to kind of follow through, whether it's the catalog or how you're managing your data across multiple clouds becomes fundamental because ultimately you want to be able to provide machine learning across all your data because machine learning is about predictions and it's hard to do really good predictions on a subset. But that pre-req is the need for an information architecture that comprehends for the fact that you're going to build models and you want to train those models. As new data comes in, you want to keep the training process going. And that's the biggest challenge I see clients struggling with. So they'll have success with their first ML project but then the next one becomes progressively harder because now they're trying to use more data and they haven't prepared their architecture for that. >> Great point. Now, switching to data science. You spoke many times with us on theCUBE about data science, we know you're passionate about you guys doing a lot of work on that. We've observed and Jim Kobielus and I were talking yesterday, there's too much work still in the data science guys plate. There's still doing a lot of what I call, sys admin like work, not the right word, but like administrative building and wrangling. They're not doing enough data science and there's enough proof points now to show that data science actually impacts business in whether it's military having data intelligence to execute something, to selling something at the right time, or even for work or play or consume, or we use, all proof is out there. So why aren't we going faster, why aren't the data scientists more effective, what does it going to take for the data science to have a seamless environment that works for them? They're still doing a lot of wrangling and they're still getting down the weeds. Is that just the role they have or how does it get easier for them that's the big catch? >> That's not the role. So they're a victim of their architecture to some extent and that's why they end up spending 80% of their time on data prep, data cleansing, that type of thing. Look, I think we solved that. That's why when we introduced the integrated analytic system this week, that whole idea was get rid of all the data prep that you need because land the data in one place, machine learning and data science is built into that. So everything that the data scientist struggles with today goes away. We can federate to data on cloud, on any cloud, we can federate to data that's sitting inside Hortonworks so it looks like one system but machine learning is built into it from the start. So we've eliminated the need for all of that data movement, for all that data wrangling 'cause we organized the data, we built the catalog, and we've made it really simple. And so if you go back to the point I made, so one issue is clients can't apply machine learning at scale, the other one is they're struggling to get the cloud. I think we've nailed those problems 'cause now with a click of a button, you can scale this to part of the cloud. >> All right, so how does the customer get their hands on this? Sounds like it's a great tool, you're saying it's leading edge. We'll take a look at it, certainly I'll do a review on it with the team but how do I get it, how do I get a hold of this? What do I do, download it, you guys supply it to me, is it some open source, how do your customers and potential customers engage with this product? >> However they want to but I'll give you some examples. So, we have an analytic system built on Spark, you can bring the whole box into your data center and right away you're ready for data science. That's one way. Somebody like you, you're going to want to go get the containerized version, you go download it on the web and you'll be up and running instantly with a highly performing warehouse integrated with machine learning and data science built on Spark using Apache Jupyter. Any developer can go use that and get value out of it. You can also say I want to run it on my desktop. >> And that's free? >> Yes. >> Okay. >> There's a trial version out there. >> That's the open source, yeah, that's the free version. >> There's also a version on public cloud so if you don't want to download it, you want to run it outside your firewall, you can go run it on IBM cloud on the public cloud so... >> Just your cloud, Amazon? >> No, not today. >> John: Just IBM cloud, okay, I got it. >> So there's variety of ways that you can go use this and I think what you'll find... >> But you have a premium model that people can get started out so they'll download it to your data center, is that also free too? >> Yeah, absolutely. >> Okay, so all the base stuff is free. >> We also have a desktop version too so you can download... >> What URL can people look at this? >> Go to datascience.ibm.com, that's the best place to start a data science journey. >> Okay, multi-cloud, Common Cloud is what people are calling it, you guys have Common SQL engine. What is this product, how does it relate to the whole multi-cloud trend? Customers are looking for multiple clouds. >> Yeah, so Common SQL is the idea of integrating data wherever it is, whatever form it's in, ANSI SQL compliant so what you would expect for a SQL query and the type of response you get back, you get that back with Common SQL no matter where the data is. Now when you start thinking multi-cloud you introduce a whole other bunch of factors. Network, latency, all those types of things so what we talked about yesterday with the announcement of Hortonworks Dataplane which is kind of extending the YARN environment across multi-clouds, that's something we can plug in to. So, I think let's be honest, the multi-cloud world is still pretty early. >> John: Oh, really early. >> Our focus is delivery... >> I don't think it really exists actually. >> I think... >> It's multiple clouds but no one's actually moving workloads across all the clouds, I haven't found any. >> Yeah, I think it's hard for latency reasons today. We're trying to deliver an outstanding... >> But people are saying, I mean this is head room I got but people are saying, I'd love to have a preferred future of multi-cloud even though they're kind of getting their own shops in order, retrenching, and re-platforming it but that's not a bad ask. I mean, I'm a user, I want to move from if I don't like IBM's cloud or I got a better service, I can move around here. If Amazon is too expensive I want to move to IBM, you got product differentiation, I might want to to be in your cloud. So again, this is the customers mindset, right. If you have something really compelling on your cloud, do I have to go all in on IBM cloud to run my data? You shouldn't have to, right? >> I agree, yeah I don't think any enterprise will go all in on one cloud. I think it's delusional for people to think that so you're going to have this world. So the reason when we built IBM Cloud Private we did it on Kubernetes was we said, that can be a substrate if you will, that provides a level of standards across multiple cloud type environments. >> John: And it's got some traction too so it's a good bet there. >> Absolutely. >> Rob, final word, just talk about the personas who you now engage with from IBM's standpoint. I know you have a lot of great developers stuff going on, you've done some great work, you've got a free product out there but you still got to make money, you got to provide value to IBM, who are you selling to, what's the main thing, you've got multiple stakeholders, could you just clarify the stakeholders that you're serving in the marketplace? >> Yeah, I mean, the emerging stakeholder that we speak with more and more than we used to is chief marketing officers who have real budgets for data and data science and trying to change how they're performing their job. That's a major stakeholder, CTOs, CIOs, any C level, >> Chief data officer. >> Chief data officer. You know chief data officers, honestly, it's a mixed bag. Some organizations they're incredibly empowered and they're driving the strategy. Others, they're figure heads and so you got to know how the organizations do it. >> A puppet for the CFO or something. >> Yeah, exactly. >> Our ops. >> A puppet? (chuckles) So, you got to you know. >> Well, they're not really driving it, they're not changing it. It's not like we're mandated to go do something they're maybe governance police or something. >> Yeah, and in some cases that's true. In other cases, they drive the data architecture, the data strategy, and that's somebody that we can engage with right away and help them out so... >> Any events you got going up? Things happening in the marketplace that people might want to participate in? I know you guys do a lot of stuff out in the open, events they can connect with IBM, things going on? >> So we do, so we're doing a big event here in New York on November first and second where we're rolling out a lot of our new data products and cloud products so that's one coming up pretty soon. The biggest thing we've changed this year is there's such a craving for clients for education as we've started doing what we're calling Analytics University where we actually go to clients and we'll spend a day or two days, go really deep and open languages, open source. That's become kind of a new focus for us. >> A lot of re-skilling going on too with the transformation, right? >> Rob: Yes, absolutely. >> All right, Rob Thomas here, General Manager IBM Analytics inside theCUBE. CUBE alumni, breaking it down, giving his perspective. He's got two books out there, The Data Revolution was the first one. >> Big Data Revolution. >> Big Data Revolution and the new one is Every Company is a Tech Company. Love that title which is true, check it out on Amazon. Rob Thomas, Bid Data Revolution, first book and then second book is Every Company is a Tech Company. It's theCUBE live from New York. More coverage after the short break. (theCUBE jingle) (theCUBE jingle) (calm soothing music)

Published Date : Oct 2 2017

SUMMARY :

Brought to you by, SiliconANGLE Media Great to see you again. but the analytics game just seems to be getting started and the way I would describe it is and so we are unifying what we deliver where you have the tools in the back and they're rusty. So talk about that dynamic because you still need tooling that they may have bought or want to get rid of. and it's isolated and if you want They might not have the big money to push it all at once, the first thing you do with books, card catalog. That might be the right thing to do just to kind of reinforce, first of all I agree with you and that makes it really hard to get to this... they have to rewrite apps. probably by people that maybe left the company, Okay, so let's back to something that you said yesterday. and you want to train those models. Is that just the role they have the data prep that you need What do I do, download it, you guys supply it to me, However they want to but I'll give you some examples. There's a That's the open source, so if you don't want to download it, So there's variety of ways that you can go use this that's the best place to start a data science journey. you guys have Common SQL engine. and the type of response you get back, across all the clouds, I haven't found any. Yeah, I think it's hard for latency reasons today. If you have something really compelling on your cloud, that can be a substrate if you will, so it's a good bet there. I know you have a lot of great developers stuff going on, Yeah, I mean, the emerging stakeholder that you got to know how the organizations do it. So, you got to you know. It's not like we're mandated to go do something the data strategy, and that's somebody that we can and cloud products so that's one coming up pretty soon. CUBE alumni, breaking it down, giving his perspective. and the new one is Every Company is a Tech Company.

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Peter Burris	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
John	PERSON	0.99+
Rob Bearden	PERSON	0.99+
Rob Thomas	PERSON	0.99+
O'Reilly Media	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
10	QUANTITY	0.99+
New York	LOCATION	0.99+
10 products	QUANTITY	0.99+
O'Reilly	ORGANIZATION	0.99+
two days	QUANTITY	0.99+
first book	QUANTITY	0.99+
two books	QUANTITY	0.99+
a day	QUANTITY	0.99+
Rob	PERSON	0.99+
Today	DATE	0.99+
yesterday	DATE	0.99+
New York City	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
San Francisco Bay	LOCATION	0.99+
five products	QUANTITY	0.99+
second book	QUANTITY	0.99+
IBM Analytics	ORGANIZATION	0.99+
this week	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first	QUANTITY	0.99+
first one	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
Spark	TITLE	0.99+
SQL	TITLE	0.99+
Common SQL	TITLE	0.98+
datascience.ibm.com	OTHER	0.98+
eighth year	QUANTITY	0.98+
One	QUANTITY	0.98+
one issue	QUANTITY	0.97+
Hortonworks Dataplane	ORGANIZATION	0.97+
three platforms	QUANTITY	0.97+
Strata Hadoop	TITLE	0.97+
today	DATE	0.97+
The Data Revolution	TITLE	0.97+
Cloudera	ORGANIZATION	0.97+
second	QUANTITY	0.96+
NYC	LOCATION	0.96+
two big problems	QUANTITY	0.96+
Analytics University	ORGANIZATION	0.96+
step two	QUANTITY	0.96+
one way	QUANTITY	0.96+
November first	DATE	0.96+
Big Data Revolution	TITLE	0.95+
one	QUANTITY	0.94+
Every Company is a Tech Company	TITLE	0.94+
CUBE	ORGANIZATION	0.93+
this year	DATE	0.93+
two different concepts	QUANTITY	0.92+
one system	QUANTITY	0.92+
step one	QUANTITY	0.92+

Wrap Up | IBM Fast Track Your Data 2017

>> Narrator: Live from Munich Germany, it's theCUBE, covering IBM, Fast Track Your Data. Brought to you by IBM. >> We're back. This is Dave Vellante with Jim Kobielus, and this is theCUBE, the leader in live tech coverage. We go out to the events. We extract the signal from the noise. We are here covering special presentation of IBM's Fast Track your Data, and we're in Munich Germany. It's been a day-long session. We started this morning with a panel discussion with five senior level data scientists that Jim and I hosted. Then we did CUBE interviews in the morning. We cut away to the main tent. Kate Silverton did a very choreographed scripted, but very well done, main keynote set of presentations. IBM made a couple of announcements today, and then we finished up theCUBE interviews. Jim and I are here to wrap. We're actually running on IBMgo.com. We're running live. Hilary Mason talking about what she's doing in data science, and also we got a session on GDPR. You got to log in to see those sessions. So go ahead to IBMgo.com, and you'll find those. Hit the schedule and go to the Hilary Mason and GDP our channels, and check that out, but we're going to wrap now. Jim two main announcements today. I hesitate to call them big announcements. I mean they were you know just kind of ... I think the word you used last night was perfunctory. You know I mean they're okay, but they're not game changing. So what did you mean? >> Well first of all, when you look at ... Though IBM is not calling this a signature event, it's essentially a signature event. They do these every June or so. You know in the past several years, the signature events have had like a one track theme, whether it be IBM announcing their investing deeply in Spark, or IBM announcing that they're focusing on investing in R as the core language for data science development. This year at this event in Munich, it's really a three track event, in terms of the broad themes, and I mean they're all important tracks, but none of them is like game-changing. Perhaps IBM doesn't intend them to be it seems like. One of which is obviously Europe. We're holding this in Munich. And a couple of things of importance to European customers, first and foremost GDPR. The deadline next year, in terms of compliance, is approaching. So sound the alarm as it were. And IBM has rolled out compliance or governance tools. Download and the go from the information catalog, governance catalog and so forth. Now announcing the consortium with Hortonworks to build governance on top of Apache Atlas, but also IBM announcing that they've opened up a DSX center in England and a machine-learning hub here in Germany, to help their European clients, in those countries especially, to get deeper down into data science and machine learning, in terms of developing those applicants. That's important for the audience, the regional audience here. The second track, which is also important, and I alluded to it. It's governance. In all of its manifestations you need a master catalog of all the assets for building and maintaining and controlling your data applications and your data science applications. The catalog, the consortium, the various offerings at IBM is announced and discussed in great detail. They've brought in customers and partners like Northern Trust, talk about the importance of governance, not just as a compliance mandate, but also the potential strategy for monetizing your data. That's important. Number three is what I call cloud native data applications and how the state of the art in developing data applications is moving towards containerized and orchestrated environments that involve things like Docker and Kubernetes. The IBM DB2 developer community edition. Been in the market for a few years. The latest version they announced today includes kubernetes support. Includes support for JSON. So it's geared towards new generation of cloud and data apps. What I'm getting at ... Those three core themes are Europe governance and cloud native data application development. Each of them is individually important, but none of them is game changer. And one last thing. Data science and machine learning, is one of the overarching envelope themes of this event. They've had Hilary Mason. A lot of discussion there. My sense I was a little bit disappointed because there wasn't any significant new announcements related to IBM evolving their machine learning portfolio into deep learning or artificial intelligence in an environment where their direct competitors like Microsoft and Google and Amazon are making a huge push in AI, in terms of their investments. There's a bit of a discussion, and Rob Thomas got to it this morning, about DSX. Working with power AI, the IBM platform, I would like to hear more going forward about IBM investments in these areas. So I thought it was an interesting bunch of announcements. I'll backtrack on perfunctory. I'll just say it was good that they had this for a lot of reasons, but like I said, none of these individual announcements is really changing the game. In fact like I said, I think I'm waiting for the fall, to see where IBM goes in terms of doing something that's actually differentiating and innovative. >> Well I think that the event itself is great. You've got a bunch of partners here, a bunch of customers. I mean it's active. IBM knows how to throw a party. They've always have. >> And the sessions are really individually awesome. I mean terms of what you learn. >> The content is very good. I would agree. The two announcements that were sort of you know DB2, sort of what I call community edition. Simpler, easier to download. Even Dave can download DB2. I really don't want to download DB2, but I could, and play with it I guess. You know I'm not database guy, but those of you out there that are, go check it out. And the other one was the sort of unified data governance. They tried to tie it in. I think they actually did a really good job of tying it into GDPR. We're going to hear over the next, you know 11 months, just a ton of GDPR readiness fear, uncertainty and doubt, from the vendor community, kind of like we heard with Y2K. We'll see what kind of impact GDPR has. I mean it looks like it's the real deal Jim. I mean it looks like you know this 4% of turnover penalty. The penalties are much more onerous than any other sort of you know, regulation that we've seen in the past, where you could just sort of fluff it off. Say yeah just pay the fine. I think you're going to see a lot of, well pay the lawyers to delay this thing and battle it. >> And one of our people in theCUBE that we interviewed, said it exactly right. It's like the GDPR is like the inverse of Y2K. In Y2K everybody was freaking out. It was actually nothing when it came down to it. Where nobody on the street is really buzzing. I mean the average person is not buzzing about GDPR, but it's hugely important. And like you said, I mean some serious penalties may be in the works for companies that are not complying, companies not just in Europe, but all around the world who do business with European customers. >> Right okay so now bring it back to sort of machine learning, deep learning. You basically said to Rob Thomas, I see machine learning here. I don't see a lot of the deep learning stuff quite yet. He said stay tuned. You know you were talking about TensorFlow and things like that. >> Yeah they supported that ... >> Explain. >> So Rob indicated that IBM very much, like with power AI and DSX, provides an open framework or toolkit for plugging in your, you the developers, preferred machine learning or deep learning toolkit of an open source nature. And there's a growing range of open source deep learning toolkits beyond you know TensorFlow, including Theano and MXNet and so forth, that IBM is supporting within the overall ESX framework, but also within the power AI framework. In other words they've got those capabilities. They're sort of burying that message under a bushel basket, at least in terms of this event. Also one of the things that ... I said this too Mena Scoyal. Watson data platform, which they launched last fall, very important product. Very important platform for collaboration among data science professionals, in terms of the machine learning development pipeline. I wish there was more about the Watson data platform here, about where they're taking it, what the customers are doing with it. Like I said a couple of times, I see Watson data platform as very much a DevOps tool for the new generation of developers that are building machine learning models directly into their applications. I'd like to see IBM, going forward turn Watson data platform into a true DevOps platform, in terms of continuous integration of machine learning and deep learning another statistical models. Continuous training, continuous deployment, iteration. I believe that's where they're going, or probably she will be going. I'd like to see more. I'm expecting more along those lines going forward. What I just described about DevOps for data science is a big theme that we're focusing on at Wikibon, in terms where the industry is going. >> Yeah, yeah. And I want to come back to that again, and get an update on what you're doing within your team, and talk about the research. Before we do that, I mean one of the things we talked about on theCUBE, in the early days of Hadoop is that the guys are going to make the money in this big data business of the practitioners. They're not going to see, you know these multi-hundred billion dollar valuations come out of the Hadoop world. And so far that prediction has held up well. It's the Airbnbs and the Ubers and the Spotifys and the Facebooks and the Googles, the practitioners who are applying big data, that are crushing it and making all the money. You see Amazon now buying Whole Foods. That in our view is a data play, but who's winning here, in either the vendor or the practitioner community? >> Who's winning are the startups with a hot new idea that's changing, that's disrupting some industry, or set of industries with machine learning, deep learning, big data, etc. For example everybody's, with bated breath, waiting for you know self-driving vehicles. And the ecosystem as it develops somebody's going to clean up. And one or more companies, companies we probably never heard of, leveraging everything we're describing here today, data science and containerized distributed applications that involve you know deep learning for you know image analysis and sensor analyst and so forth. Putting it all together in some new fabric that changes the way we live on this planet, but as you said the platforms themselves, whether they be Hadoop or Spark or TensorFlow, whatever, they're open source. You know and the fact is, by it's very nature, open source based solutions, in terms of profit margins on selling those, inexorably migrate to zero. So you're not going to make any money as a tool vendor, or a platform vendor. You got to make money ... If you're going to make money, you make money, for example from providing an ecosystem, within which innovation can happen. >> Okay we have a few minutes left. Let's talk about the research that you're working on. What's exciting you these days? >> Right, right. So I think a lot of people know I've been around the analyst space for a long long time. I've joined the SiliconANGLE Wikibon team just recently. I used to work for a very large solution provider, and what I do here for Wikibon is I focus on data science as the core of next generation application development. When I say next-generation application development, it's the development of AI, deep learning machine learning, and the deployment of those data-driven statistical assets into all manner of application. And you look at the hot stuff, like chatbots for example. Transforming the experience in e-commerce on mobile devices. Siri and Alexa and so forth. Hugely important. So what we're doing is we're focusing on AI and everything. We're focusing on containerization and building of AI micro-services and the ecosystem of the pipelines and the tools that allow you to do that. DevOps for data science, distributed training, federated training of statistical models, so forth. We are also very much focusing on the whole distributed containerized ecosystem, Docker, Kubernetes and so forth. Where that's going, in terms of changing the state of the art, in terms of application development. Focusing on the API economy. All of those things that you need to wrap around the payload of AI to deliver it into every ... >> So you're focused on that intersection between AI and the related topics and the developer. Who is winning in that developer community? Obviously Amazon's winning. You got Microsoft doing a good job there. Google, Apple, who else? I mean how's IBM doing for example? Maybe name some names. Who do you who impresses you in the developer community? But specifically let's start with IBM. How is IBM doing in that space? >> IBM's doing really well. IBM has been for quite a while, been very good about engaging with new generation of developers, using spark and R and Hadoop and so forth to build applications rapidly and deploy them rapidly into all manner of applications. So IBM has very much reached out to, in the last several years, the Millennials for whom all of this, these new tools, have been their core repertoire from the very start. And I think in many ways, like today like developer edition of the DB2 developer community edition is very much geared to that market. Saying you know to the cloud native application developer, take a second look at DB2. There's a lot in DB2 that you might bring into your next application development initiative, alongside your spark toolkit and so forth. So IBM has startup envy. They're a big old company. Been around more than a hundred years. And they're trying to, very much bootstrap and restart their brand in this new context, in the 21st century. I think they're making a good effort at doing it. In terms of community engagement, they have a really good community engagement program, all around the world, in terms of hackathons and developer days, you know meetups here and there. And they get lots of turnout and very loyal customers and IBM's got to broadest portfolio. >> So you still bleed a little bit of blue. So I got to squeeze it out of you now here. So let me push a little bit on what you're saying. So DB2 is the emphasis here, trying to position DB2 as appealing for developers, but why not some of the other you know acquisitions that they've made? I mean you don't hear that much about Cloudant, Dash TV, and things of that nature. You would think that that would be more appealing to some of the developer communities than DB2. Or am I mistaken? Is it IBM sort of going after the core, trying to evolve that core you know constituency? >> No they've done a lot of strategic acquisitions like Cloudant, and like they've acquired Agrath Databases and brought them into their platform. IBM has every type of database or file system that you might need for web or social or Internet of Things. And so with all of the development challenges, IBM has got a really high-quality, fit-the-purpose, best-of-breed platform, underlying data platform for it. They've got huge amounts of developers energized all around the world working on this platform. DB2, in the last several years they've taken all of their platforms, their legacy ... That's the wrong word. All their existing mature platforms, like DB2 and brought them into the IBM cloud. >> I think legacy is the right word. >> Yeah, yeah. >> These things have been around for 30 years. >> And they're not going away because they're field-proven and ... >> They are evolving. >> And customers have implemented them everywhere. And they're evolving. If you look at how IBM has evolved DB2 in the last several years into ... For example they responded to the challenge from SAP HANA. We brought BLU Acceleration technology in memory technology into DB2 to make it screamingly fast and so forth. IBM has done a really good job of turning around these product groups and the product architecture is making them cloud first. And then reaching out to a new generation of cloud application developers. Like I said today, things like DB2 developer community edition, it's just the next chapter in this ongoing saga of IBM turning itself around. Like I said, each of the individual announcements today is like okay that's interesting. I'm glad to see IBM showing progress. None of them is individually disruptive. I think the last week though, I think Hortonworks was disruptive in the sense that IBM recognized that BigInsights didn't really have a lot of traction in the Hadoop spaces, not as much as they would have wished. Hortonworks very much does, and IBM has cast its lot to work with HDP, but HDP and Hortonworks recognizes they haven't achieved any traction with data scientists, therefore DSX makes sense, as part of the Hortonworks portfolio. Likewise a big sequel makes perfect sense as the sequel front end to the HDP. I think the teaming of IBM and Hortonworks is propitious of further things that they'll be doing in the future, not just governance, but really putting together a broader cloud portfolio for the next generation of data scientists doing work in the cloud. >> Do you think Hortonworks is a legitimate acquisition target for IBM. >> Of course they are. >> Why would IBM ... You know educate us. Why would IBM want to acquire Hortonworks? What does that give IBM? Open source mojo, obviously. >> Yeah mojo. >> What else? >> Strong loyalty with the Hadoop market with developers. >> The developer angle would supercharge the developer angle, and maybe make it more relevant outside of some of those legacy systems. Is that it? >> Yeah, but also remember that Hortonworks came from Yahoo, the team that developed much of what became Hadoop. They've got an excellent team. Strategic team. So in many ways, you can look at Hortonworks as one part aqui-hire if they ever do that and one part really substantial and growing solution portfolio that in many ways is complementary to IBM. Hortonworks is really deep on the governance of Hadoop. IBM has gone there, but I think Hortonworks is even deeper, in terms of their their laser focus. >> Ecosystem expansion, and it actually really wouldn't be that expensive of an acquisition. I mean it's you know north of ... Maybe a billion dollars might get it done. >> Yeah. >> You know so would you pay a billion dollars for Hortonworks? >> Not out of my own pocket. >> No, I mean if you're IBM. You think that would deliver that kind of value? I mean you know how IBM thinks about about acquisitions. They're good at acquisitions. They look at the IRR. They have their formula. They blue-wash the companies and they generally do very well with acquisitions. Do you think Hortonworks would fit profile, that monetization profile? >> I wouldn't say that Hortonworks, in terms of monetization potential, would match say what IBM has achieved by acquiring the Netezza. >> Cognos. >> Or SPSS. I mean SPSS has been an extraordinarily successful ... >> Well the day IBM acquired SPSS they tripled the license fees. As a customer I know, ouch, it worked. It was incredibly successful. >> Well, yeah. Cognos was. Netezza was. And SPSS. Those three acquisitions in the last ten years have been extraordinarily pivotal and successful for IBM to build what they now have, which is really the most comprehensive portfolio of fit-to-purpose data platform. So in other words all those acquisitions prepared IBM to duke it out now with their primary competitors in this new field, which are Microsoft, who's newly resurgent, and Amazon Web Services. In other words, the two Seattle vendors, Seattle has come on strong, in a way that almost Seattle now in big data in the cloud is eclipsing Silicon Valley, in terms of where you know ... It's like the locus of innovation and really of customer adoption in the cloud space. >> Quite amazing. Well Google still hanging in there. >> Oh yeah. >> Alright, Jim. Really a pleasure working with you today. Thanks so much. Really appreciate it. >> Thanks for bringing me on your team. >> And Munich crew, you guys did a great job. Really well done. Chuck, Alex, Patrick wherever he is, and our great makeup lady. Thanks a lot. Everybody back home. We're out. This is Fast Track Your Data. Go to IBMgo.com for all the replays. Youtube.com/SiliconANGLE for all the shows. TheCUBE.net is where we tell you where theCUBE's going to be. Go to wikibon.com for all the research. Thanks for watching everybody. This is Dave Vellante with Jim Kobielus. We're out.

Published Date : Jun 25 2017

SUMMARY :

Brought to you by IBM. I mean they were you know just kind of ... I think the word you used last night was perfunctory. And a couple of things of importance to European customers, first and foremost GDPR. IBM knows how to throw a party. I mean terms of what you learn. seen in the past, where you could just sort of fluff it off. I mean the average person is not buzzing about GDPR, but it's hugely important. I don't see a lot of the deep learning stuff quite yet. And there's a growing range of open source deep learning toolkits beyond you know TensorFlow, of Hadoop is that the guys are going to make the money in this big data business of the And the ecosystem as it develops somebody's going to clean up. Let's talk about the research that you're working on. the pipelines and the tools that allow you to do that. Who do you who impresses you in the developer community? all around the world, in terms of hackathons and developer days, you know meetups here Is it IBM sort of going after the core, trying to evolve that core you know constituency? They've got huge amounts of developers energized all around the world working on this platform. Likewise a big sequel makes perfect sense as the sequel front end to the HDP. You know educate us. The developer angle would supercharge the developer angle, and maybe make it more relevant Hortonworks is really deep on the governance of Hadoop. I mean it's you know north of ... They blue-wash the companies and they generally do very well with acquisitions. I wouldn't say that Hortonworks, in terms of monetization potential, would match say I mean SPSS has been an extraordinarily successful ... Well the day IBM acquired SPSS they tripled the license fees. now in big data in the cloud is eclipsing Silicon Valley, in terms of where you know Well Google still hanging in there. Really a pleasure working with you today. And Munich crew, you guys did a great job.

ENTITIES

Entity	Category	Confidence
Kate Silverton	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Google	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Patrick	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Germany	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Y2K	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Chuck	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
England	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
second track	QUANTITY	0.99+
Siri	TITLE	0.99+
two	QUANTITY	0.99+
21st century	DATE	0.99+
three track	QUANTITY	0.99+
Rob	PERSON	0.99+
next year	DATE	0.99+
4%	QUANTITY	0.99+
Mena Scoyal	PERSON	0.99+
Alex	PERSON	0.99+
Whole Foods	ORGANIZATION	0.99+
Each	QUANTITY	0.99+
Cloudant	ORGANIZATION	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for IBM Data: