Jean Francois Puget, IBM | IBM Machine Learning Launch 2017

>> Announcer: Live from New York, it's theCUBE, covering the IBM machine learning launch event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Alright, we're back. Jean Francois Puget is here, he's the distinguished engineer for machine learning and optimization at IBM analytics, CUBE alum. Good to see you again. >> Yes. >> Thanks very much for coming on, big day for you guys. >> Jean Francois: Indeed. >> It's like giving birth every time you guys give one of these products. We saw you a little bit in the analyst meeting, pretty well attended. Give us the highlights from your standpoint. What are the key things that we should be focused on in this announcement? >> For most people, machine learning equals machine learning algorithms. Algorithms, when you look at newspapers or blogs, social media, it's all about algorithms. Our view that, sure, you need algorithms for machine learning, but you need steps before you run algorithms, and after. So before, you need to get data, to transform it, to make it usable for machine learning. And then, you run algorithms. These produce models, and then, you need to move your models into a production environment. For instance, you use an algorithm to learn from past credit card transaction fraud. You can learn models, patterns, that correspond to fraud. Then, you want to use those models, those patterns, in your payment system. And moving from where you run the algorithm to the operation system is a nightmare today, so our value is to automate what you do before you run algorithms, and then what you do after. That's our differentiator. >> I've had some folks in theCUBE in the past have said years ago, actually, said, "You know what, algorithms are plentiful." I think he made the statement, I remember my friend Avi Mehta, "Algorithms are free. "It's what you do with them that matters." >> Exactly, that's, I believe in autonomy that open source won for machine learning algorithms. Now the future is with open source, clearly. But it solves only a part of the problem you're facing if you want to action machine learning. So, exactly what you said. What do you do with the results of algorithm is key. And open source people don't care much about it, for good reasons. They are focusing on producing the best algorithm. We are focusing on creating value for our customers. It's different. >> In terms of, you mentioned open source a couple times, in terms of customer choice, what's your philosophy with regard to the various tooling and platforms for open source, how do you go about selecting which to support? >> Machine learning is fascinating. It's overhyped, maybe, but it's also moving very quickly. Every year there is a new cool stuff. Five years ago, nobody spoke about deep learning. Now it's everywhere. Who knows what will happen next year? Our take is to support open source, to support the top open source packages. We don't know which one will win in the future. We don't know even if one will be enough for all needs. We believe one size does not fit all, so our take is support a curated list of mid-show open source. We start with Spark ML for many reasons, but we won't stop at Spark ML. >> Okay, I wonder if we can talk use cases. Two of my favorite, well, let's just start with fraud. Fraud has become much, much better over the past certainly 10 years, but still not perfect. I don't know if perfection is achievable, but lot of false positives. How will machine learning affect that? Can we expect as consumers even better fraud detection in more real time? >> If we think of the full life cycle going from data to value, we will provide a better answer. We still use machine learning algorithm to create models, but a model does not tell you what to do. It will tell you, okay, for this credit card transaction coming, it has a high probability to be fraud. Or this one has a lower priority, uh, probability. But then it's up to the designer of the overall application to make decisions, so what we recommend is to use machine learning data prediction but not only, and then use, maybe, (murmuring). For instance, if your machine learning model tells you this is a fraud with a high probability, say 90%, and this is a customer you know very well, it's a 10-year customer you know very well, then you can be confident that it's a fraud. Then if next fraud tells you this is 70% probability, but it's a customer since one week. In a week, we don't know the customer, so the confidence we can get in machine learning should be low, and there you will not reject the transaction immediately. Maybe you will enter, you don't approve it automatically, maybe you will send a one-time passcode, or you enter a serve vendor system, but you don't reject it outright. Really, the idea is to use machine learning predictions as yet another input for making decisions. You're making decision informed on what you could learn from your past. But it's not replacing human decision-making. Our approach with IBM, you don't see IBM speak much about artificial intelligence in general because we don't believe we're here to replace humans. We're here to assist humans, so we say, augmented intelligence or assistance. That's the role we see for machine learning. It will give you additional data so that you make better decisions. >> It's not the concept that you object to, it's the term artificial intelligence. It's really machine intelligence, it's not fake. >> I started my career as a PhD in artificial intelligence, I won't say when, but long enough. At that time, there were already promise that we have Terminator in the next decade and this and that. And the same happened in the '60s, or it was after the '60s. And then, there is an AI winter, and we have a risk here to have an AI winter because some people are just raising red flags that are not substantiated, I believe. I don't think that technology's here that we can replace human decision-making altogether any time soon, but we can help. We can certainly make some proficient, more efficient, more productive with machine learning. >> Having said that, there are a lot of cognitive functions that are getting replaced, maybe not by so-called artificial intelligence, but certainly by machines and automation. >> Yes, so we're automating a number of things, and maybe we won't need to have people do quality check and just have an automated vision system detect defects. Sure, so we're automating more and more, but this is not new, it has been going on for centuries. >> Well, the list evolved. So, what can humans do that machines can't, and how would you expect that to change? >> We're moving away from IMB machine learning, but it is interesting. You know, each time there is a capacity that a machine that will automate, we basically redefine intelligence to exclude it, so you know. That's what I foresee. >> Yeah, well, robots a while ago, Stu, couldn't climb stairs, and now, look at that. >> Do we feel threatened because a robot can climb a stair faster than us? Not necessarily. >> No, it doesn't bother us, right. Okay, question? >> Yeah, so I guess, bringing it back down to the solution that we're talking about today, if I now am doing, I'm doing the analytics, the machine learning on the mainframe, how do we make sure that we don't overrun and blow out all our MIPS? >> We recommend, so we are not using the mainframe base compute system. We recommend using ZIPS, so additional calls to not overload, so it's a very important point. We claim, okay, if you do everything on the mainframe, you can learn from operational data. You don't want to disturb, and you don't want to disturb takes a lot of different meanings. One that you just said, you don't want to slow down your operation processings because you're going to hurt your business. But you also want to be careful. Say we have a payment system where there is a machine learning model predicting fraud probability, a part of the system. You don't want a young bright data scientist decide that he had a great idea, a great model, and he wants to push his model in production without asking anyone. So you want to control that. That's why we insist, we are providing governance that includes a lot of things like keeping track of how models were created from which data sets, so lineage. We also want to have access control and not allow anyone to just deploy a new model because we make it easy to deploy, so we want to have a role-based access and only someone someone with some executive, well, it depends on the customer, but not everybody can update the production system, and we want to support that. And that's something that differentiates us from open source. Open source developers, they don't care about governance. It's not their problem, but it is our customer problem, so this solution will come with all the governance and integrity constraints you can expect from us. >> Can you speak to, first solution's going to be on z/OS, what's the roadmap look like and what are some of those challenges of rolling this out to other private cloud solutions? >> We are going to shape this quarter IBM machine learning for Z. It starts with Spark ML as a base open source. This is not, this is interesting, but it's not all that is for machine learning. So that's how we start. We're going to add more in the future. Last week we announced we will shape Anaconda, which is a major distribution for Python ecosystem, and it includes a number of machine learning open source. We announced it for next quarter. >> I believe in the press release it said down the road things like TensorFlow are coming, H20. >> But Anaconda will announce for next quarter, so we will leverage this when it's out. Then indeed, we have a roadmap to include major open source, so major open source are the one from Anaconda (murmuring), mostly. Key deep learning, so TensorFlow and probably one or two additional, we're still discussing. One that I'm very keen on, it's called XGBoost in one word. People don't speak about it in newspapers, but this is what wins all Kaggle competitions. Kaggle is a machine learning competition site. When I say all, all that are not imagery cognition competitions. >> Dave: And that was ex-- >> XGBoost, X-G-B-O-O-S-T. >> Dave: XGBoost, okay. >> XGBoost, and it's-- >> Dave: X-ray gamma, right? >> It's really a package. When I say we don't know which package will win, XGBoost was introduced a year ago also, or maybe a bit more, but not so long ago, and now, if you have structure data, it is the best choice today. It's a really fast-moving, but so, we will support mid-show deep learning package and mid-show classical learning package like the one from Anaconda or XGBoost. The other thing we start with Z. We announced in the analyst session that we will have a power version and a private cloud, meaning XTC69X version as well. I can't tell you when because it's not firm, but it will come. >> And in public cloud as well, I guess we'll, you've got components in the public cloud today like the Watson Data Platform that you've extracted and put here. >> We have extracted part of the testing experience, so we've extracted notebooks and a graphical tool called ModelBuilder from DSX as part of IBM machine learning now, and we're going to add more of DSX as we go. But the goal is to really share code and function across private cloud and public cloud. As Rob Thomas defined it, we want with private cloud to offer all the features and functionality of public cloud, except that it would run inside a firewall. We are really developing machine learning and Watson machine learning on a command code base. It's an internal open source project. We share code, and then, we shape on different platform. >> I mean, you haven't, just now, used the word hybrid. Every now and then IBM does, but do you see that so-called hybrid use case as viable, or do you see it more, some workloads should run on prem, some should run in the cloud, and maybe they'll never come together? >> Machine learning, you basically have to face, one is training and the other is scoring. I see people moving training to cloud quite easily, unless there is some regulation about data privacy. But training is a good fit for cloud because usually you need a large computing system but only for limited time, so elasticity's great. But then deployment, if you want to score transaction in a CICS transaction, it has to run beside CICS, not cloud. If you want to score data on an IoT gateway, you want to score other gateway, not in a data center. I would say that may not be what people think first, but what will drive really the split between public cloud, private, and on prem is where you want to apply your machine learning models, where you want to score. For instance, smart watches, they are switching to gear to fit measurement system. You want to score your health data on the watch, not in the internet somewhere. >> Right, and in that CICS example that you gave, you'd essentially be bringing the model to the CICS data, is that right? >> Yes, that's what we do. That's a value of machine learning for Z is if you want to score transactions happening on Z, you need to be running on Z. So it's clear, mainframe people, they don't want to hear about public cloud, so they will be the last one moving. They have their reasons, but they like mainframe because it ties really, really secure and private. >> Dave: Public cloud's a dirty word. >> Yes, yes, for Z users. At least that's what I was told, and I could check with many people. But we know that in general the move is for public cloud, so we want to help people, depending on their journey, of the cloud. >> You've got one of those, too. Jean Francois, thanks very much for coming on theCUBE, it was really a pleasure having you back. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from the Waldorf Astoria. IBM's machine learning announcement, be right back. (electronic keyboard music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Good to see you again. on, big day for you guys. What are the key things that we and then what you do after. "It's what you do with them that matters." So, exactly what you said. but we won't stop at Spark ML. the past certainly 10 years, so that you make better decisions. that you object to, that we have Terminator in the next decade cognitive functions that and maybe we won't need to and how would you expect that to change? to exclude it, so you know. and now, look at that. Do we feel threatened because No, it doesn't bother us, right. and you don't want to disturb but it's not all that I believe in the press release it said so we will leverage this when it's out. and now, if you have structure data, like the Watson Data Platform But the goal is to really but do you see that so-called is where you want to apply is if you want to score so we want to help people, depending on it was really a pleasure having you back. from the Waldorf Astoria.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jean Francois	PERSON	0.99+
IBM	ORGANIZATION	0.99+
10-year	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
Avi Mehta	PERSON	0.99+
New York	LOCATION	0.99+
Anaconda	ORGANIZATION	0.99+
70%	QUANTITY	0.99+
Jean Francois Puget	PERSON	0.99+
next year	DATE	0.99+
Two	QUANTITY	0.99+
Last week	DATE	0.99+
next quarter	DATE	0.99+
90%	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
one-time	QUANTITY	0.99+
today	DATE	0.99+
Five years ago	DATE	0.99+
one word	QUANTITY	0.99+
CICS	ORGANIZATION	0.99+
Python	TITLE	0.99+
a year ago	DATE	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
next decade	DATE	0.98+
one week	QUANTITY	0.98+
first solution	QUANTITY	0.98+
XGBoost	TITLE	0.98+
a week	QUANTITY	0.97+
Spark ML	TITLE	0.97+
'60s	DATE	0.97+
ModelBuilder	TITLE	0.96+
one size	QUANTITY	0.96+
One	QUANTITY	0.95+
first	QUANTITY	0.94+
Watson Data Platform	TITLE	0.93+
each time	QUANTITY	0.93+
Kaggle	ORGANIZATION	0.92+
Stu	PERSON	0.91+
this quarter	DATE	0.91+
DSX	TITLE	0.89+
XGBoost	ORGANIZATION	0.89+
Waldorf Astoria	ORGANIZATION	0.86+
Spark ML.	TITLE	0.85+
z/OS	TITLE	0.82+
years	DATE	0.8+
centuries	QUANTITY	0.75+
10 years	QUANTITY	0.75+
DSX	ORGANIZATION	0.72+
Terminator	TITLE	0.64+
XTC69X	TITLE	0.63+
IBM Machine Learning Launch 2017	EVENT	0.63+
couple times	QUANTITY	0.57+
machine learning	EVENT	0.56+
X	TITLE	0.56+
Watson	TITLE	0.55+
these products	QUANTITY	0.53+
-G-B	COMMERCIAL_ITEM	0.53+
H20	ORGANIZATION	0.52+
TensorFlow	ORGANIZATION	0.5+
theCUBE	ORGANIZATION	0.49+
CUBE	ORGANIZATION	0.37+

Wrap Up - IBM Machine Learning Launch - #IBMML - #theCUBE

(jazzy intro music) [Narrator] Live from New York, it's the Cube! Covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts: Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody. This is theCUBE, the leader in live tech coverage. We've been covering, all morning, the IBM Machine Learning announcement. Essentially what IBM did is they brought Machine Learning to the z platform. My co-host and I, Stu Miniman, have been talking to a number of guests, and we're going to do a quick wrap here. You know, Stu, my take is, when we first heard about this, and the world first heard about this, we were like, "Eh, okay, that's nice, that's interesting." But what it underscores is IBM's relentless effort to continue to keep z relevant. We saw it with the early Linux stuff, we're now seeing it with all the OpenSource and Spark tooling. You're seeing IBM make big positioning efforts to bring analytics and transactions together, and the simple point is, a lot of the world's really important data runs on mainframes. You were just quoting some stats, which were pretty interesting. >> Yeah, I mean, Dave, you know, one of the biggest challenges we know in IT is migrating. Moving from one thing to another is really tough. I love the comment from Barry Baker. Well, if I need to change my platform, by the time I've moved it, that whole digital transformation, we've missed that window. It's there. We know how long that takes: months, quarters. I was actually watching Twitter, and it looks like Chris Maddern is here. Chris was the architect of Venmo, which my younger sisters, all the millennials that I know, everybody uses Venmo. He's here, and he was like, "Almost all the banks, airlines, and retailers "still run on mainframes in 2017, and it's growing. "Who knew?" You've got a guy here that's developing really cool apps that was finding this interesting, and that's an angle I've been looking at today, Dave, is how do you make it easy for developers to leverage these platforms that are already there? The developers aren't going to need to care whether it's a mainframe or a cloud or x86 underneath. IBM is giving you the options, and as a number of our guests said, they're not looking to solve all the problems here. Here's taking this really great, new type of application using Machine Learning and making it available on that platform that so many of their customers already use. >> Right, so we heard a little bit of roadmap here: the ML for z goes GA in Q1, and then we don't have specific timeframes, but we're going to see Power platform pick this up. We heard from Jean-Francois Puget that they'll have an x86 version, and then obviously a cloud version. It's unclear what that hybrid cloud will look like. It's a little fuzzy right now, but that's something that we're watching. Obviously a lot of the model development and training is going to live in the cloud, but the scoring is going to be done locally is how the data scientists like to think about these things. So again, Stu, more mainframe relevance. We've got another cycle coming soon for the mainframe. We're two years into the z13. When IBM has mainframe cycles, it tends to give a little bump to earnings. Now, granted, a smaller and smaller portion of the company's business is mainframe, but still, mainframe drags a lot of other software with it, so it remains a strategic component. So one of the questions we get a lot is what's IBM doing in so-called hardware? Of course, IBM says it's all software, but we know they're still selling boxes, right? So, all the hardware guys, EMC, Dell, IBM, HPE, et cetera. A lot of software content, but it's still a hardware business. So there's really two platforms there: there's the z and there's the Power. And those are both strategic to IBM. It sold its x86 business because it didn't see it as strategic. They just put Bob Picciano in charge of the Power business, so there's obviously real commitments to those platforms. Will they make a dent in the market share numbers? Unclear. It looks like it's steady as she goes, not dramatic increase in share. >> Yeah, and Dave, I didn't hear anybody come in here and say this offering is going to say, well let me dump x86 and go buy mainframe. That's not the target that I heard here. I would have loved to hear a little bit more as to where this fits into the broader IOT strategy. We talked a little bit on the intro, Dave. There's a lot of reasons why data's going to stick at the edge when we look at the numbers. For the huge growth of public cloud, the amount of data in public cloud hasn't caught up to the equivalent of what it would be in data centers itself. What I mean by that is, we usually spend, say 30% on average for storage costs inside a data center. If we look at public cloud, it's more around 10%. So, at AWS Reinvent, I talked to a number of the ecosystem partners, that started to see things like data lakes starting to appear in the cloud. This solution isn't in the data lake family, but it's with the analytics and everything that's happening with streaming and machine learning. It's large repositories of data and huge transactions of data that are happening in the mainframe, and just trying to squint through where all the data lives, and the new waves of technologies coming in. We heard how this can tie into some of the mobile and streaming activities that aren't on the mainframe, so that it can pull them into the other decisions, but some broader picture that I'm sure IBM will be able to give in the future. >> Well, normally you would expect a platform that is however many decades old the mainframe is, after the whole mainframe downsizing trend, you would expect there would be a managed decline in that business. I mean, you're seeing it in a lot of places now. We've talked about this, with things like Symmetrics, right? You minimize and focus the R&D investments, and you try to manage cost, you manage the decline of the business. IBM has almost sort of flipped that. They say, okay, we've got DB2, we're going to continue to invest in that platform. We've got our major subsystems, we're going to enhance the platform with Open Source technologies. We've got a big enough base that we can continue to mine perpetually. The more interesting thing to me about this announcement is it underscores how IBM is leveraging its analytics platform. So, we saw the announcement of the Watson Data Platform last September, which was sort of this end-to-end data pipeline collaboration between different persona engine, which is quite unique in the marketplace, a lot of differentiation there. Still some services. Last week at Spark Summit, I talked to some of the users and some of the partners of the Watson Data Platform. They said it's great, we love it, it's probably the most robust in the marketplace, but it's still a heavy lift. It still requires a fair amount of services, and IBM's still pushing those services. So IBM still has a large portion of the company still a services company. So, not surprising there, but as I've said many many times, the challenge IBM has is to really drive that software business, simplify the deployment and management of that software for its customers, which is something that I think it's working hard on doing. And the other thing is you're seeing IBM leverage those platforms, those analytics platforms, into different hardware segments, or hardware/cloud segments, whether it's BlueMix, z, Power, so, pushing it out through the organization. IBM still has a stack, like Oracle has a stack, so wherever it can push its own stack, it's going to do that, cuz the margins are better. At the same time, I think it understands very well, it's got to have open source choice. >> Yeah, absolutely, and that's something we heard loud and clear here, Dave, which is what we expect from IBM: choice of language, choice of framework. When I hear the public cloud guys, it's like, "Oh, well here's kind of the main focus we have, "and maybe we'll have a little bit of choice there." Absolutely the likes of Google and Amazon are working with open source, but at least first blush, when I look at things, it looks like once IBM fleshes this out -- and as we've said, it's the Spark to start and others that they're adding on -- but IBM could have a broader offering than I expect to see from some of the public cloud guys. We'll see. As you know, Dave, Google's got their cloud event in a couple of weeks in San Francisco. We'll be covering that, and of course Amazon, you expect their regular cadence of announcements that they'll make. So, definitely a new front in the Cloud Wars as it were, for machine learning. >> Excellent! Alright, Stu, we got to wrap, cuz we're broadcasting the livestream. We got to go set up for that. Thanks, I really appreciate you coming down here and co-hosting with me. Good event. >> Always happy to come down to the Big Apple, Dave. >> Alright, good. Alright, thanks for watching, everybody! So, check out SiliconAngle.com, you'll get all the new from this event and around the world. Check out SiliconAngle.tv for this and other CUBE activities, where we're going to be next. We got a big spring coming up, end of winter, big spring coming in this season. And check out WikiBon.com for all the research. Thanks guys, good job today, that's a wrap! We'll see you next time. This is theCUBE, we're out. (jazzy music)

Published Date : Feb 15 2017

SUMMARY :

New York, it's the Cube! a lot of the world's really important data the biggest challenges we Obviously a lot of the model a number of the ecosystem partners, the challenge IBM has is to really kind of the main focus we have, We got to go set up for that. down to the Big Apple, Dave. and around the world.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Chris	PERSON	0.99+
Dave	PERSON	0.99+
Barry Baker	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Chris Maddern	PERSON	0.99+
2017	DATE	0.99+
Bob Picciano	PERSON	0.99+
Google	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
San Francisco	LOCATION	0.99+
Stu	PERSON	0.99+
New York City	LOCATION	0.99+
Last week	DATE	0.99+
New York	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
one	QUANTITY	0.99+
30%	QUANTITY	0.99+
two platforms	QUANTITY	0.99+
two years	QUANTITY	0.99+
Linux	TITLE	0.99+
Alrig	PERSON	0.99+
last September	DATE	0.99+
Jean-Francois Puget	PERSON	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.98+
today	DATE	0.98+
Watson Data Platform	TITLE	0.98+
Venmo	ORGANIZATION	0.97+
Spark Summit	EVENT	0.97+
Q1	DATE	0.96+
Big Apple	LOCATION	0.96+
EMC	ORGANIZATION	0.95+
HPE	ORGANIZATION	0.95+
BlueMix	TITLE	0.94+
Spark	TITLE	0.91+
WikiBon.com	ORGANIZATION	0.9+
IBM Machine Learning Launch	EVENT	0.89+
one thing	QUANTITY	0.86+
AWS Reinvent	ORGANIZATION	0.82+
around 10%	QUANTITY	0.8+
x86	COMMERCIAL_ITEM	0.78+
SiliconAngle.tv	ORGANIZATION	0.77+
#IBMML	TITLE	0.76+
z13	COMMERCIAL_ITEM	0.74+
end	DATE	0.71+
Machine Learning	TITLE	0.65+
x86	TITLE	0.62+
CUBE	ORGANIZATION	0.56+
OpenSource	TITLE	0.56+
Twitter	TITLE	0.54+
Learning	TITLE	0.5+
decades	QUANTITY	0.48+
Symmetrics	TITLE	0.46+
SiliconAngle.com	ORGANIZATION	0.43+
theCUBE	ORGANIZATION	0.41+
Wars	TITLE	0.35+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Jean Francois: