Luis Ceze, OctoML | Amazon re:MARS 2022

(upbeat music) >> Welcome back, everyone, to theCUBE's coverage here live on the floor at AWS re:MARS 2022. I'm John Furrier, host for theCUBE. Great event, machine learning, automation, robotics, space, that's MARS. It's part of the re-series of events, re:Invent's the big event at the end of the year, re:Inforce, security, re:MARS, really intersection of the future of space, industrial, automation, which is very heavily DevOps machine learning, of course, machine learning, which is AI. We have Luis Ceze here, who's the CEO co-founder of OctoML. Welcome to theCUBE. >> Thank you very much for having me in the show, John. >> So we've been following you guys. You guys are a growing startup funded by Madrona Venture Capital, one of your backers. You guys are here at the show. This is a, I would say small show relative what it's going to be, but a lot of robotics, a lot of space, a lot of industrial kind of edge, but machine learning is the centerpiece of this trend. You guys are in the middle of it. Tell us your story. >> Absolutely, yeah. So our mission is to make machine learning sustainable and accessible to everyone. So I say sustainable because it means we're going to make it faster and more efficient. You know, use less human effort, and accessible to everyone, accessible to as many developers as possible, and also accessible in any device. So, we started from an open source project that began at University of Washington, where I'm a professor there. And several of the co-founders were PhD students there. We started with this open source project called Apache TVM that had actually contributions and collaborations from Amazon and a bunch of other big tech companies. And that allows you to get a machine learning model and run on any hardware, like run on CPUs, GPUs, various GPUs, accelerators, and so on. It was the kernel of our company and the project's been around for about six years or so. Company is about three years old. And we grew from Apache TVM into a whole platform that essentially supports any model on any hardware cloud and edge. >> So is the thesis that, when it first started, that you want to be agnostic on platform? >> Agnostic on hardware, that's right. >> Hardware, hardware. >> Yeah. >> What was it like back then? What kind of hardware were you talking about back then? Cause a lot's changed, certainly on the silicon side. >> Luis: Absolutely, yeah. >> So take me through the journey, 'cause I could see the progression. I'm connecting the dots here. >> So once upon a time, yeah, no... (both chuckling) >> I walked in the snow with my bare feet. >> You have to be careful because if you wake up the professor in me, then you're going to be here for two hours, you know. >> Fast forward. >> The average version here is that, clearly machine learning has shown to actually solve real interesting, high value problems. And where machine learning runs in the end, it becomes code that runs on different hardware, right? And when we started Apache TVM, which stands for tensor virtual machine, at that time it was just beginning to start using GPUs for machine learning, we already saw that, with a bunch of machine learning models popping up and CPUs and GPU's starting to be used for machine learning, it was clear that it come opportunity to run on everywhere. >> And GPU's were coming fast. >> GPUs were coming and huge diversity of CPUs, of GPU's and accelerators now, and the ecosystem and the system software that maps models to hardware is still very fragmented today. So hardware vendors have their own specific stacks. So Nvidia has its own software stack, and so does Intel, AMD. And honestly, I mean, I hope I'm not being, you know, too controversial here to say that it kind of of looks like the mainframe era. We had tight coupling between hardware and software. You know, if you bought IBM hardware, you had to buy IBM OS and IBM database, IBM applications, it all tightly coupled. And if you want to use IBM software, you had to buy IBM hardware. So that's kind of like what machine learning systems look like today. If you buy a certain big name GPU, you've got to use their software. Even if you use their software, which is pretty good, you have to buy their GPUs, right? So, but you know, we wanted to help peel away the model and the software infrastructure from the hardware to give people choice, ability to run the models where it best suit them. Right? So that includes picking the best instance in the cloud, that's going to give you the right, you know, cost properties, performance properties, or might want to run it on the edge. You might run it on an accelerator. >> What year was that roughly, when you were going this? >> We started that project in 2015, 2016 >> Yeah. So that was pre-conventional wisdom. I think TensorFlow wasn't even around yet. >> Luis: No, it wasn't. >> It was, I'm thinking like 2017 or so. >> Luis: Right. So that was the beginning of, okay, this is opportunity. AWS, I don't think they had released some of the nitro stuff that the Hamilton was working on. So, they were already kind of going that way. It's kind of like converging. >> Luis: Yeah. >> The space was happening, exploding. >> Right. And the way that was dealt with, and to this day, you know, to a large extent as well is by backing machine learning models with a bunch of hardware specific libraries. And we were some of the first ones to say, like, know what, let's take a compilation approach, take a model and compile it to very efficient code for that specific hardware. And what underpins all of that is using machine learning for machine learning code optimization. Right? But it was way back when. We can talk about where we are today. >> No, let's fast forward. >> That's the beginning of the open source project. >> But that was a fundamental belief, worldview there. I mean, you have a world real view that was logical when you compare to the mainframe, but not obvious to the machine learning community. Okay, good call, check. Now let's fast forward, okay. Evolution, we'll go through the speed of the years. More chips are coming, you got GPUs, and seeing what's going on in AWS. Wow! Now it's booming. Now I got unlimited processors, I got silicon on chips, I got, everywhere >> Yeah. And what's interesting is that the ecosystem got even more complex, in fact. Because now you have, there's a cross product between machine learning models, frameworks like TensorFlow, PyTorch, Keras, and like that and so on, and then hardware targets. So how do you navigate that? What we want here, our vision is to say, folks should focus, people should focus on making the machine learning models do what they want to do that solves a value, like solves a problem of high value to them. Right? So another deployment should be completely automatic. Today, it's very, very manual to a large extent. So once you're serious about deploying machine learning model, you got a good understanding where you're going to deploy it, how you're going to deploy it, and then, you know, pick out the right libraries and compilers, and we automated the whole thing in our platform. This is why you see the tagline, the booth is right there, like bringing DevOps agility for machine learning, because our mission is to make that fully transparent. >> Well, I think that, first of all, I use that line here, cause I'm looking at it here on live on camera. People can't see, but it's like, I use it on a couple couple of my interviews because the word agility is very interesting because that's kind of the test on any kind of approach these days. Agility could be, and I talked to the robotics guys, just having their product be more agile. I talked to Pepsi here just before you came on, they had this large scale data environment because they built an architecture, but that fostered agility. So again, this is an architectural concept, it's a systems' view of agility being the output, and removing dependencies, which I think what you guys were trying to do. >> Only part of what we do. Right? So agility means a bunch of things. First, you know-- >> Yeah explain. >> Today it takes a couple months to get a model from, when the model's ready, to production, why not turn that in two hours. Agile, literally, physically agile, in terms of walk off time. Right? And then the other thing is give you flexibility to choose where your model should run. So, in our deployment, between the demo and the platform expansion that we announced yesterday, you know, we give the ability of getting your model and, you know, get it compiled, get it optimized for any instance in the cloud and automatically move it around. Today, that's not the case. You have to pick one instance and that's what you do. And then you might auto scale with that one instance. So we give the agility of actually running and scaling the model the way you want, and the way it gives you the right SLAs. >> Yeah, I think Swami was mentioning that, not specifically that use case for you, but that use case generally, that scale being moving things around, making them faster, not having to do that integration work. >> Scale, and run the models where they need to run. Like some day you want to have a large scale deployment in the cloud. You're going to have models in the edge for various reasons because speed of light is limited. We cannot make lights faster. So, you know, got to have some, that's a physics there you cannot change. There's privacy reasons. You want to keep data locally, not send it around to run the model locally. So anyways, and giving the flexibility. >> Let me jump in real quick. I want to ask this specific question because you made me think of something. So we're just having a data mesh conversation. And one of the comments that's come out of a few of these data as code conversations is data's the product now. So if you can move data to the edge, which everyone's talking about, you know, why move data if you don't have to, but I can move a machine learning algorithm to the edge. Cause it's costly to move data. I can move computer, everyone knows that. But now I can move machine learning to anywhere else and not worry about integrating on the fly. So the model is the code. >> It is the product. >> Yeah. And since you said, the model is the code, okay, now we're talking even more here. So machine learning models today are not treated as code, by the way. So do not have any of the typical properties of code that you can, whenever you write a piece of code, you run a code, you don't know, you don't even think what is a CPU, we don't think where it runs, what kind of CPU it runs, what kind of instance it runs. But with machine learning model, you do. So what we are doing and created this fully transparent automated way of allowing you to treat your machine learning models if you were a regular function that you call and then a function could run anywhere. >> Yeah. >> Right. >> That's why-- >> That's better. >> Bringing DevOps agility-- >> That's better. >> Yeah. And you can use existing-- >> That's better, because I can run it on the Artemis too, in space. >> You could, yeah. >> If they have the hardware. (both laugh) >> And that allows you to run your existing, continue to use your existing DevOps infrastructure and your existing people. >> So I have to ask you, cause since you're a professor, this is like a masterclass on theCube. Thank you for coming on. Professor. (Luis laughing) I'm a hardware guy. I'm building hardware for Boston Dynamics, Spot, the dog, that's the diversity in hardware, it's tends to be purpose driven. I got a spaceship, I'm going to have hardware on there. >> Luis: Right. >> It's generally viewed in the community here, that everyone I talk to and other communities, open source is going to drive all software. That's a check. But the scale and integration is super important. And they're also recognizing that hardware is really about the software. And they even said on stage, here. Hardware is not about the hardware, it's about the software. So if you believe that to be true, then your model checks all the boxes. Are people getting this? >> I think they're starting to. Here is why, right. A lot of companies that were hardware first, that thought about software too late, aren't making it. Right? There's a large number of hardware companies, AI chip companies that aren't making it. Probably some of them that won't make it, unfortunately just because they started thinking about software too late. I'm so glad to see a lot of the early, I hope I'm not just doing our own horn here, but Apache TVM, the infrastructure that we built to map models to different hardware, it's very flexible. So we see a lot of emerging chip companies like SiMa.ai's been doing fantastic work, and they use Apache TVM to map algorithms to their hardware. And there's a bunch of others that are also using Apache TVM. That's because you have, you know, an opening infrastructure that keeps it up to date with all the machine learning frameworks and models and allows you to extend to the chips that you want. So these companies pay attention that early, gives them a much higher fighting chance, I'd say. >> Well, first of all, not only are you backable by the VCs cause you have pedigree, you're a professor, you're smart, and you get good recruiting-- >> Luis: I don't know about the smart part. >> And you get good recruiting for PhDs out of University of Washington, which is not too shabby computer science department. But they want to make money. The VCs want to make money. >> Right. >> So you have to make money. So what's the pitch? What's the business model? >> Yeah. Absolutely. >> Share us what you're thinking there. >> Yeah. The value of using our solution is shorter time to value for your model from months to hours. Second, you shrink operator, op-packs, because you don't need a specialized expensive team. Talk about expensive, expensive engineers who can understand machine learning hardware and software engineering to deploy models. You don't need those teams if you use this automated solution, right? Then you reduce that. And also, in the process of actually getting a model and getting specialized to the hardware, making hardware aware, we're talking about a very significant performance improvement that leads to lower cost of deployment in the cloud. We're talking about very significant reduction in costs in cloud deployment. And also enabling new applications on the edge that weren't possible before. It creates, you know, latent value opportunities. Right? So, that's the high level value pitch. But how do we make money? Well, we charge for access to the platform. Right? >> Usage. Consumption. >> Yeah, and value based. Yeah, so it's consumption and value based. So depends on the scale of the deployment. If you're going to deploy machine learning model at a larger scale, chances are that it produces a lot of value. So then we'll capture some of that value in our pricing scale. >> So, you have direct sales force then to work those deals. >> Exactly. >> Got it. How many customers do you have? Just curious. >> So we started, the SaaS platform just launched now. So we started onboarding customers. We've been building this for a while. We have a bunch of, you know, partners that we can talk about openly, like, you know, revenue generating partners, that's fair to say. We work closely with Qualcomm to enable Snapdragon on TVM and hence our platform. We're close with AMD as well, enabling AMD hardware on the platform. We've been working closely with two hyperscaler cloud providers that-- >> I wonder who they are. >> I don't know who they are, right. >> Both start with the letter A. >> And they're both here, right. What is that? >> They both start with the letter A. >> Oh, that's right. >> I won't give it away. (laughing) >> Don't give it away. >> One has three, one has four. (both laugh) >> I'm guessing, by the way. >> Then we have customers in the, actually, early customers have been using the platform from the beginning in the consumer electronics space, in Japan, you know, self driving car technology, as well. As well as some AI first companies that actually, whose core value, the core business come from AI models. >> So, serious, serious customers. They got deep tech chops. They're integrating, they see this as a strategic part of their architecture. >> That's what I call AI native, exactly. But now there's, we have several enterprise customers in line now, we've been talking to. Of course, because now we launched the platform, now we started onboarding and exploring how we're going to serve it to these customers. But it's pretty clear that our technology can solve a lot of other pain points right now. And we're going to work with them as early customers to go and refine them. >> So, do you sell to the little guys, like us? Will we be customers if we wanted to be? >> You could, absolutely, yeah. >> What we have to do, have machine learning folks on staff? >> So, here's what you're going to have to do. Since you can see the booth, others can't. No, but they can certainly, you can try our demo. >> OctoML. >> And you should look at the transparent AI app that's compiled and optimized with our flow, and deployed and built with our flow. That allows you to get your image and do style transfer. You know, you can get you and a pineapple and see how you look like with a pineapple texture. >> We got a lot of transcript and video data. >> Right. Yeah. Right, exactly. So, you can use that. Then there's a very clear-- >> But I could use it. You're not blocking me from using it. Everyone's, it's pretty much democratized. >> You can try the demo, and then you can request access to the platform. >> But you get a lot of more serious deeper customers. But you can serve anybody, what you're saying. >> Luis: We can serve anybody, yeah. >> All right, so what's the vision going forward? Let me ask this. When did people start getting the epiphany of removing the machine learning from the hardware? Was it recently, a couple years ago? >> Well, on the research side, we helped start that trend a while ago. I don't need to repeat that. But I think the vision that's important here, I want the audience here to take away is that, there's a lot of progress being made in creating machine learning models. So, there's fantastic tools to deal with training data, and creating the models, and so on. And now there's a bunch of models that can solve real problems there. The question is, how do you very easily integrate that into your intelligent applications? Madrona Venture Group has been very vocal and investing heavily in intelligent applications both and user applications as well as enablers. So we say an enable of that because it's so easy to use our flow to get a model integrated into your application. Now, any regular software developer can integrate that. And that's just the beginning, right? Because, you know, now we have CI/CD integration to keep your models up to date, to continue to integrate, and then there's more downstream support for other features that you normally have in regular software development. >> I've been thinking about this for a long, long, time. And I think this whole code, no one thinks about code. Like, I write code, I'm deploying it. I think this idea of machine learning as code independent of other dependencies is really amazing. It's so obvious now that you say it. What's the choices now? Let's just say that, I buy it, I love it, I'm using it. Now what do I got to do if I want to deploy it? Do I have to pick processors? Are there verified platforms that you support? Is there a short list? Is there every piece of hardware? >> We actually can help you. I hope we're not saying we can do everything in the world here, but we can help you with that. So, here's how. When you have them all in the platform you can actually see how this model runs on any instance of any cloud, by the way. So we support all the three major cloud providers. And then you can make decisions. For example, if you care about latency, your model has to run on, at most 50 milliseconds, because you're going to have interactivity. And then, after that, you don't care if it's faster. All you care is that, is it going to run cheap enough. So we can help you navigate. And also going to make it automatic. >> It's like tire kicking in the dealer showroom. >> Right. >> You can test everything out, you can see the simulation. Are they simulations, or are they real tests? >> Oh, no, we run all in real hardware. So, we have, as I said, we support any instances of any of the major clouds. We actually run on the cloud. But we also support a select number of edge devices today, like ARMs and Nvidia Jetsons. And we have the OctoML cloud, which is a bunch of racks with a bunch Raspberry Pis and Nvidia Jetsons, and very soon, a bunch of mobile phones there too that can actually run the real hardware, and validate it, and test it out, so you can see that your model runs performant and economically enough in the cloud. And it can run on the edge devices-- >> You're a machine learning as a service. Would that be an accurate? >> That's part of it, because we're not doing the machine learning model itself. You come with a model and we make it deployable and make it ready to deploy. So, here's why it's important. Let me try. There's a large number of really interesting companies that do API models, as in API as a service. You have an NLP model, you have computer vision models, where you call an API and then point in the cloud. You send an image and you got a description, for example. But it is using a third party. Now, if you want to have your model on your infrastructure but having the same convenience as an API you can use our service. So, today, chances are that, if you have a model that you know that you want to do, there might not be an API for it, we actually automatically create the API for you. >> Okay, so that's why I get the DevOps agility for machine learning is a better description. Cause it's not, you're not providing the service. You're providing the service of deploying it like DevOps infrastructure as code. You're now ML as code. >> It's your model, your API, your infrastructure, but all of the convenience of having it ready to go, fully automatic, hands off. >> Cause I think what's interesting about this is that it brings the craftsmanship back to machine learning. Cause it's a craft. I mean, let's face it. >> Yeah. I want human brains, which are very precious resources, to focus on building those models, that is going to solve business problems. I don't want these very smart human brains figuring out how to scrub this into actually getting run the right way. This should be automatic. That's why we use machine learning, for machine learning to solve that. >> Here's an idea for you. We should write a book called, The Lean Machine Learning. Cause the lean startup was all about DevOps. >> Luis: We call machine leaning. No, that's not it going to work. (laughs) >> Remember when iteration was the big mantra. Oh, yeah, iterate. You know, that was from DevOps. >> Yeah, that's right. >> This code allowed for standing up stuff fast, double down, we all know the history, what it turned out. That was a good value for developers. >> I could really agree. If you don't mind me building on that point. You know, something we see as OctoML, but we also see at Madrona as well. Seeing that there's a trend towards best in breed for each one of the stages of getting a model deployed. From the data aspect of creating the data, and then to the model creation aspect, to the model deployment, and even model monitoring. Right? We develop integrations with all the major pieces of the ecosystem, such that you can integrate, say with model monitoring to go and monitor how a model is doing. Just like you monitor how code is doing in deployment in the cloud. >> It's evolution. I think it's a great step. And again, I love the analogy to the mainstream. I lived during those days. I remember the monolithic propriety, and then, you know, OSI model kind of blew it. But that OSI stack never went full stack, and it only stopped at TCP/IP. So, I think the same thing's going on here. You see some scalability around it to try to uncouple it, free it. >> Absolutely. And sustainability and accessibility to make it run faster and make it run on any deice that you want by any developer. So, that's the tagline. >> Luis Ceze, thanks for coming on. Professor. >> Thank you. >> I didn't know you were a professor. That's great to have you on. It was a masterclass in DevOps agility for machine learning. Thanks for coming on. Appreciate it. >> Thank you very much. Thank you. >> Congratulations, again. All right. OctoML here on theCube. Really important. Uncoupling the machine learning from the hardware specifically. That's only going to make space faster and safer, and more reliable. And that's where the whole theme of re:MARS is. Let's see how they fit in. I'm John for theCube. Thanks for watching. More coverage after this short break. >> Luis: Thank you. (gentle music)

Published Date : Jun 24 2022

SUMMARY :

live on the floor at AWS re:MARS 2022. for having me in the show, John. but machine learning is the And that allows you to get certainly on the silicon side. 'cause I could see the progression. So once upon a time, yeah, no... because if you wake up learning runs in the end, that's going to give you the So that was pre-conventional wisdom. the Hamilton was working on. and to this day, you know, That's the beginning of that was logical when you is that the ecosystem because that's kind of the test First, you know-- and scaling the model the way you want, not having to do that integration work. Scale, and run the models So if you can move data to the edge, So do not have any of the typical And you can use existing-- the Artemis too, in space. If they have the hardware. And that allows you So I have to ask you, So if you believe that to be true, to the chips that you want. about the smart part. And you get good recruiting for PhDs So you have to make money. And also, in the process So depends on the scale of the deployment. So, you have direct sales How many customers do you have? We have a bunch of, you know, And they're both here, right. I won't give it away. One has three, one has four. in Japan, you know, self They're integrating, they see this as it to these customers. Since you can see the booth, others can't. and see how you look like We got a lot of So, you can use that. But I could use it. and then you can request But you can serve anybody, of removing the machine for other features that you normally have It's so obvious now that you say it. So we can help you navigate. in the dealer showroom. you can see the simulation. And it can run on the edge devices-- You're a machine learning as a service. know that you want to do, I get the DevOps agility but all of the convenience it brings the craftsmanship for machine learning to solve that. Cause the lean startup No, that's not it going to work. You know, that was from DevOps. double down, we all know the such that you can integrate, and then, you know, OSI on any deice that you Professor. That's great to have you on. Thank you very much. Uncoupling the machine learning Luis: Thank you.

ENTITIES

Entity	Category	Confidence
Luis Ceze	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Luis	PERSON	0.99+
2015	DATE	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Boston Dynamics	ORGANIZATION	0.99+
two hours	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
2017	DATE	0.99+
Japan	LOCATION	0.99+
Madrona Venture Capital	ORGANIZATION	0.99+
AMD	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
three	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
One	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
four	QUANTITY	0.99+
2016	DATE	0.99+
University of Washington	ORGANIZATION	0.99+
Today	DATE	0.99+
Pepsi	ORGANIZATION	0.99+
Both	QUANTITY	0.99+
yesterday	DATE	0.99+
First	QUANTITY	0.99+
both	QUANTITY	0.99+
Second	QUANTITY	0.99+
today	DATE	0.99+
SiMa.ai	ORGANIZATION	0.99+
OctoML	TITLE	0.99+
OctoML	ORGANIZATION	0.99+
Intel	ORGANIZATION	0.98+
one instance	QUANTITY	0.98+
DevOps	TITLE	0.98+
Madrona Venture Group	ORGANIZATION	0.98+
Swami	PERSON	0.98+
Madrona	ORGANIZATION	0.98+
about six years	QUANTITY	0.96+
Spot	ORGANIZATION	0.96+
The Lean Machine Learning	TITLE	0.95+
first	QUANTITY	0.95+
theCUBE	ORGANIZATION	0.94+
ARMs	ORGANIZATION	0.94+
pineapple	ORGANIZATION	0.94+
Raspberry Pis	ORGANIZATION	0.92+
TensorFlow	TITLE	0.89+
Snapdragon	ORGANIZATION	0.89+
about three years old	QUANTITY	0.89+
a couple years ago	DATE	0.88+
two hyperscaler cloud providers	QUANTITY	0.88+
first ones	QUANTITY	0.87+
one of	QUANTITY	0.85+
50 milliseconds	QUANTITY	0.83+
Apache TVM	ORGANIZATION	0.82+
both laugh	QUANTITY	0.82+
three major cloud providers	QUANTITY	0.81+

Hemanth Manda, IBM Cloud Pak

(soft electronic music) >> Welcome to this CUBE Virtual Conversation. I'm your host, Rebecca Knight. Today, I'm joined by Hermanth Manda. He is the Executive Director, IBM Data and AI, responsible for Cloud Pak for Data. Thanks so much for coming on the show, Hermanth. >> Thank you, Rebecca. >> So we're talking now about the release of Cloud Pak for Data version 3.5. I want to explore it for, from a lot of different angles, but do you want to just talk a little bit about why it is unique in the marketplace, in particular, accelerating innovation, reducing costs, and reducing complexity? >> Absolutely, Rebecca. I mean, this is something very unique from an IBM perspective. Frankly speaking, this is unique in the marketplace because what we are doing is we are bringing together all of our data and AI capabilities into a single offering, single platform. And we have continued, as I said, we made it run on any cloud. So we are giving customers the flexibility. So it's innovation across multiple fronts. It's still in consolidation. It's, in doing automation and infusing collaboration and also having customers to basically modernize to the cloud-native world and pick their own cloud which is what we are seeing in the market today. So I would say this is a unique across multiple fronts. >> When we talk about any new platform, one of the big concerns is always around internal skills and maintenance tasks. What changes are you introducing with version 3.5 that does, that help clients be more flexible and sort of streamline their tasks? >> Yeah, it's an interesting question. We are doing a lot of things with respect to 3.5, the latest release. Number one, we are simplifying the management of the platform, made it a lot simpler. We are infusing a lot of automation into it. We are embracing the concept of operators that are not open shelf has introduced into the market. So simple things such as provisioning, installation, upgrades, scaling it up and down, autopilot management. So all of that is taken care of as part of the latest release. Also, what we are doing is we are making the collaboration and user onboarding very easy to drive self service and use the productivity. So overall, this helps, basically, reduce the cost for our customers. >> One of the things that's so striking is the speed of the innovation. I mean, you've only been in the marketplace for two and a half years. This is already version 3.5. Can you talk a little bit about, about sort of the, the innovation that it takes to do this? >> Absolutely. You're right, we've been in the market for slightly over two and a half years, 3.5's our ninth release. So frankly speaking, for any company, or even for startups doing nine releases in 2.5 years is unheard of, and definitely unheard of at IBM. So we are acting and behaving like a startup while addressing the go to market, and the reach of IBM. So I would say that we are doing a lot here. And as I said before, we're trying to address the unique needs of the market, the need to modernize to the cloud-native architectures to move to the cloud also while addressing the needs of our existing customers, because there are two things we are trying to focus, here. First of all, make sure that we have a modern platform across the different capabilities in data and AI, that's number one. Number two is also how do we modernize our existing install base. We have six plus billion dollar business for data and AI across significant real estates. We're providing a platform through Cloud Pak for Data to those existing install base and existing customers to more nice, too. >> I want to talk about how you are addressing the needs of customers, but I want to delve into something you said earlier, and that is that you are behaving like a startup. How do you make sure that your employees have that kind of mindset that, that kind of experimental innovative, creative, resourceful mindset, particularly at a more mature company like IBM? What kinds of skills do you try to instill and cultivate in your, in your team? >> That's a very interesting question, Rebecca. I think there's no single answer, I would say. It starts with listening to the customers, trying to pay detailed attention to what's happening in the market. How competent is it reacting. Looking at the startups, themselves. What we did uniquely, that I didn't touch upon earlier is that we are also building an open ecosystem here, so we position ourselves as an open platform. Yes, there's a lot of IBM unique technology here, but we also are leveraging open source. We are, we have an ecosystem of 50 plus third party ISVs. So by doing that, we are able to drive a lot more innovation and a lot faster because when you are trying to do everything by yourself, it's a bit challenging. But when you're part of an open ecosystem, infusing open source and third party, it becomes a lot easier. In terms of culture, I just want to highlight one thing. I think we are making it a point to emphasize speed over being perfect, progress over perfection. And that, I think, that is something net new for IBM because at IBM, we pride ourselves in quality, scalability, trying to be perfect on day one. I think we didn't do that in this particular case. Initially, when we launched our offense two and a half years back, we tried to be quick to the market. Our time to market was prioritized over being perfect. But now that is not the case anymore, right? I think we will make sure we are exponentially better and those things are addressed for the past two and one-half years. >> Well, perfect is the enemy of the good, as we know. One of the things that your customers demand is flexibility when building with machine learning pipeline. What have you done to improve IBM machine learning tools on this platform? >> So there's a lot of things we've done. Number one, I want to emphasize our building AI, the initial problem that most of our customers concerned about, but in my opinion, that's 10% of the problem. Actually deploying those AI models or managing them and covering them at scales for the enterprise is a bigger challenge. So what we have is very unique. We have the end-to-end AI lifecycle, we have tools for all the way from building, deploying, managing, governing these models. Second is we are introducing net new capabilities as part of a latest release. We have this call or this new service called WMLA, Watson Machine Learning Accelerator that addresses the unique challenges of deep learning capabilities, managing GPUs, et cetera. We are also making the auto AI capabilities a lot more robust. And finally, we are introducing a net new concept called Federator Learning that allows you to build AI across distributed datasets, which is very unique. I'm not aware of any other vendor doing this, so you can actually have your data distributed across multiple clouds, and you can build an aggregated AI model without actually looking at the data that is spread across these clouds. And this concept, in my opinion, is going to get a lot more traction as we move forward. >> One of the things that IBM has always been proud of is the way it partners with ISVs and other vendors. Can you talk about how you work with your partners and foster this ecosystem of third-party capabilities that integrate into the platform? >> Yes, it's always a challenge. I mean, for this to be a platform, as I said before, you need to be open and you need to build an ecosystem. And so we made that a priority since day one and we have 53 third party ISVs, today. It's a chicken and egg problem, Rebecca, because you need to obviously showcase success and make it a priority for your partners to onboard and work with you closely. So, we obviously invest, we co-invest with our partners and we take them to market. We have different models. We have a tactical relationship with some of our third party ISVs. We also have a strategic relationship. So we partner with them depending on their ability to partner with us and we go invest and make sure that we are not only integrating them technically, but also we are integrating with them from a go-to-market perspective. >> I wonder if you can talk a little bit about the current environment that we're in. Of course, we're all living through a global health emergency in the form of the COVID-19 pandemic. So much of the knowledge work is being done from home. It is being done remotely. Teams are working asynchronously over different kinds of digital platforms. How have you seen these changes affect the team, your team at IBM, what kinds of new kinds of capabilities, collaborations, what kinds of skills have you seen your team have to gain and have to gain quite quickly in this environment? >> Absolutely. I think historically, IBM had quite a, quite a portion of our workforce working remotely so we are used to this, but not at the scale that the current situation has compelled us to. So we made a lot more investments earlier this year in digital technologies, whether it is Zoom and WebEx or trying to use tools, digital tools that helps us coordinate and collaborate effectively. So part of it is technical, right? Part of it is also a cultural shift. And that came all the way from our CEO in terms of making sure that we have the necessary processes in place to ensure that our employees are not in getting burnt out, that they're being productive and effective. And so a combination of what I would say, technical investments, plus process and leadership initiatives helped us essentially embrace the changes that we've seen, today. >> And I want you to close us out, here. Talk a little bit about the future, both for Cloud Pak for Data, but also for the companies and clients that you work for. What do you see in the next 12 to 24 months changing in the term, in terms of how we have re-imagined the future of work. I know you said this was already version nine. You've only been in the marketplace for, for not even three years. That's incredible innovation and speed. Talk a little bit about changes you see coming down the pike. >> So I think everything that we have done is going to get amplified and accelerated as we move forward, shift to cloud, embracing AI, adopting AI into business processes to automate and amplify new business models, collaboration, to a certain extent, consolidation of the different offerings into platforms. So all of this, we, I obviously see that being accelerated and that acceleration will continue as we move forward. And the real challenge I see with our customers and all the enterprises is, I see them in two buckets. There's one bucket which are resisting change, like to stick to the old concepts, and there's one bucket of enterprises who are embracing the change and moving forward, and actually get accelerating this transformation and change. I think it will be successful over the next one to five years. You know, it could be under the other bucket and if you're not, I think it's, you're going to get, you're going to miss out and that is getting amplified and accelerated, as we speak. >> So for those ones in the bucket that are resistant to the change, how do you get them onboard? I mean, this is classic change management that they teach at business schools around the world. But what are some advice that you would have to those who are resisting the change? >> So, and again, frankly speaking, we, at IBM, are going through that transition so I can speak from experience. >> Rebecca: You're drinking the Kool-Aid. >> Yeah, when, when I think, one way to address this is basically take one step at a time, like as opposed to completely revolutionizing the way you do your business. You can transform your business one step at a time while keeping the end objective as your goal, as your end goal. So, and it just want a little highlight that with full factor, that's exactly what we are enabling because what we do is we enable you to actually run anywhere you like. So if most of your systems, most of your data and your models, and analytics are on-premise, you can actually start your journey there while you plan for the future of a public cloud or a managed service. So my advice is pretty simple. You start the journey, but you can take, you can, you don't need to, you don't need to do it as a big bang. You, it could be a journey, it could be a gradual transformation, but you need to start the journey today. If you don't, you're going to miss out. >> Baby steps. Hey Hermanth Manda, thank you so much for joining us for this Virtual CUBE Conversation >> Thank you very much, Rebecca. >> I'm Rebecca Knight, stay tuned for more of theCUBE Virtual. (soft electronic music)

Published Date : Nov 20 2020

SUMMARY :

He is the Executive but do you want to just talk a little bit So we are giving one of the big concerns is of the platform, made it a lot simpler. the innovation that it takes to do this? the need to modernize to the and that is that you are is that we are also building of the good, as we know. that addresses the unique challenges One of the things that IBM has always and we have 53 third party ISVs, today. So much of the knowledge And that came all the way from our CEO and clients that you work for. over the next one to five years. in the bucket that are So, and again, frankly speaking, is we enable you to actually Hey Hermanth Manda, thank you so much for more of theCUBE Virtual.

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Hermanth	PERSON	0.99+
Hemanth Manda	PERSON	0.99+
10%	QUANTITY	0.99+
two and a half years	QUANTITY	0.99+
nine releases	QUANTITY	0.99+
two things	QUANTITY	0.99+
Hermanth Manda	PERSON	0.99+
Second	QUANTITY	0.99+
IBM Data	ORGANIZATION	0.99+
one bucket	QUANTITY	0.99+
2.5 years	QUANTITY	0.99+
ninth release	QUANTITY	0.99+
Today	DATE	0.99+
50 plus	QUANTITY	0.99+
One	QUANTITY	0.99+
over two and a half years	QUANTITY	0.98+
five years	QUANTITY	0.98+
two buckets	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
First	QUANTITY	0.97+
three years	QUANTITY	0.97+
WMLA	ORGANIZATION	0.97+
COVID-19 pandemic	EVENT	0.96+
Kool-Aid	ORGANIZATION	0.96+
Watson Machine Learning Accelerator	ORGANIZATION	0.96+
Cloud Pak for Data	TITLE	0.96+
single platform	QUANTITY	0.96+
24 months	QUANTITY	0.96+
one thing	QUANTITY	0.95+
one	QUANTITY	0.95+
Zoom	ORGANIZATION	0.95+
WebEx	ORGANIZATION	0.94+
Number two	QUANTITY	0.92+
day one	QUANTITY	0.9+
Cloud Pak	TITLE	0.9+
single offering	QUANTITY	0.89+
version 3.5	OTHER	0.87+
12	QUANTITY	0.87+
one step	QUANTITY	0.86+
53 third party	QUANTITY	0.84+
two and a half years back	DATE	0.84+
single answer	QUANTITY	0.81+
year	QUANTITY	0.8+
nine	OTHER	0.79+
3.5	OTHER	0.78+
Cloud Pak for Data version 3.5	TITLE	0.76+
one way	QUANTITY	0.74+
Number one	QUANTITY	0.74+
six plus billion dollar	QUANTITY	0.7+
party	QUANTITY	0.61+
one-half years	QUANTITY	0.61+
past two	DATE	0.57+
3.5	TITLE	0.56+
version	QUANTITY	0.56+
Cloud Pak	ORGANIZATION	0.52+
Learning	OTHER	0.46+
CUBE	ORGANIZATION	0.43+
Cloud	COMMERCIAL_ITEM	0.4+

Rob Thomas, IBM | Change the Game: Winning With AI 2018

>> [Announcer] Live from Times Square in New York City, it's theCUBE covering IBM's Change the Game: Winning with AI, brought to you by IBM. >> Hello everybody, welcome to theCUBE's special presentation. We're covering IBM's announcements today around AI. IBM, as theCUBE does, runs of sessions and programs in conjunction with Strata, which is down at the Javits, and we're Rob Thomas, who's the General Manager of IBM Analytics. Long time Cube alum, Rob, great to see you. >> Dave, great to see you. >> So you guys got a lot going on today. We're here at the Westin Hotel, you've got an analyst event, you've got a partner meeting, you've got an event tonight, Change the game: winning with AI at Terminal 5, check that out, ibm.com/WinWithAI, go register there. But Rob, let's start with what you guys have going on, give us the run down. >> Yeah, it's a big week for us, and like many others, it's great when you have Strata, a lot of people in town. So, we've structured a week where, today, we're going to spend a lot of time with analysts and our business partners, talking about where we're going with data and AI. This evening, we've got a broadcast, it's called Winning with AI. What's unique about that broadcast is it's all clients. We've got clients on stage doing demonstrations, how they're using IBM technology to get to unique outcomes in their business. So I think it's going to be a pretty unique event, which should be a lot of fun. >> So this place, it looks like a cool event, a venue, Terminal 5, it's just up the street on the west side highway, probably a mile from the Javits Center, so definitely check that out. Alright, let's talk about, Rob, we've known each other for a long time, we've seen the early Hadoop days, you guys were very careful about diving in, you kind of let things settle and watched very carefully, and then came in at the right time. But we saw the evolution of so-called Big Data go from a phase of really reducing investments, cheaper data warehousing, and what that did is allowed people to collect a lot more data, and kind of get ready for this era that we're in now. But maybe you can give us your perspective on the phases, the waves that we've seen of data, and where we are today and where we're going. >> I kind of think of it as a maturity curve. So when I go talk to clients, I say, look, you need to be on a journey towards AI. I think probably nobody disagrees that they need something there, the question is, how do you get there? So you think about the steps, it's about, a lot of people started with, we're going to reduce the cost of our operations, we're going to use data to take out cost, that was kind of the Hadoop thrust, I would say. Then they moved to, well, now we need to see more about our data, we need higher performance data, BI data warehousing. So, everybody, I would say, has dabbled in those two area. The next leap forward is self-service analytics, so how do you actually empower everybody in your organization to use and access data? And the next step beyond that is, can I use AI to drive new business models, new levers of growth, for my business? So, I ask clients, pin yourself on this journey, most are, depends on the division or the part of the company, they're at different areas, but as I tell everybody, if you don't know where you are and you don't know where you want to go, you're just going to wind around, so I try to get them to pin down, where are you versus where do you want to go? >> So four phases, basically, the sort of cheap data store, the BI data warehouse modernization, self-service analytics, a big part of that is data science and data science collaboration, you guys have a lot of investments there, and then new business models with AI automation running on top. Where are we today? Would you say we're kind of in-between BI/DW modernization and on our way to self-service analytics, or what's your sense? >> I'd say most are right in the middle between BI data warehousing and self-service analytics. Self-service analytics is hard, because it requires you, sometimes to take a couple steps back, and look at your data. It's hard to provide self-service if you don't have a data catalog, if you don't have data security, if you haven't gone through the processes around data governance. So, sometimes you have to take one step back to go two steps forward, that's why I see a lot of people, I'd say, stuck in the middle right now. And the examples that you're going to see tonight as part of the broadcast are clients that have figured out how to break through that wall, and I think that's pretty illustrative of what's possible. >> Okay, so you're saying that, got to maybe take a step back and get the infrastructure right with, let's say a catalog, to give some basic things that they have to do, some x's and o's, you've got the Vince Lombardi played out here, and also, skillsets, I imagine, is a key part of that. So, that's what they've got to do to get prepared, and then, what's next? They start creating new business models, imagining this is where the cheap data officer comes in and it's an executive level, what are you seeing clients as part of digital transformation, what's the conversation like with customers? >> The biggest change, the great thing about the times we live in, is technology's become so accessible, you can do things very quickly. We created a team last year called Data Science Elite, and we've hired what we think are some of the best data scientists in the world. Their only job is to go work with clients and help them get to a first success with data science. So, we put a team in. Normally, one month, two months, normally a team of two or three people, our investment, and we say, let's go build a model, let's get to an outcome, and you can do this incredibly quickly now. I tell clients, I see somebody that says, we're going to spend six months evaluating and thinking about this, I was like, why would you spend six months thinking about this when you could actually do it in one month? So you just need to get over the edge and go try it. >> So we're going to learn more about the Data Science Elite team. We've got John Thomas coming on today, who is a distinguished engineer at IBM, and he's very much involved in that team, and I think we have a customer who's actually gone through that, so we're going to talk about what their experience was with the Data Science Elite team. Alright, you've got some hard news coming up, you've actually made some news earlier with Hortonworks and Red Hat, I want to talk about that, but you've also got some hard news today. Take us through that. >> Yeah, let's talk about all three. First, Monday we announced the expanded relationship with both Hortonworks and Red Hat. This goes back to one of the core beliefs I talked about, every enterprise is modernizing their data and application of states, I don't think there's any debate about that. We are big believers in Kubernetes and containers as the architecture to drive that modernization. The announcement on Monday was, we're working closer with Red Hat to take all of our data services as part of Cloud Private for Data, which are basically microservice for data, and we're running those on OpenShift, and we're starting to see great customer traction with that. And where does Hortonworks come in? Hadoop has been the outlier on moving to microservices containers, we're working with Hortonworks to help them make that move as well. So, it's really about the three of us getting together and helping clients with this modernization journey. >> So, just to remind people, you remember ODPI, folks? It was all this kerfuffle about, why do we even need this? Well, what's interesting to me about this triumvirate is, well, first of all, Red Hat and Hortonworks are hardcore opensource, IBM's always been a big supporter of open source. You three got together and you're proving now the productivity for customers of this relationship. You guys don't talk about this, but Hortonworks had to, when it's public call, that the relationship with IBM drove many, many seven-figure deals, which, obviously means that customers are getting value out of this, so it's great to see that come to fruition, and it wasn't just a Barney announcement a couple years ago, so congratulations on that. Now, there's this other news that you guys announced this morning, talk about that. >> Yeah, two other things. One is, we announced a relationship with Stack Overflow. 50 million developers go to Stack Overflow a month, it's an amazing environment for developers that are looking to do new things, and we're sponsoring a community around AI. Back to your point before, you said, is there a skills gap in enterprises, there absolutely is, I don't think that's a surprise. Data science, AI developers, not every company has the skills they need, so we're sponsoring a community to help drive the growth of skills in and around data science and AI. So things like Python, R, Scala, these are the languages of data science, and it's a great relationship with us and Stack Overflow to build a community to get things going on skills. >> Okay, and then there was one more. >> Last one's a product announcement. This is one of the most interesting product annoucements we've had in quite a while. Imagine this, you write a sequel query, and traditional approach is, I've got a server, I point it as that server, I get the data, it's pretty limited. We're announcing technology where I write a query, and it can find data anywhere in the world. I think of it as wide-area sequel. So it can find data on an automotive device, a telematics device, an IoT device, it could be a mobile device, we think of it as sequel the whole world. You write a query, you can find the data anywhere it is, and we take advantage of the processing power on the edge. The biggest problem with IoT is, it's been the old mantra of, go find the data, bring it all back to a centralized warehouse, that makes it impossible to do it real time. We're enabling real time because we can write a query once, find data anywhere, this is technology we've had in preview for the last year. We've been working with a lot of clients to prove out used cases to do it, we're integrating as the capability inside of IBM Cloud Private for Data. So if you buy IBM Cloud for Data, it's there. >> Interesting, so when you've been around as long as I have, long enough to see some of the pendulums swings, and it's clearly a pendulum swing back toward decentralization in the edge, but the key is, from what you just described, is you're sort of redefining the boundary, so I presume it's the edge, any Cloud, or on premises, where you can find that data, is that correct? >> Yeah, so it's multi-Cloud. I mean, look, every organization is going to be multi-Cloud, like 100%, that's going to happen, and that could be private, it could be multiple public Cloud providers, but the key point is, data on the edge is not just limited to what's in those Clouds. It could be anywhere that you're collecting data. And, we're enabling an architecture which performs incredibly well, because you take advantage of processing power on the edge, where you can get data anywhere that it sits. >> Okay, so, then, I'm setting up a Cloud, I'll call it a Cloud architecture, that encompasses the edge, where essentially, there are no boundaries, and you're bringing security. We talked about containers before, we've been talking about Kubernetes all week here at a Big Data show. And then of course, Cloud, and what's interesting, I think many of the Hadoop distral vendors kind of missed Cloud early on, and then now are sort of saying, oh wow, it's a hybrid world and we've got a part, you guys obviously made some moves, a couple billion dollar moves, to do some acquisitions and get hardcore into Cloud, so that becomes a critical component. You're not just limiting your scope to the IBM Cloud. You're recognizing that it's a multi-Cloud world, that' what customers want to do. Your comments. >> It's multi-Cloud, and it's not just the IBM Cloud, I think the most predominant Cloud that's emerging is every client's private Cloud. Every client I talk to is building out a containerized architecture. They need their own Cloud, and they need seamless connectivity to any public Cloud that they may be using. This is why you see such a premium being put on things like data ingestion, data curation. It's not popular, it's not exciting, people don't want to talk about it, but we're the biggest inhibitors, to this AI point, comes back to data curation, data ingestion, because if you're dealing with multiple Clouds, suddenly your data's in a bunch of different spots. >> Well, so you're basically, and we talked about this a lot on theCUBE, you're bringing the Cloud model to the data, wherever the data lives. Is that the right way to think about it? >> I think organizations have spoken, set aside what they say, look at their actions. Their actions say, we don't want to move all of our data to any particular Cloud, we'll move some of our data. We need to give them seamless connectivity so that they can leave their data where they want, we can bring Cloud-Native Architecture to their data, we could also help move their data to a Cloud-Native architecture if that's what they prefer. >> Well, it makes sense, because you've got physics, latency, you've got economics, moving all the data into a public Cloud is expensive and just doesn't make economic sense, and then you've got things like GDPR, which says, well, you have to keep the data, certain laws of the land, if you will, that say, you've got to keep the data in whatever it is, in Germany, or whatever country. So those sort of edicts dictate how you approach managing workloads and what you put where, right? Okay, what's going on with Watson? Give us the update there. >> I get a lot of questions, people trying to peel back the onion of what exactly is it? So, I want to make that super clear here. Watson is a few things, start at the bottom. You need a runtime for models that you've built. So we have a product called Watson Machine Learning, runs anywhere you want, that is the runtime for how you execute models that you've built. Anytime you have a runtime, you need somewhere where you can build models, you need a development environment. That is called Watson Studio. So, we had a product called Data Science Experience, we've evolved that into Watson Studio, connecting in some of those features. So we have Watson Studio, that's the development environment, Watson Machine Learning, that's the runtime. Now you move further up the stack. We have a set of APIs that bring in human features, vision, natural language processing, audio analytics, those types of things. You can integrate those as part of a model that you build. And then on top of that, we've got things like Watson Applications, we've got Watson for call centers, doing customer service and chatbots, and then we've got a lot of clients who've taken pieces of that stack and built their own AI solutions. They've taken some of the APIs, they've taken some of the design time, the studio, they've taken some of the Watson Machine Learning. So, it is really a stack of capabilities, and where we're driving the greatest productivity, this is in a lot of the examples you'll see tonight for clients, is clients that have bought into this idea of, I need a development environment, I need a runtime, where I can deploy models anywhere. We're getting a lot of momentum on that, and then that raises the question of, well, do I have expandability, do I have trust in transparency, and that's another thing that we're working on. >> Okay, so there's API oriented architecture, exposing all these services make it very easy for people to consume. Okay, so we've been talking all week at Cube NYC, is Big Data is in AI, is this old wine, new bottle? I mean, it's clear, Rob, from the conversation here, there's a lot of substantive innovation, and early adoption, anyway, of some of these innovations, but a lot of potential going forward. Last thoughts? >> What people have to realize is AI is not magic, it's still computer science. So it actually requires some hard work. You need to roll up your sleeves, you need to understand how I get from point A to point B, you need a development environment, you need a runtime. I want people to really think about this, it's not magic. I think for a while, people have gotten the impression that there's some magic button. There's not, but if you put in the time, and it's not a lot of time, you'll see the examples tonight, most of them have been done in one or two months, there's great business value in starting to leverage AI in your business. >> Awesome, alright, so if you're in this city or you're at Strata, go to ibm.com/WinWithAI, register for the event tonight. Rob, we'll see you there, thanks so much for coming back. >> Yeah, it's going to be fun, thanks Dave, great to see you. >> Alright, keep it right there everybody, we'll be back with our next guest right after this short break, you're watching theCUBE.

Published Date : Sep 18 2018

SUMMARY :

brought to you by IBM. Long time Cube alum, Rob, great to see you. But Rob, let's start with what you guys have going on, it's great when you have Strata, a lot of people in town. and kind of get ready for this era that we're in now. where you want to go, you're just going to wind around, and data science collaboration, you guys have It's hard to provide self-service if you don't have and it's an executive level, what are you seeing let's get to an outcome, and you can do this and I think we have a customer who's actually as the architecture to drive that modernization. So, just to remind people, you remember ODPI, folks? has the skills they need, so we're sponsoring a community and it can find data anywhere in the world. of processing power on the edge, where you can get data a couple billion dollar moves, to do some acquisitions This is why you see such a premium being put on things Is that the right way to think about it? to a Cloud-Native architecture if that's what they prefer. certain laws of the land, if you will, that say, for how you execute models that you've built. I mean, it's clear, Rob, from the conversation here, and it's not a lot of time, you'll see the examples tonight, Rob, we'll see you there, thanks so much for coming back. we'll be back with our next guest

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
Rob	PERSON	0.99+
Rob Thomas	PERSON	0.99+
John Thomas	PERSON	0.99+
two months	QUANTITY	0.99+
one month	QUANTITY	0.99+
Germany	LOCATION	0.99+
last year	DATE	0.99+
Red Hat	ORGANIZATION	0.99+
Monday	DATE	0.99+
one	QUANTITY	0.99+
100%	QUANTITY	0.99+
GDPR	TITLE	0.99+
three people	QUANTITY	0.99+
first	QUANTITY	0.99+
two	QUANTITY	0.99+
ibm.com/WinWithAI	OTHER	0.99+
Watson Studio	TITLE	0.99+
Python	TITLE	0.99+
Scala	TITLE	0.99+
First	QUANTITY	0.99+
Data Science Elite	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Cube	ORGANIZATION	0.99+
one step	QUANTITY	0.99+
One	QUANTITY	0.99+
Times Square	LOCATION	0.99+
today	DATE	0.99+
Vince Lombardi	PERSON	0.98+
three	QUANTITY	0.98+
Stack Overflow	ORGANIZATION	0.98+
tonight	DATE	0.98+
Javits Center	LOCATION	0.98+
Barney	ORGANIZATION	0.98+
Terminal 5	LOCATION	0.98+
IBM Analytics	ORGANIZATION	0.98+
Watson	TITLE	0.97+
two steps	QUANTITY	0.97+
New York City	LOCATION	0.97+
Watson Applications	TITLE	0.97+
Cloud	TITLE	0.96+
This evening	DATE	0.95+
Watson Machine Learning	TITLE	0.94+
two area	QUANTITY	0.93+
seven-figure deals	QUANTITY	0.92+
Cube	PERSON	0.91+

Influencer Panel | theCUBE NYC 2018

- [Announcer] Live, from New York, it's theCUBE. Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media, and its ecosystem partners. - Hello everyone, welcome back to CUBE NYC. This is a CUBE special presentation of something that we've done now for the past couple of years. IBM has sponsored an influencer panel on some of the hottest topics in the industry, and of course, there's no hotter topic right now than AI. So, we've got nine of the top influencers in the AI space, and we're in Hell's Kitchen, and it's going to get hot in here. (laughing) And these guys, we're going to cover the gamut. So, first of all, folks, thanks so much for joining us today, really, as John said earlier, we love the collaboration with you all, and we'll definitely see you on social after the fact. I'm Dave Vellante, with my cohost for this session, Peter Burris, and again, thank you to IBM for sponsoring this and organizing this. IBM has a big event down here, in conjunction with Strata, called Change the Game, Winning with AI. We run theCUBE NYC, we've been here all week. So, here's the format. I'm going to kick it off, and then we'll see where it goes. So, I'm going to introduce each of the panelists, and then ask you guys to answer a question, I'm sorry, first, tell us a little bit about yourself, briefly, and then answer one of the following questions. Two big themes that have come up this week. One has been, because this is our ninth year covering what used to be Hadoop World, which kind of morphed into big data. Question is, AI, big data, same wine, new bottle? Or is it really substantive, and driving business value? So, that's one question to ponder. The other one is, you've heard the term, the phrase, data is the new oil. Is data really the new oil? Wonder what you think about that? Okay, so, Chris Penn, let's start with you. Chris is cofounder of Trust Insight, long time CUBE alum, and friend. Thanks for coming on. Tell us a little bit about yourself, and then pick one of those questions. - Sure, we're a data science consulting firm. We're an IBM business partner. When it comes to "data is the new oil," I love that expression because it's completely accurate. Crude oil is useless, you have to extract it out of the ground, refine it, and then bring it to distribution. Data is the same way, where you have to have developers and data architects get the data out. You need data scientists and tools, like Watson Studio, to refine it, and then you need to put it into production, and that's where marketing technologists, technologists, business analytics folks, and tools like Watson Machine Learning help bring the data and make it useful. - Okay, great, thank you. Tony Flath is a tech and media consultant, focus on cloud and cyber security, welcome. - Thank you. - Tell us a little bit about yourself and your thoughts on one of those questions. - Sure thing, well, thanks so much for having us on this show, really appreciate it. My background is in cloud, cyber security, and certainly in emerging tech with artificial intelligence. Certainly touched it from a cyber security play, how you can use machine learning, machine control, for better controlling security across the gamut. But I'll touch on your question about wine, is it a new bottle, new wine? Where does this come from, from artificial intelligence? And I really see it as a whole new wine that is coming along. When you look at emerging technology, and you look at all the deep learning that's happening, it's going just beyond being able to machine learn and know what's happening, it's making some meaning to that data. And things are being done with that data, from robotics, from automation, from all kinds of different things, where we're at a point in society where data, our technology is getting beyond us. Prior to this, it's always been command and control. You control data from a keyboard. Well, this is passing us. So, my passion and perspective on this is, the humanization of it, of IT. How do you ensure that people are in that process, right? - Excellent, and we're going to come back and talk about that. - Thanks so much. - Carla Gentry, @DataNerd? Great to see you live, as opposed to just in the ether on Twitter. Data scientist, and owner of Analytical Solution. Welcome, your thoughts? - Thank you for having us. Mine is, is data the new oil? And I'd like to rephrase that is, data equals human lives. So, with all the other artificial intelligence and everything that's going on, and all the algorithms and models that's being created, we have to think about things being biased, being fair, and understand that this data has impacts on people's lives. - Great. Steve Ardire, my paisan. - Paisan. - AI startup adviser, welcome, thanks for coming to theCUBE. - Thanks Dave. So, uh, my first career was geology, and I view AI as the new oil, but data is the new oil, but AI is the refinery. I've used that many times before. In fact, really, I've moved from just AI to augmented intelligence. So, augmented intelligence is really the way forward. This was a presentation I gave at IBM Think last spring, has almost 100,000 impressions right now, and the fundamental reason why is machines can attend to vastly more information than humans, but you still need humans in the loop, and we can talk about what they're bringing in terms of common sense reasoning, because big data does the who, what, when, and where, but not the why, and why is really the Holy Grail for causal analysis and reasoning. - Excellent, Bob Hayes, Business Over Broadway, welcome, great to see you again. - Thanks for having me. So, my background is in psychology, industrial psychology, and I'm interested in things like customer experience, data science, machine learning, so forth. And I'll answer the question around big data versus AI. And I think there's other terms we could talk about, big data, data science, machine learning, AI. And to me, it's kind of all the same. It's always been about analytics, and getting value from your data, big, small, what have you. And there's subtle differences among those terms. Machine learning is just about making a prediction, and knowing if things are classified correctly. Data science is more about understanding why things work, and understanding maybe the ethics behind it, what variables are predicting that outcome. But still, it's all the same thing, it's all about using data in a way that we can get value from that, as a society, in residences. - Excellent, thank you. Theo Lau, founder of Unconventional Ventures. What's your story? - Yeah, so, my background is driving technology innovation. So, together with my partner, what our work does is we work with organizations to try to help them leverage technology to drive systematic financial wellness. We connect founders, startup founders, with funders, we help them get money in the ecosystem. We also work with them to look at, how do we leverage emerging technology to do something good for the society. So, very much on point to what Bob was saying about. So when I look at AI, it is not new, right, it's been around for quite a while. But what's different is the amount of technological power that we have allow us to do so much more than what we were able to do before. And so, what my mantra is, great ideas can come from anywhere in the society, but it's our job to be able to leverage technology to shine a spotlight on people who can use this to do something different, to help seniors in our country to do better in their financial planning. - Okay, so, in your mind, it's not just a same wine, new bottle, it's more substantive than that. - [Theo] It's more substantive, it's a much better bottle. - Karen Lopez, senior project manager for Architect InfoAdvisors, welcome. - Thank you. So, I'm DataChick on twitter, and so that kind of tells my focus is that I'm here, I also call myself a data evangelist, and that means I'm there at organizations helping stand up for the data, because to me, that's the proxy for standing up for the people, and the places and the events that that data describes. That means I have a focus on security, data privacy and protection as well. And I'm going to kind of combine your two questions about whether data is the new wine bottle, I think is the combination. Oh, see, now I'm talking about alcohol. (laughing) But anyway, you know, all analogies are imperfect, so whether we say it's the new wine, or, you know, same wine, or whether it's oil, is that the analogy's good for both of them, but unlike oil, the amount of data's just growing like crazy, and the oil, we know at some point, I kind of doubt that we're going to hit peak data where we have not enough data, like we're going to do with oil. But that says to me that, how did we get here with big data, with machine learning and AI? And from my point of view, as someone who's been focused on data for 35 years, we have hit this perfect storm of open source technologies, cloud architectures and cloud services, data innovation, that if we didn't have those, we wouldn't be talking about large machine learning and deep learning-type things. So, because we have all these things coming together at the same time, we're now at explosions of data, which means we also have to protect them, and protect the people from doing harm with data, we need to do data for good things, and all of that. - Great, definite differences, we're not running out of data, data's like the terrible tribbles. (laughing) - Yes, but it's very cuddly, data is. - Yeah, cuddly data. Mark Lynd, founder of Relevant Track? - That's right. - I like the name. What's your story? - Well, thank you, and it actually plays into what my interest is. It's mainly around AI in enterprise operations and cyber security. You know, these teams that are in enterprise operations both, it can be sales, marketing, all the way through the organization, as well as cyber security, they're often under-sourced. And they need, what Steve pointed out, they need augmented intelligence, they need to take AI, the big data, all the information they have, and make use of that in a way where they're able to, even though they're under-sourced, make some use and some value for the organization, you know, make better use of the resources they have to grow and support the strategic goals of the organization. And oftentimes, when you get to budgeting, it doesn't really align, you know, you're short people, you're short time, but the data continues to grow, as Karen pointed out. So, when you take those together, using AI to augment, provided augmented intelligence, to help them get through that data, make real tangible decisions based on information versus just raw data, especially around cyber security, which is a big hit right now, is really a great place to be, and there's a lot of stuff going on, and a lot of exciting stuff in that area. - Great, thank you. Kevin L. Jackson, author and founder of GovCloud. GovCloud, that's big. - Yeah, GovCloud Network. Thank you very much for having me on the show. Up and working on cloud computing, initially in the federal government, with the intelligence community, as they adopted cloud computing for a lot of the nation's major missions. And what has happened is now I'm working a lot with commercial organizations and with the security of that data. And I'm going to sort of, on your questions, piggyback on Karen. There was a time when you would get a couple of bottles of wine, and they would come in, and you would savor that wine, and sip it, and it would take a few days to get through it, and you would enjoy it. The problem now is that you don't get a couple of bottles of wine into your house, you get two or three tankers of data. So, it's not that it's a new wine, you're just getting a lot of it. And the infrastructures that you need, before you could have a couple of computers, and a couple of people, now you need cloud, you need automated infrastructures, you need huge capabilities, and artificial intelligence and AI, it's what we can use as the tool on top of these huge infrastructures to drink that, you know. - Fire hose of wine. - Fire hose of wine. (laughs) - Everybody's having a good time. - Everybody's having a great time. (laughs) - Yeah, things are booming right now. Excellent, well, thank you all for those intros. Peter, I want to ask you a question. So, I heard there's some similarities and some definite differences with regard to data being the new oil. You have a perspective on this, and I wonder if you could inject it into the conversation. - Sure, so, the perspective that we take in a lot of conversations, a lot of folks here in theCUBE, what we've learned, and I'll kind of answer both questions a little bit. First off, on the question of data as the new oil, we definitely think that data is the new asset that business is going to be built on, in fact, our perspective is that there really is a difference between business and digital business, and that difference is data as an asset. And if you want to understand data transformation, you understand the degree to which businesses reinstitutionalizing work, reorganizing its people, reestablishing its mission around what you can do with data as an asset. The difference between data and oil is that oil still follows the economics of scarcity. Data is one of those things, you can copy it, you can share it, you can easily corrupt it, you can mess it up, you can do all kinds of awful things with it if you're not careful. And it's that core fundamental proposition that as an asset, when we think about cyber security, we think, in many respects, that is the approach to how we can go about privatizing data so that we can predict who's actually going to be able to appropriate returns on it. So, it's a good analogy, but as you said, it's not entirely perfect, but it's not perfect in a really fundamental way. It's not following the laws of scarcity, and that has an enormous effect. - In other words, I could put oil in my car, or I could put oil in my house, but I can't put the same oil in both. - Can't put it in both places. And now, the issue of the wine, I think it's, we think that it is, in fact, it is a new wine, and very simple abstraction, or generalization we come up with is the issue of agency. That analytics has historically not taken on agency, it hasn't acted on behalf of the brand. AI is going to act on behalf of the brand. Now, you're going to need both of them, you can't separate them. - A lot of implications there in terms of bias. - Absolutely. - In terms of privacy. You have a thought, here, Chris? - Well, the scarcity is our compute power, and our ability for us to process it. I mean, it's the same as oil, there's a ton of oil under the ground, right, we can't get to it as efficiently, or without severe environmental consequences to use it. Yeah, when you use it, it's transformed, but our scarcity is compute power, and our ability to use it intelligently. - Or even when you find it. I have data, I can apply it to six different applications, I have oil, I can apply it to one, and that's going to matter in how we think about work. - But one thing I'd like to add, sort of, you're talking about data as an asset. The issue we're having right now is we're trying to learn how to manage that asset. Artificial intelligence is a way of managing that asset, and that's important if you're going to use and leverage big data. - Yeah, but see, everybody's talking about the quantity, the quantity, it's not always the quantity. You know, we can have just oodles and oodles of data, but if it's not clean data, if it's not alphanumeric data, which is what's needed for machine learning. So, having lots of data is great, but you have to think about the signal versus the noise. So, sometimes you get so much data, you're looking at over-fitting, sometimes you get so much data, you're looking at biases within the data. So, it's not the amount of data, it's the, now that we have all of this data, making sure that we look at relevant data, to make sure we look at clean data. - One more thought, and we have a lot to cover, I want to get inside your big brain. - I was just thinking about it from a cyber security perspective, one of my customers, they were looking at the data that just comes from the perimeter, your firewalls, routers, all of that, and then not even looking internally, just the perimeter alone, and the amount of data being pulled off of those. And then trying to correlate that data so it makes some type of business sense, or they can determine if there's incidents that may happen, and take a predictive action, or threats that might be there because they haven't taken a certain action prior, it's overwhelming to them. So, having AI now, to be able to go through the logs to look at, and there's so many different types of data that come to those logs, but being able to pull that information, as well as looking at end points, and all that, and people's houses, which are an extension of the network oftentimes, it's an amazing amount of data, and they're only looking at a small portion today because they know, there's not enough resources, there's not enough trained people to do all that work. So, AI is doing a wonderful way of doing that. And some of the tools now are starting to mature and be sophisticated enough where they provide that augmented intelligence that Steve talked about earlier. - So, it's complicated. There's infrastructure, there's security, there's a lot of software, there's skills, and on and on. At IBM Think this year, Ginni Rometty talked about, there were a couple of themes, one was augmented intelligence, that was something that was clear. She also talked a lot about privacy, and you own your data, etc. One of the things that struck me was her discussion about incumbent disruptors. So, if you look at the top five companies, roughly, Facebook with fake news has dropped down a little bit, but top five companies in terms of market cap in the US. They're data companies, all right. Apple just hit a trillion, Amazon, Google, etc. How do those incumbents close the gap? Is that concept of incumbent disruptors actually something that is being put into practice? I mean, you guys work with a lot of practitioners. How are they going to close that gap with the data haves, meaning data at their core of their business, versus the data have-nots, it's not that they don't have a lot of data, but it's in silos, it's hard to get to? - Yeah, I got one more thing, so, you know, these companies, and whoever's going to be big next is, you have a digital persona, whether you want it or not. So, if you live in a farm out in the middle of Oklahoma, you still have a digital persona, people are collecting data on you, they're putting profiles of you, and the big companies know about you, and people that first interact with you, they're going to know that you have this digital persona. Personal AI, when AI from these companies could be used simply and easily, from a personal deal, to fill in those gaps, and to have a digital persona that supports your family, your growth, both personal and professional growth, and those type of things, there's a lot of applications for AI on a personal, enterprise, even small business, that have not been done yet, but the data is being collected now. So, you talk about the oil, the oil is being built right now, lots, and lots, and lots of it. It's the applications to use that, and turn that into something personally, professionally, educationally, powerful, that's what's missing. But it's coming. - Thank you, so, I'll add to that, and in answer to your question you raised. So, one example we always used in banking is, if you look at the big banks, right, and then you look at from a consumer perspective, and there's a lot of talk about Amazon being a bank. But the thing is, Amazon doesn't need to be a bank, they provide banking services, from a consumer perspective they don't really care if you're a bank or you're not a bank, but what's different between Amazon and some of the banks is that Amazon, like you say, has a lot of data, and they know how to make use of the data to offer something as relevant that consumers want. Whereas banks, they have a lot of data, but they're all silos, right. So, it's not just a matter of whether or not you have the data, it's also, can you actually access it and make something useful out of it so that you can create something that consumers want? Because otherwise, you're just a pipe. - Totally agree, like, when you look at it from a perspective of, there's a lot of terms out there, digital transformation is thrown out so much, right, and go to cloud, and you migrate to cloud, and you're going to take everything over, but really, when you look at it, and you both touched on it, it's the economics. You have to look at the data from an economics perspective, and how do you make some kind of way to take this data meaningful to your customers, that's going to work effectively for them, that they're going to drive? So, when you look at the big, big cloud providers, I think the push in things that's going to happen in the next few years is there's just going to be a bigger migration to public cloud. So then, between those, they have to differentiate themselves. Obvious is artificial intelligence, in a way that makes it easy to aggregate data from across platforms, to aggregate data from multi-cloud, effectively. To use that data in a meaningful way that's going to drive, not only better decisions for your business, and better outcomes, but drives our opportunities for customers, drives opportunities for employees and how they work. We're at a really interesting point in technology where we get to tell technology what to do. It's going beyond us, it's no longer what we're telling it to do, it's going to go beyond us. So, how we effectively manage that is going to be where we see that data flow, and those big five or big four, really take that to the next level. - Now, one of the things that Ginni Rometty said was, I forget the exact step, but it was like, 80% of the data, is not searchable. Kind of implying that it's sitting somewhere behind a firewall, presumably on somebody's premises. So, it was kind of interesting. You're talking about, certainly, a lot of momentum for public cloud, but at the same time, a lot of data is going to stay where it is. - Yeah, we're assuming that a lot of this data is just sitting there, available and ready, and we look at the desperate, or disparate kind of database situation, where you have 29 databases, and two of them have unique quantifiers that tie together, and the rest of them don't. So, there's nothing that you can do with that data. So, artificial intelligence is just that, it's artificial intelligence, so, they know, that's machine learning, that's natural language, that's classification, there's a lot of different parts of that that are moving, but we also have to have IT, good data infrastructure, master data management, compliance, there's so many moving parts to this, that it's not just about the data anymore. - I want to ask Steve to chime in here, go ahead. - Yeah, so, we also have to change the mentality that it's not just enterprise data. There's data on the web, the biggest thing is Internet of Things, the amount of sensor data will make the current data look like chump change. So, data is moving faster, okay. And this is where the sophistication of machine learning needs to kick in, going from just mostly supervised-learning today, to unsupervised learning. And in order to really get into, as I said, big data, and credible AI does the who, what, where, when, and how, but not the why. And this is really the Holy Grail to crack, and it's actually under a new moniker, it's called explainable AI, because it moves beyond just correlation into root cause analysis. Once we have that, then you have the means to be able to tap into augmented intelligence, where humans are working with the machines. - Karen, please. - Yeah, so, one of the things, like what Carla was saying, and what a lot of us had said, I like to think of the advent of ML technologies and AI are going to help me as a data architect to love my data better, right? So, that includes protecting it, but also, when you say that 80% of the data is unsearchable, it's not just an access problem, it's that no one knows what it was, what the sovereignty was, what the metadata was, what the quality was, or why there's huge anomalies in it. So, my favorite story about this is, in the 1980s, about, I forget the exact number, but like, 8 million children disappeared out of the US in April, at April 15th. And that was when the IRS enacted a rule that, in order to have a dependent, a deduction for a dependent on your tax returns, they had to have a valid social security number, and people who had accidentally miscounted their children and over-claimed them, (laughter) over the years them, stopped doing that. Well, some days it does feel like you have eight children running around. (laughter) - Agreed. - When, when that rule came about, literally, and they're not all children, because they're dependents, but literally millions of children disappeared off the face of the earth in April, but if you were doing analytics, or AI and ML, and you don't know that this anomaly happened, I can imagine in a hundred years, someone is saying some catastrophic event happened in April, 1983. (laughter) And what caused that, was it healthcare? Was it a meteor? Was it the clown attacking them? - That's where I was going. - Right. So, those are really important things that I want to use AI and ML to help me, not only document and capture that stuff, but to provide that information to the people, the data scientists and the analysts that are using the data. - Great story, thank you. Bob, you got a thought? You got the mic, go, jump in here. - Well, yeah, I do have a thought, actually. I was talking about, what Karen was talking about. I think it's really important that, not only that we understand AI, and machine learning, and data science, but that the regular folks and companies understand that, at the basic level. Because those are the people who will ask the questions, or who know what questions to ask of the data. And if they don't have the tools, and the knowledge of how to get access to that data, or even how to pose a question, then that data is going to be less valuable, I think, to companies. And the more that everybody knows about data, even people in congress. Remember when Zuckerberg talked about? (laughter) - That was scary. - How do you make money? It's like, we all know this. But, we need to educate the masses on just basic data analytics. - We could have an hour-long panel on that. - Yeah, absolutely. - Peter, you and I were talking about, we had a couple of questions, sort of, how far can we take artificial intelligence? How far should we? You know, so that brings in to the conversation of ethics, and bias, why don't you pick it up? - Yeah, so, one of the crucial things that we all are implying is that, at some point in time, AI is going to become a feature of the operations of our homes, our businesses. And as these technologies get more powerful, and they diffuse, and know about how to use them, diffuses more broadly, and you put more options into the hands of more people, the question slowly starts to turn from can we do it, to should we do it? And, one of the issues that I introduce is that I think the difference between big data and AI, specifically, is this notion of agency. The AI will act on behalf of, perhaps you, or it will act on behalf of your business. And that conversation is not being had, today. It's being had in arguments between Elon Musk and Mark Zuckerberg, which pretty quickly get pretty boring. (laughing) At the end of the day, the real question is, should this machine, whether in concert with others, or not, be acting on behalf of me, on behalf of my business, or, and when I say on behalf of me, I'm also talking about privacy. Because Facebook is acting on behalf of me, it's not just what's going on in my home. So, the question of, can it be done? A lot of things can be done, and an increasing number of things will be able to be done. We got to start having a conversation about should it be done? - So, humans exhibit tribal behavior, they exhibit bias. Their machine's going to pick that up, go ahead, please. - Yeah, one thing that sort of tag onto agency of artificial intelligence. Every industry, every business is now about identifying information and data sources, and their appropriate sinks, and learning how to draw value out of connecting the sources with the sinks. Artificial intelligence enables you to identify those sources and sinks, and when it gets agency, it will be able to make decisions on your behalf about what data is good, what data means, and who it should be. - What actions are good. - Well, what actions are good. - And what data was used to make those actions. - Absolutely. - And was that the right data, and is there bias of data? And all the way down, all the turtles down. - So, all this, the data pedigree will be driven by the agency of artificial intelligence, and this is a big issue. - It's really fundamental to understand and educate people on, there are four fundamental types of bias, so there's, in machine learning, there's intentional bias, "Hey, we're going to make "the algorithm generate a certain outcome "regardless of what the data says." There's the source of the data itself, historical data that's trained on the models built on flawed data, the model will behave in a flawed way. There's target source, which is, for example, we know that if you pull data from a certain social network, that network itself has an inherent bias. No matter how representative you try to make the data, it's still going to have flaws in it. Or, if you pull healthcare data about, for example, African-Americans from the US healthcare system, because of societal biases, that data will always be flawed. And then there's tool bias, there's limitations to what the tools can do, and so we will intentionally exclude some kinds of data, or not use it because we don't know how to, our tools are not able to, and if we don't teach people what those biases are, they won't know to look for them, and I know. - Yeah, it's like, one of the things that we were talking about before, I mean, artificial intelligence is not going to just create itself, it's lines of code, it's input, and it spits out output. So, if it learns from these learning sets, we don't want AI to become another buzzword. We don't want everybody to be an "AR guru" that has no idea what AI is. It takes months, and months, and months for these machines to learn. These learning sets are so very important, because that input is how this machine, think of it as your child, and that's basically the way artificial intelligence is learning, like your child. You're feeding it these learning sets, and then eventually it will make its own decisions. So, we know from some of us having children that you teach them the best that you can, but then later on, when they're doing their own thing, they're really, it's like a little myna bird, they've heard everything that you've said. (laughing) Not only the things that you said to them directly, but the things that you said indirectly. - Well, there are some very good AI researchers that might disagree with that metaphor, exactly. (laughing) But, having said that, what I think is very interesting about this conversation is that this notion of bias, one of the things that fascinates me about where AI goes, are we going to find a situation where tribalism more deeply infects business? Because we know that human beings do not seek out the best information, they seek out information that reinforces their beliefs. And that happens in business today. My line of business versus your line of business, engineering versus sales, that happens today, but it happens at a planning level, and when we start talking about AI, we have to put the appropriate dampers, understand the biases, so that we don't end up with deep tribalism inside of business. Because AI could have the deleterious effect that it actually starts ripping apart organizations. - Well, input is data, and then the output is, could be a lot of things. - Could be a lot of things. - And that's where I said data equals human lives. So that we look at the case in New York where the penal system was using this artificial intelligence to make choices on people that were released from prison, and they saw that that was a miserable failure, because that people that release actually re-offended, some committed murder and other things. So, I mean, it's, it's more than what anybody really thinks. It's not just, oh, well, we'll just train the machines, and a couple of weeks later they're good, we never have to touch them again. These things have to be continuously tweaked. So, just because you built an algorithm or a model doesn't mean you're done. You got to go back later, and continue to tweak these models. - Mark, you got the mic. - Yeah, no, I think one thing we've talked a lot about the data that's collected, but what about the data that's not collected? Incomplete profiles, incomplete datasets, that's a form of bias, and sometimes that's the worst. Because they'll fill that in, right, and then you can get some bias, but there's also a real issue for that around cyber security. Logs are not always complete, things are not always done, and when things are doing that, people make assumptions based on what they've collected, not what they didn't collect. So, when they're looking at this, and they're using the AI on it, that's only on the data collected, not on that that wasn't collected. So, if something is down for a little while, and no data's collected off that, the assumption is, well, it was down, or it was impacted, or there was a breach, or whatever, it could be any of those. So, you got to, there's still this human need, there's still the need for humans to look at the data and realize that there is the bias in there, there is, we're just looking at what data was collected, and you're going to have to make your own thoughts around that, and assumptions on how to actually use that data before you go make those decisions that can impact lots of people, at a human level, enterprise's profitability, things like that. And too often, people think of AI, when it comes out of there, that's the word. Well, it's not the word. - Can I ask a question about this? - Please. - Does that mean that we shouldn't act? - It does not. - Okay. - So, where's the fine line? - Yeah, I think. - Going back to this notion of can we do it, or should we do it? Should we act? - Yeah, I think you should do it, but you should use it for what it is. It's augmenting, it's helping you, assisting you to make a valued or good decision. And hopefully it's a better decision than you would've made without it. - I think it's great, I think also, your answer's right too, that you have to iterate faster, and faster, and faster, and discover sources of information, or sources of data that you're not currently using, and, that's why this thing starts getting really important. - I think you touch on a really good point about, should you or shouldn't you? You look at Google, and you look at the data that they've been using, and some of that out there, from a digital twin perspective, is not being approved, or not authorized, and even once they've made changes, it's still floating around out there. Where do you know where it is? So, there's this dilemma of, how do you have a digital twin that you want to have, and is going to work for you, and is going to do things for you to make your life easier, to do these things, mundane tasks, whatever? But how do you also control it to do things you don't want it to do? - Ad-based business models are inherently evil. (laughing) - Well, there's incentives to appropriate our data, and so, are things like blockchain potentially going to give users the ability to control their data? We'll see. - No, I, I'm sorry, but that's actually a really important point. The idea of consensus algorithms, whether it's blockchain or not, blockchain includes games, and something along those lines, whether it's Byzantine fault tolerance, or whether it's Paxos, consensus-based algorithms are going to be really, really important. Parts of this conversation, because the data's going to be more distributed, and you're going to have more elements participating in it. And so, something that allows, especially in the machine-to-machine world, which is a lot of what we're talking about right here, you may not have blockchain, because there's no need for a sense of incentive, which is what blockchain can help provide. - And there's no middleman. - And, well, all right, but there's really, the thing that makes blockchain so powerful is it liberates new classes of applications. But for a lot of the stuff that we're talking about, you can use a very powerful consensus algorithm without having a game side, and do some really amazing things at scale. - So, looking at blockchain, that's a great thing to bring up, right. I think what's inherently wrong with the way we do things today, and the whole overall design of technology, whether it be on-prem, or off-prem, is both the lock and key is behind the same wall. Whether that wall is in a cloud, or behind a firewall. So, really, when there is an audit, or when there is a forensics, it always comes down to a sysadmin, or something else, and the system administrator will have the finger pointed at them, because it all resides, you can edit it, you can augment it, or you can do things with it that you can't really determine. Now, take, as an example, blockchain, where you've got really the source of truth. Now you can take and have the lock in one place, and the key in another place. So that's certainly going to be interesting to see how that unfolds. - So, one of the things, it's good that, we've hit a lot of buzzwords, right now, right? (laughing) AI, and ML, block. - Bingo. - We got the blockchain bingo, yeah, yeah. So, one of the things is, you also brought up, I mean, ethics and everything, and one of the things that I've noticed over the last year or so is that, as I attend briefings or demos, everyone is now claiming that their product is AI or ML-enabled, or blockchain-enabled. And when you try to get answers to the questions, what you really find out is that some things are being pushed as, because they have if-then statements somewhere in their code, and therefore that's artificial intelligence or machine learning. - [Peter] At least it's not "go-to." (laughing) - Yeah, you're that experienced as well. (laughing) So, I mean, this is part of the thing you try to do as a practitioner, as an analyst, as an influencer, is trying to, you know, the hype of it all. And recently, I attended one where they said they use blockchain, and I couldn't figure it out, and it turns out they use GUIDs to identify things, and that's not blockchain, it's an identifier. (laughing) So, one of the ethics things that I think we, as an enterprise community, have to deal with, is the over-promising of AI, and ML, and deep learning, and recognition. It's not, I don't really consider it visual recognition services if they just look for red pixels. I mean, that's not quite the same thing. Yet, this is also making things much harder for your average CIO, or worse, CFO, to understand whether they're getting any value from these technologies. - Old bottle. - Old bottle, right. - And I wonder if the data companies, like that you talked about, or the top five, I'm more concerned about their nearly, or actual $1 trillion valuations having an impact on their ability of other companies to disrupt or enter into the field more so than their data technologies. Again, we're coming to another perfect storm of the companies that have data as their asset, even though it's still not on their financial statements, which is another indicator whether it's really an asset, is that, do we need to think about the terms of AI, about whose hands it's in, and who's, like, once one large trillion-dollar company decides that you are not a profitable company, how many other companies are going to buy that data and make that decision about you? - Well, and for the first time in business history, I think, this is true, we're seeing, because of digital, because it's data, you're seeing tech companies traverse industries, get into, whether it's content, or music, or publishing, or groceries, and that's powerful, and that's awful scary. - If you're a manger, one of the things your ownership is asking you to do is to reduce asset specificities, so that their capital could be applied to more productive uses. Data reduces asset specificities. It brings into question the whole notion of vertical industry. You're absolutely right. But you know, one quick question I got for you, playing off of this is, again, it goes back to this notion of can we do it, and should we do it? I find it interesting, if you look at those top five, all data companies, but all of them are very different business models, or they can classify the two different business models. Apple is transactional, Microsoft is transactional, Google is ad-based, Facebook is ad-based, before the fake news stuff. Amazon's kind of playing it both sides. - Yeah, they're kind of all on a collision course though, aren't they? - But, well, that's what's going to be interesting. I think, at some point in time, the "can we do it, should we do it" question is, brands are going to be identified by whether or not they have gone through that process of thinking about, should we do it, and say no. Apple is clearly, for example, incorporating that into their brand. - Well, Silicon Valley, broadly defined, if I include Seattle, and maybe Armlock, not so much IBM. But they've got a dual disruption agenda, they've always disrupted horizontal tech. Now they're disrupting vertical industries. - I was actually just going to pick up on what she was talking about, we were talking about buzzword, right. So, one we haven't heard yet is voice. Voice is another big buzzword right now, when you couple that with IoT and AI, here you go, bingo, do I got three points? (laughing) Voice recognition, voice technology, so all of the smart speakers, if you think about that in the world, there are 7,000 languages being spoken, but yet if you look at Google Home, you look at Siri, you look at any of the devices, I would challenge you, it would have a lot of problem understanding my accent, and even when my British accent creeps out, or it would have trouble understanding seniors, because the way they talk, it's very different than a typical 25-year-old person living in Silicon Valley, right. So, how do we solve that, especially going forward? We're seeing voice technology is going to be so more prominent in our homes, we're going to have it in the cars, we have it in the kitchen, it does everything, it listens to everything that we are talking about, not talking about, and records it. And to your point, is it going to start making decisions on our behalf, but then my question is, how much does it actually understand us? - So, I just want one short story. Siri can't translate a word that I ask it to translate into French, because my phone's set to Canadian English, and that's not supported. So I live in a bilingual French English country, and it can't translate. - But what this is really bringing up is if you look at society, and culture, what's legal, what's ethical, changes across the years. What was right 200 years ago is not right now, and what was right 50 years ago is not right now. - It changes across countries. - It changes across countries, it changes across regions. So, what does this mean when our AI has agency? How do we make ethical AI if we don't even know how to manage the change of what's right and what's wrong in human society? - One of the most important questions we have to worry about, right? - Absolutely. - But it also says one more thing, just before we go on. It also says that the issue of economies of scale, in the cloud. - Yes. - Are going to be strongly impacted, not just by how big you can build your data centers, but some of those regulatory issues that are going to influence strongly what constitutes good experience, good law, good acting on my behalf, agency. - And one thing that's underappreciated in the marketplace right now is the impact of data sovereignty, if you get back to data, countries are now recognizing the importance of managing that data, and they're implementing data sovereignty rules. Everyone talks about California issuing a new law that's aligned with GDPR, and you know what that meant. There are 30 other states in the United States alone that are modifying their laws to address this issue. - Steve. - So, um, so, we got a number of years, no matter what Ray Kurzweil says, until we get to artificial general intelligence. - The singularity's not so near? (laughing) - You know that he's changed the date over the last 10 years. - I did know it. - Quite a bit. And I don't even prognosticate where it's going to be. But really, where we're at right now, I keep coming back to, is that's why augmented intelligence is really going to be the new rage, humans working with machines. One of the hot topics, and the reason I chose to speak about it is, is the future of work. I don't care if you're a millennial, mid-career, or a baby boomer, people are paranoid. As machines get smarter, if your job is routine cognitive, yes, you have a higher propensity to be automated. So, this really shifts a number of things. A, you have to be a lifelong learner, you've got to learn new skillsets. And the dynamics are changing fast. Now, this is also a great equalizer for emerging startups, and even in SMBs. As the AI improves, they can become more nimble. So back to your point regarding colossal trillion dollar, wait a second, there's going to be quite a sea change going on right now, and regarding demographics, in 2020, millennials take over as the majority of the workforce, by 2025 it's 75%. - Great news. (laughing) - As a baby boomer, I try my damnedest to stay relevant. - Yeah, surround yourself with millennials is the takeaway there. - Or retire. (laughs) - Not yet. - One thing I think, this goes back to what Karen was saying, if you want a basic standard to put around the stuff, look at the old ISO 38500 framework. Business strategy, technology strategy. You have risk, compliance, change management, operations, and most importantly, the balance sheet in the financials. AI and what Tony was saying, digital transformation, if it's of meaning, it belongs on a balance sheet, and should factor into how you value your company. All the cyber security, and all of the compliance, and all of the regulation, is all stuff, this framework exists, so look it up, and every time you start some kind of new machine learning project, or data sense project, say, have we checked the box on each of these standards that's within this machine? And if you haven't, maybe slow down and do your homework. - To see a day when data is going to be valued on the balance sheet. - It is. - It's already valued as part of the current, but it's good will. - Certainly market value, as we were just talking about. - Well, we're talking about all of the companies that have opted in, right. There's tens of thousands of small businesses just in this region alone that are opt-out. They're small family businesses, or businesses that really aren't even technology-aware. But data's being collected about them, it's being on Yelp, they're being rated, they're being reviewed, the success to their business is out of their hands. And I think what's really going to be interesting is, you look at the big data, you look at AI, you look at things like that, blockchain may even be a potential for some of that, because of mutability, but it's when all of those businesses, when the technology becomes a cost, it's cost-prohibitive now, for a lot of them, or they just don't want to do it, and they're proudly opt-out. In fact, we talked about that last night at dinner. But when they opt-in, the company that can do that, and can reach out to them in a way that is economically feasible, and bring them back in, where they control their data, where they control their information, and they do it in such a way where it helps them build their business, and it may be a generational business that's been passed on. Those kind of things are going to make a big impact, not only on the cloud, but the data being stored in the cloud, the AI, the applications that you talked about earlier, we talked about that. And that's where this bias, and some of these other things are going to have a tremendous impact if they're not dealt with now, at least ethically. - Well, I feel like we just got started, we're out of time. Time for a couple more comments, and then officially we have to wrap up. - Yeah, I had one thing to say, I mean, really, Henry Ford, and the creation of the automobile, back in the early 1900s, changed everything, because now we're no longer stuck in the country, we can get away from our parents, we can date without grandma and grandpa setting on the porch with us. (laughing) We can take long trips, so now we're looked at, we've sprawled out, we're not all living in the country anymore, and it changed America. So, AI has that same capabilities, it will automate mundane routine tasks that nobody wanted to do anyway. So, a lot of that will change things, but it's not going to be any different than the way things changed in the early 1900s. - It's like you were saying, constant reinvention. - I think that's a great point, let me make one observation on that. Every period of significant industrial change was preceded by the formation, a period of formation of new assets that nobody knew what to do with. Whether it was, what do we do, you know, industrial manufacturing, it was row houses with long shafts tied to an engine that was coal-fired, and drove a bunch of looms. Same thing, railroads, large factories for Henry Ford, before he figured out how to do an information-based notion of mass production. This is the period of asset formation for the next generation of social structures. - Those ship-makers are going to be all over these cars, I mean, you're going to have augmented reality right there, on your windshield. - Karen, bring it home. Give us the drop-the-mic moment. (laughing) - No pressure. - Your AV guys are not happy with that. So, I think the, it all comes down to, it's a people problem, a challenge, let's say that. The whole AI ML thing, people, it's a legal compliance thing. Enterprises are going to struggle with trying to meet five billion different types of compliance rules around data and its uses, about enforcement, because ROI is going to make risk of incarceration as well as return on investment, and we'll have to manage both of those. I think businesses are struggling with a lot of this complexity, and you just opened a whole bunch of questions that we didn't really have solid, "Oh, you can fix it by doing this." So, it's important that we think of this new world of data focus, data-driven, everything like that, is that the entire IT and business community needs to realize that focusing on data means we have to change how we do things and how we think about it, but we also have some of the same old challenges there. - Well, I have a feeling we're going to be talking about this for quite some time. What a great way to wrap up CUBE NYC here, our third day of activities down here at 37 Pillars, or Mercantile 37. Thank you all so much for joining us today. - Thank you. - Really, wonderful insights, really appreciate it, now, all this content is going to be available on theCUBE.net. We are exposing our video cloud, and our video search engine, so you'll be able to search our entire corpus of data. I can't wait to start searching and clipping up this session. Again, thank you so much, and thank you for watching. We'll see you next time.

Published Date : Sep 13 2018

SUMMARY :

- Well, and for the first

ENTITIES

Entity	Category	Confidence
Chris	PERSON	0.99+
Steve	PERSON	0.99+
Mark Lynd	PERSON	0.99+
Karen	PERSON	0.99+
Karen Lopez	PERSON	0.99+
John	PERSON	0.99+
Steve Ardire	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Bob	PERSON	0.99+
Peter Burris	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Chris Penn	PERSON	0.99+
Google	ORGANIZATION	0.99+
Carla Gentry	PERSON	0.99+
Dave	PERSON	0.99+
Theo Lau	PERSON	0.99+
Carla	PERSON	0.99+
Kevin L. Jackson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Tony Flath	PERSON	0.99+
Tony	PERSON	0.99+
April, 1983	DATE	0.99+
Apple	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
Ray Kurzweil	PERSON	0.99+
Zuckerberg	PERSON	0.99+
New York	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
2020	DATE	0.99+
two	QUANTITY	0.99+
75%	QUANTITY	0.99+
Ginni Rometty	PERSON	0.99+
Bob Hayes	PERSON	0.99+
80%	QUANTITY	0.99+
GovCloud	ORGANIZATION	0.99+
35 years	QUANTITY	0.99+
2025	DATE	0.99+
Oklahoma	LOCATION	0.99+
Mark Zuckerberg	PERSON	0.99+
US	LOCATION	0.99+
two questions	QUANTITY	0.99+
United States	LOCATION	0.99+
April	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
29 databases	QUANTITY	0.99+
Mark	PERSON	0.99+
7,000 languages	QUANTITY	0.99+
five billion	QUANTITY	0.99+
Elon Musk	PERSON	0.99+
1980s	DATE	0.99+
Unconventional Ventures	ORGANIZATION	0.99+
IRS	ORGANIZATION	0.99+
Siri	TITLE	0.99+
eight children	QUANTITY	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
Armlock	ORGANIZATION	0.99+
French	OTHER	0.99+
Trust Insight	ORGANIZATION	0.99+
ninth year	QUANTITY	0.99+
congress	ORGANIZATION	0.99+
first time	QUANTITY	0.99+
Paisan	PERSON	0.99+

Rob Thomas, IBM | Change the Game: Winning With AI

>> Live from Times Square in New York City, it's The Cube covering IBM's Change the Game: Winning with AI, brought to you by IBM. >> Hello everybody, welcome to The Cube's special presentation. We're covering IBM's announcements today around AI. IBM, as The Cube does, runs of sessions and programs in conjunction with Strata, which is down at the Javits, and we're Rob Thomas, who's the General Manager of IBM Analytics. Long time Cube alum, Rob, great to see you. >> Dave, great to see you. >> So you guys got a lot going on today. We're here at the Westin Hotel, you've got an analyst event, you've got a partner meeting, you've got an event tonight, Change the game: winning with AI at Terminal 5, check that out, ibm.com/WinWithAI, go register there. But Rob, let's start with what you guys have going on, give us the run down. >> Yeah, it's a big week for us, and like many others, it's great when you have Strata, a lot of people in town. So, we've structured a week where, today, we're going to spend a lot of time with analysts and our business partners, talking about where we're going with data and AI. This evening, we've got a broadcast, it's called Winning with AI. What's unique about that broadcast is it's all clients. We've got clients on stage doing demonstrations, how they're using IBM technology to get to unique outcomes in their business. So I think it's going to be a pretty unique event, which should be a lot of fun. >> So this place, it looks like a cool event, a venue, Terminal 5, it's just up the street on the west side highway, probably a mile from the Javits Center, so definitely check that out. Alright, let's talk about, Rob, we've known each other for a long time, we've seen the early Hadoop days, you guys were very careful about diving in, you kind of let things settle and watched very carefully, and then came in at the right time. But we saw the evolution of so-called Big Data go from a phase of really reducing investments, cheaper data warehousing, and what that did is allowed people to collect a lot more data, and kind of get ready for this era that we're in now. But maybe you can give us your perspective on the phases, the waves that we've seen of data, and where we are today and where we're going. >> I kind of think of it as a maturity curve. So when I go talk to clients, I say, look, you need to be on a journey towards AI. I think probably nobody disagrees that they need something there, the question is, how do you get there? So you think about the steps, it's about, a lot of people started with, we're going to reduce the cost of our operations, we're going to use data to take out cost, that was kind of the Hadoop thrust, I would say. Then they moved to, well, now we need to see more about our data, we need higher performance data, BI data warehousing. So, everybody, I would say, has dabbled in those two area. The next leap forward is self-service analytics, so how do you actually empower everybody in your organization to use and access data? And the next step beyond that is, can I use AI to drive new business models, new levers of growth, for my business? So, I ask clients, pin yourself on this journey, most are, depends on the division or the part of the company, they're at different areas, but as I tell everybody, if you don't know where you are and you don't know where you want to go, you're just going to wind around, so I try to get them to pin down, where are you versus where do you want to go? >> So four phases, basically, the sort of cheap data store, the BI data warehouse modernization, self-service analytics, a big part of that is data science and data science collaboration, you guys have a lot of investments there, and then new business models with AI automation running on top. Where are we today? Would you say we're kind of in-between BI/DW modernization and on our way to self-service analytics, or what's your sense? >> I'd say most are right in the middle between BI data warehousing and self-service analytics. Self-service analytics is hard, because it requires you, sometimes to take a couple steps back, and look at your data. It's hard to provide self-service if you don't have a data catalog, if you don't have data security, if you haven't gone through the processes around data governance. So, sometimes you have to take one step back to go two steps forward, that's why I see a lot of people, I'd say, stuck in the middle right now. And the examples that you're going to see tonight as part of the broadcast are clients that have figured out how to break through that wall, and I think that's pretty illustrative of what's possible. >> Okay, so you're saying that, got to maybe take a step back and get the infrastructure right with, let's say a catalog, to give some basic things that they have to do, some x's and o's, you've got the Vince Lombardi played out here, and also, skillsets, I imagine, is a key part of that. So, that's what they've got to do to get prepared, and then, what's next? They start creating new business models, imagining this is where the cheap data officer comes in and it's an executive level, what are you seeing clients as part of digital transformation, what's the conversation like with customers? >> The biggest change, the great thing about the times we live in, is technology's become so accessible, you can do things very quickly. We created a team last year called Data Science Elite, and we've hired what we think are some of the best data scientists in the world. Their only job is to go work with clients and help them get to a first success with data science. So, we put a team in. Normally, one month, two months, normally a team of two or three people, our investment, and we say, let's go build a model, let's get to an outcome, and you can do this incredibly quickly now. I tell clients, I see somebody that says, we're going to spend six months evaluating and thinking about this, I was like, why would you spend six months thinking about this when you could actually do it in one month? So you just need to get over the edge and go try it. >> So we're going to learn more about the Data Science Elite team. We've got John Thomas coming on today, who is a distinguished engineer at IBM, and he's very much involved in that team, and I think we have a customer who's actually gone through that, so we're going to talk about what their experience was with the Data Science Elite team. Alright, you've got some hard news coming up, you've actually made some news earlier with Hortonworks and Red Hat, I want to talk about that, but you've also got some hard news today. Take us through that. >> Yeah, let's talk about all three. First, Monday we announced the expanded relationship with both Hortonworks and Red Hat. This goes back to one of the core beliefs I talked about, every enterprise is modernizing their data and application of states, I don't think there's any debate about that. We are big believers in Kubernetes and containers as the architecture to drive that modernization. The announcement on Monday was, we're working closer with Red Hat to take all of our data services as part of Cloud Private for Data, which are basically microservice for data, and we're running those on OpenShift, and we're starting to see great customer traction with that. And where does Hortonworks come in? Hadoop has been the outlier on moving to microservices containers, we're working with Hortonworks to help them make that move as well. So, it's really about the three of us getting together and helping clients with this modernization journey. >> So, just to remind people, you remember ODPI, folks? It was all this kerfuffle about, why do we even need this? Well, what's interesting to me about this triumvirate is, well, first of all, Red Hat and Hortonworks are hardcore opensource, IBM's always been a big supporter of open source. You three got together and you're proving now the productivity for customers of this relationship. You guys don't talk about this, but Hortonworks had to, when it's public call, that the relationship with IBM drove many, many seven-figure deals, which, obviously means that customers are getting value out of this, so it's great to see that come to fruition, and it wasn't just a Barney announcement a couple years ago, so congratulations on that. Now, there's this other news that you guys announced this morning, talk about that. >> Yeah, two other things. One is, we announced a relationship with Stack Overflow. 50 million developers go to Stack Overflow a month, it's an amazing environment for developers that are looking to do new things, and we're sponsoring a community around AI. Back to your point before, you said, is there a skills gap in enterprises, there absolutely is, I don't think that's a surprise. Data science, AI developers, not every company has the skills they need, so we're sponsoring a community to help drive the growth of skills in and around data science and AI. So things like Python, R, Scala, these are the languages of data science, and it's a great relationship with us and Stack Overflow to build a community to get things going on skills. >> Okay, and then there was one more. >> Last one's a product announcement. This is one of the most interesting product annoucements we've had in quite a while. Imagine this, you write a sequel query, and traditional approach is, I've got a server, I point it as that server, I get the data, it's pretty limited. We're announcing technology where I write a query, and it can find data anywhere in the world. I think of it as wide-area sequel. So it can find data on an automotive device, a telematics device, an IoT device, it could be a mobile device, we think of it as sequel the whole world. You write a query, you can find the data anywhere it is, and we take advantage of the processing power on the edge. The biggest problem with IoT is, it's been the old mantra of, go find the data, bring it all back to a centralized warehouse, that makes it impossible to do it real time. We're enabling real time because we can write a query once, find data anywhere, this is technology we've had in preview for the last year. We've been working with a lot of clients to prove out used cases to do it, we're integrating as the capability inside of IBM Cloud Private for Data. So if you buy IBM Cloud for Data, it's there. >> Interesting, so when you've been around as long as I have, long enough to see some of the pendulums swings, and it's clearly a pendulum swing back toward decentralization in the edge, but the key is, from what you just described, is you're sort of redefining the boundary, so I presume it's the edge, any Cloud, or on premises, where you can find that data, is that correct? >> Yeah, so it's multi-Cloud. I mean, look, every organization is going to be multi-Cloud, like 100%, that's going to happen, and that could be private, it could be multiple public Cloud providers, but the key point is, data on the edge is not just limited to what's in those Clouds. It could be anywhere that you're collecting data. And, we're enabling an architecture which performs incredibly well, because you take advantage of processing power on the edge, where you can get data anywhere that it sits. >> Okay, so, then, I'm setting up a Cloud, I'll call it a Cloud architecture, that encompasses the edge, where essentially, there are no boundaries, and you're bringing security. We talked about containers before, we've been talking about Kubernetes all week here at a Big Data show. And then of course, Cloud, and what's interesting, I think many of the Hadoop distral vendors kind of missed Cloud early on, and then now are sort of saying, oh wow, it's a hybrid world and we've got a part, you guys obviously made some moves, a couple billion dollar moves, to do some acquisitions and get hardcore into Cloud, so that becomes a critical component. You're not just limiting your scope to the IBM Cloud. You're recognizing that it's a multi-Cloud world, that' what customers want to do. Your comments. >> It's multi-Cloud, and it's not just the IBM Cloud, I think the most predominant Cloud that's emerging is every client's private Cloud. Every client I talk to is building out a containerized architecture. They need their own Cloud, and they need seamless connectivity to any public Cloud that they may be using. This is why you see such a premium being put on things like data ingestion, data curation. It's not popular, it's not exciting, people don't want to talk about it, but we're the biggest inhibitors, to this AI point, comes back to data curation, data ingestion, because if you're dealing with multiple Clouds, suddenly your data's in a bunch of different spots. >> Well, so you're basically, and we talked about this a lot on The Cube, you're bringing the Cloud model to the data, wherever the data lives. Is that the right way to think about it? >> I think organizations have spoken, set aside what they say, look at their actions. Their actions say, we don't want to move all of our data to any particular Cloud, we'll move some of our data. We need to give them seamless connectivity so that they can leave their data where they want, we can bring Cloud-Native Architecture to their data, we could also help move their data to a Cloud-Native architecture if that's what they prefer. >> Well, it makes sense, because you've got physics, latency, you've got economics, moving all the data into a public Cloud is expensive and just doesn't make economic sense, and then you've got things like GDPR, which says, well, you have to keep the data, certain laws of the land, if you will, that say, you've got to keep the data in whatever it is, in Germany, or whatever country. So those sort of edicts dictate how you approach managing workloads and what you put where, right? Okay, what's going on with Watson? Give us the update there. >> I get a lot of questions, people trying to peel back the onion of what exactly is it? So, I want to make that super clear here. Watson is a few things, start at the bottom. You need a runtime for models that you've built. So we have a product called Watson Machine Learning, runs anywhere you want, that is the runtime for how you execute models that you've built. Anytime you have a runtime, you need somewhere where you can build models, you need a development environment. That is called Watson Studio. So, we had a product called Data Science Experience, we've evolved that into Watson Studio, connecting in some of those features. So we have Watson Studio, that's the development environment, Watson Machine Learning, that's the runtime. Now you move further up the stack. We have a set of APIs that bring in human features, vision, natural language processing, audio analytics, those types of things. You can integrate those as part of a model that you build. And then on top of that, we've got things like Watson Applications, we've got Watson for call centers, doing customer service and chatbots, and then we've got a lot of clients who've taken pieces of that stack and built their own AI solutions. They've taken some of the APIs, they've taken some of the design time, the studio, they've taken some of the Watson Machine Learning. So, it is really a stack of capabilities, and where we're driving the greatest productivity, this is in a lot of the examples you'll see tonight for clients, is clients that have bought into this idea of, I need a development environment, I need a runtime, where I can deploy models anywhere. We're getting a lot of momentum on that, and then that raises the question of, well, do I have expandability, do I have trust in transparency, and that's another thing that we're working on. >> Okay, so there's API oriented architecture, exposing all these services make it very easy for people to consume. Okay, so we've been talking all week at Cube NYC, is Big Data is in AI, is this old wine, new bottle? I mean, it's clear, Rob, from the conversation here, there's a lot of substantive innovation, and early adoption, anyway, of some of these innovations, but a lot of potential going forward. Last thoughts? >> What people have to realize is AI is not magic, it's still computer science. So it actually requires some hard work. You need to roll up your sleeves, you need to understand how I get from point A to point B, you need a development environment, you need a runtime. I want people to really think about this, it's not magic. I think for a while, people have gotten the impression that there's some magic button. There's not, but if you put in the time, and it's not a lot of time, you'll see the examples tonight, most of them have been done in one or two months, there's great business value in starting to leverage AI in your business. >> Awesome, alright, so if you're in this city or you're at Strata, go to ibm.com/WinWithAI, register for the event tonight. Rob, we'll see you there, thanks so much for coming back. >> Yeah, it's going to be fun, thanks Dave, great to see you. >> Alright, keep it right there everybody, we'll be back with our next guest right after this short break, you're watching The Cube.

Published Date : Sep 13 2018

SUMMARY :

brought to you by IBM. Rob, great to see you. what you guys have going on, it's great when you have on the phases, the waves that we've seen where you want to go, you're the BI data warehouse modernization, a data catalog, if you and get the infrastructure right with, and help them get to a first and I think we have a as the architecture to news that you guys announced that are looking to do new things, I point it as that server, I get the data, of processing power on the the edge, where essentially, it's not just the IBM Cloud, Is that the right way to think about it? We need to give them seamless connectivity certain laws of the land, that is the runtime for people to consume. and it's not a lot of time, register for the event tonight. Yeah, it's going to be fun, we'll be back with our next guest

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
John Thomas	PERSON	0.99+
two months	QUANTITY	0.99+
six months	QUANTITY	0.99+
six months	QUANTITY	0.99+
Rob	PERSON	0.99+
Rob Thomas	PERSON	0.99+
Monday	DATE	0.99+
last year	DATE	0.99+
one month	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Germany	LOCATION	0.99+
New York City	LOCATION	0.99+
one	QUANTITY	0.99+
Vince Lombardi	PERSON	0.99+
GDPR	TITLE	0.99+
three people	QUANTITY	0.99+
Watson Studio	TITLE	0.99+
Cube	ORGANIZATION	0.99+
ibm.com/WinWithAI	OTHER	0.99+
two	QUANTITY	0.99+
Times Square	LOCATION	0.99+
both	QUANTITY	0.99+
tonight	DATE	0.99+
First	QUANTITY	0.99+
today	DATE	0.98+
Data Science Elite	ORGANIZATION	0.98+
The Cube	TITLE	0.98+
two steps	QUANTITY	0.98+
Scala	TITLE	0.98+
Python	TITLE	0.98+
One	QUANTITY	0.98+
three	QUANTITY	0.98+
Barney	ORGANIZATION	0.98+
Javits Center	LOCATION	0.98+
Watson	TITLE	0.98+
This evening	DATE	0.98+
IBM Analytics	ORGANIZATION	0.97+
one step	QUANTITY	0.97+
Stack Overflow	ORGANIZATION	0.96+
Cloud	TITLE	0.96+
seven-figure deals	QUANTITY	0.96+
Terminal 5	LOCATION	0.96+
Watson Applications	TITLE	0.95+
Watson Machine Learning	TITLE	0.94+
a month	QUANTITY	0.94+
50 million developers	QUANTITY	0.92+

Rob Thomas, IBM | Machine Learning Everywhere 2018

>> Announcer: Live from New York, it's theCUBE, covering Machine Learning Everywhere: Build Your Ladder to AI, brought to you by IBM. >> Welcome back to New York City. theCUBE continue our coverage here at IBM's event, Machine Learning Everywhere: Build Your Ladder to AI. And with us now is Rob Thomas, who is the vice president of, or general manager, rather, of IBM analytics. Sorry about that, Rob. Good to have you with us this morning. Good to see you, sir. >> Great to see you John. Dave, great to see you as well. >> Great to see you. >> Well let's just talk about the event first. Great lineup of guests. We're looking forward to visiting with several of them here on theCUBE today. But let's talk about, first off, general theme with what you're trying to communicate and where you sit in terms of that ladder to success in the AI world. >> So, maybe start by stepping back to, we saw you guys a few times last year. Once in Munich, I recall, another one in New York, and the theme of both of those events was, data science renaissance. We started to see data science picking up steam in organizations. We also talked about machine learning. The great news is that, in that timeframe, machine learning has really become a real thing in terms of actually being implemented into organizations, and changing how companies run. And that's what today is about, is basically showcasing a bunch of examples, not only from our clients, but also from within IBM, how we're using machine learning to run our own business. And the thing I always remind clients when I talk to them is, machine learning is not going to replace managers, but I think machine learning, managers that use machine learning will replace managers that do not. And what you see today is a bunch of examples of how that's true because it gives you superpowers. If you've automated a lot of the insight, data collection, decision making, it makes you a more powerful manager, and that's going to change a lot of enterprises. >> It seems like a no-brainer, right? I mean, or a must-have. >> I think there's a, there's always that, sometimes there's a fear factor. There is a culture piece that holds people back. We're trying to make it really simple in terms of how we talk about the day, and the examples that we show, to get people comfortable, to kind of take a step onto that ladder back to the company. >> It's conceptually a no-brainer, but it's a challenge. You wrote a blog and it was really interesting. It was, one of the clients said to you, "I'm so glad I'm not in the technology industry." And you went, "Uh, hello?" (laughs) "I've got news for you, you are in the technology industry." So a lot of customers that I talk to feel like, meh, you know, in our industry, it's really not getting disrupted. That's kind of taxis and retail. We're in banking and, you know, but, digital is disrupting every industry and every industry is going to have to adopt ML, AI, whatever you want to call it. Can traditional companies close that gap? What's your take? >> I think they can, but, I'll go back to the word I used before, it starts with culture. Am I accepting that I'm a technology company, even if traditionally I've made tractors, as an example? Or if traditionally I've just been you know, selling shirts and shoes, have I embraced the role, my role as a technology company? Because if you set that culture from the top, everything else flows from there. It can't be, IT is something that we do on the side. It has to be a culture of, it's fundamental to what we do as a company. There was an MIT study that said, data-driven cultures drive productivity gains of six to 10 percent better than their competition. You can't, that stuff compounds, too. So if your competitors are doing that and you're not, not only do you fall behind in the short term but you fall woefully behind in the medium term. And so, I think companies are starting to get there but it takes a constant push to get them focused on that. >> So if you're a tractor company, you've got human expertise around making tractors and messaging and marketing tractors, and then, and data is kind of there, sort of a bolt-on, because everybody's got to be data-driven, but if you look at the top companies by market cap, you know, we were talking about it earlier. Data is foundational. It's at their core, so, that seems to me to be the hard part, Rob, I'd like you to comment in terms of that cultural shift. How do you go from sort of data in silos and, you know, not having cloud economics and, that are fundamental, to having that dynamic, and how does IBM help? >> You know, I think, to give companies credit, I think most organizations have developed some type of data practice or discipline over the last, call it five years. But most of that's historical, meaning, yeah, we'll take snapshots of history. We'll use that to guide decision making. You fast-forward to what we're talking about today, just so we're on the same page, machine learning is about, you build a model, you train a model with data, and then as new data flows in, your model is constantly updating. So your ability to make decisions improves over time. That's very different from, we're doing historical reporting on data. And so I think it's encouraging that companies have kind of embraced that data discipline in the last five years, but what we're talking about today is a big next step and what we're trying to break it down to what I call the building blocks, so, back to the point on an AI ladder, what I mean by an AI ladder is, you can't do AI without machine learning. You can't do machine learning without analytics. You can't do analytics without the right data architecture. So those become the building blocks of how you get towards a future of AI. And so what I encourage companies is, if you're not ready for that AI leading edge use case, that's okay, but you can be preparing for that future now. That's what the building blocks are about. >> You know, I think we're, I know we're ahead of, you know, Jeremiah Owyang on a little bit later, but I was reading something that he had written about gut and instinct, from the C-Suite, and how, that's how companies were run, right? You had your CEO, your president, they made decisions based on their guts or their instincts. And now, you've got this whole new objective tool out there that's gold, and it's kind of taking some of the gut and instinct out of it, in a way, and maybe there are people who still can't quite grasp that, that maybe their guts and their instincts, you know, what their gut tells them, you know, is one thing, but there's pretty objective data that might indicate something else. >> Moneyball for business. >> A little bit of a clash, I mean, is there a little bit of a clash in that respect? >> I think you'd be surprise by how much decision making is still pure opinion. I mean, I see that everywhere. But we're heading more towards what you described for sure. One of the clients talking here today, AMC Networks, think it's a great example of a company that you wouldn't think of as a technology company, primarily a content producer, they make great shows, but they've kind of gone that extra step to say, we can integrate data sources from third parties, our own data about viewer habits, we can do that to change our relationship with advertisers. Like, that's a company that's really embraced this idea of being a technology company, and you can see it in their results, and so, results are not coincidence in this world anymore. It's about a practice applied to data, leveraging machine learning, on a path towards AI. If companies are doing that, they're going to be successful. >> And we're going to have the tally from AMC on, but so there's a situation where they have embraced it, that they've dealt with that culture, and data has become foundational. Now, I'm interested as to what their journey look like. What are you seeing with clients? How they break this down, the silos of data that have been built up over decades. >> I think, so they get almost like a maturity curve. You've got, and the rule I talk about is 40-40-20, where 40% of organizations are really using data just to optimize costs right now. That's okay, but that's on the lower end of the maturity curve. 40% are saying, all right, I'm starting to get into data science. I'm starting to think about how I extend to new products, new services, using data. And then 20% are on the leading edge. And that's where I'd put AMC Networks, by the way, because they've done unique things with integrating data sets and building models so that they've automated a lot of what used to be painstakingly long processes, internal processes to do it. So you've got this 40-40-20 of organizations in terms of their maturity on this. If you're not on that curve right now, you have a problem. But I'd say most are somewhere on that curve. If you're in the first 40% and you're, right now data for you is just about optimizing cost, you're going to be behind. If you're not right now, you're going to be behind in the next year, that's a problem. So I'd kind of encourage people to think about what it takes to be in the next 40%. Ultimately you want to be in the 20% that's actually leading this transformation. >> So change it to 40-20-40. That's where you want it to go, right? You want to flip that paradigm. >> I want to ask you a question. You've done a lot of M and A in the past. You spent a lot of time in Silicon Valley and Silicon Valley obviously very, very disruptive, you know, cultures and organizations and it's always been a sort of technology disruption. It seems like there's a ... another disruption going on, not just horizontal technologies, you know, cloud or mobile or social, whatever it is, but within industries. Some industries, as we've been talking, radically disrupted. Retail, taxis, certainly advertising, et cetera et cetera. Some have not yet, the client that you talked to. Do you see, technology companies generally, Silicon Valley companies specifically, as being able to pull off a sort of disruption of not only technologies but also industries and where does IBM play there? You've made a sort of, Ginni in particular has made a deal about, hey, we're not going to compete with our customers. So talking about this sort of dual disruption agenda, one on the technology side, one within industries that Apple's getting into financial services and, you know, Amazon getting into grocery, what's your take on that and where does IBM fit in that world? >> So, I mean, IBM has been in Silicon Valley for a long time, I would say probably longer than 99.9% of the companies in Silicon Valley, so, we've got a big lab there. We do a lot of innovation out of there. So love it, I mean, the culture of the valley is great for the world because it's all about being the challenger, it's about innovation, and that's tremendous. >> No fear. >> Yeah, absolutely. So, look, we work with a lot of different partners, some who are, you know, purely based in the valley. I think they challenge us. We can learn from them, and that's great. I think the one, the one misnomer that I see right now, is there's a undertone that innovation is happening in Silicon Valley and only in Silicon Valley. And I think that's a myth. Give you an example, we just, in December, we released something called Event Store which is basically our stab at reinventing the database business that's been pretty much the same for the last 30 to 40 years. And we're now ingesting millions of rows of data a second. We're doing it in a Parquet format using a Spark engine. Like, this is an amazing innovation that will change how any type of IOT use case can manage data. Now ... people don't think of IBM when they think about innovations like that because it's not the only thing we talk about. We don't have, the IBM website isn't dedicated to that single product because IBM is a much bigger company than that. But we're innovating like crazy. A lot of that is out of what we're doing in Silicon Valley and our labs around the world and so, I'm very optimistic on what we're doing in terms of innovation. >> Yeah, in fact, I think, rephrase my question. I was, you know, you're right. I mean people think of IBM as getting disrupted. I wasn't posing it, I think of you as a disruptor. I know that may sound weird to some people but in the sense that you guys made some huge bets with things like Watson on solving some of the biggest, world's problems. And so I see you as disrupting sort of, maybe yourselves. Okay, frame that. But I don't see IBM as saying, okay, we are going to now disrupt healthcare, disrupt financial services, rather we are going to help our, like some of your comp... I don't know if you'd call them competitors. Amazon, as they say, getting into content and buying grocery, you know, food stores. You guys seems to have a different philosophy. That's what I'm trying to get to is, we're going to disrupt ourselves, okay, fine. But we're not going to go hard into healthcare, hard into financial services, other than selling technology and services to those organizations, does that make sense? >> Yeah, I mean, look, our mission is to make our clients ... better at what they do. That's our mission, we want to be essential in terms of their journey to be successful in their industry. So frankly, I love it every time I see an announcement about Amazon entering another vertical space, because all of those companies just became my clients. Because they're not going to work with Amazon when they're competing with them head to head, day in, day out, so I love that. So us working with these companies to make them better through things like Watson Health, what we're doing in healthcare, it's about making companies who have built their business in healthcare, more effective at how they perform, how they drive results, revenue, ROI for their investors. That's what we do, that's what IBM has always done. >> Yeah, so it's an interesting discussion. I mean, I tend to agree. I think Silicon Valley maybe should focus on those technology disruptions. I think that they'll have a hard time pulling off that dual disruption and maybe if you broadly define Silicon Valley as Seattle and so forth, but, but it seems like that formula has worked for decades, and will continue to work. Other thoughts on sort of the progression of ML, how it gets into organizations. You know, where you see this going, again, I was saying earlier, the parlance is changing. Big data is kind of, you know, mm. Okay, Hadoop, well, that's fine. We seem to be entering this new world that's pervasive, it's embedded, it's intelligent, it's autonomous, it's self-healing, it's all these things that, you know, we aspire to. We're now back in the early innings. We're late innings of big data, that's kind of ... But early innings of this new era, what are your thoughts on that? >> You know, I'd say the biggest restriction right now I see, we talked before about somehow, sometimes companies don't have the desire, so we have to help create the desire, create the culture to go do this. Even for the companies that have a burning desire, the issue quickly becomes a skill gap. And so we're doing a lot to try to help bridge that skill gap. Let's take data science as an example. There's two worlds of data science that I would describe. There's clickers, and there's coders. Clickers want to do drag and drop. They will use traditional tools like SPSS, which we're modernizing, that's great. We want to support them if that's how they want to work and build models and deploy models. There's also this world of coders. This is people that want to do all their data science in ML, and Python, and Scala, and R, like, that's what they want to do. And so we're supporting them through things like Data Science Experience, which is built on Apache Jupiter. It's all open source tooling, it'd designed for coders. The reason I think that's important, it goes back to the point on skill sets. There is a skill gap in most companies. So if you walk in and you say, this is the only way to do this thing, you kind of excluded half the companies because they say, I can't play in that world. So we are intentionally going after a strategy that says, there's a segmentation in skill types. In places there's a gap, we can help you fill that gap. That's how we're thinking about them. >> And who does that bode well for? If you say that you were trying to close a gap, does that bode well for, we talked about the Millennial crowd coming in and so they, you know, do they have a different approach or different mental outlook on this, or is it to the mid-range employee, you know, who is open minded, I mean, but, who is the net sweet spot, you think, that say, oh, this is a great opportunity right now? >> So just take data science as an example. The clicker coder comment I made, I would put the clicker audience as mostly people that are 20 years into their career. They've been around a while. The coder audience is all the Millennials. It's all the new audience. I think the greatest beneficiary is the people that find themselves kind of stuck in the middle, which is they're kind of interested in this ... >> That straddle both sides of the line yeah? >> But they've got the skill set and the desire to do some of the new tooling and new approaches. So I think this kind of creates an opportunity for that group in the middle to say, you know, what am I going to adopt as a platform for how I go forward and how I provide leadership in my company? >> So your advice, then, as you're talking to your clients, I mean you're also talking to their workforce. In a sense, then, your advice to them is, you know, join, jump in the wave, right? You've got your, you can't straddle, you've got to go. >> And you've got to experiment, you've got to try things. Ultimately, organizations are going to gravitate to things that they like using in terms of an approach or a methodology or a tool. But that comes with experimentation, so people need to get out there and try something. >> Maybe we could talk about developers a little bit. We were talking to Dinesh earlier and you guys of course have focused on data scientists, data engineers, obviously developers. And Dinesh was saying, look, many, if not most, of the 10 million Java developers out there, they're not, like, focused around the data. That's really the data scientist's job. But then, my colleague John Furrier says, hey, data is the new development kit. You know, somebody said recently, you know, Andreessen's comment, "software is eating the world." Well, data is eating software. So if Furrier is right and that comment is right, it seems like developers increasingly have to become more data aware, fundamentally. Blockchain developers clearly are more data focused. What's your take on the developer community, where they fit into this whole AI, machine learning space? >> I was just in Las Vegas yesterday and I did a session with a bunch of our business partners. ISVs, so software companies, mostly a developer audience, and the discussion I had with them was around, you're doing, you're building great products, you're building great applications. But your product is only as good as the data and the intelligence that you embed in your product. Because you're still putting too much of a burden on the user, as opposed to having everything happen magically, if you will. So that discussion was around, how do you embed data, embed AI, into your products and do that at the forefront versus, you deliver a product and the client has to say, all right, now I need to get my data out of this application and move it somewhere else so I can do the data science that I want to do. That's what I see happening with developers. It's kind of ... getting them to think about data as opposed to just thinking about the application development framework, because that's where most of them tend to focus. >> Mm, right. >> Well, we've talked about, well, earlier on about the governance, so just curious, with Madhu, which I'll, we'll have that interview in just a little bit here. I'm kind of curious about your take on that, is that it's a little kinder, gentler, friendlier than maybe some might look at it nowadays because of some organization that it causes, within your group and some value that's being derived from that, that more efficiency, more contextual information that's, you know, more relevant, whatever. When you talk to your clients about meeting rules, regs, GDPR, all these things, how do you get them to see that it's not a black veil of doom and gloom but it really is, really more of an opportunity for them to cash in? >> You know, my favorite question to ask when I go visit clients is I say, I say, just show of hands, how many people have all the data they need to do their job? To date, nobody has ever raised their hand. >> Not too many hands up. >> The reason I phrased it that way is, that's fundamentally a governance challenge. And so, when you think about governance, I think everybody immediately thinks about compliance, GDPR, types of things you mentioned, and that's great. But there's two use cases for governance. One is compliance, the other one is self service analytics. Because if you've done data governance, then you can make your data available to everybody in the organization because you know you've got the right rules, the right permissions set up. That will change how people do their jobs and I think sometimes governance gets painted into a compliance corner, when organizations need to think about it as, this is about making data accessible to my entire workforce. That's a big change. I don't think anybody has that today. Except for the clients that we're working with, where I think we've made good strides in that. >> What's your sort of number one, two, and three, or pick one, advice for those companies that as you blogged about, don't realize yet that they're in the software business and the technology business? For them to close the ... machine intelligence, machine learning, AI gap, where should they start? >> I do think it can be basic steps. And the reason I say that is, if you go to a company that hasn't really viewed themselves as a technology company, and you start talking about machine intelligence, AI, like, everybody like, runs away scared, like it's not interesting. So I bring it back to building blocks. For a client to be great in data, and to become a technology company, you really need three platforms for how you think about data. You need a platform for how you manage your data, so think of it as data management. You need a platform for unified governance and integration, and you need a platform for data science and business analytics. And to some extent, I don't care where you start, but you've got to start with one of those. And if you do that, you know, you'll start to create a flywheel of momentum where you'll get some small successes. Then you can go in the other area, and so I just encourage everybody, start down that path. Pick one of the three. Or you may already have something going in one of them, so then pick one where you don't have something going. Just start down the path, because, those building blocks, once you have those in place, you'll be able to scale AI and ML in the future in your organization. But without that, you're going to always be limited to kind of a use case at a time. >> Yeah, and I would add, this is, you talked about it a couple times today, is that cultural aspect, that realization that in order to be data driven, you know, buzzword, you have to embrace that and drive that through the culture. Right? >> That starts at the top, right? Which is, it's not, you know, it's not normal to have a culture of, we're going to experiment, we're going to try things, half of them may not work. And so, it starts at the top in terms of how you set the tone and set that culture. >> IBM Think, we're less than a month away. CUBE is going to be there, very excited about that. First time that you guys have done Think. You've consolidated all your big, big events. What can we expect from you guys? >> I think it's going to be an amazing show. To your point, we thought about this for a while, consolidating to a single IBM event. There's no question just based on the response and the enrollment we have so far, that was the right answer. We'll have people from all over the world. A bunch of clients, we've got some great announcements that will come out that week. And for clients that are thinking about coming, honestly the best thing about it is all the education and training. We basically build a curriculum, and think of it as a curriculum around, how do we make our clients more effective at competing with the Amazons of the world, back to the other point. And so I think we build a great curriculum and it will be a great week. >> Well, if I've heard anything today, it's about, don't be afraid to dive in at the deep end, just dive, right? Get after it and, looking forward to the rest of the day. Rob, thank you for joining us here and we'll see you in about a month! >> Sounds great. >> Right around the corner. >> All right, Rob Thomas joining us here from IBM Analytics, the GM at IBM Analytics. Back with more here on theCUBE. (upbeat music)

Published Date : Feb 27 2018

SUMMARY :

Build Your Ladder to AI, brought to you by IBM. Good to have you with us this morning. Dave, great to see you as well. and where you sit in terms of that ladder And what you see today is a bunch of examples I mean, or a must-have. onto that ladder back to the company. So a lot of customers that I talk to And so, I think companies are starting to get there to be the hard part, Rob, I'd like you to comment You fast-forward to what we're talking about today, and it's kind of taking some of the gut But we're heading more towards what you described for sure. Now, I'm interested as to what their journey look like. to think about what it takes to be in the next 40%. That's where you want it to go, right? I want to ask you a question. So love it, I mean, the culture of the valley for the last 30 to 40 years. but in the sense that you guys made some huge bets in terms of their journey to be successful Big data is kind of, you know, mm. create the culture to go do this. The coder audience is all the Millennials. for that group in the middle to say, you know, you know, join, jump in the wave, right? so people need to get out there and try something. and you guys of course have focused on data scientists, that you embed in your product. When you talk to your clients about have all the data they need to do their job? And so, when you think about governance, and the technology business? And to some extent, I don't care where you start, that in order to be data driven, you know, buzzword, Which is, it's not, you know, it's not normal CUBE is going to be there, very excited about that. I think it's going to be an amazing show. and we'll see you in about a month! from IBM Analytics, the GM at IBM Analytics.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
December	DATE	0.99+
Rob Thomas	PERSON	0.99+
New York	LOCATION	0.99+
Dinesh	PERSON	0.99+
AMC Networks	ORGANIZATION	0.99+
John	PERSON	0.99+
Jeremiah Owyang	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Rob	PERSON	0.99+
20 years	QUANTITY	0.99+
Dave	PERSON	0.99+
Munich	LOCATION	0.99+
IBM Analytics	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
MIT	ORGANIZATION	0.99+
10 million	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
20%	QUANTITY	0.99+
last year	DATE	0.99+
Furrier	PERSON	0.99+
AMC	ORGANIZATION	0.99+
One	QUANTITY	0.99+
yesterday	DATE	0.99+
six	QUANTITY	0.99+
New York City	LOCATION	0.99+
GDPR	TITLE	0.99+
40%	QUANTITY	0.99+
both	QUANTITY	0.99+
three	QUANTITY	0.99+
one	QUANTITY	0.99+
Seattle	LOCATION	0.99+
Scala	TITLE	0.99+
two use cases	QUANTITY	0.99+
today	DATE	0.99+
Python	TITLE	0.98+
Andreessen	PERSON	0.98+
both sides	QUANTITY	0.98+
two	QUANTITY	0.98+
Watson Health	ORGANIZATION	0.98+
millions of rows	QUANTITY	0.98+
five years	QUANTITY	0.97+
next year	DATE	0.97+
less than a month	QUANTITY	0.97+
Madhu	PERSON	0.97+
Amazons	ORGANIZATION	0.96+

Linton Ward, IBM & Asad Mahmood, IBM - DataWorks Summit 2017

>> Narrator: Live from San Jose, in the heart of Silicon Valley, it's theCUBE! Covering Data Works Summit 2017. Brought to you by Hortonworks. >> Welcome back to theCUBE. I'm Lisa Martin with my co-host George Gilbert. We are live on day one of the Data Works Summit in San Jose in the heart of Silicon Valley. Great buzz in the event, I'm sure you can see and hear behind us. We're very excited to be joined by a couple of fellows from IBM. A very longstanding Hortonworks partner that announced a phenomenal suite of four new levels of that partnership today. Please welcome Asad Mahmood, Analytics Cloud Solutions Specialist at IBM, and medical doctor, and Linton Ward, Distinguished Engineer, Power Systems OpenPOWER Solutions from IBM. Welcome guys, great to have you both on the queue for the first time. So, Linton, software has been changing, companies, enterprises all around are really looking for more open solutions, really moving away from proprietary. Talk to us about the OpenPOWER Foundation before we get into the announcements today, what was the genesis of that? >> Okay sure, we recognized the need for innovation beyond a single chip, to build out an ecosystem, an innovation collaboration with our system partners. So, ranging from Google to Mellanox for networking, to Hortonworks for software, we believe that system-level optimization and innovation is what's going to bring the price performance advantage in the future. That traditional seamless scaling doesn't really bring us there by itself but that partnership does. >> So, from today's announcements, a number of announcements that Hortonworks is adopting IBM's data science platforms, so really the theme this morning of the keynote was data science, right, it's the next leg in really transforming an enterprise to be very much data driven and digitalized. We also saw the announcement about Atlas for data governance, what does that mean from your perspective on the engineering side? >> Very exciting you know, in terms of building out solutions of hardware and software the ability to really harden the Hortonworks data platform with servers, and storage and networking I think is going to bring simplification to on-premises, like people are seeing with the Cloud, I think the ability to create the analyst workbench, or the cognitive workbench, using the data science experience to create a pipeline of data flow and analytic flow, I think it's going to be very strong for innovation. Around that, most notable for me is the fact that they're all built on open technologies leveraging communities that universities can pick up, contribute to, I think we're going to see the pace of innovation really pick up. >> And on that front, on pace of innovation, you talked about universities, one of the things I thought was really a great highlight in the customer panel this morning that Raj Verma hosted was you had health care, insurance companies, financial services, there was Duke Energy there, and they all talked about one of the great benefits of open source is that kids in universities have access to the software for free. So from a talent attraction perspective, they're really kind of fostering that next generation who will be able to take this to the next level, which I think is a really important point as we look at data science being kind of the next big driver or transformer and also going, you know, there's not a lot of really skilled data scientists, how can that change over time? And this is is one, the open source community that Hortonworks has been very dedicated to since the beginning, it's a great it's really a great outcome of that. >> Definitely, I think the ability to take the risk out of a new analytical project is one benefit, and the other benefit is there's a tremendous, not just from young people, a tremendous amount of interest among programmers, developers of all types, to create data science skills, data engineering and data science skills. >> If we leave aside the skills for a moment and focus on the, sort of, the operationalization of the models once they're built, how should we think about a trained model, or, I should break it into two pieces. How should we think about training the models, where the data comes from and who does it? And then, the orchestration and deployment of them, Cloud, Edge Gateway, Edge device, that sort of thing. >> I think it all comes down to exactly what your use case is. You have to identify what use case you're trying to tackle, whether that's applicable to clinical medicine, whether that's applicable to finance, to banking, to retail or transportation, first you have to have that use case in mind, then you can go about training that model, developing that model, and for that you need to have a good, potent, robust data set to allow you to carry out that analysis and whether you want to do exploratory analysis or you want to do predictive analysis, that needs to be very well defined in your training stage. Once you have that model developed, then we have certain services, such as Watson Machine Learning, within data science experience that will allow you to take that model that you just developed, just moments ago, and just deploy that as a restful API that you can then embed into an application and to your solution, and in that solution you can basically use across industry. >> Are there some use cases where you have almost like a tiering of models where, you know, there're some that are right at the edge like, you know, a big device like a car and then, you know, there's sort of the fog level which is the, say, cell towers or other buildings nearby and then there's something in the Cloud that's sort of like, master model or an ensemble of models, I don't assume that's like, Evel Knievel would say you know, "Don't try that at home," but sort-of, is the tooling being built to enable that? >> So the tooling is already in existence right now. You can actually go ahead right now and be able to build out prototypes, even full-level, full-range applications right on the Cloud, and you can do that, you can do that thanks to Data Science Experience, you can do that thanks to IBM Bluemix, you can go ahead and do that type of analysis right there and not only that, you can allow that analysis to actually guide you along the path from building a model to building a full-range application and this is all happening on the Cloud level. We can talk more about it happening on on-premise level but on the Cloud level specifically, you can have those applications built on the fly, on the Cloud and have them deployed for web apps, for moblie apps, et cetera. >> One of the things that you talked about is use cases in certain verticals, IBM has been very strong and vertically focused for a very long time, but you kind of almost answered the question that I'd like to maybe explore a little bit more about building these models, training the models, in say, health care or telco and being able to deploy them, where's the horizontal benefits there that IBM would be able to deliver faster to other industries? >> Definitely, I think the main thing is that IBM, first of all, gives you that opportunity, that platform to say that hey, you have a data set, you have a use case, let's give you the tooling, let's give you the methodology to take you from data, to a model, to ultimately that full range application and specifically, I've built some applications specific to federal health care, specifically to address clinical medicine and behavioral medicine and that's allowed me to actually use IBM tools and some open source technologies as well to actually go out and build these applications on the fly as a prototype to show, not only the realm, the art of the possible when it comes to these technologies, but also to solve problems, because ultimately, that's what we're trying to accomplish here. We're trying to find real-world solutions to real-world problems. >> Linton, let me re-direct something towards you about, a lot of people are talking about how Moore's law slowing down or even ending, well at least in terms of speed of processors, but if you look at the, not just the CPU but FPGA or Asic or the tensor processing unit, which, I assume is an Asic, and you have the high speed interconnects, if we don't look at just, you know what can you fit on one chip, but you look at, you know 3D what's the density of transistors in a rack or in a data center, is that still growing as fast or faster, and what does it mean for the types of models that we can build? >> That's a great question. One of the key things that we did with the OpenPOWER Foundation, is to open up the interfaces to the chip, so with NVIDIA we have NVLink, which gives us a substantial increase in bandwidth, we have created something called OpenCAPI, which is a coherent protocol, to get to other types of accelerators, so we believe that hybrid computing in that form, you saw NVIDIDA on-stage this morning, and we believe especially for deploring the acceleration provided for GPUs is going to continue to drive substantial growth, it's a very exciting time. >> Would it be fair to say that we're on the same curve, if we look at it, not from the point of view of, you know what can we fit on a little square, but if we look at what can we fit in a data center or the power available to model things, you know Jeff Dean at Google said, "If Android users "talk into their phones for two to three minutes a day, "we need two to three times the data centers we have." Can we grow that price performance faster and enable sort of things that we did not expect? >> I think the innovation that you're describing will, in fact, put pressure on data centers. The ability to collect data from autonomous vehicles or other N points is really going up. So, we're okay for the near-term but at some point we will have to start looking at other technologies to continue that growth. Right now we're in the throws of what I call fast data versus slow data, so keeping the slow data cheaply and getting the fast data closer to the compute is a very big deal for us, so NAND flash and other non-volatile technologies for the fast data are where the innovation is happening right now, but you're right, over time we will continue to collect more and more data and it will put pressure on the overall technologies. >> Last question as we get ready to wrap here, Asad, your background is fascinating to me. Having a medical degree and working in federal healthcare for IBM, you talked about some of the clinical work that you're doing and the models that you're helping to build. What are some of the mission critical needs that you're seeing in health care today that are really kind of driving, not just health care organizations to do big data right, but to do data science right? >> Exactly, so I think one of the biggest questions that we get and one of the biggest needs that we get from the healthcare arena is patient-centric solutions. There are a lot of solutions that are hoping to address problems that are being faced by physicians on a day-to-day level, but there are not enough applications that are addressing the concerns that are the pain points that patients are facing on a daily basis. So the applications that I've started building out at IBM are all patient-centric applications that basically put the level of their data, their symptoms, their diagnosis, in their hands alone and allows them to actually find out more or less what's going wrong with my body at any particular time during the day and then find the right healthcare professional or the right doctor that is best suited to treating that condition, treating that diagnosis. So I think that's the big thing that we've seen from the healthcare market right now. The big need that we have, that we're currently addressing with our Cloud analytics technology which is just becoming more and more advanced and sophisticated and is trending towards some of the other health trends or technology trends that we have currently right now on the market, including the Blockchain, which is tending towards more of a de-centralized focus on these applications. So it's actually they're putting more of the data in the hands of the consumer, of the hands of the patient, and even in the hands of the doctor. >> Wow, fantastic. Well you guys, thank you so much for joining us on theCUBE. Congratulations on your first time being on the show, Asad Mahmood and Linton Ward from IBM, we appreciate your time. >> Thank you very much. >> Thank you. >> And for my co-host George Gilbert, I'm Lisa Martin, you're watching theCUBE live on day one of the Data Works Summit from Silicon Valley but stick around, we've got great guests coming up so we'll be right back.

Published Date : Jun 13 2017

SUMMARY :

Brought to you by Hortonworks. Welcome guys, great to have you both to build out an ecosystem, an innovation collaboration to be very much data driven and digitalized. the ability to really harden the Hortonworks data platform and also going, you know, there's not a lot is one benefit, and the other benefit is of the models once they're built, and for that you need to have a good, potent, to actually guide you along the path that platform to say that hey, you have a data set, the acceleration provided for GPUs is going to continue or the power available to model things, you know and getting the fast data closer to the compute for IBM, you talked about some of the clinical work There are a lot of solutions that are hoping to address Well you guys, thank you so much for joining us on theCUBE. on day one of the Data Works Summit from Silicon Valley

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Lisa Martin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jeff Dean	PERSON	0.99+
Duke Energy	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Asad Mahmood	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
Google	ORGANIZATION	0.99+
Raj Verma	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Asad	PERSON	0.99+
Mellanox	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Evel Knievel	PERSON	0.99+
OpenPOWER Foundation	ORGANIZATION	0.99+
two pieces	QUANTITY	0.99+
Linton	PERSON	0.99+
Linton Ward	PERSON	0.99+
three times	QUANTITY	0.99+
Data Works Summit	EVENT	0.99+
one	QUANTITY	0.98+
first time	QUANTITY	0.98+
today	DATE	0.98+
one chip	QUANTITY	0.98+
one benefit	QUANTITY	0.97+
One	QUANTITY	0.96+
Android	TITLE	0.96+
three minutes a day	QUANTITY	0.95+
both	QUANTITY	0.94+
day one	QUANTITY	0.94+
Moore	PERSON	0.93+
this morning	DATE	0.92+
OpenCAPI	TITLE	0.91+
first	QUANTITY	0.9+
single chip	QUANTITY	0.89+
Data Works Summit 2017	EVENT	0.88+
telco	ORGANIZATION	0.88+
DataWorks Summit 2017	EVENT	0.85+
NVLink	COMMERCIAL_ITEM	0.79+
NVIDIDA	TITLE	0.76+
IBM Bluemix	ORGANIZATION	0.75+
Watson Machine Learning	TITLE	0.75+
Power Systems OpenPOWER Solutions	ORGANIZATION	0.74+
Edge	TITLE	0.67+
Edge Gateway	TITLE	0.62+
couple	QUANTITY	0.6+
Covering	EVENT	0.6+
Narrator	TITLE	0.56+
Atlas	TITLE	0.52+
Linton	ORGANIZATION	0.51+
Ward	PERSON	0.47+
3D	QUANTITY	0.36+

Wrap Up - IBM Machine Learning Launch - #IBMML - #theCUBE

(jazzy intro music) [Narrator] Live from New York, it's the Cube! Covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts: Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody. This is theCUBE, the leader in live tech coverage. We've been covering, all morning, the IBM Machine Learning announcement. Essentially what IBM did is they brought Machine Learning to the z platform. My co-host and I, Stu Miniman, have been talking to a number of guests, and we're going to do a quick wrap here. You know, Stu, my take is, when we first heard about this, and the world first heard about this, we were like, "Eh, okay, that's nice, that's interesting." But what it underscores is IBM's relentless effort to continue to keep z relevant. We saw it with the early Linux stuff, we're now seeing it with all the OpenSource and Spark tooling. You're seeing IBM make big positioning efforts to bring analytics and transactions together, and the simple point is, a lot of the world's really important data runs on mainframes. You were just quoting some stats, which were pretty interesting. >> Yeah, I mean, Dave, you know, one of the biggest challenges we know in IT is migrating. Moving from one thing to another is really tough. I love the comment from Barry Baker. Well, if I need to change my platform, by the time I've moved it, that whole digital transformation, we've missed that window. It's there. We know how long that takes: months, quarters. I was actually watching Twitter, and it looks like Chris Maddern is here. Chris was the architect of Venmo, which my younger sisters, all the millennials that I know, everybody uses Venmo. He's here, and he was like, "Almost all the banks, airlines, and retailers "still run on mainframes in 2017, and it's growing. "Who knew?" You've got a guy here that's developing really cool apps that was finding this interesting, and that's an angle I've been looking at today, Dave, is how do you make it easy for developers to leverage these platforms that are already there? The developers aren't going to need to care whether it's a mainframe or a cloud or x86 underneath. IBM is giving you the options, and as a number of our guests said, they're not looking to solve all the problems here. Here's taking this really great, new type of application using Machine Learning and making it available on that platform that so many of their customers already use. >> Right, so we heard a little bit of roadmap here: the ML for z goes GA in Q1, and then we don't have specific timeframes, but we're going to see Power platform pick this up. We heard from Jean-Francois Puget that they'll have an x86 version, and then obviously a cloud version. It's unclear what that hybrid cloud will look like. It's a little fuzzy right now, but that's something that we're watching. Obviously a lot of the model development and training is going to live in the cloud, but the scoring is going to be done locally is how the data scientists like to think about these things. So again, Stu, more mainframe relevance. We've got another cycle coming soon for the mainframe. We're two years into the z13. When IBM has mainframe cycles, it tends to give a little bump to earnings. Now, granted, a smaller and smaller portion of the company's business is mainframe, but still, mainframe drags a lot of other software with it, so it remains a strategic component. So one of the questions we get a lot is what's IBM doing in so-called hardware? Of course, IBM says it's all software, but we know they're still selling boxes, right? So, all the hardware guys, EMC, Dell, IBM, HPE, et cetera. A lot of software content, but it's still a hardware business. So there's really two platforms there: there's the z and there's the Power. And those are both strategic to IBM. It sold its x86 business because it didn't see it as strategic. They just put Bob Picciano in charge of the Power business, so there's obviously real commitments to those platforms. Will they make a dent in the market share numbers? Unclear. It looks like it's steady as she goes, not dramatic increase in share. >> Yeah, and Dave, I didn't hear anybody come in here and say this offering is going to say, well let me dump x86 and go buy mainframe. That's not the target that I heard here. I would have loved to hear a little bit more as to where this fits into the broader IOT strategy. We talked a little bit on the intro, Dave. There's a lot of reasons why data's going to stick at the edge when we look at the numbers. For the huge growth of public cloud, the amount of data in public cloud hasn't caught up to the equivalent of what it would be in data centers itself. What I mean by that is, we usually spend, say 30% on average for storage costs inside a data center. If we look at public cloud, it's more around 10%. So, at AWS Reinvent, I talked to a number of the ecosystem partners, that started to see things like data lakes starting to appear in the cloud. This solution isn't in the data lake family, but it's with the analytics and everything that's happening with streaming and machine learning. It's large repositories of data and huge transactions of data that are happening in the mainframe, and just trying to squint through where all the data lives, and the new waves of technologies coming in. We heard how this can tie into some of the mobile and streaming activities that aren't on the mainframe, so that it can pull them into the other decisions, but some broader picture that I'm sure IBM will be able to give in the future. >> Well, normally you would expect a platform that is however many decades old the mainframe is, after the whole mainframe downsizing trend, you would expect there would be a managed decline in that business. I mean, you're seeing it in a lot of places now. We've talked about this, with things like Symmetrics, right? You minimize and focus the R&D investments, and you try to manage cost, you manage the decline of the business. IBM has almost sort of flipped that. They say, okay, we've got DB2, we're going to continue to invest in that platform. We've got our major subsystems, we're going to enhance the platform with Open Source technologies. We've got a big enough base that we can continue to mine perpetually. The more interesting thing to me about this announcement is it underscores how IBM is leveraging its analytics platform. So, we saw the announcement of the Watson Data Platform last September, which was sort of this end-to-end data pipeline collaboration between different persona engine, which is quite unique in the marketplace, a lot of differentiation there. Still some services. Last week at Spark Summit, I talked to some of the users and some of the partners of the Watson Data Platform. They said it's great, we love it, it's probably the most robust in the marketplace, but it's still a heavy lift. It still requires a fair amount of services, and IBM's still pushing those services. So IBM still has a large portion of the company still a services company. So, not surprising there, but as I've said many many times, the challenge IBM has is to really drive that software business, simplify the deployment and management of that software for its customers, which is something that I think it's working hard on doing. And the other thing is you're seeing IBM leverage those platforms, those analytics platforms, into different hardware segments, or hardware/cloud segments, whether it's BlueMix, z, Power, so, pushing it out through the organization. IBM still has a stack, like Oracle has a stack, so wherever it can push its own stack, it's going to do that, cuz the margins are better. At the same time, I think it understands very well, it's got to have open source choice. >> Yeah, absolutely, and that's something we heard loud and clear here, Dave, which is what we expect from IBM: choice of language, choice of framework. When I hear the public cloud guys, it's like, "Oh, well here's kind of the main focus we have, "and maybe we'll have a little bit of choice there." Absolutely the likes of Google and Amazon are working with open source, but at least first blush, when I look at things, it looks like once IBM fleshes this out -- and as we've said, it's the Spark to start and others that they're adding on -- but IBM could have a broader offering than I expect to see from some of the public cloud guys. We'll see. As you know, Dave, Google's got their cloud event in a couple of weeks in San Francisco. We'll be covering that, and of course Amazon, you expect their regular cadence of announcements that they'll make. So, definitely a new front in the Cloud Wars as it were, for machine learning. >> Excellent! Alright, Stu, we got to wrap, cuz we're broadcasting the livestream. We got to go set up for that. Thanks, I really appreciate you coming down here and co-hosting with me. Good event. >> Always happy to come down to the Big Apple, Dave. >> Alright, good. Alright, thanks for watching, everybody! So, check out SiliconAngle.com, you'll get all the new from this event and around the world. Check out SiliconAngle.tv for this and other CUBE activities, where we're going to be next. We got a big spring coming up, end of winter, big spring coming in this season. And check out WikiBon.com for all the research. Thanks guys, good job today, that's a wrap! We'll see you next time. This is theCUBE, we're out. (jazzy music)

Published Date : Feb 15 2017

SUMMARY :

New York, it's the Cube! a lot of the world's really important data the biggest challenges we Obviously a lot of the model a number of the ecosystem partners, the challenge IBM has is to really kind of the main focus we have, We got to go set up for that. down to the Big Apple, Dave. and around the world.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Chris	PERSON	0.99+
Dave	PERSON	0.99+
Barry Baker	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Chris Maddern	PERSON	0.99+
2017	DATE	0.99+
Bob Picciano	PERSON	0.99+
Google	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
San Francisco	LOCATION	0.99+
Stu	PERSON	0.99+
New York City	LOCATION	0.99+
Last week	DATE	0.99+
New York	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
one	QUANTITY	0.99+
30%	QUANTITY	0.99+
two platforms	QUANTITY	0.99+
two years	QUANTITY	0.99+
Linux	TITLE	0.99+
Alrig	PERSON	0.99+
last September	DATE	0.99+
Jean-Francois Puget	PERSON	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.98+
today	DATE	0.98+
Watson Data Platform	TITLE	0.98+
Venmo	ORGANIZATION	0.97+
Spark Summit	EVENT	0.97+
Q1	DATE	0.96+
Big Apple	LOCATION	0.96+
EMC	ORGANIZATION	0.95+
HPE	ORGANIZATION	0.95+
BlueMix	TITLE	0.94+
Spark	TITLE	0.91+
WikiBon.com	ORGANIZATION	0.9+
IBM Machine Learning Launch	EVENT	0.89+
one thing	QUANTITY	0.86+
AWS Reinvent	ORGANIZATION	0.82+
around 10%	QUANTITY	0.8+
x86	COMMERCIAL_ITEM	0.78+
SiliconAngle.tv	ORGANIZATION	0.77+
#IBMML	TITLE	0.76+
z13	COMMERCIAL_ITEM	0.74+
end	DATE	0.71+
Machine Learning	TITLE	0.65+
x86	TITLE	0.62+
CUBE	ORGANIZATION	0.56+
OpenSource	TITLE	0.56+
Twitter	TITLE	0.54+
Learning	TITLE	0.5+
decades	QUANTITY	0.48+
Symmetrics	TITLE	0.46+
SiliconAngle.com	ORGANIZATION	0.43+
theCUBE	ORGANIZATION	0.41+
Wars	TITLE	0.35+

Barry Baker, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> [Narrator] Live from New York, it's theCUBE! Covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts: Dave Vellante and Stu Miniman. >> Hi everybody, we're back, this is theCUBE. We're live at the IBM Machine Learning Launch Event. Barry Baker is here, he's the Vice President of Offering Management for z Systems. Welcome to theCUBE, thanks for coming on! >> Well, it's my first time, thanks for having me! >> A CUBE newbie, alright! Let's get right into it! >> [Barry Baker] Go easy! >> So, two years ago, January of 2015, we covered the z13 launch. The big theme there was bringing analytics and transactions together, z13 being the platform for that. Today, we're hearing about machine learning on mainframe. Why machine learning on mainframe, Barry? >> Well, for one, it is all about the data on the platform, and the applications that our clients have on the platform. And it becomes a very natural fit for predictive analytics and what you can get from machine learning. So whether you're trying to do churn analysis or fraud detection at the moment of the transaction, it becomes a very natural place for us to inject what is pretty advanced capability from a machine learning perspective into the mainframe environment. We're not trying to solve all analytics problems on the mainframe, we're not trying to become a data lake, but for the applications and the data that reside on the platform, we believe it's a prime use case that our clients are waiting to adopt. >> Okay, so help me think through the use case of I have all this transaction data on the mainframe. Not trying to be a data lake, but I've got this data lake elsewhere, that might be useful for some of the activity I want to do. How do I do that? I'm presuming I'm not extracting my sensitive transaction data and shipping it into the data lake. So, how am I getting access to some of that social data or other data? >> Yeah, and we just saw an example in the demo pad before, whereby the bulk of the data you want to perform scoring on, and also the machine learning on to build your models, is resident on the mainframe, but there does exist data out there. In the example we just saw, it was social data. So the demo that was done was how you can take and use IBM Bluemix and get at key pieces of social data. Not a whole mass of the volume of unstructured data that lives out there. It's not about bringing that to the platform and doing machine learning on it. It's about actually taking a subset of that data, a filtered subset that makes sense to be married with the bigger data set that sits on the platform. And so that's how we envision it. We provide a number of ways to do that through the IBM Machine Learning offering, where you can marry data sources from different places. But really, the bulk of the data needs to be on z and on the platform for it to make sense to have this workload running there. >> Okay. One of the big themes, of course, that IBM puts forth is platform modernization, application modernization. I think it kind of started with Linux on z? Maybe there were other examples, but that was a big one. I don't know what the percentage is, but a meaningful percentage of workloads running on z are Linux-based, correct? >> Yeah, so, the way I would view it is it's still today that the majority of workload on the platform is z/OS based, but Linux is one of our fastest growing workloads on the platform. And it is about how do you marry and bring other capabilities and other applications closer to the systems of record that is sitting there on z/OS. >> So, last week, at AnacondaCON, you announced Anaconda on z, certainly Spark, a lot of talk on Spark. Give us the update on the sort of tooling. >> We recognized a few years back that Spark was going to be key to our platform longer-term. So, contrary to what people have seen from z in the past, we jumped on it fast. We view it as an enabling technology, an enabling piece of infrastructure that allows for analytics solutions to be built and brought to market really rapidly. And the machine learning announcement today is proof of that. In a matter of months, we've been able to take the cloud-based IBM Watson Machine Learning offering and have the big chunk of it run on the mainframe, because of the investment we made in spark a year and a half ago, two years ago. We continue to invest in Spark, we're at 2.0.2 level. The announcement last week around Anaconda is, again, how do we continue to bring the right infrastructure, from an analytics perspective, onto the platform. And you'll see later, maybe in the session, where the roadmap for ML isn't just based on Spark. The roadmap for ML also requires us to go after and provide new runtimes and new languages on the platform, like Python and Anaconda in particular. So, it's a coordinated strategy where we're laying the foundation on the infrastructure side to enable the solutions from the analytics unit. >> Barry, when I hear about streaming, it reminds me of the general discussion we've been having with customers about digital transformation. How does mainframe fit into that digital mandate that you hear from customers? >> That's a great, great question. From our perspective, we've come out of the woods of many of our discussions with clients being about, I need to move off the platform, and rather, I need to actually leverage this platform, because the time it's going to take me to move off this platform, by the time I do that, digital's going to overwash me and I'm going to be gone." So the very first step that our clients take, and some of our leading clients take, on the platform for digital transformation, is moving toward standard RESTful APIs, taking z/OS Connect Enterprise Edition, putting that in front of their core, mission-critical applications and data stores, and enabling those assets to be exposed externally. And what's happening is those clients then build out new engaging mobile web apps that are then coming directly back to the mainframe at those high value assets. But in addition, what that is driving is a whole other set of interaction patterns that we're actually able to see on the mainframe in how they're being used. So, opening up the API channel is the first step our clients are taking. Next is how do they take the 200 billion lines of COBOL code that is out there in the wild, running on these systems, and how do they over time modernize it? And we have some leading clients that are doing very tight integration whereby they have a COBOL application, and as they want to make changes to it, we give them the ability to make changes in it, but do it in Java, or do it in another language, a more modern language, tightly integrated with the COBOL runtime. So, we call that progressive modernization. It's not about come in and replace the whole app and rewrite that thing. That's one next step on the journey, and then as the clients start to do that, they start to really need to lay down a continuous integration, continuous delivery tool chain, building a whole dev ops end-to-end flow. That's kind of the path that our clients are on for really getting much more faster and getting more productivity out of their development side of things. And in turn, the platform is now becoming a platform that they can deliver results on, just like they could on any other platform. >> That's big because a lot of customers use to complain, well, I can't get COBOL skills or, you know, and so IBM's answer was often, well, we got 'em. You can outsource it to us and that's not always the preferred approach so, glad to hear you're addressing that. On the dev ops discussion, you know, a lot of times dev ops is about breaking stuff. How about the main frame workload's all about not breaking stuff so, waterfall, more traditional methodologies are still appropriate. Can you help us understand how customers are dealing with that, sort of, schism. >> Yeah, I think dev ops, some people would come at it and say, that's just about moving fast and breaking some eggs and cleaning up the mess and then moving forward from but from our perspective it's, that's not it, right? That can't be it for our customers because of the criticality of these systems will not allow that so from our, our dev ops model is not so much about move fast and break some eggs, it's about move fast in smaller increments and in establishing clear chains and a clear pipeline with automated test suites getting executed and run at each phase of the pipeline before you move to production. So, we're not going to... And our approach is not to compromise on quality as you kind of move towards dev ops and we have, internally, our major subsystems right? So, KIX, IMS, DB2. They're all on their own journey to deliver and move towards continuous integration in dev ops internally. So, we're eating our own... We're dog fooding this here, right? We're building our own teams around this and we're not seeing a decline in quality. In fact, as we start to really fix and move testing to the left, as they call it, shift left testing, right? Earlier in the cycle you regression test. We are seeing better quality come because of that effort. >> You put forth this vision, as I said, at the top of this segment. Vision, this vision of bringing data in analytics, in transactions together. That was the Z13 announcement. But the reality is, a lot of customers would have their main frame and then they'd have, you know, some other data warehouse, some infiniband pipe, maybe to that data warehouse was there approximation of real time. So, the vision that you put forth was to consolidate that. And has that happened? Are you starting to do that? What are they doing with the data warehouse? >> So, we're starting to see it. I mean, and frankly, we have clients that struggle with that model, right? And that's precisely why we have a very strong point of view that says, if this is data that you're going to get value from, from an analytics perspective and you can use it on the platform, moving it off the platform is going to create a number of challenges for you. And we've seen it first hand. We've seen companies that ETL the data off the platform. They end up with 9, 10, 12 copies of the data. As soon as you do that, the data is, it's old, it's stale and so any insights you derive are then going to be potentially old and stale as well. The other side of it is, our customers in the industries that heavy users of the mainframe, finance, banking, healthcare. These are heavily regulated industries that are getting more regulated. And they're under more pressure to ensure governance and, in their meeting, the various regulation needs. As soon as you start to move that data off the platform, your problem just got that much harder. So, we are seeing a shift in approaches and it's going to take some time for clients to get past this, right? Because, enterprise data warehouse is a pretty big market and there's a lot of them out there but we're confident that for specific use cases, it makes a great deal of sense to leave the data where it is bring the analytics as close to that data as possible, and leverage the insight right there at the point of impact as opposed to pushing it off. >> How about the economics? So, I have talked, certainly talked to customers that understand it for a lot of the work that they're doing. Doing it on the Z platform is more cost effective than maybe, try to manage a bunch of, you know, bespoke X86 boxes, no question. But at the end of the day, there's still that CAPEX. What is IBM doing to help customers, sort of, absorb, you know, the costs and bring together, more aggressively, analytic and transaction data. >> Yeah, so, in agreement a 100%, I think we can create the best technology in the world but if we don't close on the financials, it's not going to go anywhere, it's not going to get, it's not going to move. So, from an analytics perspective, just starting at the ground level with spark, even underneath the spark layer, there are things we've done in the hardware to accelerate performance and so that's one layer. Then you move into spark. Well, spark is running on our java, our JDK and it takes advantage of using and being moved off to the ziip offload processors. So, those processors alone are lower cost than general purpose processors. We then have additionally thought this through, in terms of working with clients and seeing that, you know, a typical use case for running spark on the platform, they require three or four ziips and then a hundred, two hundred gig of additional memory. We've come at that as a, let's do a bundled offer and with you that comes in and says, for that workload, we're going to come in with a different price point for you. So, the other side of it is, we've been delivering over the last couple of years, ways to isolate workload from a software license cost perspective, right. 'Cause the other knock that people will say is, as I add new workload it impacts all the rest of my software Well, no. There are multiple paths forward for you to isolate that workload, add new workload to the platform and not have it impact your existing MLC charges so we continue to actually evolve that and make that easier to do but that's something we're very focused on. >> But that's more than just, sort of an LPAR or... >> Yeah, so there's other ways we could do that with... (mumbles) We're IBM so there's acronyms right. So there's ZCAP and there's all other pricing mechanisms that we can take advantage of to help you, you know, the way I simply say it is, we have to enable for new workload, we need to enable the pricing to be supportive of growth, right, not protecting and so we are very focused on, how do we do this in the right way that clients can adopt it, take advantage of the capabilities and also do it in a cost effective way. >> And what about security? That's another big theme that you guys have put forth. What's new there? >> Yeah so we have a lot underway from the security perspective. I'm going to say stay tuned, more to come there but there's a heavy investment, again, going back to what our clients are struggling with and that we hear in day in and day out, is around how do I ensure, you know, and how do I do encryption pervasively across the platform for all of the data being managed by the system, how do I do that with ease, and how do I do that without having to drive changes at the application layer, having to drive operational changes. How do I enable these systems to get that much more secure with these and low cost. >> Right, because if you... In an ideal world you'd encrypt everything but there's a cost of doing that. There are some downstream nuances with things like compression >> Yup. >> And so forth so... Okay, so more to come there. We'll stay tuned. >> More to come. >> Alright, we'll give you the final word. Big day for you, guys so congratulations on the announcement You got a bunch of customers who're comin' in very shortly. >> Yeah no... It's extremely, we're excited to be here. We think that the combination of IBM systems, working with the IBM analytics team to put forward an offering that pulls key aspects of Watson and delivers it on the mainframe is something that will get noticed and actually solve some real challenges so we're excited. >> Great. Barry, thanks very much for coming to theCUBE, appreciate it >> Thanks for having me. Thanks for going easy on me. >> You're welcome. Keep it right there. We'll be back with our next guest, right after this short break. (techno music)

Published Date : Feb 15 2017

SUMMARY :

brought to you by IBM. Barry Baker is here, he's the analytics and transactions together, that reside on the platform, we believe So, how am I getting access to and also the machine learning on to build your models, One of the big themes, of course, that the majority of workload on the platform is z/OS based, you announced Anaconda on z, and have the big chunk of it run on the mainframe, it reminds me of the general discussion we've been having because the time it's going to take me to move On the dev ops discussion, you know, a lot of times dev ops Earlier in the cycle you regression test. So, the vision that you put forth was to consolidate that. moving it off the platform is going to create But at the end of the day, there's still that CAPEX. and make that easier to do but the way I simply say it is, we have to enable That's another big theme that you guys have put forth. and that we hear in day in and day out, but there's a cost of doing that. Okay, so more to come there. Alright, we'll give you the final word. and delivers it on the mainframe Barry, thanks very much for coming to theCUBE, appreciate it Thanks for going easy on me. We'll be back with our next guest,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Barry	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Barry Baker	PERSON	0.99+
Stu Miniman	PERSON	0.99+
New York	LOCATION	0.99+
9	QUANTITY	0.99+
last week	DATE	0.99+
COBOL	TITLE	0.99+
Python	TITLE	0.99+
100%	QUANTITY	0.99+
Java	TITLE	0.99+
10	QUANTITY	0.99+
Linux	TITLE	0.99+
first step	QUANTITY	0.99+
two years ago	DATE	0.99+
a year and a half ago	DATE	0.99+
200 billion lines	QUANTITY	0.99+
12 copies	QUANTITY	0.99+
first time	QUANTITY	0.98+
one layer	QUANTITY	0.98+
January of 2015	DATE	0.98+
today	DATE	0.98+
Today	DATE	0.98+
three	QUANTITY	0.98+
Spark	TITLE	0.97+
Anaconda	ORGANIZATION	0.97+
z/OS	TITLE	0.96+
2.0.2 level	QUANTITY	0.94+
IBM Machine Learning Launch Event	EVENT	0.94+
each phase	QUANTITY	0.93+
AnacondaCON	ORGANIZATION	0.93+
X86	COMMERCIAL_ITEM	0.92+
z	TITLE	0.91+
Anaconda	TITLE	0.91+
Vice President	PERSON	0.91+
z Systems	ORGANIZATION	0.9+
java	TITLE	0.9+
z13	TITLE	0.9+
a hundred	QUANTITY	0.89+
one	QUANTITY	0.88+
four ziips	QUANTITY	0.88+
ML	TITLE	0.86+
One	QUANTITY	0.85+
Bluemix	COMMERCIAL_ITEM	0.82+
z/OS Connect Enterprise Edition	TITLE	0.76+
Spark	ORGANIZATION	0.76+
two hundred gig	QUANTITY	0.75+
a few years back	DATE	0.74+
last	DATE	0.69+
first	QUANTITY	0.69+
Anaconda	LOCATION	0.68+
z13	COMMERCIAL_ITEM	0.68+
one next	QUANTITY	0.66+
theCUBE	ORGANIZATION	0.64+
Watson	TITLE	0.64+
Z13	ORGANIZATION	0.62+
spark	ORGANIZATION	0.61+
ZCAP	TITLE	0.6+
#IBMML	TITLE	0.57+
lake	LOCATION	0.57+
KIX	TITLE	0.46+
years	DATE	0.44+

Jean Francois Puget, IBM | IBM Machine Learning Launch 2017

>> Announcer: Live from New York, it's theCUBE, covering the IBM machine learning launch event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Alright, we're back. Jean Francois Puget is here, he's the distinguished engineer for machine learning and optimization at IBM analytics, CUBE alum. Good to see you again. >> Yes. >> Thanks very much for coming on, big day for you guys. >> Jean Francois: Indeed. >> It's like giving birth every time you guys give one of these products. We saw you a little bit in the analyst meeting, pretty well attended. Give us the highlights from your standpoint. What are the key things that we should be focused on in this announcement? >> For most people, machine learning equals machine learning algorithms. Algorithms, when you look at newspapers or blogs, social media, it's all about algorithms. Our view that, sure, you need algorithms for machine learning, but you need steps before you run algorithms, and after. So before, you need to get data, to transform it, to make it usable for machine learning. And then, you run algorithms. These produce models, and then, you need to move your models into a production environment. For instance, you use an algorithm to learn from past credit card transaction fraud. You can learn models, patterns, that correspond to fraud. Then, you want to use those models, those patterns, in your payment system. And moving from where you run the algorithm to the operation system is a nightmare today, so our value is to automate what you do before you run algorithms, and then what you do after. That's our differentiator. >> I've had some folks in theCUBE in the past have said years ago, actually, said, "You know what, algorithms are plentiful." I think he made the statement, I remember my friend Avi Mehta, "Algorithms are free. "It's what you do with them that matters." >> Exactly, that's, I believe in autonomy that open source won for machine learning algorithms. Now the future is with open source, clearly. But it solves only a part of the problem you're facing if you want to action machine learning. So, exactly what you said. What do you do with the results of algorithm is key. And open source people don't care much about it, for good reasons. They are focusing on producing the best algorithm. We are focusing on creating value for our customers. It's different. >> In terms of, you mentioned open source a couple times, in terms of customer choice, what's your philosophy with regard to the various tooling and platforms for open source, how do you go about selecting which to support? >> Machine learning is fascinating. It's overhyped, maybe, but it's also moving very quickly. Every year there is a new cool stuff. Five years ago, nobody spoke about deep learning. Now it's everywhere. Who knows what will happen next year? Our take is to support open source, to support the top open source packages. We don't know which one will win in the future. We don't know even if one will be enough for all needs. We believe one size does not fit all, so our take is support a curated list of mid-show open source. We start with Spark ML for many reasons, but we won't stop at Spark ML. >> Okay, I wonder if we can talk use cases. Two of my favorite, well, let's just start with fraud. Fraud has become much, much better over the past certainly 10 years, but still not perfect. I don't know if perfection is achievable, but lot of false positives. How will machine learning affect that? Can we expect as consumers even better fraud detection in more real time? >> If we think of the full life cycle going from data to value, we will provide a better answer. We still use machine learning algorithm to create models, but a model does not tell you what to do. It will tell you, okay, for this credit card transaction coming, it has a high probability to be fraud. Or this one has a lower priority, uh, probability. But then it's up to the designer of the overall application to make decisions, so what we recommend is to use machine learning data prediction but not only, and then use, maybe, (murmuring). For instance, if your machine learning model tells you this is a fraud with a high probability, say 90%, and this is a customer you know very well, it's a 10-year customer you know very well, then you can be confident that it's a fraud. Then if next fraud tells you this is 70% probability, but it's a customer since one week. In a week, we don't know the customer, so the confidence we can get in machine learning should be low, and there you will not reject the transaction immediately. Maybe you will enter, you don't approve it automatically, maybe you will send a one-time passcode, or you enter a serve vendor system, but you don't reject it outright. Really, the idea is to use machine learning predictions as yet another input for making decisions. You're making decision informed on what you could learn from your past. But it's not replacing human decision-making. Our approach with IBM, you don't see IBM speak much about artificial intelligence in general because we don't believe we're here to replace humans. We're here to assist humans, so we say, augmented intelligence or assistance. That's the role we see for machine learning. It will give you additional data so that you make better decisions. >> It's not the concept that you object to, it's the term artificial intelligence. It's really machine intelligence, it's not fake. >> I started my career as a PhD in artificial intelligence, I won't say when, but long enough. At that time, there were already promise that we have Terminator in the next decade and this and that. And the same happened in the '60s, or it was after the '60s. And then, there is an AI winter, and we have a risk here to have an AI winter because some people are just raising red flags that are not substantiated, I believe. I don't think that technology's here that we can replace human decision-making altogether any time soon, but we can help. We can certainly make some proficient, more efficient, more productive with machine learning. >> Having said that, there are a lot of cognitive functions that are getting replaced, maybe not by so-called artificial intelligence, but certainly by machines and automation. >> Yes, so we're automating a number of things, and maybe we won't need to have people do quality check and just have an automated vision system detect defects. Sure, so we're automating more and more, but this is not new, it has been going on for centuries. >> Well, the list evolved. So, what can humans do that machines can't, and how would you expect that to change? >> We're moving away from IMB machine learning, but it is interesting. You know, each time there is a capacity that a machine that will automate, we basically redefine intelligence to exclude it, so you know. That's what I foresee. >> Yeah, well, robots a while ago, Stu, couldn't climb stairs, and now, look at that. >> Do we feel threatened because a robot can climb a stair faster than us? Not necessarily. >> No, it doesn't bother us, right. Okay, question? >> Yeah, so I guess, bringing it back down to the solution that we're talking about today, if I now am doing, I'm doing the analytics, the machine learning on the mainframe, how do we make sure that we don't overrun and blow out all our MIPS? >> We recommend, so we are not using the mainframe base compute system. We recommend using ZIPS, so additional calls to not overload, so it's a very important point. We claim, okay, if you do everything on the mainframe, you can learn from operational data. You don't want to disturb, and you don't want to disturb takes a lot of different meanings. One that you just said, you don't want to slow down your operation processings because you're going to hurt your business. But you also want to be careful. Say we have a payment system where there is a machine learning model predicting fraud probability, a part of the system. You don't want a young bright data scientist decide that he had a great idea, a great model, and he wants to push his model in production without asking anyone. So you want to control that. That's why we insist, we are providing governance that includes a lot of things like keeping track of how models were created from which data sets, so lineage. We also want to have access control and not allow anyone to just deploy a new model because we make it easy to deploy, so we want to have a role-based access and only someone someone with some executive, well, it depends on the customer, but not everybody can update the production system, and we want to support that. And that's something that differentiates us from open source. Open source developers, they don't care about governance. It's not their problem, but it is our customer problem, so this solution will come with all the governance and integrity constraints you can expect from us. >> Can you speak to, first solution's going to be on z/OS, what's the roadmap look like and what are some of those challenges of rolling this out to other private cloud solutions? >> We are going to shape this quarter IBM machine learning for Z. It starts with Spark ML as a base open source. This is not, this is interesting, but it's not all that is for machine learning. So that's how we start. We're going to add more in the future. Last week we announced we will shape Anaconda, which is a major distribution for Python ecosystem, and it includes a number of machine learning open source. We announced it for next quarter. >> I believe in the press release it said down the road things like TensorFlow are coming, H20. >> But Anaconda will announce for next quarter, so we will leverage this when it's out. Then indeed, we have a roadmap to include major open source, so major open source are the one from Anaconda (murmuring), mostly. Key deep learning, so TensorFlow and probably one or two additional, we're still discussing. One that I'm very keen on, it's called XGBoost in one word. People don't speak about it in newspapers, but this is what wins all Kaggle competitions. Kaggle is a machine learning competition site. When I say all, all that are not imagery cognition competitions. >> Dave: And that was ex-- >> XGBoost, X-G-B-O-O-S-T. >> Dave: XGBoost, okay. >> XGBoost, and it's-- >> Dave: X-ray gamma, right? >> It's really a package. When I say we don't know which package will win, XGBoost was introduced a year ago also, or maybe a bit more, but not so long ago, and now, if you have structure data, it is the best choice today. It's a really fast-moving, but so, we will support mid-show deep learning package and mid-show classical learning package like the one from Anaconda or XGBoost. The other thing we start with Z. We announced in the analyst session that we will have a power version and a private cloud, meaning XTC69X version as well. I can't tell you when because it's not firm, but it will come. >> And in public cloud as well, I guess we'll, you've got components in the public cloud today like the Watson Data Platform that you've extracted and put here. >> We have extracted part of the testing experience, so we've extracted notebooks and a graphical tool called ModelBuilder from DSX as part of IBM machine learning now, and we're going to add more of DSX as we go. But the goal is to really share code and function across private cloud and public cloud. As Rob Thomas defined it, we want with private cloud to offer all the features and functionality of public cloud, except that it would run inside a firewall. We are really developing machine learning and Watson machine learning on a command code base. It's an internal open source project. We share code, and then, we shape on different platform. >> I mean, you haven't, just now, used the word hybrid. Every now and then IBM does, but do you see that so-called hybrid use case as viable, or do you see it more, some workloads should run on prem, some should run in the cloud, and maybe they'll never come together? >> Machine learning, you basically have to face, one is training and the other is scoring. I see people moving training to cloud quite easily, unless there is some regulation about data privacy. But training is a good fit for cloud because usually you need a large computing system but only for limited time, so elasticity's great. But then deployment, if you want to score transaction in a CICS transaction, it has to run beside CICS, not cloud. If you want to score data on an IoT gateway, you want to score other gateway, not in a data center. I would say that may not be what people think first, but what will drive really the split between public cloud, private, and on prem is where you want to apply your machine learning models, where you want to score. For instance, smart watches, they are switching to gear to fit measurement system. You want to score your health data on the watch, not in the internet somewhere. >> Right, and in that CICS example that you gave, you'd essentially be bringing the model to the CICS data, is that right? >> Yes, that's what we do. That's a value of machine learning for Z is if you want to score transactions happening on Z, you need to be running on Z. So it's clear, mainframe people, they don't want to hear about public cloud, so they will be the last one moving. They have their reasons, but they like mainframe because it ties really, really secure and private. >> Dave: Public cloud's a dirty word. >> Yes, yes, for Z users. At least that's what I was told, and I could check with many people. But we know that in general the move is for public cloud, so we want to help people, depending on their journey, of the cloud. >> You've got one of those, too. Jean Francois, thanks very much for coming on theCUBE, it was really a pleasure having you back. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from the Waldorf Astoria. IBM's machine learning announcement, be right back. (electronic keyboard music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Good to see you again. on, big day for you guys. What are the key things that we and then what you do after. "It's what you do with them that matters." So, exactly what you said. but we won't stop at Spark ML. the past certainly 10 years, so that you make better decisions. that you object to, that we have Terminator in the next decade cognitive functions that and maybe we won't need to and how would you expect that to change? to exclude it, so you know. and now, look at that. Do we feel threatened because No, it doesn't bother us, right. and you don't want to disturb but it's not all that I believe in the press release it said so we will leverage this when it's out. and now, if you have structure data, like the Watson Data Platform But the goal is to really but do you see that so-called is where you want to apply is if you want to score so we want to help people, depending on it was really a pleasure having you back. from the Waldorf Astoria.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jean Francois	PERSON	0.99+
IBM	ORGANIZATION	0.99+
10-year	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
Avi Mehta	PERSON	0.99+
New York	LOCATION	0.99+
Anaconda	ORGANIZATION	0.99+
70%	QUANTITY	0.99+
Jean Francois Puget	PERSON	0.99+
next year	DATE	0.99+
Two	QUANTITY	0.99+
Last week	DATE	0.99+
next quarter	DATE	0.99+
90%	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
one-time	QUANTITY	0.99+
today	DATE	0.99+
Five years ago	DATE	0.99+
one word	QUANTITY	0.99+
CICS	ORGANIZATION	0.99+
Python	TITLE	0.99+
a year ago	DATE	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
next decade	DATE	0.98+
one week	QUANTITY	0.98+
first solution	QUANTITY	0.98+
XGBoost	TITLE	0.98+
a week	QUANTITY	0.97+
Spark ML	TITLE	0.97+
'60s	DATE	0.97+
ModelBuilder	TITLE	0.96+
one size	QUANTITY	0.96+
One	QUANTITY	0.95+
first	QUANTITY	0.94+
Watson Data Platform	TITLE	0.93+
each time	QUANTITY	0.93+
Kaggle	ORGANIZATION	0.92+
Stu	PERSON	0.91+
this quarter	DATE	0.91+
DSX	TITLE	0.89+
XGBoost	ORGANIZATION	0.89+
Waldorf Astoria	ORGANIZATION	0.86+
Spark ML.	TITLE	0.85+
z/OS	TITLE	0.82+
years	DATE	0.8+
centuries	QUANTITY	0.75+
10 years	QUANTITY	0.75+
DSX	ORGANIZATION	0.72+
Terminator	TITLE	0.64+
XTC69X	TITLE	0.63+
IBM Machine Learning Launch 2017	EVENT	0.63+
couple times	QUANTITY	0.57+
machine learning	EVENT	0.56+
X	TITLE	0.56+
Watson	TITLE	0.55+
these products	QUANTITY	0.53+
-G-B	COMMERCIAL_ITEM	0.53+
H20	ORGANIZATION	0.52+
TensorFlow	ORGANIZATION	0.5+
theCUBE	ORGANIZATION	0.49+
CUBE	ORGANIZATION	0.37+

Bryan Smith, Rocket Software - IBM Machine Learning Launch - #IBMML - #theCUBE

>> Announcer: Live from New York, it's theCUBE, covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody. We're here at the Waldorf Astoria covering the IBM Machine Learning Launch Event, bringing machine learning to the IBM Z. Bryan Smith is here, he's the vice president of R&D and the CTO of Rocket Software, powering the path to digital transformation. Bryan, welcome to theCUBE, thanks for coming on. >> Thanks for having me. >> So, Rocket Software, Waltham, Mass. based, close to where we are, but a lot of people don't know about Rocket, so pretty large company, give us the background. >> It's been around for, this'll be our 27th year. Private company, we've been a partner of IBM's for the last 23 years. Almost all of that is in the mainframe space, or we focused on the mainframe space, I'll say. We have 1,300 employees, we call ourselves Rocketeers. It's spread around the world. We're really an R&D focused company. More than half the company is engineering, and it's spread across the world on every continent and most major countries. >> You're esstenially OEM-ing your tools as it were. Is that right, no direct sales force? >> About half, there are different lenses to look at this, but about half of our go-to-market is through IBM with IBM-labeled, IBM-branded products. We've always been, for the side of products, we've always been the R&D behind the products. The partnership, though, has really grown. It's more than just an R&D partnership now, now we're doing co-marketing, we're even doing some joint selling to serve IBM mainframe customers. The partnership has really grown over these last 23 years from just being the guys who write the code to doing much more. >> Okay, so how do you fit in this announcement. Machine learning on Z, where does Rocket fit? >> Part of the announcement today is a very important piece of technology that we developed. We call it data virtualization. Data virtualization is really enabling customers to open their mainframe to allow the data to be used in ways that it was never designed to be used. You might have these data structures that were designed 10, 20, even 30 years ago that were designed for a very specific application, but today they want to use it in a very different way, and so, the traditional path is to take that data and copy it, to ETL it someplace else they can get some new use or to build some new application. What data virtualization allows you to do is to leave that data in place but access it using APIs that developers want to use today. They want to use JSON access, for example, or they want to use SQL access. But they want to be able to do things like join across IMS, DB2, and VSAM all with a single query using an SQL statement. We can do that relational databases and non-relational databases. It gets us out of this mode of having to copy data into some other data store through this ETL process, access the data in place, we call it moving the applications or the analytics to the data versus moving the data to the analytics or to the applications. >> Okay, so in this specific case, and I have said several times today, as Stu has heard me, two years ago IBM had a big theme around the z13 bringing analytics and transactions together, this sort of extends that. Great, I've got this transaction data that lives behind a firewall somewhere. Why the mainframe, why now? >> Well, I would pull back to where I said where we see more companies and organizations wanting to move applications and analytics closer to the data. The data in many of these large companies, that core business-critical data is on the mainframe, and so, being able to do more real time analytics without having to look at old data is really important. There's this term data gravity. I love the visual that presents in my mind that you have these different masses, these different planets if you will, and the biggest, massivest planet in that solar system really is the data, and so, it's pulling the smaller satellites if you will into this planet or this star by way of gravity because data is, data's a new currency, data is what the companies are running on. We're helping in this announcement with being able to unlock and open up all mainframe data sources, even some non-mainframe data sources, and using things like Spark that's running on the platform, that's running on z/OS to access that data directly without having to write any special programming or any special code to get to all their data. >> And the preferred place to run all that data is on the mainframe obviously if you're a mainframe customer. One of the questions I guess people have is, okay, I get that, it's the transaction data that I'm getting access to, but if I'm bringing transaction and analytic data together a lot of times that analytic data might be in social media, it might be somewhere else not on the mainframe. How do envision customers dealing with that? Do you have tooling them to do that? >> We do, so this data virtualization solution that I'm talking about is one that is mainframe resident, but it can also access other data sources. It can access DB2 on Linux Windows, it can access Informix, it can access Cloudant, it can access Hadoop through IBM's BigInsights. Other feeds like Twitter, like other social media, it can pull that in. The case where you'd want to do that is where you're trying to take that data and integrate it with a massive amount of mainframe data. It's going to be much more highly performant by pulling this other small amount of data into, next to that core business data. >> I get the performance and I get the security of the mainframe, I like those two things, but what about the economics? >> Couple of things. One, IBM when they ported Spark to z/OS, they did it the right way. They leveraged the architecture, it wasn't just a simple port of recompiling a bunch of open source code from Apache, it was rewriting it to be highly performant on the Z architecture, taking advantage of specialty engines. We've done the same with the data virtualization component that goes along with that Spark on z/OS offering that also leverages the architecture. We actually have different binaries that we load depending on which architecture of the machine that we're running on, whether it be a z9, an EC12, or the big granddaddy of a z13. >> Bryan, can you speak the developers? I think about, you're talking about all this mobile and Spark and everything like that. There's got to be certain developers that are like, "Oh my gosh, there's mainframe stuff. "I don't know anything about that." How do you help bridge that gap between where it lives in the tools that they're using? >> The best example is talking about embracing this API economy. And so, developers really don't care where the stuff is at, they just want it to be easy to get to. They don't have to code up some specific interface or language to get to different types of data, right? IBM's done a great job with the z/OS Connect in opening up the mainframe to the API economy with ReSTful interfaces, and so with z/OS Connect combined with Rocket data virtualization, you can come through that z/OS Connect same path using all those same ReSTful interfaces pushing those APIs out to tools like Swagger, which the developers want to use, and not only can you get to the applications through z/OS Connect, but we're a service provider to z/OS Connect allowing them to also get to every piece of data using those same ReSTful APIs. >> If I heard you correctly, the developer doesn't need to even worry about that it's on mainframe or speak mainframe or anything like that, right? >> The goal is that they never do. That they simply see in their tool-set, again like Swagger, that they have data as well as different services that they can invoke using these very straightforward, simple ReSTful APIs. >> Can you speak to the customers you've talked to? You know, there's certain people out in the industry, I've had this conversation for a few years at IBM shows is there's some part of the market that are like, oh, well, the mainframe is this dusty old box sitting in a corner with nothing new, and my experience has been the containers and cool streaming and everything like that, oh well, you know, mainframe did virtualization and Linux and all these things really early, decades ago and is keeping up with a lot of these trends with these new type of technologies. What do you find in the customers that, how much are they driving forward on new technologies, looking for that new technology and being able to leverage the assets that they have? >> You asked a lot of questions there. The types of customers certainly financial and insurance are the big two, but that doesn't mean that we're limited and not going after retail and helping governments and manufacturing customers as well. What I find is talking with them that there's the folks who get it and the folks who don't, and the folks who get it are the ones who are saying, "Well, I want to be able "to embrace these new technologies," and they're taking things like open source, they're looking at Spark, for example, they're looking at Anaconda. Last week, we just announced at the Anaconda Conference, we stepped on stage with Continuum, IBM, and we, Rocket, stood up there talking about this partnership that we formed to create this ecosystem because the development world changes very, very rapidly. For a while, all the rage was JDBC, or all the rage was component broker, and so today it's Spark and Anaconda are really in the forefront of developers' minds. We're constantly moving to keep up with developers because that's where the action's happening. Again, they don't care where the data is housed as long as you can open that up. We've been playing with this concept that came up from some research firm called two-speed IT where you have maybe your core business that has been running for years, and it's designed to really be slow-moving, very high quality, it keeps everything running today, but they want to embrace some of their new technologies, they want to be able to roll out a brand-new app, and they want to be able to update that multiple times a week. And so, this two-speed IT says, you're kind of breaking 'em off into two separate teams. You don't have to take your existing infrastructure team and say, "You must embrace every Agile "and every DevOps type of methodology." What we're seeing customers be successful with is this two-speed IT where you can fracture these two, and now you need to create some nice integration between those two teams, so things like data virtualization really help with that. It opens up and allows the development teams to very quickly access those assets on the mainframe in this case while allowing those developers to very quickly crank out an application where quality is not that important, where being very quick to respond and doing lots of AB testing with customers is really critical. >> Waterfall still has its place. As a company that predominately, or maybe even exclusively is involved in mainframe, I'm struck by, it must've been 2008, 2009, Paul Maritz comes in and he says VMWare our vision is to build the software mainframe. And of course the world said, "Ah, that's, mainframe's dead," we've been hearing that forever. In many respects, I accredit the VMWare, they built sort of a form of software mainframe, but now you hear a lot of talk, Stu, about going back to bare metal. You don't hear that talk on the mainframe. Everything's virtualized, right, so it's kind of interesting to see, and IBM uses the language of private cloud. The mainframe's, we're joking, the original private cloud. My question is you're strategy as a company has been always focused on the mainframe and going forward I presume it's going to continue to do that. What's your outlook for that platform? >> We're not exclusively by the mainframe, by the way. We're not, we have a good mix. >> Okay, it's overstating that, then. It's half and half or whatever. You don't talk about it, 'cause you're a private company. >> Maybe a little more than half is mainframe-focused. >> Dave: Significant. >> It is significant. >> You've got a large of proportion of the company on mainframe, z/OS. >> So we're bullish on the mainframe. We continue to invest more every year. We invest, we increase our investment every year, and so in a software company, your investment is primarily people. We increase that by double digits every year. We have license revenue increases in the double digits every year. I don't know many other mainframe-based software companies that have that. But I think that comes back to the partnership that we have with IBM because we are more than just a technology partner. We work on strategic projects with IBM. IBM will oftentimes stand up and say Rocket is a strategic partner that works with us on hard problem-solving customers issues every day. We're bullish, we're investing more all the time. We're not backing away, we're not decreasing our interest or our bets on the mainframe. If anything, we're increasing them at a faster rate than we have in the past 10 years. >> And this trend of bringing analytics and transactions together is a huge mega-trend, I mean, why not do it on the mainframe? If the economics are there, which you're arguing that in many use cases they are, because of the value component as well, then the future looks pretty reasonable, wouldn't you say? >> I'd say it's very, very bright. At the Anaconda Conference last week, I was coming up with an analogy for these folks. It's just a bunch of data scientists, right, and during most of the breaks and the receptions, they were just asking questions, "Well, what is a mainframe? "I didn't know that we still had 'em, "and what do they do?" So it was fun to educate them on that. But I was trying to show them an analogy with data warehousing where, say that in the mid-'90s it was perfectly acceptable to have a separate data warehouse separate from your transaction system. You would copy all this data over into the data warehouse. That was the model, right, and then slowly it became more important that the analytics or the BI against that data warehouse was looking at more real time data. So then it became more efficiencies and how do we replicate this faster, and how do we get closer to, not looking at week-old data but day-old data? And so, I explained that to them and said the days of being able to do analytics against old data that's copied are going away. ETL, we're also bullish to say that ETL is dead. ETL's future is very bleak. There's no place for it. It had its time, but now it's done because with data virtualization you can access that data in place. I was telling these folks as they're talking about, these data scientists, as they're talking about how they look at their models, their first step is always ETL. And so I told them this story, I said ETL is dead, and they just look at me kind of strange. >> Dave: Now the first step is load. >> Yes, there you go, right, load it in there. But having access from these platforms directly to that data, you don't have to worry about any type of a delay. >> What you described, though, is still common architecture where you've got, let's say, a Z mainframe, it's got an InfiniBand pipe to some exit data warehouse or something like that, and so, IBM's vision was, okay, we can collapse that, we can simplify that, consolidate it. SAP with HANA has a similar vision, we can do that. I'm sure Oracle's got their vision. What gives you confidence in IBM's approach and legs going forward? >> Probably due to the advances that we see in z/OS itself where handling mixed workloads, which it's just been doing for many of the 50 years that it's been around, being able to prioritize different workloads, not only just at the CPU dispatching, but also at the memory usage, also at the IO, all the way down through the channel to the actual device. You don't see other operating systems that have that level of granularity for managing mixed workloads. >> In the security component, that's what to me is unique about this so-called private cloud, and I say, I was using that software mainframe example from VMWare in the past, and it got a good portion of the way there, but it couldn't get that last mile, which is, any workload, any application with the performance and security that you would expect. It's just never quite got there. I don't know if the pendulum is swinging, I don't know if that's the accurate way to say it, but it's certainly stabilized, wouldn't you say? >> There's certainly new eyes being opened every day to saying, wait a minute, I could do something different here. Muscle memory doesn't have to guide me in doing business the way I have been doing it before, and that's this muscle memory I'm talking about of this ETL piece. >> Right, well, and a large number of workloads in mainframe are running Linux, right, you got Anaconda, Spark, all these modern tools. The question you asked about developers was right on. If it's independent or transparent to developers, then who cares, that's the key. That's the key lever this day and age is the developer community. You know it well. >> That's right. Give 'em what they want. They're the customers, they're the infrastructure that's being built. >> Bryan, we'll give you the last word, bumper sticker on the event, Rocket Software, your partnership, whatever you choose. >> We're excited to be here, it's an exciting day to talk about machine learning on z/OS. I say we're bullish on the mainframe, we are, we're especially bullish on z/OS, and that's what this even today is all about. That's where the data is, that's where we need the analytics running, that's where we need the machine learning running, that's where we need to get the developers to access the data live. >> Excellent, Bryan, thanks very much for coming to theCUBE. >> Bryan: Thank you. >> And keep right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from New York City. Be right back. (electronic keyboard music)

Published Date : Feb 15 2017

SUMMARY :

Event, brought to you by IBM. powering the path to close to where we are, but and it's spread across the Is that right, no direct sales force? from just being the Okay, so how do you or the analytics to the data versus Why the mainframe, why now? data is on the mainframe, is on the mainframe obviously It's going to be much that also leverages the architecture. There's got to be certain They don't have to code up some The goal is that they never do. and my experience has been the containers and the folks who get it are the ones who You don't hear that talk on the mainframe. the mainframe, by the way. It's half and half or whatever. half is mainframe-focused. of the company on mainframe, z/OS. in the double digits every year. the days of being able to do analytics directly to that data, you don't have it's got an InfiniBand pipe to some for many of the 50 years I don't know if that's the in doing business the way I is the developer community. They're the customers, bumper sticker on the the developers to access the data live. very much for coming to theCUBE. This is theCUBE, we're

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Bryan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Paul Maritz	PERSON	0.99+
Dave	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Rocket Software	ORGANIZATION	0.99+
50 years	QUANTITY	0.99+
2009	DATE	0.99+
New York City	LOCATION	0.99+
2008	DATE	0.99+
Oracle	ORGANIZATION	0.99+
27th year	QUANTITY	0.99+
New York City	LOCATION	0.99+
first step	QUANTITY	0.99+
two	QUANTITY	0.99+
JDBC	ORGANIZATION	0.99+
1,300 employees	QUANTITY	0.99+
Continuum	ORGANIZATION	0.99+
Last week	DATE	0.99+
New York	LOCATION	0.99+
Anaconda	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
mid-'90s	DATE	0.99+
Spark	TITLE	0.99+
Rocket	ORGANIZATION	0.99+
z/OS Connect	TITLE	0.99+
10	DATE	0.99+
two teams	QUANTITY	0.99+
Linux	TITLE	0.99+
today	DATE	0.99+
two-speed	QUANTITY	0.99+
two separate teams	QUANTITY	0.99+
Z. Bryan Smith	PERSON	0.99+
SQL	TITLE	0.99+
Bryan Smith	PERSON	0.99+
z/OS	TITLE	0.98+
two years ago	DATE	0.98+
ReSTful	TITLE	0.98+
Swagger	TITLE	0.98+
last week	DATE	0.98+
decades ago	DATE	0.98+
DB2	TITLE	0.98+
HANA	TITLE	0.97+
IBM Machine Learning Launch Event	EVENT	0.97+
Anaconda Conference	EVENT	0.97+
Hadoop	TITLE	0.97+
Spark	ORGANIZATION	0.97+
One	QUANTITY	0.97+
Informix	TITLE	0.96+
VMWare	ORGANIZATION	0.96+
More than half	QUANTITY	0.95+
z13	COMMERCIAL_ITEM	0.95+
JSON	TITLE	0.95+

Steven Astorino, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> Announcer: Live from New York, it's the CUBE. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now here are your hosts Dave Vellante and Stu Miniman. >> Welcome back to New York City everybody the is The CUBE the leader in live tech coverage. We're here at the IBM Machine Learning Launch Event, bringing machine learning to the Z platform. Steve Astorino is here, he's the VP for Development for the IBM Private Cloud Analytics Platform. Steve, good to see you, thanks for coming on. >> Hi how are you? >> Good thanks, how you doing? >> Good, good. >> Down from Toronto. So this is your baby. >> It is >> This product right? >> It is. So you developed this thing in the labs and now you point it at platforms. So talk about, sort of, what's new here today specifically. >> So today we're launching and announcing our machine learning, our IBM machine learning product. It's really a new solution that allows, obviously, machine learning to be automated and for data scientists and line of business, business analysts to work together and create models to be able to apply machine learning, do predictions and build new business models in the end. To provide better services for their customers. >> So how is it different than what we knew as Watson machine learning? Is it the same product pointed at Z or is it different? >> It's a great question. So Watson is our cloud solution, it's our cloud brand, so we're building something on private cloud for the private cloud customers and enterprises. Same product built for private cloud as opposed to public cloud. Think of it more as a branding and Watson is sort of a bigger solution set in the cloud. >> So it's your product, your baby, what's so great about it? How does it compare with what else is in the marketplace? Why should we get excited about this product? >> Actually, a bunch of things. It's great for many angles, what we're trying to do, obviously it's based on open source, it's an open platform just like what we've been talking about with the other products that we've been launching over the last six months to a year. It's based on Spark, you know we're bringing in all the open source technology, to your fingertips. As well as we're integrating with IBM's top-notch research and capabilities that we're driving in-house, integrating them together and being able to provide one experience to be able to do machine learning. That's at a very high level, also if you think about it there's three things that we're calling out, there's freedom, basically being able to choose what tools you want to use, what environments you want to use, what language you want to use, whether it's Python, Scala, R, right there's productivity. So we really enable and make it simple to be productive and build these machine learning models and then an application developer can leverage and use within their application. The other one is trust. IBM is very well known for its enterprise level capabilities, whether it's governance, whether its trust of the data, how to manage the data, but also more importantly, we're creating something called The Feedback Loop which allows the models to stay current and the data scientists, the administrators, know when these models, for example, is degrading. To make sure it's giving you the right outcome. >> OK, so you mention it's built on Spark. When I think about the efforts to build a data pipeline I think I've got to ingest the data, I've got to explore, I've got to process it and clean it up and then I've got to ultimately serve whomever, the business. >> Right, Right. >> What pieces of that does Spark unify and simplify? >> So we leveraged Spark to able to, obviously for the analytics. When you're building a model you one, have your choice of tooling that you want to use, whether it's programmatic or not. That's one of the value propositions we're bringing forward. But then we create these models, we train them, we evaluate them, we leverage Spark for that. Then obviously, we're trying to bring the models where the data is. So one of the key value proposition is we operationalize these models very simply and quickly. Just at a click of a button you can say hey deploy this model now and we deploy it right on where the data is in this case we're launching it on mainframe first. So Spark on the mainframe, we're deploying the model there and you can score the model directly in Spark on the mainframe. That's a huge value add, get better performance. >> Right, okay, just in terms of differentiates from the competition, you're the only company I think, providing machine learning on Z, so. >> Definitely, definitely. >> That's pretty easy, but in terms of the capabilities that you have, how are you different from the competition? When you talk to clients and they say well what about this vendor or that vendor, how do you respond? >> So let me talk about one of the research technologies that we're launching as part of this called CADS, Cognitive Assistant for Data Scientists. This is a feature where essentially, it takes the complexity out of building a model where you tell it, or you give it the algorithms you want to work with and the CADS assistant basically returns which one is the best which one performs the best. Now, all of a sudden you have the best model to use without having to go and spend, potentially weeks, on figuring out which one that is. So that's a huge value proposition. >> So automating the choice of the algorithm, an algorithm to choose the algorithm. what have you found in terms of it's level of accuracy in terms of the best fit? >> Actually it works really well. And in fact we have a live demo that we'll be doing today, where it shows CADS coming back with a 90% accurate model in terms of the data that we're feeding it and outcome it will give you in terms of what model to use. It works really well. >> Choosing an algorithm is not like choosing a programming language right, this bias if I like Scala or R or whatever, Java, Python okay fine, I've got skill sets associated with that. Algorithm choice is one that's more scientific, I guess? >> It is more scientific, it's based on the algorithm, the statistical algorithm and the selection of the algorithm or the model itself is a huge deal because that's where you're going to drive your business. If you're offering a new service that's where you're providing that solution from, so it has to be the right algorithm the right model so that you can build that more efficiently. >> What are you seeing as the big barriers to customer adopting machine learning? >> I think everybody, I mean it's the hottest thing around right now, everybody wants machine learning it's great, it's a huge buzz. The hardest thing is they know they want it, but don't really know how to apply it into their own environment, or they think they don't have the right skills. So, that actually one of the things that we're going after, to be able to enable them to do that. We're for example working on building different industry-based examples to showcase here's how you would use it in your environment. So last year when we did the Watson data platform we did a retail example, now today we're doing a finance example, a churn example with customers potentially churning and leaving a bank. So we're looking at all those different scenarios, and then also we're creating hubs, locations we're launching today also, announcing today, actually Dinesh will be doing that. There is a hub in Silicon Valley where it would allow customers to come in and work with us and we help them figure out how they can leverage machine learning. It is a great way to interact with our customers and be able to do that. >> So Steve nirvana is, and you gave that example, the retail example in September, when you launched Watson Data Platform, the nirvana in this world is you can use data, and maybe put in an offer, or save a patients life or effect an outcome in real time. So the retail example was just that. If I recall, you were making an offer real-time it was very fast, live demo it wasn't just a fakey. The example on churn, is the outcome is to effect that customer's decisions so that they don't leave? Is that? >> Yes, pretty much, Essentially what we are looking at is , we're using live data, we're using social media data bringing in Twitter sentiment about a particular individual for example, and try to predict if this customer, if this user is happy with the service that they are getting or not. So for example, people will go and socialize, oh I went to this bank and I hated this experience, or they really got me upset or whatever. Bringing that data from Twitter, so open data and merging it with the bank's data, banks have a lot of data they can leverage and monetize. And then making an assessment using machine learning to predict is this customer going to leave me or not? What probability do they have that they are going to leave me or not based on the machine learning model. The example or scenario we are using now, if we think they are going to leave us, we're going to make special offers to them. It's a way to enhance your service for those customers. So that they don't leave you. >> So operationalizing that would be a call center has some kind on dashboard that says red, green, yellow, boom heres an offer that you should make, and that's done in near real time. In fact, real time is before you lose the customer. That's as good a definition as anything else. >> But it's actually real-time, and when we call it the scoring of the data, so as the data transaction is coming in, you can actually make that assessment in real time, it's called in-transaction scoring where you can make that right on the fly and be able to determine is this customer at risk or not. And then be able to make smarter decisions to that service you are providing on whether you want to offer something better. >> So is the primary use case for this those streams those areas I'm getting you know, whether it be, you mentioned Twitter data, maybe IoT, you're getting can we point machine learning at just archives of data and things written historically or is it mostly the streams? >> It's both of course and machine learning is based on historical data right and that's hot the models are built. The more accurate or more data you have on historical data, the more accurate that you picked the right model and you'll get the better predictition of what's going to happen next time. So it's exactly, it's both. >> How are you helping customers with that initial fit? My understanding is how big of a data set do you need, Do I have enough to really model where I have, how do you help customers work through that? >> So my opinion is obvious to a certain extent, the more data you have as your sample set, the more accurate your model is going to be. So if we have one that's too small, your prediction is going to be inaccurate. It really depends on the scenario, it depends on how many features or the fields you have you're looking at within your dataset. It depends on many things, and it's variable depending on the scenario, but in general you want to have a good chunk of historical data that you can build expertise on right. >> So you've worked on both the Watson Services in the public cloud and now this private cloud, is there any differentiation or do you see significant use case different between those two or is it just kind of where the data lives and we're going to do similar activities there. >> So it is similar. At the end of the day, we're trying to provide similar products on both public cloud and private cloud. But for this specific case, we're launching it on mainframe that's a different angle at this. But we know that's where the biggest banks, the insurance companies, the biggest retailers in the world are, and that's where the biggest transactions are running and we really want to help them leverage machine learning and get their services to the next level. I think it's going to be a huge differentiator for them. >> Steve, you gave an example before of Twitter sentiment data. How would that fit in to this announcement. So I've got this ML on Z and I what API into the twitter data? How does that sort of all get adjusted and consolidated? >> So we allow hooks to be able to access data from different sources, bring in data. That is part of the ingest process. Then once you have that data there into data frames into the machine learning product, now you're feeding into a statistical algorithm to figure out what the best prediction is going to be, and the best model's going to be. >> I have a slide that you guys are sharing on the data scientist workflow. It starts with ingestion, selection, preparation, generation, transform, model. It's a complex set of tasks, and typically historically, at least in the last fIve or six years, different tools to de each of those. And not just different tools, multiples of different tools. That you had to cobble together. If I understand it correctly the Watson Data Platform was designed to really consolidate that and simplify that, provide collaboration tools for different personas, so my question is this. Because you were involved in that product as well. And I was excited about it when I saw it, I talked to people about it, sometimes I hear the criticism of well IBM just took a bunch of legacy products threw them together, threw and abstraction layer on top and is now going to wrap a bunch of services around it. Is that true? >> Absolutely not. Actually, you may have heard a while back IBM had made a big shift into design first design methodology. So we started with the Watson Data Platform, the Data Science Experience, they started with design first approach. We looked at this, we said what do we want the experience to be, for which persona do we want to target. Then we understood what we wanted the experience to be and then we leverage IBM analytics portfolio to be able to feed in and provide and integrate those services together to fit into that experience. So, its not a dumping ground for, I'll take this product, it's part of Watson Data Platform, not at all the case. It was the design first, and then integrate for that experience. >> OK, but there are some so-called legacy products in there, but you're saying you picked the ones that were relevant and then was there additional design done? >> There was a lot of work involved to take them from a traditional product, to be able to componentize, create a micro service architecture, I mean the whole works to be able to redesign it and fit into this new experience. >> So microservices architecture, runs on cloud, I think it only runs on cloud today right? >> Correct, correct. >> OK, maybe roadmap without getting too specific. What should we be paying attention to in the future? >> Right now we're doing our first release. Definitely we want to target any platform behind the firewall. So we don't have specific dates, but now we started with machine learning on a mainframe and we want to be able to target the other platforms behind the firewall and the private cloud environment. Definitely we should be looking at that. Our goal is to make, I talked about the feedback loop a little bit, so that is essentially once you deploy the model we actually look at that model you could schedule in a valuation, automatically, within the machine learning product. To be able to say, this model is still good enough. And if it's not we automatically flag it, and we look at the retraining process and redeployment process to make sure you always have the most up to date model. So this is truly machine learning where it requires very little to no intervention from a human. We're going to continue down that path and continue that automation in providing those capabilities so there's a bigger roadmap, there's a lot of things we're looking at. >> We've sort of looked at our big data analyst George Gilbert has talked about you had batch and you had interactive, not the sort of emergent workload is this continuous, streaming data. How do you see the adoption. First of all, is it a valid assertion? That there is a new class of workload, and then how do you see that adoption occurring? Is it going to be a dominant force over the next 10 years? >> Yeah, I think so. Like I said there is a huge buzz around machine learning in general and artificial intelligence, deep learning, all of these terms you hear about. I think as users and customers get more comfortable with understanding how they're going to leverage this in their enterprise. This real-time streaming of data and being able to do analytics on the fly and machine learning on the fly. It's a big deal and it will really helps them be more competitive in their own space with the services we're providing. >> OK Steve, thanks very much for coming on The CUBE. We'll give you the last word. The event, very intimate event a lot of customers coming in very shortly here in just a couple of hours. Give us the bumper sticker. >> All of that's very exciting, we're very excited, this is a big deal for us, that's why whenever IBM does a signature moment it's a big deal for us and we got something cool to talk about, we're very excited about that. Lot's of clients coming so there's an entire session this afternoon, which will be live streamed as well. So it's great, I think we have a differentiating product and we're already getting that feedback from our customers. >> Well congratulations, I love the cadence that you're on. We saw some announcements in September, we're here in February, I expect we're going to see more innovation coming out of your labs in Toronto, and cross IBM so thank you very much for coming on The CUBE. >> Thank you. >> You're welcome OK keep it right there everybody, we'll be back with our next guest right after this short break. This is The CUBE we're live from New York City. (energetic music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. for the IBM Private So this is your baby. and now you point it at platforms. and create models to be able for the private cloud the last six months to a year. the data, I've got to explore, So Spark on the mainframe, from the competition, you're the best model to use without So automating the of the data that we're feeding it Algorithm choice is one that's and the selection and be able to do that. the retail example in September, when you based on the machine learning model. boom heres an offer that you should make, and be able to determine on historical data, the more accurate the more data you have as your sample set, in the public cloud and and get their services to the next level. to this announcement. and the best model's going to be. and is now going to wrap a the experience to be, I mean the whole works attention to in the future? to make sure you always and then how do you see and machine learning on the fly. We'll give you the last word. So it's great, I think we and cross IBM so thank you very This is The CUBE we're

ENTITIES

Entity	Category	Confidence
Steve	PERSON	0.99+
Dave Vellante	PERSON	0.99+
George Gilbert	PERSON	0.99+
Steve Astorino	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
September	DATE	0.99+
Toronto	LOCATION	0.99+
90%	QUANTITY	0.99+
February	DATE	0.99+
Silicon Valley	LOCATION	0.99+
New York City	LOCATION	0.99+
Scala	TITLE	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
New York	LOCATION	0.99+
Python	TITLE	0.99+
Twitter	ORGANIZATION	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
twitter	ORGANIZATION	0.99+
R	TITLE	0.99+
both	QUANTITY	0.99+
Java	TITLE	0.99+
first release	QUANTITY	0.98+
three things	QUANTITY	0.98+
IBM Machine Learning Launch Event	EVENT	0.97+
one experience	QUANTITY	0.96+
one	QUANTITY	0.96+
Watson Data Platform	TITLE	0.96+
first approach	QUANTITY	0.95+
Watson	TITLE	0.95+
Steve nirvana	PERSON	0.94+
Watson Data Platform	TITLE	0.93+
Spark	TITLE	0.93+
six years	QUANTITY	0.92+
First	QUANTITY	0.91+
Watson Services	ORGANIZATION	0.91+
this afternoon	DATE	0.9+
first	QUANTITY	0.89+
last six months	DATE	0.89+
each	QUANTITY	0.86+
#IBMML	TITLE	0.82+
Astorino	PERSON	0.77+
Dinesh	ORGANIZATION	0.76+
CUBE	ORGANIZATION	0.74+
next 10 years	DATE	0.72+
Private Cloud Analytics Platform	TITLE	0.71+
a year	QUANTITY	0.65+
first design methodology	QUANTITY	0.65+
of clients	QUANTITY	0.62+
Watson	ORGANIZATION	0.55+
Loop	OTHER	0.48+

James Kobielus, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> [Announcer] Live from New York, it's the Cube. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now here are your hosts Dave Vellante and Stu Miniman. >> Welcome back to New York City everybody, this is the CUBE. We're here live at the IBM Machine Learning Launch Event. Bringing analytics and transactions together on Z, extending an announcement that IBM made a couple years ago, sort of laid out that vision, and now bringing machine learning to the mainframe platform. We're here with Jim Kobielus. Jim is the Director of IBM's Community Engagement for Data Science and a long time CUBE alum and friend. Great to see you again James. >> Great to always be back here with you. Wonderful folks from the CUBE. You ask really great questions and >> Well thank you. >> I'm prepared to answer. >> So we saw you last week at Spark Summit so back to back, you know, continuous streaming, machine learning, give us the lay of the land from your perspective of machine learning. >> Yeah well machine learning very much is at the heart of what modern application developers build and that's really the core secret sauce in many of the most disruptive applications. So machine learning has become the core of, of course, what data scientists do day in and day out or what they're asked to do which is to build, essentially artificial neural networks that can process big data and find patterns that couldn't normally be found using other approaches. And then as Dinesh and Rob indicated a lot of it's for regression analysis and classification and the other core things that data scientists have been doing for a long time, but machine learning has come into its own because of the potential for great automation of this function of finding patterns and correlations within data sets. So today at the IBM Machine Learning Launch Event, and we've already announced it, IBM Machine Learning for ZOS takes that automation promised to the next step. And so we're real excited and there'll be more details today in the main event. >> One of the most funs I had, most fun I had last year, most fun interviews I had last year was with you, when we interviewed, I think it was 10 data scientists, rock star data scientists, and Dinesh had a quote, he said, "Machine learning is 20% fun, 80% elbow grease." And data scientists sort of echoed that last year. We spent 80% of our time wrangling data. >> [Jim] Yeah. >> It gets kind of tedious. You guys have made announcements to address that, is the needle moving? >> To some degree the needle's moving. Greater automation of data sourcing and preparation and cleansing is ongoing. Machine learning is being used for that function as well. But nonetheless there is still a lot of need in the data science, sort of, pipeline for a lot of manual effort. So if you look at the core of what machine learning is all about, it's supervised learning involves humans, meaning data scientists, to train their algorithms with data and so that involves finding the right data and then of course doing the feature engineering which is a very human and creative process. And then to be training the data and iterating through models to improve the fit of the machine learning algorithms to the data. In many ways there's still a lot of manual functions that need expertise of data scientists to do it right. There's a lot of ways to do machine learning wrong you know there's a lot of, as it were, tricks of the trade you have to learn just through trial and error. A lot of things like the new generation of things like generative adversarial models ride on machine learning or deep learning in this case, a multilayered, and they're not easy to get going and get working effectively the first time around. I mean with the first run of your training data set, so that's just an example of how, the fact is there's a lot of functions that can't be fully automated yet in the whole machine learning process, but a great many can in fact, especially data preparation and transformation. It's being automated to a great degree, so that data scientists can focus on the more creative work that involves subject matter expertise and really also application development and working with larger teams of coders and subject matter experts and others, to be able to take the machine learning algorithms that have been proved out, have been trained, and to dry them to all manner of applications to deliver some disruptive business value. >> James, can you expand for us a little bit this democratization of before it was not just data but now the machine learning, the analytics, you know, when we put these massive capabilities in the broader hands of the business analysts the business people themselves, what are you seeing your customers, what can they do now that they couldn't do before? Why is this such an exciting period of time for the leveraging of data analytics? >> I don't know that it's really an issue of now versus before. Machine learning has been around for a number of years. It's artificial neural networks at the very heart, and that got going actually in many ways in the late 50s and it steadily improved in terms of sophistication and so forth. But what's going on now is that machine learning tools have become commercialized and refined to a greater degree and now they're in a form in the cloud, like with IBM machine learning for the private cloud on ZOS, or Watson machine learning for the blue mixed public cloud. They're at a level of consumability that they've never been at before. With software as a service offering you just, you pay for it, it's available to you. If you're a data scientist you being doing work right away to build applications, derive quick value. So in other words, the time to value on a machine learning project continues to shorten and shorten, due to the consumability, the packaging of these capabilities and to cloud offerings and into other tools that are prebuilt to deliver success. That's what's fundamentally different now and it's just an ongoing process. You sort of see the recent parallels with the business intelligence market. 10 years ago BI was reporting and OLEP and so forth, was only for the, what we now call data scientists or the technical experts and all that area. But in the last 10 years we've seen the business intelligence community and the industry including IBM's tools, move toward more self service, interactive visualization, visual design, BI and predictive analytics, you know, through our cognos and SPSS portfolios. A similar dynamic is coming in to the progress of machine learning, the democratization, to use your term, the more self service model wherein everybody potentially will be able to be, to do machine learning, to build machine learning and deep learning models without a whole of university training. That day is coming and it's coming fairly rapidly. It's just a matter of the maturation of this technology in the marketplace. >> So I want to ask you, you're right, 1950s it was artificial neural networks or AI, sort of was invented I guess, the concept, and then in the late 70s and early 80s it was heavily hyped. It kind of died in the late 80s or in the 90s, you never heard about it even the early 2000s. Why now, why is it here now? Is it because IBM's putting so much muscle behind it? Is it because we have Siri? What is it that has enabled that? >> Well I wish that IBM putting muscle behind a technology can launch anything to success. And we've done a lot of things in that regard. But the thing is, if you look back at the historical progress of AI, I mean, it's older than me and you in terms of when it got going in the middle 50s as a passion or a focus of computer scientists. What we had for the last, most of the last half century is AI or expert systems that were built on having to do essentially programming is right, declared a rule defining how AI systems could process data whatever under various scenarios. That didn't prove scalable. It didn't prove agile enough to learn on the fly from the statistical patterns within the data that you're trying to process. For face recognition and voice recognition, pattern recognition, you need statistical analysis, you need something along the lines of an artificial neural network that doesn't have to be pre-programmed. That's what's new now about in the last this is the turn of this century, is that AI has become predominantly now focused not so much on declarative rules, expert systems of old, but statistical analysis, artificial neural networks that learn from the data. See the, in the long historical sweep of computing, we have three eras of computing. The first era before the second world war was all electromechanical computing devices like IBM's start of course, like everybody's, was in that era. The business logic was burned into the hardware as it were. The second era from the second world war really to the present day, is all about software, programming, it's COBAL, 4trans, C, Java, where the business logic has to be developed, coded by a cadre of programmers. Since the turn of this millennium and really since the turn of this decade, it's all moved towards the third era, which is the cognitive era, where you're learning the business rules automatically from the data itself, and that involves machine learning at its very heart. So most of what has been commercialized and most of what is being deployed in the real world working, successful AI, is all built on artificial neural networks and cognitive computing in the way that I laid out. Where, you still need human beings in the equation, it can't be completely automated. There's things like unsupervised learning that take the automation of machine learning to a greater extent, but you still have the bulk of machine learning is supervised learning where you have training data sets and you need experts, data scientists, to manage that whole process, that over time supervised learning is evolving towards who's going to label the training data sets, especially when you have so much data flooding in from the internet of things and social media and so forth. A lot of that is being outsourced to crowd sourcing environments in terms of the ongoing labeling of data for machine learning projects of all sorts. That trend will continue a pace. So less and less of the actual labeling of the data for machine learning will need to be manually coded by data scientists or data engineers. >> So the more data the better. See I would argue in the enablement pie. You're going to disagree with that which is good. Let's have a discussion [Jim Laughs]. In the enablement pie, I would say the profundity of Hadup was two things. One is I can leave data where it is and bring code to data. >> [Jim] Yeah. >> 5 megabytes of code to petabyte of data, but the second was the dramatic reduction in the cost to store more data, hence my statement of the more data the better, but you're saying, meh maybe not. Certainly for compliance and other things you might not want to have data lying around. >> Well it's an open issue. How much data do you actually need to find the patterns of interest to you, the correlations of interest to you? Sampling of your data set, 10% sample or whatever, in most cases that might be sufficient to find the correlations you're looking for. But if you're looking for some highly deepened rare nuances in terms of anomalies or outliers or whatever within your data set, you may only find those if you have a petabyte of data of the population of interest. So but if you're just looking for broad historical trends and to do predictions against broad trends, you may not need anywhere near that amount. I mean, if it's a large data set, you may only need five to 10% sample. >> So I love this conversation because people have been on the CUBE, Abi Metter for example said, "Dave, sampling is dead." Now a statistician said that's BS, no way. Of course it's not dead. >> Storage isn't free first of all so you can't necessarily save and process all the data. Compute power isn't free yet, memory isn't free yet, so forth so there's lots... >> You're working on that though. >> Yeah sure, it's asymptotically all moving towards zero. But the bottom line is if the underlying resources, including the expertise of your data scientists that's not for free, these are human beings who need to make a living. So you've got to do a lot of things. A, automate functions on the data science side so that your, these experts can radically improve their productivity. Which is why the announcement today of IBM machine learning is so important, it enables greater automation in the creation and the training and deployment of machine learning models. It is a, as Rob Thomas indicated, it's very much a multiplier of productivity of your data science teams, the capability we offer. So that's the core value. Because our customers live and die increasingly by machine learning models. And the data science teams themselves are highly inelastic in the sense that you can't find highly skilled people that easily at an affordable price if you're a business. And you got to make the most of the team that you have and help them to develop their machine learning muscle. >> Okay, I want to ask you to weigh in on one of Stu's favorite topics which is man versus machine. >> Humans versus mechanisms. Actually humans versus bots, let's, okay go ahead. >> Okay so, you know a lot of discussions, about, machines have always replaced humans for jobs, but for the first time it's really beginning to replace cognitive functions. >> [Jim] Yeah. >> What does that mean for jobs, for skill sets? The greatest, I love the comment, the greatest chess player in the world is not a machine. It's humans and machines, but what do you see in terms of the skill set shift when you talk to your data science colleagues in these communities that you're building? Is that the right way to think about it, that it's the creativity of humans and machines that will drive innovation going forward. >> I think it's symbiotic. If you take Watson, of course, that's a star case of a cognitive AI driven machine in the cloud. We use a Watson all the time of course in IBM. I use it all the time in my job for example. Just to give an example of one knowledge worker and how he happens to use AI and machine learning. Watson is an awesome search engine. Through multi-structure data types and in real time enabling you to ask a sequence of very detailed questions and Watson is a relevance ranking engine, all that stuff. What I've found is it's helped me as a knowledge worker to be far more efficient in doing my upfront research for anything that I might be working on. You see I write blogs and I speak and I put together slide decks that I present and so forth. So if you look at knowledge workers in general, AI as driving far more powerful search capabilities in the cloud helps us to eliminate a lot of the grunt work that normally was attended upon doing deep research into like a knowledge corpus that may be preexisting. And that way we can then ask more questions and more intelligent questions and really work through our quest for answers far more rapidly and entertain and rule out more options when we're trying to develop a strategy. Because we have all the data at our fingertips and we've got this expert resource increasingly in a conversational back and forth that's working on our behalf predictively to find what we need. So if you look at that, everybody who's a knowledge worker which is really the bulk now of the economy, can be far more productive cause you have this high performance virtual assistant in the cloud. I don't know that it's really going, AI or deep learning or machine learning, is really going to eliminate a lot of those jobs. It'll just make us far smarter and more efficient doing what we do. That's, I don't want to belittle, I don't want to minimize the potential for some structural dislocation in some fields. >> Well it's interesting because as an example, you're like the, you're already productive, now you become this hyper-productive individual, but you're also very creative and can pick and choose different toolings and so I think people like you it's huge opportunities. If you're a person who used to put up billboards maybe it's time for retraining. >> Yeah well maybe you know a lot of the people like the research assistants and so forth who would support someone like me and most knowledge worker organizations, maybe those people might be displaced cause we would have less need for them. In the same way that one of my very first jobs out of college before I got into my career, I was a file clerk in a court in Detroit, it's like you know, a totally manual job, and there was no automation or anything. You know that most of those functions, I haven't revisited that court in recent years, I'm sure are automated because you have this thing called computers, especially PCs and LANs and so forth that came along since then. So a fair amount of those kinds of feather bedding jobs have gone away and in any number of bureaucracies due to automation and machine learning is all about automation. So who knows where we'll all end up. >> Alright well we got to go but I wanted to ask you about... >> [Jim] I love unions by the way. >> And you got to meet a lot of lawyers I'm sure. >> Okay cool. >> So I got to ask you about your community of data scientists that you're building. You've been early on in that. It's been a persona that you've really tried to cultivate and collaborate with. So give us an update there. What's your, what's the latest, what's your effort like these days? >> Yeah, well, what we're doing is, I'm on a team now that's managing and bringing together all of our program for community engagement programs for really for across portfolio not just data scientists. That involves meet ups and hack-a-thons and developer days and user groups and so forth. These are really important professional forums for our customers, our developers, our partners, to get together and share their expertise and provide guidance to each other. And these are very very important for these people to become very good at, to help them, get better at what they do, help them stay up to speed on the latest technologies. Like deep learning, machine learning and so forth. So we take it very seriously at IBM that communities are really where customers can realize value and grow their human capital ongoing so we're making significant investments in growing those efforts and bringing them together in a unified way and making it easier for like developers and IT administrators to find the right forums, the right events, the right content, within IBM channels and so forth, to help them do their jobs effectively and machine learning is at the heart, not just of data science, but other professions within the IT and business analytics universe, relying more heavily now on machine learning and understanding the tools of the trade to be effective in their jobs. So we're bringing, we're educating our communities on machine learning, why it's so critically important to the future of IT. >> Well your content machine is great content so congratulations on not only kicking that off but continuing it. Thanks Jim for coming on the CUBE. It's good to see you. >> Thanks for having me. >> You're welcome. Alright keep it right there everybody, we'll be back with our next guest. The CUBE, we're live from the Waldorf-Astoria in New York City at the IBM Machine Learning Launch Event right back. (techno music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Great to see you again James. Wonderful folks from the CUBE. so back to back, you know, continuous streaming, and that's really the core secret sauce in many One of the most funs I had, most fun I had last year, is the needle moving? of the machine learning algorithms to the data. of machine learning, the democratization, to use your term, It kind of died in the late 80s or in the 90s, So less and less of the actual labeling of the data So the more data the better. but the second was the dramatic reduction in the cost the correlations of interest to you? because people have been on the CUBE, so you can't necessarily save and process all the data. and the training and deployment of machine learning models. Okay, I want to ask you to weigh in Actually humans versus bots, let's, okay go ahead. but for the first time it's really beginning that it's the creativity of humans and machines and in real time enabling you to ask now you become this hyper-productive individual, In the same way that one of my very first jobs So I got to ask you about your community and machine learning is at the heart, Thanks Jim for coming on the CUBE. in New York City at the IBM Machine Learning

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jim	PERSON	0.99+
Dinesh	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
James	PERSON	0.99+
80%	QUANTITY	0.99+
James Kobielus	PERSON	0.99+
20%	QUANTITY	0.99+
Jim Laughs	PERSON	0.99+
five	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
Detroit	LOCATION	0.99+
1950s	DATE	0.99+
last year	DATE	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
10 data scientists	QUANTITY	0.99+
One	QUANTITY	0.99+
Siri	TITLE	0.99+
Dave	PERSON	0.99+
10%	QUANTITY	0.99+
5 megabytes	QUANTITY	0.99+
Abi Metter	PERSON	0.99+
two things	QUANTITY	0.99+
first time	QUANTITY	0.99+
last week	DATE	0.99+
second	QUANTITY	0.99+
90s	DATE	0.99+
ZOS	TITLE	0.99+
Rob	PERSON	0.99+
last half century	DATE	0.99+
today	DATE	0.99+
early 2000s	DATE	0.98+
Java	TITLE	0.98+
one	QUANTITY	0.98+
C	TITLE	0.98+
10 years ago	DATE	0.98+
first run	QUANTITY	0.98+
late 80s	DATE	0.98+
Watson	TITLE	0.97+
late 70s	DATE	0.97+
late 50s	DATE	0.97+
zero	QUANTITY	0.97+
IBM Machine Learning Launch Event	EVENT	0.96+
early 80s	DATE	0.96+
4trans	TITLE	0.96+
second world war	EVENT	0.95+
IBM Machine Learning Launch Event	EVENT	0.94+
second era	QUANTITY	0.94+
IBM Machine Learning Launch	EVENT	0.93+
Stu	PERSON	0.92+
first jobs	QUANTITY	0.92+
middle 50s	DATE	0.91+
couple years ago	DATE	0.89+
agile	TITLE	0.87+
petabyte	QUANTITY	0.85+
BAL	TITLE	0.84+
this decade	DATE	0.81+
three eras	QUANTITY	0.78+
last 10 years	DATE	0.78+
this millennium	DATE	0.75+
third era	QUANTITY	0.72+

Dinesh Nirmal, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> [Announcer] Live from New York, it's theCube, covering the IBM Machine Learning Launch Event brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to the Waldorf Astoria, everybody. This is theCube, the worldwide leader in live tech coverage. We're covering the IBM Machine Learning announcement. IBM bringing machine learning to its zMainframe, its private cloud. Dinesh Nirmel is here. He's the Vice President of Analytics at IBM and a Cube alum. Dinesh, good to see you again. >> Good to see you, Dave. >> So let's talk about ML. So we went through the big data, the data lake, the data swamp, all this stuff with the dupe. And now we're talking about machine learning and deep learning and AI and cognitive. Is it same wine, new bottle? Or is it an evolution of data and analytics? >> Good. So, Dave, let's talk about machine learning. Right. When I look at machine learning, there's three pillars. The first one is the product. I mean, you got to have a product, right. And you got to have a different shared set of functions and features available for customers to build models. For example, Canvas. I mean, those are table stakes. You got to have a set of algorithms available. So that's the product piece. >> [Dave] Uh huh. >> But then there's the process, the process of taking that model that you built in a notebook and being able to operationalize it. Meaning able to deploy it. That is, you know, I was talking to one of the customers today, and he was saying, "Machine learning is 20% fun and 80% elbow grease." Because that operationalizing of that model is not easy. Although they make it sound very simple, it's not. So if you take a banking, enterprise banking example, right? You build a model in the notebook. Some data sense build it. Now you have to take that and put it into your infrastructure or production environment, which has been there for decades. So you could have a third party software that you cannot change. You could have a set of rigid rules that already is there. You could have applications that was written in the 70's and 80's that nobody want to touch. How do you all of a sudden take the model and infuse in there? It's not easy. And so that is a tremendous amount of work. >> [Dave] Okay. >> The third pillar is the people or the expertise or the experience, the skills that needs to come through, right. So the product is one. The process of operationalizing and getting it into your production environment is another piece. And then the people is the third one. So when I look at machine learning, right. Those are three key pillars that you need to have to have a successful, you know, experience of machine learning. >> Okay, let's unpack that a little bit. Let's start with the differentiation. You mentioned Canvas, but talk about IBM specifically. >> [Dinesh] Right. What's so great about IBM? What's the differentiation? >> Right, exactly. Really good point. So we have been in the productive side for a very long time, right. I mean, it's not like we are coming into ML or AI or cognitive yesterday. We have been in that space for a very long time. We have SPSS predictive analytics available. So even if you look from all three pillars, what we are doing is we are, from a product perspective, we are bringing in the product where we are giving a choice or a flexibility to use the language you want. So there are customers who only want to use R. They are religious R users. They don't want to hear about anything else. There are customers who want to use Python, you know. They don't want to use anything else. So how do we give that choice of languages to our customers to say use any language you want. Or execution engines, right? Some folks want to use Park as execution engine. Some folks want to use R or Python, so we give that choice. Then you talked about Canvas. There are folks who want to use the GUI portion of the Canvas or a modeler to build models, or there are, you know, tekkie guys that we'll approach who want to use notebook. So how do you give that choice? So it becomes kind of like a freedom or a flexibility or a choice that we provide, so that's the product piece, right? We do that. Then the other piece is productivity. So one of the customers, the CTO of (mumbles) TV's going to come on stage with me during the main session, talk about how collaboration helped from an IBM machine learning perspective because their data scientists are sitting in New York City, our data scientists who are working with them are sitting in San Jose, California. And they were real time collaborating using notebooks in our ML projects where they can see the real time. What changes their data scientists are making. They can slack messages between each other. And that collaborative piece is what really helped us. So collaboration is one. Right from a productivity piece. We introduced something called Feedback Loop, whereby which your model can get trained. So today, you deploy a model. It could lose the score, and it could get degraded over time. Then you have to take it off-line and re-train, right? What we have done is like we introduced the Feedback Loops, so when you deploy your model, we give you two endpoints. The first endpoint is, basically, a URI, for you to plug-in your application when you, you know, run your application able call the scoring API. The second endpoint is this feedback endpoint, where you can choose to re-train the model. If you want three hours, if you want it to be six hours, you can do that. So we bring that flexibility, we bring that productivity into it. Then, the management of the models, right? How do we make sure that once you develop the model, you deploy the model. There's a life cycle involved there. How do you make sure that we enable, give you the tools to manage the model? So when you talk about differentiation, right? We are bringing differentiation on all three pillars. From a product perspective, with all the things I mentioned. From a deployment perspective. How do we make sure we have different choices of deployment, whether it's streaming, whether it's realtime, whether it's batch. You can do deployment, right? The Feedback Loop is another one. Once you deployed, how do we keep re-training it. And the last piece I talked about is the expertise or the people, right? So we are today announcing IBM Machine Learning Hub, which will become one place where our customers can go, ask questions, get education sessions, get training, right? Work together to build models. I'll give you an example, that although we are announcing hub, the IBM Machine Learning Hub today, we have been working with America First Credit Union for the last month or so. They approached us and said, you know, their underwriting takes a long time. All the knowledge is embedded in 15 to 20 human beings. And they want to make sure a machine should be able to absorb that knowledge and make that decision in minutes. So it takes hours or days. >> [Dave] So, Stu, before you jump in, so I got, put the portfolio. You know, you mentioned SPSS, expertise, choice. The collaboration, which I think you really stressed at the announcement last fall. The management of the models, so you can continuously improve it. >> Right. >> And then this knowledge base, what you're calling the hub. And I could argue, I guess, that if I take any one of those individual pieces, there, some of your competitors have them. Your argument would be it's all there. >> It all comes together, right? And you have to make sure that all three pillars come together. And customers see great value when you have that. >> Dinesh, customers today are used to kind of the deployment model on the public cloud, which is, "I want to activate a new service," you know. I just activate it, and it's there. When I think about private cloud environments, private clouds are operationally faster, but it's usually not miniature hours. It's usually more like months to deploy projects, which is still better than, you know, kind of, I think, before big data, it was, you know, oh, okay, 18 months to see if it works, and let's bring that down to, you know, a couple of months. Can you walk us through what does, you know, a customer today and says, "Great, I love this approach. "How long does it take?" You know, what's kind of the project life cycle of this? And how long will it take them to play around and pull some of these levers before they're, you know, getting productivity out of it? >> Right. So, really good questions, Stu. So let me back one step. So, in private cloud, we are going, we have new initiative called Download and Go, where our goal is to have our desktop products be able to install on your personal desktop in less than five clicks, in less than fifteen minutes. That's the goal. So the other day, you know, the team told me it's ready. That the first product is ready where you can go less than five clicks, fifteen minutes. I said the real test is I'm going to bring my son, who's five years old. Can he install it, and if he can install it, you know, we are good. And he did it. And I have a video to prove it, you know. So after the show, I will show you because and that's, when you talk about, you know, in the private cloud side, or the on-premise side, it has been a long project cycle. What we want is like you should be able to take our product, install it, and get the experience in minutes. That's the goal. And when you talk about private cloud and public cloud, another differentiating factor is that now you get the strength of IBM public cloud combined with the private cloud, so you could, you know, train your model in public cloud, and score on private cloud. You have the same experience. Not many folks, not many competitors can offer that, right? So that's another . .. >> [Stu] So if I get that right. If I as a customer have played around with the machine learning in Bluemix, I'm going to have a similar look, feel, API. >> Exactly the same, so what you have in Bluemix, right? I mean, so you have the Watson in Bluemix, which, you know, has deep learning, machine learning--all those capabilities. What we have done is we have done, is like, we have extracted the core capabilities of Watson on private cloud, and it's IBM Machine Learning. But the experience is the same. >> I want to talk about this notion of operationalizing analytics. And it ties, to me anyway, it ties into transformation. You mentioned going from Notebook to actually being able to embed analytics in workflow of the business. Can you double click on that a little bit, and maybe give some examples of how that has helped companies transform? >> Right. So when I talk about operationalizing, when you look at machine learning, right? You have all the way from data, which is the most critical piece, to building or deploying the model. A lot of times, data itself is not clean. I'll give you an example, right. So >> OSYX. >> Yeah. And when we are working with an insurance company, for example, the data that comes in. For example, if you just take gender, a lot of times the values are null. So we have to build another model to figure out if it's male or female, right? So in this case, for example, we have to say somebody has done a prostate exam. Obviously, he's a male. You know, we figured that. Or has a gynocology exam. It's a female. So we have to, you know, there's a lot of work just to get that data cleansed. So that's where I mentioned it's, you know, machine learning is 20% fun, 80% elbow grease because it's a lot of grease there that you need to make sure that you cleanse the data. Get that right. That's the shaping piece of it. Then, comes the building the model, right. And then, once you build the model on that data comes the operationalization of that model, which in itself is huge because how do you make sure that you infuse that model into your current infrastructure, which is where a lot of skill set, a lot of experience, and a lot of knowledge that comes in because you want to make sure, unless you are a start-up, right? You already have applications and programs and third-party vendors applications worth running for years, or decades, for that matter. So, yeah, so that's operationalization's a huge piece. Cleansing of the data is a huge piece. Getting the model right is another piece. >> And simplifying the whole process. I think about, I got to ingest the data. I've now got to, you know, play with it, explore. I've got to process it. And I've got to serve it to some, you know, some business need or application. And typically, those are separate processes, separate tools, maybe different personas that are doing that. Am I correct that your announcement in the Fall addressed that workflow. How is it being, you know, deployed and adopted in the field? How is it, again back to transformation, are you seeing that people are actually transforming their analytics processes and ultimately creating outcomes that they expect? >> Huge. So good point. We announced data science experience in the Fall. And the customers that who are going to speak with us today on stage, are the customers who have been using that. So, for example, if you take AFCU, America First Credit Union, they worked with us. In two weeks, you know, talk about transformation, we were able to absorb the knowledge of their underwriters. You know, what (mumbles) is in. Build that, get that features. And was able to build a model in two weeks. And the model is predicting 90%, with 90% accuracy. That's what early tests are showing. >> [Dave] And you say that was in a couple of weeks. You were, you developed that model. >> Yeah, yeah, right. So when we talk about transformation, right? We couldn't have done that a few years ago. We have transformed where the different personas can collaborate with each other, and that's a collaboration piece I talked about. Real time. Be able to build a model, and put it in the test to see what kind of benefits they're getting. >> And you've obviously got edge cases where people get really sophisticated, but, you know, we were sort of talking off camera, and you know like the 80/20 rule, or maybe it's the 90/10. You say most use cases can be, you know, solved with regression and classification. Can you talk about that a little more? >> So, so when we talk about machine learning, right? To me, I would say 90% of it is regression or classification. I mean there are edge case of our clustering and all those things. But linear regression or a classification can solve most of the, most of our customers problems, right? So whether it's fraud detection. Or whether it's underwriting the loan. Or whether you're trying to determine the sentiment analysis. I mean, you can kind of classify or do regression on it. So I would say that 90% of the cases can be covered, but like I said, most of the work is not about picking the right algorithm, but it's also about cleansing the data. Picking the algorithm, then comes building the model. Then comes deployment or operationalizing the model. So there's a step process that's involved, and each step involves some amount of work. So if I could make one more point on the technology and the transformation we have done. So even with picking the right algorithm, we automated, so you as a data scientist don't need to, you know, come in and figure out if I have 50 classifiers and each classifier has four parameters. That's 200 different combinations. Even if you take one hour on each combination, that's 200 hours or nine days that takes you to pick the right combination. What we have done is like in IBM Machine Learning we have something called cognitive assistance for data science, which will help you pick the right combination in minutes instead of days. >> So I can see how regression scales, and in the example you gave of classification, I can see how that scales. If you've got a, you know, fixed classification or maybe 200 parameters, or whatever it is, that scales, what happens, how are people dealing with, sort of automating that classification as things change, as they, some kind of new disease or pattern pops up. How do they address that at scale? >> Good point. So as the data changes, the model needs to change, right? Because everything that model knows is based on the training data. Now, if the data has changed, the symptoms of cancer or any disease has changed, obviously, you have to retrain that model. And that's where I talk about the, where the feedback loop comes in, where we will automatically retrain the model based on the new data that's coming in. So you, as an end user, for example, don't need to worry about it because we will take care of that piece also. We will automate that, also. >> Okay, good. And you've got a session this afternoon with you said two clients, right? AFCU and Kaden dot TV, and you're on, let's see, at 2:55. >> Right. >> So you folks watching the live stream, check that out. I'll give you the last word, you know, what shall we expect to hear there. Show a little leg on your discussion this afternoon. >> Right. So, obviously, I'm going to talk about the different shading factors, what we are delivering IBM Machine Learning, right? And I covered some of it. There's going to be much more. We are going to focus on how we are making freedom or flexibility available. How are we going to do productivity, right? Gains for our data scientists and developers. We are going to talk about trust, you know, the trust of data that we are bringing in. Then I'm going to bring the customers in and talk about their experience, right? We are delivering a product, but we already have customers using it, so I want them to come on stage and share the experiences of, you know, it's one thing you hear about that from us, but it's another thing that customers come and talk about it. So, and the last but not least is we are going to announce our first release of IBM Machine Learning on Z because if you look at 90% of the transactional data, today, it runs through Z, so they don't have to off-load the data to do analytics on it. We will make machine learning available, so you can do training and scoring right there on Z for your real time analytics, so. >> Right. Extending that theme that we talked about earlier, Stu, bringing analytics and transactions together, which is a big theme of the Z 13 announcement two years ago. Now you're seeing, you know, machine learning coming on Z. The live stream starts at 2 o'clock. Silicon Angle dot com had an article up on the site this morning from Maria Doucher on the IBM announcement, so check that out. Dinesh, thanks very much for coming back on theCube. Really appreciate it, and good luck today. >> Thank you. >> All right. Keep it right there, buddy. We'll be back with our next guest. This is theCube. We're live from the Waldorf Astoria for the IBM Machine Learning Event announcement. Right back.

Published Date : Feb 15 2017

SUMMARY :

brought to you by IBM. Dinesh, good to see you again. the data lake, the data swamp, And you got to have a different shared set So if you take a banking, to have a successful, you know, experience Let's start with the differentiation. What's the differentiation? the Feedback Loops, so when you deploy your model, The management of the models, so you can And I could argue, I guess, And customers see great value when you have that. and let's bring that down to, you know, So the other day, you know, the machine learning in Bluemix, I mean, so you have the Watson in Bluemix, Can you double click on that a little bit, when you look at machine learning, right? So we have to, you know, And I've got to serve it to some, you know, So, for example, if you take AFCU, [Dave] And you say that was in a couple of weeks. and put it in the test to see what kind You say most use cases can be, you know, we automated, so you as a data scientist and in the example you gave of classification, So as the data changes, with you said two clients, right? So you folks watching the live stream, you know, the trust of data that we are bringing in. on the IBM announcement, for the IBM Machine Learning Event announcement.

ENTITIES

Entity	Category	Confidence
20%	QUANTITY	0.99+
Dave Vellante	PERSON	0.99+
AFCU	ORGANIZATION	0.99+
15	QUANTITY	0.99+
one hour	QUANTITY	0.99+
New York City	LOCATION	0.99+
Dinesh Nirmal	PERSON	0.99+
Dinesh Nirmel	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
200 hours	QUANTITY	0.99+
six hours	QUANTITY	0.99+
90%	QUANTITY	0.99+
Dave	PERSON	0.99+
80%	QUANTITY	0.99+
less than fifteen minutes	QUANTITY	0.99+
New York	LOCATION	0.99+
fifteen minutes	QUANTITY	0.99+
Maria Doucher	PERSON	0.99+
America First Credit Union	ORGANIZATION	0.99+
50 classifiers	QUANTITY	0.99+
nine days	QUANTITY	0.99+
three hours	QUANTITY	0.99+
two clients	QUANTITY	0.99+
Kaden dot TV	ORGANIZATION	0.99+
less than five clicks	QUANTITY	0.99+
18 months	QUANTITY	0.99+
San Jose, California	LOCATION	0.99+
two weeks	QUANTITY	0.99+
200 different combinations	QUANTITY	0.99+
Dinesh	PERSON	0.99+
each classifier	QUANTITY	0.99+
200 parameters	QUANTITY	0.99+
each combination	QUANTITY	0.99+
Python	TITLE	0.99+
today	DATE	0.99+
each step	QUANTITY	0.99+
two years ago	DATE	0.99+
three key pillars	QUANTITY	0.99+
one	QUANTITY	0.98+
first product	QUANTITY	0.98+
one step	QUANTITY	0.98+
two endpoints	QUANTITY	0.98+
third one	QUANTITY	0.98+
first one	QUANTITY	0.98+
Watson	TITLE	0.98+
2 o'clock	DATE	0.98+
last month	DATE	0.98+
first endpoint	QUANTITY	0.98+
three pillars	QUANTITY	0.98+
Silicon Angle dot com	ORGANIZATION	0.98+
70's	DATE	0.97+
80's	DATE	0.97+
this afternoon	DATE	0.97+
Z 13	TITLE	0.97+
Z	TITLE	0.97+
last fall	DATE	0.96+
Bluemix	TITLE	0.96+
yesterday	DATE	0.95+
2:55	DATE	0.95+

Rob Thomas, IBM | IBM Machine Learning Launch

>> Narrator: Live from New York, it's theCUBE. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody this is theCUBE, we're here at the IBM Machine Learning Launch Event, Rob Thomas is here, he's the general manager of the IBM analytics group. Rob, good to see you again. >> Dave, great to see you, thanks for being here. >> Yeah it's our pleasure. So two years ago, IBM announced the Z platform, and the big theme was bringing analytics and transactions together. You guys are sort of extending that today, bringing machine learning. So the news just hit three minutes ago. >> Rob: Yep. >> Take us through what you announced. >> This is a big day for us. The announcement is we are going to bring machine learning to private Clouds, and my observation is this, you look at the world today, over 90% of the data in the world cannot be googled. Why is that? It's because it's behind corporate firewalls. And as we've worked with clients over the last few years, sometimes they don't want to move their most sensitive data to the public Cloud yet, and so what we've done is we've taken the machine learning from IBM Watson, we've extracted that, and we're enabling that on private Clouds, and we're telling clients you can get the power of machine learning across any type of data, whether it's data in a warehouse, a database, unstructured content, email, you name it we're bringing machine learning everywhere. To your point, we were thinking about, so where do we start? And we said, well, what is the world's most valuable data? It's the data on the mainframe. It's the transactional data that runs the retailers of the world, the banks of the world, insurance companies, airlines of the world, and so we said we're going to start there because we can show clients how they can use machine learning to unlock value in their most valuable data. >> And which, you say private Cloud, of course, we're talking about the original private Cloud, >> Rob: Yeah. >> Which is the mainframe, right? >> Rob: Exactly. >> And I presume that you'll extend that to other platforms over time is that right? >> Yeah, I mean, we're going to think about every place that data is managed behind a firewall, we want to enable machine learning as an ingredient. And so this is the first step, and we're going to be delivering every quarter starting next quarter, bringing it to other platforms, other repositories, because once clients get a taste of the idea of automating analytics with machine learning, what we call continuous intelligence, it changes the way they do analytics. And, so, demand will be off the charts here. >> So it's essentially Watson ML extracted and placed on Z, is that right? And describe how people are going to be using this and who's going to be using it. >> Sure, so Watson on the Cloud today is IBM's Cloud platform for artificial intelligence, cognitive computing, augmented intelligence. A component of that is machine learning. So we're bringing that as IBM machine learning which will run today on the mainframe, and then in the future, other platforms. Now let's talk about what it does. What it is, it's a single-place unified model management, so you can manage all your models from one place. And we've got really interesting technology that we pulled out of IBM research, called CADS, which stands for the Cognitive Assistance for Data Scientist. And the idea behind CADS is, you don't have to know which algorithm to choose, we're going to choose the algorithm for you. You build your model, we'll decide based on all the algorithms available on open-source what you built for yourself, what IBM's provided, what's the best way to run it, and our focus here is, it's about productivity of data science and data scientists. No company has as many data scientists as they want, and so we've got to make the ones they do have vastly more productive, and so with technology like CADS, we're helping them do their job more efficiently and better. >> Yeah, CADS, we've talked about this in theCUBE before, it's like an algorithm to choose an algorithm, and makes the best fit. >> Rob: Yeah. >> Okay. And you guys addressed some of the collaboration issues at your Watson data platform announcement last October, so talk about the personas who are asking you to give me access to mainframe data, and give me, to tooling that actually resides on this private Cloud. >> It's definitely a data science persona, but we see, I'd say, an emerging market where it's more the business analyst type that is saying I'd really like to get at that data, but I haven't been able to do that easily in the past. So giving them a single pane of glass if you will, with some light data science experience, where they can manage their models, using CADS to actually make it more productive. And then we have something called a feedback loop that's built into it, which is you build a model running on Z, as you get new data in, these are the largest transactional systems in the world so there's data coming in every second. As you get new data in, that model is constantly updating. The model is learning from the data that's coming in, and it's becoming smarter. That's the whole idea behind machine learning in the first place. And that's what we've been able to enable here. Now, you and I have talked through the years, Dave, about IBM's investment in Spark. This is one of the first, I would say, world-class applications of Spark. We announced Spark on the mainframe last year, what we're bringing with IBM machine learning is leveraging Spark as an execution engine on the mainframe, and so I see this as Spark is finally coming into the mainstream, when you talk about Spark accessing the world's greatest transactional data. >> Rob, I wonder if you can help our audience kind of squint through a compare and contrast, public Cloud versus what you're offering today, 'cause one thing, public Cloud adding new services, machine learning seemed like one of those areas that we would add, like IBM had done with a machine learning platform. Streaming, absolutely you hear mobile streaming applications absolutely happened in the public Cloud. Is cost similar in private Cloud? Can I get all the services? How will IBM and your customer base keep up with that pace of innovation that we've seen from IBM and others in the public Cloud on PRIM? >> Yeah, so, look, my view is it's not an either or. Because when you look at this valuable data, clients want to do some of it in public Cloud, they want to keep a lot of it in the system that they built on PRIMA. So our job is, how do we actually bridge that gap? So I see machine learning like we've talked about becoming much more of a hybrid capability over time because the data they want to move to the Cloud, they should do that. The economics are great. The data, doing it on private Cloud, actually the economics are tremendous as well. And so we're delivering an elastic infrastructure on private Cloud as well that can scale the public Cloud. So to me it's not either or, it's about what everybody wants as Cloud features. They want the elasticity, they want a creatable interface, they want the economics of Cloud, and our job is to deliver that in both places. Whether it's on the public Cloud, which we're doing, or on the private Cloud. >> Yeah, one of the thought exercises I've gone through is if you follow the data, and follow the applications, it's going to show you where customers are going to do things. If you look at IOT, if you look at healthcare, there's lots of uses that it's going to be on PRIMA it's going to be on the edge, I got to interview Walmart a couple of years ago at the IBM Ed show, and they leveraged Z globally to use their sales, their enablement, and obviously they're not going to use AWS as their platform. What's the trends, what do you hear form their customers, how much of the data, are there reasons why it needs to stay at the edge? It's not just compliance and governance, but it's just because that's where the data is and I think you were saying there's just so much data on the Z series itself compared to in other environments. >> Yeah, and it's not just the mainframe, right? Let's be honest, there's just massive amounts of data that still sits behind corporate firewalls. And while I believe the end destination is a lot of that will be on public Cloud, what do you do now? Because you can't wait until that future arrives. And so the place, the biggest change I've seen in the market in the last year is clients are building private Clouds. It's not traditional on-premise deployments, it's, they're building an elastic infrastructure behind their firewall, you see it a lot in heavily-regulated industries, so financial services where they're dealing with things like GDPR, any type of retailer who's dealing with things like PCI compliance. Heavy-regulated industries are saying, we want to move there, but we got challenges to solve right now. And so, our mission is, we want to make data simple and accessible, wherever it is, on private Cloud or public Cloud, and help clients on that journey. >> Okay, so carrying through on that, so you're now unlocking access to mainframe data, great, if I have, say, a retail example, and I've got some data science, I'm building some models, I'm accessing the mainframe data, if I have data that's elsewhere in the Cloud, how specifically with regard to this announcement will a practitioner execute on that? >> Yeah, so, one is you could decide one place that you want to land your data and have it be resonant, so you could do that. We have scenarios where clients are using data science experience on the Cloud, but they're actually leaving the data behind the firewalls. So we don't require them to move the data, so our model is one of flexibility in terms of how they want to manage their data assets. Which I think is unique in terms of IBM's approach to that. Others in the market say, if you want to use our tools, you have to move your data to our Cloud, some of them even say as you click through the terms, now we own your data, now we own your insights, that's not our approach. Our view is it's your data, if you want to run the applications in the Cloud, leave the data where it is, that's fine. If you want to move both to the Cloud, that's fine. If you wanted to leave both on private Cloud, that's fine. We have capabilities like Big SQL where we can actually federate data across public and private Clouds, so we're trying to provide choice and flexibility when it comes to this. >> And, Rob, in the context of this announcement, that would be, that example you gave, would be done through APIs that allow me access to that Cloud data is that right? >> Yeah, exactly, yes. >> Dave: Okay. >> So last year we announced something called Data Connect, which is basically, think of it as a bus between private and public Cloud. You can leverage Data Connect to seamlessly and easily move data. It's very high-speed, it uses our Aspera technology under the covers, so you can do that. >> Dave: A recent acquisition. >> Rob, IBM's been very active in open source engagement, in trying to help the industry sort out some of the challenges out there. Where do you see the state of the machine learning frameworks Google of course has TensorFlow, we've seen Amazon pushing at MXNet, is IBM supporting all of them, there certain horses that you have strong feelings for? What are your customers telling you? >> I believe in openness and choice. So with IBM machine learning you can choose your language, you can use Scala, you can use Java, you can use Python, more to come. You can choose your framework. We're starting with Spark ML because that's where we have our competency and that's where we see a lot of client desire. But I'm open to clients using other frameworks over time as well, so we'll start to bring that in. I think the IT industry always wants to kind of put people into a box. This is the model you should use. That's not our approach. Our approach is, you can use the language, you can use the framework that you want, and through things like IBM machine learning, we give you the ability to tap this data that is your most valuable data. >> Yeah, the box today has just become this mosaic and you have to provide access to all the pieces of that mosaic. One of the things that practitioners tell us is they struggle sometimes, and I wonder if you could weigh in on this, to invest either in improving the model or capturing more data and they have limited budget, and they said, okay. And I've had people tell me, no, you're way better off getting more data in, I've had people say, no no, now with machine learning we can advance the models. What are you seeing there, what are you advising customers in that regard? >> So, computes become relatively cheap, which is good. Data acquisitions become relatively cheap. So my view is, go full speed ahead on both of those. The value comes from the right algorithms and the right models. That's where the value is. And so I encourage clients, even think about maybe you separate your teams. And you have one that's focused on data acquisition and how you do that, and another team that's focused on model development, algorithm development. Because otherwise, if you give somebody both jobs, they both get done halfway, typically. And the value is from the right models, the right algorithms, so that's where we stress the focus. >> And models to date have been okay, but there's a lot of room for improvement. Like the two examples I like to use are retargeting, ad retargeting, which, as we all know as consumers is not great. You buy something and then you get targeted for another week. And then fraud detection, which is actually, for the last ten years, quite good, but there's still a lot of false positives. Where do you see IBM machine learning taking that practical use case in terms of improving those models? >> Yeah, so why are there false positives? The issue typically comes down to the quality of data, and the amount of data that you have that's why. Let me give an example. So one of the clients that's going to be talking at our event this afternoon is Argus who's focused on the healthcare space. >> Dave: Yeah, we're going to have him on here as well. >> Excellent, so Argus is basically, they collect data across payers, they're focused on healthcare, payers, providers, pharmacy benefit managers, and their whole mission is how do we cost-effectively serve different scenarios or different diseases, in this case diabetes, and how do we make sure we're getting the right care at the right time? So they've got all that data on the mainframe, they're constantly getting new data in, it could be about blood sugar levels, it could be about glucose, it could be about changes in blood pressure. Their models will get smarter over time because they built them with IBM machine learning so that what's cost-effective today may not be the most effective or cost-effective solution tomorrow. But we're giving them that continuous intelligence as data comes in to do that. That is the value of machine learning. I think sometimes people miss that point, they think it's just about making the data scientists' job easier, that productivity is part of it, but it's really about the voracity of the data and that you're constantly updating your models. >> And the patient outcome there, I read through some of the notes earlier, is if I can essentially opt in to allow the system to adjudicate the medication or the claim, and if I do so, I can get that instantaneously or in near real-time as opposed to have to wait weeks and phone calls and haggling. Is that right, did I get that right? >> That's right, and look, there's two dimensions. It's the cost of treatment, so you want to optimize that, and then it's the effectiveness. And which one's more important? Well, they're both actually critically important. And so what we're doing with Argus is building, helping them build models where they deploy this so that they're optimizing both of those. >> Right, and in the case, again, back to the personas, that would be, and you guys stressed this at your announcement last October, it's the data scientist, it's the data engineer, it's the, I guess even the application developer, right? Involved in that type of collaboration. >> My hope would be over time, when I talked about we view machine learning as an ingredient across everywhere that data is, is you want to embed machine learning into any applications that are built. And at that point you no longer need a data scientist per se, for that case, you can just have the app developer that's incorporating that. Whereas another tough challenge like the one we discussed, that's where you need data scientists. So think about, you need to divide and conquer the machine learning problem, where the data scientist can play, the business analyst can play, the app developers can play, the data engineers can play, and that's what we're enabling. >> And how does streaming fit in? We talked earlier about this sort of batch, interactive, and now you have this continuous sort of work load. How does streaming fit? >> So we use streaming in a few ways. One is very high-speed data ingest, it's a good way to get data into the Cloud. We also can do analytics on the fly. So a lot of our use case around streaming where we actually build analytical models into the streaming engine so that you're doing analytics on the fly. So I view that as, it's a different side of the same coin. It's kind of based on your use case, how fast you're ingesting data if you're, you know, sub-millisecond response times, you constantly have data coming in, you need something like a streaming engine to do that. >> And it's actually consolidating that data pipeline, is what you described which is big in terms of simplifying the complexity, this mosaic of a dupe, for example and that's a big value proposition of Spark. Alright, we'll give you the last word, you've got an audience outside waiting, big announcement today; final thoughts. >> You know, we talked about machine learning for a long time. I'll give you an analogy. So 1896, Charles Brady King is the first person to drive an automobile down the street in Detroit. It was 20 years later before Henry Ford actually turned it from a novelty into mass appeal. So it was like a 20-year incubation period where you could actually automate it, you could make it more cost-effective, you could make it simpler and easy. I feel like we're kind of in the same thing here where, the data era in my mind began around the turn of the century. Companies came onto the internet, started to collect a lot more data. It's taken us a while to get to the point where we could actually make this really easy and to do it at scale. And people have been wanting to do machine learning for years. It starts today. So we're excited about that. >> Yeah, and we saw the same thing with the steam engine, it was decades before it actually was perfected, and now the timeframe in our industry is compressed to years, sometimes months. >> Rob: Exactly. >> Alright, Rob, thanks very much for coming on theCUBE. Good luck with the announcement today. >> Thank you. >> Good to see you again. >> Thank you guys. >> Alright, keep it right there, everybody. We'll be right back with our next guest, we're live from the Waldorf Astoria, the IBM Machine Learning Launch Event. Be right back. [electronic music]

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Rob, good to see you again. Dave, great to see you, and the big theme was bringing analytics and we're telling clients you can get it changes the way they do analytics. are going to be using this And the idea behind CADS and makes the best fit. so talk about the personas do that easily in the past. in the public Cloud. Whether it's on the public Cloud, and follow the applications, And so the place, that you want to land your under the covers, so you can do that. of the machine learning frameworks This is the model you should use. and you have to provide access to and the right models. for the last ten years, quite good, and the amount of data to have him on here as well. That is the value of machine learning. the system to adjudicate It's the cost of treatment, Right, and in the case, And at that point you no and now you have this We also can do analytics on the fly. in terms of simplifying the complexity, King is the first person and now the timeframe in our industry much for coming on theCUBE. the IBM Machine Learning Launch Event.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Henry Ford	PERSON	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Detroit	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
Charles Brady King	PERSON	0.99+
New York City	LOCATION	0.99+
Walmart	ORGANIZATION	0.99+
Scala	TITLE	0.99+
Amazon	ORGANIZATION	0.99+
New York	LOCATION	0.99+
last year	DATE	0.99+
two dimensions	QUANTITY	0.99+
1896	DATE	0.99+
Java	TITLE	0.99+
both	QUANTITY	0.99+
Argus	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
Python	TITLE	0.99+
20-year	QUANTITY	0.99+
GDPR	TITLE	0.99+
Argus	PERSON	0.99+
one	QUANTITY	0.99+
two examples	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
both jobs	QUANTITY	0.99+
first step	QUANTITY	0.99+
today	DATE	0.99+
next quarter	DATE	0.99+
two years ago	DATE	0.98+
first	QUANTITY	0.98+
Google	ORGANIZATION	0.98+
first person	QUANTITY	0.98+
three minutes ago	DATE	0.98+
20 years later	DATE	0.98+
Watson	TITLE	0.98+
last October	DATE	0.97+
IBM Machine Learning Launch Event	EVENT	0.96+
IBM Machine Learning Launch Event	EVENT	0.96+
Spark ML	TITLE	0.96+
both places	QUANTITY	0.95+
One	QUANTITY	0.95+
IBM Machine Learning Launch Event	EVENT	0.94+
MXNet	ORGANIZATION	0.94+
Watson ML	TITLE	0.94+
Data Connect	TITLE	0.94+
Cloud	TITLE	0.93+

Kickoff - IBM Machine Learning Launch - #IBMML - #theCUBE

>> Narrator: Live from New York, it's The Cube covering the IBM Machine Learning Launch Event brought to you by IBM. Here are your hosts, Dave Vellante and Stu Miniman. >> Good morning everybody, welcome to the Waldorf Astoria. Stu Miniman and I are here in New York City, the Big Apple, for IBM's Machine Learning Event #IBMML. We're fresh off Spark Summit, Stu, where we had The Cube, this by the way is The Cube, the worldwide leader in live tech coverage. We were at Spark Summit last week, George Gilbert and I, watching the evolution of so-called big data. Let me frame, Stu, where we're at and bring you into the conversation. The early days of big data were all about offloading the data warehouse and reducing the cost of the data warehouse. I often joke that the ROI of big data is reduction on investment, right? There's these big, expensive data warehouses. It was quite successful in that regard. What then happened is we started to throw all this data into the data warehouse. People would joke it became a data swamp, and you had a lot of tooling to try to clean the data warehouse and a lot of transforming and loading and the ETL vendors started to participate there in a bigger way. Then you saw the extension of these data pipelines to try to more with that data. The Cloud guys have now entered in a big way. We're now entering the Cognitive Era, as IBM likes to refer to it. Others talk about AI and machine learning and deep learning, and that's really the big topic here today. What we can tell you, that the news goes out at 9:00am this morning, and it was well known that IBM's bringing machine learning to its mainframe, z mainframe. Two years ago, Stu, IBM announced the z13, which was really designed to bring analytic and transaction processing together on a single platform. Clearly IBM is extending the useful life of the mainframe by bringing things like Spark, certainly what it did with Linux and now machine learning into z. I want to talk about Cloud, the importance of Cloud, and how that has really taken over the world of big data. Virtually every customer you talk to now is doing work on the Cloud. It's interesting to see now IBM unlocking its transaction base, its mission-critical data, to this machine learning world. What are you seeing around Cloud and big data? >> We've been digging into this big data space since before it was called big data. One of the early things that really got me interested and exciting about it is, from the infrastructure standpoint, storage has always been one of its costs that we had to have, and the massive amounts of data, the digital explosion we talked about, is keeping all that information or managing all that information was a huge challenge. Big data was really that bit flip. How do we take all that information and make it an opportunity? How do we get new revenue streams? Dave, IBM has been at the center of this and looking at the higher-level pieces of not just storing data, but leveraging it. Obviously huge in analytics, lots of focus on everything from Hadoop and Spark and newer technologies, but digging in to how they can leverage up the stack, which is where IBM has done a lot of acquisitions in that space and leveraging that and wants to make sure that they have a strong position both in Cloud, which was renamed. The soft layer is now IBM Bluemix with a lot of services including a machine learning service that leverages the Watson technology and of course OnPrem they've got the z and the power solutions that you and I have covered for many years at the IBM Med show. >> Machine learning obviously heavily leverages models. We've seen in the early days of the data, the data scientists would build models and machine learning allows those models to be perfected over time. So there's this continuous process. We're familiar with the world of Batch and then some mini computer brought in the world of interactive, so we're familiar with those types of workloads. Now we're talking about a new emergent workload which is continuous. Continuous apps where you're streaming data in, what Spark is all about. The models that data scientists are building can constantly be improved. The key is automation, right? Being able to automate that whole process, and being able to collaborate between the data scientist, the data quality engineers, even the application developers that's something that IBM really tried to address in its last big announcement in this area of which was in October of last year the Watson data platform, what they called at the time the DataWorks. So really trying to bring together those different personas in a way that they can collaborate together and improve models on a continuous basis. The use cases that you often hear in big data and certainly initially in machine learning are things like fraud detection. Obviously ad serving has been a big data application for quite some time. In financial services, identifying good targets, identifying risk. What I'm seeing, Stu, is that the phase that we're in now of this so-called big data and analytics world, and now bringing in machine learning and deep learning, is to really improve on some of those use cases. For example, fraud's gotten much, much better. Ten years ago, let's say, it took many, many months, if you ever detected fraud. Now you get it in seconds, or sometimes minutes, but you also get a lot of false positives. Oops, sorry, the transaction didn't go through. Did you do this transaction? Yes, I did. Oh, sorry, you're going to have to redo it because it didn't go through. It's very frustrating for a lot of users. That will get better and better and better. We've all experienced retargeting from ads, and we know how crappy they are. That will continue to get better. The big question that people have and it goes back to Jeff Hammerbacher, the best minds of my generation are trying to get people to click on ads. When will we see big data really start to affect our lives in different ways like patient outcomes? We're going to hear some of that today from folks in health care and pharma. Again, these are the things that people are waiting for. The other piece is, of course, IT. What you're seeing, in terms of IT, in the whole data flow? >> Yes, a big question we have, Dave, is where's the data? And therefore, where does it make sense to be able to do that processing? In big data we talked about you've got masses amounts of data, can we move the processing to that data? With IT, the day before, your RCTO talked that there's going to be massive amounts of data at the edge and I don't have the time or the bandwidth or the need necessarily to pull that back to some kind of central repository. I want to be able to work on it there. Therefore there's going to be a lot of data worked at the edge. Peter Levine did a whole video talking about how, "Oh, Public Cloud is dead, it's all going to the edge." A little bit hyperbolic to the statement we understand that there's plenty use cases for both Public Cloud and for the edge. In fact we see Google big pushing machine learning TensorFlow, it's got one of those machine learning frameworks out there that we expect a lot of people to be working on. Amazon is putting effort into the MXNet framework, which is once again an open-source effort. One of the things I'm looking at the space, and I think IBM can provide some leadership here is to what frameworks are going to become popular across multiple scenarios? How many winners can there be for these frameworks? We already have multiple programming languages, multiple Clouds. How much of it is just API compatibility? How much of work there, and where are the repositories of data going to be, and where does it make sense to do that predictive analytics, that advanced processing? >> You bring up a good point. Last year, last October, at Big Data CIV, we had a special segment of data scientists with a data scientist panel. It was great. We had some rockstar data scientists on there like Dee Blanchfield and Joe Caserta, and a number of others. They echoed what you always hear when you talk to data scientists. "We spend 80% of our time messing with the data, "trying to clean the data, figuring out the data quality, "and precious little time on the models "and proving the models "and actually getting outcomes from those models." So things like Spark have simplified that whole process and unified a lot of the tooling around so-called big data. We're seeing Spark adoption increase. George Gilbert in our part one and part two last week in the big data forecast from Wikibon showed that we're still not on the steep part of the Se-curve, in terms of Spark adoption. Generically, we're talking about streaming as well included in that forecast, but it's forecasting that increasingly those applications are going to become more and more important. It brings you back to what IBM's trying to do is bring machine learning into this critical transaction data. Again, to me, it's an extension of the vision that they put forth two years ago, bringing analytic and transaction data together, actually processing within that Private Cloud complex, which is what essentially this mainframe is, it's the original Private Cloud, right? You were saying off-camera, it's the original converged infrastructure. It's the original Private Cloud. >> The mainframe's still here, lots of Linux on it. We've covered for many years, you want your cool Linux docker, containerized, machine learning stuff, I can do that on the Zn-series. >> You want Python and Spark and Re and Papa Java, and all the popular programming languages. It makes sense. It's not like a huge growth platform, it's kind of flat, down, up in the product cycle but it's alive and well and a lot of companies run their businesses obviously on the Zn. We're going to be unpacking that all day. Some of the questions we have is, what about Cloud? Where does it fit? What about Hybrid Cloud? What are the specifics of this announcement? Where does it fit? Will it be extended? Where does it come from? How does it relate to other products within the IBM portfolio? And very importantly, how are customers going to be applying these capabilities to create business value? That's something that we'll be looking at with a number of the folks on today. >> Dave, another thing, it reminds me of two years ago you and I did an event with the MIT Sloan school on The Second Machine Age with Andy McAfee and Erik Brynjolfsson talking about as machines can help with some of these analytics, some of this advanced technology, what happens to the people? Talk about health care, it's doctors plus machines most of the time. As these two professors say, it's racing with the machines. What is the impact on people? What's the impact on jobs? And productivity going forward, really interesting hot space. They talk about everything from autonomous vehicles, advanced health care and the like. This is right at the core of where the next generation of the economy and jobs are going to go. >> It's a great point, and no doubt that's going to come up today and some of our segments will explore that. Keep it right there, everybody. We'll be here all day covering this announcement, talking to practitioners, talking to IBM executives and thought leaders and sharing some of the major trends that are going on in machine learning, the specifics of this announcement. Keep it right there, everybody. This is The Cube. We're live from the Waldorf Astoria. We'll be right back.

Published Date : Feb 15 2017

SUMMARY :

covering the IBM Machine and that's really the and the massive amounts of data, and it goes back to Jeff Hammerbacher, and I don't have the time or the bandwidth of the Se-curve, in I can do that on the Zn-series. Some of the questions we have is, of the economy and jobs are going to go. and sharing some of the major trends

ENTITIES

Entity	Category	Confidence
Jeff Hammerbacher	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Peter Levine	PERSON	0.99+
George Gilbert	PERSON	0.99+
Erik Brynjolfsson	PERSON	0.99+
Joe Caserta	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Last year	DATE	0.99+
80%	QUANTITY	0.99+
Andy McAfee	PERSON	0.99+
Stu	PERSON	0.99+
New York City	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
last October	DATE	0.99+
Dee Blanchfield	PERSON	0.99+
last week	DATE	0.99+
Python	TITLE	0.99+
two professors	QUANTITY	0.99+
Spark	TITLE	0.99+
October	DATE	0.99+
Google	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Linux	TITLE	0.98+
today	DATE	0.98+
two years ago	DATE	0.98+
Ten years ago	DATE	0.98+
Waldorf Astoria	ORGANIZATION	0.98+
Big Apple	LOCATION	0.98+
Two years ago	DATE	0.97+
Spark Summit	EVENT	0.97+
single platform	QUANTITY	0.97+
both	QUANTITY	0.97+
One	QUANTITY	0.97+
Wikibon	ORGANIZATION	0.96+
one	QUANTITY	0.96+
The Cube	COMMERCIAL_ITEM	0.96+
MIT Sloan school	ORGANIZATION	0.96+
Watson	TITLE	0.91+
9:00am this morning	DATE	0.9+
Hadoop	TITLE	0.9+
Re	TITLE	0.9+
Papa Java	TITLE	0.9+
Zn	TITLE	0.88+
Watson	ORGANIZATION	0.87+
IBM Machine Learning Launch Event	EVENT	0.87+
MXNet	TITLE	0.84+
part two	QUANTITY	0.82+
Cloud	TITLE	0.81+
Second	TITLE	0.8+
IBM Med	EVENT	0.8+
Machine Learning Event	EVENT	0.79+
z13	COMMERCIAL_ITEM	0.78+
#IBMML	EVENT	0.77+
Big	ORGANIZATION	0.75+
#IBMML	TITLE	0.75+
DataWorks	ORGANIZATION	0.71+

Joel Horwitz, IBM & David Richards, WANdisco - Hadoop Summit 2016 San Jose - #theCUBE

>> Narrator: From San Jose, California, in the heart of Silicon Valley, it's theCUBE. Covering Hadoop Summit 2016. Brought to you by Hortonworks. Here's your host, John Furrier. >> Welcome back everyone. We are here live in Silicon Valley at Hadoop Summit 2016, actually San Jose. This is theCUBE, our flagship program. We go out to the events and extract the signal to the noise. Our next guest, David Richards, CEO of WANdisco. And Joel Horowitz, strategy and business development, IBM analyst. Guys, welcome back to theCUBE. Good to see you guys. >> Thank you for having us. >> It's great to be here, John. >> Give us the update on WANdisco. What's the relationship with IBM and WANdisco? 'Cause, you know. I can just almost see it, but I'm not going to predict. Just tell us. >> Okay, so, I think the last time we were on theCUBE, I was sitting with Re-ti-co who works very closely with Joe. And we began to talk about how our partnership was evolving. And of course, we were negotiating an OEM deal back then, so we really couldn't talk about it very much. But this week, I'm delighted to say that we announced, I think it's called IBM Big Replicate? >> Joel: Big Replicate, yeah. We have a big everything and Replicate's the latest edition. >> So it's going really well. It's OEM'd into IBM's analytics, big data products, and cloud products. >> Yeah, I'm smiling and smirking because we've had so many conversations, David, on theCUBE with you on and following your business through the bumpy road or the wild seas of big data. And it's been a really interesting tossing and turning of the industry. I mean, Joel, we've talked about it too. The innovation around Hadoop and then the massive slowdown and realization that cloud is now on top of it. The consumerization of the enterprise created a little shift in the value proposition, and then a massive rush to build enterprise grade, right? And you guys had that enterprise grade piece of it. IBM, certainly you're enterprise grade. You have enterprise everywhere. But the ecosystem had to evolve really fast. What happened? Share with the audience this shift. >> So, it's classic product adoption lifecycle and the buying audience has changed over that time continuum. In the very early days when we first started talking more at these events, when we were talking about Hadoop, we all really cared about whether it was Pig and Hive. >> You once had a distribution. That's a throwback. Today's Thursday, we'll do that tomorrow. >> And the buying audience has changed, and consequently, the companies involved in the ecosystem have changed. So where we once used to really care about all of those different components, we don't really care about the machinations below the application layer anymore. Some people do, yes, but by and large, we don't. And that's why cloud for example is so successful because you press a button, and it's there. And that, I think, is where the market is going to very, very quickly. So, it makes perfect sense for a company like WANdisco who've got 20, 30, 40, 50 sales people to move to a company like IBM that have 4 or 5,000 people selling our analytics products. >> Yeah, and so this is an OEM deal. Let's just get that news on the table. So, you're an OEM. IBM's going to OEM their product and brand it IBM, Big Replication? >> Yeah, it's part of our Big Insights Portfolio. We've done a great job at growing this product line over the last few years, with last year talking about how we decoupled all the value-as from the core distribution. So I'm happy to say that we're both part of the ODPI. It's an ODPI-certified distribution. That is Hadoop that we offer today for free. But then we've been adding not just in terms of the data management capabilities, but the partnership here that we're announcing with WANdisco and how we branded it as Big Replicate is squarely aimed at the data management market today. But where we're headed, as David points out, is really much bigger, right? We're talking about support for not only distributed storage and data, but we're also talking about a hybrid offering that will get you to the cloud faster. So not only does Big Replicate work with HDFS, it also works with the Swift objects store, which as you know, kind of the underlying storage for our cloud offering. So what we're hoping to see from this great partnership is as you see around you, Hadoop is a great market. But there's a lot more here when you talk about managing data that you need to consider. And I think hybrid is becoming a lot larger of a story than simply distributing your processing and your storage. It's becoming a lot more about okay, how do you offset different regions? How do you think through that there are multiple, I think there's this idea that there's one Hadoop cluster in an enterprise. I think that's factually wrong. I think what we're observing is that there's actually people who are spinning up, you know, multiple Hadoop distributions at the line of business for maybe a campaign or for maybe doing fraud detection, or maybe doing log file, whatever. And managing all those clusters, and they'll have Cloud Arrow. They'll have Hortonworks. They'll have IBM. They'll have all of these different distributions that they're having to deal with. And what we're offering is sanity. It's like give me sanity for how I can actually replicate that data. >> I love the name Big Replicate, fantastic. Big Insights, Big Replicate. And so go to market, you guys are going to have bigger sales force. It's a nice pop for you guys. I mean, it's good deal. >> We were just talking before we came on air about sort of a deal flow coming through. It's coming through, this potential deal flow coming through, which has been off the charts. I mean, obviously when you turn on the tap, and then suddenly you enable thousands and thousands of sales people to start selling your products. I mean, IBM, are doing a great job. And I think IBM are in a unique position where they own both cloud and on-prem. There are very few companies that own both the on-prem-- >> They're going to need to have that connection for the companies that are going hybrid. So hybrid cloud becomes interesting right now. >> Well, actually, it's, there's a theory that says okay, so, and we were just discussing this, the value of data lies in analytics, not in the data itself. It lies in you've been able to pull out information from that data. Most CIOs-- >> If you can get the data. >> If you can get the data. Let's assume that you've got the data. So then it becomes a question of, >> That's a big assumption. Yes, it is. (laughs) I just had Nancy Handling on about metadata. No, that's an issue. People have data they store they can't do anything with it. >> Exactly. And that's part of the problem because what you actually have to have is CPU slash processing power for an unknown amount of data any one moment in time. Now, that sounds like an elastic use case, and you can't do elastic on-prem. You can only do elastic in cloud. That means that virtually every distribution will have to be a hybrid distribution. IBM realized this years ago and began to build this hybrid infrastructure. We're going to help them to move data, completely consistent data, between on-prem and cloud, so when you query things in the cloud, it's exactly the same results and the correct results you get. >> And also the stability too on that. There's so many potential, as we've discussed in the past, that sounds simple and logical. To do an enterprise grade is pretty complex. And so it just gives a nice, stable enterprise grade component. >> I mean, the volumes of data that we're talking about here are just off the charts. >> Give me a use case of a customer that you guys are working with, or has there been any go-to-market activity or an ideal scenario that you guys see as a use case for this partnership? >> We're already seeing a whole bunch of things come through. >> What's the number one pattern that bubbles up to the top? Use case-wise. >> As Joel pointed out, that he doesn't believe that any one company just has one version of Hadoop behind their firewall. They have multiple vendors. >> 100% agree with that. >> So how do you create one, single cluster from all of those? >> John: That's one problem you solved. >> That's of course a very large problem. Second problem that we're seeing in spades is I have to move data to cloud to run analytics applications against it. That's huge. That required completely guaranteed consistent data between on-prem and cloud. And I think those two use cases alone account for pretty much every single company. >> I think there's even a third here. I think the third is actually, I think frankly there's a lot of inefficiencies in managing just HDFS and how many times you have to actually copy data. If I looked across, I think the standard right now is having like three copies. And actually, working with Big Replicate and WANdisco, you can actually have more assurances and actually have to make less copies across the cluster and actually across multiple clusters. If you think about that, you have three copies of the data sitting in this cluster. Likely, an analysts have a dragged a bunch of the same data in other clusters, so that's another multiple of three. So there's amount of waste in terms of the same data living across your enterprise. That I think there's a huge cost-savings component to this as well. >> Does this involve anything with Project Atlas at all? You guys are working with, >> Not yet, no. >> That project? It's interesting. We're seeing a lot of opening up the data, but all they're doing is creating versions of it. And so then it becomes version control of the data. You see a master or a centralization of data? Actually, not centralize, pull all the data in one spot, but why replicate it? Do you see that going on? I guess I'm not following the trend here. I can't see the mega trend going on. >> It's cloud. >> What's the big trend? >> The big trend is I need an elastic infrastructure. I can't build an elastic infrastructure on-premise. It doesn't make economic sense to build massive redundancy maybe three or four times the infrastructure I need on premise when I'm only going to use it maybe 10, 20% of the time. So the mega trend is cloud provides me with a completely economic, elastic infrastructure. In order to take advantage of that, I have to be able to move data, transactional data, data that changes all the time, into that cloud infrastructure and query it. That's the mega trend. It's as simple as that. >> So moving data around at the right time? >> And that's transaction. Anybody can say okay, press pause. Move the data, press play. >> So if I understand this correctly, and just, sorry, I'm a little slow. End of the day today. So instead of staging the data, you're moving data via the analytics engines. Is that what you're getting at? >> You use data that's being transformed. >> I think you're accessing data differently. I think today with Hadoop, you're accessing it maybe through like Flume or through Oozy, where you're building all these data pipelines that you have to manage. And I think that's obnoxious. I think really what you want is to use something like Apache Spark. Obviously, we've made a large investment in that earlier, actually, last year. To me, what I think I'm seeing is people who have very specific use cases. So, they want to do analysis for a particular campaign, and so they may just pull a bunch of data into memory from across their data environment. And that may be on the cloud. It may be from a third-party. It may be from a transactional system. It may be from anywhere. And that may be done in Hadoop. It may not, frankly. >> Yeah, this is the great point, and again, one of the themes on the show is, this is a question that's kind of been talked about in the hallways. And I'd love to hear your thoughts on this. Is there are some people saying that there's really no traction for Hadoop in the cloud. And that customers are saying, you know, it's not about just Hadoop in the cloud. I'm going to put in S3 or object store. >> You're right. I think-- >> Yeah, I'm right as in what? >> Every single-- >> There's no traction for Hadoop in the cloud? >> I'll tell you what customers tell us. Customers look at what they actually need from storage, and they compare whatever it is, Hadoop or any on-premise proprietor storage array and then look at what S3 and Swift and so on offer to them. And if you do a side-by-side comparison, there isn't really a difference between those two things. So I would argue that it's a fact that functionally, storage in cloud gives you all the functionality that any customer would need. And therefore, the relevance of Hadoop in cloud probably isn't there. >> I would add to that. So it really depends on how you define Hadoop. If you define Hadoop by the storage layer, then I would say for sure. Like HDFS versus an objects store, that's going to be a difficult one to find some sort of benefit there. But if you look at Hadoop, like I was talking to my friend Blake from Netflix, and I was asking him so I hear you guys are kind of like replatforming on Spark now. And he was basically telling me, well, sort of. I mean, they've invested a lot in Pig and Hive. So if you think it now about Hadoop as this broader ecosystem which you brought up Atlas, we talk about Ranger and Knox and all the stuff that keeps coming out, there's a lot of people who are still invested in the peripheral ecosystem around Hadoop as that central point. My argument would be that I think there's still going to be a place for distributed computing kind of projects. And now whether those will continue to interface through Yarn via and then down to HDFS, or whether that'll be Yarn on say an objects store or something and those projects will persist on their own. To me that's kind of more of how I think about the larger discussion around Hadoop. I think people have made a lot of investments in terms of that ecosystem around Hadoop, and that's something that they're going to have to think through. >> Yeah. And Hadoop wasn't really designed for cloud. It was designed for commodity servers, deployment with ease and at low cost. It wasn't designed for cloud-based applications. Storage in cloud was designed for storage in cloud. Right, that's with S3. That's what Swift and so on were designed specifically to do, and they fulfill most of those functions. But Joel's right, there will be companies that continue to use-- >> What's my whole argument? My whole argument is that why would you want to use Hadoop in the cloud when you can just do that? >> Correct. >> There's object store out. There's plenty of great storage opportunities in the cloud. They're mostly shoe-horning Hadoop, and I think that's, anyway. >> There are two classes of customers. There were customers that were born in the cloud, and they're not going to suddenly say, oh you know what, we need to build our own server infrastructure behind our own firewall 'cause they were born in the cloud. >> I'm going to ask you guys this question. You can choose to answer or not. Joel may not want to answer it 'cause he's from IBM and gets his wrist slapped. This is a question I got on DM. Hadoop ecosystem consolidation question. People are mailing in the questions. Now, keep sending me your questions if you don't want your name on it. Hold on, Hadoop system ecosystem. When will this start to happen? What is holding back the M and A? >> So, that's a great question. First of all, consolidation happens when you sort of reach that tipping point or leveling off, that inflection point where the market levels off, and we've reached market saturation. So there's no more market to go after. And the big guys like IBM and so on come in-- >> Or there was never a market to begin with. (laughs) >> I don't think that's the case, but yes, I see the point. Now, what's stopping that from happening today, and you're a naughty boy by the way for asking this question, is a lot of these companies are still very well funded. So while they still have cash on the balance sheet, of course, it's very, very hard for that to take place. >> You picked up my next question. But that's a good point. The VCs held back in 2009 after the crash of 2008. Sequoia's memo, you know, the good times role, or RIP good times. They stopped funding companies. Companies are getting funded, continually getting funding. Joel. >> So I don't think you can look at this market as like an isolated market like there's the Hadoop market and then there's a Spark market. And then even there's like an AI or cognitive market. I actually think this is all the same market. Machine learning would not be possible if you didn't have Hadoop, right? I wouldn't say it. It wouldn't have a resurgence that it has had. Mahout was one of the first machine learning languages that caught fire from Ted Dunning and others. And that kind of brought it back to life. And then Spark, I mean if you talk to-- >> John: I wouldn't say it creates it. Incubated. >> Incubated, right. >> And created that Renaissance-like experience. >> Yeah, deep learning, Some of those machine learning algorithms require you to have a distributed kind of framework to work in. And so I would argue that it's less of a consolidation, but it's more of an evolution of people going okay, there's distributed computing. Do I need to do that on-premise in this Hadoop ecosystem, or can I do that in the cloud, or in a growing Spark ecosystem? But I would argue there's other things happening. >> I would agree with you. I love both areas. My snarky comment there was never a market to begin with, what I'm saying there is that the monetization of commanding the hill that everyone's fighting for was just one of many hills in a bigger field of hills. And so, you could be in a cul-de-sac of being your own champion of no paying customers. >> What you have-- >> John: Or a free open-source product. >> Unlike the dotcom era where most of those companies were in the public markets, and you could actually see proper valuations, most of the companies, the unicorns now, most are not public. So the valuations are really difficult to, and the valuation metrics are hard to come by. There are only few of those companies that are in the public market. >> The cash story's right on. I think to Joel' point, it's easy to pivot in a market that's big and growing. Just 'cause you're in the wrong corner of the market pivoting or vectoring into the value is easier now than it was 10 years ago. Because, one, if you have a unicorn situation, you have cash on the bank. So they have a good flush cash. Your runway's so far out, you can still do your thing. If you're a startup, you can get time to value pretty quickly with the cloud. So again, I still think it's very healthy. In my opinion, I kind of think you guys have good analysis on that point. >> I think we're going to see some really cool stuff happen working together, and especially from what I'm seeing from IBM, in the fact that in the IT crowd, there is a behavioral change that's happening that Hadoop opened the door to. That we're starting to see more and more It professionals walk through. In the sense that, Hadoop has opened the door to not thinking of data as a liability, but actually thinking about data differently as an asset. And I think this is where this market does have an opportunity to continue to grow as long as we don't get carried away with trying to solve all of the old problems that we solved for on-premise data management. Like if we do that, then we're just, then there will be a consolidation. >> Metadata is a huge issue. I think that's going to be a big deal. And on the M and A, my feeling on the M and A is that, you got to buy something of value, so you either have revenue, which means customers, and or initial property. So, in a market of open source, it comes back down to the valuation question. If you're IBM or Oracle or HP, they can pivot too. And they can be agile. Now slower agile, but you know, they can literally throw some engineers at it. So if there's no customers in I and P, they can replicate, >> Exactly. >> That product. >> And we're seeing IBM do that. >> They don't know what they're buying. My whole point is if there's nothing to buy. >> I think it depends on, ultimately it depends on where we see people deriving value, and clearly in WANdisco, there's a huge amount of value that we're seeing our customers derive. So I think it comes down to that, and there is a lot of IP there, and there's a lot of IP in a lot of these companies. I think it's just a matter of widening their view, and I think WANdisco is probably the earliest to do this frankly. Was to recognize that for them to succeed, it couldn't just be about Hadoop. It actually had to expand to talk about cloud and talk about other data environments, right? >> Well, congratulations on the OEM deal. IBM, great name, Big Replicate. Love it, fantastic name. >> We're excited. >> It's a great product, and we've been following you guys for a long time, David. Great product, great energy. So I'm sure there's going to be a lot more deals coming on your. Good strategy is OEM strategy thing, huh? >> Oh yeah. >> It reduces sales cost. >> Gives us tremendous operational leverage. Getting 4,000, 5,000-- >> You get a great partner in IBM. They know the enterprise, great stuff. This is theCUBE bringing all the action here at Hadoop. IBM OEM deal with WANdisco all happening right here on theCUBE. Be back with more live coverage after this short break.

Published Date : Jul 1 2016

SUMMARY :

Brought to you by Hortonworks. extract the signal to the noise. What's the relationship And of course, we were Replicate's the latest edition. So it's going really well. The consumerization of the enterprise and the buying audience has changed That's a throwback. And the buying audience has changed, Let's just get that news on the table. of the data management capabilities, I love the name Big that own both the on-prem-- for the companies that are going hybrid. not in the data itself. If you can get the data. I just had Nancy Handling and the correct results you get. And also the stability too on that. I mean, the volumes of bunch of things come through. What's the number one pattern that any one company just has one version And I think those two use cases alone of the data sitting in this cluster. I guess I'm not following the trend here. data that changes all the time, Move the data, press play. So instead of staging the data, And that may be on the cloud. And that customers are saying, you know, I think-- Swift and so on offer to them. and all the stuff that keeps coming out, that continue to use-- opportunities in the cloud. and they're not going to suddenly say, What is holding back the M and A? And the big guys like market to begin with. hard for that to take place. after the crash of 2008. And that kind of brought it back to life. John: I wouldn't say it creates it. And created that or can I do that in the cloud, that the monetization that are in the public market. I think to Joel' point, it's easy to pivot And I think this is where this market I think that's going to be a big deal. there's nothing to buy. the earliest to do this frankly. Well, congratulations on the OEM deal. So I'm sure there's going to be Gives us tremendous They know the enterprise, great stuff.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Joel	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Joe	PERSON	0.99+
David Richards	PERSON	0.99+
Joel Horowitz	PERSON	0.99+
2009	DATE	0.99+
John	PERSON	0.99+
4	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
20	QUANTITY	0.99+
San Jose	LOCATION	0.99+
HP	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Joel Horwitz	PERSON	0.99+
Ted Dunning	PERSON	0.99+
Big Replicate	ORGANIZATION	0.99+
last year	DATE	0.99+
Silicon Valley	LOCATION	0.99+
Big Replicate	ORGANIZATION	0.99+
40	QUANTITY	0.99+
30	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
third	QUANTITY	0.99+
today	DATE	0.99+
Hadoop	TITLE	0.99+
San Jose, California	LOCATION	0.99+
three	QUANTITY	0.99+
two things	QUANTITY	0.99+
2008	DATE	0.99+
5,000 people	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
David Richards	PERSON	0.99+
Blake	PERSON	0.99+
4,000, 5,000	QUANTITY	0.99+
S3	TITLE	0.99+
two classes	QUANTITY	0.99+
tomorrow	DATE	0.99+
Second problem	QUANTITY	0.99+
both areas	QUANTITY	0.99+
three copies	QUANTITY	0.99+
Hadoop Summit 2016	EVENT	0.99+
Swift	TITLE	0.99+
both	QUANTITY	0.99+
Big Insights	ORGANIZATION	0.99+
one problem	QUANTITY	0.98+
Today	DATE	0.98+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for IBM Machine Learning Launch Event: