Joel Horwitz, IBM | IBM CDO Summit Sping 2018

(techno music) >> Announcer: Live, from downtown San Francisco, it's theCUBE. Covering IBM Chief Data Officer Strategy Summit 2018. Brought to you by IBM. >> Welcome back to San Francisco everybody, this is theCUBE, the leader in live tech coverage. We're here at the Parc 55 in San Francisco covering the IBM CDO Strategy Summit. I'm here with Joel Horwitz who's the Vice President of Digital Partnerships & Offerings at IBM. Good to see you again Joel. >> Thanks, great to be here, thanks for having me. >> So I was just, you're very welcome- It was just, let's see, was it last month, at Think? >> Yeah, it's hard to keep track, right. >> And we were talking about your new role- >> It's been a busy year. >> the importance of partnerships. One of the things I want to, well let's talk about your role, but I really want to get into, it's innovation. And we talked about this at Think, because it's so critical, in my opinion anyway, that you can attract partnerships, innovation partnerships, startups, established companies, et cetera. >> Joel: Yeah. >> To really help drive that innovation, it takes a team of people, IBM can't do it on its own. >> Yeah, I mean look, IBM is the leader in innovation, as we all know. We're the market leader for patents, that we put out each year, and how you get that technology in the hands of the real innovators, the developers, the longtail ISVs, our partners out there, that's the challenging part at times, and so what we've been up to is really looking at how we make it easier for partners to partner with IBM. How we make it easier for developers to work with IBM. So we have a number of areas that we've been adding, so for example, we've added a whole IBM Code portal, so if you go to developer.ibm.com/code you can actually see hundreds of code patterns that we've created to help really any client, any partner, get started using IBM's technology, and to innovate. >> Yeah, and that's critical, I mean you're right, because to me innovation is a combination of invention, which is what you guys do really, and then it's adoption, which is what your customers are all about. You come from the data science world. We're here at the Chief Data Officer Summit, what's the intersection between data science and CDOs? What are you seeing there? >> Yeah, so when I was here last, it was about two years ago in 2015, actually, maybe three years ago, man, time flies when you're having fun. >> Dave: Yeah, the Spark Summit- >> Yeah Spark Technology Center and the Spark Summit, and we were here, I was here at the Chief Data Officer Summit. And it was great, and at that time, I think a lot of the conversation was really not that different than what I'm seeing today. Which is, how do you manage all of your data assets? I think a big part of doing good data science, which is my kind of background, is really having a good understanding of what your data governance is, what your data catalog is, so, you know we introduced the Watson Studio at Think, and actually, what's nice about that, is it brings a lot of this together. So if you look in the market, in the data market, today, you know we used to segment it by a few things, like data gravity, data movement, data science, and data governance. And those are kind of the four themes that I continue to see. And so outside of IBM, I would contend that those are relatively separate kind of tools that are disconnected, in fact Dinesh Nirmal, who's our engineer on the analytic side, Head of Development there, he wrote a great blog just recently, about how you can have some great machine learning, you have some great data, but if you can't operationalize that, then really you can't put it to use. And so it's funny to me because we've been focused on this challenge, and IBM is making the right steps, in my, I'm obviously biased, but we're making some great strides toward unifying the, this tool chain. Which is data management, to data science, to operationalizing, you know, machine learning. So that's what we're starting to see with Watson Studio. >> Well, I always push Dinesh on this and like okay, you've got a collection of tools, but are you bringing those together? And he flat-out says no, we developed this, a lot of this from scratch. Yes, we bring in the best of the knowledge that we have there, but we're not trying to just cobble together a bunch of disparate tools with a UI layer. >> Right, right. >> It's really a fundamental foundation that you're trying to build. >> Well, what's really interesting about that, that piece, is that yeah, I think a lot of folks have cobbled together a UI layer, so we formed a partnership, coming back to the partnership view, with a company called Lightbend, who's based here in San Francisco, as well as in Europe, and the reason why we did that, wasn't just because of the fact that Reactive development, if you're not familiar with Reactive, it's essentially Scala, Akka, Play, this whole framework, that basically allows developers to write once, and it kind of scales up with demand. In fact, Verizon actually used our platform with Lightbend to launch the iPhone 10. And they show dramatic improvements. Now what's exciting about Lightbend, is the fact that application developers are developing with Reactive, but if you turn around, you'll also now be able to operationalize models with Reactive as well. Because it's basically a single platform to move between these two worlds. So what we've continued to see is data science kind of separate from the application world. Really kind of, AI and cloud as different universes. The reality is that for any enterprise, or any company, to really innovate, you have to find a way to bring those two worlds together, to get the most use out of it. >> Fourier always says "Data is the new development kit". He said this I think five or six years ago, and it's barely becoming true. You guys have tried to make an attempt, and have done a pretty good job, of trying to bring those worlds together in a single platform, what do you call it? The Watson Data Platform? >> Yeah, Watson Data Platform, now Watson Studio, and I think the other, so one side of it is, us trying to, not really trying, but us actually bringing together these disparate systems. I mean we are kind of a systems company, we're IT. But not only that, but bringing our trained algorithms, and our trained models to the developers. So for example, we also did a partnership with Unity, at the end of last year, that's now just reaching some pretty good growth, in terms of bringing the Watson SDK to game developers on the Unity platform. So again, it's this idea of bringing the game developer, the application developer, in closer contact with these trained models, and these trained algorithms. And that's where you're seeing incredible things happen. So for example, Star Trek Bridge Crew, which I don't know how many Trekkies we have here at the CDO Summit. >> A few over here probably. >> Yeah, a couple? They're using our SDK in Unity, to basically allow a gamer to use voice commands through the headset, through a VR headset, to talk to other players in the virtual game. So we're going to see more, I can't really disclose too much what we're doing there, but there's some cool stuff coming out of that partnership. >> Real immersive experience driving a lot of data. Now you're part of the Digital Business Group. I like the term digital business, because we talk about it all the time. Digital business, what's the difference between a digital business and a business? What's the, how they use data. >> Joel: Yeah. >> You're a data person, what does that mean? That you're part of the Digital Business Group? Is that an internal facing thing? An external facing thing? Both? >> It's really both. So our Chief Digital Officer, Bob Lord, he has a presentation that he'll give, where he starts out, and he goes, when I tell people I'm the Chief Digital Officer they usually think I just manage the website. You know, if I tell people I'm a Chief Data Officer, it means I manage our data, in governance over here. The reality is that I think these Chief Digital Officer, Chief Data Officer, they're really responsible for business transformation. And so, if you actually look at what we're doing, I think on both sides is we're using data, we're using marketing technology, martech, like Optimizely, like Segment, like some of these great partners of ours, to really look at how we can quickly A/B test, get user feedback, to look at how we actually test different offerings and market. And so really what we're doing is we're setting up a testing platform, to bring not only our traditional offers to market, like DB2, Mainframe, et cetera, but also bring new offers to market, like blockchain, and quantum, and others, and actually figure out how we get better product-market fit. What actually, one thing, one story that comes to mind, is if you've seen the movie Hidden Figures- >> Oh yeah. >> There's this scene where Kevin Costner, I know this is going to look not great for IBM, but I'm going to say it anyways, which is Kevin Costner has like a sledgehammer, and he's like trying to break down the wall to get the mainframe in the room. That's what it feels like sometimes, 'cause we create the best technology, but we forget sometimes about the last mile. You know like, we got to break down the wall. >> Where am I going to put it? >> You know, to get it in the room! So, honestly I think that's a lot of what we're doing. We're bridging that last mile, between these different audiences. So between developers, between ISVs, between commercial buyers. Like how do we actually make this technology, not just accessible to large enterprise, which are our main clients, but also to the other ecosystems, and other audiences out there. >> Well so that's interesting Joel, because as a potential partner of IBM, they want, obviously your go-to-market, your massive company, and great distribution channel. But at the same time, you want more than that. You know you want to have a closer, IBM always focuses on partnerships that have intrinsic value. So you talked about offerings, you talked about quantum, blockchain, off-camera talking about cloud containers. >> Joel: Yeah. >> I'd say cloud and containers may be a little closer than those others, but those others are going to take a lot of market development. So what are the offerings that you guys are bringing? How do they get into the hands of your partners? >> I mean, the commonality with all of these, all the emerging offerings, if you ask me, is the distributed nature of the offering. So if you look at blockchain, it's a distributed ledger. It's a distributed transaction chain that's secure. If you look at data, really and we can hark back to say, Hadoop, right before object storage, it's distributed storage, so it's not just storing on your hard drive locally, it's storing on a distributed network of servers that are all over the world and data centers. If you look at cloud, and containers, what you're really doing is not running your application on an individual server that can go down. You're using containers because you want to distribute that application over a large network of servers, so that if one server goes down, you're not going to be hosed. And so I think the fundamental shift that you're seeing is this distributed nature, which in essence is cloud. So I think cloud is just kind of a synonym, in my opinion, for distributed nature of our business. >> That's interesting and that brings up, you're right, cloud and Big Data/Hadoop, we don't talk about Hadoop much anymore, but it kind of got it all started, with that notion of leave the data where it is. And it's the same thing with cloud. You can't just stuff your business into the public cloud. You got to bring the cloud to your data. >> Joel: That's right. >> But that brings up a whole new set of challenges, which obviously, you're in a position just to help solve. Performance, latency, physics come into play. >> Physics is a rough one. It's kind of hard to avoid that one. >> I hear your best people are working on it though. Some other partnerships that you want to sort of, elucidate. >> Yeah, no, I mean we have some really great, so I think the key kind of partnership, I would say area, that I would allude to is, one of the things, and you kind of referenced this, is a lot of our partners, big or small, want to work with our top clients. So they want to work with our top banking clients. They want, 'cause these are, if you look at for example, MaRisk and what we're doing with them around blockchain, and frankly, talk about innovation, they're innovating containers for real, not virtual containers- >> And that's a joint venture right? >> Yeah, it is, and so it's exciting because, what we're bringing to market is, I also lead our startup programs, called the Global Entrepreneurship Program, and so what I'm focused on doing, and you'll probably see more to come this quarter, is how do we actually bridge that end-to-end? How do you, if you're startup or a small business, ultimately reach that kind of global business partner level? And so kind of bridging that, that end-to-end. So we're starting to bring out a number of different incentives for partners, like co-marketing, so I'll help startups when they're early, figure out product-market fit. We'll give you free credits to use our innovative technology, and we'll also bring you into a number of clients, to basically help you not burn all of your cash on creating your own marketing channel. God knows I did that when I was at a start-up. So I think we're doing a lot to kind of bridge that end-to-end, and help any partner kind of come in, and then grow with IBM. I think that's where we're headed. >> I think that's a critical part of your job. Because I mean, obviously IBM is known for its Global 2000, big enterprise presence, but startups, again, fuel that innovation fire. So being able to attract them, which you're proving you can, providing whatever it is, access, early access to cloud services, or like you say, these other offerings that you're producing, in addition to that go-to-market, 'cause it's funny, we always talk about how efficient, capital efficient, software is, but then you have these companies raising hundreds of millions of dollars, why? Because they got to do promotion, marketing, sales, you know, go-to-market. >> Yeah, it's really expensive. I mean, you look at most startups, like their biggest ticket item is usually marketing and sales. And building channels, and so yeah, if you're, you know we're talking to a number of partners who want to work with us because of the fact that, it's not just like, the direct kind of channel, it's also, as you kind of mentioned, there's other challenges that you have to overcome when you're working with a larger company. for example, security is a big one, GDPR compliance now, is a big one, and just making sure that things don't fall over, is a big one. And so a lot of partners work with us because ultimately, a number of the decision makers in these larger enterprises are going, well, I trust IBM, and if IBM says you're good, then I believe you. And so that's where we're kind of starting to pull partners in, and pull an ecosystem towards us. Because of the fact that we can take them through that level of certification. So we have a number of free online courses. So if you go to partners, excuse me, ibm.com/partners/learn there's a number of blockchain courses that you can learn today, and will actually give you a digital certificate, that's actually certified on our own blockchain, which we're actually a first of a kind to do that, which I think is pretty slick, and it's accredited at some of the universities. So I think that's where people are looking to IBM, and other leaders in this industry, is to help them become experts in their, in this technology, and especially in this emerging technology. >> I love that blockchain actually, because it's such a growing, and interesting, and innovative field. But it needs players like IBM, that can bring credibility, enterprise-grade, whether it's security, or just, as I say, credibility. 'Cause you know, this is, so much of negative connotations associated with blockchain and crypto, but companies like IBM coming to the table, enterprise companies, and building that ecosystem out is in my view, crucial. >> Yeah, no, it takes a village. I mean, there's a lot of folks, I mean that's a big reason why I came to IBM, three, four years ago, was because when I was in start-up land, I used to work for H20, I worked for Alpine Data Labs, Datameer, back in the Hadoop days, and what I realized was that, it's an opportunity cost. So you can't really drive true global innovation, transformation, in some of these bigger companies because there's only so much that you can really kind of bite off. And so you know at IBM it's been a really rewarding experience because we have done things like for example, we partnered with Girls Who Code, Treehouse, Udacity. So there's a number of early educators that we've partnered with, to bring code to, to bring technology to, that frankly, would never have access to some of this stuff. Some of this technology, if we didn't form these alliances, and if we didn't join these partnerships. So I'm very excited about the future of IBM, and I'm very excited about the future of what our partners are doing with IBM, because, geez, you know the cloud, and everything that we're doing to make this accessible, is bar none, I mean, it's great. >> I can tell you're excited. You know, spring in your step. Always a lot of energy Joel, really appreciate you coming onto theCUBE. >> Joel: My pleasure. >> Great to see you again. >> Yeah, thanks Dave. >> You're welcome. Alright keep it right there, everybody. We'll be back. We're at the IBM CDO Strategy Summit in San Francisco. You're watching theCUBE. (techno music) (touch-tone phone beeps)

Published Date : May 2 2018

SUMMARY :

Brought to you by IBM. Good to see you again Joel. that you can attract partnerships, To really help drive that innovation, and how you get that technology Yeah, and that's critical, I mean you're right, Yeah, so when I was here last, to operationalizing, you know, machine learning. that we have there, but we're not trying that you're trying to build. to really innovate, you have to find a way in a single platform, what do you call it? So for example, we also did a partnership with Unity, to basically allow a gamer to use voice commands I like the term digital business, to look at how we actually test different I know this is going to look not great for IBM, but also to the other ecosystems, But at the same time, you want more than that. So what are the offerings that you guys are bringing? So if you look at blockchain, it's a distributed ledger. You got to bring the cloud to your data. But that brings up a whole new set of challenges, It's kind of hard to avoid that one. Some other partnerships that you want to sort of, elucidate. and you kind of referenced this, to basically help you not burn all of your cash early access to cloud services, or like you say, that you can learn today, but companies like IBM coming to the table, that you can really kind of bite off. really appreciate you coming onto theCUBE. We're at the IBM CDO Strategy Summit in San Francisco.

ENTITIES

Entity	Category	Confidence
Joel	PERSON	0.99+
Joel Horwitz	PERSON	0.99+
Europe	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Kevin Costner	PERSON	0.99+
Dave	PERSON	0.99+
Dinesh Nirmal	PERSON	0.99+
Alpine Data Labs	ORGANIZATION	0.99+
Lightbend	ORGANIZATION	0.99+
Verizon	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Hidden Figures	TITLE	0.99+
Bob Lord	PERSON	0.99+
Both	QUANTITY	0.99+
MaRisk	ORGANIZATION	0.99+
both	QUANTITY	0.99+
iPhone 10	COMMERCIAL_ITEM	0.99+
2015	DATE	0.99+
Datameer	ORGANIZATION	0.99+
both sides	QUANTITY	0.99+
one story	QUANTITY	0.99+
Think	ORGANIZATION	0.99+
five	DATE	0.99+
hundreds	QUANTITY	0.99+
Treehouse	ORGANIZATION	0.99+
three years ago	DATE	0.99+
developer.ibm.com/code	OTHER	0.99+
Unity	ORGANIZATION	0.98+
two worlds	QUANTITY	0.98+
Reactive	ORGANIZATION	0.98+
GDPR	TITLE	0.98+
one side	QUANTITY	0.98+
Digital Business Group	ORGANIZATION	0.98+
today	DATE	0.98+
Udacity	ORGANIZATION	0.98+
ibm.com/partners/learn	OTHER	0.98+
last month	DATE	0.98+
Watson Studio	ORGANIZATION	0.98+
each year	QUANTITY	0.97+
three	DATE	0.97+
single platform	QUANTITY	0.97+
Girls Who Code	ORGANIZATION	0.97+
Parc 55	LOCATION	0.97+
one thing	QUANTITY	0.97+
four themes	QUANTITY	0.97+
Spark Technology Center	ORGANIZATION	0.97+
six years ago	DATE	0.97+
H20	ORGANIZATION	0.97+
four years ago	DATE	0.97+
martech	ORGANIZATION	0.97+
Unity	TITLE	0.96+
hundreds of millions of dollars	QUANTITY	0.94+
Watson Studio	TITLE	0.94+
Dinesh	PERSON	0.93+
one server	QUANTITY	0.93+

Manish Goyal, IBM - IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM, Fast Track Your Data, brought to you by IBM. >> We're back in Munich, Germany this is Fast Track Your Data and this is theCUBE, the leader in live tech coverage, we go out to the events. We extract a signal from the noise my name is Dave Vellante and I'm here with my co-host Jim Kobielus. We just came off of the main stage. IBM had a very choreographed, really beautiful, Kate Silverton was there of BBC Fame talking to various folks within the IBM community. IBM executives, practitioners, and quite a main stage production Jim. IBM always knows how to do it right. Manish Goyal here, he's the Director of Product Management for the Watson Data Platform. Something we covered on theCUBE extensively, that announcement last year in New York City. Manish welcome to theCUBE. >> Thank you for having me. >> Dave: So this is, it really was your signature moment back in last fall at Strata in New York City. We covered that, big announcement, lot of customers there. You guys demonstrated sort of the next generation of platform that you guys are announcing. >> Manish: That's right. >> So take us, bring us up to date. How's it going, where are we at, and what are you guys doing here? >> So, again thank you for having me. >> Dave: You're welcome. >> Let me take a minute to just let all the viewers know what is alternate about form. So the Watson Data Platform is our cloud analytics platform, and it's really three things. It's a set of composable data services, for ingest, analyze, processed. It's a set of tailor-made experiences for the different personas. Whether you are a data engineer, a business analyst, data scientist, or the steward. And connecting all of these, both of these is a set of data fabric, which is really the secret sauce. And think of this as being the governance layer that ensures that everything that we're doing, that everything that is being done by any of these personas is working on trusted data, and that the insights that are being generated can be trusted by the risk folks, the business folks, as they put the analytics into production. >> Dave: So just to review for our audience, there are a number of components to the Watson Data Platform. >> That's right, yep. >> Dave: There's the governance components you mentioned, there's the visualization, there's analytics. Now, many people criticized Watson Data Platform, they said oh it's just IBM putting a bunch of despaired products together, some acquisitions and then wrapping some services around it. When we talked to you guys in October, you said no, no, that's not the case. But can you affirm that? >> That is exactly right, that is not the case. It's not just us putting stuff together and calling it a new name, and think oh that's the platform, just a set of despaired services. That is absolutely not, and that's why I was emphasizing this common data fabric, right. I've got a couple of, let me sort of dive a little bit deeper into it. >> Sure, great. >> Manish: So the biggest problem that customers and data users in general complain about is, extremely hard to find data, right. The tools that they're working with are all siloed. So even if, you know, you and I are working on, you know our analytics projects, very hard for me to share what I'm working on with you, the environment that I am running on with you, et cetera. And this... The third piece is, a real issue with is the data that I'm working with trusted? Like can I actually believe that this is the best data that I can use, so that when I put something into production when I create my machine learning models I put them into my production environment. The risk guys are going to be fine with it, I'm going to be fine with it, I see the results that I'm getting. And so, getting this data fabric which is addressing these issues. One, it's addressing it first and foremost with a data catalog, a governance layer. So that it's very clear, irrespective, whether you're a data engineer, business analyst, data scientist or the data steward, from the CDO's office, you're all working off the same version of the truth, right. >> Jim: Manish is that something a DevOps platform, is it like DevOps for data science or for machine learning development or is it... How would you describe... Does that make sense? The automated release pipeline that's-- >> Manish: In a way yes. >> With the governance baked in? >> Yes, in a way that's one way to describe it. So that's one aspect right? Making sure that you're working with the trusted data, making it very easy to find the data, so that's sort of the governance aspect. The second piece that sort of really makes this a platform is that you're working off the same notion of a workspace, we call it a project. So, you may start out as a data engineer being asked yourself, take all these different data sources that are coming in and create and publish a data set that can be consumed for dashboarding, for data analysis whatever. And you're working on that in a project, now if you have a data science team that needs to be working on the same thing, you can just invite them to the same project. So they're working on the same thing, similarly to a business analyst, et cetera. And all of these results, and when we talk about governance it's not about just data sets, it's all analytical products. So it is the model that you're creating are being put back into the catalog and governed. Data flows-- >> It's model governance. >> Jim: Model governance, it's model governance? >> Exactly. >> And aiding governance. >> Manish: So it's a huge problem that customers have. I was just talking to a large insurance company yesterday, and they're question was, "What are you doing to make sure that I don't have to spend an enormous amount of time that I have to with the risk group, before I can put a model into production." Because they want complete lineage all the way back, saying "Okay you created this model, you're going to put it into production, whether it's for allowing credit card insurance, whatever your product is that you're selling. How do you make sure that there's no bias in the model that is created, can you show me the data set on which you trained it? And then when you re-trained it can you show me that data set?" So in case they're audited, that there's complete way to go back from the production model all the way back to the data set that was created. And which goes even further back from all the different data sources. Where it was cleansed, et cetera, the ETL, where it was published, and then picked up by data science team. So all of these things, putting it together with this data fabric. Governance being a huge, huge portion of that that goes across everything that we're doing. Giving these tailor-made experiences for the different business personas, oh sorry, the data personas, and just making it extremely simple for generating insights that can be trusted. So that is what we are trying to do with the Watson Data Platform. As, since last fall when we announced it, we have had a huge update on our data science experience, you heard a lot about that in the presentation this morning. As well as, all of our other cloud data services and the governance put forth. >> Dave: And that data science experience is embedded fundamental to the platform. >> It is, it is. >> Dave: You know I want to ask you about that. Because I don't know if you remember Jim and Manish, a few years ago, several years ago, Pivotal announced this thing called Chorus and it went, it was a collaboration platform and it really went nowhere. Now part of the reason it went nowhere was because it was early days, but also there wasn't the analytics solution underneath it. But a lot of people questioned, "Well do we really need to collaborate across those personas?" Again maybe they were immature at the time. So convince me that there's a need for that and that this is actually getting used in the world. >> There was an example, probably you've always seen the venn diagram or for data scientist, right? With all the different skills that they need, they are a unicorn, and there are no unicorns. It's extremely hard for our customers, in fact just finding really good data scientist is extremely hard. It's a very limited supply of that talent. So that's one thing right. So you can't find enough of these folks to scale out the level of analytics that is needed, if you want to use data for a comparative advantage. So that's one aspect right, of talent being a huge issue. The second aspect of it is you really do need specialized skill in data engineer. You don't want your PhD data scientist spending 60% of their time finding cleansing data. You have folks who really do that well and you want to enable them to work closely with the data science team. And you really do need business analyst who are the key to sort of understanding the business problem that needs to be solved, because that's where you always want to start any analytics product. What is it that you're trying to improve, or reduce cost on, or whatever your problem is that you're addressing. And so you really need, it is a team sport. You can't just do it without. Now if it is a team sport, how are these folks going to collaborate, right? And that is why, in all of our interactions with our customers and their data science teams. They absolutely love the collaboration features that we have put in, and we have put in a lot of effort in data science experience and the same collaboration features are actually going to extend across the portfolio of these experiences on the data platform. >> And the whole notion of personas is so fundamental to Watson Data Platform. And I'm wondering, is IBM evolving the range and variety of personas for which you're providing these experiences? And what I mean by that is, examples, we see more and more data science application development projects focusing on for example, chat bots. That involves human conversation, you need a bit more, possibly a persona, a computational linguist. Or cognitive IoT, like Watson, you know IoT, that's sensors, that's hardware devices maybe hardware engineers, hardware engineering experiences. You see what I'm getting at is that data science centric projects are increasingly moving from the totally virtual world, to being very much embedding in the physical world and the world of human guided, machine learning guided conversation. What are your thoughts about evolving the personas mix? >> So application, application developers, or the persona I actually missed when I was talking about this before, it's absolutely central because almost anything that the data science team is doing is going to create, at the end of the day, sort of create models. But the hope is that it's going to put into production system. And that job typically is the role of an application developer. Now, Jim you mentioned sort of, there's a lot of emphasis these days on conversational chat bots. And again, at the end of the day with data science projects you are in many ways, trying to improve the experience that you're giving your customers. Or personalizing the experience that you're giving your customers. A celebrity experience that Rob talked about this morning. And there are other personas involved in that sense, so to get a chat bot right, I mean there is data that you can obviously harvest and use to create that flow, an intelligence in chat bot. But there are elements where you do need a subject matter expert to curate that. To make sure that it doesn't seem robotic, that it does feel genuine. And so there is a role for a subject matter expert, we sort of collaborate with a business analyst role, or persona. But yes, all of these roles play an important part in sort of putting together the entire package. It just feels seamless, and that's why I sort of come back to saying that it is a team sport and if you do not enable the teams to work closely together, and enhance their productivity, you can go after all the data that's being generated and all the opportunity that data is presenting. And the prize is to gain a competitive advantage. >> Dave: One of the things Manish, you demonstrated last fall was this sort of, it was sort of a recommendation engine and very personalized. And it was quite a nice demo and it wasn't a fake demo from what I understood, it was real data. Can you share with us in the time we have remaining, just some of your favorite examples of how people are applying the Watson Data Platform and affecting business? >> Manish: Sure yeah so, I'll tell you a couple of examples. So I was actually in London earlier this week, meeting with a customer and they are using DSX, our data science experience, with a couple of utility companies. One is a water company, water utility company. And the problem that they're trying to solve is, they're supplying water in a hilly area and they want to optimize the power that they use to power the pumps to pump out water. Because it can be very expensive if the pumps are running all the time, et cetera. And so they're using data science experience to optimize when and how, and how long the pumps need to run to enable that the customers are happy with the level of water supply that they're getting and the force that they're getting it with. While the utility company is optimizing the expense in actually powering these things. So that's just a recent example that comes to mind. There are others, there's a logistics, huge logistics in transportation company who's using data science experience to optimize how the refrigeration of the storage units that are going all across the globe for transporting sort of food and other articles like that. How they can optimize the temperature of the goods that they're transporting, again to make sure that there's absolutely the minimum amount of wastage that occurs in the transportation process. But at the same time optimizing the cost that they incur, because all of that sort of shows up in the end product that you and I buy from retailers. >> Dave: And is there instrumentation in the field involved in that? Is that kind of a semi-IoT example? >> Absolutely, right, so in this case, actually both of these cases, in one case there are smart meters that are throwing out data every 15 minutes. In the other example of the logistics one, it is data that is almost streaming coming in. So in one case you can use batch processing, even though it's coming in at a 15 minute intervals, to predict out what you want to do. In the other case it's streaming data, which you want to analyze as it streams. >> Excellent, alright well exciting times here for you and your group. >> Absolutely >> Dave: Congratulations on getting the product out and getting it adopted. >> Thank you. >> Glad to see that. And thanks for coming on theCUBE. >> Manish: Thank you. Thanks for having me. >> Alright! >> Dave: Keep it right there everybody. Jim and I will be back, we're live from Munich, Germany, unscripted, bringing theCUBE to you. Bringing Fast Track Your Data. We'll be right back. (techno music)

Published Date : Jun 24 2017

SUMMARY :

brought to you by IBM. for the Watson Data Platform. platform that you guys are announcing. and what are you guys doing here? So the Watson Data Platform is our cloud analytics platform, Dave: So just to review for our audience, Dave: There's the governance components you mentioned, That is exactly right, that is not the case. Manish: So the biggest problem that customers Jim: Manish is that something a DevOps platform, So it is the model that you're creating all the way back, saying "Okay you created this model, Dave: And that data science experience is embedded and that this is actually getting used in the world. the business problem that needs to be solved, and the world of human guided, And the prize is to gain a competitive advantage. Dave: One of the things Manish, and how long the pumps need to run to enable that to predict out what you want to do. for you and your group. Dave: Congratulations on getting the product out Glad to see that. Manish: Thank you. Dave: Keep it right there everybody.

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Jim	PERSON	0.99+
Kate Silverton	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
London	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
Rob	PERSON	0.99+
Manish	PERSON	0.99+
October	DATE	0.99+
third piece	QUANTITY	0.99+
both	QUANTITY	0.99+
yesterday	DATE	0.99+
New York City	LOCATION	0.99+
60%	QUANTITY	0.99+
Manish Goyal	PERSON	0.99+
last year	DATE	0.99+
last fall	DATE	0.99+
15 minute	QUANTITY	0.99+
second piece	QUANTITY	0.99+
Pivotal	ORGANIZATION	0.99+
one aspect	QUANTITY	0.99+
second aspect	QUANTITY	0.99+
one case	QUANTITY	0.99+
Munich, Germany	LOCATION	0.98+
2017	DATE	0.98+
earlier this week	DATE	0.98+
several years ago	DATE	0.98+
BBC Fame	ORGANIZATION	0.96+
DevOps	TITLE	0.96+
one thing	QUANTITY	0.96+
three things	QUANTITY	0.96+
first	QUANTITY	0.96+
One	QUANTITY	0.94+
this morning	DATE	0.93+
few years ago	DATE	0.93+
Strata	ORGANIZATION	0.92+
Germany	LOCATION	0.9+
one way	QUANTITY	0.9+
Watson Data Platform	ORGANIZATION	0.83+
Watson Data Platform	TITLE	0.81+
DSX	ORGANIZATION	0.8+
every 15 minutes	QUANTITY	0.76+
Chorus	ORGANIZATION	0.75+
theCUBE	ORGANIZATION	0.73+
Data Platform	TITLE	0.68+
CDO	ORGANIZATION	0.62+
Watson	TITLE	0.58+
couple	QUANTITY	0.55+
Watson	ORGANIZATION	0.49+

Rob Thomas, IBM Analytics | IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM: Fast Track Your Data. Brought to you by IBM. >> Welcome, everybody, to Munich, Germany. This is Fast Track Your Data brought to you by IBM, and this is theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise. My name is Dave Vellante, and I'm here with my co-host Jim Kobielus. Rob Thomas is here, he's the General Manager of IBM Analytics, and longtime CUBE guest, good to see you again, Rob. >> Hey, great to see you. Thanks for being here. >> Dave: You're welcome, thanks for having us. So we're talking about, we missed each other last week at the Hortonworks DataWorks Summit, but you came on theCUBE, you guys had the big announcement there. You're sort of getting out, doing a Hadoop distribution, right? TheCUBE gave up our Hadoop distributions several years ago so. It's good that you joined us. But, um, that's tongue-in-cheek. Talk about what's going on with Hortonworks. You guys are now going to be partnering with them essentially to replace BigInsights, you're going to continue to service those customers. But there's more than that. What's that announcement all about? >> We're really excited about that announcement, that relationship, just to kind of recap for those that didn't see it last week. We are making a huge partnership with Hortonworks, where we're bringing data science and machine learning to the Hadoop community. So IBM will be adopting HDP as our distribution, and that's what we will drive into the market from a Hadoop perspective. Hortonworks is adopting IBM Data Science Experience and IBM machine learning to be a core part of their Hadoop platform. And I'd say this is a recognition. One is, companies should do what they do best. We think we're great at data science and machine learning. Hortonworks is the best at Hadoop. Combine those two things, it'll be great for clients. And, we also talked about extending that to things like Big SQL, where they're partnering with us on Big SQL, around modernizing data environments. And then third, which relates a little bit to what we're here in Munich talking about, is governance, where we're partnering closely with them around unified governance, Apache Atlas, advancing Atlas in the enterprise. And so, it's a lot of dimensions to the relationship, but I can tell you since I was on theCUBE a week ago with Rob Bearden, client response has been amazing. Rob and I have done a number of client visits together, and clients see the value of unlocking insights in their Hadoop data, and they love this, which is great. >> Now, I mean, the Hadoop distro, I mean early on you got into that business, just, you had to do it. You had to be relevant, you want to be part of the community, and a number of folks did that. But it's really sort of best left to a few guys who want to do that, and Apache open source is really, I think, the way to go there. Let's talk about Munich. You guys chose this venue. There's a lot of talk about GDPR, you've got some announcements around unified government, but why Munich? >> So, there's something interesting that I see happening in the market. So first of all, you look at the last five years. There's only 10 companies in the world that have outperformed the S&P 500, in each of those five years. And we started digging into who those companies are and what they do. They are all applying data science and machine learning at scale to drive their business. And so, something's happening in the market. That's what leaders are doing. And I look at what's happening in Europe, and I say, I don't see the European market being that aggressive yet around data science, machine learning, how you apply data for competitive advantage, so we wanted to come do this in Munich. And it's a bit of a wake-up call, almost, to say hey, this is what's happening. We want to encourage clients across Europe to think about how do they start to do something now. >> Yeah, of course, GDPR is also a hook. The European Union and you guys have made some talk about that, you've got some keynotes today, and some breakout sessions that are discussing that, but talk about the two announcements that you guys made. There's one on DB2, there's another one around unified governance, what do those mean for clients? >> Yeah, sure, so first of all on GDPR, it's interesting to me, it's kind of the inverse of Y2K, which is there's very little hype, but there's huge ramifications. And Y2K was kind of the opposite. So look, it's coming, May 2018, clients have to be GDPR-compliant. And there's a misconception in the market that that only impacts companies in Europe. It actually impacts any company that does any type of business in Europe. So, it impacts everybody. So we are announcing a platform for unified governance that makes sure clients are GDPR-compliant. We've integrated software technology across analytics, IBM security, some of the assets from the Promontory acquisition that IBM did last year, and we are delivering the only platform for unified governance. And that's what clients need to be GDPR-compliant. The second piece is data has to become a lot simpler. As you think about my comment, who's leading the market today? Data's hard, and so we're trying to make data dramatically simpler. And so for example, with DB2, what we're announcing is you can download and get started using DB2 in 15 minutes or less, and anybody can do it. Even you can do it, Dave, which is amazing. >> Dave: (laughs) >> For the first time ever, you can-- >> We'll test that, Rob. >> Let's go test that. I would love to see you do it, because I guarantee you can. Even my son can do it. I had my son do it this weekend before I came here, because I wanted to see how simple it was. So that announcement is really about bringing, or introducing a new era of simplicity to data and analytics. We call it Download And Go. We started with SPSS, we did that back in March. Now we're bringing Download And Go to DB2, and to our governance catalog. So the idea is make data really simple for enterprises. >> You had a community edition previous to this, correct? There was-- >> Rob: We did, but it wasn't this easy. >> Wasn't this simple, okay. >> Not anybody could do it, and I want to make it so anybody can do it. >> Is simplicity, the rate of simplicity, the only differentiator of the latest edition, or I believe you have Kubernetes support now with this new addition, can you describe what that involves? >> Yeah, sure, so there's two main things that are new functionally-wise, Jim, to your point. So one is, look, we're big supporters of Kubernetes. And as we are helping clients build out private clouds, the best answer for that in our mind is Kubernetes, and so when we released Data Science Experience for Private Cloud earlier this quarter, that was on Kubernetes, extending that now to other parts of the portfolio. The other thing we're doing with DB2 is we're extending JSON support for DB2. So think of it as, you're working in a relational environment, now just through SQL you can integrate with non-relational environments, JSON, documents, any type of no-SQL environment. So we're finally bringing to fruition this idea of a data fabric, which is I can access all my data from a single interface, and that's pretty powerful for clients. >> Yeah, more cloud data development. Rob, I wonder if you can, we can go back to the machine learning, one of the core focuses of this particular event and the announcements you're making. Back in the fall, IBM made an announcement of Watson machine learning, for IBM Cloud, and World of Watson. In February, you made an announcement of IBM machine learning for the z platform. What are the machine learning announcements at this particular event, and can you sort of connect the dots in terms of where you're going, in terms of what sort of innovations are you driving into your machine learning portfolio going forward? >> I have a fundamental belief that machine learning is best when it's brought to the data. So, we started with, like you said, Watson machine learning on IBM Cloud, and then we said well, what's the next big corpus of data in the world? That's an easy answer, it's the mainframe, that's where all the world's transactional data sits, so we did that. Last week with the Hortonworks announcement, we said we're bringing machine learning to Hadoop, so we've kind of covered all the landscape of where data is. Now, the next step is about how do we bring a community into this? And the way that you do that is we don't dictate a language, we don't dictate a framework. So if you want to work with IBM on machine learning, or in Data Science Experience, you choose your language. Python, great. Scala or Java, you pick whatever language you want. You pick whatever machine learning framework you want, we're not trying to dictate that because there's different preferences in the market, so what we're really talking about here this week in Munich is this idea of an open platform for data science and machine learning. And we think that is going to bring a lot of people to the table. >> And with open, one thing, with open platform in mind, one thing to me that is conspicuously missing from the announcement today, correct me if I'm wrong, is any indication that you're bringing support for the deep learning frameworks like TensorFlow into this overall machine learning environment. Am I wrong? I know you have Power AI. Is there a piece of Power AI in these announcements today? >> So, stay tuned on that. We are, it takes some time to do that right, and we are doing that. But we want to optimize so that you can do machine learning with GPU acceleration on Power AI, so stay tuned on that one. But we are supporting multiple frameworks, so if you want to use TensorFlow, that's great. If you want to use Caffe, that's great. If you want to use Theano, that's great. That is our approach here. We're going to allow you to decide what's the best framework for you. >> So as you look forward, maybe it's a question for you, Jim, but Rob I'd love you to chime in. What does that mean for businesses? I mean, is it just more automation, more capabilities as you evolve that timeline, without divulging any sort of secrets? What do you think, Jim? Or do you want me to ask-- >> What do I think, what do I think you're doing? >> No, you ask about deep learning, like, okay, that's, I don't see that, Rob says okay, stay tuned. What does it mean for a business, that, if like-- >> Yeah. >> If I'm planning my roadmap, what does that mean for me in terms of how I should think about the capabilities going forward? >> Yeah, well what it means for a business, first of all, is what they're going, they're using deep learning for, is doing things like video analytics, and speech analytics and more of the challenges involving convolution of neural networks to do pattern recognition on complex data objects for things like connected cars, and so forth. Those are the kind of things that can be done with deep learning. >> Okay. And so, Rob, you're talking about here in Europe how the uptick in some of the data orientation has been a little bit slower, so I presume from your standpoint you don't want to over-rotate, to some of these things. But what do you think, I mean, it sounds like there is difference between certainly Europe and those top 10 companies in the S&P, outperforming the S&P 500. What's the barrier, is it just an understanding of how to take advantage of data, is it cultural, what's your sense of this? >> So, to some extent, data science is easy, data culture is really hard. And so I do think that culture's a big piece of it. And the reason we're kind of starting with a focus on machine learning, simplistic view, machine learning is a general-purpose framework. And so it invites a lot of experimentation, a lot of engagement, we're trying to make it easier for people to on-board. As you get to things like deep learning as Jim's describing, that's where the market's going, there's no question. Those tend to be very domain-specific, vertical-type use cases and to some extent, what I see clients struggle with, they say well, I don't know what my use case is. So we're saying, look, okay, start with the basics. A general purpose framework, do some tests, do some iteration, do some experiments, and once you find out what's hunting and what's working, then you can go to a deep learning type of approach. And so I think you'll see an evolution towards that over time, it's not either-or. It's more of a question of sequencing. >> One of the things we've talked to you about on theCUBE in the past, you and others, is that IBM obviously is a big services business. This big data is complicated, but great for services, but one of the challenges that IBM and other companies have had is how do you take that service expertise, codify it to software and scale it at large volumes and make it adoptable? I thought the Watson data platform announcement last fall, I think at the time you called it Data Works, and then so the name evolved, was really a strong attempt to do that, to package a lot of expertise that you guys had developed over the years, maybe even some different software modules, but bring them together in a scalable software package. So is that the right interpretation, how's that going, what's the uptake been like? >> So, it's going incredibly well. What's interesting to me is what everybody remembers from that announcement is the Watson Data Platform, which is a decomposable framework for doing these types of use cases on the IBM cloud. But there was another piece of that announcement that is just as critical, which is we introduced something called the Data First method. And that is the recipe book to say to a client, so given where you are, how do you get to this future on the cloud? And that's the part that people, clients, struggle with, is how do I get from step to step? So with Data First, we said, well look. There's different approaches to this. You can start with governance, you can start with data science, you can start with data management, you can start with visualization, there's different entry points. You figure out the right one for you, and then we help clients through that. And we've made Data First method available to all of our business partners so they can go do that. We work closely with our own consulting business on that, GBS. But that to me is actually the thing from that event that has had, I'd say, the biggest impact on the market, is just helping clients map out an approach, a methodology, to getting on this journey. >> So that was a catalyst, so this is not a sequential process, you can start, you can enter, like you said, wherever you want, and then pick up the other pieces from majority model standpoint? Exactly, because everybody is at a different place in their own life cycle, and so we want to make that flexible. >> I have a question about the clients, the customers' use of Watson Data Platform in a DevOps context. So, are more of your customers looking to use Watson Data Platform to automate more of the stages of the machine learning development and the training and deployment pipeline, and do you see, IBM, do you see yourself taking the platform and evolving it into a more full-fledged automated data science release pipelining tool? Or am I misunderstanding that? >> Rob: No, I think that-- >> Your strategy. >> Rob: You got it right, I would just, I would expand a little bit. So, one is it's a very flexible way to manage data. When you look at the Watson Data Platform, we've got relational stores, we've got column stores, we've got in-memory stores, we've got the whole suite of open-source databases under the composed-IO umbrella, we've got cloud in. So we've delivered a very flexible data layer. Now, in terms of how you apply data science, we say, again, choose your model, choose your language, choose your framework, that's up to you, and we allow clients, many clients start by building models on their private cloud, then we say you can deploy those into the Watson Data Platform, so therefore then they're running on the data that you have as part of that data fabric. So, we're continuing to deliver a very fluid data layer which then you can apply data science, apply machine learning there, and there's a lot of data moving into the Watson Data Platform because clients see that flexibility. >> All right, Rob, we're out of time, but I want to kind of set up the day. We're doing CUBE interviews all morning here, and then we cut over to the main tent. You can get all of this on IBMgo.com, you'll see the schedule. Rob, you've got, you're kicking off a session. We've got Hilary Mason, we've got a breakout session on GDPR, maybe set up the main tent for us. >> Yeah, main tent's going to be exciting. We're going to debunk a lot of misconceptions about data and about what's happening. Marc Altshuller has got a great segment on what he calls the death of correlations, so we've got some pretty engaging stuff. Hilary's got a great piece that she was talking to me about this morning. It's going to be interesting. We think it's going to provoke some thought and ultimately provoke action, and that's the intent of this week. >> Excellent, well Rob, thanks again for coming to theCUBE. It's always a pleasure to see you. >> Rob: Thanks, guys, great to see you. >> You're welcome; all right, keep it right there, buddy, We'll be back with our next guest. This is theCUBE, we're live from Munich, Fast Track Your Data, right back. (upbeat electronic music)

Published Date : Jun 22 2017

SUMMARY :

Brought to you by IBM. This is Fast Track Your Data brought to you by IBM, Hey, great to see you. It's good that you joined us. and machine learning to the Hadoop community. You had to be relevant, you want to be part of the community, So first of all, you look at the last five years. but talk about the two announcements that you guys made. Even you can do it, Dave, which is amazing. I would love to see you do it, because I guarantee you can. but it wasn't this easy. and I want to make it so anybody can do it. extending that now to other parts of the portfolio. What are the machine learning announcements at this And the way that you do that is we don't dictate I know you have Power AI. We're going to allow you to decide So as you look forward, maybe it's a question No, you ask about deep learning, like, okay, that's, and speech analytics and more of the challenges But what do you think, I mean, it sounds like And the reason we're kind of starting with a focus One of the things we've talked to you about on theCUBE And that is the recipe book to say to a client, process, you can start, you can enter, and deployment pipeline, and do you see, IBM, models on their private cloud, then we say you can deploy and then we cut over to the main tent. and that's the intent of this week. It's always a pleasure to see you. This is theCUBE, we're live from Munich,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Europe	LOCATION	0.99+
Rob	PERSON	0.99+
Marc Altshuller	PERSON	0.99+
Hilary	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Bearden	PERSON	0.99+
February	DATE	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
May 2018	DATE	0.99+
March	DATE	0.99+
Munich	LOCATION	0.99+
Scala	TITLE	0.99+
Apache	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
Last week	DATE	0.99+
Java	TITLE	0.99+
last year	DATE	0.99+
two announcements	QUANTITY	0.99+
10 companies	QUANTITY	0.99+
GDPR	TITLE	0.99+
Python	TITLE	0.99+
DB2	TITLE	0.99+
15 minutes	QUANTITY	0.99+
last week	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
European Union	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
JSON	TITLE	0.99+
Watson Data Platform	TITLE	0.99+
third	QUANTITY	0.99+
One	QUANTITY	0.99+
this week	DATE	0.98+
today	DATE	0.98+
a week ago	DATE	0.98+
two things	QUANTITY	0.98+
SQL	TITLE	0.98+
last fall	DATE	0.98+
2017	DATE	0.98+
Munich, Germany	LOCATION	0.98+
each	QUANTITY	0.98+
Y2K	ORGANIZATION	0.98+

Derek Shoettle & Adam Kocoloski, IBM- IBM Interconnect 2017 - #ibminterconnect - #theCUBE

>> Narrator: Live from Las Vegas! It's the Cube covering Interconnect 2017, brought to you by IBM. >> Okay, welcome back everyone. We are live in Las Vegas at IBM Interconnect 2017, IBM's cloud and now data show. I'm John Furrier with my co-host Dave Vellante. This is the Cube. Our next guest is Derek Schoettle, the general manager of Watson Data Platform, and Adam Kocoloski who's the CTO of the Watson Data Platform. Guys, welcome to the Cube. Good to see you again Derek. Great to see you, welcome Adam! >> Thanks, John. >> So, obviously the data was a big part of the theme. You saw Chris Moody from Twitter up there, obviously, they have a ton of data. I like to joke about they have a really active user right now in the President of the United States. >> Daily State of the Union, I think, was the one take away. >> Daily State of the Union. But this is the conversation that's happening in all over IT, and enterprise, and cloud, both public and enterprise, is the data conversation in context to cloud. Super relevant right now, and there's architecturals at play, it's app, it impacts app developers, it impacts architectures. And that's the Holy Grail, the so-called app data layer or cloud data layer. What's your vision, guys, on this? Derek, I'll start with you, your vision on this data opportunity. How does IBM approach it? And what's different from, or could be different from the competitors? >> Yeah, I know, one, it's an exciting time. We were just chatting about before we went live is, there's so much change taking place in and around data, right? It used to be it's the natural currency, it's everything everyone is talking about. The reality is, it's changing business models, right? It introduces a whole new set of discussions when you introduce cloud, self-service and open source. So, when we step back and think about how we can differentiate, how we can make IBM's offer to clients and the broader market interesting, is shift to a platform strategy where it says, we have instead of discreet compossible services that act independent of one another that are not, I'll say, self-aware, shift into a platform where you have common governance, you have common management, and you have really a collaborative by design approach where data is at the epicenter. Data is what starts every conversation whether you're on the app dev side, whether you are a data scientist, someone who's, you know, at the edge of discovery. And cloud's what's enabling that, self-service is what's enabling that and operationalize is what we do. I mean, we spend our days thinking about and then operationalizing feature, function, and then performance for a lot of different workloads. 'Cause it used to be, I think the, I was at Vertica, right? So that was the introduction of volume, variety, and velocity, right? Now, with the introduction of AI and cognitive, it's really about taking any and all and rationalizing it. And any and all meaning sitting within your corporate structure, as well as what's more broadly in the internet, out available within social media, right? That to me is the shift that's taking place. It's all companies are realizing they made a lot of investments, they have a lot of data, and they're not taking advantage of it. And we see that the big shift is... People are saying data scientist, what we think about is the merging of data and science. You think of science as cognitive and AI, right? That's a small population that really understands and can take advantage of. You have a whole big market that's out there in traditional data and analytics. Our platform is about merging those two. It's really about merging those experiences so everyone takes advantage of the benefits of data and science. >> What's the conversations that you are having, Derek, with customers? Because I think that's, there's a lot of bells going off into the CXO or even practitioners when you hear about machine learning, you hear AI, cognitive, autonomous vehicles, sensor networks. Obviously that's, the alarms are going off, like, I'd better get my act together. So, how do they pull that off? How do your customers pull off making that happen? Because now you got to bring in to be cloud ready, you have all these decoupled component parts. >> Yeah. >> John: You got to operate them in the cloud and you got to kind of have an on-prem component that's hybrid. What are the conversations that you are having with customers in how they're pulling this off? >> Yeah so, I'll cover the first piece, and I know Adam is spending certainly this week and a lot of time as well with clients on this topic. You know, the first part of the discussion is do you believe that the cloud can help you? Most folks are saying, "Yes, we believe it can help". Second piece is, how do I take advantage of emerging technologies that are moving at a rate and pace that perhaps my skills, my existing IT architecture, and my business model can't fully kind of, grasp, if not take advantage of? So, what we've introduced is a methodology, a data first method, which literally is a, it sounds simple, but at the end of the day, it is a common, uniform, agile way for us as IBM to engage with partners and clients that literally starts with the discovery workshop that says how does data inform your business? It's not static reporting anymore, it's what is the data that's sitting within your organization? You heard it from James at PlayFab. Data is changing the way people build in games today, thinking about how to enrich games, so on and so forth. Data First Method is what we've introduced, so you'll see going forward, IBM will sell Data First, we will engage Data First. So, any conversation with someone who says, "How do I take advantage of AI, "or machine learning, "or data science experience?". Well, let's step back for a second and talk about data. 'Cause 30 years ago, 20, that's how every conversation started. You get on a whiteboard, you design a schema, you talk about the relationships. That's how it started, and we're kind of cycling back to that, right? We got to put data first. >> So, Adam, the geeks are always arguing speeds, "I got a Hadoop cluster here, "I got this over here.". I mean, there's a lot of variety and diversity in terms of how people can manage either databases, and middleware or what not, right? So, how do you see the data first? How does it play out architecturally? And how does that play out for the solution? >> I think one of the big advantages we have in the world of the cloud platform is this opportunity to, on the one hand, use more a broader variety of compossible services, but also be able to take different parts of the business that were historically a little bit more separated from one another and bring them together. So you look at a Hadoop-flavored data leg on premises. It's a good area to do discovery, a good area to do exploration. But what clients really care about time and time again, a common refrain is the operationalization of the analytics, of the machine learning models. How do I take this insight that my data science team has discovered, and have it really influence a business process or incorporate it into an application? And in the on-premises architecture, that's often times quite a challenge. In the world of the cloud platform and the Watson data platform, we have an opportunity to be a little bit closer to things like the world of kubernetes which are really ideally suited for deploying and scaling microservices and APIs in a cloud-native, fault-tolerant, reliable fashion, right? So, you're seeing us take that menu of composable services in the cloud platform, and treat the data platform as one such composition. An opinionated way to put together this menu of services specifically to help data professionals collaborate, and drive the business forward. >> So, when you guys announced the Watson Data Platform, I think you called it Data Works, then changed the name, about five, maybe six months ago you messaged that 80% of, you know, data professionals' time is spent wrangling data, not enough time doing the fun stuff. And the premise was you coming up with a platform for collaboration that sort of integrates those different roles as well as, as you pointed out just now, allows you to operationalize analytics. Okay, so we're five months in, six months in, what kind of proof points do you have? Have you seen it? I mean, some people were skeptical saying, "Okay, well, it's IBM, "they've put a nice wrapper on this thing, "pulling in some different legacy components, "and you know, nice name." Okay, so, what do you say to that? And what evidence do you have that what you said is going to come true is actually coming true? >> You're going to do tech and I can do customer? >> Yeah, go for customer first. >> Yeah, so what we've seen is if you think about why we ended up at a platform. So, if you roll the tape back to when Cloudant got acquired in 2014, the journey that we were on was everyone was building rich applications, they wanted to be smarter, they wanted to understand what that exhaust was coming off. >> Right. >> Derek: And they wanted to add different ingredients to it. So, instead of a do-it-yourself kit that is a bunch of proprietary interoperability issues that's a ton of expense and inefficiency, and can't take advantage of the cloud, we decided, in very much of then our path towards, let's build a platform that allows you to easily ingest, govern, curate, and then, I'll say present and deploy. So, starting in actually June, and thhis started first with Spark. We made a huge bet on Spark 'cause we believed that to be kind of the operational operating system, if you will, for an analytic fabric. So, it started in Spark. Then, when we announced the Watson Data Platform in October it was, here's how we're going to take our heritage run governance, our heritage run traditional structured, non-structured data repositories, and here's how we're going to take visualization and distribution of data. So, that then next went into how we bring it to market? That's Data First. So, we've been working with large insurance companies, large financial services companies, retailers, gaming companies, and the net that we see is three things. First is, yes everyone agrees the platform is the right place to go. It's where do we get started? How do I take my existing investment and take advantage of this platform? And that, invariably, is I'm going to build a net new application whether it be Watson Conversations, so that runs into Watson Data Platform. We want to ingest data, but we want that data to be resident on-prem, we want it to be native to the cloud, and so we're going to work through the architectural change to adopt that. Another great example is we want to start with just an analytic application because we are already hosting with you a mobile app. Well, we're going to run it into your analytic fabric using dashDB, and dashDB works with Watson Analytics and we're going to build an application that's resident. The really creative and compelling piece here, back to your comment on IBM is, it's really hard to buy things from this company historically. Buying things from IBM is not easy, so we built a platform, we built the methodology to help you understand how to take advantage of it, and now we have a subscription, the Bluemix subscription is which you can come in and draw down those services, be it an object store, be it a sequel data store, be the visualization layer. >> John: Opposability basically. >> Yeah, but in a common governed framework. The big takeaway is, and I'll pass to Adam, governance and security and operationalizing the platform is what we can bring to bear. 'Cause we're bringing Open Source, we're bringing proprietary technologies, but if it's done independent, it doesn't really deliver on the promise of a platform. >> I will say that architecturally, that's incredibly liberating to know that there is this one common mind model. >> It's also highly requested by customers. That's what they want. >> Derek: That's what they want. It's the path to get there that I think is, we're at that intersection right now, it's crossing the chasm. >> John: So, what's liberating? Give us good-- >> Oh, just the fact that you know that if there's a common access control layer under the hood, if there's a common governance layer under the hood, that you don't have to compromise and come up with an alternative proposition for taking some capability, maybe deploying a model to a scoring engine. You can have the one purpose filled scoring engine and know that I can call that in on demand from discovery phase to go to production and I don't have to sort of engage in another separate mind conversation or separate entitlement conversation or a separate enabling conversation. This catalog is allowing it to work together. >> That to me from a team sport perspective is that the steps you have to take. So, think of ETL. ETL really in a modern real time, like getting away from batch and go into real time, that's just flow. So, the skill set and the ownership of the infrastructure associated with that is evolved, especially in cloud where that's just a dynamic where it's going to be a team deciding here's the data I want, here's how I want to enrich it, here's how I want to govern and curate it. >> It's a team sport. I love that. We were just at the Strata Hadoop. We had our big data SV event and the collision between batch and real time, they are not mutually exclusive and some people just made bets on batch and forgot real time. And they have real time people who don't do batch. So, you kind of see that coming together. >> Adam: Conversion. >> So, the question, Adam, for you is that, with the world kind of moving in that direction, how do you rationalize so the customer who's saying, "Hey, I'm cloud native but I also have a hybrid here "and I want to be cloud native purely "on this net new applications". So, there's a conversation happening. I call it the dev ops of data which is like data ops. Hey, I'm a programmer. I just want data as code. I just don't want to get in the weeds of setting up a data warehouse, and prepping an ETL, all that batch stuff that someone else does. I'm writing some software. I want data native to my app, but I don't want to go in and do the wrangling. I don't want to go out. I just want stuff to magically work. How do you tackle that premise? >> I mean, I think the dev ops of data piece is certainly a topic we're going to be hearing a lot more about over the next coming six months, in a year. I think the reason for that is precisely because this earlier topic of operationalization. You've got lots of people building up, budding data science teams and so on. And the first thing they're going to do is be working in the discovery area. They won't be in the world of pushing things to production. When they do, it's going to become more important that the folks who truly understand the details of the algorithm are close enough to the deployed assets, so that they can understand how this model is behaving over time. So that they can understand new data quality issues that might have cropped up and get close to that without obviously sort of breaking the separation duties that are important for a production system. So, I think, that is one part of the data ops conversation that hasn't yet been worked out. It's going to be a real opportunity for folks who-- >> That's an emerging area. You agree, right? >> It's a cultural shift too. I mean that is a re-thinking of, because most companies keep data in steel pipes. They're highly regulated. Their rules, the personalities that own them so to speak. The proposition that we've been on and every client asks for is how do I create a common fabric that gives access to people, that is governed and curated so you can always give a shopping experience. People that work with data do not want to talk about and say this : "How long does it take to stand up a server? "When can I get the data stood up in the staging area "so I can actually access it?" That's over. >> It's interesting, we're doing some Wikibon research on this, and this is the point where people look at value extraction of the data so they tend to, it's kind of like if you're a hammer, everything looks like a nail. So if you're in IT, it's infrastructure. If you are on the business line, it's the apps. So, you're seeing the shift where apps is value creating the value, but the infrastructure is more elastic, more compossible so it's enablement by itself so that's interesting. So, your thoughts on that, guys? Where is that value of the data coming from most, right now? Is it the apps? Is the infrastructure still evolving? The hybrid not-- >> We think there's a value model here. There is certainly elements of the data pipeline that are purely operational, reporting base and things like that, which drive value on their own. But we also recognize that it's new uses of data and new business processes that are primarily driven by applications, driven by conversational interfaces, driven by these sort of emerging paradigms. And one of our goals in the data platform is to ensure that clients can move along that curve more aggressively. >> How are people getting started with the Watson Data Platform? Do they go jumping all in? Is there a community edition, you can try it before you buy it kind of thing? >> Yeah, so you're signing up in Bluemix. You have access to a set of services around the platform. You have a 30-day window where you can try everything included within it, and then at some point you got to commit to a credit card or you got to commit a 12-month term agreement. I think in parallel, we see a lot of other companies that end up blasting in size challenge for IBM. We have a lot of clients. We have got a lot of clients that we are working with today in traditional architects and infrastructure, helping them through a methodology, helping them with the right skills. That is a more traditional, hey, come in and try an analytic workload on the platform. We'll give the skills. We'll help do the enablement and then we're off and running. I think the big difference is whether or not clients are paying for and they are willing to pay for it. 'Cause we are helping them get to this new model. We're helping them get to the platform, and I think the big thing we're working through is how do we get to velocity? I think when you look at these workloads that are happening. The reason they're happening is now data is not just in some dark corner. With AI, the machine learning is always on. So, there's a lot of different ways in which you can unleash that, that then, how do you take advantage of it? And that is a cultural shift. It's re-thinking business models, it's re-thinking how you got skills deployed which is incredibly exciting for us, and I think the market in general. I think back to how AI is cast in many cases as the robots are going to rule the world. There's a lot of good that can come from exposing vast amounts of data to AI and to frameworks where you can get a lot of value out of it. From how to better position products to how to, better design of medicines to fulfillment chains in countries that need help. >> So, guys, in the last minute that we have I want you to take a minute to either together or one of you guys talk about how IBM is helping solve what seems to be the number one question we get on the Cube where I get asked, hey, how do you help me build a hybrid architecture. I have more data-rich workloads coming on board now. Either I have some heavy data rich workloads that are run on-prem, I got more cloud action coming, I got IOT and I'm investing in data science. So, how do you guys specifically help me build a hybrid cloud architecture that's going to fuel and support data-rich workloads and propel my data science operation. >> Yeah, so, I'll take the basics for me. It is the Data First method. It is dashDB, which is an extensible on-prem hybrid in the cloud so that the common analytic fabric. There's Data Connect, which is our ability to move data batch continuous into different end states in the cloud, and then there's data science experience. So data science experience is our offering that brings together community, it brings together content, it brings together various tooling for the data scientist or data engineers. And I think the other piece of this is, we have something called solutions assurance. So we're literally designing patterns that we stand up in our own environments that reflect what we see on Premise and what we see workloads going into the cloud with, and stamping that as hybrid architectures that are repeatable, and we remove risk, the operational risk. But the reality is (mumbles) is, clients have to make sacrifices in getting to the cloud. You have to deprecate, you have to rethink. And that's where some of the smoothing of those rough edges come into the discipline of us saying, here's a supported architecture, here's the destination that you're going to, and we're going to have to work together to get there. Which is the fun part, I mean, that's what we're all in this for, is getting the outcomes. >> I think the key is not to pretend that these environments are completely identical to one another. There are things that the public cloud is uniquely well suited for. So let's make sure that those kinds of use cases are really nailed there, right? And then there are other cases where you're dealing with mainframe systems running critical business processes, and you want to be able to infuse that process with some analytics. So you have to look at the use case. Maybe it's training a machine learning model in the cloud, being able to export that model and run it-- >> So use proven solutions and be prepared to be handling new ones coming onboard. Alright, Derek Schoettle, general manager, and Adam Kocoloski, the CTO, the leaders at IBM Watson Data Group, IMB Watson Platform. This is The Cube, back with more live coverage after this short break.

Published Date : Mar 21 2017

SUMMARY :

brought to you by IBM. Good to see you again Derek. So, obviously the data was a big part of the theme. Daily State of the Union, is the data conversation in context to cloud. and the broader market interesting, What's the conversations that you are having, What are the conversations that you are having Data is changing the way people build in games today, And how does that play out for the solution? and the Watson data platform, And the premise was you in 2014, the journey that we were on was kind of the operational operating system, if you will, it doesn't really deliver on the promise of a platform. to know that there is this one common mind model. That's what they want. It's the path to get there that I think is, Oh, just the fact that you know that is that the steps you have to take. and the collision between batch and real time, So, the question, Adam, for you is that, of the algorithm are close enough to the deployed assets, You agree, right? Their rules, the personalities that own them so to speak. Is it the apps? And one of our goals in the data platform is to ensure and to frameworks where you can get So, guys, in the last minute that we have You have to deprecate, you have to rethink. in the cloud, being able to export that model and Adam Kocoloski, the CTO,

ENTITIES

Entity	Category	Confidence
Derek	PERSON	0.99+
Derek Schoettle	PERSON	0.99+
Adam	PERSON	0.99+
Adam Kocoloski	PERSON	0.99+
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
IBM	ORGANIZATION	0.99+
James	PERSON	0.99+
2014	DATE	0.99+
30-day	QUANTITY	0.99+
John Furrier	PERSON	0.99+
12-month	QUANTITY	0.99+
October	DATE	0.99+
Chris Moody	PERSON	0.99+
Derek Shoettle	PERSON	0.99+
80%	QUANTITY	0.99+
first piece	QUANTITY	0.99+
PlayFab	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
Second piece	QUANTITY	0.99+
five months	QUANTITY	0.99+
first part	QUANTITY	0.99+
June	DATE	0.99+
IBM Watson Data Group	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Watson Data Platform	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Cloudant	ORGANIZATION	0.99+
three things	QUANTITY	0.98+
Vertica	ORGANIZATION	0.98+
six months ago	DATE	0.98+
First	QUANTITY	0.98+
first method	QUANTITY	0.98+
both	QUANTITY	0.98+
six months	QUANTITY	0.97+
first	QUANTITY	0.97+
Watson Data Platform	TITLE	0.97+
Watson Conversations	TITLE	0.96+
this week	DATE	0.96+
30 years ago	DATE	0.96+
one part	QUANTITY	0.96+
today	DATE	0.95+
Wikibon	ORGANIZATION	0.95+
Daily State of the Union	TITLE	0.94+
Watson Analytics	TITLE	0.94+
Cube	COMMERCIAL_ITEM	0.94+
Spark	TITLE	0.92+
Interconnect 2017	EVENT	0.92+
Bluemix	ORGANIZATION	0.89+
dashDB	TITLE	0.89+
IBM Interconnect 2017	EVENT	0.87+
one purpose	QUANTITY	0.86+
#ibminterconnect	EVENT	0.85+
20	QUANTITY	0.83+
Strata Hadoop	LOCATION	0.82+
first thing	QUANTITY	0.81+
one common mind model	QUANTITY	0.79+
second	QUANTITY	0.76+
Twitter	ORGANIZATION	0.76+
President of the United States	PERSON	0.72+
Watson	TITLE	0.71+
IMB Watson	ORGANIZATION	0.71+
about five	DATE	0.7+

Wrap Up - IBM Machine Learning Launch - #IBMML - #theCUBE

(jazzy intro music) [Narrator] Live from New York, it's the Cube! Covering the IBM Machine Learning Launch Event, brought to you by IBM. Now, here are your hosts: Dave Vellante and Stu Miniman. >> Welcome back to New York City, everybody. This is theCUBE, the leader in live tech coverage. We've been covering, all morning, the IBM Machine Learning announcement. Essentially what IBM did is they brought Machine Learning to the z platform. My co-host and I, Stu Miniman, have been talking to a number of guests, and we're going to do a quick wrap here. You know, Stu, my take is, when we first heard about this, and the world first heard about this, we were like, "Eh, okay, that's nice, that's interesting." But what it underscores is IBM's relentless effort to continue to keep z relevant. We saw it with the early Linux stuff, we're now seeing it with all the OpenSource and Spark tooling. You're seeing IBM make big positioning efforts to bring analytics and transactions together, and the simple point is, a lot of the world's really important data runs on mainframes. You were just quoting some stats, which were pretty interesting. >> Yeah, I mean, Dave, you know, one of the biggest challenges we know in IT is migrating. Moving from one thing to another is really tough. I love the comment from Barry Baker. Well, if I need to change my platform, by the time I've moved it, that whole digital transformation, we've missed that window. It's there. We know how long that takes: months, quarters. I was actually watching Twitter, and it looks like Chris Maddern is here. Chris was the architect of Venmo, which my younger sisters, all the millennials that I know, everybody uses Venmo. He's here, and he was like, "Almost all the banks, airlines, and retailers "still run on mainframes in 2017, and it's growing. "Who knew?" You've got a guy here that's developing really cool apps that was finding this interesting, and that's an angle I've been looking at today, Dave, is how do you make it easy for developers to leverage these platforms that are already there? The developers aren't going to need to care whether it's a mainframe or a cloud or x86 underneath. IBM is giving you the options, and as a number of our guests said, they're not looking to solve all the problems here. Here's taking this really great, new type of application using Machine Learning and making it available on that platform that so many of their customers already use. >> Right, so we heard a little bit of roadmap here: the ML for z goes GA in Q1, and then we don't have specific timeframes, but we're going to see Power platform pick this up. We heard from Jean-Francois Puget that they'll have an x86 version, and then obviously a cloud version. It's unclear what that hybrid cloud will look like. It's a little fuzzy right now, but that's something that we're watching. Obviously a lot of the model development and training is going to live in the cloud, but the scoring is going to be done locally is how the data scientists like to think about these things. So again, Stu, more mainframe relevance. We've got another cycle coming soon for the mainframe. We're two years into the z13. When IBM has mainframe cycles, it tends to give a little bump to earnings. Now, granted, a smaller and smaller portion of the company's business is mainframe, but still, mainframe drags a lot of other software with it, so it remains a strategic component. So one of the questions we get a lot is what's IBM doing in so-called hardware? Of course, IBM says it's all software, but we know they're still selling boxes, right? So, all the hardware guys, EMC, Dell, IBM, HPE, et cetera. A lot of software content, but it's still a hardware business. So there's really two platforms there: there's the z and there's the Power. And those are both strategic to IBM. It sold its x86 business because it didn't see it as strategic. They just put Bob Picciano in charge of the Power business, so there's obviously real commitments to those platforms. Will they make a dent in the market share numbers? Unclear. It looks like it's steady as she goes, not dramatic increase in share. >> Yeah, and Dave, I didn't hear anybody come in here and say this offering is going to say, well let me dump x86 and go buy mainframe. That's not the target that I heard here. I would have loved to hear a little bit more as to where this fits into the broader IOT strategy. We talked a little bit on the intro, Dave. There's a lot of reasons why data's going to stick at the edge when we look at the numbers. For the huge growth of public cloud, the amount of data in public cloud hasn't caught up to the equivalent of what it would be in data centers itself. What I mean by that is, we usually spend, say 30% on average for storage costs inside a data center. If we look at public cloud, it's more around 10%. So, at AWS Reinvent, I talked to a number of the ecosystem partners, that started to see things like data lakes starting to appear in the cloud. This solution isn't in the data lake family, but it's with the analytics and everything that's happening with streaming and machine learning. It's large repositories of data and huge transactions of data that are happening in the mainframe, and just trying to squint through where all the data lives, and the new waves of technologies coming in. We heard how this can tie into some of the mobile and streaming activities that aren't on the mainframe, so that it can pull them into the other decisions, but some broader picture that I'm sure IBM will be able to give in the future. >> Well, normally you would expect a platform that is however many decades old the mainframe is, after the whole mainframe downsizing trend, you would expect there would be a managed decline in that business. I mean, you're seeing it in a lot of places now. We've talked about this, with things like Symmetrics, right? You minimize and focus the R&D investments, and you try to manage cost, you manage the decline of the business. IBM has almost sort of flipped that. They say, okay, we've got DB2, we're going to continue to invest in that platform. We've got our major subsystems, we're going to enhance the platform with Open Source technologies. We've got a big enough base that we can continue to mine perpetually. The more interesting thing to me about this announcement is it underscores how IBM is leveraging its analytics platform. So, we saw the announcement of the Watson Data Platform last September, which was sort of this end-to-end data pipeline collaboration between different persona engine, which is quite unique in the marketplace, a lot of differentiation there. Still some services. Last week at Spark Summit, I talked to some of the users and some of the partners of the Watson Data Platform. They said it's great, we love it, it's probably the most robust in the marketplace, but it's still a heavy lift. It still requires a fair amount of services, and IBM's still pushing those services. So IBM still has a large portion of the company still a services company. So, not surprising there, but as I've said many many times, the challenge IBM has is to really drive that software business, simplify the deployment and management of that software for its customers, which is something that I think it's working hard on doing. And the other thing is you're seeing IBM leverage those platforms, those analytics platforms, into different hardware segments, or hardware/cloud segments, whether it's BlueMix, z, Power, so, pushing it out through the organization. IBM still has a stack, like Oracle has a stack, so wherever it can push its own stack, it's going to do that, cuz the margins are better. At the same time, I think it understands very well, it's got to have open source choice. >> Yeah, absolutely, and that's something we heard loud and clear here, Dave, which is what we expect from IBM: choice of language, choice of framework. When I hear the public cloud guys, it's like, "Oh, well here's kind of the main focus we have, "and maybe we'll have a little bit of choice there." Absolutely the likes of Google and Amazon are working with open source, but at least first blush, when I look at things, it looks like once IBM fleshes this out -- and as we've said, it's the Spark to start and others that they're adding on -- but IBM could have a broader offering than I expect to see from some of the public cloud guys. We'll see. As you know, Dave, Google's got their cloud event in a couple of weeks in San Francisco. We'll be covering that, and of course Amazon, you expect their regular cadence of announcements that they'll make. So, definitely a new front in the Cloud Wars as it were, for machine learning. >> Excellent! Alright, Stu, we got to wrap, cuz we're broadcasting the livestream. We got to go set up for that. Thanks, I really appreciate you coming down here and co-hosting with me. Good event. >> Always happy to come down to the Big Apple, Dave. >> Alright, good. Alright, thanks for watching, everybody! So, check out SiliconAngle.com, you'll get all the new from this event and around the world. Check out SiliconAngle.tv for this and other CUBE activities, where we're going to be next. We got a big spring coming up, end of winter, big spring coming in this season. And check out WikiBon.com for all the research. Thanks guys, good job today, that's a wrap! We'll see you next time. This is theCUBE, we're out. (jazzy music)

Published Date : Feb 15 2017

SUMMARY :

New York, it's the Cube! a lot of the world's really important data the biggest challenges we Obviously a lot of the model a number of the ecosystem partners, the challenge IBM has is to really kind of the main focus we have, We got to go set up for that. down to the Big Apple, Dave. and around the world.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Chris	PERSON	0.99+
Dave	PERSON	0.99+
Barry Baker	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Chris Maddern	PERSON	0.99+
2017	DATE	0.99+
Bob Picciano	PERSON	0.99+
Google	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
San Francisco	LOCATION	0.99+
Stu	PERSON	0.99+
New York City	LOCATION	0.99+
Last week	DATE	0.99+
New York	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
one	QUANTITY	0.99+
30%	QUANTITY	0.99+
two platforms	QUANTITY	0.99+
two years	QUANTITY	0.99+
Linux	TITLE	0.99+
Alrig	PERSON	0.99+
last September	DATE	0.99+
Jean-Francois Puget	PERSON	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.98+
today	DATE	0.98+
Watson Data Platform	TITLE	0.98+
Venmo	ORGANIZATION	0.97+
Spark Summit	EVENT	0.97+
Q1	DATE	0.96+
Big Apple	LOCATION	0.96+
EMC	ORGANIZATION	0.95+
HPE	ORGANIZATION	0.95+
BlueMix	TITLE	0.94+
Spark	TITLE	0.91+
WikiBon.com	ORGANIZATION	0.9+
IBM Machine Learning Launch	EVENT	0.89+
one thing	QUANTITY	0.86+
AWS Reinvent	ORGANIZATION	0.82+
around 10%	QUANTITY	0.8+
x86	COMMERCIAL_ITEM	0.78+
SiliconAngle.tv	ORGANIZATION	0.77+
#IBMML	TITLE	0.76+
z13	COMMERCIAL_ITEM	0.74+
end	DATE	0.71+
Machine Learning	TITLE	0.65+
x86	TITLE	0.62+
CUBE	ORGANIZATION	0.56+
OpenSource	TITLE	0.56+
Twitter	TITLE	0.54+
Learning	TITLE	0.5+
decades	QUANTITY	0.48+
Symmetrics	TITLE	0.46+
SiliconAngle.com	ORGANIZATION	0.43+
theCUBE	ORGANIZATION	0.41+
Wars	TITLE	0.35+

Jean Francois Puget, IBM | IBM Machine Learning Launch 2017

>> Announcer: Live from New York, it's theCUBE, covering the IBM machine learning launch event. Brought to you by IBM. Now, here are your hosts, Dave Vellante and Stu Miniman. >> Alright, we're back. Jean Francois Puget is here, he's the distinguished engineer for machine learning and optimization at IBM analytics, CUBE alum. Good to see you again. >> Yes. >> Thanks very much for coming on, big day for you guys. >> Jean Francois: Indeed. >> It's like giving birth every time you guys give one of these products. We saw you a little bit in the analyst meeting, pretty well attended. Give us the highlights from your standpoint. What are the key things that we should be focused on in this announcement? >> For most people, machine learning equals machine learning algorithms. Algorithms, when you look at newspapers or blogs, social media, it's all about algorithms. Our view that, sure, you need algorithms for machine learning, but you need steps before you run algorithms, and after. So before, you need to get data, to transform it, to make it usable for machine learning. And then, you run algorithms. These produce models, and then, you need to move your models into a production environment. For instance, you use an algorithm to learn from past credit card transaction fraud. You can learn models, patterns, that correspond to fraud. Then, you want to use those models, those patterns, in your payment system. And moving from where you run the algorithm to the operation system is a nightmare today, so our value is to automate what you do before you run algorithms, and then what you do after. That's our differentiator. >> I've had some folks in theCUBE in the past have said years ago, actually, said, "You know what, algorithms are plentiful." I think he made the statement, I remember my friend Avi Mehta, "Algorithms are free. "It's what you do with them that matters." >> Exactly, that's, I believe in autonomy that open source won for machine learning algorithms. Now the future is with open source, clearly. But it solves only a part of the problem you're facing if you want to action machine learning. So, exactly what you said. What do you do with the results of algorithm is key. And open source people don't care much about it, for good reasons. They are focusing on producing the best algorithm. We are focusing on creating value for our customers. It's different. >> In terms of, you mentioned open source a couple times, in terms of customer choice, what's your philosophy with regard to the various tooling and platforms for open source, how do you go about selecting which to support? >> Machine learning is fascinating. It's overhyped, maybe, but it's also moving very quickly. Every year there is a new cool stuff. Five years ago, nobody spoke about deep learning. Now it's everywhere. Who knows what will happen next year? Our take is to support open source, to support the top open source packages. We don't know which one will win in the future. We don't know even if one will be enough for all needs. We believe one size does not fit all, so our take is support a curated list of mid-show open source. We start with Spark ML for many reasons, but we won't stop at Spark ML. >> Okay, I wonder if we can talk use cases. Two of my favorite, well, let's just start with fraud. Fraud has become much, much better over the past certainly 10 years, but still not perfect. I don't know if perfection is achievable, but lot of false positives. How will machine learning affect that? Can we expect as consumers even better fraud detection in more real time? >> If we think of the full life cycle going from data to value, we will provide a better answer. We still use machine learning algorithm to create models, but a model does not tell you what to do. It will tell you, okay, for this credit card transaction coming, it has a high probability to be fraud. Or this one has a lower priority, uh, probability. But then it's up to the designer of the overall application to make decisions, so what we recommend is to use machine learning data prediction but not only, and then use, maybe, (murmuring). For instance, if your machine learning model tells you this is a fraud with a high probability, say 90%, and this is a customer you know very well, it's a 10-year customer you know very well, then you can be confident that it's a fraud. Then if next fraud tells you this is 70% probability, but it's a customer since one week. In a week, we don't know the customer, so the confidence we can get in machine learning should be low, and there you will not reject the transaction immediately. Maybe you will enter, you don't approve it automatically, maybe you will send a one-time passcode, or you enter a serve vendor system, but you don't reject it outright. Really, the idea is to use machine learning predictions as yet another input for making decisions. You're making decision informed on what you could learn from your past. But it's not replacing human decision-making. Our approach with IBM, you don't see IBM speak much about artificial intelligence in general because we don't believe we're here to replace humans. We're here to assist humans, so we say, augmented intelligence or assistance. That's the role we see for machine learning. It will give you additional data so that you make better decisions. >> It's not the concept that you object to, it's the term artificial intelligence. It's really machine intelligence, it's not fake. >> I started my career as a PhD in artificial intelligence, I won't say when, but long enough. At that time, there were already promise that we have Terminator in the next decade and this and that. And the same happened in the '60s, or it was after the '60s. And then, there is an AI winter, and we have a risk here to have an AI winter because some people are just raising red flags that are not substantiated, I believe. I don't think that technology's here that we can replace human decision-making altogether any time soon, but we can help. We can certainly make some proficient, more efficient, more productive with machine learning. >> Having said that, there are a lot of cognitive functions that are getting replaced, maybe not by so-called artificial intelligence, but certainly by machines and automation. >> Yes, so we're automating a number of things, and maybe we won't need to have people do quality check and just have an automated vision system detect defects. Sure, so we're automating more and more, but this is not new, it has been going on for centuries. >> Well, the list evolved. So, what can humans do that machines can't, and how would you expect that to change? >> We're moving away from IMB machine learning, but it is interesting. You know, each time there is a capacity that a machine that will automate, we basically redefine intelligence to exclude it, so you know. That's what I foresee. >> Yeah, well, robots a while ago, Stu, couldn't climb stairs, and now, look at that. >> Do we feel threatened because a robot can climb a stair faster than us? Not necessarily. >> No, it doesn't bother us, right. Okay, question? >> Yeah, so I guess, bringing it back down to the solution that we're talking about today, if I now am doing, I'm doing the analytics, the machine learning on the mainframe, how do we make sure that we don't overrun and blow out all our MIPS? >> We recommend, so we are not using the mainframe base compute system. We recommend using ZIPS, so additional calls to not overload, so it's a very important point. We claim, okay, if you do everything on the mainframe, you can learn from operational data. You don't want to disturb, and you don't want to disturb takes a lot of different meanings. One that you just said, you don't want to slow down your operation processings because you're going to hurt your business. But you also want to be careful. Say we have a payment system where there is a machine learning model predicting fraud probability, a part of the system. You don't want a young bright data scientist decide that he had a great idea, a great model, and he wants to push his model in production without asking anyone. So you want to control that. That's why we insist, we are providing governance that includes a lot of things like keeping track of how models were created from which data sets, so lineage. We also want to have access control and not allow anyone to just deploy a new model because we make it easy to deploy, so we want to have a role-based access and only someone someone with some executive, well, it depends on the customer, but not everybody can update the production system, and we want to support that. And that's something that differentiates us from open source. Open source developers, they don't care about governance. It's not their problem, but it is our customer problem, so this solution will come with all the governance and integrity constraints you can expect from us. >> Can you speak to, first solution's going to be on z/OS, what's the roadmap look like and what are some of those challenges of rolling this out to other private cloud solutions? >> We are going to shape this quarter IBM machine learning for Z. It starts with Spark ML as a base open source. This is not, this is interesting, but it's not all that is for machine learning. So that's how we start. We're going to add more in the future. Last week we announced we will shape Anaconda, which is a major distribution for Python ecosystem, and it includes a number of machine learning open source. We announced it for next quarter. >> I believe in the press release it said down the road things like TensorFlow are coming, H20. >> But Anaconda will announce for next quarter, so we will leverage this when it's out. Then indeed, we have a roadmap to include major open source, so major open source are the one from Anaconda (murmuring), mostly. Key deep learning, so TensorFlow and probably one or two additional, we're still discussing. One that I'm very keen on, it's called XGBoost in one word. People don't speak about it in newspapers, but this is what wins all Kaggle competitions. Kaggle is a machine learning competition site. When I say all, all that are not imagery cognition competitions. >> Dave: And that was ex-- >> XGBoost, X-G-B-O-O-S-T. >> Dave: XGBoost, okay. >> XGBoost, and it's-- >> Dave: X-ray gamma, right? >> It's really a package. When I say we don't know which package will win, XGBoost was introduced a year ago also, or maybe a bit more, but not so long ago, and now, if you have structure data, it is the best choice today. It's a really fast-moving, but so, we will support mid-show deep learning package and mid-show classical learning package like the one from Anaconda or XGBoost. The other thing we start with Z. We announced in the analyst session that we will have a power version and a private cloud, meaning XTC69X version as well. I can't tell you when because it's not firm, but it will come. >> And in public cloud as well, I guess we'll, you've got components in the public cloud today like the Watson Data Platform that you've extracted and put here. >> We have extracted part of the testing experience, so we've extracted notebooks and a graphical tool called ModelBuilder from DSX as part of IBM machine learning now, and we're going to add more of DSX as we go. But the goal is to really share code and function across private cloud and public cloud. As Rob Thomas defined it, we want with private cloud to offer all the features and functionality of public cloud, except that it would run inside a firewall. We are really developing machine learning and Watson machine learning on a command code base. It's an internal open source project. We share code, and then, we shape on different platform. >> I mean, you haven't, just now, used the word hybrid. Every now and then IBM does, but do you see that so-called hybrid use case as viable, or do you see it more, some workloads should run on prem, some should run in the cloud, and maybe they'll never come together? >> Machine learning, you basically have to face, one is training and the other is scoring. I see people moving training to cloud quite easily, unless there is some regulation about data privacy. But training is a good fit for cloud because usually you need a large computing system but only for limited time, so elasticity's great. But then deployment, if you want to score transaction in a CICS transaction, it has to run beside CICS, not cloud. If you want to score data on an IoT gateway, you want to score other gateway, not in a data center. I would say that may not be what people think first, but what will drive really the split between public cloud, private, and on prem is where you want to apply your machine learning models, where you want to score. For instance, smart watches, they are switching to gear to fit measurement system. You want to score your health data on the watch, not in the internet somewhere. >> Right, and in that CICS example that you gave, you'd essentially be bringing the model to the CICS data, is that right? >> Yes, that's what we do. That's a value of machine learning for Z is if you want to score transactions happening on Z, you need to be running on Z. So it's clear, mainframe people, they don't want to hear about public cloud, so they will be the last one moving. They have their reasons, but they like mainframe because it ties really, really secure and private. >> Dave: Public cloud's a dirty word. >> Yes, yes, for Z users. At least that's what I was told, and I could check with many people. But we know that in general the move is for public cloud, so we want to help people, depending on their journey, of the cloud. >> You've got one of those, too. Jean Francois, thanks very much for coming on theCUBE, it was really a pleasure having you back. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be back with our next guest. This is theCUBE, we're live from the Waldorf Astoria. IBM's machine learning announcement, be right back. (electronic keyboard music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. Good to see you again. on, big day for you guys. What are the key things that we and then what you do after. "It's what you do with them that matters." So, exactly what you said. but we won't stop at Spark ML. the past certainly 10 years, so that you make better decisions. that you object to, that we have Terminator in the next decade cognitive functions that and maybe we won't need to and how would you expect that to change? to exclude it, so you know. and now, look at that. Do we feel threatened because No, it doesn't bother us, right. and you don't want to disturb but it's not all that I believe in the press release it said so we will leverage this when it's out. and now, if you have structure data, like the Watson Data Platform But the goal is to really but do you see that so-called is where you want to apply is if you want to score so we want to help people, depending on it was really a pleasure having you back. from the Waldorf Astoria.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jean Francois	PERSON	0.99+
IBM	ORGANIZATION	0.99+
10-year	QUANTITY	0.99+
Stu Miniman	PERSON	0.99+
Avi Mehta	PERSON	0.99+
New York	LOCATION	0.99+
Anaconda	ORGANIZATION	0.99+
70%	QUANTITY	0.99+
Jean Francois Puget	PERSON	0.99+
next year	DATE	0.99+
Two	QUANTITY	0.99+
Last week	DATE	0.99+
next quarter	DATE	0.99+
90%	QUANTITY	0.99+
Rob Thomas	PERSON	0.99+
one-time	QUANTITY	0.99+
today	DATE	0.99+
Five years ago	DATE	0.99+
one word	QUANTITY	0.99+
CICS	ORGANIZATION	0.99+
Python	TITLE	0.99+
a year ago	DATE	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
next decade	DATE	0.98+
one week	QUANTITY	0.98+
first solution	QUANTITY	0.98+
XGBoost	TITLE	0.98+
a week	QUANTITY	0.97+
Spark ML	TITLE	0.97+
'60s	DATE	0.97+
ModelBuilder	TITLE	0.96+
one size	QUANTITY	0.96+
One	QUANTITY	0.95+
first	QUANTITY	0.94+
Watson Data Platform	TITLE	0.93+
each time	QUANTITY	0.93+
Kaggle	ORGANIZATION	0.92+
Stu	PERSON	0.91+
this quarter	DATE	0.91+
DSX	TITLE	0.89+
XGBoost	ORGANIZATION	0.89+
Waldorf Astoria	ORGANIZATION	0.86+
Spark ML.	TITLE	0.85+
z/OS	TITLE	0.82+
years	DATE	0.8+
centuries	QUANTITY	0.75+
10 years	QUANTITY	0.75+
DSX	ORGANIZATION	0.72+
Terminator	TITLE	0.64+
XTC69X	TITLE	0.63+
IBM Machine Learning Launch 2017	EVENT	0.63+
couple times	QUANTITY	0.57+
machine learning	EVENT	0.56+
X	TITLE	0.56+
Watson	TITLE	0.55+
these products	QUANTITY	0.53+
-G-B	COMMERCIAL_ITEM	0.53+
H20	ORGANIZATION	0.52+
TensorFlow	ORGANIZATION	0.5+
theCUBE	ORGANIZATION	0.49+
CUBE	ORGANIZATION	0.37+

Steven Astorino, IBM - IBM Machine Learning Launch - #IBMML - #theCUBE

>> Announcer: Live from New York, it's the CUBE. Covering the IBM Machine Learning Launch Event. Brought to you by IBM. Now here are your hosts Dave Vellante and Stu Miniman. >> Welcome back to New York City everybody the is The CUBE the leader in live tech coverage. We're here at the IBM Machine Learning Launch Event, bringing machine learning to the Z platform. Steve Astorino is here, he's the VP for Development for the IBM Private Cloud Analytics Platform. Steve, good to see you, thanks for coming on. >> Hi how are you? >> Good thanks, how you doing? >> Good, good. >> Down from Toronto. So this is your baby. >> It is >> This product right? >> It is. So you developed this thing in the labs and now you point it at platforms. So talk about, sort of, what's new here today specifically. >> So today we're launching and announcing our machine learning, our IBM machine learning product. It's really a new solution that allows, obviously, machine learning to be automated and for data scientists and line of business, business analysts to work together and create models to be able to apply machine learning, do predictions and build new business models in the end. To provide better services for their customers. >> So how is it different than what we knew as Watson machine learning? Is it the same product pointed at Z or is it different? >> It's a great question. So Watson is our cloud solution, it's our cloud brand, so we're building something on private cloud for the private cloud customers and enterprises. Same product built for private cloud as opposed to public cloud. Think of it more as a branding and Watson is sort of a bigger solution set in the cloud. >> So it's your product, your baby, what's so great about it? How does it compare with what else is in the marketplace? Why should we get excited about this product? >> Actually, a bunch of things. It's great for many angles, what we're trying to do, obviously it's based on open source, it's an open platform just like what we've been talking about with the other products that we've been launching over the last six months to a year. It's based on Spark, you know we're bringing in all the open source technology, to your fingertips. As well as we're integrating with IBM's top-notch research and capabilities that we're driving in-house, integrating them together and being able to provide one experience to be able to do machine learning. That's at a very high level, also if you think about it there's three things that we're calling out, there's freedom, basically being able to choose what tools you want to use, what environments you want to use, what language you want to use, whether it's Python, Scala, R, right there's productivity. So we really enable and make it simple to be productive and build these machine learning models and then an application developer can leverage and use within their application. The other one is trust. IBM is very well known for its enterprise level capabilities, whether it's governance, whether its trust of the data, how to manage the data, but also more importantly, we're creating something called The Feedback Loop which allows the models to stay current and the data scientists, the administrators, know when these models, for example, is degrading. To make sure it's giving you the right outcome. >> OK, so you mention it's built on Spark. When I think about the efforts to build a data pipeline I think I've got to ingest the data, I've got to explore, I've got to process it and clean it up and then I've got to ultimately serve whomever, the business. >> Right, Right. >> What pieces of that does Spark unify and simplify? >> So we leveraged Spark to able to, obviously for the analytics. When you're building a model you one, have your choice of tooling that you want to use, whether it's programmatic or not. That's one of the value propositions we're bringing forward. But then we create these models, we train them, we evaluate them, we leverage Spark for that. Then obviously, we're trying to bring the models where the data is. So one of the key value proposition is we operationalize these models very simply and quickly. Just at a click of a button you can say hey deploy this model now and we deploy it right on where the data is in this case we're launching it on mainframe first. So Spark on the mainframe, we're deploying the model there and you can score the model directly in Spark on the mainframe. That's a huge value add, get better performance. >> Right, okay, just in terms of differentiates from the competition, you're the only company I think, providing machine learning on Z, so. >> Definitely, definitely. >> That's pretty easy, but in terms of the capabilities that you have, how are you different from the competition? When you talk to clients and they say well what about this vendor or that vendor, how do you respond? >> So let me talk about one of the research technologies that we're launching as part of this called CADS, Cognitive Assistant for Data Scientists. This is a feature where essentially, it takes the complexity out of building a model where you tell it, or you give it the algorithms you want to work with and the CADS assistant basically returns which one is the best which one performs the best. Now, all of a sudden you have the best model to use without having to go and spend, potentially weeks, on figuring out which one that is. So that's a huge value proposition. >> So automating the choice of the algorithm, an algorithm to choose the algorithm. what have you found in terms of it's level of accuracy in terms of the best fit? >> Actually it works really well. And in fact we have a live demo that we'll be doing today, where it shows CADS coming back with a 90% accurate model in terms of the data that we're feeding it and outcome it will give you in terms of what model to use. It works really well. >> Choosing an algorithm is not like choosing a programming language right, this bias if I like Scala or R or whatever, Java, Python okay fine, I've got skill sets associated with that. Algorithm choice is one that's more scientific, I guess? >> It is more scientific, it's based on the algorithm, the statistical algorithm and the selection of the algorithm or the model itself is a huge deal because that's where you're going to drive your business. If you're offering a new service that's where you're providing that solution from, so it has to be the right algorithm the right model so that you can build that more efficiently. >> What are you seeing as the big barriers to customer adopting machine learning? >> I think everybody, I mean it's the hottest thing around right now, everybody wants machine learning it's great, it's a huge buzz. The hardest thing is they know they want it, but don't really know how to apply it into their own environment, or they think they don't have the right skills. So, that actually one of the things that we're going after, to be able to enable them to do that. We're for example working on building different industry-based examples to showcase here's how you would use it in your environment. So last year when we did the Watson data platform we did a retail example, now today we're doing a finance example, a churn example with customers potentially churning and leaving a bank. So we're looking at all those different scenarios, and then also we're creating hubs, locations we're launching today also, announcing today, actually Dinesh will be doing that. There is a hub in Silicon Valley where it would allow customers to come in and work with us and we help them figure out how they can leverage machine learning. It is a great way to interact with our customers and be able to do that. >> So Steve nirvana is, and you gave that example, the retail example in September, when you launched Watson Data Platform, the nirvana in this world is you can use data, and maybe put in an offer, or save a patients life or effect an outcome in real time. So the retail example was just that. If I recall, you were making an offer real-time it was very fast, live demo it wasn't just a fakey. The example on churn, is the outcome is to effect that customer's decisions so that they don't leave? Is that? >> Yes, pretty much, Essentially what we are looking at is , we're using live data, we're using social media data bringing in Twitter sentiment about a particular individual for example, and try to predict if this customer, if this user is happy with the service that they are getting or not. So for example, people will go and socialize, oh I went to this bank and I hated this experience, or they really got me upset or whatever. Bringing that data from Twitter, so open data and merging it with the bank's data, banks have a lot of data they can leverage and monetize. And then making an assessment using machine learning to predict is this customer going to leave me or not? What probability do they have that they are going to leave me or not based on the machine learning model. The example or scenario we are using now, if we think they are going to leave us, we're going to make special offers to them. It's a way to enhance your service for those customers. So that they don't leave you. >> So operationalizing that would be a call center has some kind on dashboard that says red, green, yellow, boom heres an offer that you should make, and that's done in near real time. In fact, real time is before you lose the customer. That's as good a definition as anything else. >> But it's actually real-time, and when we call it the scoring of the data, so as the data transaction is coming in, you can actually make that assessment in real time, it's called in-transaction scoring where you can make that right on the fly and be able to determine is this customer at risk or not. And then be able to make smarter decisions to that service you are providing on whether you want to offer something better. >> So is the primary use case for this those streams those areas I'm getting you know, whether it be, you mentioned Twitter data, maybe IoT, you're getting can we point machine learning at just archives of data and things written historically or is it mostly the streams? >> It's both of course and machine learning is based on historical data right and that's hot the models are built. The more accurate or more data you have on historical data, the more accurate that you picked the right model and you'll get the better predictition of what's going to happen next time. So it's exactly, it's both. >> How are you helping customers with that initial fit? My understanding is how big of a data set do you need, Do I have enough to really model where I have, how do you help customers work through that? >> So my opinion is obvious to a certain extent, the more data you have as your sample set, the more accurate your model is going to be. So if we have one that's too small, your prediction is going to be inaccurate. It really depends on the scenario, it depends on how many features or the fields you have you're looking at within your dataset. It depends on many things, and it's variable depending on the scenario, but in general you want to have a good chunk of historical data that you can build expertise on right. >> So you've worked on both the Watson Services in the public cloud and now this private cloud, is there any differentiation or do you see significant use case different between those two or is it just kind of where the data lives and we're going to do similar activities there. >> So it is similar. At the end of the day, we're trying to provide similar products on both public cloud and private cloud. But for this specific case, we're launching it on mainframe that's a different angle at this. But we know that's where the biggest banks, the insurance companies, the biggest retailers in the world are, and that's where the biggest transactions are running and we really want to help them leverage machine learning and get their services to the next level. I think it's going to be a huge differentiator for them. >> Steve, you gave an example before of Twitter sentiment data. How would that fit in to this announcement. So I've got this ML on Z and I what API into the twitter data? How does that sort of all get adjusted and consolidated? >> So we allow hooks to be able to access data from different sources, bring in data. That is part of the ingest process. Then once you have that data there into data frames into the machine learning product, now you're feeding into a statistical algorithm to figure out what the best prediction is going to be, and the best model's going to be. >> I have a slide that you guys are sharing on the data scientist workflow. It starts with ingestion, selection, preparation, generation, transform, model. It's a complex set of tasks, and typically historically, at least in the last fIve or six years, different tools to de each of those. And not just different tools, multiples of different tools. That you had to cobble together. If I understand it correctly the Watson Data Platform was designed to really consolidate that and simplify that, provide collaboration tools for different personas, so my question is this. Because you were involved in that product as well. And I was excited about it when I saw it, I talked to people about it, sometimes I hear the criticism of well IBM just took a bunch of legacy products threw them together, threw and abstraction layer on top and is now going to wrap a bunch of services around it. Is that true? >> Absolutely not. Actually, you may have heard a while back IBM had made a big shift into design first design methodology. So we started with the Watson Data Platform, the Data Science Experience, they started with design first approach. We looked at this, we said what do we want the experience to be, for which persona do we want to target. Then we understood what we wanted the experience to be and then we leverage IBM analytics portfolio to be able to feed in and provide and integrate those services together to fit into that experience. So, its not a dumping ground for, I'll take this product, it's part of Watson Data Platform, not at all the case. It was the design first, and then integrate for that experience. >> OK, but there are some so-called legacy products in there, but you're saying you picked the ones that were relevant and then was there additional design done? >> There was a lot of work involved to take them from a traditional product, to be able to componentize, create a micro service architecture, I mean the whole works to be able to redesign it and fit into this new experience. >> So microservices architecture, runs on cloud, I think it only runs on cloud today right? >> Correct, correct. >> OK, maybe roadmap without getting too specific. What should we be paying attention to in the future? >> Right now we're doing our first release. Definitely we want to target any platform behind the firewall. So we don't have specific dates, but now we started with machine learning on a mainframe and we want to be able to target the other platforms behind the firewall and the private cloud environment. Definitely we should be looking at that. Our goal is to make, I talked about the feedback loop a little bit, so that is essentially once you deploy the model we actually look at that model you could schedule in a valuation, automatically, within the machine learning product. To be able to say, this model is still good enough. And if it's not we automatically flag it, and we look at the retraining process and redeployment process to make sure you always have the most up to date model. So this is truly machine learning where it requires very little to no intervention from a human. We're going to continue down that path and continue that automation in providing those capabilities so there's a bigger roadmap, there's a lot of things we're looking at. >> We've sort of looked at our big data analyst George Gilbert has talked about you had batch and you had interactive, not the sort of emergent workload is this continuous, streaming data. How do you see the adoption. First of all, is it a valid assertion? That there is a new class of workload, and then how do you see that adoption occurring? Is it going to be a dominant force over the next 10 years? >> Yeah, I think so. Like I said there is a huge buzz around machine learning in general and artificial intelligence, deep learning, all of these terms you hear about. I think as users and customers get more comfortable with understanding how they're going to leverage this in their enterprise. This real-time streaming of data and being able to do analytics on the fly and machine learning on the fly. It's a big deal and it will really helps them be more competitive in their own space with the services we're providing. >> OK Steve, thanks very much for coming on The CUBE. We'll give you the last word. The event, very intimate event a lot of customers coming in very shortly here in just a couple of hours. Give us the bumper sticker. >> All of that's very exciting, we're very excited, this is a big deal for us, that's why whenever IBM does a signature moment it's a big deal for us and we got something cool to talk about, we're very excited about that. Lot's of clients coming so there's an entire session this afternoon, which will be live streamed as well. So it's great, I think we have a differentiating product and we're already getting that feedback from our customers. >> Well congratulations, I love the cadence that you're on. We saw some announcements in September, we're here in February, I expect we're going to see more innovation coming out of your labs in Toronto, and cross IBM so thank you very much for coming on The CUBE. >> Thank you. >> You're welcome OK keep it right there everybody, we'll be back with our next guest right after this short break. This is The CUBE we're live from New York City. (energetic music)

Published Date : Feb 15 2017

SUMMARY :

Brought to you by IBM. for the IBM Private So this is your baby. and now you point it at platforms. and create models to be able for the private cloud the last six months to a year. the data, I've got to explore, So Spark on the mainframe, from the competition, you're the best model to use without So automating the of the data that we're feeding it Algorithm choice is one that's and the selection and be able to do that. the retail example in September, when you based on the machine learning model. boom heres an offer that you should make, and be able to determine on historical data, the more accurate the more data you have as your sample set, in the public cloud and and get their services to the next level. to this announcement. and the best model's going to be. and is now going to wrap a the experience to be, I mean the whole works attention to in the future? to make sure you always and then how do you see and machine learning on the fly. We'll give you the last word. So it's great, I think we and cross IBM so thank you very This is The CUBE we're

ENTITIES

Entity	Category	Confidence
Steve	PERSON	0.99+
Dave Vellante	PERSON	0.99+
George Gilbert	PERSON	0.99+
Steve Astorino	PERSON	0.99+
Stu Miniman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
September	DATE	0.99+
Toronto	LOCATION	0.99+
90%	QUANTITY	0.99+
February	DATE	0.99+
Silicon Valley	LOCATION	0.99+
New York City	LOCATION	0.99+
Scala	TITLE	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
New York	LOCATION	0.99+
Python	TITLE	0.99+
Twitter	ORGANIZATION	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
twitter	ORGANIZATION	0.99+
R	TITLE	0.99+
both	QUANTITY	0.99+
Java	TITLE	0.99+
first release	QUANTITY	0.98+
three things	QUANTITY	0.98+
IBM Machine Learning Launch Event	EVENT	0.97+
one experience	QUANTITY	0.96+
one	QUANTITY	0.96+
Watson Data Platform	TITLE	0.96+
first approach	QUANTITY	0.95+
Watson	TITLE	0.95+
Steve nirvana	PERSON	0.94+
Watson Data Platform	TITLE	0.93+
Spark	TITLE	0.93+
six years	QUANTITY	0.92+
First	QUANTITY	0.91+
Watson Services	ORGANIZATION	0.91+
this afternoon	DATE	0.9+
first	QUANTITY	0.89+
last six months	DATE	0.89+
each	QUANTITY	0.86+
#IBMML	TITLE	0.82+
Astorino	PERSON	0.77+
Dinesh	ORGANIZATION	0.76+
CUBE	ORGANIZATION	0.74+
next 10 years	DATE	0.72+
Private Cloud Analytics Platform	TITLE	0.71+
a year	QUANTITY	0.65+
first design methodology	QUANTITY	0.65+
of clients	QUANTITY	0.62+
Watson	ORGANIZATION	0.55+
Loop	OTHER	0.48+

Robbie Strickland, IBM - Spark Summit East 2017 - #SparkSummit - #theCUBE

>> Announcer: Live from Boston Massachusetts this is theCube. Covering Spark Summit East 2017, brought to you by Databricks. Now here are your hosts Dave Vellante and George Gilbert. >> Welcome back to theCube, everybody, we're here in Boston. The Cube is the worldwide leader in live tech coverage. This is Spark Summit, hashtag #SparkSummit. And Robbie Strickland is here. He's the Vice President of Engines & Pipelines, I love that title, for the Watson Data Platform at IBM Analytics, formerly with The Weather Company that was acquired by IBM. Welcome to you theCube, good to see you. >> Thank you, good to be here. >> So, it's my standing tongue-in-cheek line is the industry's changing, Dell buys EMC, IBM buys The Weather Company. [Robbie] That's right. >> Wow! That sort of says it all, right? But it was kind of a really interesting blockbuster acquisition. Great for the folks at The Weather Company, great for IBM, so give us the update. Where are we at today? >> So, it's been an interesting first year. Actually, we just hit our first anniversary of the acquisition and a lot has changed. Part of my role, new role at IBM, having come from The Weather Company, is a byproduct of the two companies bringing our best analytics work and kind of pulling those together. I don't know if we have some water but that would be great. So, (coughs) excuse me. >> Dave: So, let me chat for a bit. >> Thanks. >> Feel free to clear your throat. So, you were at IBM, the conference at the time was called IBM Insight. It was the day before the acquisition was announced and we had David Kenny on. David Kenny was the CEO of The Weather Company. And I remember we were talking, and I was like, wow, you have such an interesting business model. Off camera, I was like, what do you want to do with this company, you guys are like prime. Are you going public, you going to sell this thing, I know you have an MBA background. And he goes, "Oh, yeah, we're having fun." Next day was the announcement that IBM bought The Weather Company. I saw him later and I was like, "Aha!" >> And now he's the leader of the Watson Group. >> That's right. >> Which is part of our, The Weather Company joined The Watson Group. >> And The Cloud and analytics groups have come together in recognition that analytics and The Cloud are peanut butter and jelly. >> Robbie: That's absolutely right. >> And David's running that organization, right? >> That is absolutely right. So, it's been an exciting year, it's been an interesting year, a lot of challenges. But I think where we are now with the Watson Data Platform is a real recognition that the use dase where we want to try to make data and analytics and machine learning and operationalizing all of those, that that's not easy for people. And we need to make that easy. And our experience doing that at The Weather Company and all the challenges we ran into have informed the organization, have informed the road map and the technologies that we're using to kind of move forward on that path. >> And The Watson Data Platform was announced in, I believe, October. >> Robbie: That's right. >> You guys had a big announcement in New York City. And you took many sort of components that were viewed as individual discreet functions-- >> Robbie: That's right. >> And brought them together in a single data pipeline. Is that right? >> Robbie: That's right. >> So, maybe describe that a little bit for our audience. >> So, the vision is, you know, one of the things that's missing in the market today is the ability to easily grab data from some source, whether it's a database or a Kafka stream, or some sort of streaming data feed, which is actually something that's often overlooked. Usually you have platforms that are oriented around streaming data, data feeds, or oriented around data at rest, batch data. One of the things that we really wanted to do was sort of combine those two together because we think that's really important. So, to be able to easily acquire data at scale, bring it into a platform, orchestrate complex workflows around that, with the objective, of course, of data enrichment. Ultimately, what you want to be able to do is take those raw signals, whatever they are, and turn that into some sort of enriched data for your organization. And so, for example, we may take signals in from a mobile app, things like beacons, usage beacons on a mobile app, and turn that into a recommendation engine so we can feed real time content decisions back into a mobile platform. Well, that's really hard right now. It requires lots of custom development. It requires you to essentially stitch together your pipeline end to end. It might involve a machine learning pipeline that runs a training pipeline. It might involve, it's all batch oriented, so you land your data somewhere, you run this machine learning pipeline maybe in Spark or ADO or whatever you've got. And then the results of that get fed back into some data store that gets merged with your online application. And then you need to have a restful API or something for your application to consume that and make decisions. So, our objective was to take all of the manual work of standing up those individual pieces and build a platform where that is just, that's what it's designed to do. It's designed to orchestrate those multiple combinations of real time and batch flows. And then with a click of a button and a few configuration options, stand up a restful service on top of whatever the results are. You know, either at an interim stage or at the end of the line. >> And you guys gave an example. You actually showed a demo at the announcement. And I think it was a retail example, and you showed a lot of what would traditionally be batch processes, and then real time, a recommendation came up and completed the purchase. The inference was this is an out of the box software solution. >> Robbie: That's right. >> And that's really what you're saying you've developed. A lot of people would say, oh, it's IBM, they've cobbled together a bunch of their old products, stuck them together, put an abstraction layer on, and wrapped a bunch of services around it. I'm hearing from you-- >> That's exactly, that's just WebSphere. It's WebSphere repackaged. >> (laughing) Yeah, yeah, yeah. >> No, it's not that. So, one of the things that we're trying to do is, if you look at our cloud strategy, I mean, this is really part and parcel, I mean, the nexus of the cloud strategy is the Watson Data Platform. What we could have done is we could have said let's build a fantastic cloud and compete with Amazon or Google or Microsoft. But what we realized is that there is a certain niche there of people who want to take individual services and compose them together and build an application. Mostly on top of just raw VMs with some additional, you know, let's stitch together something with Lambda or stitch together something with SQS, or whatever it may be. Our objective was to sort of elevate that a bit, not try to compete on that level. And say, how do we bring Enterprise grade capabilities to that space. Enterprise grade data management capabilities end-to-end application development, machine learning as a first class citizen, in a cohesive experience. So that, you know, the collaboration is key. We want to be able to collaborate with business users, data scientists, data engineers, developers, API developers, the consumers of the end results of that, whether they be mobile developers or whatever. One of the things that is sort of key, I think, to the vision is that these roles that we've traditionally looked at. If you look at the way that tool sets are built, they're very targeted to specific roles. The data engineer has a tool, the data scientist has a tool. And what's been the difficult part is the boundaries between those have been very firm and the collaboration has been difficult. And so, we draw the personas as a Venn diagram. Because it's very difficult, especially if you look at a smaller company, and even sometimes larger companies, the data engineer is the data scientist. The developer who builds the mobile application is the data scientist. And then in some larger organizations, you have very large teams of data scientists that have these artificial barriers between the data scientist and the data engineer. So, how do we solve both cases? And I think the answer was for us a platform that allows for seamless collaboration where there is not these clean lines between the personas, that the tool sets easily move from one to the other. And if you're one of those hybrid people that works across lines, that the tool feels like it's one tool for you. But if you're two different teams working together, that you can easily hand off. So, that was one of the key objectives we're trying to answer. >> Definitely an innovative component of the announcement, for sure. Go ahead, George. >> So, help us sort of bracket how mature this end-to-end tool suite is in terms of how much of the pipeline it addresses. You know, from the data origin all the way to a trained model and deploying that model. Sort of what's there now, what's left to do. >> So, there are a few things we've brought to market. Probably the most significant is the data science experience. The data science experience is oriented around data science and has, as its sort of central interface, Jupyter Notebooks. Now, as well as, we brought in our studio, and those sorts of things. The idea there being that we'll start with the collaboration around data scientists. So, data scientists can use their language of choice, collaborate around data sets, save out the results of their work and have it consumed either publicly by some other group of data scientists. But the collaboration among data scientists, that was sort of step one. There's a lot of work going on that's sort of ongoing, not ready to bring to market, around how do we simplify machine learning pipelines specifically, how do we bring governance and lineage, and catalog services and those sorts of things. And then the ingest, one of the things we're working on that we have brought to market is our product called Lift which connects, as well. And that's bringing large amounts of data easily into the platform. There are a few components that have sort of been brought to market. dashDB, of course, is a key source of data clouded. So, one of the things that we're working on is some of these existing technologies that actually really play well into the eco system, trying to tie them well together. And then add the additional glue pieces. >> And some of your information management and governance components, as well. Now, maybe that is a little bit more legacy but they're proven. And I don't know if the exits and entries into those systems are as open, I don't know, but there's some capabilities there. >> Speaking of openness, that's actually a great point. If you look at the IIG suite, it's a great On-Premise suite. And one of the challenges that we've had in sort of past IBM cloud offerings is a lot of what has been the M.O. in the past is take a great On-Prem solution and just try to stand it up as a service in the cloud. Which in some cases has been successful, in other cases, less so. One of the things we're trying to look at with this platform is how do we leverage (a) open source. So that whatever you may already be running open source on, Prem or in some other provider, that it's very easy to move your workloads. So, we want to be able to say if you've got 10,000 lines of fraud detection code to map produce. You don't need to rewrite that in anything. You can just move it. And the other thing is where our existing legacy tech doesn't necessarily translate well to the cloud, our first strategy is see if there's any traction around an existing open source project that satisfies that need, and try to see if we can build on that. Where there's not, we go cloud first and we build something that's tailor made to come out. >> So, who's the first one or two customers for this platform? Is it like IBM Global Business Services where they're building the semi-custom industry apps? Or is it the very, very big and sophisticated, like banks and Telcos who are doing the same? Or have you gotten to the point where you can push it out to a much wider audience? >> That's a great question, and it's actually one that is a source of lots of conversation internally for us. If you look at where the data science experience is right now, it's a lot of individual data scientists, you know, small companies, those sorts of things coming together. And a lot of that is because some of the sophistication that we expect for Enterprise customers is not quite there yet. So, we wouldn't expect Enterprise customers to necessarily be onboarded as quickly at the moment. But if we look at sort of the, so I guess there's maybe a medium term answer and a long term answer. I think the long term answer is definitely the Enterprise customers, you know, leveraging IBM's huge entry point into all of those customers today, there's definitely a play to be made there. And one of the things that we're differentiating, we think, over an AWS or Google, is that we're trying to answer that use case in a way that they really aren't even trying to answer it right now. And so, that's one thing. The other is, you know, going beta with a launch customer that's a healthcare provider or a bank where they have all sorts of regulatory requirements, that's more complicated. And so, we are looking at, in some cases, we're looking at those banks or healthcare providers and trying to carve off a small niche use case that doesn't actually fall into the category of all those regulatory requirements. So that we can get our feet wet, get the tires kicked, those sorts of things. And in some cases we're looking for less traditional Enterprise customers to try to launch with. So, that's an active area of discussion. And one of the other key ones is The Weather Company. Trying to take The Weather Company workloads and move The Weather Company workloads. >> I want to come back to The Weather Company. When you did that deal, I was talking to one of your executives and he said, "Why do you think we did the deal?" I said, "Well, you've got 1500 data scientists, "you've got all this data, you know, it's the future." He goes, "Yeah, it's also going to be a platform "for IOT for IBM." >> Robbie: That's right. >> And I was like, "Hmmm." I get the IOT piece, how does it become a platform for IBM's IOT strategy? Is that really the case? Is that transpiring and how so? >> It's interesting because that was definitely one of the key tenets behind the acquisition. And what we've been working on so hard over the last year, as I'm sure you know, sometimes boxes and arrows on an architecture diagram and reality are more challenging. >> Dave: (laughing) Don't do that. >> And so, what we've had to do is reconcile a lot of what we built at The Weather Company, existing IBM tech, and the new things that were in flight, and try to figure out how can we fit all those pieces together. And so, it's been complicated but also good. In some cases, it's just people and expertise. And bringing those people and expertise and leaving some of the software behind. And other cases, it's actually bringing software. So, the story is, obviously, where the rubber meets the road, more complicated than what it sounds like in the press release. But the reality is we've combined those teams and they are all moving in the same direction together with various bits and pieces from the different teams. >> Okay, so, there's vision and then the road map to execute on that, and it's going to unfold over several years. >> Robbie: That's right. >> Okay, good. Stuff at the event here, I mean, what are you seeing, what's hot, what's going on with Spark? >> I think one of the interesting things with what's going on with Spark right now is a lot of the optimizations, especially things around GPUs and that. And we're pretty excited about that, being a hardware manufacturer, that's something that is interesting to us. We run our own cloud. Where some people may not be able to immediately leverage those capabilities, we're pretty excited about that. And also, we're looking at some of those, you know, taking Spark and running it on Power and those sorts of things to try to leverage the hardware improvements. So, that's one of the things we're doing. >> Alright, we have to leave it there, Robbie. Thanks very much for coming on theCube, really appreciate it. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be right back with our next guest. This is theCube. We're live from Spark Summit East, hashtag #SparkSummit. Be right back. >> Narrator: Since the dawn of The Cloud, theCube.

Published Date : Feb 9 2017

SUMMARY :

brought to you by Databricks. The Cube is the worldwide leader in live tech coverage. is the industry's changing, Dell buys EMC, Great for the folks at The Weather Company, is a byproduct of the two companies And I remember we were talking, and I was like, Which is part of our, And The Cloud and analytics groups have come together is a real recognition that the use dase And The Watson Data Platform was announced in, And you took many sort of components that were And brought them together in a single data pipeline. So, the vision is, you know, one of the things And I think it was a retail example, And that's really what you're saying you've developed. That's exactly, that's just WebSphere. So, one of the things that we're trying to do is, of the announcement, for sure. You know, from the data origin all the way to So, one of the things that we're working on And I don't know if the exits and entries One of the things we're trying to look at with this platform And a lot of that is because some of the sophistication and he said, "Why do you think we did the deal?" Is that really the case? one of the key tenets behind the acquisition. and the new things that were in flight, to execute on that, and it's going to unfold Stuff at the event here, I mean, So, that's one of the things we're doing. Alright, we have to leave it there, Robbie. This is theCube.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
George Gilbert	PERSON	0.99+
George	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
The Weather Company	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Robbie	PERSON	0.99+
Dave	PERSON	0.99+
Robbie Strickland	PERSON	0.99+
Watson Group	ORGANIZATION	0.99+
David Kenny	PERSON	0.99+
October	DATE	0.99+
New York City	LOCATION	0.99+
1500 data scientists	QUANTITY	0.99+
two companies	QUANTITY	0.99+
10,000 lines	QUANTITY	0.99+
Dell	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
One	QUANTITY	0.99+
both cases	QUANTITY	0.99+
Boston Massachusetts	LOCATION	0.99+
Spark Summit	EVENT	0.99+
IBM Analytics	ORGANIZATION	0.99+
Spark	TITLE	0.99+
one	QUANTITY	0.99+
ADO	TITLE	0.99+
Lambda	TITLE	0.99+
Telcos	ORGANIZATION	0.99+
The Cloud	ORGANIZATION	0.98+
Spark Summit East 2017	EVENT	0.98+
first strategy	QUANTITY	0.98+
IBM Global Business Services	ORGANIZATION	0.98+
EMC	ORGANIZATION	0.98+
one tool	QUANTITY	0.98+
first anniversary	QUANTITY	0.98+
Databricks	ORGANIZATION	0.98+
last year	DATE	0.98+
today	DATE	0.97+
two customers	QUANTITY	0.97+
single	QUANTITY	0.97+
SQS	TITLE	0.97+
first year	QUANTITY	0.97+
two	QUANTITY	0.96+
two different teams	QUANTITY	0.96+
WebSphere	TITLE	0.96+
#SparkSummit	EVENT	0.95+
Jupyter	ORGANIZATION	0.95+
Watson Data Platform	TITLE	0.94+
Kafka	TITLE	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Watson Data Platform: