Manish Goyal, IBM - IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM, Fast Track Your Data, brought to you by IBM. >> We're back in Munich, Germany this is Fast Track Your Data and this is theCUBE, the leader in live tech coverage, we go out to the events. We extract a signal from the noise my name is Dave Vellante and I'm here with my co-host Jim Kobielus. We just came off of the main stage. IBM had a very choreographed, really beautiful, Kate Silverton was there of BBC Fame talking to various folks within the IBM community. IBM executives, practitioners, and quite a main stage production Jim. IBM always knows how to do it right. Manish Goyal here, he's the Director of Product Management for the Watson Data Platform. Something we covered on theCUBE extensively, that announcement last year in New York City. Manish welcome to theCUBE. >> Thank you for having me. >> Dave: So this is, it really was your signature moment back in last fall at Strata in New York City. We covered that, big announcement, lot of customers there. You guys demonstrated sort of the next generation of platform that you guys are announcing. >> Manish: That's right. >> So take us, bring us up to date. How's it going, where are we at, and what are you guys doing here? >> So, again thank you for having me. >> Dave: You're welcome. >> Let me take a minute to just let all the viewers know what is alternate about form. So the Watson Data Platform is our cloud analytics platform, and it's really three things. It's a set of composable data services, for ingest, analyze, processed. It's a set of tailor-made experiences for the different personas. Whether you are a data engineer, a business analyst, data scientist, or the steward. And connecting all of these, both of these is a set of data fabric, which is really the secret sauce. And think of this as being the governance layer that ensures that everything that we're doing, that everything that is being done by any of these personas is working on trusted data, and that the insights that are being generated can be trusted by the risk folks, the business folks, as they put the analytics into production. >> Dave: So just to review for our audience, there are a number of components to the Watson Data Platform. >> That's right, yep. >> Dave: There's the governance components you mentioned, there's the visualization, there's analytics. Now, many people criticized Watson Data Platform, they said oh it's just IBM putting a bunch of despaired products together, some acquisitions and then wrapping some services around it. When we talked to you guys in October, you said no, no, that's not the case. But can you affirm that? >> That is exactly right, that is not the case. It's not just us putting stuff together and calling it a new name, and think oh that's the platform, just a set of despaired services. That is absolutely not, and that's why I was emphasizing this common data fabric, right. I've got a couple of, let me sort of dive a little bit deeper into it. >> Sure, great. >> Manish: So the biggest problem that customers and data users in general complain about is, extremely hard to find data, right. The tools that they're working with are all siloed. So even if, you know, you and I are working on, you know our analytics projects, very hard for me to share what I'm working on with you, the environment that I am running on with you, et cetera. And this... The third piece is, a real issue with is the data that I'm working with trusted? Like can I actually believe that this is the best data that I can use, so that when I put something into production when I create my machine learning models I put them into my production environment. The risk guys are going to be fine with it, I'm going to be fine with it, I see the results that I'm getting. And so, getting this data fabric which is addressing these issues. One, it's addressing it first and foremost with a data catalog, a governance layer. So that it's very clear, irrespective, whether you're a data engineer, business analyst, data scientist or the data steward, from the CDO's office, you're all working off the same version of the truth, right. >> Jim: Manish is that something a DevOps platform, is it like DevOps for data science or for machine learning development or is it... How would you describe... Does that make sense? The automated release pipeline that's-- >> Manish: In a way yes. >> With the governance baked in? >> Yes, in a way that's one way to describe it. So that's one aspect right? Making sure that you're working with the trusted data, making it very easy to find the data, so that's sort of the governance aspect. The second piece that sort of really makes this a platform is that you're working off the same notion of a workspace, we call it a project. So, you may start out as a data engineer being asked yourself, take all these different data sources that are coming in and create and publish a data set that can be consumed for dashboarding, for data analysis whatever. And you're working on that in a project, now if you have a data science team that needs to be working on the same thing, you can just invite them to the same project. So they're working on the same thing, similarly to a business analyst, et cetera. And all of these results, and when we talk about governance it's not about just data sets, it's all analytical products. So it is the model that you're creating are being put back into the catalog and governed. Data flows-- >> It's model governance. >> Jim: Model governance, it's model governance? >> Exactly. >> And aiding governance. >> Manish: So it's a huge problem that customers have. I was just talking to a large insurance company yesterday, and they're question was, "What are you doing to make sure that I don't have to spend an enormous amount of time that I have to with the risk group, before I can put a model into production." Because they want complete lineage all the way back, saying "Okay you created this model, you're going to put it into production, whether it's for allowing credit card insurance, whatever your product is that you're selling. How do you make sure that there's no bias in the model that is created, can you show me the data set on which you trained it? And then when you re-trained it can you show me that data set?" So in case they're audited, that there's complete way to go back from the production model all the way back to the data set that was created. And which goes even further back from all the different data sources. Where it was cleansed, et cetera, the ETL, where it was published, and then picked up by data science team. So all of these things, putting it together with this data fabric. Governance being a huge, huge portion of that that goes across everything that we're doing. Giving these tailor-made experiences for the different business personas, oh sorry, the data personas, and just making it extremely simple for generating insights that can be trusted. So that is what we are trying to do with the Watson Data Platform. As, since last fall when we announced it, we have had a huge update on our data science experience, you heard a lot about that in the presentation this morning. As well as, all of our other cloud data services and the governance put forth. >> Dave: And that data science experience is embedded fundamental to the platform. >> It is, it is. >> Dave: You know I want to ask you about that. Because I don't know if you remember Jim and Manish, a few years ago, several years ago, Pivotal announced this thing called Chorus and it went, it was a collaboration platform and it really went nowhere. Now part of the reason it went nowhere was because it was early days, but also there wasn't the analytics solution underneath it. But a lot of people questioned, "Well do we really need to collaborate across those personas?" Again maybe they were immature at the time. So convince me that there's a need for that and that this is actually getting used in the world. >> There was an example, probably you've always seen the venn diagram or for data scientist, right? With all the different skills that they need, they are a unicorn, and there are no unicorns. It's extremely hard for our customers, in fact just finding really good data scientist is extremely hard. It's a very limited supply of that talent. So that's one thing right. So you can't find enough of these folks to scale out the level of analytics that is needed, if you want to use data for a comparative advantage. So that's one aspect right, of talent being a huge issue. The second aspect of it is you really do need specialized skill in data engineer. You don't want your PhD data scientist spending 60% of their time finding cleansing data. You have folks who really do that well and you want to enable them to work closely with the data science team. And you really do need business analyst who are the key to sort of understanding the business problem that needs to be solved, because that's where you always want to start any analytics product. What is it that you're trying to improve, or reduce cost on, or whatever your problem is that you're addressing. And so you really need, it is a team sport. You can't just do it without. Now if it is a team sport, how are these folks going to collaborate, right? And that is why, in all of our interactions with our customers and their data science teams. They absolutely love the collaboration features that we have put in, and we have put in a lot of effort in data science experience and the same collaboration features are actually going to extend across the portfolio of these experiences on the data platform. >> And the whole notion of personas is so fundamental to Watson Data Platform. And I'm wondering, is IBM evolving the range and variety of personas for which you're providing these experiences? And what I mean by that is, examples, we see more and more data science application development projects focusing on for example, chat bots. That involves human conversation, you need a bit more, possibly a persona, a computational linguist. Or cognitive IoT, like Watson, you know IoT, that's sensors, that's hardware devices maybe hardware engineers, hardware engineering experiences. You see what I'm getting at is that data science centric projects are increasingly moving from the totally virtual world, to being very much embedding in the physical world and the world of human guided, machine learning guided conversation. What are your thoughts about evolving the personas mix? >> So application, application developers, or the persona I actually missed when I was talking about this before, it's absolutely central because almost anything that the data science team is doing is going to create, at the end of the day, sort of create models. But the hope is that it's going to put into production system. And that job typically is the role of an application developer. Now, Jim you mentioned sort of, there's a lot of emphasis these days on conversational chat bots. And again, at the end of the day with data science projects you are in many ways, trying to improve the experience that you're giving your customers. Or personalizing the experience that you're giving your customers. A celebrity experience that Rob talked about this morning. And there are other personas involved in that sense, so to get a chat bot right, I mean there is data that you can obviously harvest and use to create that flow, an intelligence in chat bot. But there are elements where you do need a subject matter expert to curate that. To make sure that it doesn't seem robotic, that it does feel genuine. And so there is a role for a subject matter expert, we sort of collaborate with a business analyst role, or persona. But yes, all of these roles play an important part in sort of putting together the entire package. It just feels seamless, and that's why I sort of come back to saying that it is a team sport and if you do not enable the teams to work closely together, and enhance their productivity, you can go after all the data that's being generated and all the opportunity that data is presenting. And the prize is to gain a competitive advantage. >> Dave: One of the things Manish, you demonstrated last fall was this sort of, it was sort of a recommendation engine and very personalized. And it was quite a nice demo and it wasn't a fake demo from what I understood, it was real data. Can you share with us in the time we have remaining, just some of your favorite examples of how people are applying the Watson Data Platform and affecting business? >> Manish: Sure yeah so, I'll tell you a couple of examples. So I was actually in London earlier this week, meeting with a customer and they are using DSX, our data science experience, with a couple of utility companies. One is a water company, water utility company. And the problem that they're trying to solve is, they're supplying water in a hilly area and they want to optimize the power that they use to power the pumps to pump out water. Because it can be very expensive if the pumps are running all the time, et cetera. And so they're using data science experience to optimize when and how, and how long the pumps need to run to enable that the customers are happy with the level of water supply that they're getting and the force that they're getting it with. While the utility company is optimizing the expense in actually powering these things. So that's just a recent example that comes to mind. There are others, there's a logistics, huge logistics in transportation company who's using data science experience to optimize how the refrigeration of the storage units that are going all across the globe for transporting sort of food and other articles like that. How they can optimize the temperature of the goods that they're transporting, again to make sure that there's absolutely the minimum amount of wastage that occurs in the transportation process. But at the same time optimizing the cost that they incur, because all of that sort of shows up in the end product that you and I buy from retailers. >> Dave: And is there instrumentation in the field involved in that? Is that kind of a semi-IoT example? >> Absolutely, right, so in this case, actually both of these cases, in one case there are smart meters that are throwing out data every 15 minutes. In the other example of the logistics one, it is data that is almost streaming coming in. So in one case you can use batch processing, even though it's coming in at a 15 minute intervals, to predict out what you want to do. In the other case it's streaming data, which you want to analyze as it streams. >> Excellent, alright well exciting times here for you and your group. >> Absolutely >> Dave: Congratulations on getting the product out and getting it adopted. >> Thank you. >> Glad to see that. And thanks for coming on theCUBE. >> Manish: Thank you. Thanks for having me. >> Alright! >> Dave: Keep it right there everybody. Jim and I will be back, we're live from Munich, Germany, unscripted, bringing theCUBE to you. Bringing Fast Track Your Data. We'll be right back. (techno music)

Published Date : Jun 24 2017

SUMMARY :

brought to you by IBM. for the Watson Data Platform. platform that you guys are announcing. and what are you guys doing here? So the Watson Data Platform is our cloud analytics platform, Dave: So just to review for our audience, Dave: There's the governance components you mentioned, That is exactly right, that is not the case. Manish: So the biggest problem that customers Jim: Manish is that something a DevOps platform, So it is the model that you're creating all the way back, saying "Okay you created this model, Dave: And that data science experience is embedded and that this is actually getting used in the world. the business problem that needs to be solved, and the world of human guided, And the prize is to gain a competitive advantage. Dave: One of the things Manish, and how long the pumps need to run to enable that to predict out what you want to do. for you and your group. Dave: Congratulations on getting the product out Glad to see that. Manish: Thank you. Dave: Keep it right there everybody.

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Jim	PERSON	0.99+
Kate Silverton	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
London	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
Rob	PERSON	0.99+
Manish	PERSON	0.99+
October	DATE	0.99+
third piece	QUANTITY	0.99+
both	QUANTITY	0.99+
yesterday	DATE	0.99+
New York City	LOCATION	0.99+
60%	QUANTITY	0.99+
Manish Goyal	PERSON	0.99+
last year	DATE	0.99+
last fall	DATE	0.99+
15 minute	QUANTITY	0.99+
second piece	QUANTITY	0.99+
Pivotal	ORGANIZATION	0.99+
one aspect	QUANTITY	0.99+
second aspect	QUANTITY	0.99+
one case	QUANTITY	0.99+
Munich, Germany	LOCATION	0.98+
2017	DATE	0.98+
earlier this week	DATE	0.98+
several years ago	DATE	0.98+
BBC Fame	ORGANIZATION	0.96+
DevOps	TITLE	0.96+
one thing	QUANTITY	0.96+
three things	QUANTITY	0.96+
first	QUANTITY	0.96+
One	QUANTITY	0.94+
this morning	DATE	0.93+
few years ago	DATE	0.93+
Strata	ORGANIZATION	0.92+
Germany	LOCATION	0.9+
one way	QUANTITY	0.9+
Watson Data Platform	ORGANIZATION	0.83+
Watson Data Platform	TITLE	0.81+
DSX	ORGANIZATION	0.8+
every 15 minutes	QUANTITY	0.76+
Chorus	ORGANIZATION	0.75+
theCUBE	ORGANIZATION	0.73+
Data Platform	TITLE	0.68+
CDO	ORGANIZATION	0.62+
Watson	TITLE	0.58+
couple	QUANTITY	0.55+
Watson	ORGANIZATION	0.49+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Manish Goyal: