Robbie Strickland, IBM - Spark Summit East 2017 - #SparkSummit - #theCUBE
>> Announcer: Live from Boston Massachusetts this is theCube. Covering Spark Summit East 2017, brought to you by Databricks. Now here are your hosts Dave Vellante and George Gilbert. >> Welcome back to theCube, everybody, we're here in Boston. The Cube is the worldwide leader in live tech coverage. This is Spark Summit, hashtag #SparkSummit. And Robbie Strickland is here. He's the Vice President of Engines & Pipelines, I love that title, for the Watson Data Platform at IBM Analytics, formerly with The Weather Company that was acquired by IBM. Welcome to you theCube, good to see you. >> Thank you, good to be here. >> So, it's my standing tongue-in-cheek line is the industry's changing, Dell buys EMC, IBM buys The Weather Company. [Robbie] That's right. >> Wow! That sort of says it all, right? But it was kind of a really interesting blockbuster acquisition. Great for the folks at The Weather Company, great for IBM, so give us the update. Where are we at today? >> So, it's been an interesting first year. Actually, we just hit our first anniversary of the acquisition and a lot has changed. Part of my role, new role at IBM, having come from The Weather Company, is a byproduct of the two companies bringing our best analytics work and kind of pulling those together. I don't know if we have some water but that would be great. So, (coughs) excuse me. >> Dave: So, let me chat for a bit. >> Thanks. >> Feel free to clear your throat. So, you were at IBM, the conference at the time was called IBM Insight. It was the day before the acquisition was announced and we had David Kenny on. David Kenny was the CEO of The Weather Company. And I remember we were talking, and I was like, wow, you have such an interesting business model. Off camera, I was like, what do you want to do with this company, you guys are like prime. Are you going public, you going to sell this thing, I know you have an MBA background. And he goes, "Oh, yeah, we're having fun." Next day was the announcement that IBM bought The Weather Company. I saw him later and I was like, "Aha!" >> And now he's the leader of the Watson Group. >> That's right. >> Which is part of our, The Weather Company joined The Watson Group. >> And The Cloud and analytics groups have come together in recognition that analytics and The Cloud are peanut butter and jelly. >> Robbie: That's absolutely right. >> And David's running that organization, right? >> That is absolutely right. So, it's been an exciting year, it's been an interesting year, a lot of challenges. But I think where we are now with the Watson Data Platform is a real recognition that the use dase where we want to try to make data and analytics and machine learning and operationalizing all of those, that that's not easy for people. And we need to make that easy. And our experience doing that at The Weather Company and all the challenges we ran into have informed the organization, have informed the road map and the technologies that we're using to kind of move forward on that path. >> And The Watson Data Platform was announced in, I believe, October. >> Robbie: That's right. >> You guys had a big announcement in New York City. And you took many sort of components that were viewed as individual discreet functions-- >> Robbie: That's right. >> And brought them together in a single data pipeline. Is that right? >> Robbie: That's right. >> So, maybe describe that a little bit for our audience. >> So, the vision is, you know, one of the things that's missing in the market today is the ability to easily grab data from some source, whether it's a database or a Kafka stream, or some sort of streaming data feed, which is actually something that's often overlooked. Usually you have platforms that are oriented around streaming data, data feeds, or oriented around data at rest, batch data. One of the things that we really wanted to do was sort of combine those two together because we think that's really important. So, to be able to easily acquire data at scale, bring it into a platform, orchestrate complex workflows around that, with the objective, of course, of data enrichment. Ultimately, what you want to be able to do is take those raw signals, whatever they are, and turn that into some sort of enriched data for your organization. And so, for example, we may take signals in from a mobile app, things like beacons, usage beacons on a mobile app, and turn that into a recommendation engine so we can feed real time content decisions back into a mobile platform. Well, that's really hard right now. It requires lots of custom development. It requires you to essentially stitch together your pipeline end to end. It might involve a machine learning pipeline that runs a training pipeline. It might involve, it's all batch oriented, so you land your data somewhere, you run this machine learning pipeline maybe in Spark or ADO or whatever you've got. And then the results of that get fed back into some data store that gets merged with your online application. And then you need to have a restful API or something for your application to consume that and make decisions. So, our objective was to take all of the manual work of standing up those individual pieces and build a platform where that is just, that's what it's designed to do. It's designed to orchestrate those multiple combinations of real time and batch flows. And then with a click of a button and a few configuration options, stand up a restful service on top of whatever the results are. You know, either at an interim stage or at the end of the line. >> And you guys gave an example. You actually showed a demo at the announcement. And I think it was a retail example, and you showed a lot of what would traditionally be batch processes, and then real time, a recommendation came up and completed the purchase. The inference was this is an out of the box software solution. >> Robbie: That's right. >> And that's really what you're saying you've developed. A lot of people would say, oh, it's IBM, they've cobbled together a bunch of their old products, stuck them together, put an abstraction layer on, and wrapped a bunch of services around it. I'm hearing from you-- >> That's exactly, that's just WebSphere. It's WebSphere repackaged. >> (laughing) Yeah, yeah, yeah. >> No, it's not that. So, one of the things that we're trying to do is, if you look at our cloud strategy, I mean, this is really part and parcel, I mean, the nexus of the cloud strategy is the Watson Data Platform. What we could have done is we could have said let's build a fantastic cloud and compete with Amazon or Google or Microsoft. But what we realized is that there is a certain niche there of people who want to take individual services and compose them together and build an application. Mostly on top of just raw VMs with some additional, you know, let's stitch together something with Lambda or stitch together something with SQS, or whatever it may be. Our objective was to sort of elevate that a bit, not try to compete on that level. And say, how do we bring Enterprise grade capabilities to that space. Enterprise grade data management capabilities end-to-end application development, machine learning as a first class citizen, in a cohesive experience. So that, you know, the collaboration is key. We want to be able to collaborate with business users, data scientists, data engineers, developers, API developers, the consumers of the end results of that, whether they be mobile developers or whatever. One of the things that is sort of key, I think, to the vision is that these roles that we've traditionally looked at. If you look at the way that tool sets are built, they're very targeted to specific roles. The data engineer has a tool, the data scientist has a tool. And what's been the difficult part is the boundaries between those have been very firm and the collaboration has been difficult. And so, we draw the personas as a Venn diagram. Because it's very difficult, especially if you look at a smaller company, and even sometimes larger companies, the data engineer is the data scientist. The developer who builds the mobile application is the data scientist. And then in some larger organizations, you have very large teams of data scientists that have these artificial barriers between the data scientist and the data engineer. So, how do we solve both cases? And I think the answer was for us a platform that allows for seamless collaboration where there is not these clean lines between the personas, that the tool sets easily move from one to the other. And if you're one of those hybrid people that works across lines, that the tool feels like it's one tool for you. But if you're two different teams working together, that you can easily hand off. So, that was one of the key objectives we're trying to answer. >> Definitely an innovative component of the announcement, for sure. Go ahead, George. >> So, help us sort of bracket how mature this end-to-end tool suite is in terms of how much of the pipeline it addresses. You know, from the data origin all the way to a trained model and deploying that model. Sort of what's there now, what's left to do. >> So, there are a few things we've brought to market. Probably the most significant is the data science experience. The data science experience is oriented around data science and has, as its sort of central interface, Jupyter Notebooks. Now, as well as, we brought in our studio, and those sorts of things. The idea there being that we'll start with the collaboration around data scientists. So, data scientists can use their language of choice, collaborate around data sets, save out the results of their work and have it consumed either publicly by some other group of data scientists. But the collaboration among data scientists, that was sort of step one. There's a lot of work going on that's sort of ongoing, not ready to bring to market, around how do we simplify machine learning pipelines specifically, how do we bring governance and lineage, and catalog services and those sorts of things. And then the ingest, one of the things we're working on that we have brought to market is our product called Lift which connects, as well. And that's bringing large amounts of data easily into the platform. There are a few components that have sort of been brought to market. dashDB, of course, is a key source of data clouded. So, one of the things that we're working on is some of these existing technologies that actually really play well into the eco system, trying to tie them well together. And then add the additional glue pieces. >> And some of your information management and governance components, as well. Now, maybe that is a little bit more legacy but they're proven. And I don't know if the exits and entries into those systems are as open, I don't know, but there's some capabilities there. >> Speaking of openness, that's actually a great point. If you look at the IIG suite, it's a great On-Premise suite. And one of the challenges that we've had in sort of past IBM cloud offerings is a lot of what has been the M.O. in the past is take a great On-Prem solution and just try to stand it up as a service in the cloud. Which in some cases has been successful, in other cases, less so. One of the things we're trying to look at with this platform is how do we leverage (a) open source. So that whatever you may already be running open source on, Prem or in some other provider, that it's very easy to move your workloads. So, we want to be able to say if you've got 10,000 lines of fraud detection code to map produce. You don't need to rewrite that in anything. You can just move it. And the other thing is where our existing legacy tech doesn't necessarily translate well to the cloud, our first strategy is see if there's any traction around an existing open source project that satisfies that need, and try to see if we can build on that. Where there's not, we go cloud first and we build something that's tailor made to come out. >> So, who's the first one or two customers for this platform? Is it like IBM Global Business Services where they're building the semi-custom industry apps? Or is it the very, very big and sophisticated, like banks and Telcos who are doing the same? Or have you gotten to the point where you can push it out to a much wider audience? >> That's a great question, and it's actually one that is a source of lots of conversation internally for us. If you look at where the data science experience is right now, it's a lot of individual data scientists, you know, small companies, those sorts of things coming together. And a lot of that is because some of the sophistication that we expect for Enterprise customers is not quite there yet. So, we wouldn't expect Enterprise customers to necessarily be onboarded as quickly at the moment. But if we look at sort of the, so I guess there's maybe a medium term answer and a long term answer. I think the long term answer is definitely the Enterprise customers, you know, leveraging IBM's huge entry point into all of those customers today, there's definitely a play to be made there. And one of the things that we're differentiating, we think, over an AWS or Google, is that we're trying to answer that use case in a way that they really aren't even trying to answer it right now. And so, that's one thing. The other is, you know, going beta with a launch customer that's a healthcare provider or a bank where they have all sorts of regulatory requirements, that's more complicated. And so, we are looking at, in some cases, we're looking at those banks or healthcare providers and trying to carve off a small niche use case that doesn't actually fall into the category of all those regulatory requirements. So that we can get our feet wet, get the tires kicked, those sorts of things. And in some cases we're looking for less traditional Enterprise customers to try to launch with. So, that's an active area of discussion. And one of the other key ones is The Weather Company. Trying to take The Weather Company workloads and move The Weather Company workloads. >> I want to come back to The Weather Company. When you did that deal, I was talking to one of your executives and he said, "Why do you think we did the deal?" I said, "Well, you've got 1500 data scientists, "you've got all this data, you know, it's the future." He goes, "Yeah, it's also going to be a platform "for IOT for IBM." >> Robbie: That's right. >> And I was like, "Hmmm." I get the IOT piece, how does it become a platform for IBM's IOT strategy? Is that really the case? Is that transpiring and how so? >> It's interesting because that was definitely one of the key tenets behind the acquisition. And what we've been working on so hard over the last year, as I'm sure you know, sometimes boxes and arrows on an architecture diagram and reality are more challenging. >> Dave: (laughing) Don't do that. >> And so, what we've had to do is reconcile a lot of what we built at The Weather Company, existing IBM tech, and the new things that were in flight, and try to figure out how can we fit all those pieces together. And so, it's been complicated but also good. In some cases, it's just people and expertise. And bringing those people and expertise and leaving some of the software behind. And other cases, it's actually bringing software. So, the story is, obviously, where the rubber meets the road, more complicated than what it sounds like in the press release. But the reality is we've combined those teams and they are all moving in the same direction together with various bits and pieces from the different teams. >> Okay, so, there's vision and then the road map to execute on that, and it's going to unfold over several years. >> Robbie: That's right. >> Okay, good. Stuff at the event here, I mean, what are you seeing, what's hot, what's going on with Spark? >> I think one of the interesting things with what's going on with Spark right now is a lot of the optimizations, especially things around GPUs and that. And we're pretty excited about that, being a hardware manufacturer, that's something that is interesting to us. We run our own cloud. Where some people may not be able to immediately leverage those capabilities, we're pretty excited about that. And also, we're looking at some of those, you know, taking Spark and running it on Power and those sorts of things to try to leverage the hardware improvements. So, that's one of the things we're doing. >> Alright, we have to leave it there, Robbie. Thanks very much for coming on theCube, really appreciate it. >> Thank you. >> You're welcome. Alright, keep it right there, everybody. We'll be right back with our next guest. This is theCube. We're live from Spark Summit East, hashtag #SparkSummit. Be right back. >> Narrator: Since the dawn of The Cloud, theCube.
SUMMARY :
brought to you by Databricks. The Cube is the worldwide leader in live tech coverage. is the industry's changing, Dell buys EMC, Great for the folks at The Weather Company, is a byproduct of the two companies And I remember we were talking, and I was like, Which is part of our, And The Cloud and analytics groups have come together is a real recognition that the use dase And The Watson Data Platform was announced in, And you took many sort of components that were And brought them together in a single data pipeline. So, the vision is, you know, one of the things And I think it was a retail example, And that's really what you're saying you've developed. That's exactly, that's just WebSphere. So, one of the things that we're trying to do is, of the announcement, for sure. You know, from the data origin all the way to So, one of the things that we're working on And I don't know if the exits and entries One of the things we're trying to look at with this platform And a lot of that is because some of the sophistication and he said, "Why do you think we did the deal?" Is that really the case? one of the key tenets behind the acquisition. and the new things that were in flight, to execute on that, and it's going to unfold Stuff at the event here, I mean, So, that's one of the things we're doing. Alright, we have to leave it there, Robbie. This is theCube.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
George | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Boston | LOCATION | 0.99+ |
The Weather Company | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Robbie | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Robbie Strickland | PERSON | 0.99+ |
Watson Group | ORGANIZATION | 0.99+ |
David Kenny | PERSON | 0.99+ |
October | DATE | 0.99+ |
New York City | LOCATION | 0.99+ |
1500 data scientists | QUANTITY | 0.99+ |
two companies | QUANTITY | 0.99+ |
10,000 lines | QUANTITY | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
One | QUANTITY | 0.99+ |
both cases | QUANTITY | 0.99+ |
Boston Massachusetts | LOCATION | 0.99+ |
Spark Summit | EVENT | 0.99+ |
IBM Analytics | ORGANIZATION | 0.99+ |
Spark | TITLE | 0.99+ |
one | QUANTITY | 0.99+ |
ADO | TITLE | 0.99+ |
Lambda | TITLE | 0.99+ |
Telcos | ORGANIZATION | 0.99+ |
The Cloud | ORGANIZATION | 0.98+ |
Spark Summit East 2017 | EVENT | 0.98+ |
first strategy | QUANTITY | 0.98+ |
IBM Global Business Services | ORGANIZATION | 0.98+ |
EMC | ORGANIZATION | 0.98+ |
one tool | QUANTITY | 0.98+ |
first anniversary | QUANTITY | 0.98+ |
Databricks | ORGANIZATION | 0.98+ |
last year | DATE | 0.98+ |
today | DATE | 0.97+ |
two customers | QUANTITY | 0.97+ |
single | QUANTITY | 0.97+ |
SQS | TITLE | 0.97+ |
first year | QUANTITY | 0.97+ |
two | QUANTITY | 0.96+ |
two different teams | QUANTITY | 0.96+ |
WebSphere | TITLE | 0.96+ |
#SparkSummit | EVENT | 0.95+ |
Jupyter | ORGANIZATION | 0.95+ |
Watson Data Platform | TITLE | 0.94+ |
Kafka | TITLE | 0.94+ |