Rob Bearden, Hortonworks & Rob Thomas, IBM Analytics - #DataWorks - #theCUBE
>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2017, brought to you by Hortonworks. >> Hi, welcome to theCUBE. We are live in San Jose, in the heart of Silicon Valley at the DataWorks Summit, day one. I'm Lisa Martin, with my co-host, George Gilbert. And we're very excited to be talking to two Robs. With Rob squared on the program this morning. Rob Bearden, the CEO of Hortonworks. Welcome, Rob. >> Thank you for having us. >> And Rob Thomas, the VP, GM rather, of IBM Analytics. So, guys, we just came from this really exciting, high energy keynote. The laser show was fantastic, but one of the great things, Rob, that you kicked off with was really showing the journey that Hortonworks has been on, and in a really pretty short period of time. Tremendous inertia, and you talked about the four mega-trends that are really driving enterprises to modernize their data architecture. Cloud, IOT, streaming data, and the fourth, next leg of this is data science. Data science, you said, will be the transformational next leg in the journey. Tell our viewers a little bit more about that. What does that mean for Hortonworks and your partnership with IBM? >> Well, what I think what IBM and Hortonworks now have the ability to do is to bring all the data together across a connected data platform. The data in motion, the data at rest, now have in one common platform, irrespective of the deployment architecture, whether it's on prim across multiple data centers or whether deployed in the cloud. And now that the large volume of data and we have access to it, we can now start to begin to drive the analytics in the end as that data moves through each phase of its life cycle. And what really happens now, is now that we have visibility and access to the inclusive life cycle of the data we can now put a data science framework over that to really now understand and learn those patterns and what's the data telling us, what's the pattern behind that. And we can bring simplification to the data science and turn data science actually into a team sport. Allow them to collaborate, allow them to have access to it. And sort of take the black magic out of doing data science with the framework of the tool and the power of DSX on top of the connected data platform. Now we can advance rapidly the insights in the end of the data and what that really does is drive value really quickly back into the customer. And then we can then begin to bring smart applications via the data science back into the enterprise. So we can now do things like connected car in real time, and have connected car learn as it's moving and through all the patterns, we can now, from a retail standpoint really get smart and accurate about inventory placement and inventory management. From an industrial standpoint, we know in real time, down to the component, what's happening with the machine, and any failures that may happen and be able to eliminate downtime. Agriculture, same kind of... Healthcare, every industry, financial services, fraud detection, money laundering advances that we have but it's all going to be attributable to how machine learning is applied and the DSX platform is the best platform in the world to do that with. >> And one of the things that I thought was really interesting, was that, as we saw enterprises start to embrace Hadoop and Big Data and Segano this needs to co-exist and inter-operate with our traditional applications, our traditional technologies. Now you're saying and seeing data science is going to be strategic business differentiator. You mentioned a number of industries, and there were several of them on stage today. Give us some, maybe some, one of your favorite examples of one of your customers leveraging data science and driving a pretty significant advantage for their business. >> Sure. Yeah, well, to step back a little bit, just a little context, only ten companies have out performed the S&P 500 in each of the last five years. We start looking at what are they doing. Those are companies that have decided data science and machine learning is critical. They've made a big bet on it, and every company needs to be doing that. So a big part of our message today was, kind of, I'd say, open the eyes of everybody to say there is something happening in the market right now. And it can make a huge difference in how you're applying data analytics to improve your business. We announced our first focus on this back in February, and one of our clients that spoke at that event is a company called Argus Healthcare. And Argus has massive amounts of data, sitting on a mainframe, and they were looking for how can we unleash that to do better care of patients, better care for our hospital networks, and they did that with data they had in their mainframe. So they brought data science experience and machine learning to their mainframe, that's what they talked about. What Rob and I have announced today is there's another great trove of data in every organization which is the data inside Hadoop. HDP, leading distribution for that, is a great place to start. So the use case that I just shared, which is on the mainframe, that's going to apply anywhere where there's large amounts of data. And right now there's not a great answer for data science on Hadoop, until today, where data science experience plus HDP brings really, I'd say, an elegant approach to it. It makes it a team sport. You can collaborate, you can interact, you can get education right in the platform. So we have the opportunity to create a next generation of data scientists working with data and HDP. That's why we're excited. >> Let me follow up with this question in your intro that, in terms of sort of the data science experience as this next major building block, to extract, or to build on the value from the data lake, the two companies, your two companies have different sort of, better markets, especially at IBM, but the industry solutions and global business services, you guys can actually build semi-custom solutions around this platform, both the data and the data science experience. With Hortonworks, what are those, what's your go to market motion going to look like and what are the offerings going to look like to the customer? >> They'll be several. You just described a great example, with IBM professional services, they have the ability to take those industry templates and take these data science models and instantly be able to bring those to the data, and so as part of our joint go to market motion, we'll be able now partner, bring those templates, bring those models to not only our customer base, but also part of the new sales go to market motion in the light space, in new customer opportunities and the whole point is, now we can use the enterprise data platforms to bring the data under management in a mission critical way that then bring value to it through these kinds of use case and templates that drive the smart applications into quick time to value. And just increase that time to value for the customers. >> So, how would you look at the mix changing over time in terms of data scientists working with the data to experiment on the model development and the two hard parts that you talked about, data prep and operationalization. So in other words, custom models, the issue of deploying it 11 months later because there's no real process for that that's packaged, and then packaged enterprise apps that are going to bake these models in as part of their functionality that, you know, the way Salesforce is starting to do and Workday is starting to do. How does that change over time? >> It'll be a layering effect. So today, we now have the ability to bring through the connected data platforms all the data under management in a mission critical manner from point of origination through the entire stream till it comes at rest. Now with the data science, through DSX, we can now, then, have that data science framework to where, you know, the analogy I would say, is instead of it being a black science of how you do data access and go through and build the models and determine what the algorithms are and how that yields a result, the analogy is you don't have to be a mechanic to drive a car anymore. The common person can drive a car. So, now we really open up the community business analyst that can now participate and enable data science through collaboration and then we can take those models and build the smart apps and evolve the smart apps that go to that very rapidly and we can accelerate that process also now through the partnership with IBM and bringing their core domain and value that, drivers that they've already built and drop that into the DSX environments and so I think we can accelerate the time to value now much faster and efficient than we've ever been able to do before. >> You mentioned teamwork a number of times, and I'm curious about, you also talked about the business analyst, what's the governance like to facilitate business analysts and different lines of business that have particular access? And what is that team composed of? >> Yeah, well, so let's look at what's happening in the big enterprises in the world right now. There's two major things going one. One is everybody's recognizing this is a multi-cloud world. There's multiple public cloud options, most clients are building a private cloud. They need a way to manage data as a strategic asset across all those multiple cloud environments. The second piece is, we are moving towards, what I would call, the next generation data fabric, which is your warehousing capabilities, your database capabilities, married with Hadoop, married with other open source data repositories and doing that in a seamless fashion. So you need a governance strategy for all of that. And the way I describe governance, simple analogy, we do for data what libraries do for books. Libraries create a catalog of books, they know they have different copies of books, some they archive, but they can access all of the intelligence in the library. That's what we do for data. So when we talk about governance and working together, we're both big supporters of the Atlas project, that will continue, but the other piece, kind of this point around enterprise data fabric is what we're doing with Big SQL. Big SQL is the only 100% ANSI-SQL compliant SQL engine for data across Hadoop and other repositories. So we'll be working closely together to help enterprises evolve in a multi-cloud world to this enterprise data fabric and Big SQL's a big capability for that. >> And an immediate example of that is in our EDW optimization suite that we have today we be loading Big SQL as the platform to do the complex query sector of that. That will go to market with almost immediately. >> Follow up question on the governance, there's, to what extent is end to end governance, meaning from the point of origin through the last mile, you know, if the last mile might be some specialized analytic engine, versus having all the data management capabilities in that fabric, you mentioned operational and analytic, so, like, are customers going to be looking for a provider who can give them sort of end to end capabilities on both the governance side and on all the data management capabilities? Is that sort of a critical decision? >> I believe so. I think there's really two use cases for governance. It's either insights or it's compliance. And if you're focus is on compliance, something like GDPR, as an example, that's really about the life cycle of data from when it starts to when it can be disposed of. So for compliance use case, absolutely. When I say insights as a governance use case, that's really about self-service. The ideal world is you can make your data available to anybody in your organization, knowing that they have the right permissions, that they can access, that they can do it in a protected way and most companies don't have that advantage today. Part of the idea around data science on HDP is if you've got the right governance framework in place suddenly you can enable self-service which is any data scientist or any business analyst can go find and access the data they need. So it's a really key part of delivering on data science, is this governance piece. Now I just talked to clients, they understand where you're going. Is this about compliance or is this about insights? Because there's probably a different starting point, but the end game is similar. >> Curious about your target markets, Tyler talked about the go to market model a minute ago, are you targeting customers that are on mainframes? And you said, I think, in your keynote, 90% of transactional data is in a mainframe. Is that one of the targets, or is it the target, like you mention, Rob, with the EDW optimization solution, are you working with customers who have an existing enterprise data warehouse that needs to be modernized, is it both? >> The good news is it's both. It's about, really the opportunity and mission, is about enabling the next generation data architecture. And within that is again, back to the layering approach, is being able to bring the data under management from point of origination through point of it reg. Now if we look at it, you know, probably 90% of, at least transactional data, sits in the mainframe, so you have to be able to span all data sets and all deployment architectures on prim multi-data center as well as public cloud. And that then, is the opportunity, but for that to then drive value ultimately back, you've got to be able to have then the simplification of the data science framework and toolset to be able to then have the proper insights and basis on which you can bring the new smart applications. And drive the insights, drive the governance through the entire life cycle. >> On the value front, you know, we talk about, and Hortonworks talks about, the fact that this technology can really help a business unlock transformational value across their organization, across lines of business. This conversation, we just talked about a couple of the customer segments, is this a conversation that you're having at the C-suite initially? Where are the business leaders in terms of understanding? We know there's more value here, we probably can open up new business opportunities or are you talking more the data science level? >> Look, it's at different levels. So, data science, machined learning, that is a C-suite topic. A lot of times I'm not sure the audience knows what they're asking for, but they know it's important and they know they need to be doing something. When you go to things like a data architecture, the C-suite discussion there is, I just want to become more productive in how I'm deploying and using technology because my IT budget's probably not going up, if anything it may be going down, so I've got to become a lot more productive and efficient to do that. So it depends on who you're talking to, there's different levels of dialogue. But there's no question in my mind, I've seen, you know, just look at major press Financial Times, Wallstreet Journal last year. CEOs are talking about AI, machine learning, using data as a competitive weapon. It is happening and it's happening right now. What we're doing together, saying how do we make data simple and accessible? How do we make getting there really easy? Because right now it's pretty hard. But we think with the combination of what we're bringing, we make it pretty darn easy. >> So one quick question following up on that, and then I think we're getting close to the end. Which is when the data lakes started out, it was sort of, it seemed like, for many customers a mandate from on high, we need a big data strategy, and that translated into standing up a Hadoop cluster, and that resulted in people realizing that there's a lot to manage there. It sounds like, right now people know machine learning is hot so they need to get data science tools in place, but is there a business capability sort of like the ETL offload was for the initial Hadoop use cases, where you would go to a customer and recommend do this, bite this off as something concrete? >> I'll start and then Rob can comment. Look, the issue's not Hadoop, a lot of clients have started with it. The reason there hasn't been, in some cases, the outcomes they wanted is because just putting data into Hadoop doesn't drive an outcome. What drives an outcome is what do you do with it. How do you change your business process, how do you change what the company's doing with the data, and that's what this is about, it's kind of that next step in the evolution of Hadoop. And that's starting to happen now. It's not happening everywhere, but we think this will start to propel that discussion. Any thoughts you had, Rob? >> Spot on. Data lake was about releasing the constraints of all the silos and being able to bring those together and aggregate that data. And it was the first basis for being able to have a 360 degree or wholistic centralized insight about something and, or pattern, but what then data science does is it actually accelerates those patterns and those lessons learned and the ability to have a much more detailed and higher velocity insight that you can react to much faster, and actually accelerate the business models around this aggregate. So it's a foundational approach with Hadoop. And it's then, as I mentioned in the keynote, the data science platforms, machine learning, and AI actually is what is the thing that transformationally opens up and accelerates those insights, so then new models and patterns and applications get built to accelerate value. >> Well, speaking of transformation, thank you both so much for taking time to share your transformation and the big news and the announcements with Hortonworks and IBM this morning. Thank you Rob Bearden, CEO of Hortonworks, Rob Thomas, General Manager of IBM Analytics. I'm Lisa Martin with my co-host, George Gilbert. Stick around. We are live from day one at DataWorks Summit in the heart of Silicon Valley. We'll be right back. (tech music)
SUMMARY :
brought to you by Hortonworks. We are live in San Jose, in the heart of Silicon Valley and the fourth, next leg of this is data science. now have the ability to do And one of the things and every company needs to be doing that. and the data science experience. that drive the smart applications into quick time to value. and the two hard parts that you talked about, and drop that into the DSX environments and doing that in a seamless fashion. in our EDW optimization suite that we have today and most companies don't have that advantage today. Tyler talked about the go to market model a minute ago, but for that to then drive value ultimately back, On the value front, you know, we talk about, and they know they need to be doing something. that there's a lot to manage there. it's kind of that next step in the evolution of Hadoop. and the ability to have a much more detailed and the announcements with Hortonworks and IBM this morning.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Martin | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Rob Bearden | PERSON | 0.99+ |
San Jose | LOCATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Rob | PERSON | 0.99+ |
Argus | ORGANIZATION | 0.99+ |
90% | QUANTITY | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
IBM Analytics | ORGANIZATION | 0.99+ |
Tyler | PERSON | 0.99+ |
February | DATE | 0.99+ |
two companies | QUANTITY | 0.99+ |
second piece | QUANTITY | 0.99+ |
Argus Healthcare | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
360 degree | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
one | QUANTITY | 0.99+ |
Hadoop | TITLE | 0.99+ |
One | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
DataWorks Summit | EVENT | 0.99+ |
ten companies | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
fourth | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
two hard parts | QUANTITY | 0.98+ |
DataWorks Summit 2017 | EVENT | 0.98+ |
11 months later | DATE | 0.98+ |
each | QUANTITY | 0.98+ |
two use cases | QUANTITY | 0.97+ |
100% | QUANTITY | 0.97+ |
one quick question | QUANTITY | 0.97+ |
Segano | ORGANIZATION | 0.97+ |
SQL | TITLE | 0.96+ |
four mega-trends | QUANTITY | 0.96+ |
Big SQL | TITLE | 0.96+ |
first basis | QUANTITY | 0.94+ |
one common platform | QUANTITY | 0.94+ |
two major things | QUANTITY | 0.92+ |
Robs | PERSON | 0.92+ |
Wallstreet Journal | ORGANIZATION | 0.92+ |
Financial Times | ORGANIZATION | 0.92+ |