Rob Thomas, IBM Analytics | IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM: Fast Track Your Data. Brought to you by IBM. >> Welcome, everybody, to Munich, Germany. This is Fast Track Your Data brought to you by IBM, and this is theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise. My name is Dave Vellante, and I'm here with my co-host Jim Kobielus. Rob Thomas is here, he's the General Manager of IBM Analytics, and longtime CUBE guest, good to see you again, Rob. >> Hey, great to see you. Thanks for being here. >> Dave: You're welcome, thanks for having us. So we're talking about, we missed each other last week at the Hortonworks DataWorks Summit, but you came on theCUBE, you guys had the big announcement there. You're sort of getting out, doing a Hadoop distribution, right? TheCUBE gave up our Hadoop distributions several years ago so. It's good that you joined us. But, um, that's tongue-in-cheek. Talk about what's going on with Hortonworks. You guys are now going to be partnering with them essentially to replace BigInsights, you're going to continue to service those customers. But there's more than that. What's that announcement all about? >> We're really excited about that announcement, that relationship, just to kind of recap for those that didn't see it last week. We are making a huge partnership with Hortonworks, where we're bringing data science and machine learning to the Hadoop community. So IBM will be adopting HDP as our distribution, and that's what we will drive into the market from a Hadoop perspective. Hortonworks is adopting IBM Data Science Experience and IBM machine learning to be a core part of their Hadoop platform. And I'd say this is a recognition. One is, companies should do what they do best. We think we're great at data science and machine learning. Hortonworks is the best at Hadoop. Combine those two things, it'll be great for clients. And, we also talked about extending that to things like Big SQL, where they're partnering with us on Big SQL, around modernizing data environments. And then third, which relates a little bit to what we're here in Munich talking about, is governance, where we're partnering closely with them around unified governance, Apache Atlas, advancing Atlas in the enterprise. And so, it's a lot of dimensions to the relationship, but I can tell you since I was on theCUBE a week ago with Rob Bearden, client response has been amazing. Rob and I have done a number of client visits together, and clients see the value of unlocking insights in their Hadoop data, and they love this, which is great. >> Now, I mean, the Hadoop distro, I mean early on you got into that business, just, you had to do it. You had to be relevant, you want to be part of the community, and a number of folks did that. But it's really sort of best left to a few guys who want to do that, and Apache open source is really, I think, the way to go there. Let's talk about Munich. You guys chose this venue. There's a lot of talk about GDPR, you've got some announcements around unified government, but why Munich? >> So, there's something interesting that I see happening in the market. So first of all, you look at the last five years. There's only 10 companies in the world that have outperformed the S&P 500, in each of those five years. And we started digging into who those companies are and what they do. They are all applying data science and machine learning at scale to drive their business. And so, something's happening in the market. That's what leaders are doing. And I look at what's happening in Europe, and I say, I don't see the European market being that aggressive yet around data science, machine learning, how you apply data for competitive advantage, so we wanted to come do this in Munich. And it's a bit of a wake-up call, almost, to say hey, this is what's happening. We want to encourage clients across Europe to think about how do they start to do something now. >> Yeah, of course, GDPR is also a hook. The European Union and you guys have made some talk about that, you've got some keynotes today, and some breakout sessions that are discussing that, but talk about the two announcements that you guys made. There's one on DB2, there's another one around unified governance, what do those mean for clients? >> Yeah, sure, so first of all on GDPR, it's interesting to me, it's kind of the inverse of Y2K, which is there's very little hype, but there's huge ramifications. And Y2K was kind of the opposite. So look, it's coming, May 2018, clients have to be GDPR-compliant. And there's a misconception in the market that that only impacts companies in Europe. It actually impacts any company that does any type of business in Europe. So, it impacts everybody. So we are announcing a platform for unified governance that makes sure clients are GDPR-compliant. We've integrated software technology across analytics, IBM security, some of the assets from the Promontory acquisition that IBM did last year, and we are delivering the only platform for unified governance. And that's what clients need to be GDPR-compliant. The second piece is data has to become a lot simpler. As you think about my comment, who's leading the market today? Data's hard, and so we're trying to make data dramatically simpler. And so for example, with DB2, what we're announcing is you can download and get started using DB2 in 15 minutes or less, and anybody can do it. Even you can do it, Dave, which is amazing. >> Dave: (laughs) >> For the first time ever, you can-- >> We'll test that, Rob. >> Let's go test that. I would love to see you do it, because I guarantee you can. Even my son can do it. I had my son do it this weekend before I came here, because I wanted to see how simple it was. So that announcement is really about bringing, or introducing a new era of simplicity to data and analytics. We call it Download And Go. We started with SPSS, we did that back in March. Now we're bringing Download And Go to DB2, and to our governance catalog. So the idea is make data really simple for enterprises. >> You had a community edition previous to this, correct? There was-- >> Rob: We did, but it wasn't this easy. >> Wasn't this simple, okay. >> Not anybody could do it, and I want to make it so anybody can do it. >> Is simplicity, the rate of simplicity, the only differentiator of the latest edition, or I believe you have Kubernetes support now with this new addition, can you describe what that involves? >> Yeah, sure, so there's two main things that are new functionally-wise, Jim, to your point. So one is, look, we're big supporters of Kubernetes. And as we are helping clients build out private clouds, the best answer for that in our mind is Kubernetes, and so when we released Data Science Experience for Private Cloud earlier this quarter, that was on Kubernetes, extending that now to other parts of the portfolio. The other thing we're doing with DB2 is we're extending JSON support for DB2. So think of it as, you're working in a relational environment, now just through SQL you can integrate with non-relational environments, JSON, documents, any type of no-SQL environment. So we're finally bringing to fruition this idea of a data fabric, which is I can access all my data from a single interface, and that's pretty powerful for clients. >> Yeah, more cloud data development. Rob, I wonder if you can, we can go back to the machine learning, one of the core focuses of this particular event and the announcements you're making. Back in the fall, IBM made an announcement of Watson machine learning, for IBM Cloud, and World of Watson. In February, you made an announcement of IBM machine learning for the z platform. What are the machine learning announcements at this particular event, and can you sort of connect the dots in terms of where you're going, in terms of what sort of innovations are you driving into your machine learning portfolio going forward? >> I have a fundamental belief that machine learning is best when it's brought to the data. So, we started with, like you said, Watson machine learning on IBM Cloud, and then we said well, what's the next big corpus of data in the world? That's an easy answer, it's the mainframe, that's where all the world's transactional data sits, so we did that. Last week with the Hortonworks announcement, we said we're bringing machine learning to Hadoop, so we've kind of covered all the landscape of where data is. Now, the next step is about how do we bring a community into this? And the way that you do that is we don't dictate a language, we don't dictate a framework. So if you want to work with IBM on machine learning, or in Data Science Experience, you choose your language. Python, great. Scala or Java, you pick whatever language you want. You pick whatever machine learning framework you want, we're not trying to dictate that because there's different preferences in the market, so what we're really talking about here this week in Munich is this idea of an open platform for data science and machine learning. And we think that is going to bring a lot of people to the table. >> And with open, one thing, with open platform in mind, one thing to me that is conspicuously missing from the announcement today, correct me if I'm wrong, is any indication that you're bringing support for the deep learning frameworks like TensorFlow into this overall machine learning environment. Am I wrong? I know you have Power AI. Is there a piece of Power AI in these announcements today? >> So, stay tuned on that. We are, it takes some time to do that right, and we are doing that. But we want to optimize so that you can do machine learning with GPU acceleration on Power AI, so stay tuned on that one. But we are supporting multiple frameworks, so if you want to use TensorFlow, that's great. If you want to use Caffe, that's great. If you want to use Theano, that's great. That is our approach here. We're going to allow you to decide what's the best framework for you. >> So as you look forward, maybe it's a question for you, Jim, but Rob I'd love you to chime in. What does that mean for businesses? I mean, is it just more automation, more capabilities as you evolve that timeline, without divulging any sort of secrets? What do you think, Jim? Or do you want me to ask-- >> What do I think, what do I think you're doing? >> No, you ask about deep learning, like, okay, that's, I don't see that, Rob says okay, stay tuned. What does it mean for a business, that, if like-- >> Yeah. >> If I'm planning my roadmap, what does that mean for me in terms of how I should think about the capabilities going forward? >> Yeah, well what it means for a business, first of all, is what they're going, they're using deep learning for, is doing things like video analytics, and speech analytics and more of the challenges involving convolution of neural networks to do pattern recognition on complex data objects for things like connected cars, and so forth. Those are the kind of things that can be done with deep learning. >> Okay. And so, Rob, you're talking about here in Europe how the uptick in some of the data orientation has been a little bit slower, so I presume from your standpoint you don't want to over-rotate, to some of these things. But what do you think, I mean, it sounds like there is difference between certainly Europe and those top 10 companies in the S&P, outperforming the S&P 500. What's the barrier, is it just an understanding of how to take advantage of data, is it cultural, what's your sense of this? >> So, to some extent, data science is easy, data culture is really hard. And so I do think that culture's a big piece of it. And the reason we're kind of starting with a focus on machine learning, simplistic view, machine learning is a general-purpose framework. And so it invites a lot of experimentation, a lot of engagement, we're trying to make it easier for people to on-board. As you get to things like deep learning as Jim's describing, that's where the market's going, there's no question. Those tend to be very domain-specific, vertical-type use cases and to some extent, what I see clients struggle with, they say well, I don't know what my use case is. So we're saying, look, okay, start with the basics. A general purpose framework, do some tests, do some iteration, do some experiments, and once you find out what's hunting and what's working, then you can go to a deep learning type of approach. And so I think you'll see an evolution towards that over time, it's not either-or. It's more of a question of sequencing. >> One of the things we've talked to you about on theCUBE in the past, you and others, is that IBM obviously is a big services business. This big data is complicated, but great for services, but one of the challenges that IBM and other companies have had is how do you take that service expertise, codify it to software and scale it at large volumes and make it adoptable? I thought the Watson data platform announcement last fall, I think at the time you called it Data Works, and then so the name evolved, was really a strong attempt to do that, to package a lot of expertise that you guys had developed over the years, maybe even some different software modules, but bring them together in a scalable software package. So is that the right interpretation, how's that going, what's the uptake been like? >> So, it's going incredibly well. What's interesting to me is what everybody remembers from that announcement is the Watson Data Platform, which is a decomposable framework for doing these types of use cases on the IBM cloud. But there was another piece of that announcement that is just as critical, which is we introduced something called the Data First method. And that is the recipe book to say to a client, so given where you are, how do you get to this future on the cloud? And that's the part that people, clients, struggle with, is how do I get from step to step? So with Data First, we said, well look. There's different approaches to this. You can start with governance, you can start with data science, you can start with data management, you can start with visualization, there's different entry points. You figure out the right one for you, and then we help clients through that. And we've made Data First method available to all of our business partners so they can go do that. We work closely with our own consulting business on that, GBS. But that to me is actually the thing from that event that has had, I'd say, the biggest impact on the market, is just helping clients map out an approach, a methodology, to getting on this journey. >> So that was a catalyst, so this is not a sequential process, you can start, you can enter, like you said, wherever you want, and then pick up the other pieces from majority model standpoint? Exactly, because everybody is at a different place in their own life cycle, and so we want to make that flexible. >> I have a question about the clients, the customers' use of Watson Data Platform in a DevOps context. So, are more of your customers looking to use Watson Data Platform to automate more of the stages of the machine learning development and the training and deployment pipeline, and do you see, IBM, do you see yourself taking the platform and evolving it into a more full-fledged automated data science release pipelining tool? Or am I misunderstanding that? >> Rob: No, I think that-- >> Your strategy. >> Rob: You got it right, I would just, I would expand a little bit. So, one is it's a very flexible way to manage data. When you look at the Watson Data Platform, we've got relational stores, we've got column stores, we've got in-memory stores, we've got the whole suite of open-source databases under the composed-IO umbrella, we've got cloud in. So we've delivered a very flexible data layer. Now, in terms of how you apply data science, we say, again, choose your model, choose your language, choose your framework, that's up to you, and we allow clients, many clients start by building models on their private cloud, then we say you can deploy those into the Watson Data Platform, so therefore then they're running on the data that you have as part of that data fabric. So, we're continuing to deliver a very fluid data layer which then you can apply data science, apply machine learning there, and there's a lot of data moving into the Watson Data Platform because clients see that flexibility. >> All right, Rob, we're out of time, but I want to kind of set up the day. We're doing CUBE interviews all morning here, and then we cut over to the main tent. You can get all of this on IBMgo.com, you'll see the schedule. Rob, you've got, you're kicking off a session. We've got Hilary Mason, we've got a breakout session on GDPR, maybe set up the main tent for us. >> Yeah, main tent's going to be exciting. We're going to debunk a lot of misconceptions about data and about what's happening. Marc Altshuller has got a great segment on what he calls the death of correlations, so we've got some pretty engaging stuff. Hilary's got a great piece that she was talking to me about this morning. It's going to be interesting. We think it's going to provoke some thought and ultimately provoke action, and that's the intent of this week. >> Excellent, well Rob, thanks again for coming to theCUBE. It's always a pleasure to see you. >> Rob: Thanks, guys, great to see you. >> You're welcome; all right, keep it right there, buddy, We'll be back with our next guest. This is theCUBE, we're live from Munich, Fast Track Your Data, right back. (upbeat electronic music)

Published Date : Jun 22 2017

SUMMARY :

Brought to you by IBM. This is Fast Track Your Data brought to you by IBM, Hey, great to see you. It's good that you joined us. and machine learning to the Hadoop community. You had to be relevant, you want to be part of the community, So first of all, you look at the last five years. but talk about the two announcements that you guys made. Even you can do it, Dave, which is amazing. I would love to see you do it, because I guarantee you can. but it wasn't this easy. and I want to make it so anybody can do it. extending that now to other parts of the portfolio. What are the machine learning announcements at this And the way that you do that is we don't dictate I know you have Power AI. We're going to allow you to decide So as you look forward, maybe it's a question No, you ask about deep learning, like, okay, that's, and speech analytics and more of the challenges But what do you think, I mean, it sounds like And the reason we're kind of starting with a focus One of the things we've talked to you about on theCUBE And that is the recipe book to say to a client, process, you can start, you can enter, and deployment pipeline, and do you see, IBM, models on their private cloud, then we say you can deploy and then we cut over to the main tent. and that's the intent of this week. It's always a pleasure to see you. This is theCUBE, we're live from Munich,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Europe	LOCATION	0.99+
Rob	PERSON	0.99+
Marc Altshuller	PERSON	0.99+
Hilary	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Bearden	PERSON	0.99+
February	DATE	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
May 2018	DATE	0.99+
March	DATE	0.99+
Munich	LOCATION	0.99+
Scala	TITLE	0.99+
Apache	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
Last week	DATE	0.99+
Java	TITLE	0.99+
last year	DATE	0.99+
two announcements	QUANTITY	0.99+
10 companies	QUANTITY	0.99+
GDPR	TITLE	0.99+
Python	TITLE	0.99+
DB2	TITLE	0.99+
15 minutes	QUANTITY	0.99+
last week	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
European Union	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
JSON	TITLE	0.99+
Watson Data Platform	TITLE	0.99+
third	QUANTITY	0.99+
One	QUANTITY	0.99+
this week	DATE	0.98+
today	DATE	0.98+
a week ago	DATE	0.98+
two things	QUANTITY	0.98+
SQL	TITLE	0.98+
last fall	DATE	0.98+
2017	DATE	0.98+
Munich, Germany	LOCATION	0.98+
each	QUANTITY	0.98+
Y2K	ORGANIZATION	0.98+

Raj Verma, Hortonworks - DataWorks Summit 2017

>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2017. Brought to by Hortonworks. >> Welcome back to theCUBE, we are live, on day two of the DataWorks Summit. I'm Lisa Martin. #DWS17, join the conversation. We've had a great day and a half. We have learned from a ton of great influencers and leaders about really what's going on with big data, data science, how things are changing. My cohost is George Gilbert. We're joined by my old buddy, the COO of Hortonworks, Rajnish Verma. Raj, it's great to have you on theCUBE. >> It's great to be here, Lisa. Great to see you as well, it's been a while. >> It has, so yesterday on the customer panel, the Raj I know had great conversation with customers from, Duke Energy was one. You also had Black Knight on the financial services side. >> Rajnish: And HSC. >> Yes, on the insurance side, and one of the things that, a couple things that really caught my attention, one was when Duke said, kind of, where they were using data and moving to Hadoop, but they are now a digital company. They're now a technology company that sells electricity and products, which I thought was fantastic. Another thing that I found really interesting about that was they all talked about the need to leverage big data, and glean insights and monetize that, really requires this cultural shift. So I know you love customer interactions. Talk to us about what you're seeing. Those are three great industry examples. What are you seeing? Where are customers on this sort of maturity model where big data and Hadoop are concerned? >> Sure, happy to. So one thing that I enjoy the most about my job is meeting customers and talking to them about the art of the possible. And some of the stuff that they're doing, and, which was only science fiction, really, about two or three years ago. And they're a couple of questions that you've just asked me as to where they are on their journey, what are they trying to accomplish, et cetera. I remember about, five, seven, 10 years ago where Marc Andreessen said "Software is eating the world." And to be honest with you, now, it's now more like every company is a data company. I wouldn't say data is eating the world, but without effective monetization of your data assets, you can't be a force to reckon with as a company. So that is a common theme that we are seeing irrespective of industry, irrespective of customer, irrespective of really the size of the customer. The only thing that sort of varies is the amount and complexity of data, from one company to the other. Now, when, I'm new to Hortonworks as you know. It's really my fifth month here. And one of the things that I've seen and, Lisa, as you know, are coming from TIBCO. So we've been dealing with data. I have been involved with data for over a decade and a half now, right. So the difference was, 15 years ago, we were dealing with really structured data and we actually connected the structured data and gleaned insights into structured data. Now, today, a seminal challenge that every CIO or chief data officer is trying to solve is how do you get actionable insights into semi-structured and unstructured data. Now, so, getting insights into that data first requires ability to aggregate data, right. Once you've aggregated data, you also need a platform to make sense of data in real-time, that is being streamed at you. Now once you do those two things, then you put yourself in a position to analyze that data. So in that journey, as you asked, where our customers are. Some are defining their data aggregation strategy. The others, having defined data aggregation, they're talking about streaming analytics as a platform, and then the others are talking about data science and machine learning and deep learning, as a journey. Now, you saw the customer panel yesterday. But the one point I'd like to make is, it's not only the Duke Energies and the Black Knights of the world, or the HSC, who I believe are big, large firms that are using data. Even a company like, an old agricultural company, or I shouldn't say old but steeped in heritage is probably the right word. 96, 97 year old agricultural company that's in the animal feed business. Animal feed. Multi-billion dollar animal feed business. They use data to monetize their business model. What they say is, they've been feeding animals for the last 70 years. Sp now they go to a farmer and they have enough data about how to feed animals, that they can actually tell the farmer, that this hog that you have right now, which is 17 pounds, I can guarantee you that I will have him or her on a nutrition that, by four months, it'll be 35 pounds. How much are you willing to pay? So even in the animal feed business, data is being used to drive not only insights, but monetization models. >> Wow. >> So. >> That's outstanding. >> Thank you. >> So in getting to that level of sophistication, it's not like every firm sort of has the skills and technology in place to do that. What are some of the steps that you find that they typically have to go through to get to that level of maturity? Like, where do they make mistakes? Where do they find the skills to manage on-prem infrastructure, if it is on-premmed? What about, if they're trying to do a hybrid cloud setup. How complex is that? >> I think that's where the power of the community comes through at multiple levels. So we're committed to the open-source movement. We're committed to the community-based development of data. Now, this community-based business model does a few things. Firstly, it keeps the innovation at the leading edge, bleeding edge, number one. But as you heard the panel talk about yesterday, one of the biggest benefits that our customers see of using open source, is, sure economics is good, but that's not the leading reason. Keeping up with innovation, very high up there. Avoiding when to lock in, again very, very high up there. But one of the biggest reasons that CIOs gave me for choosing open source as a business model is more to do with the fact that they can attract good talent, and without open source, you can't actually attract talent. And I can relate to that because I have a sophomore at home. And it just happened to me that she's 15 now but she's been using open source since she was 11. The iPhone and, she downloads an application for free. She uses it, and if she stretches the limit of that, then she orders something more in a paid model. So the community helps people do a few things. Be able to fail fast if they need to. The second is, it lowers the barriers of entry, right. Because it's really free. You can have the same model. The third is, you can rely on the community for support and methodologies and best practices and lessons learned from implementations. The fourth is, it's a great hiring ground in terms of bringing people in and attracting Millennial talent, young talent, and sought-after talent. So that's really probably the answer that I would have for that. >> When you talk about the business model, the open-source business model and the attraction on the customer side, that sounded like there's this analogy with sort of the agro-business customer in the sense that there are offering data along with their traditional product. If your traditional product is open-source data management, what a room started telling us this morning was the machine learning that goes along with operating not only your own sort of internal workloads but customers, and being to offer prescriptive advice on operations, essentially IT operations. Is that the core, will that become the core of sort of value-add through data for an open-source business model like yours? >> I don't want to be speculative but I'll probably answer it another way. I think our vision, which was set by our founder Rob Bearden, and he took you guys through that yesterday, was way back when, we did say that our mission in life is to manage the world's data. So that mission hasn't changed. And the second was, we would do it as a open-source community or as a big contributing part of that community. And that has really not changed. Now, we feel that machine learning and data science and deep learning are areas that we're very, very excited about, our customers are very, very excited about. Now, the one thing that we did cover yesterday and I think earlier today as well, I'm a computer science engineer. And when I was in college, way back when, 25 years ago, I was interested in AI and ML. And it has existed for 50 years. The reason why it hasn't been available to the common man, so as to speak, is because of two reasons. One is, it did not have a source of data that it could sit on top of, that makes machine learning and AI effective. Or at least not a commercially-viable option to do so. Now, there is one. The second is, the compute power required to run some of the large algorithms that really give you insights into machine learning and AI. So we've become the platform on which customers can take advantage of excellent machine learning and AI tools to get insights. Now, that is two independent sort of categories. One is the open source community providing the platform. And then what tools the customer has used to apply data science and machine learning, so. >> So, all right. I'm thinking something that is slightly different and maybe the nuance is making it tough to articulate. But it's how can Hortonworks take the data platform and data science tools that you use to help understand how to operate important works, whether it's on a customer prem, or in the cloud. In other words, how can you use machine learning to make it a sort of a more effective and automated manage service? >> Yeah, and I think that's, the nuance's not lost in me. I think what I'm trying to sort of categorize is, for that to happen, you require two things. One is data aggregator across on-prem and cloud. Because when you have data which is multi-tenancy, you have a lot of issues with data security, data governance, all the rest of it. Now, that is what we plan to manage for the world, so as to speak. Now, on top of that, customers who require to have data science or deep learning to be used, we provide that platform. Now, whether that is used as a service by the customer, which we would be happy to provide, or it is used inhouse, on-prem, on various cloud models, that's more a customer decision. We don't want to force that decision. However, from the art of the possible perspective, yes it's possible. >> I love the mission to manage the world's data. >> Thank you. >> That's a lofty goal, but yesterday's announcements with IBM were pretty, pretty transformative. In your opinion as chief operating officer, how do you see this extension of this technology and strategic partnership helping Hortonworks on the next level of managing the world's data? >> Absolutely, it's game-changing for us. We're very, very excited. Our colleagues are very, very excited about the opportunity to partner. It's also a big validation of the fact that we now have a pretty large open-source community that contributes to this cause. So we're very excited about that. The opportunity is in actually our partnering with a leader in data science, machine learning, and AI, a company that has steeped in heritage, is known for game-changing, next technology moves. And the fact that we're powering it from a data perspective is something that we're very, very excited and pleased about. And the opportunities are limitless. >> I love that, and I know you are a game-changer, in your fifth month. We thank you so much, Raj, for joining us. It was great to see you. Continued success, >> Thank you. >> at managing the world's data and being that game-changer, yourself, and for Hortonworks as well. >> Thank you Lisa, good to see you. >> You've been watching theCUBE. Again, we're live, day two of the DataWorks Summit, #DWS17. For my cohost, George Gilbert, I'm Lisa Martin. Stick around guys, we'll be right back with more great content. (jingle)

Published Date : Jun 14 2017

SUMMARY :

in the heart of Silicon Valley, Raj, it's great to have you on theCUBE. Great to see you as well, it's been a while. You also had Black Knight on the financial services side. Yes, on the insurance side, and one of the things that, But the one point I'd like to make is, What are some of the steps that you find is more to do with the fact that they can attract and the attraction on the customer side, Now, the one thing that we did cover yesterday and maybe the nuance is making it tough to articulate. for that to happen, you require two things. on the next level of managing the world's data? about the opportunity to partner. I love that, and I know you are a game-changer, at managing the world's data of the DataWorks Summit, #DWS17.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Marc Andreessen	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Duke Energy	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
TIBCO	ORGANIZATION	0.99+
Duke Energies	ORGANIZATION	0.99+
Raj Verma	PERSON	0.99+
35 pounds	QUANTITY	0.99+
Raj	PERSON	0.99+
Rob Bearden	PERSON	0.99+
50 years	QUANTITY	0.99+
San Jose	LOCATION	0.99+
17 pounds	QUANTITY	0.99+
fifth month	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Rajnish Verma	PERSON	0.99+
HSC	ORGANIZATION	0.99+
one	QUANTITY	0.99+
yesterday	DATE	0.99+
15	QUANTITY	0.99+
four months	QUANTITY	0.99+
One	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
Black Knights	ORGANIZATION	0.99+
Duke	ORGANIZATION	0.99+
two reasons	QUANTITY	0.99+
two	QUANTITY	0.99+
two things	QUANTITY	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Firstly	QUANTITY	0.99+
second	QUANTITY	0.99+
third	QUANTITY	0.99+
one company	QUANTITY	0.99+
DataWorks Summit 2017	EVENT	0.98+
DataWorks Summit	EVENT	0.98+
three	QUANTITY	0.98+
#DWS17	EVENT	0.98+
Multi-billion dollar	QUANTITY	0.98+
fourth	QUANTITY	0.98+
one thing	QUANTITY	0.98+
today	DATE	0.97+
15 years ago	DATE	0.97+
11	QUANTITY	0.96+
this morning	DATE	0.95+
25 years ago	DATE	0.95+
one point	QUANTITY	0.94+
day two	QUANTITY	0.93+
Rajnish	PERSON	0.93+
first	QUANTITY	0.93+
five	DATE	0.91+
three years ago	DATE	0.91+
theCUBE	ORGANIZATION	0.9+
96, 97 year old	QUANTITY	0.89+
Hortonworks - DataWorks Summit 2017	EVENT	0.87+
earlier today	DATE	0.87+
COO	PERSON	0.86+
10 years ago	DATE	0.86+
about two	DATE	0.84+
seven	DATE	0.8+
couple	QUANTITY	0.8+
Hadoop	ORGANIZATION	0.75+
over a decade and a half	QUANTITY	0.72+
last 70 years	DATE	0.69+

Show Wrap - Data Platforms 2017 - #DataPlatforms2017

>> Announcer: Live from the Wigwam in Phoenix, Arizona. It's theCUBE. Covering Data Platforms 2017. Brought to you by Kubo. >> Hey welcome back everybody. Jeff Frick here with theCUBE along with George Gilbert from Wikibon. We've had a tremendous day here at DataPlatforms 2017 at the historic Wigwam Resort, just outside of Phoenix, Arizona. George, you've been to a lot of big data shows. What's your impression? >> I thought we're at the, we're sort of at the edge of what could be a real bridge to something new, which is, we've built big data systems for like out of traditional, as traditional software for deployment on traditional infrastructure. Even if you were going to put it in a virtual machine, it's still not a cloud. You're still dealing with server abstractions. But what's happening with Kubo is, they're saying, once you go to the cloud, whether it's Amazon, Azure, Google or Oracle, you're going to be dealing with services. Services are very different. It greatly simplifies the administrative experience, the developer experience, and more than that, they're focused on, they're focused on turning Kubo, the product on Kubo the service, so that they can automate the management of it. And we know that big data has been choking itself on complexity. Both admin and developer complexity. And they're doing something unique, both on sort of the big data platform management, but also data science operations. And their point, their contention, which we still have to do a little more homework on, is that the vendors who started with software on-prem, can't really make that change very easily without breaking what they've done on-prem. Cuz they have traditional perpetual license physical software as opposed to services, which is what is in the cloud. >> The question is, are people going to wait for them to figure it out. I talked to somebody in the hallway earlier this morning and we were talking about their move to put all their data into, it was S3, on their data lake. And he said, it's part of a much bigger transformational process that we're doing inside the company. And so, this move, from his cloud, public cloud viable, to tell me, give me a reason why it shouldn't go to the cloud, has really kicked in big time. And hear over and over and over that speed and agility, not just in deploying applications, but in operating as a company, is the key to success. And we hear over and over how many, how short the tenure is on the Fortune 500 now, compared to what it used to be. So if you're not speed and agile, which you pretty much have to use cloud, and software driven automated decision-making >> Yeah. >> that's powered by machine learning to eat. >> Those two things. >> A huge percentage of your transaction and decision-making, you're going to get smoked by the person that is. >> Let's let's sort of peel that back. I was talking to Monte Zweben who is the co-founder of Splice Machine, one of the most advance databases that sort of come out of nowhere over the last couple of years. And it's now, I think, in close beta on Amazon. He showed me, like a couple of screens for spinning it up and configuring it on Amazon. And he said, if I were doing that on-prem, he goes I needed Hadoop cluster with HBase. It would take me like four plus months. And that's an example of software versus services. >> Jeff: Right. >> And when you said, when you pointed out that, automated decision-making, powered by machine learning, that's the other part, which is these big data systems ultimately are in the service of creating machine learning models that will inform ever better decisions with ever greater speed and the key then is to plug those models into existing systems of record. >> Jeff: Right. Right. >> Because we're not going to, >> We're not going to to rip those out and rebuild them from scratch. >> Right. But as you just heard, you can pull the data out that you need, run it through a new age application. >> George: Yeah. >> And then feed it back into the old system. >> George: Yes. >> The other thing that came up, it was Oskar, I have to look him up, Oskar Austegard from Gannett was on one of the panels. We always talk about the flexibility to add capacity very easily in a cloud-based solution. But he talked about in the separation of storage and cloud, that they actually have times where they turn off all their compute. It's off. Off. >> And that was If you had to boil down the fundamental compatibility break between on-prem and in the cloud, the Kubo folks, both the CEO and CMO said, look, you cannot reconcile what's essentially server send, where the storage is attached to the compute node, the server. With cloud where you have storage separate from compute and allowing you to spin it down completely. He said those are just the fundamentally incompatible. >> Yeah, yeah. And also, Andretti, one of the founders in his talk, he talked about the big three trends, which we just kind of talked about, he summarized them right in serverless. This continual push towards smaller and smaller units >> George: Yeah. >> of store compute. And the increasing speed of networks is one, from virtual servers to just no servers, to just compute. The second one is automation, you've got to move to automation. >> George: Right. If you're not, you're going to get passed by your competitor that is. Or the competitor you that you don't even know that exists that's going to come out from over your shoulder. And the third one was the intelligence, right. There is a lot of intelligence that can be applied. And I think the other cusp that we're on, is this continuing crazy increase in compute horsepower. Which just keeps going. That the speed and the intelligence of these machines is growing at an exponential curve, not a linear curve. It's going to be bananas in the not too distance future. >> We're soaking up more and more that intelligence with machine learning. The training part of machine learning where the datasets to train a model are immense. Not only the dataset are large, but the amount of time to sort of chug through them to come up with the, just the right mix of variables and values for those variables. Or maybe even multiple models. So that we're going to see in the cloud. And that's going to chew up more and more cycles. Even as we have >> Jeff: Right. Right. >> specialized processors. >> Jeff: Right. But in the data ops world, in theory yes, but I don't have to wait to get it right. Right? I can get it 70% right. >> George: Yeah. >> Which is better than not right. >> George: Yeah. >> And I can continue to iterate over time. In that, I think was the the genius of dev-ops. To stop writing PRDs and MRDs. >> George: Yeah. >> And deliver something. And then listen and adjust. >> George: Yeah. >> And within the data ops world, it's the same thing. Don't try to figure it all out. Take the data you know, have some hypothesis. Build some models and iterate. That's really tough to compete with. >> George: Yeah. >> Fast, fast, fast iteration. >> We're doing actually a fair amount of research on that. On the Wikibon side. Which is, if you build, if you build an enterprise application that has, that is reinforced or informed by models in many different parts, in other words, you're modeling more and more digital entities within the business. >> Jeff: Right. >> Each of those has feedback loops. >> Jeff: Right. Right. >> And when you get the whole thing orchestrated and moving or learning in concert then you have essentially what Michael Porter many years ago called competitive advantage. Which is when each business process reinforces all the other business processes in service of a delivering a value proposition. And those models represent business processes and when they're learning and orchestrated all together, you have a, what Trump called a fined-tuned machine. >> I won't go there. >> Leaving out that it was Bigley and it was finely-tuned machine. >> Yeah, yeah. But the end of the day, if you're using resources and effort to improve an different resource and effort, you're getting a multiplier effect. >> Yes. >> And that's really the key part. Final thought as we go out of here. Are you excited about this? Do you see, they showed the picture the NASA headquarters with the big giant snowball truck loading up? Do you see more and more of this big enterprise data going into S3, going into Google Cloud, going into Microsoft Azure? >> You're asking-- >> Is this the solution for the data lake swamp issue that we've been talking about? >> You're asking the 64 dollar question. Which is, companies, we sensed a year ago at the at the Hortonworks DataWorks Summit in, was in June, down in San Jose last year. That was where we first got the sense that, people were sort of throwing in the towel on trying to build, large scale big data platforms on-prem. And what changes now is, are they now evaluating Hortonworks versus Cloudera versus MapR in the cloud or are they widening their consideration as Kubo suggests. Because now they want to look, not only at Cloud Native Hadoop, but they actually might want to look at Cloud Native Services that aren't necessarily related to Hadoop. >> Right. Right. And we know as a service wins. It's continue. PAS is a service. Software is a service. Time and time again, as a service either eats a lot of share from the incumbent or knocks the incumbent out. So, Hadoop as a service, regardless of your distro, via one of these types of companies on Amazon, it seems like it's got to win, right. It's going to win. >> Yeah but the difference is, so far, so far, the Clouderas and the MapRs and the Hortonworks of the world are more software than service when they're in the cloud. They don't hide all the knobs. You still need You still a highly trained admin to get them up-- >> But not if you buy it as a service, in theory, right. It's going to be packaged up by somebody else and they'll have your knobs all set. >> They're not designed yet that way. >> HD Insight >> Then, then, then, then, They better be careful cuz it might be a new, as a service distro, of the Hadoop system. >> My point, which is what this is. >> Okay, very good, we'll leave it at that. So George, thanks for spending the day with me. Good show as always. >> And I'll be in a better mood next time when you don't steal my candy bars. >> All right. He's George Goodwin. I'm Jeff Frick. You're watching theCUBE. We're at the historic 99 years young, Wigwam Resort, just outside of Phoenix, Arizona. DataPlatforms 2017. Thanks for watching. It's been a busy season. It'll continue to be a busy season. So keep it tuned. SiliconAngle.TV or YouTube.com/SiliconAngle. Thanks for watching.

Published Date : May 26 2017

SUMMARY :

Brought to you by Kubo. at the historic Wigwam Resort, is that the vendors who started with software on-prem, but in operating as a company, is the key to success. you're going to get smoked by the person that is. over the last couple of years. and the key then is to plug those models Jeff: Right. We're not going to to rip those out But as you just heard, We always talk about the flexibility to add capacity And that was And also, Andretti, one of the founders in his talk, And the increasing speed of networks is one, And the third one was the intelligence, right. but the amount of time to sort of chug through them Jeff: Right. But in the data ops world, in theory yes, And I can continue to iterate over time. And then listen and adjust. Take the data you know, have some hypothesis. On the Wikibon side. Jeff: Right. And when you get the whole thing orchestrated Leaving out that it was Bigley But the end of the day, if you're using resources And that's really the key part. You're asking the 64 dollar question. a lot of share from the incumbent and the Hortonworks of the world It's going to be packaged up by somebody else of the Hadoop system. which is what this is. So George, thanks for spending the day with me. And I'll be in a better mood next time We're at the historic 99 years young, Wigwam Resort,

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
George	PERSON	0.99+
George Goodwin	PERSON	0.99+
George Gilbert	PERSON	0.99+
Michael Porter	PERSON	0.99+
Andretti	PERSON	0.99+
San Jose	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
64 dollar	QUANTITY	0.99+
70%	QUANTITY	0.99+
Trump	PERSON	0.99+
Oskar Austegard	PERSON	0.99+
June	DATE	0.99+
Oracle	ORGANIZATION	0.99+
Oskar	PERSON	0.99+
Google	ORGANIZATION	0.99+
NASA	ORGANIZATION	0.99+
Kubo	ORGANIZATION	0.99+
one	QUANTITY	0.99+
last year	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
four plus months	QUANTITY	0.99+
99 years	QUANTITY	0.99+
third one	QUANTITY	0.99+
Phoenix, Arizona	LOCATION	0.99+
a year ago	DATE	0.99+
Splice Machine	ORGANIZATION	0.98+
Both	QUANTITY	0.98+
Microsoft	ORGANIZATION	0.98+
Hadoop	TITLE	0.98+
both	QUANTITY	0.97+
Azure	ORGANIZATION	0.97+
Each	QUANTITY	0.96+
Monte Zweben	PERSON	0.96+
first	QUANTITY	0.94+
MapRs	ORGANIZATION	0.94+
earlier this morning	DATE	0.92+
Wigwam Resort	LOCATION	0.92+
two things	QUANTITY	0.92+
2017	DATE	0.92+
#DataPlatforms2017	EVENT	0.89+
Wikibon	ORGANIZATION	0.89+
second one	QUANTITY	0.89+
three trends	QUANTITY	0.89+
each business process	QUANTITY	0.87+
DataPlatforms	TITLE	0.86+
theCUBE	ORGANIZATION	0.85+
Cloudera	ORGANIZATION	0.85+
Hortonworks DataWorks Summit	EVENT	0.85+
Wigwam Resort	ORGANIZATION	0.85+
Kubo	PERSON	0.84+
Gannett	ORGANIZATION	0.82+
MapR	ORGANIZATION	0.8+
S3	TITLE	0.8+
many years ago	DATE	0.78+
DataPlatforms 2017	EVENT	0.74+
years	DATE	0.73+
YouTube.com/SiliconAngle	OTHER	0.72+
Clouderas	ORGANIZATION	0.7+
Cloud Native	TITLE	0.67+
Platforms	TITLE	0.67+
Google Cloud	TITLE	0.64+
Cloud Native Hadoop	TITLE	0.64+
last couple	DATE	0.64+
Azure	TITLE	0.61+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Hortonworks - DataWorks Summit 2017: