Christoph Streubert, SAP - DataWorks Summit Europe 2017 - #DWS17 - #theCUBE

>> Announcer: Live from Munich, Germany, it's The CUBE, covering DataWorks Summit Europe 2017. Brought to you by Heartenworks. >> Okay, welcome back everyone, we are here live in Munich, Germany For DataWorks 2017, the DataWorks Summit, formally Hadoop Summit. I'm John Furrier with Silicone Angle's theCUBE, my co-host Dave Vellante, wrapping up day two of coverage here with Christoph Schubert, who's the Senior Director of SAP Big Data, handles all the go-to-market for SAP Big Data, @sapbigdata is the Twitter handle. You have a great shirt there, Go Live >> Go Live or go home. (Laughs) >> John: You guys are a part. Welcome to theCUBE. >> Christoph: Thank you, I appreciate it. >> Thanks for joining us and on the wrap up. You and I have known each other, we've known each other for a long time. We've been in many Sapphires together, we've had many conversations around the role of data, the role of architecture, the role of how organizations are transforming at the speed of business, which is SAP, it's a lot of software that powers business, under transformation right now. You guys are no stranger to analytics, we have the HANA Cloud Platform now. >> Christoph: We know a thing or two about that, yeah. (laughs) >> You know a little bit about data and legacy as well. You guys power pretty much most of the Fortune 100, if not all of them. What's your thoughts on this? >> Yeah, good point. On the topic of some numbers, about 75% of the world GDP runs through SAP systems eventually. So yes, we know a thing or two about transactional and analytical systems, definitely. >> John: And you're a partner with Hortonworks >> With Hortonworks and other Cloud providers, Hadoop Providers, certainly, absolutely but in this case, Hortonworks. We have, specifically, a solution that runs on Hadoop Spark and that allows, actually, our customers to unify much, much larger data sets with a system of records that we now do so many of them around the world for new and exciting new cases. >> And you were born in Munich. This is your hometown. >> This is actually a home gig for me, exactly. So, yes, unfortunately I'll also be presenting in English but yeah, I want to talk German, Bavarian, all the time. (laughs) >> I see my parents tonight. >> I wish we could help you >> but we don't speak Bavarian. But we do like to drink the beer though. It's the fifth season but a lot of great stuff here in Germany. Dave, you guys, I want to get your thoughts on something. I wanted to get you, just 'cause you're both, you're like an analyst, Christoph as well. I know you're over at SAP but, you know, you have such great industry expertise and Dave obviously covers the stuff everyday. I just think that the data world is so undervalued, in my mind. I think the ecosystem of startups that are coming out in the, out of the open source ecosystems, which are well-defined, by the way, and getting better. But now you have startups doing things like VIMTEC, we just had a bank on. Startups creating value and things like block chain on the horizon. Other new paradigms are coming on, is going to change the landscape of how wealth is created and value is created and charged. So, you've got a whole new tsunami of change. What's your thoughts on how this expands and obviously, certainly, Hortonworks as a public company and Cloudera is going public, so you expect to see that level up in valuation. >> They're in the process, yes. >> But I still think they're both undervalued. Your thoughts. >> Well it's not just the platform, right? and that what, I think, where Hadoop also came from. The legacy of Hadoop is that you don't have to really think about how you want to use your data. You have to, don't think ahead what kind of schema you want to apply and how you want to correlate your data. You can create a large data lake, right? That's the term that was created a long time ago, that allows customers to just collect all that data and think in the second stage about what to use with it and how to correlate it. And that's exactly, now, we're also seeing in the third stage, to not just create analytics but also creating applications instead of analytics or on top of analytics, correlating with data that also drives the business, the core business, from an OLTP perspective or also from an OLAP perspective. >> I mean, Dave, you were the one who said Amazon's a trillion dollar TAM, will be the first trillion dollar company and you were kind of, but you looked at the thousand points of Live with Cloud enables, all these aggregated all together, what's your thoughts on valuation of this industry? Because if Hortonworks continues on this peer play and they've got Cloudera coming in and they're doing well, you could argue that they're both undervalued companies if you count the ecosystem. >> Well, we always knew that big data was going to be a heavy lift, right? And I would agree with what Christoph was saying, was that Hadoop is profound in that it was no schema on right and ship five magabytes of code to a pedabyte of data. But it was hard to get that right. And I remember something you said, John, at one of our early SAP Sapphires, When the big data meme was just coming through. You said, "You know, SAP is not just big data, it's fast data". And you were talking about bringing transaction and analytic data together. >> John: Right. >> Again, something that has only recently been enabled. And you think about, you know, continuous streaming. I think that, now, big data has sort of entered the young-adulthood phase, we're going to start seeing steep part of that S-curve returns, and I think the hype will be realized. I think it is undervalued, much like the internet was. It was overvalued, then nobody wanted to touch it, and then it became. Actually, if you think back to 1999, the internet was undervalued in terms of what it actually achieved. >> John: Yeah. >> I think the same or similar thing is going to happen with big data. And since we have an SAP guest on, I'll say as well, We all remember the early days of ERP. >> Mhm, oh yeah. >> It wasn't clear >> Nope. >> Who was going to emerge as the king. >> Right. >> There were a few solutions. You're right. >> That's right. And, as well, something else we said about big data, it was the practitioners of ERP that made the most money, that created the most value and the same thing is happening here. >> Yeah. In fact, on that topic, I believe that 2017 and 2018 will be the big years for big data, so to speak. >> John: Uh huh. >> In fact, because of some statistics. >> John: In what way? >> Well, we just did >> Adoption, S-curve? >> Right, exactly. Utilizing the value of big data. You're talking about valuation here, right? 75% of CEOs of the top 1000 believe that the next three years are more important to their business than the last 50. And so that tells me that they're willing to invest. Not just the financial market, where I believe really run the most sophisticated big data analytics and models today. They had real use cases with real results very quickly. And so, they showed many how it's done. They created sort of the new role of a data scientist. They have roles like an AML officer. It's a real job, they do nothing else but anti-money laundering, right? So, in that industry they've shown us how to do that and I think others will follow. >> Yeah, and I think that when you look at this whole thing about digital transformation, it's all about data. >> John: Yeah. >> I mean, if you're serious about digital transformation, you must become a data-driven company and you have to hop on that curb. Even if you're talking to the, you know, bank today who got on in 2014, which was relatively late, but the pace at which they're advancing is astronomical. >> John: Yeah. >> I don't remember his name, a British mathematician, created, about 11 years already, that according to the phrase "Data is the new oil". >> John: Mhm. >> And I think it's very true because crude oil, in its original form, you also can't use it. >> John: It has to be refined. >> Right, exactly. It has to be refined to actually use it and use the value of it. Same thing with data. You have to distill it, you have to correlate it, you have to align it, you have to relate it to business transactions so the business really can take advantage of it. >> And then we're seeing, you know, to your point, you've got, I don't know, a list of big data companies that are now in public is growing. It's still small, not much profit. >> I mean, I just think, and this is while I'm getting your reaction, I mean, I'm just reading right now some news popping on my dashboard. Google just released some benchmarks on the TPU, the transistor processing unit, >> Dave: Right. >> Basically a chip dedicated to machine learning. >> Yep. >> You know, so, you're going to start to see some abstraction layers develop, whether it's a hardened-top processor hardware, you guys have certainly done innovation on the analytic side, we've seen that with some of the specialty apps. Just to make things go faster. I mean, so, more and more action is coming, so I would agree that this S-curve is coming. But the game might shift. I mean, this is not an easy, clear path. There's bets being made in big data and there's potential for huge money shift, of value. >> See, one of the things I see, and we talked to Hortonworks about this, the new president, you know, betting all on open source. I happen to think a hybrid model is going to win. I think the rich get richer here. SAP, IBM, even Oracle, you know, they can play the open source game and say, "Hey, we're going to contribute to open source, we're going to participate, we're going to utilize open source, but we're also going to put the imprimatur of our install base, our business model, our trusted brands behind so-called big data." We don't really use that term as much anymore. It's the confluence of not only the technology but the companies who, what'd you say, 75% of the world's transactions run though SAP at some point? >> Christoph: Yeah. >> With companies like SAP behind it, and others, that's when this thing, I think, really takes off. >> What I think a lot of people don't realize, and I've been a customer, also, for a long time before I joined the vendor side, and what is under-realized is the aspect of risk management. Once you have a system and once you have business processes digitized and they run your business, you can't introduce radical changes overnight as quickly anymore as you'd like or your business would like. So, risk management is really very important to companies. That's why you see innovation within organizations not necessarily come from the core digitization organization within their enterprise, it often happens on the outside, within different business units that are closer to the product or to the customer or something. >> Something else that's happening, too, that I wanted to address is this notion of digitization, which is all about data, allows companies to jump industries. You're seeing it everywhere, you're seeing Amazon getting into content, Apple getting into financial services. You know, there's this premise out there that Uber isn't about taxicabs, it's about logistics. >> John: Yeah. >> And so you're seeing these born-digital, born in the cloud companies now being able to have massive impacts across different industries. Huge disruption creates, you know, great opportunities, in my view. >> Christoph: Yeah. >> David: What do you think? >> I mean, I just think that the disruption is going to be brutal, and I want to, I'm trying to synthesize what's happening in this show, and you know, you're going to squint through all the announcements and the products, really an upgrade to 2.6, a new data platform. But here in Europe the IOT thing just, to me, is a catalyst point because it's really a proof point to where the value is today. >> David: Mhm. >> That people can actually look at and say, "This is going to have an impact on our business tier digitization point" and I think IOT is pulling the big data industry and cloud together. And I think machine learning and things that come over the top on it are only going to make it go faster. And so that intersection point, where the AI, augmented intelligence, is going to come in, I think that's where you're going to start to see real proof points on value proposition of data. I mean, right now it's all kind of an inner circle game. "Oh yeah, got to get the insights, optimize this process here and there" and so there's some low hanging fruit, but the big shifting, mind blowing, CEO changing strategies will come from some bigger moves. >> To that point, actually, two things I want to mention that SAP does in that space, specifically, right? Startups, we have a program actually, SAP.io, that Bill McDermont also recently introduced again, where we invest in startups in this space to help foster innovation faster, right? And also connecting that with our customers. >> John: What is it called? >> SAP.io Something to look out for. And on the topic of IOT, we made, also, an announcement at the beginning of the year, Project Leonardo. >> Yeah. >> It's a commitment, it's a solution set, and it's also an investment strategy, right? We're committed in this market to invest, to create solutions, we have solutions already in the cloud and also in primus. There are a few companies we also purchased in conjunction with Loeonardo, RT specifically. Some of our customers in the manufacturing space, very strong opportunity for IOT, sensor collection, creating SLAs for robotics on the manufacturing floor. For example, we have a complete solution set to make that possible and realize that for our customers and that's exactly a perfect example where these sensor applications in IOT, edge, compute rich environments come together also with a core where, then, a system of references like machine points, for example, matter because if you manage the SLA for a machine, for example, you just not only monitor it, you want to also automatically trigger the replacement of a part, for example, and that's why you need an SAP component, as well. So, in that space, we're heavily investing, as well. >> The other think I want to say about IOT is, I see it, I mean, cloud and big data have totally disrupted the IT business. You've seen Dell buying EMC, HP had to get out of the cloud business, Oracle pivoted to the cloud, SAP obviously, going hard after the cloud. Very, very disruptive, those two trends. I see IOT as not necessarily disruptive. I see those who have the install base as adopting IOT and doing very, very well. I think it's maybe disruptive to the economy at large, but I think existing companies like GE, like Siemens, like Dimar, are going to do very, very well as a result of IOT. I mean, to the extent they embrace digitization, which they would be crazy not to. >> Alright guys, final thoughts. What's your walkaway from this show? Dave, we'll start with you. >> I was going to say, you know, Hadoop has definitely not failed, in my mind, I think it's been wildly successful. It is entering this new phase that I call sort of young-adulthood and I think it's, we know it's gone mainstream into the enterprise, now it's about, okay, how do I really drive the value of data, as we've been discussing, and hit that steep part of the S-curve. Which, I agree, it's going to be within the next two years, you're going to start to see massive returns. And I think this industry is going to be realized, looked back, it was undervalued in 2017. >> Remember how long it took to align on TCP/IP? (laughter) >> Walk away, I mean interoperability was key with TCP/IP. >> Christoph: Yeah. One of the things that made things happen. >> I remember talking about it. (laughter) >> Yeah, two megabits per second. Yeah, but I mean, bringing back that, what's your walkaway? Because is it a unification opportunity? Is it more of an ecosystem? >> A good friend of mine, also at SAP on the West Coast, Andreas Walter, he shared an observation that he saw in another presentation years ago. It was suits versus hoodies. Different kind of way to run your IT shop, right? Top-down structure, waterfall projects, and suits, open source, hack it, quickly done, you know, get in, walk away, make money. >> Whoa, whoa, whoa, the suits were the waterfall, hoodies was the agile. >> Christoph: That's correct. >> Alright, alright, okay. >> Christoph: Correct. So, I think that it's not just the technology that's coming together, it's mindsets that are coming together. And I think organizationally for companies, that's the bigger challenge, actually. Because one is very subscribed, change control oriented, risk management aware. The other is very progressive, innovative, fast adopters. That these two can't bring those together, I think that's the real challenge in organizations. >> John: Mhm, yeah. >> Not the technology. And on that topic, we have a lot of very intelligent questions, very good conversations, deep conversations here with the audience at this event here in Munich. >> Dave, my walkaway was interesting because I had some preconceived notions coming in. Obviously, we were prepared to talk about, and because we saw the S1 File by Cloudera, you're starting to see the level of transparency relative to the business model. One's worth one billion dollars in private value, and then Hortonworks pushing only 2700 million in a public market, which I would agree with you is undervalued, vis a vis what's going on. So obviously, you're going to see my observation coming in from here is that I think that's going to be a haircut for Cloudera. The question is how much value will be chopped down off Cloudera, versus how much value of Hortonworks will go up. So the question is, does Cloudera plummit, or does Cloudera get a little bit of a haircut or stay and Hortonworks rises? Either way, the equilibrium in the industry will be established. The other option would be >> Dave: I think the former and the numbers are ugly, let's not sugarcoat it. And so that's got to change in order for this prediction that we're making. >> John: Former being the haircut? >> Yeah, the haircut's going to happen, I think. But the numbers are really ugly. >> But I think the question is how far does it drop and how much of that is venture. >> Sure. >> Venture, arbitrage, or just how they are capitalized but Hortonworks could roll up. >> But my point is that those numbers have to change and get better in order for our prediction to come true. Okay, so, but in your second talk, sorry to interrupt you but >> No, I like a debate and I want to know where that line is. We'll be watching. >> Dave: Yeah. >> But the value in, I think you guys are pointing out but I walk away, is IOT is bigger here, and I already said that, but I think the S-curve is, you're right on. I think you're going to start to see real, fast product development around incorporating data, whether that's a Hortonworks model, which seems to be the nice unifying, partner-oriented one, that's going to start seeing specialized hardware that people are going to start building chips for using flash or other things, and optimizing hard complexities. You pointed that out on the intro yesterday. And putting real product value on the table. I think the cards are going to start hitting the table in ecosystem, and what I'm seeing is that happening now. So, I think just an overall healthy ecosystem. >> Without a doubt. >> Okay. >> Great. >> Any final comments? >> Let's have a beer. >> Great to see you in Munich. (laughter) >> We'll have a beer, we had a pig knuckle last night, Dave. We had some sauerkraut. >> Christoph: (speaks foreign word) >> Yeah, we had the (speaks foreign word). Dave, we'll grab the beer, thanks. Good to be with you again. Thanks to the crew, thanks to everyone watching. >> Thanks, John. >> The CUBE, signing off from Munich, Germany for DataWorks 2017. Thanks for watching, see ya next time. (soft techno music)

Published Date : Apr 7 2017

SUMMARY :

Brought to you by Heartenworks. @sapbigdata is the Twitter handle. Go Live or go home. Welcome to theCUBE. at the speed of business, which is SAP, Christoph: We know a thing or two most of the Fortune 100, about 75% of the world GDP around the world for new And you were born in Munich. Bavarian, all the time. like block chain on the horizon. But I still think in the third stage, to I mean, Dave, you were the one who said And I remember something you said, John, the internet was undervalued in terms is going to happen with big data. There were a few solutions. that created the most value big data, so to speak. of some statistics. that the next three Yeah, and I think that when and you have to hop on that curb. that according to the phrase And I think it's very You have to distill it, you know, to your point, on the TPU, the transistor to machine learning. on the analytic side, we've seen that but the companies who, what'd you say, that's when this thing, I often happens on the outside, allows companies to jump industries. born in the cloud companies now being able that the disruption that come over the top on it to help foster innovation faster, right? And on the topic of IOT, we made, also, in the cloud and also in primus. I mean, to the extent Dave, we'll start with you. and hit that steep part of the S-curve. interoperability was key with TCP/IP. One of the things that made things happen. I remember talking about it. Is it more of an ecosystem? also at SAP on the West Coast, were the waterfall, hoodies was the agile. not just the technology And on that topic, we have a lot coming in from here is that I think and the numbers are ugly, But the numbers are really ugly. and how much of that is venture. but Hortonworks could roll up. sorry to interrupt you but and I want to know where that line is. that people are going to Great to see you in Munich. We'll have a beer, we had a Good to be with you again. Thanks for watching, see ya next time.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Christoph Schubert	PERSON	0.99+
Christoph	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Siemens	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
GE	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Andreas Walter	PERSON	0.99+
2014	DATE	0.99+
Europe	LOCATION	0.99+
Munich	LOCATION	0.99+
2017	DATE	0.99+
David	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
1999	DATE	0.99+
HP	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
75%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
Dimar	ORGANIZATION	0.99+
Christoph Streubert	PERSON	0.99+
2018	DATE	0.99+
Bill McDermont	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
third stage	QUANTITY	0.99+
first trillion dollar	QUANTITY	0.99+
second stage	QUANTITY	0.99+
one billion dollars	QUANTITY	0.99+
two	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+
second talk	QUANTITY	0.99+
yesterday	DATE	0.99+
Munich, Germany	LOCATION	0.99+
DataWorks Summit	EVENT	0.99+
SAP Big Data	ORGANIZATION	0.98+
both	QUANTITY	0.98+
fifth season	QUANTITY	0.98+
Bavarian	OTHER	0.98+
One	QUANTITY	0.98+

Nadeem Gulzar | DataWorks Summit Europe 2017

>> Announcer: Live from Munich, Germany, it's the CUBE, covering DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Hey welcome back everyone. We're here live in Munich Germany for DataWorks 2017 Summit, formerly know as Hadoop Summit, now called DataWorks. I'm John Furrier with the CUBE, my co-host Dave Vellante, here for two days of wall-to-wall coverage. Our next guest is Nadeem Gulzar, head of advanced Analytics at Danske Bank. Welcome to the CUBE. >> Thank you. >> You're a customer but also talking here at the event, bringing all your folks here. Your observation, I mean, Hadoop is not going away, certainly we see that. But now, as John Kreisa, who was MC'ing, was on earlier said, open up the aperture to analytics, is really where the action is. >> Nadeem: Absolutely. >> Your reaction to that. >> I completely agree, because again, Hadoop is basically just the basic infrastructure, right. Components build on components, and things like that. But, when you really utilize it, is when you add the advanced analytics frameworks. There are many out there. I'm not going to favor one over another. But the main thing is, you need that to really leverage Hadoop. And, at the same time, I think it's very important to realize how much power there actually is in this. For us at, in Danske Bank, getting Hadoop, getting the advanced analytics framework, has really proven quite a lot. It allowed us actually to dig into our core data, transaction data for instance, which we haven't been able to for decades. >> So take me through, because you guys are an interesting use case because you're advanced. You're gettin' at the data, which is cutting edge. But you're going through this transformation, and you have to because you're on the front lines. Take us inside the company, without giving away any trade secrets, and describe the environment. What's the current situation, and how is it evolving from an IT standpoint, and also from the relationship with the stakekholders in the business side. >> So again, we are a bank with 20,000 employees, so of course in a large organization you have silos, People feeling okay, this is my domain, this is my kingdom, don't touch it. Don't approach me, or you can approach me, talk to me, you have to convince me, otherwise don't talk to me at all. So we get that quite a lot, and to be honest, from my point of view, if we do not lift as a bank, we're not going to succeed. If I have success, if my organization of almost 60 people have success, that's good in itself, but we are not going to succeed as a bank. So for me, it's quite important that I go down and break down these barriers, and allow us to come in, tell the business units, tell them what sort of capabilities do we bring, and include them. That is actually the main key. I don't want to replace them or anything like that. >> So an organizational challenge is to get the mindset shifted. How 'about process gaps and product gaps? 'Cause I mean I almost see the sequence, kind of a group hug if you will, organizational mindset, kind of a reset or calibration. And then identify processes and then product gaps, seem to be the next transition. >> Absolutely, absolutely, and there are some gaps. Still, even though we have been on this journey for a considerable amount of time, there are still gaps, both in terms of processes and products. Because again, even though we have top management buy in, it doesn't go through all the way down to the middle layer. So we still struggle with this from time to time. >> How do you break down those barriers? What do you do, what's your strategy? >> I'm humble, to be honest. I go in, I tell them, listen you guys I have some capabilities that I can add to your capabilities. I want you to leverage me to make your life easier. I want to lift you as an organization. I don't care about myself, I want you to be better at what you're doing. >> So Nadeem, the money business and the technology business have always had a close relationship. It was like in 2010 after we came out of the downturn, it was like this other massive collision. You had begun experimenting with Cloud, the shift, CapEx to OpEx. The data thing hit in a big way, obviously mobile became real. So talk about the confluence of those technologies, specifically in the context of your big data journey. Where did you get started, and how did it evolve? >> So actually it fit in quite nicely because we were coming out of this down period, right, so there was extreme amount of focus on cost. So, of course at the time where we wanted to go into this journey, a lot of people were asking, okay how much does this cost, what's the big strategy, and so on. And how's the road map going to look like, and what's the cost of the road map? The thing is, if you buy some off the shelf commercial product, it's quite expensive. We can easily talk like half a billion, something like that, for a full end to end system. So with this, you were allowed, or we were allowed, to start up with relatively small funding, and I'm actually talking about just like a million dollars, roughly. And that actually allowed us a substantial boost in the capability department, in allowing us to show what kind of use cases we could build, and what kind of value we could bring to Danske Bank. >> So you started with understanding Hadoop? Is that right, was that the starting point? >> Yes, in a fairly small, very researched team set up. We did the initial research, we looked at, okay what could this bring? We did some initial, what we call, proof of value. So small, small, pilot projects, looking at, okay this is the data. We can leverage it in this way, this is the value we can bring. How much can we actually boost the business? So everything is directly linked to business value. So, for instance, one of the use cases was within customers, understanding customer behavior, directly linking it to marketing, do more targeted marketing, and at the end get more results in terms of increased sales. >> We just started a journey 2009, 2010, is that right? Or was it later? >> No, we started somewhat later. The initial research was in '14. >> In '14? Okay, alright, so '14 you sort of became familiar with Hadoop, and then I imagine, like many customers, you said okay, wow this stuff is complicated, but you were takin' it in small chunks, low risk. Let's get some value. Marketing is an obvious use case. I would imagine fraud is another obvious use case. So then, how did that evolve? I mean it's only a few years now, but I imagine you've evolved very quickly. >> Extremely quickly. Actually, within two months of the research, we actually saw a huge benefit in this area, and directly we went with the material to the senior members of the different boards we wanted to affect, and actually, you could call it luck. But, maybe we were just well prepared and convincing, so we actually directly got funding at that point in time. They said, listen, this is very promising. Here you go, start off with the initial, slightly larger projects, prove some value, and then come back to us. Initially they wanted us to do two things, look into the customer journey, or doing deeper customer behavior analytics, and the second was within risk. Doing things like, text mining, financial statements, getting some deeper into that, doing some web crawling on financial data such as Bloomberg, etcetera, and then pull it into the system. >> To inform your investments as a financial institution. From an architecture and infrastructure standpoint, we talked about starting at Hadoop. Has it evolved, how has it evolved? Where do you see it going? >> It has evolved quite a lot in the past couple of years. And again, to be honest, it's like every quarter something new is happening and we need to do some adjustments even to the core architecture. And with the introduction of HDB 3 hence later this year, I think we're going to see a massive change once again. Hortonworks already calls it a major change, or a major release. But actually, the things they are doing is extremely promising, so we want to take that step with them. But again, it's going to affect us. >> What's exciting about that to you? >> The thing that's very exciting is, we are now at like a balance point, where we have played quite a lot, we have released a couple of production grade solutions, but we have really not reached the full enterprise potential. So getting like into the real deep stuff with living under heavy SLA's, regulation stuff. All these kind of things is not in place yet, from my point of view. >> We talk a lot about, in the CUBE, and in our company, about these emergent work loads; you had batch, interactive, and the world went back to batch with Hadoop, and now you have this continuous workload, this streaming real-time workloads. How is that affecting your organization, generally, and specifically, you're thinking about architecture. How real is that and where do you see that in the future? >> It's the core, to be honest. Again, one of the main things we are trying to do is look into, so, gone are the days with heavy, heavy batches of data coming in. Because if you look at Weblocks for instance, so when customers interacts with our web, or our tablet solution, or mobile solution, the amount of data generated is humongous. So, no way on earth you can think about batches anymore. So it's more about streaming the data all the way in, doing real time analytics and then produce results. >> What would you say are your biggest, big data challenges, problems that you really want to attack and solve. >> So, what I really want to attack is, getting all sorts of data into the system. So, you can imagine, as a bank we have 2,000 plus systems. We have approximately 4,000 different points that delivers data. So getting all that mass into our data link, it's a huge task. We actually underestimated it. But now, we have seen we have to attack it and get it in because that is the gold. Data is the future gold. So we need to mine it in, we need to do analytics on top of it and produce value. >> And then once you get it in there, I'm sure you're anticipating that you want to make sure this doesn't go stale, doesn't become a swamp, doesn't get frozen. It's your job to talk about data oceans, which is really the long term vision I presume, right? >> And that is a key as well because with the GDPR for instance, we need to have full mapping and full control of all the data coming in. We need to be able to generate metadata, we need to have full data lineage. We need to know what, all the data where it came from, how it's interconnected, relations, all that. >> And that's what, two years away from implementation? Is that about right? >> It's going to take a while, of course. But again, the key thing is we make the framework so all the data coming in step by step, has that. >> Yeah, but so GDPR though, it goes into effect in '19, is that correct? >> It's actually May '18. >> May '18, oh, so it's much tighter time frame then I realized. >> John: You're under the gun. >> Nadeem: Yes. >> Okay, observation here at this event, obviously a lot of IOT, for you that's people. People and things are kind of the edge of the network. The intelligent edge is a big, big topic. Very dynamic. >> Nadeem: Extremely dynamic. >> A lot of things happening. Lot of opportunities for you to be this humble service provider to your constituents, but also your customers. How do you guys view that? What's the current landscape look like as you look outside the company and look at what's happening around you, the world. >> A lot of cool things are going on, to be honest. Especially in IOT, right? I mean, even though we are a core bank, still, there are a lot of sensors we can use. I talked a bit about, under the keynote, about ATM's, right? So, we're also looking at how can we utilize this technology? How can we enable our customers? If you look at our apps, they also generate extreme amounts of data, right? The mobile solution that we have, it gives away GPS location and things like that. And we want to include all that data in. At the end of the day, it's not for our gain, we are not always looking at making the next buck, right? It's also about being there for the customer, providing the services they need, making their banking life easier. >> And your ecosystem is evolving and rapidly adding new constituents to your network because, then you have the consumer with the phone, the mobile app alone, never mind the point of sale opportunity at the ATM. Now a digital, augmented reality experience could be enabled where you now have fintech suppliers, and potentially other suppliers in this now digital network that could be relational with you. >> Yes, and our job is to make sure that we leverage that. Acquiring a banking license is extremely difficult. But we have it, and what we need to do is to engage these fintechs, partners, even other banks, and say listen guys we invite you in. Utilize our services, utilize our framework, utilize our foundation and let's build something upon that. >> If you had to explain, Nadeem, this fintech start up trend because it is super hot, what is it? I mean how would you describe to someone who's not in the banking world. 'Cause most people would be scratching their head and say, isn't that banking? But, now this ecosystem is developing of new entrepreneurial activity and they're skyrocketing with success 'cause they have either a specialty focus, they do something extremely well. It may or may not be in a direct big space with a bank, but a white space. Use cases. So, is it good? Is it bad? Is it hype? What's the current state of the fintech situation? >> From my point of view, it's awesome. And the reason is, these guys are pushing us. Remember, we are a hundred fifty plus year old bank. And sometimes we do tend to just pat on our back and say, okay, this is going good, right? But, these guys are coming in, giving some competition, and we love it. >> Give me an example of a fintech capabilities. Randomly bring up some examples to highlight what fintech is. >> So what we've seen in, for instance the German market, is the fintechs coming in, utilizing some of the customer data, and then producing awesome new applications. Whether it is a new net bank, where a customer can interact with it, in a much, much more smoother way. Some of the banks tend to over clutter things, not make it simple. So things like, where you can put in, you can look at your transactions in a Google Map, for instance. You can see how much do you spend at this location. You can move around. >> You could literally follow the money, on a map. (laughing) >> So this is your home base, you go out here, you spend this amount of money, and maybe even add more on it. So, let's say you do your grocery shopping over here, but if I moved all my business from this company to this company, how much could I save? Imagine if you could just drag and drop it and see, okay, I could actually save a couple of thousand bucks, awesome. >> And machine learning is going to totally change the game with Augmented Intelligence. AI is called Artificial Intelligence, or Augmented Intelligence, depending upon your definition. This is a good thing for consumers. >> It is, it is. >> And thinking about disruption, what do you guys, what are your thoughts on blockchain? What is your research showing? You playing around with Hyperledger at all? >> Yes we are. And blockchain, it's also quite interesting. We're doing lots of research on that. What's it's shown actually is that this is a technology that we can also use. And we can also really utilize, even the security aspects of it. If you just take that, you could really implement that. >> The identity aspect, it's federating identity around fraud, another area you can innovate on. I'm bullish on blockchain, a lot of people are skeptical, but Dave knows I really, I love blockchain. Because it's not about Bitcoin per se, it's sort of the underlying opportunity. It just seems fascinating. Dave you know, I got to get on my soapbox, blockchain soapbox. >> We've never really looked at Bitcoin as just a currency, it's move of a technology platform, and I have always been fascinated with the security angle. Virtually unhackable, put that in quotes. No need for a third party to intermediate. So many positive fundamentals, now it's guys like you figuring out, okay the practitioner saying, here's how we're going to implement it and commercialize it. >> And actually it fits in quite well with things like GDPR. This is also about opening up, the same with PSD 2. Exposing the customer data, making it available for the general public. And ultimately the goal is, so you as a consumer, me as a consumer, we own our data. >> Nadeem, thank you so much for coming on the CUBE and sharing your practitioner situation, and your advice, as well as commentary. I'll give you the last word. As you and your team embark from DataWorks 2017 and head back to the ranch, so to speak, and bring back some stuff. What are you going to work on? What's the to do item? What are you going to sharpen the saw on and cut when you get back? >> So for us on the very, very short term, it's about taking our platform and our capabilities and move it into the real enterprise world. That is our first key milestone that we are going to go for. And, I'll tell you, we're going to go all in for that. Because, unless we do that, we're not able to really attack the core of banking, which requires this, right? Please remember that a consumer doing a transaction somewhere in the world, he cannot stand and wait for ages for something to be processed. It needs to be instantaneous. So, this is what we need to do. >> You think this event, you're armed up with product. >> Absolutely, absolutely. Lots of good insight we've gotten from this. Lots of potential, lots of networking guys and other companies that we can talk to about this. >> Also great recruiting, get some developers out there too, lot of great people. Congratulations on your success and thanks for sharing this great insight here on the CUBE, exposing the data to you live on the CUBE. Silicon Angle dot TV, I'm John Furrier, with Dave Vellante my co-host, more great coverage stay with us here live in Munich, Germany for DataWorks 2017 Summit. We'll be right back.

Published Date : Apr 6 2017

SUMMARY :

Brought to you by Hortonworks. Welcome to the CUBE. You're a customer but also talking here at the event, is when you add the advanced analytics frameworks. and you have to because you're on the front lines. So again, we are a bank with 20,000 employees, kind of a group hug if you will, So we still struggle with this from time to time. I want you to leverage me to make your life easier. the shift, CapEx to OpEx. And how's the road map going to look like, We did the initial research, we looked at, No, we started somewhat later. so '14 you sort of became familiar with Hadoop, and directly we went with the material Where do you see it going? and we need to do some adjustments So getting like into the real deep stuff and now you have this continuous workload, Again, one of the main things we are trying to do What would you say are your biggest, and get it in because that is the gold. And then once you get it in there, of all the data coming in. But again, the key thing is we make the framework so it's much tighter time frame then I realized. obviously a lot of IOT, for you that's people. Lot of opportunities for you A lot of cool things are going on, to be honest. then you have the consumer with the phone, and say listen guys we invite you in. I mean how would you describe to someone and we love it. Give me an example of a fintech capabilities. Some of the banks tend to over clutter things, You could literally follow the money, on a map. So, let's say you do your grocery shopping over here, And machine learning is going to totally change the game that we can also use. Dave you know, I got to get on my soapbox, and I have always been fascinated with the security angle. so you as a consumer, me as a consumer, we own our data. and cut when you get back? That is our first key milestone that we are going to go for. that we can talk to about this. exposing the data to you live on the CUBE.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Danske Bank	ORGANIZATION	0.99+
Nadeem	PERSON	0.99+
John Kreisa	PERSON	0.99+
Nadeem Gulzar	PERSON	0.99+
Dave	PERSON	0.99+
May '18	DATE	0.99+
2009	DATE	0.99+
2010	DATE	0.99+
John Furrier	PERSON	0.99+
Bloomberg	ORGANIZATION	0.99+
20,000 employees	QUANTITY	0.99+
two days	QUANTITY	0.99+
half a billion	QUANTITY	0.99+
two years	QUANTITY	0.99+
two months	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
'19	DATE	0.99+
'14	DATE	0.99+
CUBE	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
Google Map	TITLE	0.99+
Munich, Germany	LOCATION	0.99+
both	QUANTITY	0.98+
DataWorks 2017 Summit	EVENT	0.98+
GDPR	TITLE	0.98+
Hadoop	TITLE	0.98+
DataWorks	EVENT	0.98+
Munich Germany	LOCATION	0.98+
PSD 2	TITLE	0.98+
first key milestone	QUANTITY	0.98+
second	QUANTITY	0.97+
DataWorks Summit	EVENT	0.97+
Hadoop Summit	EVENT	0.97+
one	QUANTITY	0.97+
almost 60 people	QUANTITY	0.95+
Hadoop	ORGANIZATION	0.95+
later this year	DATE	0.94+
approximately 4,000 different points	QUANTITY	0.94+
2,000 plus systems	QUANTITY	0.93+
a hundred fifty plus year old	QUANTITY	0.93+
Silicon Angle dot TV	ORGANIZATION	0.93+
2017	EVENT	0.93+
a couple of thousand bucks	QUANTITY	0.87+
DataWorks Summit Europe 2017	EVENT	0.85+
decades	QUANTITY	0.85+
German	LOCATION	0.84+
OpEx	ORGANIZATION	0.82+
past couple of years	DATE	0.76+
a million dollars	QUANTITY	0.76+
earth	LOCATION	0.75+
CapEx	ORGANIZATION	0.74+
Hyperledger	ORGANIZATION	0.71+
2017	DATE	0.7+
3	COMMERCIAL_ITEM	0.68+
Bitcoin	OTHER	0.64+
SLA	TITLE	0.64+
Europe	LOCATION	0.59+
Weblocks	ORGANIZATION	0.59+
HDB	TITLE	0.58+
years	QUANTITY	0.57+
DataWorks	TITLE	0.49+
Cloud	TITLE	0.44+
CUBE	TITLE	0.41+

Gianthomas Volpe & Bertrand Cariou | DataWorks Summit Europe 2017

(upbeat music) >> Announcer: Live from Munich, Germany, it's the Cube covering DataWorks Summit Europe, 2017. Brought to you by Hortonworks. >> Hey, welcome back everyone. We're here live in Munich, Germany, at the DataWorks 2017 Summit. I'm John Furrier, my co-host Dave Vellante with the Cube, and our next two guests are Gianthomas Volpe, head of customer development e-media for Alation. Welcome to the Cube. And we have Bertrand Cariou, who's the director of solution marketing at Trifecta with partners. Guys, welcome to the Cube. >> Thank you. >> Thank you for having us. >> Big fans of both your start-ups and growing. You guys are doing great. We had your CEO on our big data SV, Joe Hellerstein, he talked about the rang, all the cool stuff that's going on, and Alation, we know Stephanie has been on many times, but you guys are start ups that are doing very well and growing in this ecosystem, and, you know, everyone's going public. Cloud Air has filed their S1, great news for those guys, so the data world has changed beyond Hadoop. You're seeing it, obviously Hadoop is not dead, but it's still going to be a critical component of a larger ecosystem that's developing. You guys are part of that. So I want to get your thoughts of why you're here in Europe, okay? And how you guys are working together to take data to the next level, because, you know, we're hearing more and more data is a foundational conversation starter, because now there's other things happening, IOT, business analysts, you guys are in the heart of it. Your thoughts? >> You know, going to be you. >> All in, yeah, sure. So definitely at Alation what we're seeing is more and more people across the organization want to get access to the data, and we're kind of breaking out of the traditional roles around IP managing both metadata, data preparation, like Trifecta's focused on. So we're pretty squarely focused on how do we bring that access to a wider range of people? How do we enable that social and collaborative approach to working with that data, whether it's in a data lake so, or here at DataWorks. So clearly that's one of the main topics. But also other data sources within the organization. >> So you're freeing the data up and the whole collaboration thing is more of, okay, don't just look at IT as this black box of give me some data and now spit out some data at me. Maybe that's the old way. The new way is okay, all of the data's out there, they're doing their thing, but the collaboration is for the user to get into that data you know, ingestion. Playing with the data, using the data, shaping the data. Developing with the data. Whatever they're doing, right? >> It's just bringing transparency to not only what IT is doing and making that accessible to users, but also helping users collaborate across different silos within an organization, so. We look at things like logs to understand who is doing what with the data, so if I'm working in one group, I can find out that somebody in a completely different group in the organization is working with similar data, bringing new techniques to their analysis, and can start leveraging that and have a conversation that others can learn from, too. >> So basically it's like a discovery platform for saying hey, you know, Mary in department X has got these models. I can leverage that. Is that kind of what you guys are all about? >> Yeah, definitely. And breaking through that, enabling communication across the different levels of the organization, and teaching other people at all different levels of maturity within the company, how they can start interacting with data and giving them the tools to up skill throughout that process. >> Bertrand, how about the Trifecta? 'Cause one of the things that I find exciting about Europe value proposition and talking to Joe, the founder, besides the fact that they all have GitHub on their about page, which is the coolest thing ever, 'cause they're all developers. But the more reality is is that a business person or person dealing with data in some part of a geography, could be whether it's in Europe or in the US, might have a completely different view and interest in data than someone in another area. It could be sales data, could be retail data, it doesn't matter but it's never going to be the same schema. So the issue is, got to take that away from the user complexity. That is really fundamental change. >> Yeah. You're totally correct. So information is there, it is available. Alation helps identify what is the right information that can be used, so if I'm in marketing, I could reuse sales information, associating maybe with web logs information. Alation will give me the opportunity to know what information is available and if I can trust it. If someone in finance is using that information, I can trust that data. So now as a user, I want to take that data, maybe combine the data, and the data is always a different format, structure, level of quality, and the work of data wrangling is really for the end user, you can be an analyst. Someone in the line of business most of the time, these could be like some of the customers we are here in Germany like Munich Re would be actuaries. Building risk models and or claimed for casting, payment for casting. So they are not technologies at all, but they need to combine these data sets by themselves, and at scale, and the work they're doing, they are producing new information and this information is used directly to their own business, but as soon as they share this information, back to the data lake, Alation will index this information, see how it is used, and put it to this visibility to the other users for reuse as well. >> So you guys have a partnership, or is this more of a standard API kind of thing? >> So we do have a partnership, we have plan development on the road map. It's currently happening. So I think by the end of the quarter, we're going to be delivering a new integration where whether I'm in Alation and looking for data and finding something that I want to work with, I know needs to be prepared I can quickly jump into Trifecta to do that. Or the other way around in Trifecta, if I'm looking for data to prepare, I can open the catalog, quickly find out what exists and how to work with it better. >> So basically the relationship, if I get this right is, you guys pass on your expertise of the data wrangling all the back processes you guys have, and advertise that into Alation. They discover it, make it surfaceable for the social collaboration or the business collaboration. >> Exactly. And when the data is wrangled, it began indexed and so it's a virtual circle where all the data that is traded and combined is exposed to the user to be reused. >> So if I were Chief Data Officer, I'd say okay, there's three sequential things that I need to do, and you can maybe help me with a couple of them. So the first one is I need to understand how data contributes to the monetization of my company, if I'm a public company or a for profit company. That's, I guess my challenge. But then, there are other two things that I need to give people access to that data, and I need quality. So I presume Alation can help me understand what data's available. I can actually, it kind of helps with number one as well because like you said, okay, this is the type of data, this is how the business process works. Feed it. And then the access piece and quality. I guess the quality is really where Trifecta comes in. >> GianThomas: Yes. >> What about that sequential flow that I just described? Is that common? >> Yeah >> In your business, your customer base. >> It's definitely very common. So, kind of going back to the Munich Re examples, since we're here in Munich, they're very focused on providing better services around risk reduction for their customers. Data that can impact that risk can be of all kinds from all different places. You kind of have to think five, ten years ahead of where we are now to see where it might be coming from. So you're going to have a ton of data going in to the data lake. Just because you have a lot of data, that does not mean that people will know how to work with it they won't know that it exists. And especially since the volumes are so high. It doesn't mean that it's all coming in at a greatly usable format. So Alation comes in to play in helping you find not only what exists, by automating that process of extraction but also looking at what data people are actually using. So going back to your point of how do I know what data's driving value for the organization, we can tell you in this schema, this is what's actually being used the most. That's a pretty good starting point to focus in on what is driving value and when you do find something, then you can move over to Trifecta to prepare it and get it ready for analysis. >> So keying on that for a second, so in the example of Munich Re, the value there is my reduction in expected loss. I'm going to reduce my risk, that puts money in my bottom line. Okay, so you can help me with number one, and then take that Munich Re example into Trifecta. >> Yes, so the user will be the same user using Alation and Trifecta. So is an actuary. So as soon as the actuary items you find the data that is the most relevant for what you'll be planning, so the actuaries are working with terms like development triangles over 20 years. And usually it's column by column. So they have to pivot the data row by row. They have to associate that with the paid claims the new claims coming in, so all these information is different format. Then they have to look at maybe weather information, or additional third party information where the level of quality is not well known, so they are bringing data in the lake that is not yet known. And they're combining all this data. The outcome of that work, that helps in the Reese modeling so that could be used by, they could use Sass or our older technology for the risk modeling. But when they've done that modeling and building these new data sets. They're, again, available to the community because Alation would index that information and explain how it is used. The other things that we've seen with our users is there's also a very strong, if you think about insurances banks, farmer companies, there is a lot of regulation. So, as the user, as you are creating new data, said where the data coming from. Where the data is going, how is it used in the company? So we're capturing all that information. Trifecta would have the rules to transform the data, Alation will see the overall eye level picture from table to the source system where the data is come. So super important as well for the team. >> And just one follow up. In that example, the actuary, I know hard core data scientists hate this term, but the actuaries, the citizen data scientist. Is that right? >> The actuaries would know I would say statistics, usually. But you get multiple level of actuaries. You get many actuaries, they're Excel users. They have to prepare data. They have to pin up, structure the data to give it to next actuary that will be doing the pricing model or the next actuary that will risk modeling. >> You guys are hitting on a great formula which is cutting edge, which is why you guys are on the startups. But, Bertrand I want to talk to you about your experience at Informatica. You were the founder the Informatica France. And you're also involved in some product development in the old, I'd say old days, but like. Back in the days when structured data and enterprise data, which was once a hard problem, deal with metadata, deal with search, you had schemes, all kinds of stuff to deal with. It was very difficult. You have expertise. I want you to talk about what's different now in this environment. Because it's still challenging. But now the world has got so much fast data, we got so much new IOT data, especially here in Europe. >> Oh yes. >> Where you have an industrialized focus, certainly Germany, like case in point, but it's pretty smart mobility going on in Europe. You've always had that mobile environment. You've got smart cities. A lot of focus on data. What's the new world like now? How are people dealing with this? What's your perspective? >> Yes, so there's and we all know about the big data and with all this volume, additional volume and new structure of data. And I would say legacy technology can deal as you mentioned, with well structured information. Also you want to give that information to the masses. Because the people who know the data best, are the business people. They know what to do with the data, but the access of this data is pretty complicated. So where Trifecta is really differentiating and has been thinking through that is to say whatever the structure of the data, IOT, Web Logs, Value per J son, XML, that should be for an end user, just metrics. So that's the way you understand the data. The next thing when play with data, usually you don't know what the schema would be at the end. Because you don't know what the outcome is. So, you are, as an end user, you are exploring the data combining data set and the structure is trading as you discover the data. So that is also something new compared to the old model where an end user would go to the data engineer to say I need that information, can you give me that information? And engineers would look at that and say okay. We can access here, what is the schema? There was all this back and forth. >> There was so much friction in the old way, because the creativity of the user is independent now of all that scaffolding and all the wrangling, pre-processing. So I get that piece of the Citizen's Journal, Citizen Analyst. But the key thing here is you were shrecking with the complexity to get the job done. So the question then comes in, because it's interesting, all the theme here at DataWorks Summit in Europe and in the US is all the big transformative conversations are starting with business people. So this a business unit so the front lines if you will, not IT. Although IT now's got to support that. If that's the case, the world's shifting to the business owners. Hence your start up. Is that kind of getting that right? >> I think so. And I think that's also where we're positioning ourselves is you have a data lake, you can put tons of data in it, but if you don't find an easy way to make that accessible to a business user, you're not going to get a value out of it. It's just going to become a storage place. So really, what we've focused on is how do you make that layer easily accessible? How do you share around and bring some of the common business practices to that? And make sure that you're communicating with IT. So IT shouldn't be cast aside, but they should have an ongoing relationship with the business user. >> By the way, I'll point out that Dave knows I'm not really a big fan of the data lake concept mainly because they've turned it into data swamps because IT deploys it, we're done! You know, check the box. But, data's getting stale because it's not being leveraged. You're not impacting the data or making it addressable, or discoverable or even wrangleable. If that's a word. But my point is that's all complexities. >> Yes, so we call it sort of frozen data lake. You build a lake, and then it's frozen and nobody can go fishing. >> You play hockey on it. (laughs) >> You dig and you're fishing. >> And you need to have this collaboration ongoing with the IT people, because they own the infrastructure. They can feed the lake with data with the business. If there is no collaboration, and we've seen that multiple times. Data lake initiatives, and then we come back one year after there is no one using the lake, like one, two person of the processing power, or the data is used. Nobody is going to the lake. So you need to index the data, catalog the data to know what is available. >> And the psychology for IT is important here, and I was talking yesterday with IBM folks, Nevacarti here, but this is important because IT is not necessarily in a position of doing it because doing the frozen lake or data swamp because they want to screw over the business people, they just do their job, but here you're empowering them because you guys are got some tech that's enabling the IT to do a data lake or data environment that allows them to free up the hassles, but more importantly, satisfy the business customer. >> GeanThomas: Exactly. >> There's a lot of tech involved. And certainly we've talked to you guys about that. Talk about that dynamic of the psychology because that's what IT wants. So what's that dev ops mindset for data, data ops if you will or you know, data as code if you will, constantly what we've been calling it but that's now the cloud ethos hits the date ethos. Kind of coming together. >> Yes, I think data catalogs are subtly different in that traditionally they are more of an IT function, but to some extent on the metadata side, where as on the business side, they tended to be a siloed organization of information that business itself kept to maintain very manually. So we've tried to bring that together. All the different parties within this process from the IT side to the govern stewardship all the way down to the analysts and data scientists can get value out of a data catalog that can help each other out throughout that process. So if it's communicating to end users what kind of impact any change IT will make, that makes their life easier, and have one way to communicate that out and see what's going to happen. But also understand what the business is doing for governance or stewardship. You can't really govern or curate if you don't know what exists and what matters to the business itself. So bring those different stages together, helping them help each other is really what Alation does. >> Tell about the prospects that you guys are engaging in from a customer standpoint. What are some of the conversations of those customers you haven't gotten yet together. And and also give an example of a customer that you guys have, and use cases where they've been successful. >> Absolutely. So typically what we see, is that an organization is starting up a data lake or they already have legacy data warehouses. Often it's both, together. And they just need a unified way of making information about those environments available to end users. And they want to have that better relationship. So we're often seeing IT engaged in trying to develop that relationship along with the business. So that's typically how we start and we in the process of deploying, work in to that conversation of now that you know what exists, what you might want to work with, you're often going to have to do some level of preparation or transformation. And that's what makes Trifecta a great fit for us, as a partner, is coming to that next step. >> Yeah, on Mobile Market Share, one of our common customers, we have DNSS, also a common customer, eBay, a common customer. So we've got already multiple customers and so some information about the issue Market Share, they have to deal with their customer information. So the first thing they receive is data, digital information about ads, and so it's really marketing type of data. They have to assess the quality of the data. They have to understand what values and combine the value with their existing data to provide back analytics to their customers. And that use case, we were talking to the business users, my people selling Market Share to their customers because the fastest they can unboard their data, they can qualify the quality of the data the easiest it is to deliver right level of quality analytics. And also to engage more customers. So it was really was to be fast onboarding customer data and deliver analytics. And where Alatia explain is that they can then analyze all the sequel statement that the customers, maybe I'll let you talk about use case, but there's also, it was the same users looking at the same information, so we engage with the business users. >> I wonder if we can talk about the different roles. You hear about the data scientists obviously, the data engineer, there might be a data quality professional involved, there's certainly the application developer. These guys may or may not even be in IT. And then you got a DVA. Then you may have somebody who's statistician. They might sit in the line of business. Am I overcomplicating it? Do larger organizations have these different roles? And how do you help bring them together? >> I'd say that those roles are still influx in the industry. Sometimes they sit on IT's legs, sometimes they sit in the business. I think there's a lot of movement happening it's not a consistent definition of those different roles. So I think it comes down to different functions. Sometimes you find those functions happening within different places in the company. So stewardship and governance may happen on the IT side, it might happen on the business side, and it's almost a maturity scale of how involved the two sides are within that. So we play with all of those different groups so it's sometimes hard to narrow down exactly who it is. But generally it's on the consumptions side whether it's the analyst or data scientists, and there's definitely a crossover between the two groups, moving up towards the governance and stewardship that wants to enable those users or document curing the data for them all the way to the IT data engineers that operationalize a lot of the work that the data scientists and analysts might be hypothesizing and working with in their research. >> And you sell to all of those roles? Who's your primary user constituency, or advocate? >> We sell both to the analytics groups as well as governance and they often merge together. But we tend to talk to all of those constituencies throughout a sales cycle. >> And how prominent in your customer base do you see that the role of the Chief Data Officer? Is it only reconfined within regulated industries? Does he seep into non-regulated industries? >> I'd say for us, it seeps with non-regulated industries. >> What percent of the customers, for instance have, just anecdotally, not even customers, just people that you talk to, have a Chief Data Officer? Formal Chief Data Officer? >> I'd say probably about 60 to 70 percent. >> That high? >> Yeah, same for us. In regulated industries (mumbles). I think they play a role. The real advantage a Chief Data and Analytical Officer, it's data and analytics, and they have to look at governance. Governance could be for regulation, because you have to, you've got governance policy, which data can be combined with which data, there is a lot. And you need to add that. But then, even if you are less regulated, you need to know what data is available, and what data is (mumbles). So you have this requirement as well. We see them a lot. We are more and more powerful, I would say in the enterprise where they are able to collaborate with the business to enable the business. >> Thanks so much for coming on the Cube, I really appreciate it. Congratulations on your partnership. Final word I'll give you guys before we end the segment. Share a story, obviously you guys have a unique partnership, you've been in the business for awhile, breaking into the business with Alation. Hot startups. What observations out there that people should know about that might not be known in this data world. Obviously there's a lot of false premises out there on what the industry may or may not be, but there's a lot of certainly a sea change happening. You see AI, it gives a mental model for people, Eugene Learning, Autonomous Vehicles, Smart Cities, some amazing, kind of magical things going on. But for the basic business out there, they're struggling. And there's a lot of opportunities if they get it right, what thing, observation, data, pattern you're seeing that people should know about that may not be known? It could be something anecdotal or something specific. >> You go first. (laughs) >> So maybe there will be surprising, but like Kaiser is a big customer of us. And you know Kaiser in California in the US. They have hundreds or thousands of hospitals. And surprisingly, some of the supply chain people where I've been working for years, trying to analyze, optimizing the relationship with their suppliers. Typically they would buy a staple gun without staples. Stupid. But they see that happening over and over with many products. They were never able to sell these, because why? There will be one product that have to go to IT, they have to work, it would take two months and there's another supplier, new products. So how to know- >> John: They're chasing their tail! >> Yeah. It's not super excited, they are now to do that in a couple of hours. So for them, they are able, by going to the data lakes, see what data, see how this hospital is buying, they were not able to do it. So there is nothing magical here, it's just giving access to the data who know the data best, the analyst. >> So your point is don't underestimate the innovation, as small as it may seem, or inconsequential, could have huge impacts. >> The innovation goes with the process to be more efficient with the data, not so much building new products, just basically being good at what you do, so then you can focus on the value you bring to the company. >> GianThomas what's your thoughts? >> So it's sort of related. I would actually say something we've seen pretty often is companies, all sizes, are all struggling with very similar, similar problems in the data space specifically so it's not a big companies have it all figured out, small companies are behind trying to catch up, and small companies aren't necessarily super agile and aren't able to change at the drop of a hat. So it's a journey. It's a journey and it's understanding what your problems are with the data in the company and it's about figuring out what works best for your solution, or for your problems. And understanding how that impacts everyone in the business. So it's really a learning process to understand what's going- >> What are your friends who aren't in the tech business say to you? Hey, what's this data thing? How do you explain it? The fundamental shift, how do you explain it? What do you say to them? >> I'm more and more getting people that already have an idea of what this data thing is. Which five years ago was not the case. Five years ago, it was oh, what's data? Tell me more about that? Why do you need to know about what's in these databases? Now, they actually get why that's important. So it's becoming a concept that everyone understands. Now it's just a matter of moving its practice and how that actually works. >> Operationalizing it, all the things you're talking about. Guys, thanks so much for bringing the insights. We wrangled it here on the Cube. Live. Congratulations to Trifecta and Alation. Great startups, you guys are doing great. Good to see you guys successful again and rising tide floats all boats in this open source world we're living in and we're bringing you more coverage here at DataWowrks 2017, I'm John Furrier with Dave Vellante. Stay with us, more great content coming after this short break. (upbeat music)

Published Date : Apr 6 2017

SUMMARY :

Brought to you by Hortonworks. at the DataWorks 2017 Summit. so the data world has So clearly that's one of the main topics. and the whole collaboration thing group in the organization Is that kind of what levels of the organization, So the issue is, the opportunity to know I can open the catalog, all the back processes you guys have, is exposed to the user to be reused. So the first one is I need to understand So Alation comes in to so in the example of Munich Re, So, as the user, as you In that example, the actuary, or the next actuary Back in the days when structured data What's the new world like now? So that's the way you understand the data. so the front lines if you will, not IT. some of the common fan of the data lake concept and nobody can go fishing. You play hockey on it. They can feed the lake with that's enabling the IT to do a data lake Talk about that dynamic of the psychology from the IT side to the govern stewardship What are some of the of now that you know what exists, the easiest it is to deliver You hear about the data that the data scientists and analysts We sell both to the analytics groups with non-regulated industries. about 60 to 70 percent. and they have to look at governance. breaking into the business with Alation. You go first. California in the US. it's just giving access to the the innovation, as small as it may seem, to be more efficient with the data, impacts everyone in the business. and how that actually works. Good to see you guys successful again

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Joe	PERSON	0.99+
Dave	PERSON	0.99+
Joe Hellerstein	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Europe	LOCATION	0.99+
California	LOCATION	0.99+
Germany	LOCATION	0.99+
Bertrand	PERSON	0.99+
Bertrand Cariou	PERSON	0.99+
hundreds	QUANTITY	0.99+
Gianthomas Volpe	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Alation	ORGANIZATION	0.99+
yesterday	DATE	0.99+
Stephanie	PERSON	0.99+
two groups	QUANTITY	0.99+
US	LOCATION	0.99+
two months	QUANTITY	0.99+
Mary	PERSON	0.99+
John Furrier	PERSON	0.99+
five	QUANTITY	0.99+
Kaiser	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
Munich Re	ORGANIZATION	0.99+
GianThomas	PERSON	0.99+
Trifecta	ORGANIZATION	0.99+
eBay	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
Cloud Air	ORGANIZATION	0.99+
one product	QUANTITY	0.99+
Alation	PERSON	0.99+
five years ago	DATE	0.99+
Munich, Germany	LOCATION	0.98+
both	QUANTITY	0.98+
Excel	TITLE	0.98+
GeanThomas	PERSON	0.98+
over 20 years	QUANTITY	0.98+
DataWorks Summit	EVENT	0.98+
one	QUANTITY	0.98+
Five years ago	DATE	0.98+
Informatica France	ORGANIZATION	0.98+
two person	QUANTITY	0.98+
first one	QUANTITY	0.98+
Hadoop	TITLE	0.97+
DataWorks	ORGANIZATION	0.97+
thousands	QUANTITY	0.97+
Munich Re	ORGANIZATION	0.96+
Hortonworks	ORGANIZATION	0.96+
one group	QUANTITY	0.96+
DataWorks 2017 Summit	EVENT	0.96+
first	QUANTITY	0.96+
GitHub	ORGANIZATION	0.96+
ten years	QUANTITY	0.96+
about 60	QUANTITY	0.96+
first thing	QUANTITY	0.95+
Cube	ORGANIZATION	0.95+
Eugene Learning	ORGANIZATION	0.94+

Carlo Vaiti | DataWorks Summit Europe 2017

>> Announcer: You are CUBE Alumni. Live from Munich, Germany, it's theCUBE. Covering, DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Hello, everyone, welcome back to live coverage at DataWorks 2017, I'm John Furrier with my cohost, Dave Vellante. Two days of coverage here in Munich, Germany, covering Hortonworks and Yahoo, presenting Hadoop Summit, now called DataWorks 2017. Our next guest is Carlo Vaiti, who's the HPE chief technology strategist, EMEA Digital Solutions, Europe, Middle East, and Africa. Welcome to theCUBE. >> Thank you, John. >> So we were just chatting before we came on, of your historic background at IBM, Oracle, and now HPE, and now back into the saddle there. >> Don't forget Sun Microsystems. >> Sun Microsystems, sorry, Sun, yeah. I mean, great, great run. >> It was a long run. >> You've seen the computer revolution happen. I worked at HP for nine years, from '88 to '97. Again, Dave was a premier analyst during that run of client-server. We've seen the computer revolution happen. Now we're seeing the digital revolution where the iPhone is now 10 years old, Cloud is booming, data's at the center of the value proposition, so a completely new disruptive capability. >> Carlo: Sure, yes. >> So what are you doing as the CTO, chief technologist for HPE, how are you guys bringing this story together? 'Cause there's so much going on at HPE. You got the services spit, you got the software split, and HP's focusing on the new style of IT, as Meg Whitman calls it. >> So, yeah. My role in EMEA is actually all about having basically a visionary kind of strategy role for what's going to be HP in the future, in terms of IT. And one of the things that we are looking at is, is specifically to have, we split our strategy in three different aspects, so three transformation areas. The first one which we usually talk is what I call hybrid IT, right, which is basically making services around either On-Premise or on Cloud for our customer base. The second one is actually power the Intelligent Edge, so is actually looking after our collaboration and when we acquire Aruba components. And the third one, which is in the middle, and that's why I'm here at the DataWorks Summit, is actually the data-analytics aspects. And we have a couple of solution in there. One is the Enterprise great Hadoop, which is part of this. This is actually how we generalize all the figure and the strategy for HP. >> It's interesting, Dave and I were talking yesterday, being in Europe, it's obviously a different sideshow, it's smaller than the DataWorks or Hadoop Summit in North America in San Jose, but there's a ton of Internet of things, IoT or IIoT, 'cause here in Germany, obviously, a lot of industrial nations, but in Europe in general, a lot of smart cities initiatives, a lot of mobility, a ton of Internet of things opportunity, more than in the US. >> Absolutely. >> Can you comment on how you guys are tackling the IoT? Because it's an Intelligent Edge, certainly, but it's also data, it's in your wheelhouse. >> Yes, sure. So I'm actually working, it's a good question, because I'm actually working a couple of projects in Eastern Europe, where it's all about Industrial IoT Analytics, IIoTA. That's the new terminology we use. So what we do is actually, we analyze from a business perspective, what are the business pain points, in an oil and gas company for example. And we understand for example, what kind of things that they need and must have. And what I'm saying here is, one of the aspects for example, is the drilling opportunity. So how much oil you can extract from a specific rig in the middle of the North Sea, for example. This is one of the key question, because the customer want to understand, in the future, how much oil they can extract. The other one is for example, the upstream business. So doing on the retail side and having, say, when my customer is stopping in a gas station, I want go in the shop, immediately giving, I dunno, my daughter, a kind of campaign for the Barbie, because they like the Barbie. So IoT, Industrial IoT help us in actually making a much better customer experience, and that's the case of the upstream business, but is also helping us in actually much faster business outcomes. And that's what the customer wants, right? 'Cause, and was talking with your colleague before, I'm talking to the business guy. I'm not talking to the IT anymore in these kind of place, and that's how IoT allow us a chance to change the conversation at the industry level. >> These are first-time conversations too. You're getting at the kinds of business conversations that weren't possible five years ago. >> Carlo: Yes, sure. >> I mean and 10 years ago, they would have seemed fantasy. Now they're reality. >> The role of analytics in my opinion, is becoming extremely key, and I said this morning, for me my best center is that the detail, is the stone foundation of the digital economy. I continue to repeat this terminology, because it's actually where everything is starting from. So what I mean is, let's take a look at the analytic aspect. So if I'm able to analyze the data close to the shop floor, okay, close to the shop manufacturing floor, if I'm able to analyze my data on the rig, in the oil and gas industry, if I'm able to analyze doing preprocessing analytics, with Kafka, Druid, these kind of open-source software, where close to the Intelligent Edge, then my customers going to be happy, because I give them very fast response, and the decision-maker can get to decision in a faster time. Today, it takes a long time to take these type of decision. So that's why we want to move into the power Intelligent Edge. >> So you're saying, data's foundational, but if you get to the Intelligent Edge, it's dynamic. So you have a dynamic reactive, realtime time series, or presences of data, but you need the foundational pre-data. >> Perfect. >> Is that kind of what you're getting at? >> Yes, that's the first step. Preprocessing analytics is what we do. In the next generation of, we think is going to be Industrial IoT Analytics, we're going to actually put massive amount of compute close to the shop manufacturing floor. We call internally or actually externally, convergent planned infrastructure. And that's the key point, right? >> John: Convergent plan? >> Convergent planned infrastructure, CPI. If you look at in Google, you will find. It's a solution we bring in the market a few months ago. We announce it in December last year. >> Yeah, Antonio's smart. He also had a converged systems as well. One of the first ones. >> Yeah, so that's converge compute at the edge basically. >> Correct, converge compute-- >> Very powerful. >> Very powerful, and we run analytics on the edge. That's the key point. >> Which we love, because that means you don't have to send everything back to the Cloud because it's too expensive, it's going to take too long, it's not going to work. >> Carlo: The bandwidth on the network is much less. >> There's no way that's going to be successful, unless you go to the edge and-- >> It takes time. >> With a cost. >> Now the other thing is, of course, you've got the Aruba asset, to be able to, I always say, joke, connect the windmill. But, Carlo, can we go back to the IoTA example? >> Carlo: Correct, yeah. >> I want to help, help our audience understand, sort of, the new HP, post these spin merges. So perviously you would say, okay, we have Vertica. You still have partnership, or you still own Vertica, but after September 1st-- >> Absolutely, absolutely. It's part of the columnar side-- >> Right, yes, absolutely, but, so. But the new strategy is to be more of a platform for a variety of technology. So how for instance would you solve, or did you solve, that problem that you described? What did you actually deliver? >> So again, as I said, we're, especially in the Industrial IoT, we are an ecosystem, okay? So we're one element of the ecosystem solution. For the oil and gas specifically, we're working with other system integrator. We're working with oil and the industry gas expertise, like DXC company, right, the company that we just split a few days ago, and we're working with them. They're providing the industry expertise. We are a infrastructure provided around that, and the services around that for the infrastructure element. But for the industry expertise, we try to have a kind of little bit of knowledge, to start the conversation with the customer. But again, my role in the strategy is actually to be a ecosystem digital integrator. That's the new terminology we like to bring in the market, because we really believe that's the way HP role is going to be. And the relevance of HP is totally depending if we are going to be successful in these type of things. >> Okay, now a couple other things you talked about in your keynote. I'm just going to list them, and then we can go wherever we want. There was Data Link 3.0, Storage Disaggregation, which is kind of interesting, 'cause it's been a problem. Hadoop as a service, Realtime Everywhere, and then Analytics at the Edge, which we kind of just talked about. Let's pick one. Let's start with Data Link 3.0. What is that? John doesn't like the term data link. He likes data ocean. >> I like data ocean. >> Is Data Link 3.0 becoming an ocean? >> It's becoming an ocean. So, Data Link 3.0 for us is actually following what is going to be the future for HDFS 3.0. So we have three elements. The erasure coding feature, which is coming on HDFS. The second element is around having HDFS data tier, multi-data tier. So we're going to have faster SSD drives. We're going to have big memory nodes. We're going to have GPU nodes. And the reason why I say disaggregation is because some of the workload will be only compute, and some of the workload will be only storage, okay? So we're going to bring, and the customer require this, because it's getting more data, and they need to have for example, YARN application running on compute nodes, and the same level, they want to have storage compute block, sorry, storage components, running on the storage model, like HBase for example, like HDFS 3.0 with the multi-tier option. So that's why the data disaggregation, or disaggregation between compute and storage, is the key point. We call this asymmetric, right? Hadoop is becoming asymmetric. That's what it mean. >> And the problem you're solving there, is when I add a node to a cluster, I don't have to add compute and storage together, I can disaggregate and choose whatever I need, >> Everyone that we did. >> based on the workload. >> They are all multitenancy kind of workload, and they are independent and they scale out. Of course, it's much more complex, but we have actually proved that this is the way to go, because that's what the customer is demanding. >> So, 3.0 is actually functional. It's erasure coding, you said. There's a data tier. You've got different memory levels. >> And I forgot to mention, the containerization of the application. Having dockerized the application for example. Using mesosphere for example, right? So having the containerization of the application is what all of that means, because what we do in Hadoop, we actually build the different clusters, they need to talk to each other, and change data in a faster way. And a solution like, a product like SQL Manager, from Hortonworks, is actually helping us to get this connection between the cluster faster and faster. And that's what the customer wants. >> And then Hadoop as a service, is that an on-premise solution, is that a hybrid solution, is it a Cloud solution, all three? >> I can offer all of them. Hadoop is a service could be run on-premise, could be run on a public Cloud, could be run on Azure, or could be mix of them, partially on-premise, and partially on public. >> And what are you seeing with regard to customer adoption of Cloud, and specifically around Hadoop and big data? >> I think the way I see that option is all the customer want to start very small. The maturity is actually better from a technology standpoint. If you're asking me the same question maybe a year ago, I would say, it's difficult. Now I think they've got the point. Every large customer, they want to build this big data ocean, note the delay, ocean, whatever you want to call it. >> John: Love that. (laughs) >> All right. They want to build this data ocean, and the point I want to make is, they want to start small, but they want to think very high. Very big, right, from their perspective. And the way they approach us is, we have a kind of methodology. We establish the maturity assessment. We do a kind of capability maturity assessment, where we find that if the customer is actually a pioneer, or is actually a very traditional one, so it's very slow-going. Once we determine where is the stage of the customer is, we propose some specific proof of concept. And in three months usually, we're putting this in place. >> You also talked about realtime everywhere. We in our research, we talk about the, historically, you had batchy of interactive, and now you have what we call continuous, or realtime streaming workloads. How prevalent is that? Where do you see it going in the future? >> So I think is another train for the future, as I mentioned this morning in my presentation. So and Spark is actually doing the open-source memory engine process, is actually the core of this stuff. We see 60 to 70 time faster analytics, compared to not to use Spark. So many customer implemented Spark because of this. The requirement are that the customer needs an immediate response time, okay, for a specific decision-making that they have to do, in order to improve their business, in order to improve their life. But this require a different architecture. >> I have a question, 'cause you, you've lived in the United States, you're obviously global, and spent a lot of time in Europe as well, and a lot of times, people want to discuss the differences between, let's make it specific here, the European continent and North America, and from a sophistication standpoint, same, we can agree on that, but there are still differences. Maybe, more greater privacy concerns. The whole thing with the Cloud and the NSA in the United States, created some concerns. What do you see as the differences today between North America and Europe? >> From my perspective, I think we are much more for example take IoT, Industrial IoT. I think in Europe we are much more advanced. I think in the manufacturing and the automotive space, the connected car kind of things, autonomous driving, this is something that we know already how to manage, how to do it. I mean, Tesla in the US is a good example that what I'm saying is not true, but if I look at for example, large German manufacturing car, they always implemented these type of things already today. >> Dave: For years, yeah. >> That's the difference, right? I think the second step is about the faster analytic approach. So what I mentioned before. The Power the Intelligent Edge, in my opinion at the moment, is much more advanced in the US compared to Europe. But I think Europe is starting to run back, and going on the same route. Because we believe that putting compute capacity on the edge is what actually the customer wants. But that's the two big differences I see. >> The other two big external factors that we like to look at, are Brexit and Trump. So (laughs) how 'about Brexit? Now that it's starting to sort of actually become, begin the process, how should we think about it? Is it overblown? It is critical? What's your take? >> Well, I think it's too early to say. UK just split a few days ago, right, officially. It's going to take another 18 months before it's going to be completed. From a commercial standpoint, we don't see any difference so far. We're actually working the same way. For me it's too early to say if there's going to be any implication on that. >> And we don't know about Trump. We don't have to talk about it, but the, but I saw some data recently that's, European sentiment, business sentiment is trending stronger than the US, which is different than it's been for the last many years. What do you see in terms of just sentiment, business conditions in Europe? Do you see a pick up? >> It's getting better, it is getting better. I mean, if I look at the major countries, the P&L is going positive, 1.5%. So I think from that perspective, we are getting better. Of course we are still suffering from the Chinese, and Japanese market sometimes. Especially in some of the big large deals. The inclusion of the Japanese market, I feel it, and the Chinese market, I feel that. But I think the economy is going to be okay, so it's going to be good. >> Carlo, I want to thank you for coming on and sharing your insight, final question for you. You're new to HPE, okay. We have a lot of history, obviously I was, spent a long part of my career there, early in my career. Dave and I have covered the transformation of HP for many, many years, with theCUBE certainly. What attracted you to HP and what would you say is going on at HP from your standpoint, that people should know about? >> So I think the number one thing is that for us the word is going to be hybrid. It means that some of the services that you can implement, either on-premise or on Cloud, could be done very well by the new Pointnext organization. I'm not part of Pointnext. I'm in the EG, Enterprise Group division. But I am fan for Pointnext because I believe this is the future of our company, is on the services side, that's where it's going. >> I would just point out, Dave and I, our commentary on the spin merge has been, create these highly cohesive entities, very focused. Antonio now running EG, big fans, of where it's actually an efficient business model. >> Carlo: Absolutely. >> And Chris Hsu is running the Micro Focus, CUBE Alumni. >> Carlo: It's a very efficient model, yes. >> Well, congratulations and thanks for coming on and sharing your insights here in Europe. And certainly it is an IoT world, IIoT. I love the analytics story, foundational services. It's going to be great, open source powering it, and this is theCUBE, opening up our content, and sharing that with you. I'm John Furrier, Dave Vellante. Stay with us for more great coverage, here from Munich after the short break.

Published Date : Apr 6 2017

SUMMARY :

Brought to you by Hortonworks. Welcome to theCUBE. and now back into the saddle there. I mean, great, great run. data's at the center of the value proposition, and HP's focusing on the new style And one of the things that we are looking at is, it's smaller than the DataWorks or Hadoop Summit Can you comment on how you guys are tackling the IoT? and that's the case of the upstream business, You're getting at the kinds of business conversations I mean and 10 years ago, they would have seemed fantasy. and the decision-maker can get to decision in a faster time. So you have a dynamic reactive, And that's the key point, right? It's a solution we bring in the market a few months ago. One of the first ones. That's the key point. it's going to take too long, it's not going to work. Now the other thing is, sort of, the new HP, post these spin merges. It's part of the columnar side-- But the new strategy is to be more That's the new terminology we like to bring in the market, John doesn't like the term data link. and the same level, they want to have but we have actually proved that this is the way to go, So, 3.0 is actually functional. So having the containerization of the application Hadoop is a service could be run on-premise, all the customer want to start very small. John: Love that. and the point I want to make is, they want to start small, and now you have what we call continuous, is actually the core of this stuff. in the United States, created some concerns. I mean, Tesla in the US is a good example is much more advanced in the US compared to Europe. actually become, begin the process, before it's going to be completed. We don't have to talk about it, but the, and the Chinese market, I feel that. Dave and I have covered the transformation of HP It means that some of the services that you can implement, our commentary on the spin merge has been, I love the analytics story, foundational services.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Carlo	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Trump	PERSON	0.99+
Meg Whitman	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Pointnext	ORGANIZATION	0.99+
Chris Hsu	PERSON	0.99+
John	PERSON	0.99+
Carlo Vaiti	PERSON	0.99+
John Furrier	PERSON	0.99+
HP	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
HPE	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
Sun Microsystems	ORGANIZATION	0.99+
Antonio	PERSON	0.99+
US	LOCATION	0.99+
EG	ORGANIZATION	0.99+
second element	QUANTITY	0.99+
United States	LOCATION	0.99+
second step	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
December last year	DATE	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
San Jose	LOCATION	0.99+
1.5%	QUANTITY	0.99+
yesterday	DATE	0.99+
North America	LOCATION	0.99+
September 1st	DATE	0.99+
'97	DATE	0.99+
'88	DATE	0.99+
Africa	LOCATION	0.99+
one	QUANTITY	0.99+
Today	DATE	0.99+
three months	QUANTITY	0.99+
Eastern Europe	LOCATION	0.99+
Sun	ORGANIZATION	0.99+
Two days	QUANTITY	0.99+
60	QUANTITY	0.99+
DataWorks 2017	EVENT	0.99+
10 years ago	DATE	0.99+
DXC	ORGANIZATION	0.98+
EMEA Digital Solutions	ORGANIZATION	0.98+
five years ago	DATE	0.98+
a year ago	DATE	0.98+
Tesla	ORGANIZATION	0.98+

John Kreisa, Hortonworks– DataWorks Summit Europe 2017 #DWS17 #theCUBE

>> Announcer: Live from Munich, Germany, it's theCUBE, covering DataWorks Summit Europe 2017. Brought to you by HORTONWORKS. (electronic music) (crowd) >> Okay, welcome back everyone, we are here live in Munich, Germany, for DataWorks 2017, formerly Hadoop Summit, the European version. Again, different kind of show than the main show in North America, in San Jose, but it's a great show, a lot of great topics. I'm John Furrier, my co-host, Dave Vellante. Our next guest is John Kreisa, Vice President of International Marketing. Great to see you emceeing the event. Great job, great event! >> John Kreisa: Great. >> Classic European event, its got the European vibe. >> Yep. >> Germany everything's tightly buttoned down, very professional. (laughing) But big IOT message-- >> Yes. >> Because in Germany a lot of industrial action-- >> That's right. >> And then Europe, in general, a lot of smart cities, a lot of mobility, and issues. >> Umm-hmm. >> So a lot of IOT, a lot of meat on the bone here. >> Yep. >> So congratulations! >> John Kreisa: Thank you. >> How's your thoughts? Are you happy with the event? Give us by the numbers, how many people, what's the focus? >> Sure, yeah, no, thanks, John, Dave. Long-time CUBE attendee, I'm really excited to be here. Always great to have you guys here-- >> Thanks. >> Thanks. >> And be participating. This is a great event this year. We did change the name as you mentioned from Hadoop Summit to DataWorks Summit. Perhaps, I'll just riff on that a little bit. I think that really was in response to the change in the community, the breadth of technologies. You mentioned IOT, machine learning, and AI, which we had some of in the keynotes. So just a real expansion of from data loading, data streaming, analytics, and machine learning and artificial intelligence, which all sit on top and use the core Hadoop platform. We felt like it was time to expand the conference itself. Open up the aperture to really bring in the other technologies that were involved, and really represent what was already starting to kind of feed into Hadoop Summit, so it's kind of a natural change, a natural evolution. >> And there's a 2-year visibility. We talk about this two years ago. >> John Kreisa: Yeah, yeah. >> That you are starting to see this aperture open up a little bit. >> Yeah. >> But it's interesting. I want to get your thoughts on this because Dave and I were talking yesterday. It's like we've been to every single Hadoop Summit. Even theCUBE's been following it all as you know. It's interesting the big data space was created by the Hadoop ecosystem. >> Umm-hmm. >> So, yeah, you rode in on the Hadoop horse. >> Yeah. >> I get that. A lot of people don't get them. They say, Oh, Hadoop's dead, but it's not. >> No. >> It's evolving to a much broader scope. >> That's right. >> And you guys saw that two years ago. Comment on your reaction to Hadoop is not dead. >> Yeah, wow (laughing). It's far from dead if you look at the momentum, largest conference ever here in Europe. I think strong interest from them. I think we had a very good customer panel, which talked about the usage, right. How they were really transforming. You had Walgreens Booth's talking about how they're redoing their shelf, shelving, and how they're redesigning their stores. Don Ske-bang talking about how they're analyzing, how they replenish their cash machines. Centrica talking about how they redo their... Or how they're driving down cost of energy by being smarter around energy consumption. So, these are real transformative use cases, and so, it's far from dead. Really what might be confusing people is probably the fact that there are so many other technologies and markets that are being enabled by this open source technologies and the breadth of the platform. And I think that's maybe people see it kind of move a little bit back as a platform play. And so, we talk more about streaming and analytics and machine learning, but all that's enabled by Hadoop. It's all riding on top of this platform. And I think people kind of just misconstrue that the fact that there's one enabling-- >> It's a fundamental element, obviously. >> John Kreisa: Yeah. >> But what's the new expansion? IOT, as I mentioned, is big here. >> Umm-hmm. >> But there's a lot more in connective tissue going on, as Shawn Connelly calls it. >> Yeah, yep. >> What are those other things? >> Yeah, so I think, as you said, smart cities, smart devices, the analytics, getting the value out of the technologies. The ability to load it and capture it in new ways with new open source technology, NyFy and some of those other things, Kafka we've heard of. And some of those technologies are enabling the broader use cases, so I don't think it's... I think it's that's really the fundamental change in shift that we see. It's why we renamed it to DataWorks Summit because it's all about the data, right. That's the thing-- >> But I think... Well, if you think about from a customer perspective, to me anyway, what's happened is we went through the adolescent phase of getting this stuff to work and-- >> Yeah. >> And figuring out, Okay, what's the relationship with my enterprise data warehouse, and then they realize, Wow, the enterprise data warehouse is critical to my big data platform. >> Umm-hmm. >> So what's customers have done as they've evolved, as Hadoop has evolved, their big data platforms internally-- >> Umm-hmm. And now they're turning to to their business saying, Okay, we have this platform. Let's now really start to go up the steep part of the S-curve and get more value out of it. >> John Kreisa: Umm-hmm. >> Do you agree with that scenario? >> I would definitely agree with that. I think that as companies have, and in particularly here in Europe, it's interesting because they kind of waited for the technology to mature and its reached that inflection point. To your point, Dave, such that they're really saying, Alright, let's really get this into production. Let's really drive value out of the data that they see and know they have. And there's sort of... We see a sense of urgency here in Europe, to get going and really start to get that value out. Yeah, and we call it a ratchet game. (laughing) The ratchet is, Okay, you get the technology to work. Okay, you still got to keep the lights on. Okay, and oh, by the way, we need some data governance. Let's ratchet it up that side. Oh, we need a CDO! >> Umm-hmm. >> And so, because if you just try to ratchet up one side of the house (laughing) (cross-talk)-- >> Well, Carlo from HPE said it great on our last segment. >> Yeah. >> And I thought this was fundamental. And this was kind of like you had a CUBE moment where it's like, Wow, that's a really amazing insight. And he said something profound, The data is now foundational to all conversations. >> Right. >> And that's from a business standpoint. It's never always been the case. Now, it's like, Okay, you can look at data as a fundamental foundation building block. >> Right. >> And then react from there. So if you get the data locked in, to Dave's point about compliance, you then can then do clever things. You can have a conversation about a dynamic edge or-- >> Right. >> Something else. So the foundational data is really now fundamental, and I think that is... Changes, it's not a database issue. It's just all data. >> Right, now all data-- >> All databases. >> You're right, it's all data. It's driving the business in all different functions. It's operational efficiency. It's new applications. It's customer intimacy. All of those different ways that all these companies are going, We've got this data. We now have the systems, and we can go ahead and move forward with it. And I think that's the momentum that we're seeing here in Europe, as evidence by the conference and those kinds of things, just I think really shows how maybe... We used to say... I'd say when I first moved over here, that Europe was maybe a year and a half behind the U.S., in terms of adoption. I'd say that's shrunk to where a lot of the conversations are the exact same conversations that we're having with big European companies, that we're having with U.S. companies. >> And, even in... >> Yeah. >> Like we were just talking to Carlo, He was like, Well, and Europe is ahead in things like certain IOT-- >> Yeah. >> And Industrial IOT. >> Yeah. >> Yeah. >> Even IOT analytics. Some of the... Tesla not withstanding some of the automated vehicles. >> John Kreisa: Correct. >> Autonomous vehicles activity that's going on. >> John Kreisa: That's right. >> Certainly with Daimler and others. So there's an advancement. It almost reminds me of the early days of mobile, so... (laughing) >> It's actually, it's a good point. If you look at... Squint through some of the perspectives, it depends on where you are in the room and what your view is. You could argue there are many things that Europe is advanced on and where we're behind. If you look at Amazon Web Services, for instance. >> Umm-hmm. >> They are clearly running as fast as they can to deploy regions. >> Umm-hmm. >> So the scoop's coming out now. I'm hearing buzz that there's another region coming out. >> Right. >> From Amazon soon (laughing). They can't go fast enough. Google is putting out regions again. >> Right. >> Data centers are now pushing global, yet, there's more industrial here than is there. So it's interesting perspective. It depends on how you look at it! >> Yeah, yeah, no, I think it's... And it's perfectly fair to say there are many places where it's more advanced. I think in this technology and open source technologies, in general, are helping drive some of those and enable some of those trends. >> Yeah. >> Because if you have the sensors, you need a place to store and analyze that data whether it's smart cars or smart cities, or energy, smart energy, all those different places. That's really where we are. >> What's different in the international theater that you're involved in because you've been on both sides. >> Yep. >> As you came from the U.S. then when we first met. What's different out here now? And I see the gaps closing? What other things that notable that you could share? >> Yeah, yeah, so I'd say, we still see customers in the U.S. that are still very much wanting to use the shiniest, new thing, like the very latest version of Spark or the very latest version of NyFy or some other technologies. They want to push and use that latest version. In Europe, now the conversations are slightly different, in terms of understanding the security and governance. I think there's a lot more consciousness, if you will, around data here. There's other rules and regulations that are coming into place. And I think they're a little bit more advanced in how they think of-- >> Yeah. >> Data, personal data, how to be treated, and so, consequently, those are where the conversations are about the platform. How do we secure it? How does it get governed? So that you need regulations-- >> John Furrier: It's not as fast, as loose as the U.S. >> Yeah, it's not as fast. And you look and see some of the regulations. (laughing) My wife asked me if we should set up a VPIN on our home WiFi because of this new rule about being able to sell the personal data. I've said, Well, we're not in the U.S., but perhaps, when we move to the U.S. >> In order to get the right to block chain (laughing). (cross-talk) >> Yeah, absolutely (cross-talk). >> John Furrier: Encrypt everything. >> (laughing) Yeah, exactly. >> Well, another topic is... Let's talk about the ecosystem a little bit. >> Umm-hmm. >> You've got now some additional public brethren, obviously Cloudera's, there's been a lot of talk here about-- >> Umm-hmm. Tow-len and Al-trex-is have gone public. >> Yeah. >> The ecosystem you've evolved that. IBM was up on stage with you guys. >> Yeah, yep. >> So that continues to be-- >> Gallium C. >> Can we talk about that a little bit? >> Gallium C >> Gallium C. >> We had a great... Partners are great. We've always been about the ecosystem. We were talking about before we came on-screen that for us it's not Marney Partnership. They're very much of substance, engineering to try to drive value for the customers. It's where we see that value in that joint value. So IBM is working with us across all of the DataWorks Summit, but, even in all of the engineering work that we're doing, participated in HDP 2.6 announcement that we just did. And I'm sure what you covered with Shawn and others, but those partnerships really help drive value for the customer. >> Umm-hmm. For us, it's all making sure the customer is successful. And to make a complete solution, it is a range of products, right. It is whether it's data warehousing, servers, networks, all of the different analytics, right. There's not one product that is the complete solution. It does take a stack, a multitude of technologies, to make somebody successful. >> Cloudera's S-1, was file, what's been part of the conversation, and we've been digging into, it's great to see the numbers. >> Umm-hmm. >> Anything surprise you in the S-1? And advice you'd give to open source companies looking to go public because, as Dave pointed out, there's a string now of comrades in arms, if you will, Mool-saw, that's doing very well. >> Yeah, yeah. >> And Al-trex-is just went public. >> Yeah. >> You guys have been public for a long time. You guys been operating the public open-- >> Yeah. >> Both open source, pure open source. But also on the public markets. You guys have experience. You got some scar tissue. >> John Kreisa: (laughing) Yeah, yeah. >> What's your advice to Cloudera or others that are... Because the risk certainly will be a rush for more public companies. >> Yeah. >> It's a fantastic trend. >> I think it is a fantastic trend. I completely agree. And I think that it shows the strength of the market. It shows both the big data market, in general, the analytics market, kind of all the different components that are represented in some of those IPOs or planned IPOs. I think that for us, we're always driving for success of the customer, and I think any of the open source companies, they have to look at their business plan and take it step-wise in approach, that keeps an eye on making the customer successful because that's ultimately what's going to drive the company success and drive revenue for it and continue to do it. But we welcome as many companies as possible to come into the public market because A: it just allows everybody to operate in an open and honest way, in terms of comparison and understanding how growth is. But B: it's shows that strength of how open source and related technologies can help-- >> Yeah. >> Drive things forward. >> And it's good for the customer, too, because now they can compare-- >> Yes! >> Apples to Apples-- >> Exactly. >> Visa V, Cloudera, and what's interesting is that they had such a head start on you guys, HORTONWORKS, but the numbers are almost identical. >> Umm-hmm, yeah. >> Really close. >> Yeah, I think it's indicative of the opportunity that they're now coming out and there's rumors of other companies coming out. And I think it's just gives that visibility. We welcome it, absolutely-- >> Yeah. >> To show because we're very proud of our performance and now are growth. And I think that's something that we stand behind and stand on top of. And we want to see others come out and show what they got. >> Let's talk about events, if we can? >> Yeah. >> We were there at the first Hadoop Summit in San Jose. Thrilled to be-- >> John Kreisa: In a few years. >> In Dublin last year. >> Yeah. >> So what's the event strategy? I love going into the local flavor. >> Umm-hmm. >> Last year we had the Irish singers. This year we had a great (laughing) locaL band. >> John Kreisa: (laughing) Yeah, yeah, yeah. >> So I don't know if you've announced where next year's going to be? Maybe you can share with us some of the roll-out strategies? >> Yeah, so first of all, DataWorks Summit is a great event as you guys know, And you guys are long participants, so it's a great partnership. We've moving them international, of course, we did a couple... We are already international, but moving a couple to Asia last year so-- >> Right. >> Those were a tremendous success, we actually exceeded our targets, in terms of how many people we thought would go. >> Dave: Where did you do those? >> We were in Melburn in Tokyo. >> Dave: That's right, yeah. >> Yeah, so in both places great community, kind of rushed to the event and kind of understanding, really showed that there is truly a global kind of data community around Hadoop and other related technologies. So from here as you guys know because you're going to be there, we're thinking about San Jose and really wanting to make sure that's a great event. It's already stacking up to be tremendous, call for papers is all done. And all that's announced so, even the sessions we're really starting build for that, We'll be later this year. We'll be in Sydney, so we're going to have to take DataWorks into Sydney, Australia, in September. So throughout the rest of this year, there's going to be continued building momentum and just really global participation in this community, which is great. >> Yeah. >> Yeah. >> Yeah, it's fantastic. >> Yeah, Sydney should be great. >> Yeah. >> Looking forward to it. We're going to expand theCUBE down under. Dave and I are are excited-- >> Dave: Yeah, let's talk about that. >> We got a lot of interest (laughing). >> Alright. >> John, great to have you-- >> Come on down. >> On theCUBE again. Great to see you. Congratulations, I'm going to see you up on stage. >> Thank you. >> Doing the emcee. Great show, a lot of great presenters and great customer testimonials. And as always the sessions are packed. And good learning, great community. >> Yeah. >> Congratulations on your ecosystem. This is theCUBE broadcasting live from Munich, Germany for DataWorks 2017, presented by HORTONWORKS and Yahoo. I'm John Furrier with Dave Vellante. Stay with us, great interviews on day two still up. Stay with us. (electronic music)

Published Date : Apr 6 2017

SUMMARY :

Brought to you by HORTONWORKS. Great to see you emceeing the event. its got the European vibe. But big IOT message-- a lot of smart cities, a lot of meat on the bone here. Always great to have you guys here-- We did change the name as you mentioned And there's a 2-year visibility. to see this aperture It's interesting the big data space in on the Hadoop horse. A lot of people don't get them. to a much broader scope. And you guys saw that two years ago. that the fact that there's one enabling-- But what's the new expansion? But there's a lot more in because it's all about the data, right. of getting this stuff to work and-- Wow, the enterprise data warehouse of the S-curve and get for the technology to mature it great on our last segment. And I thought It's never always been the case. So if you get the data locked in, So the foundational data a lot of the conversations of the automated vehicles. activity that's going on. It almost reminds me of the it depends on where you are in the room as fast as they can to deploy regions. So the scoop's Google is putting out regions again. It depends on how you look at it! And it's perfectly fair to have the sensors, the international theater And I see the gaps closing? or the very latest version of NyFy So that you need regulations-- fast, as loose as the U.S. some of the regulations. In order to get the right Let's talk about the Tow-len and Al-trex-is IBM was up on stage with you guys. even in all of the engineering work networks, all of the it's great to see the numbers. in the S-1? You guys been operating the public open-- But also on the public markets. Because the risk certainly will be kind of all the different components HORTONWORKS, but the numbers indicative of the opportunity And I think that's something at the first Hadoop Summit in San Jose. I love going into the local flavor. the Irish singers. Yeah, yeah, yeah. And you guys are long participants, in terms of how many kind of rushed to the event We're going to expand theCUBE down under. to see you up on stage. And as always the sessions are packed. I'm John Furrier with Dave Vellante.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
John Kreisa	PERSON	0.99+
John Furrier	PERSON	0.99+
Carlo	PERSON	0.99+
Sydney	LOCATION	0.99+
Asia	LOCATION	0.99+
Shawn Connelly	PERSON	0.99+
2-year	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Tokyo	LOCATION	0.99+
Dublin	LOCATION	0.99+
Melburn	LOCATION	0.99+
San Jose	LOCATION	0.99+
North America	LOCATION	0.99+
John	PERSON	0.99+
Last year	DATE	0.99+
U.S.	LOCATION	0.99+
Daimler	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Google	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
September	DATE	0.99+
Centrica	ORGANIZATION	0.99+
Tesla	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
last year	DATE	0.99+
Walgreens Booth	ORGANIZATION	0.99+
Both	QUANTITY	0.99+
both sides	QUANTITY	0.99+
HORTONWORKS	ORGANIZATION	0.99+
This year	DATE	0.99+
S-1	TITLE	0.99+
yesterday	DATE	0.98+
next year	DATE	0.98+
Munich, Germany	LOCATION	0.98+
Shawn	PERSON	0.98+
HPE	ORGANIZATION	0.98+
Hadoop Summit	EVENT	0.98+
both	QUANTITY	0.98+
two years ago	DATE	0.98+
a year and a half	QUANTITY	0.98+
one product	QUANTITY	0.98+
DataWorks 2017	EVENT	0.98+
this year	DATE	0.98+
Sydney, Australia	LOCATION	0.97+
DataWorks Summit	EVENT	0.97+
Apples	ORGANIZATION	0.97+
day two	QUANTITY	0.97+
Hortonworks–	ORGANIZATION	0.97+
Gallium C	ORGANIZATION	0.96+
Gallium C.	ORGANIZATION	0.96+

Raj Verma | DataWorks Summit Europe 2017

>> Narrator: Live from Munich, Germany it's the CUBE, covering Dataworks Summit Europe 2017. Brought to you by Hortonworks. >> Okay, welcome back everyone here at day two coverage of the CUBE here in Munich, Germany for Dataworks 2017. I'm John Furrier, my co-host Dave Vellante. Two days of wall to wall coverage SiliconANGLE Media's the CUBE. Our next guest is Raj Verma, the president and COO of Hortonworks. First time on the CUBE, new to Hortonworks. Welcome to the CUBE. >> Thank you very much, John, appreciate it. >> Looking good with a three piece suit we were commenting when you were on stage. >> Raj: Thank you. >> Great scene here in Europe, again different show vis-a-vis North America, in San Jose. You got the show coming up there, it's the big show. Here, it's a little bit different. A lot of IOT in Germany. You got a lot of car manufacturers, but industrial nation here, smart city initiatives, a lot of big data. >> Uh-huh. >> What's your thoughts? >> Yeah no, firstly thanks for having me here. It's a pleasure and good chit chatting right before the show as well. We are very, very excited about the entire data space. Europe is leading many initiatives about how to use data as a sustainable, competitive differentiator. I just moderated a panel and you guys heard me talk to a retail bank, a retailer. And really, Centrica, which was nothing but British Gas, which is rather an organization steeped in history so as to speak and that institution is now, calls itself a technology company. And, it's a technology company or an IOT company based on them using data as the currency for innovation. So now, British Gas, or Centrica calls itself a data company, when would you have ever thought that? I was at dinner with a very large automotive manufacturers and the kind of stuff they are doing with data right from the driving habits, driver safety, real time insurance premium calculation, the autonomous drive. It's just fascinating no matter what industry you talk about. It's just very, very interesting. And, we are very glad to be here. International business is a big priority for me. >> We've been following Hortonworks since it's inception when it spun out of Yahoo years ago. I think we've been to every Hadoop World going back, except for the first one. We watched the transition. It's interesting, it's always been a learning environment at these shows. And certainly the customer testimonials speaks to the ecosystem, but I have to ask you, you're new to Hortonworks. You have interesting technology background. Why did you join Hortonworks? Because you certainly see the movies before and the cycles of innovation, but now we're living in a pretty epic, machine learning, data AI is on the horizon. What were the reasons why you joined Hortonworks? >> Yeah sure, I've had a really good run in technology, fortunately was associated with two great companies, Parametric Technology and TIBCO Software. I was 16 years at TIBCO, so I've been dealing with data for 16 years. But, over the course of the last couple of years whenever I spoke to a C level executive, or a CIO they were talking to us about the fact that structured data, which is really what we did for 16 years, was not good enough for innovation. Innovation and insights into unstructured data was the seminal challenge of most of the executives that I was talking to, senior level executives. And, when you're talking about unstructured data and making sense of it there isn't a better technology than the one that we are dealing with right now, undoubtedly. So, that was one. Dealing with data because data is really the currency of our times. Every company is a data company. Second was, I've been involved with proprietary software for 23 years. And, if there is a business model that's ready for disruption it's the proprietary software business model because I'm absolutely convinced that open source is what I call a green business model. It's good for planet Earth so as to speak. It's a community based, it's based on innovation and it puts the customer and the technology provider on the same page. The customer success drives the vendor success. Yeah, so the open source community, data-- >> It's sustainables, pun intended, in the sense that it's had a continuing run. And, it's interesting Tier One software is all open source now. >> 100%, and by the way not only that if you see large companies like IBM and Microsoft they have finally woken up to the fact that if they need to attract talent and if they want to be known as talk leaders they have to have some very meaningful open source initiatives. Microsoft loves Linux, when did we ever think that was going to happen, right? And, by the way-- >> I think Steve Bauman once said it was the cancer of the industry. Now, they're behind it. But, this is the Linux foundation has also grown. We saw a project this past week. Intel donated a big project to the Linux now it's taking over, so more projects. >> Raj: Yes. >> There's more action happening than ever before. >> You know absolutely, John. Five years ago when I would go an meet a CIO and I would ask them about open source and they would wink, they say "Of course, "we do open source. But, it's less than 5%, right? Now, when I talk to a CIO they first ask their teams to go evaluated open source as the first choice. And, if they can't they come kicking and screaming towards propriety software. Most organizations, and some organizations with a lot of historical gravity so as to speak have a 50/50 even split between proprietary and open source. And, that's happened in the last three years. And, I can make a bold statement, and I know it'll be true, but in the next three years most organizations the ratio of proprietary to open source would be 20 proprietary 80 open source. >> So, obviously you've made that bet on open source, joining Hortonworks, but open is a spectrum. And, on one end of the spectrum you have Hortonworks which is, as I see it, the purest. Now, even Larry Ellison, when he gets onstage at Oracle Open World will talk about how open Oracle is, I guess that's the other end of the spectrum. So, my question is won't the Microsofts and the Oracles and the IBM, they're like recovering alcoholics and they'll accommodate their platforms through open source, embracing open source. We'll see if AWS is the same, we know it's unidirectional there. How do you see that-- >> Well, not necessarily. >> Industry dynamic, we'll talk about that later. How do you see that industry dynamic shaking out? >> No, absolutely, I think I remember way back in I think the mid to late 90s I still loved that quote by Scott McNeely, who is a friend, Dell, not Dell, Digital came out with a marketing campaign saying open VMS. And, Scott said, "How can someone lie "so much with one word?" (laughs) So, it's the fact that Oracle calling itself open, well I'll just leave it at, it's a good joke. I think the definition of open source, to me, is when you acquire a software you have three real costs. One is the cost of initial procuring that software and the hardware and all the rest of it. The second is implementation and maintenance. However, most people miss the third dimension of cost when acquiring software, which is the cost to exit the technology. Our software and open source has very low exit barriers to our technology. If you don't like our technology, switch it off. You own the software anyways. Switch off our services and the barrier of exits are very, very low. Having worked in proprietary software, as I said, for 23 years I very often had conversations with my customers where I would say, "Look, you really "don't have a choice, because if you want to exit "our technology it's going to probably cost you "ten times more than what you've spent till date." So, it a lock in architecture and then you milk that customer through maintenance, correct? >> Switching costs really are the metric-- >> Raj: Switching costs, exactly. >> You gave the example of Blockbuster Camera, and the rental, the late charge fees. Okay, that's an example of lock in. So, as we look at the company you're most compared with, now that's it's going public, Cloudera, in a way I see more similarities than differences. I mean, you guys are sort of both birds of a feather. But, you are going for what I call the long game with a volume subscription model. And, Cloudera has chosen to build proprietary components on top. So, you have to make big bets on open. You have to support those open technologies. How do you see that affecting the long term distance model? >> Yeah, I think we are committed to open source. There's absolutely no doubt about it. I do feel that we are connected data platform, which is data at rest and data in motion across on prem and cloud is the business model the going to win. We clearly have momentum on our side. You've seen the same filings that I have seen. You're talking about a company that had a three year head start on us, and a billion dollars of funding, all right, at very high valuations. And yet, they're only one year ahead in terms of revenue. And, they have burnt probably three times more cash than we have. So clearly, and it's not my opinion, if you look at the numbers purely, the numbers actually give us the credibility that our business model and what we are doing is more efficient and is working better. One of the arguments that I often hear from analysts and press is how are your margins on open source? According to the filings, again, their margins are 82% on proprietary software, my margins on open source are 84%. So, from a health of the business perspective we are better. Now, the other is they've claimed to have been making a pivot to more machine learning and deep learning and all the rest of it. And, they actually'd like us to believe that their competition is going to be Amazon, IBM, and Google. Now, with a billion dollars of funding with the Intel ecosystem behind them they could effectively compete again Hortonworks. What do you think are their chances of competing against Google, Amazon, and IBM? I just leave that for you guys to decide, to be honest with you. And, we feel very good that they have virtually vacated the space and we've got the momentum. >> On the numbers, what jumps out at you on filing since obviously, I sure, everyone at Hortonworks was digging through the S1 because for the first time now Cloudera exposes some of the numbers. I noticed some striking things different, obviously, besides their multiple on revenue valuation. Pretty obvious it's going to be a haircut coming after the public offering. But, on the sales side, which is your wheelhouse there's a value proposition that you guys at Hortonworks, we've been watching, the cadence of getting new clients, servicing clients. With product evolution is challenging enough, but also expensive. It's not you guys, but it's getting better as Sean Connolly pointed out yesterday, you guys are looking at some profitability targets on the Ee-ba-dep coming up in Q four. Publicly stated on the earnings call. How's that different from Cloudera? Are they burning more cash because of their sales motions or sales costs, or is it the product mix? What's you thoughts on the filings around Cloudera versus the Hortonworks? >> Well, look I just feel that, I can talk more about my business than theirs. Clearly, you've seen the same filings that I have and you've see the same cash burn rates that we have seen. And, we clearly are ore efficient, although we can still get better. But, because of being public for a little more than two years now we've had a thousand watt bulb being shown at us and we have been forced to be more efficient because we were in the limelight. >> John: You're open. >> In the open, right? So, people knew what our figures are, what our efficiency ratios were. So, we've been working diligently at improving them and we've gotten better, and there's still scope for improvement. However, being private did not have the same scrutiny on Cloudera. And, some would say that they were actually spending money like drunken sailors if you really read their S1 filing. So, they will come under a lot of scrutiny as well. I'm sure they'll get more efficient. But right now, clearly, you've seen the same numbers that I have, their numbers don't talk about efficiency either in the R and D side or the sales and marketing side. So, yeah we feel very good about where we are in that space. >> And, open source is this two edged sword. Like, take Yarn for example, at least from my perspective Hortonworks really led the charge to Yarn and then well before Doctor and Kubernetes ascendancy and then all of a sudden that happens and of course you've got to embrace those open source trends. So, you have the unique challenge of having to support sort of all the open source platforms. And, so that's why I call it the long game. In order for you guys to thrive you've got to both put resources into those multiple projects and you've got to get the volume of your subscription model, which you pointed out the marginal economics are just as good as most, if not any software business. So, how do you manage that resource allocation? Yes, so I think a lot of that is the fact that we've got plenty of contributors and committers to the open source community. We are seen as the angel child in open source because we are just pure, kosher open source. We just don't have a single line of proprietary code. So, we are committed to that community. We have over the last six or seven years developed models of our software development which helps us manage the collective bargaining power, so as to speak, of the community to allocate resources and prioritize the allocation of resources. It continues to be a challenge given the breadth of the open source community and what we have to handle, but fortunately I'm blessed that we've got a very, very capable engineering organization that keeps us very efficient and on the cutting edge. >> We're here with Raj Verma, With the new president and COO of Hortonworks, Chief Operating Officer. I've got to ask you because it's interesting. You're coming in with a fresh set of eyes, coming in as you mentioned, from TIBCO, interesting, which was very successful in the generation of it's time and history of TIBCO where it came from and what it did was pretty fantastic. I mean, everyone knows connecting data together was very hard in the enterprise world. TIBCO has some challenges today, as you're seeing, with being disrupted by open source, but I got to ask you. As a perspective, new executive you got, looking at the battlefield, an opportunity with open source there's some significant things happening and what are you excited about because Hortonworks has actually done some interesting things. Some, I would say, the world spun in their direction, their relationship with Microsoft, for instance, and their growth in cloud has been fantastic. I mean, Microsoft stock price when they first started working with Hortonworks I think was like 26, and obviously with Scott Di-na-tell-a on board Azure, more open source, on Open Compute to Kubernetes and Micro Services, Azure doing very, very well. You also have a partnership with Amazon Web Services so you already are living in this cloud era, okay? And so, you have a cloud dynamic going on. Are you excited by that? You bring some partnership expertise in from TIBCO. How do you look at partners? Because, you guys don't really compete with anybody, but you're partners with everybody. So, you're kind of like Switzerland, but you're also doing a lot of partnerships. What are you excited about vis-a-vis the cloud and some of the other partnerships that are happening. >> Yeah, absolutely, I think having a robust partner ecosystem is probably my number one priority, maybe number two after being profitable in a short span of time, which is, again, publicly stated. Now, our partnership with Microsoft is very, very special to us. Being available in Azure we are seeing some fantastic growth rates coming in from Azure. We are also seeing remarkable amount of traction from the market to be able to go and test out our platform with very, very low barriers of entry and, of course, almost zero barriers of exit. So, from a partnership platform cloud providers like Amazon, Microsoft, are very, very important to us. We are also getting a lot of interest from carriers in Europe, for example. Some of the biggest carriers want to offer business services around big data and almost 100%, actually not almost, 100% of the carriers that we have spoken to thus far want to partner with us and offer our platform as a cloud service. So, cloud for us is a big initiative. It gives us the entire capability to reach audiences that we might not be able to reach ringing one door bell at a time. So, it's, as I said, we've got a very robust, integrated cloud strategy. Our customers find that very, very interesting. And, building that with a very robust partner channel, high priority for us. Second, is using our platform as a development platform for application on big data is, again, a priority. And that's, again, building a partner ecosystem. The third is relationships with global SIs, Extensia, Deloitte, KPMG. The Indian SIs of In-flu-ces, and Rip-ro, and HCL and the rest. We have some work to do. We've done some good work there, but there's some work to be done there. And, not only that I think some of the initiatives that we are launching in terms of training as a service, free certification, they are all things which are aimed at reaching out to the partners and building, as I said, a robust partner ecosystem. >> There's a lot of talk a conferences like this about, especially in Hadoop, about complexity, complexity of the ecosystem, new projects, and the difficulties of understanding that. But, in reality it seems as though today anyway the technology's pretty well understood. We talked about Millennials off camera coming out today with social savvy and tooling and understanding gaming and things like that. Technology, getting it to work seems to not be the challenge anymore. It's really understanding how to apply it, how to value data, we heard in your panel today. The business process, which used to be very well known, it's counting, it's payroll, simple. Now, it's kind of ever changing daily. What do you make of that? How do you think that will effect the future of work? Yeah, I think there's some very interesting questions that you've asked in that the first, of course, is what does it take to have a very successful big data, or Hadoop project. And, I think we always talk about the fact that if you have a very robust business case backing a Hadoop project that is the number one key ingredient to delivering a Hadoop project. Otherwise, you can tend to boil the ocean, all right, or try and eat an elephant in one bite as I like to say. So, that's one and I think you're right. It's not the technology, it's not the complexity, it's not the availability of the resources. It is a leadership issue in organizations where the leader demands certain outcomes, business outcomes from the Hadoop project team and we've seen whenever that happens the projects seem to be very, very successful. Now, the second part of the question about future of work, which is a very, very interesting topic and a topic which is very, very close to my heart. There are going to be more people than jobs in the next 20, 25 years. I think that any job that can be automated will be automated, or has been automated, right? So, this is going to have a societal impact on how we live. I've been lucky enough that I joined this industry 25 years ago and I've never had to change or switch industries. But, I can assure you that our kids, and we were talking about kids off camera as well, our kids will have to probably learn a new skill every five years. So, how does that impact education? We, in our generation, were testing champions. We were educated to score well on tests. But, the new form of education, which you and I were talking about, again in California where we live, and where my daughter goes to high school and in her school the number one, the number one priority is to instill a sense of learning and joy of learning in students because that is what is going to contribute to a robust future. >> That's a good point, I want to just interject here because I think that the trend we're seeing in the higher Ed side too also point to the impact of data science, to curriculum and learning. It's not just putting catalogs online. There's now kind of an iterative kind of non-linear discovery to proficiency. But, there's also the emotional quotient aspect. You mentioned the love of learning. The immersion of tech and digital is creating an interdisciplinary requirement. So, all the folks say that, what the statistic's like half the jobs that are going to be available haven't even been figured out yet. There's a value creation around interdisciplinary skill sets and emotional quotient. >> Absolutely. >> Social, emotional because of the human social community connectedness. This is also a big data challenge opportunity. >> Oh, 100% and I think one of the things that we believe is in the future, jobs that require a greater amount of empathy are least susceptible to automation. So, things like caring for old age people in the world, and nursing, and teaching, and artists, and all the rest will be professions which will be highly paid and numerous. I also believe that the entire big data challenge about how you use data to impact communities is going to come into play. And also, I think John, you and I were again talking about it, the entire concept of corporations is only 200 years old, really, 200, 300 years old. Before that, our forefathers were individual contributors who contributed a certain part in a community, barbers, tailors, farmers, what have you. We are going to go back to the future where all of us will go back to being individual contributors. And, I think, and again I'm bringing it back to open source, open source is the start of that community which will allow the community to go back to its roots of being individual contributors rather than being part of a organization or a corporation to be successful and to contribute. >> Yeah, the Coase's Penguin has been a very famous seminal piece of work. Obviously, Ronald Coase who's wrote the book The Nature of the Firm is interesting, but that's been a kind of historical document. You look at blockchain for instance. Blockchain actually has the opportunity to disrupt what the Nature of the Firm is about because of smart contracts, supply chain, and what not. And, we have this debate on the CUBE all the time, there's some naysayers, Tim Conner's a VC and I were talking on our Friday show, Silicon Valley Friday show. He's actually a naysayer on blockchain. I'm actually pro blockchain because I think there's some skeptics that say blockchain is really hard to because it requires an ecosystem. However, we're living in an ecosystem, a world of community. So, I think The Nature of the Firm will be disrupted by people organizing in a new way vis-a-vis blockchain 'cause that's an open source paradigm. >> Yeah, no I concur. So, I'm a believer in that entire concept. I 100%-- >> I want to come back to something you talked about, about individual contributors and the relationship in link to open source and collaboration. I personally, I think we have to have a frank conversation about, I mean machines have always replaced humans, but for the first time in our history it's replacing cognitive functions. To your point about empathy, what are the things that humans can do that machines can't? And, they become fewer and fewer every year. And, a lot of these conferences people don't like to talk about that, but it's a reality that we have to talk about. And, your point is right on, we're going back to individual contribution, open source collaboration. The other point is data, is it going to be at the center of that innovation because it seems like value creation and maybe job creation, in the future, is going to be a result of the combinatorial effects of data, open source, collaboration, other. It's not going to because of Moore's Law, all right. >> 100%, and I think one of the aspects that we didn't touch upon is the new societal model that automation is going to create would need data driven governance. So, a data driven government is going to be a necessity because, remember, in those times, and I think in 25, 30 years countries will have to explore the impact of negative taxation, right? Because of all the automation that actually happens around citizen security, about citizen welfare, about cost of healthcare, cost of providing healthcare. All of that is going to be fueled by data, right? So, it's just, as the Chinese proverb says, "May you live in interesting times." We definitely are living in very interesting times. >> And, the public policy implications are, your friend and one of my business heroes, Scott McNeally says, "There's no privacy in "the internet, get over it." We interviewed John Tapscott last week he said "That's unacceptable, "we have to solve that problem." So, it brings up a lot of public policy issues. >> Well, the social economic impact, right now there's a trend we're seeing where the younger generation, we're talking about the post 9/11 generation that's entering the workforce, they have a social conscience, right? So, there's an emphasis you're seeing on social good. AI for social good is one of the hottest trends out there. But, the changing landscape around data is interesting. So, the word democratization has been used whether you're looking at the early days of blogging and podcasting which we were involved in and research to now in media this notion of data and transparency and open source is probably at a tipping point, an all time high in terms of value creation. So, I want to hear your thoughts on this because as someone who's been in the proprietary world the mode of operation was get something proprietary, lock it dowm, build a fence and a wall, protect it with folks with machine guns and fight for the competitive advantage, right? Now, the competitive advantage is open. Okay, so you're looking at pure open source model with Hortonworks. It changes how companies are competing. What is the competitive advantage of Hortonworks? Actually, to be more open. >> 100%. >> How do you manage that? >> No absolutely, I just think the proprietary nature of software, like software has disrupted a lot of businesses, all right? And, it's not a resistance to disruption itself. I mean, there has never been a business model in the history of time where you charge a lot of money to build a software, or sell a software that you built and then whatever are the defects in that software you get paid more money to fix them, all right? That's the entire perpetual and maintenance model. That model is going to get disrupted. Now, there are hundreds of billions of dollars involved in it so people are going to come kicking and screaming to the open source world, but they will have to come to the open source world. Our advantage that we're seeing is innovation now in a closed loop environment, no matter what size of a company you are, cannot keep up with the changing landscape around you from a data perspective. So, without the collective innovation of the community I don't really think a technology can stay at par with the changes around them. >> This is what I say about, this is what I think is such an important point that you're getting at because we were started SiliconANGLE actually in the Cloudera office, so we have a lot of friends that work there. We have a great admiration for them, but one of the things that Cloudera has done through their execution is they have been very profit oriented, go public at all costs kind of thing that they're doing now. You've seen that happen. Is the competitive advantage that you're pointing out is something we're seeing that similar that Andy Jasseys doing at AWS, which is it's not so much to build something proprietary per se, it's just to ship something faster. So, if you look at Amazon's competitive advantage is that they just continue to ship product faster and faster and faster than companies can build themselves. And also, the scale that they're getting with these economies is increasing the quality. So, open source has also hit the naysayers on security, right? Everyone said, "Oh, open source is not secure." As it turns out, it's more secure. Amazon at scale is actually becoming more secure. So, you're starting to see the new competitive advantage be ship more, be more open as the way to do business. What do you think the impact will be to traditional companies whether it's a startup competing or an existing bank? This is a paradigm shift, what's the impact going to be for a CIO or CEO of a big company? How do they incorporate that competitive advantage? Yeah, I think the proprietary software world is not going to go away tomorrow, John, you know that. There so much of installed software and there's a saying from where I come from that "Even a dead elephant is worth a million dollars," right? So, even that business model even though it is sort of dying it'll still be a good investment for the next ten years because of the locked in business model where customers cannot get out. Now, from a perspective of openness and what that brings as a competitive differentiators to our customer just the very base at which, as I've said I've lived in a proprietary world, you would be lucky if you were getting the next version of our software every 18 months, you'd be lucky. In the open source community you get a few versions in 18 months. So, the cadence at which releases come out have just completely disrupted the proprietary model. It is just the collective, as I said, innovative or innovation ability of the community has allowed us to release, to increase the release cadence to a few months now, all right? And, if our engineering team had it's way it'll further be cut short, right? So, the ability of customers, and what does that allow the customer to do? Ten years ago if you looked for a capability from your proprietary vendor they would say you have to wait 18 months. So, what do you do, you build it yourself, all right? So, that is what the spaghetti architecture was all about. In the new open source model you ask the community and if enough people in the community think that that's important the community builds it for you and gives it to you. >> And, the good news is the business model of open source is working. So, you got you guys have been public, you got Cloudera going public, you have MuleSoft out there, a lot of companies out there now that are public companies are open source companies, a phenomenal change over. But, the other thing that's interesting is that the hiring factor for the large enterprise to the point of, your point about so proprietary not updating, it's the same is true for the enterprise. So, just hiring candidates out of open source is now increased, the talent pool for a large enterprise. >> 100%, 100%. >> Well, I wonder if I could challenge this love fest for a minute. (laughs) So, there's another saying, I didn't grow up there, but a dying snake can still bite you. So, I bring that up because there is this hybrid model that's emerging because these elephants eventually they figure it out. And so, an example would be, we talked about Cloudera and so forth, but the better example, I think, is IBM. What IBM has done to embrace open source with investing years ago a billion dollars into Linux, what it's doing with Spark, essentially trying to elbow its way in and say, "Okay, "now we're going to co-opt the ecosystem. "And then, build our proprietary pieces on top of it." That, to me, that's a viable business model, is it not? >> Yes, I'm sure it is and to John's point with the Mule going IPO and with Cloudera having successfully built a $250 million, $261 million business is testimony, yeah, it's a testimony to the fact that companies can be built. Now, can they be more efficient, sure they can be more efficient. However, my entire comment on this is why are you doing open source? What is your intent of doing open source, to be seen as open, or to be truly open? Because, in our philosophy if you a add a slim layer of proprietariness, why are you doing that? And, as a businessman I'll tell you why you increase the stickiness factor by locking in your customer, right? So, let's not, again, we're having a frank conversation, proprietary code equals customer lock in, period. >> Agreed. And, as a business model-- >> I'm not sure I agree with that. >> As a business model. >> Please. (laughs) We'll come back to that. >> So, it's a customer lock in. Now, as a business model it is, if you were to go with the business models of the past, yes I believe most of the analysts will say it a stickier, better business model, but then we would like to prove them wrong. And, that's our mission as open source purely. >> I would caution though, Amazon's the mother of all lock in's. You kind of bristled at that before. >> They're not, I mean they use a lot of open source. I mean, did they open source it? Getting back to the lock in, the lock in is a function of stickiness, right? So, stickiness can be open source. Now, you could argue that Horonworks through they're relationship with partnering is a lock in spec with their stickiness of being open. Right, so I come back down to the proprietary-- >> Dave: My search engine I like Google. >> I mean Google's certainly got-- >> It's got to be locked in 'cause I like it? >> Well, there's a lot of do you care with proprietary technology that Google's built. >> Switching costs, as we talked about before. >> But, you're not paying for Si-tch >> If the value exceeds the price of the lock in then it's an opportunity. So, Palma Richie's talking about the hardened top, the hardened top. Do you care what's in an Intel processor? Well, Intel is a proprietary platform that provides processing power, but it enables a lot of other value. So, I think the stickiness factor of say IBM is interesting and they've done a lot open source stuff to defend them on Linux, for example they do a (mumbles) blockchain. But, they're priming the pump for their own business, that's clear for their lock In. >> Raj wasn't saying there's not value there. He's saying it's lock in, and it is. >> Well, some customers will pay for convenience. >> Your point is if the value exceeds the lock in risk than it's worth it. >> Yeah, that's my point, yeah. >> 1005, 100%. >> And, that's where the opportunity is. So, you can use open source to get to a value projectory. That's the barriers to entry, we seen 'em on the entrepreneurship side, right? It's easier to start a company now than ever before. Why? Because of open source and cloud, right? So, does that mean that every startup's going to be super successful and beat IBM? No, not really. >> Do you thinK there will be a red hat of big data and will you be it? >> We hope so. (laughs) If I had my that's definitely. That's really why I am here. >> Just an example, right? >> And, the one thing that excites us about this this year is as my former boss used to say you could be as good as you think you are or the best in the world but if you're in the landline business right now you're not going to have a very bright future. However, the business that we are in we pull from the market that we get, and you're seeing here, right? And, these are days that we have very often where customer pool is remarkable. I mean, this industry is growing at, depending on which analyst you're talking to somewhere between 50 to 80% ear on ear. All right, every customer is a prospect for us. There isn't a single conversation that we have with any organization almost of any size where they don't think that they can use their data better, or they can enhance and improve their data strategy. So, if that is in place and I am confident about our execution, very, very happy with the technology platform, the support that we get from out customers. So, all things seem to be lining up. >> Raj, thanks so much for coming on, we appreciate your time. We went a little bit over, I think, the allotted time, but wanted to get your insight as the new President and Chief Operating Officer for Hortonworks. Congratulations on the new role, and looking forward to seeing the results. Since you're a public company we'll be actually able to see the scoreboard. >> Raj: Yes. >> Congratulations, and thanks for coming on the CUBE. There's more coverage here live at Dataworks 2017. I John Furrier, stay with us more great interviews, day two coverage. We'll be right back. (jaunty music)

Published Date : Apr 6 2017

SUMMARY :

Munich, Germany it's the CUBE, of the CUBE here in Munich, Thank you very much, we were commenting when you were on stage. You got the show coming up about the entire data space. and the cycles of of most of the executives in the sense that it's 100%, and by the way of the industry. happening than ever before. a lot of historical gravity so as to speak And, on one end of the How do you see that industry So, it's the fact that and the rental, the late charge fees. the going to win. But, on the sales side, to be more efficient because either in the R and D side or of that is the fact that and some of the other from the market to be the projects seem to be So, all the folks say that, the human social community connectedness. I also believe that the the opportunity to disrupt So, I'm a believer in that entire concept. and maybe job creation, in the future, Because of all the automation And, the public and fight for the innovation of the community allow the customer to do? is now increased, the talent and so forth, but the better the fact that companies And, as a business model-- I agree with that. We'll come back to that. most of the analysts Amazon's the mother is a function of stickiness, right? Well, there's a lot of do you care we talked about before. If the value exceeds there's not value there. Well, some customers Your point is if the value exceeds That's the barriers to If I had my that's definitely. the market that we get, and Congratulations on the new role, on the CUBE.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
TIBCO	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Google	ORGANIZATION	0.99+
Raj Verma	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Scott	PERSON	0.99+
Steve Bauman	PERSON	0.99+
Centrica	ORGANIZATION	0.99+
British Gas	ORGANIZATION	0.99+
John	PERSON	0.99+
Tim Conner	PERSON	0.99+
John Tapscott	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
KPMG	ORGANIZATION	0.99+
Deloitte	ORGANIZATION	0.99+
California	LOCATION	0.99+
John Furrier	PERSON	0.99+
Scott McNeally	PERSON	0.99+
Sean Connolly	PERSON	0.99+
Larry Ellison	PERSON	0.99+
Ronald Coase	PERSON	0.99+
Dell	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Germany	LOCATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Raj	PERSON	0.99+
Scott McNeely	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
$261 million	QUANTITY	0.99+
Andy Jasseys	PERSON	0.99+
AWS	ORGANIZATION	0.99+
82%	QUANTITY	0.99+
$250 million	QUANTITY	0.99+
16 years	QUANTITY	0.99+
100%	QUANTITY	0.99+
Dave	PERSON	0.99+
84%	QUANTITY	0.99+
23 years	QUANTITY	0.99+
18 months	QUANTITY	0.99+
Scott Di	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
last week	DATE	0.99+
Extensia	ORGANIZATION	0.99+
Oracles	ORGANIZATION	0.99+

Steve Roberts, IBM– DataWorks Summit Europe 2017 #DW17 #theCUBE

>> Narrator: Covering DataWorks Summit, Europe 2017, brought to you by Hortonworks. >> Welcome back to Munich everybody. This is The Cube. We're here live at DataWorks Summit, and we are the live leader in tech coverage. Steve Roberts is here as the offering manager for big data on power systems for IBM. Steve, good to see you again. >> Yeah, good to see you Dave. >> So we're here in Munich, a lot of action, good European flavor. It's my second European, formerly Hadoop Summit, now DataWorks. What's your take on the show? >> I like it. I like the size of the venue. It's the ability to interact and talk to a lot of the different sponsors and clients and partners, so the ability to network with a lot of people from a lot of different parts of the world in a short period of time, so it's been great so far and I'm looking forward to building upon this and towards the next DataWorks Summit in San Jose. >> Terri Virnig VP in your organization was up this morning, had a keynote presentation, so IBM got a lot of love in front of a fairly decent sized audience, talking a lot about the sort of ecosystem and that's evolving, the openness. Talk a little bit about open generally at IBM, but specifically what it means to your organization in the context of big data. >> Well, I am from the power systems team. So we have an initiative that we have launched a couple years ago called Open Power. And Open Power is a foundation of participants innovating from the power processor through all aspects, through accelerators, IO, GPUs, advanced analytics packages, system integration, but all to the point of being able to drive open power capability into the market and have power servers delivered not just through IBM, but through a whole ecosystem of partners. This compliments quite well with the Apache, Hadoop, and Spark philosophy of openness as it relates to software stack. So our story's really about being able to marry the benefits of open ecosystem for open power as it relates to the system infrastructure technology, which drives the same time to innovation, community value, and choice for customers as it relates to a multi-vendor ecosystem and coupled with the same premise as it relates to Hadoop and Spark. And of course, IBM is making significant contributions to Spark as part of the Apache Spark community and we're a key active member, as is Hortonworks with the ODPi organization forwarding the standards around Hadoop. So this is a one, two combo of open Hadoop, open Spark, either from Hortonworks or from IBM sitting on the open power platform built for big data. No other story really exists like that in the market today, open on open. >> So Terri mentioned cognitive systems. Bob Picciano has recently taken over and obviously has some cognitive chops, and some systems chops. Is this a rebranding of power? Is it sort of a layer on top? How should we interpret this? >> No, think of it more as a layer on top. So power will now be one of the assets, one of the sort of member family of the cognitive systems portion on IBM. System z can also be used as another great engine for cognitive in certain clients, certain use cases where they want to run cognitive close to the data and they have a lot of data sitting on System z. So power systems as a server really built for big data and machine learning, in particular our S822LC for high performance computing. This is a server which is landing very well in the deep learning, machine learning space. It offers the Tesla P100 GPU and with the NVIDIA NVLink technology can offer up to 2.8x bandwidth benefits CPU to GPU over what would be available through a PCIe Intel combination today. So this drives immediate value when you need to ensure that not just you're exploiting GPUs, but you of course need to move your data quickly from the processor to the GPU. >> So I was going to ask you actually, sort of what make power so well suited for big data and cognitive applications, particularly relative to Intel alternatives. You touched on that. IBM talks a lot about Moore's Law starting to hit its peak, that innovation is going to come from other places. I love that narrative 'cause it's really combinatorial innovation that's going to lead us in the next 50 years, but can we stay on that thread for a bit? What makes power so substantially unique, uniquely suited and qualified to run cognitive systems and big data? >> Yeah, it actually starts with even more of the fundamentals of the power processors. The power processor has eight threads per core in contrast to Intel's two threads per core. So this just means for being able to parallelize your workloads and workloads that come up in the cognitive space, whether you're running complex queries and need to drive SQL over a lot of parallel pipes or you're writing iterative computation, the same data set as when you're doing model training, these can all benefit from highly parallelized workloads, which can benefit from this 4x thread advantage. But of course to do this, you also need large, fast memory, and we have six times more cache per core versus Broadwell, so this just means you have a lot of memory close to the processor, driving that throughput that you require. And then on top of that, now we get to the ability to add accelerators, and unique accelerators such as I mentioned the NVIDIA in the links scenario for GPU or using the open CAPI as an approach to attach FPGA or Flash to get access speeds, processor memory access speeds, but with an attached acceleration device. And so this is economies of scale in terms of being able to offload specialized compute processing to the right accelerator at the right time, so you can drive way more throughput. The upper bounds are driving workload through individual nodes and being able to balance your IO and compute on an individual node is far superior with the power system server. >> Okay, so multi-threaded, giant memories, and this open CAPI gives you primitive level access I guess to a memory extension, instead of having to-- >> Yeah, pluggable accelerators through this high speed memory extension. >> Instead of going through, what I often call the horrible storage stack, aka SCSI, And so that's cool, some good technology discussion there. What's the business impact of all that? What are you seeing with clients? >> Well, the business impact is not everyone is going to start with supped up accelerated workloads, but they're going to get there. So part of the vision that clients need to understand is to begin to get more insights from their data is, it's hard to predict where your workloads are going to go. So you want to start with a server that provides you some of that upper room for growth. You don't want to keep scaling out horizontally by requiring to add nodes every time you need to add storage or add more compute capacity. So firstly, it's the flexibility, being able to bring versatile workloads onto a node or a small number of nodes and be able to exploit some of these memory advantages, acceleration advantages without necessarily having to build large scale out clusters. Ultimately, it's about improving time to insights. So with accelerators and with large memory, running workloads on a similar configured clusters, you're simply going to get your results faster. For example, recent benchmark we did with a representative set of TPC-DS queries on Hortonworks running on Linux and power servers, we're able to drive 70% more queries per hour over a comparable Intel configuration. So this is just getting more work done on what is now similarly priced infrastructure. 'Cause power family is a broad family that now includes 1U, 2U, scale out servers, along with our 192 core horsepowers for enterprise grade. So we can directly price compete on a scale out box, but we offer a lot more flexible choice as clients want to move up in the workload stack or to bring accelerators to the table as they start to experiment with machine learning. >> So if I understand that right, I can turn two knobs. I can do the same amount of work for less money, TCO play. Or, for the same amount of money, I can do more work. >> Absolutely >> Is that fair? >> Absolutely, now in some cases, especially in the Hadoop space, the size of your cluster is somewhat gated by how much storage you require. And if you're using the classic scale up storage model, you're going to have so many nodes no matter what 'cause you can only put so much storage on the node. So in that case, >> You're scaling storage. >> Your clusters can look the same, but you can put a lot more workload on that cluster or you can bring in IBM, a solution like IBM Spectrum Scale our elastic storage server, which allows you to essentially pull that storage off the nodes, put it in a storage appliance, and at that point, you now have high speed access to storage 'cause of course the network bandwidth has increased to the point that the performance benefit of local storage is no longer really a driving factor to a classic Hadoop deployment. You can get that high speed access in a storage appliance mode with the resiliency at far less cost 'cause you don't need 3x replication, you just have about a 30% overhead for the software erasure coding. And now with your compete nodes, you can really choose and scale those nodes just for your workload purposes. So you're not bound by the number of nodes equal total storage required by storage per node, which is a classic, how big is my cluster calculation. That just doesn't work if you get over 10 nodes, 'cause now you're just starting to get to the point where you're wasting something right? You're either wasting storage capacity or typically you're wasting compute capacity 'cause you're over provisioned on one side or the other. >> So you're able to scale compute and storage independent and tune that for the workload and grow that resource efficiently, more efficiently? >> You can right size the compute and storage for your cluster, but also importantly is you gain the flexibility with that storage tier, that data plan can be used for other non-HDFS workloads. You can still have classic POSIX applications or you may have new object based applications and you can with a single copy of the data, one virtual file system, which could also be geographically distributed, serving both Hadoop and non-Hadoop workloads, so you're saving then additional replicas of the data from being required by being able to onboard that onto a common data layer. >> So that's a return on asset play. You got an asset that's more fungible across the application portfolio. You can get more value out of it. You don't have to dedicate it to this one workload and then over provision for another one when you got extra capacity sitting here. >> It's a TCO play, but it's also a time saver. It's going to get you time to insight faster 'cause you don't have to keep moving that data around. The time you spend copying data is time you should be spending getting insights from the data, so having a common data layer removes that delay. >> Okay, 'cause it's HDFS ready I don't have to essentially move data from my existing systems into this new stovepipe. >> Yeah, we just present it through the HDFS API as it lands in the file system from the original application. >> So now, all this talk about rings of flexibility, agility, etc, what about cloud? How does cloud fit into this strategy? What do are you guys doing with your colleagues and cohorts at Bluemix, aka SoftLayer. You don't use that term anymore, but we do. When we get our bill it says SoftLayer still, but any rate, you know what I'm talking about. The cloud with IBM, how does it relate to what you guys are doing in power systems? >> Well the cloud is still, really the born on the cloud philosophy of IBM software analytics team is still very much the motto. So as you see in the data science experience, which was launched last year, born in the cloud, all our analytics packages whether it be our BigInsights software or our business intelligence software like Cognos, our future generations are landing first in the cloud. And of course we have our whole arsenal of Watson based analytics and APIs available through the cloud. So what we're now seeing as well as we're taking those born in the cloud, but now also offering a lot of those in an on-premise model. So they can also participate in the hybrid model, so data science experience now coming on premise, we're showing it at the booth here today. Bluemix has a on premise version as well, and the same software library, BigInsights, Cognos, SPSS are all available for on prem deployment. So power is still ideal place for hosting your on prem data and to run your analytics close to the data, and now we can federate that through hybrid access to these elements running in the cloud. So the focus is really being able to, the cloud applications being able to leverage the power and System z's based data through high speed connectors and being able to build hybrid configurations where you're running your analytics where they most make sense based upon your performance requirements, data security and compliance requirements. And a lot of companies, of course, are still not comfortable putting all their jewels in the cloud, so typically there's going to be a mix and match. We are expanding the footprint for cloud based offerings both in terms of power servers offered through SoftLayer, but also through other cloud providers, Nimbix is a partner we're working with right now who actually is offering our Power AI package. Power AI is a package of open source, deep learning frameworks, packaged by IBM, optimized for Power in an easily deployed package with IBM support available. And that's, could be deployed on premise in a power server, but also available on a pay per drink purpose through the Nimbix cloud. >> All right, we covered a lot of ground here. We talked strategy, we talked strategic fit, which I guess is sort of a adjunct to strategy, we talked a little bit about the competition and where you differentiate, some of the deployment models, like cloud, other bits and pieces of your portfolio. Can we talk specifically about the announcements that you have here at this event, just maybe summarize for use? >> Yeah, no absolutely. As it relates to IBM, and Hadoop, and Spark, we really have the full stack support, the rich analytics capabilities that I was mentioning, deep insight, prescriptive insights, streaming analytics with IBM Streams, Cognos Business Intelligence, so this set of technologies is available for both IBMs, Hadoop stack, and Hortonworks Hadoop stack today. Our BigInsights and IOP offering, is now out for tech preview, their next release their 4.3 release, is available for technical preview will be available for both Linux on Intel, Linux on power towards the end of this month, so that's kind of one piece of new Hadoop news at the analytics layer. As it relates to power systems, as Hortonworks announced this morning, HDP 2.6 is now available for Linux on power, so we've been partnering closely with Hortonworks to ensure that we have an optimized story for HDP running on power system servers as the data point I shared earlier with the 70% improved queries per hour. At the storage layer, we have a work in progress to certify Hortonworks, to certify Spectrum Scale file system, which really now unlocks abilities to offer this converged storage alternative to the classic Hadoop model. Spectrum Scale actually supports and provides advantages in both a classic Hadoop model with local storage or it can provide the flexibility of offering the same sort of multi-application support, but in a scale out model for storage that it also has the ability to form a part of a storage appliance that we call Elastic Storage Server, which is a combination of power servers and high density storage enclosures, SSD or spinning disk, depending upon the, or flash, depending on the configuration, and that certification will now have that as an available storage appliance, which could underpin either IBM Open Platform or HDP as a Hadoop data leg. But as I mentioned, not just for Hadoop, really for building a common data plane behind mixed analytics workloads that reduces your TCO through converged storage footprint, but more importantly, provides you that flexibility of not having to create data copies to support multiple applications. >> Excellent, IBM opening up its portfolio to the open source ecosystem. You guys have always had, well not always, but in the last 20 years, major, major investments in open source. They continue on, we're seeing it here. Steve, people are filing in. The evening festivities are about to begin. >> Steve: Yeah, yeah, the party will begin shortly. >> Really appreciate you coming on The Cube, thanks very much. >> Thanks a lot Dave. >> You're welcome. >> Great to talk to you. >> All right, keep it right there everybody. John and I will be back with a wrap up right after this short break, right back.

Published Date : Apr 6 2017

SUMMARY :

brought to you by Hortonworks. Steve, good to see you again. Munich, a lot of action, so the ability to network and that's evolving, the openness. as it relates to the system and some systems chops. from the processor to the GPU. in the next 50 years, and being able to balance through this high speed memory extension. What's the business impact of all that? and be able to exploit some of these I can do the same amount of especially in the Hadoop space, 'cause of course the network and you can with a You don't have to dedicate It's going to get you I don't have to essentially move data as it lands in the file system to what you guys are and to run your analytics a adjunct to strategy, to ensure that we have an optimized story but in the last 20 years, Steve: Yeah, yeah, the you coming on The Cube, John and I will be back with a wrap up

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
John	PERSON	0.99+
Steve	PERSON	0.99+
Steve Roberts	PERSON	0.99+
Dave	PERSON	0.99+
Munich	LOCATION	0.99+
Bob Picciano	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Terri	PERSON	0.99+
3x	QUANTITY	0.99+
six times	QUANTITY	0.99+
70%	QUANTITY	0.99+
last year	DATE	0.99+
San Jose	LOCATION	0.99+
two knobs	QUANTITY	0.99+
Bluemix	ORGANIZATION	0.99+
NVIDIA	ORGANIZATION	0.99+
eight threads	QUANTITY	0.99+
Linux	TITLE	0.99+
Hadoop	TITLE	0.99+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
Nimbix	ORGANIZATION	0.98+
today	DATE	0.98+
DataWorks Summit	EVENT	0.98+
SoftLayer	TITLE	0.98+
second	QUANTITY	0.97+
Hadoop Summit	EVENT	0.97+
Intel	ORGANIZATION	0.97+
Spark	TITLE	0.97+
IBMs	ORGANIZATION	0.95+
single copy	QUANTITY	0.95+
end of this month	DATE	0.95+
Watson	TITLE	0.95+
S822LC	COMMERCIAL_ITEM	0.94+
Europe	LOCATION	0.94+
this morning	DATE	0.94+
firstly	QUANTITY	0.93+
HDP 2.6	TITLE	0.93+
first	QUANTITY	0.93+
HDFS	TITLE	0.91+
one piece	QUANTITY	0.91+
Apache	ORGANIZATION	0.91+
30%	QUANTITY	0.91+
ODPi	ORGANIZATION	0.9+
DataWorks Summit Europe 2017	EVENT	0.89+
two threads per core	QUANTITY	0.88+
SoftLayer	ORGANIZATION	0.88+

Day 1 Wrap - DataWorks Summit Europe 2017 - #DWS17 - #theCUBE

(Rhythm music) >> Narrator: Live, from Munich, Germany, it's The Cube. Coverage, DataWorks Summit Europe, 2017. Brought to you by Hortonworks. >> Okay, welcome back everyone. We are live in Munich, Germany for DataWorks 2017, formally known as Hadoop Summit. This is The Cube special coverage of the Big Data world. I'm John Furrier my co-host Dave Vallente. Two days of live coverage, day one wrapping up. Now, Dave, we're just kind of reviewing the scene here. First of all, Europe is a different vibe. But the game is still the same. It's about Big Data evolving from Hadoop to full open source penetration. Puppy's now public in markets Hortonworks, Cloudera is now filing an S-1, Neosoft, Talon, variety of the other public companies. Alteryx. Hadoop is not dead, it's not dying. It certainly is going to have a position in the industry, but the Big Data conversation is front and center. And one thing that's striking to me is that in Europe, more than in the North America, is IOT is more centrally themed in this event. Europe is on the Internet of Things because of the manufacturing, smart cities. So this is a lot of IOT happening here, and I think this is a big discovery certainly, Hortonworks event is much more of a community event than Strata Hadoop. Which is much more about making money and modernization. This show's got a lot more engagement with real conversations and developers sessions. Very engaging audience. Well, yeah, it's Europe. So you've go a little bit different smaller show than North America but to me, IOT, Internet of Things, is bringing the other cloud world with Big Data. That's the forcing function. And real time data is the center of the action. I think is going to be a continuing theme as we move forward. >> So, in 2010 John, it was all about 'What is Hadoop?' With the middle part of that decade was all about Hadoop's got to go into the enterprise. It's gone mainstream in to the enterprise, and now it's sort of 'what's next?' Same wine new bottle. But I will say this, Hadoop, as you pointed out, is not dead. And I liken it to the early web. Web one dot O it was profound. It was a new paradigm. The profundity of Hadoop was that you could ship five megabytes of code to a petabyte of data. And that was the new model and that's spawned, that's catalyzed the Big Data movement. That is with us now and it's entrenched, and now you're seeing layers of innovation on top of that. >> Yeah, and I would just reiterate and reinforce that point by saying that Cloudera, the founders of this industry if you will, with Hadoop the first company to be commercially funded to do what Hortonworks came in after the fact out of Yahoo, came out of a web-scale world. So you have the cloud native DevOps culture, Amar Ujala's at Yahoo, Mike Olson, Jeff Hammerbacher, Christopher Vercelli. These guys were hardcore large-scale data guys. Again, this is the continuation of the evolution, and I think nothing is changed it that regard because those pioneers have set the stage for now the commercialization and now the conversation around operationalizing this cloud is big. And having Alan Nance, a practitioner, rock-star, talking about radical deployments that can drop a billion dollars at a cost savings to the bottom line. This is the kind of conversations we're going to see more of this is going to change the game from, you know, "Hey, I'm the CFO buyer" or "CIO doing IT", to an operational CEO, chief operating officer level conversation. That operational model of cloud is now coming into the view what ERP did in software, those kinds of megatrends, this is happening right now. >> As we talk about the open, the people who are going to make the real money on Big Data are the practitioners, those people applying it. We talked about Alan Nance's example of billion dollar, half a billion dollar cost-savings revenue opportunities, that's where the money's being made. It's not being made, yet anyway with these public companies. You're seeing it Splunk, Tableau, now Cloudera, Hortonworks, MapR. Is MapR even here? >> Haven't seen 'em. >> No I haven't seen MapR, they used to have pretty prominent display at the show. >> You brought up point I want to get back to. This relates to those guys, which is, profitless prosperity. >> Yeah. >> A term used for open source. I think there's a trend happening and I can't put a finger on it but I can kind of feel it. That is the ecosystems of open source are now going to a dimension where they're not yet valued in the classic sense. Most people that build platforms value ecosystems, that's where developers came from. Developer ecosystems fuel open source. But if you look at enterprise, at transformations over the decades, you'd see the successful companies have ecosystems of channel partners; ecosystems of indirect sales if you will. We're seeing the formation, at least I can start seeing the formation of an indirect engine of value creation, vis-à-vis this organic developer community where the people are building businesses and companies. Shaun Connolly pointed to Thintech as an example. Where these startups became financial services businesses that became Thintech suppliers, the banks. They're not in the banking business per se, but they're becoming as important as banks 'cuz they're the providers in Thintech, Thintech being financial tech. So you're starting to see this ecosystem of not "channel partners", resell my equipment or software in the classic sense as we know them as they're called channel partners. But if this continues to develop, the thousand flower blooming strategy, you could argue that Hortonworks is undervalued as a company because they're not realizing those gains yet or those gains can't be measured. So if you're an MBA or an investment banker, you've got to be looking at the market saying, "wow, is there a net-present value to an ecosystem?" It begs the question Dave. >> Dave: It's a great question John. >> This is a wealth creation. A rising tide floats all boats, in that rising tide is a ecosystem value number there. No one has their hands on that, no one's talked about that. That is the upshot in my mind, the silver-lining to what some are saying is the consolidation of Hadoop. Some are saying Cloudera is going to get a huge haircut off their four point one billion dollar value. >> Dave: I think that's inevitable. >> Which is some say, they may lose two to three billion in value, in the IPO. Post IPO which would put them in line with Hortonworks based on the numbers. You know, is that good or bad? I don't think it's bad because the value shifts to the ecosystem. Both Cloudera and Hortonworks both play in open source so you can be glass half-full on one hand, on the haircut, upcoming for Cloudera, two saying "No, the glass is half-full because it's a haircut in the short-term maybe", if that happens. I mean some said Pure Storage was going to get a haircut, they never really did Dave. So, again, no one yet has pegged the valuation of an ecosystem. >> Well, and I think that is a great point, personally I think, I've been sort of racking my brain, will this Big Data hike be realized. Like the internet. You remember the internet hyped up, then it crashed; no one wanted to own any of these companies. But it actually lived up to the hype. It actually exceeded the hype. >> You can get pet food online now, it's called amazon. [Co-Hosts Chuckle Together] All the e-commerce played out. >> Right, e-commerce played out. But I think you're right. But everybody's expecting sort of, was expecting a similar type of cycle. "Oh, this will replace that." And that's now what's going to happen. What's going to happen is the ecosystem is going to create a flywheel effect, is really what you're saying. >> Jeff: Yes. >> And there will be huge valuations that emerge out of this. But today, the guys that we know and love, the Hortonworks, the Clouderas, et cetera, aren't really on the winners list, I mean some of their founders maybe are. But who are the winners? Maybe the customers because they saw a big drop in cost. Apache's a big winner here. Wouldn't ya say? >> Yeah. >> Apache's looking pretty good, Apache Foundation. I would say AWS is a pretty big winner. They're drifting off of this. How about Microsoft and IBM? I mean I feel in a way IBM is sort of co-opted this Big Data meme, and said, "okay, cognitive." And layered all of it's stuff on top of it. Bought the weather company, repositioned the company, now it hasn't translated in to growth, but certainly has profitability implications. >> IBM plays well here, I'll tell you why. They're very big in open source, so that's positive. Two, they have huge track record and staff dealing with professional services in the enterprise. So if transformation is the journey conversation, IBM's right there. You can't ignore IBM on this one. Now, the stack might be different, but again, beauty is in the eye of the beholder because depending on what work clothes you have it depends. IBM is not going to leave you high and dry 'cuz they have a really you need for what they can do with their customers. Where people are going to get blindsided in my opinion, the IBMs and Oracles of the world, and even Microsoft, is what Alan Nance was talking about, the radical transformation around the operating model is going to force people to figure out when to start cannibalizing their own stacks. That's going to be a tell sign for winners and losers in the big game. Because if IBM can shift quickly and co-op the megatrends, make it their own, get out in front of that next wave as Pat Gelsinger would say, they could surf that wave and then tweak, and then get out in front. If they don't get behind that next wave, they're driftwood. It really is all about where you are in the spectrum, and analytics is one of those things in data where, you've got to have a cohesive horizontal strategy. You got to be horizontally scalable with data. You got to make data freely available. You have to have an abstraction layer of software that will allow free movement of data, across systems. That's the number one thing that comes out of seeing the Hortonwork's data platform for instance. Shaun Connolly called it 'connective tissue'. Cloudera is the same thing, they have to start figuring out ways to be better at the data across the horizontal view. Cloudera like IBM has an opportunity as well, to get out in front of the next wave. I think you can see that with AI and machine learning, clearly they're going to go after that. >> Just to finish off on the winners and losers; I mean, the other winner is systems integrators to service these companies. But I like what you said about cannibalizing stacks as an indicator of what's happening. So let's talk about that. Oracle clearly cannibalizing it's stacks, saying, "okay, we're going to the red stack to the cloud, go." Microsoft has made that decision to do that. IBM? To a large degree is cannibalizing it's stack. HP sold off it's stack, said, "we don't want to cannibalize our stack, we want to sell and try to retool." >> So, your question, your point? >> So, haven't they already begun to do that, the big legacy companies? >> They're doing their tweaking the collet and mog, as an example. At Oracle Open World and IBM Interconnect, all the shows we, except for Amazon, 'cuz they're pure cloud. All are taking the unique differentiation approach to their own stuff. IBM is putting stuff that's relate to IBM in their cloud. Oracle differentiates on their stack, for instance, I have no problem with Oracle because they have a huge database business. And, you're high as a kite if you think Oracle's going to lose that database business when data is the number one asset in the world. What Oracle's doing which I think is quite brilliant on Oracle's part is saying, "hey, if you want to run on premise with hardware, we got Sun, and oh by the way, our database is the fastest on our stuff." Check. Win. "Oh you want to move to the cloud? Come to the Oracle cloud, our database runs the fastest in our cloud", which is their stuff in the cloud. So if you're an Oracle customer you just can't lose there. So they created an inimitability around their own database. So does that mean they're going to win the new database war? Maybe not, but they can coexist as a system of records so that's a win. Microsoft Office 365, tightly coupling that with Azure is a brilliant move. Why wouldn't they do that? They're going to migrate their customer base to their own clouds. Oracle and Microsoft are going to migrate their customers to their own cloud. Differentiate and give their customers a gateway to the cloud. VVMware is partnering with Amazon. Brilliant move and they just sold vCloud Air which we reported at Silicon Angle last night, to a French company recently so vCloud Air is gone. Now that puts the VMware clearly in bed with Amazon web services. Great move for VMware, benefit to AWS, that's a differentiation for VMware. >> Dave: Somebody bought vCloud Air? >> I think you missed that last night 'cuz you were traveling. >> Chuckling: That's tongue-in-cheek, I mean what did they get for vCloud Air? >> OVH bought them, French company. >> More de-levering by Michael. >> Well, they're inter-clouding right? I mean de-leveraging the focus, right? So OVH, French company, has a very much coexisted... >> What'd they pay? >> ... strategy. It's undisclosed. >> Yeah, well why? 'Cuz it wasn't a big number. That's my point. >> Back to the other cloud players, Google. I think Google's differentiating on their technology. Great move, smart move. They just got to get, as someone who's been following them, and you know, you and I both love an enterprise experience. They got to speak the enterprise language and execute the language. Not through 19 year olds and interns or recent smart college grads ad and say, "we're instantly enterprise." There's a dis-economies of scale for trying to ramp up and trying to be too heavy on the enterprise. Amazon's got the same problem, you can't hire sales guy fast enough, and oh by the way, find me a sales guy that has ten 15 years executive selling experience to a complex strategic sales, like the enterprise where you now have stakeholders that are in multiple roles and changing roles as Alan Nance pointed out. So the enterprise game is very difficult. >> Yup. >> Very very difficult. >> Well, I think these dupe startups are seeing that. None of them are making money. Shaun Connolly basically said, "hey, it used to be growth they would pay for growth, but now their punishing you if you don't have growth plus profitability." By the way, that's not all totally true. Amazon makes no money, unless stock prices go through the roof. >> There is no self-service, there is no self-service business model for digital transformation for enterprise customers today. It doesn't exist. The value proposition doesn't resinate with customers. It works good for Shadow IT, and if you want to roll out G Suite in some pockets of your organization, but an ad-sense sales force doesn't work in the enterprise. Everyone's finding that out right now because they're basically transforming their enterprise. >> I think Google's going to solve their problem. I think Google has to solve their problem 'cuz... >> I think they will, but to me it's, buy a company, there's a zillion company out there they could buy tomorrow that are private, that have like 300 sales people that are senior people. Pay the bucks, buy a sales force, roll your stuff out and start speaking the language. I think Dianne Green gets this. So, I think, I expect to see Google ... >> Dave: Totally. >> do some things in that area. >> And I think, to you're point, I've always said the rich get richer. The traditional legacy companies, they're holding servant in this. They waited they waited they waited, and they said, "okay now we're going to go put our chips on the table." Oracle made it's bets. IBM made it's bets. HP, not really, betting on hardware. Okay. Fine. Cisco, Microsoft, they're all making their bets. >> It's all about bets on technology and profitability. This is what I'm looking at right now Dave. We talked about it on our intro. Shaun Connolly who's in charge of strategy at Hortonworks clarified it that clearly revenue, losing money is not going to solve the problem for credibility. Profitability matters. This comes back to the point we've said on The Cube multiple years ago and even just as recently as last year, that the world's flipping back down to credibility. Customers in the enterprise want to see credibility and track record. And they're going to evaluate the suppliers based upon key fundamentals in their business. Can they make money? Can they deliver SLAs? These are going to be key requirements, not the shiny new toy from Silicon Valley. Or the cool machine learning algorithm. It has to apply to their product, their value, and they're going to look to companies on the scoreboard and say, "are you profitable?" As a proxy for relevance. >> Well I want to keep it, but I do want to, we've been kind of critical of some of the Hadoop players. Cloudera and Hortonworks specifically. But I want to give them props 'cuz you remember well John, when the legacy enterprise guys started coming into the Hadoop market they all said that they had the same messaging, "we're going to make Hadoop enterprise ready." You remember that well, and I have to say that Hortonworks, Cloudera, I would say MapR as well and the ecosystem, have done a pretty good job of making Hadoop and Big Data enterprise ready. They were already working on it very hard, I think they took it seriously and I think that that's why they are in the mix and they are growing as they are. Shaun Connolly talked about them being operating cashflow positive. Eking out some plus cash. On the next earnings call, pressures on. But we want to see, you know, rocket ships. >> I think they've done a good job, I mean, I don't think anyone's been asleep at the switch. At all, enterprise ready. The questions always been "can they get there fast enough?" I think everyone's recognized that cost of ownership's down. We still solicit on the OpenStack ecosystem, and that they move right from the valley properties. So we'll keep an eye on it, tomorrow we'll be checking in. We got a great day tomorrow. Live coverage here in Munich, Germany for DataWorks 2017. More coverage tomorrow, stay with us. I'm John Furrier with Dave Vallente. Be right back with more tomorrow, day two. Keep following us.

Published Date : Apr 6 2017

SUMMARY :

Brought to you by Hortonworks. Europe is on the Internet of Things And I liken it to the early web. the founders of this industry if you will, on Big Data are the practitioners, prominent display at the show. This relates to those guys, which is, That is the ecosystems of open source the silver-lining to what some are saying on one hand, on the haircut, You remember the internet hyped up, All the e-commerce played out. the ecosystem is going to the Hortonworks, the Clouderas, et cetera, Bought the weather company, IBM is not going to leave you high and dry the red stack to the cloud, go." Now that puts the VMware clearly in bed I think you missed that last night I mean de-leveraging the focus, right? It's undisclosed. 'Cuz it wasn't a big number. like the enterprise where you now have By the way, that's not all totally true. and if you want to roll out G Suite I think Google has to start speaking the language. And I think, to you're point, that the world's flipping of some of the Hadoop players. We still solicit on the

ENTITIES

Entity	Category	Confidence
Dave Vallente	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Michael	PERSON	0.99+
Dianne Green	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Shaun Connolly	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
Jeff Hammerbacher	PERSON	0.99+
Alan Nance	PERSON	0.99+
Europe	LOCATION	0.99+
two	QUANTITY	0.99+
Pat Gelsinger	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Jeff	PERSON	0.99+
Apache	ORGANIZATION	0.99+
John	PERSON	0.99+
Yahoo	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
Christopher Vercelli	PERSON	0.99+
Google	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Thintech	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
billion dollar	QUANTITY	0.99+
VVMware	ORGANIZATION	0.99+
three billion	QUANTITY	0.99+
last year	DATE	0.99+
Silicon Valley	LOCATION	0.99+
Sun	ORGANIZATION	0.99+
Mike Olson	PERSON	0.99+
Two days	QUANTITY	0.99+
North America	LOCATION	0.99+
2010	DATE	0.99+
Neosoft	ORGANIZATION	0.99+
Talon	ORGANIZATION	0.99+

Chandra Mukhyala, IBM - DataWorks Summit Europe 2017 - #DW17 - #theCUBE

>> Narrator: theCUBE covering, DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Welcome back to the DataWorks Summit in Munich everybody. This is The Cube, the leader in live tech coverage. Chandra Mukhyala is here. He's the offering manager for IBM Storage. Chandra, good to see you. It always comes back to storage. >> It does, it's the foundation. We're here at a Data Show, and you got to put the data somewhere. How's the show going? What are you guys doing here? >> The show's going good. We have lots of participation. I didn't expect this big a crowd, but there is good crowd. Storage, people don't look at it as the most sexy thing but I still see a lot of people coming and asking. "What do you have to do with Hadoop?" kind of questions which is exactly the kind of question I expect. So, going good, we're able to-- >> It's interesting, in the early days of Hadoop and big data, I remember we interviewed, John and I interviewed Jeff Hammerbacher, founder of Cloudera and he was at Facebook and he said, "My whole goal at Facebook "when we're working with Hadoop was to "eliminate the storage container "and the expensive storage container." They succeeded, but now you see guys like you coming in and saying, "Hey, we have better storage." Why does the world need anything different than HDFS? >> This has been happening for the last two decades, right? In storage, every few years a startup comes, they address one problem very well. They address one problem and create a whole storage solution around that. Everybody understands the benefit of it and that becomes part of the main storage. When I say main storage, because these new point solutions address one problem but what about all the rest of the features storage has been developing for decades. Same thing happened with other solutions, for example, deduplication. Very popular, right at one point, dedupe appliances. Nowadays, every storage solution has dedupe in. I think same thing with HDFS right? HDFS's purpose is built for Hadoop. It solves that problem in terms of giving local access storage, scalable storage, big plural storage. But, it's missing out many things you know. One of the biggest problems they have with HDFS is it's siloed storage, meaning that data is only available, the data in HDFS is only for Hadoop. You can't, what about the rest of the applications in the organizations, who may need it through traditional protocols like NFS, or SMB or they maybe need it through new applications like S3 interfaces or Swift interfaces. So, you don't want that siloed storage. That's one of the biggest problems we have. >> So, you're putting forth a vision of some kind horizontal infrastructure that can be leveraged across your application portfolio... >> Chandra: Yes. >> How common is that? And what's the value of that? >> It's not really common, that's one of the stories, messages we're trying to get out. And I've been talking to data scientists in the last one year, a lot of them. One of the first things they do when they are implementing a Hadoop project is, they have to copy a lot data into HDFS Because before they could enter it just as HDFS they can't on any set. That copy process takes days. >> Dave: That's a big move, yeah. >> It's not only wasting time from a data scientist, but it also makes the data stale. I tell them you don't have to do that if your data was on something like IBM Spectrum Scale. You can run Hadoop straight off that, why do you even have to copy into HDFS. You can use the same existing applications map, and just applications with zero change to it and pour in them at Spectrum Scale it can still use the HSFS API. You don't have to copy that. And every data scientists I talk to is like, "Really?" "I don't know how to do this, I'm wasting time?" Yes. So, it's not very well known that, you know, most people think that there's only one way to do Hadoop applications, in sometimes HDFS. You don't have to. And advantages there is, one, you don't have to copy, you can share the data with the rest of the applications but its no more stale data. But also, one other big difference between the HDFS type of storage versus shared storages. In the shared, which is what HDFS is, the various scale is by adding new nodes, which adds both compute and storage. What if our applications, which don't necessarily need need more compute, all they need is more throughput. You're wasting computer resources, right? So there are certain applications where a share nothing is a better architecture. Now the solution which IBM has, will allow you to deploy it in either way. Share nothing or shared storage but that's one of the main reasons, people want to, data scientists especially, want to look at these alternative solutions for storage. >> So when I go back to my Hammerbacher example, it worked for a Facebook of the early days because they didn't have a bunch of legacy data hanging around, they could start with, pretty much, a blank piece of paper. >> Yes. >> Re-architect, plus they had such scale, they probably said, "Okay, we don't want to go to EMC "and NetApp or IBM, or whomever and buy storage, "we want to use commodity components." Not every enterprise can do that, is what you're saying. >> Yes, exactly. It's probably okay for somebody like a very large search engine, when all they're doing is analytics, nothing else. But if you to any large commercial enterprise, they have lots of, the whole point around analytics is they want to pool all of the data and look at that. So, find the correlations, right? It's not about analyzing one small, one dataset from one business function. It's about pooling everything together and see what insights can I get out of it. So that's one of the reasons it's very important to have support to access the data for your legacy enterprise applications, too, right? Yeah, so NFS and SMB are pretty important, so are S3 and Swift, but also for these analytics applications, one of the advantage of IBM Solution here is we provide local access for file system. Not necessarily through mass protocols like an access, we do that, but we also have PO SIX access to have data local access to the file system. With that, HDFS you have to first copy the file into HDFS, you had to bring it back to do anything with that. All those copy operations go away. And this is important, again in enterprise, not just for data sharing but also to get local access. >> You're saying your system is Hadoop ready. >> Chandra: It is. >> Okay. And then, the other thing you hear a lot from IT practitioners anyway, not so much from from the line of businesses, that when people spin up these Hadoop projects, big data projects, they go outside of the edicts of the organization in terms of governance and compliance, and often, security. How do you solve, do you solve that problem? >> Yeah, that's one of the reason to consider again, the enterprise storage, right? It's not just because you have, you're able to share the data with rest of applications, but also the whole bunch of data management features, including data governance features. You can talk about encryption there, you can talk about auditing there, you can talk about features like WAN, right, WAN, so data is, especially archival data, once you write you can't modify that. There are a whole bunch of features around data retention, data governance, those are all part of the data management stack we have. You get that for free. You not only get universal access, unified access, but you also get data governance. >> So is this one of the situations where, on the face of it, when you look at the CapEx, you say, "Oh, wow, I cause use commodity components, save a bunch of money." You know, you remember the client server days. "Oh, wow, cheap, cheap, cheep, "microprocessor based solution," and then all the sudden, people realize we have to manage this. Have we seen a similar sort of trend with Hadoop, with the ability to or the complexity of managing all of this infrastructure? It's so high than it actually drives costs up. >> Actually there are two parts to it, right? There is actually value in utilizing commodity hardware, industry standards. That does reduce your costs right? If you can just buy a standard XL6 server we can, a storage server and utilize that, why not. That is kind of just because. But the real value in any kind of a storage data manage solution is in the software stack. Now you can reduce CapEx by using industry standards. It's a good thing to do and we should, and we support that but in the end, the data management is there in the software stack. What I'm saying is HDFS is solving one problem by dismissing the whole data management problems, which we just touched on. And that all comes in software which goes down under service. >> Well, and you know, it's funny, I've been saying for years, that if you peel back the onion on any storage device, the vast majority anyway, they're all based on standard components. It's the software that you're paying for. So it's sort of artificial in that a company like IBM will say, "Okay, we've got all this value in here, "but it's on top of commodity components, "we're going to charge for the value." >> Right. >> And so if you strip that out, sure, you do it yourself. >> Yeah, exactly. And it's all standard service. It's been like that always. Now one difference is ten years ago people used propriety array controllers. Now all of the functionalities coming into software-- >> ASICs, >> Recording. >> Yeah, 3PAR still has an ASIC, but most don't. >> Right, that's funny, they only come in like.. Almost everybody has some kind of a software-based recording and they're able to utilize sharing server. Now the reason advantage in appliance more over, because, yes it can run on industry's standard, but this is storage, this is where, that's a foundation of all of your inter sectors. And you want RAS, or you want reliability and availability. The only way to get that is a fully integrated, tight solution, where you're doing a lot of testing on the software and the hardware. Yes, it's supposed to work, but what really happens when it fails, how does the sub react. And that's where I think there is still a value for integrated systems. If you're a large customer, you have a lot of storage saving, source of the administrators and they know to build solutions and validate it. Yes, software based storage is the right answer for you. And you're the offering manager for Spectrum Scale, which is the file offering, right, that's right? >> Yes, right yes. >> And it includes object as well, or-- >> Spectrum Sale is a file and object storage pack. It supports both file and protocols. It also supports object protocols. The thing about object storage is it means different things to different people. To some people, it's the object interface. >> Yeah, to me it means get put. >> Yeah, that's what the definition is, then it is objectivity. But the fact is that everybody's supposed to stay in now. But to some of the people, it's not about the protocol, because they're going to still access by finding those protocols, but to them, it's about the object store, which means it's a flat name space and there's no hierarchical name structure, and you can get into billions of finites without having any scalable issues. That's an object store. But to some other people it's neither of those, it's about a range of coding which object storage, so it's cheap storage. It allows you to run on storage and service, and you get cheap storage. So it's three different things. So if you're talking about protocols yes, but their skill is by their definition is object storage, also. >> So in thinking about, well let's start with Spectrum Scale generally. But specifically, your angle in big data and Hadoop, and we talked about that a little bit, but what are you guys doing here, what are you showing, what's your partership with Hortonworks. Maybe talk about that a little bit. >> So we've been supporting this, what we call as Hadoop connector on Spectrum Scale for almost a year now, which is allowing our existing Spectrum Scale customers to run Hadoop straight on it. But if you look at the Hadoop distributions, there are two or three major ones, right? Cloudera, Hortonworks, maybe MapArt. One of the first questions we get is, we tell our customers you can run Hadoop on this. "Oh, is this supported by my distribution?" So that has been a problem. So what we announced is, we found a partnership with Hortonworks, so now Hortonwords is certifying IBM Spectrum Scale. It's not new code changes, it's not new features, but it's a validation and a stamp from Hortonworks, that's in the process. The result of is, Hortonworks certified reference architecture, which is what we announced. We announced it about a month ago. We should be publishing that soon. Now customers can have more confidence in the joint solutions. It's not just IBM saying that it's Hadoop ready, but it's Hortonworks backing that up. >> Okay, and your scope, correct me if I'm wrong, is sort of on prem and hybrid, >> Chandra: Yes. >> Not cloud services. That's kind of you might sell your technology internally, but-- >> Correct so IBM storage is primarily focused on on prem storage. We do have a separate cloud division, but almost every IBM storage production, especially Spectrum Scale, is what I can speak of, we treat them as hybrid cloud storage. What we mean that is we have built in capabilities, we have feature. Most of our products call transfer in cloud tiering, it allows you to set a policy on when data should be automatically tiered to the cloud. Everybody wants public, everybody wants on prem. Obviously there are pros and cons of on primary storage, versus off primary storage, but basially, it boils down to, if you want performance and security, you want to be on premises. But there's always some which is better to be in the cloud, and we try to automate that with our feature called transfer and cloud data. You set a policy based on age, based on the type of data, based on the ownership. The system will automatically tier the data to the cloud, and when a user access that cloud, it comes back automatically, too. It's all transferred to the end. So yes, we're a non primary storage business but our solutions are hybrid cloud storage. >> So, as somebody who knows the file business pretty well, let's talk about kind of the business file and sort of where it's headed. There's some mega trends and dislocations. There's obviously software defined. You guys have made a big investment in software defined a year and a half, two years ago. There's cloud, Amazon with S3 sort of shook up the world. I mean, at first it was sort of small, but then now, it's really catching on. Object obviously fits in there. What do you see as the future of file. >> That's a great question. When it comes to data layout, there's really a block file of object. Software defined and cloud are various ways of consuming storage. If you're large service probably, you would prefer a software based solution so you can run it on your existing service. But who are your preferred solutions? Depending on the organization's preferences for security, and how concerned they are about security and performance needs, they will prefer to run some of the applications on cloud. These are different ways of consuming storage. But coming back to file, an object right? So object is perfect if you are not going to modify the data. You're done writing that data, and you're not going to change. It just belongs an object store, right? It's more scalable storage, I say scalable because file systems are hierarchical in nature. Because it's a file system tree, you have travels through the various subtype trees. Beyond a few million subtype trees, it slows you down. But file systems have a strength. When you want to modify the file, any application which is going to edit the file, which is going to modify the file, that application belongs on file storage, not on object. But let's say you are dealing with medical images. You're not going to modify an x-ray once it's done. That's better suited on an object storage. So file storage will always have a place. Take video editing and all these videos they are doing, you know video, we do a lot of video editing. That belongs on file storage, not on object. If you care about file modifications and file performance, file is your answer, but if you're done and you just want to archive it, you know, you want a scalable storage, billions of objects, then object is answer. Now either of these can be software based storage or it could be appliance. That's again an organization's preference for do you want to integrate a robust ready, ready made solution, then appliance is an answer. "Ah, no I'm a large organization. "I have a lot of storage administered," as they can build something on their own, then software based is answer. Having most windows will give you a choice. >> What brought you to IBM. You used to be at NetApp. IBM's buying the weather company. Dell's buying EMC. What attracted you to IBM? Storage is the foundation which we have, but it's really about data, and it's really about making sense of it, right? And everybody saying data is the new oil, right? And IBM is probably the only company I can think of, which has the tools and the IT to make sense of all this. NetApp, it was great in early 2000s. Even as a storage foundation, they have issues, with scale out and a true scale out, not just a single name space. EMC is pure storage company. In the future it's all about, the reason we are here at this conference is about analyzing the data. What tools do you have to make sense of that. And that's where machine learning, then deep learning comes. Watson is very well-known for that. IBM has the IT and it has a rightful research going on behind that, and I think storage will make more sense here. And also, IBM is doing the right thing by investing almost a billion dollars in software defined storage. They are one of the first companies who did not hesitate to take the software from the integrated systems, for example, XIV, and made the software available as software only. We did the same thing with Store-Wise. We took the software off it and made available as Spectrum Virtualize. We did not hesitate at all to take the same software which was available, to some other vendors, "I can't do that. "I'm going to lose all my margins." We didn't hesitate. We made it available as software. 'Cause we believe that's an important need for our customers. >> So the vision of the company, cognitive, the halo effect of that business, that's the future, is going to bring a lot of storage action, is sort of the premise there. >> Chandra: Yes. >> Excellent, well Chandra, thanks very much for coming to theCUBE. It was great to have you, and good luck with attacking the big data world. >> Thank you, thanks for having me. >> You're welcome. Keep it right there everybody. We'll be back with our next guest. We're live from Munich. This is DataWorks 2017. Right back. (techno music)

Published Date : Apr 5 2017

SUMMARY :

Brought to you by Hortonworks. This is The Cube, the leader It does, it's the foundation. at it as the most sexy thing in the early days of Hadoop and big data, and that becomes part of the main storage. of some kind horizontal infrastructure One of the first things they do but it also makes the data stale. of legacy data hanging around, that, is what you're saying. So that's one of the You're saying your of the organization in terms of governance but also the whole bunch of the client server days. It's a good thing to do and we should, It's the software that you're paying for. And so if you strip that Now all of the functionalities an ASIC, but most don't. is the right answer for you. To some people, it's the object interface. it's not about the protocol, but what are you guys doing One of the first questions we get is, That's kind of you might sell based on the type of data, let's talk about kind of the business file of the applications on cloud. And also, IBM is doing the right thing is sort of the premise there. to theCUBE. This is DataWorks 2017.

ENTITIES

Entity	Category	Confidence
Jeff Hammerbacher	PERSON	0.99+
John	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Hortonwords	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Munich	LOCATION	0.99+
Chandra Mukhyala	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Chandra	PERSON	0.99+
two parts	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
billions	QUANTITY	0.99+
EMC	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
DataWorks Summit	EVENT	0.99+
Swift	TITLE	0.99+
early 2000s	DATE	0.99+
One	QUANTITY	0.99+
one problem	QUANTITY	0.99+
DataWorks Summit	EVENT	0.99+
Cloudera	ORGANIZATION	0.99+
S3	TITLE	0.98+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
MapArt	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Spectrum Scale	TITLE	0.97+
ten years ago	DATE	0.97+
two years ago	DATE	0.97+
first questions	QUANTITY	0.96+
first companies	QUANTITY	0.96+
billions of objects	QUANTITY	0.95+
Hadoop	TITLE	0.95+
#DW17	EVENT	0.95+
one point	QUANTITY	0.95+
2017	EVENT	0.94+
decades	QUANTITY	0.94+
one business function	QUANTITY	0.94+
zero	QUANTITY	0.94+
a year and a half	DATE	0.93+
DataWorks Summit Europe 2017	EVENT	0.92+
one dataset	QUANTITY	0.92+
one way	QUANTITY	0.92+
three different things	QUANTITY	0.92+
DataWorks 2017	EVENT	0.91+
SMB	TITLE	0.91+
CapEx	ORGANIZATION	0.9+
last one year	DATE	0.89+

Alan Nance, Virtual Clarity– DataWorks Summit Europe 2017 #DW17 #theCUBE

>> Narrator: At the DataWorks Summit, Europe 2017. Brought to you by Hortonworks. >> Hey, welcome back everyone. We're here live from Munich, Germany at DataWorks 2017, Hadoop Summit formerly, the conference name before it changed to DataWorks. I'm John Furrier with my cohost Dave Vellante. Our next guest, we're excited to have Alan Nance who flew in, just for the CUBE interview today. Executive Vice President with Virtual Clarity. Former star, I call practitioner of the Cloud, knows the Cloud business. Knows the operational aspects of how to use technology. Alan, it's great to see you. Thanks for coming on the CUBE. >> Thank you for having me again. >> Great to see you, you were in the US recently, we had a chance to catch up. And one of the motivations that we talked with you today was, a little bit about some of the things you're looking at, that are transformative. Before we do that, let's talk a little about your history. And what your role is at Virtual Clarity. >> So, as you guys have, basically, followed that career, I started out in the transformation time with ING Bank. And started out, basically, technology upwards. Looking at converged infrastructure, converged infrastructure into VDI. When you've got that, you start to look at Clouds. Then you start to experiment with Clouds. And I moved from ING, from earlier experimentation, into Phillips. So, while Phillips, at that time had both the health care and lighting group. And then you start to look at consumption based Cloud propositions. And you remember the big thing that we were doing at that time, when we identified that 80% of the IT spend was non differentiating. So the thing was, how do we get away from almost a 900 million a year spend on legacy? How do we turn that into something that's productive for the Enterprise? So we spent a lot of time creating the consumption based infrastructure operating platform. A lot of things we had to learn. Because let's be honest, Amazon was still trying to become the behemoth it is now. IBM still didn't get the transition, HP didn't get it. So there was a lot of experimentation on which of the operating model-- >> You're the first mover on the operating model, The Cloud, that has scaled to it. And really differentiated services for your business, for also, cost reductions. >> Cost reductions have been phenomenal. And we're talking about halving the budget over a three year period. We're talking about 500 million a year savings. So these are big, big savings. The thing I feel we still need to tackle, is that when we re-platform your business, it should leave to agile acceleration of your growth path. And I think that's something that we still haven't conquered. So I think we're getting better and better at using platforms to save money, to suppress the expenditure. What we now need to do is to convert that into growth platform business. >> So, how about the data component? Because you were CIO of infrastructure at Phillips. But lately, you've been really spending a lot of time thinking about the data, how data adds value. So talk about your data journey. >> Well if I look at the data journey, the journey started for me, with, basically, a meeting with Tom Ritz in 2013. And he came with a very, very simple proposition. "You guys need to learn how to create "and store, and reason over data, "for the benefit of the Enterprise." And I think, "Well that's cool." Because up until that point, nobody had really been talking about data. Everyone was talking about the underlying technologies of the Cloud, but not really of the data element. And then we had a session with JP Rangaswami, who was at Salesforce, who basically, also said, "Well don't just think "about data lakes, but think also "about data streams and data rivers. "Because the other thing that's "going to happen here is that data's "not going to be stagnant in a company like yours." So we took that, and what happened, I think, in Phillips, which I think you see in a lot of companies, is an explosion across the Enterprise. So you've got people in social doing stuff. You got CDO's appearing. You've got the IOT. You've got the old, legacy systems, the systems of record. And so you end up with this enormous fragmentation of data. And with that you get a Wild West of what I call data stewardship. So you have a CDO who says, "Well I'm in charge of data." And you got a CMO who says, "Well I'm in charge of marketing data." Or you've got a CSO, says, "Yeah, "but I'm the security data guy." And there's no coherence, in terms of moving the Enterprise forward. Because everybody's focused on their own functionality around that data and not connecting it. So where are we now? I think right now we have a huge proliferation of data that's not connected, in many organizations. And I think we're going to hybrid but I don't think that's a future proof thing for most organizations. >> John: What do you mean by that? >> Well, if I look at what a lot of those suppliers are saying, they're really saying, "The solution "that you need, is to have a hybrid solution "between the public Cloud and your own Cloud." I thought, "But that's not the problem "that we need to solve." The problem that we need to solve is first of all, data gravity. So if I look at all the transformations that are running into trouble, what do they forget? When we go out and do IOT, when we go out and do social media analysis, it all has to flow back into those legacy systems. And those legacy systems are all going to be in the old world. And so you get latency issues, you get formatting issues. And so, we have to solve the data gravity issue. And we have to also solve this proliferation of stewardship. Somebody has to be in charge of making this work. And it's not going to be, just putting in a hybrid solution. Because that won't change the operating model. >> So let me ask the question, because on one of the things you're kind of dancing around, Dave brought up the data question. Something that I see as a problem in the industry, that hasn't yet been solved, and I'm just going to throw it out there. The CIO has always been the guy managing IT. And then he would report to the CFO, get the budget, blah, blah, blah. We know that's kind of played out its course. But there's no operational playbook to take the Cloud, mobile data at scale, that's going to drive the transformative impact. And I think there's some people doing stuff here and there, pockets. And maybe there's some organizations that have a cadence of managers, that are doing compliance, security, blah, blah, blah. But you have a vision on this. And some information that you're tracking around. An architecture that would bring it to scale. Could you share your thoughts on this operational model of Cloud, at a management level? >> Well, part of this is also based on your own analyst, Peter Boris. When he says, "The problem with data "is that its value is inverse to its half life." So, what the Enterprise has to do is it has to get to analyzing and making this data valuable, much, much faster then it is right now. And Chris Sellender of Unifi recently said, "You know, the problem's not big data. "The problem's fast data." So, now, who is best positioned in the organization to do this? And I believe it's the COO. >> John: Chief Operating Officer? >> Chief Operating Officer. I don't think it's going to be the CIO. Because I'm trying to figure out who's got the problem. Who's got the problem of connecting the dots to improving the operation of the company? Who is in charge of actually creating an operating platform that the business can feed off of? It's the C Tower. >> John: Why not the CFO? >> No, I think the CFO is going to be a diminishing value, over time. Because a couple of reasons. First of all, we see it in Phillips. There's always going to be a fiduciary role for the CFO. But we're out of the world of capex. We're out of the world of balancing assets. Everything is now virtual. So really, the value of a CFO, as sitting on the tee, if I use the racquetball, the CFO standing on the tee is not going to bring value to the Enterprise. >> And the CIO doesn't have the business juice, is your argument? Is that right? >> It depends on the CIO. There are some CIO's out there-- >> Dave: But in general, we're generalizing. >> Generally not. Because they've come through the ranks of building applications, which now has to be thrown away. They've come through the ranks of technology, which is now less relevant. And they've come through the ranks of having huge budgets and huge people to deploy certain projects. All of that's going away. And so what are you left with? Now you're left with somebody who absolutely has to understand how to communicate with the business. And that's what they haven't done for 30 years. >> John: And stream line business process. >> Well, at least get involved in the conversation. At least get involved in the conversation. Now if I talk to business people today, and you probably do too, most of them will still say there's this huge communication gulf. Between what we're trying to achieve and what the technology people are doing with our goals. I mean, I was talking to somebody the other day. And this lady heads up the sales for a global financial institution. She's sitting on the business side of this. And she's like, "The conversation should be "about, if our company wants to improve "our cost income ratio, and they ask me, "as sales to do it, I have to sell 10 times "more to make a difference. "Then if IT would save money. "So for every Euro they save. "And give me an agile platform, "is straight to the bottom line. "Every time I sell, because of our "cost income ratio, I just can't sell against that. "But I can't find on the IT side, "anybody who, sort of, gets my problem. "And is trying to help me with it." And then you look at her and what? You think a hybrid solution's going to help her? (laughs) I have no idea what you're talking about. >> Right, so the business person here then says, "I don't really care where it runs." But to your point, you care about the operational model? >> Alan: Absolutely. >> And that's really what Cloud should be, right? >> I think everybody who's going to achieve anything from an investment in Cloud, will achieve it in the operating world. They won't just achieve it on the cost savings side. Or on making costs more transparent, or more commoditized. Where it has to happen is in the operating model. In fact, we actually have data of a very large, transportation, logistics company, who moved everything that they had, in an attempt to be in a zero Cloud. And on the benchmark, saved zero. And they saved zero because they weren't changing the operating model. So they were still-- >> They lifted and shifted, but didn't change the operational mindset. >> Not at all. >> But there could have been business value there. Maybe things went faster? >> There could have been. >> Maybe simpler? >> But I'm not seeing it. >> Not game changing. >> Not game changing, certainly yes. >> Not as meaningful, it was a stretch. >> Give an example of a game changing scenario. >> Well for me, and I think this is the next most exciting thing. Is this idea of platforms. There's been an early adoption of this in Telco. Where we've seen people coming in and saying, "If you stock all of this IT, as we've known it, "and you leverage the ideas of Cloud computing, "to have scalable, invisible, infrastructure. "And you put a single platform on top of it "to run your business, you can save money." Now, I've seen business cases where people who are about to embark on this program are taking a billion a year out of their cost base. And in this company, it's 1/7th of their total profit. That's a game changer, for me. But now, who's going to help them do that? Who's going to help them-- >> What's the platform look like? >> And a million's a lot of money. >> Let's go, grab a sheet of paper how we-- >> So not everybody will even have a billion-- >> But that gets the attention of certainly, the CEO, the COO, CFO says, "Tell me more." >> You're alluding to it, Dave. You need to build a layer to punch, to doing that. So you need to fix the data stewardship problem. You have to create the invisible infrastructure that enables that platform. And you have to have a platform player who is prepared to disrupt the industry. And for me-- >> Dave: A Cloud player. >> A Cloud player, I think it's a born in the Cloud player. I think, you know, we've talked about it privately. >> So who are the forces to attract? You got Microsoft, you got AWS, Google, maybe IBM, maybe Oracle. >> See, I think it's Google. >> Dave: Why, why do you think it's Google? >> I think it's because, the platforms that I'm thinking of, and if I look in retail, if I look in financial services, it's all about data. Because that's the battle, right. We all agree, the battle's on data. So it's got to be somebody who understands data at scale, understands search at scale, understands deep learning at scale. And understands technology enough to build that platform and make it available in a consumption model. And for me, Google would be the ideal player, if they would make that step. Amazon's going to have a different problem because their strategy's not going down that route. And I think, for people like IBM or Oracle, it would require cannibalizing too much of their existing business. But they may dally with it. And they may do it in a territory where they have no install base. But they're not going to be disrupting the industry. I just don't think it's going to be possible for them. >> And you think Google has the Enterprise chops to pull it off? >> I think Google has the platform. I would agree with Alan on this. Something, I've been very critical on Google. Dave brings this up because he wants me to say it now, and I will. Google is well positioned to be the platform. I am very bullish on Google Cloud with respect to their ability to moon shot or slingshot to the future faster, than, potentially others. Or as they say in football, move the goal posts and change the game. That being said, where I've been critical of Google, and this is where, I'll be critical, is their dogma is very academic, very, "We're the technology leader, "therefore you should use Google G Suite." I think that they have to change their mindset, to be more Enterprise focused, in the sense of understand not the best product will always win, but the B chip they have to develop, have to think about the Enterprise. And that's a lot of white glove service. That's a lot of listening. That's not being too arrogant. I mean, there's a borderline between confidence and arrogance. And I think Google crosses it a little bit too much, Dave. And I think that's where Google recognizes, some people in Google recognize that they don't have the Enterprise track record, for sure on the sales side. You could add 1,000 sales reps tomorrow but do they have experience? So there's a huge translation issue going on between Google's capability and potential energy. And then the reality of them translating that into an operational footprint. So for them to meet the mark of folks like you, you can't be speaking Russian and English. You got to speak the same language. So, the language barrier, so to speak, the linguistics is different. That's my only point. >> I sense in your statements, there's a frustration here. Because we know that the key to some really innovative, disruption is with Google. And I think what we'd all like to do, even while I was addressing the camera. I'd love to see Diane, who does understand Enterprise, who's built a whole career servicing Enterprises extremely well, I'd like to see a little bit of a glimpse of, "We are up for this." And I understand when you're part of the bigger Google, the numbers are a little bit skewered against you to make a big impact and carry the firm with you. But I do believe there's an enormous opportunity in the Enterprise space. And people are just waiting for this. >> Well Diane Greene knows the Enterprise. So she came in, she's got to change the culture. And I know she's doing it. Because I have folks at Google, that I know that work there, that tell me privately, that it's happening, maybe not fast enough. But here's the thing. If you walked in the front door at Google, Alan Nance, this is my point, and he said, "I have experience and I have a plan "to build a platform, to knock a billion "dollars off seven companies, that I know, personally. "That I can walk in and win. "And move a billion dollars to their "bottom line with your platform." They might not understand what that means. >> I don't know, you know I was at Google Next a few weeks ago, last month. And I thought they were more, to your point, open to listening. Maybe not as arrogant as you might be presenting. And somewhat more humble. Still pretty ballsy. But I think Google recognizes that it needs help in the Enterprise. And here's why. Something that we've talked about in the past, is, you've got top down initiatives. You've got bottom up initiatives. And you've got middle out. What frequently happens, and I'd love for you to describe your experiences. The leaders say, the top CXO's say, "Okay we're going." And they take off and the organization doesn't follow them. If it's bottoms up, you don't have the top down in premature. So how do you address that? What are you seeing and how do you address that problem? >> So I think that's a really, really good observation. I mean, what I see in a lot of the big transformations that I've been involved in, is that speed is of the essence. And I think when CEO's, because usually it's the CEO. CEO comes in and they think they've got more time than they actually have to make the impact in the Enterprise. And it doesn't matter if they're coming in from the outside or they've grown up. They always underestimate their ability to do change, in time. And now what's changed over the past few years, is the average tenure of a CEO is six years. You know, I mean, Jack Welch was 20 years at GE. You can do a lot of damage in 20 years. And he did a lot of great things at GE over a 20 year period. You've only got six years now. And what I see in these big transformation programs is they start with a really good vision. I mean Mackenzie, Bain, Boston. They know the essence of what needs to happen. >> Dave: They can sell the dream. >> They can sell the dream. And the CEO sort of buys into it. And then immediately you get into the first layer, "Okay, okay, so we've got to change the organization." And so you bring in a lot of these companies that will run 13 work streams over three years, with hundreds of people. And at the end of that time, you're almost halfway through your tenure. And all you've got is a new design. Or a new set of job descriptions or strategies. You haven't actually achieved anything. And then the layer down is going to run into real problems. One of the problems that we had at the company I worked at before, was in order to support these platforms you needed really good master data management. And we suddenly realized that. And so we had to really put in an accelerated program to achieve that, with Impatica. We did it, but it cost us a year and 1/2. At a bank I know, they can't move forward because they're looking at 700 million of technology debt, they can't get past. So they end up going down a route of, "Maybe one of these big suppliers "can buy our old stuff. "And we can tag on some transformational "deal at the back end of that." None of those are working. And then what happens is, in my mind, if the CEO, from what I see, has not achieved escape velocity at the end of year three. So he's showing the growth, or she's showing the digital transformation, it's kind of game over. The Enterprise has already figured out they've stalled it long enough, not intentionally. And then we go back into an austerity program. Because you got to justify the millions you've spent in the last three years. And you've got nothing to show for it. >> And you're preparing three envelopes. >> So you got to accelerate those layers. You got to take layers out and you've got to have a really, I would say almost like, 90 day iteration plans that show business outcomes. >> But the technology layer, you can put in an abstraction layer, use APIs and infrastructure as code, all that cool stuff. But you're saying it's the organizational challenges. >> I think that's the real problem. It is the real problem, is the organization. And also, because what you're really doing in terms of the Enterprise, is you're moving from a more traditional supply chain that you own. And you've matriculated with SAP or with Oracle. Now you're talking about creating a digital value chain. A digital value chain that's much more based on a more mobile ecosystem, where you would have thin text in one area or insurance text, that have to now fit into an agile supply chain. It's all about the operating model. If you don't have people who know how to drive that, the technology's not going to help you. So you've got to have people on the business side and the technology side coming together to make this work. >> Alan, I have a question for you. What's you're prediction, okay, knowing what you know. And kind of, obviously, you have some frustrations in platforms with trying to get the big players to listen. And I think they should listen to you. But this is going to happen. So I would believe that what you're saying with the COO, operational things radically changing differently. Obviously, the signs are all there. Data centers are moving into the Cloud. I mean this is radical stuff, in a good way. And so, what's your prediction for how this plays out vis a vis Amazon Web Services, Google Cloud Platform Azure, IBM Cloud SoftLayer. >> Well here's my concern a little bit. I think if Google enters the fray I think everybody will reconfigure. Because if we'd assume that Google plays to its strengths and goes out there and finds the right partners. It's going to reconfigure the industry. If they don't do that, then what the industry's going to do is what it's done. Which means that the platforms are going to be hybrid platforms that are dominated by the traditional players. By the SOPs, by the Oracles, by the IBMs. And what I fear is that there may actually be a disillusionment. Because they will not bring the digital transformation and all the wonderful things that we all know, are out there to be gained. So you may get, "We've invested all this money." You see it a little bit with big data. "I've got this huge layer. "I've got petabytes. "Why am I not smarter? "Why is my business not going so much better? "I've put everything in there." I think we've got to address the operating problem. And we have to find a dialogue at the C Suite. >> Well to your point, and we talked about this. You know, you look at the core of Enterprise apps, the Oracle stuff is not moving in droves, to the Cloud. Oracle's freezing the market right now. Betting that it can get there before the industry gets there. And if it does-- >> Alan: It's not. >> And it might, but if it does, it's not going to be that radical transformation you're prescribing. >> They have too much to lose. Let's be honest, right. So Oracle is a victim of it's own success, pretty much like SAP. It has to go to the Cloud as a defensive play. Because the last thing either of those want is to be disintermediated by Amazon. Which may or may not happen anyway. Because a lot of companies will disintermediate if they can. Because the licensing is such a painful element for most enterprises, when they deal with these companies. So they have to believe that the platform is not going to look like that. >> And they're still trying to figure out the pricing models, and the margin models, and Amazon's clearly-- >> You know what's driving the pricing models is not the growth on the consumer side. >> Right, absolutely. >> That's not what's driving it. So I think we need another player. I really think we need another player. If it's not Google, somebody else. I can't think who would have the scale, the money to-- >> The only guys who have the scale, you got 10 cents, maybe a couple China Clouds, maybe one Japan Cloud and that's it. >> To be honest, you raise a good point. I haven't really looked at the Ali Baba's and the other people like that who may pick up that mantle. I haven't looked at them. Ali Baba's interesting, because just like Amazon, they have their own business that runs on platforms. And a very diverse business, which is growing faster than Amazon and is more profitable than Amazon. So they could be interesting. But I'm still hopeful. We should figure this out. >> Google should figure it out. You're absolutely right. They're investing, and I thought they put forth a pretty good messaging at the Google Next. You covered it remotely but I think they understand the opportunity. And I think they have the stomach for it. >> We had reporters there as well, at the event. We just did, they came to our studio. Google is self aware that they need to work on the Enterprise. I think the bigger thing that you're highlighting is the operational model is shifting to a scale point where it's going to change stewardship and COO meaning to be, I like that. The other thing I want to get your reaction to is something I heard this morning, on the CUBE from Sean Connelly. Which that goes with some of the things that we're seeing where you're seeing Cloud becoming a more centralized view. Where IOT is an Edge case. So you have now, issues around architectural things. Your thoughts and reaction to this balance between Edge and Cloud. >> Well I think this is where you're also going to have your data gravity challenge. So, Dave McCrory has written a lot about the concept of data gravity. And in my mind, too many people in the Enterprise don't understand it. Which is basically, that data attracts more data. And more data you have, it'll attract more. And then you create all these latency issues when you start going out to the Edge. Because when we first went out to the Edge I think, even at Phillips, we didn't realize how much interaction needed to come back. And that's going to vary from company to company. So some company's are going to want to have that data really quickly because they need to react to it immediately. Others may not have that. But what you do have is you have this balancing act. About, "What do I keep central? "And what do I put at the Edge?" I think Edge Technology is amazing. And when we first looked at it, four years ago, I mean, it's come such a long way. And what I am encouraged by is that, that data layer, so the layer that Sean talks about, there's a lot of exciting things happening. But again, my problem is what's the Enterprise going to do with that? Because it requires a different operating model. If I take an example of a manufacturing company, I know a manufacturing company right now that does work in China. And it takes all the data back to its central mainframes for processing. Well if you've got the Edge, you want to be changing the way you process. Which means that the decision makers on the business need to be insitu. They need to be in China. And we need to be bringing, systems of record data and combining it with local social data and age data, so we get better decisions. So we can drive growth in those areas. If I just enable it with technology but don't change the business model the business is not going to grow. >> So Alan, we always loved having you on. Great practitioner, but now you've kind of gone over to the dark side. We've heard of a company called Virtual Clarity. Tell us about what you're doing there. >> So what we're vested in, what I am very much vested in, with my team at Virtual Clarity, is creating this concept of precision guided transformation. Where you work on the business, on what are the outcomes we really need to get from this? And then we've combined, I would say it's like a data nerve center. So we can quickly analyze, within a matter of weeks, where we are with the company, and what routes to value we can create. And then we'll go and do it. So we do it in 90 day increments. So the business now starts to believe that something's really going to happen. None of these big, insert miracle here after three year programs. But actually going out and doing it. The second thing that I think that we're doing that I'm excited about is bringing in enlightened people who represent the Enterprise. So, one of my colleagues, former COO of Unilever, we just brought on a very smart lady, Dessa Grassa, who was the CDO at JP Morgan Chase. And the idea is to combine the insights that we have on the demand side, the buy side, with the insights that we have on the technology side to create better operating models. So that combination of creating a new view that is acceptable to the C Suite. Because these people understand how you talk to them. But at the same time, runs on this concept of doing everything quickly. That's what we're about right now. >> That's awesome, we should get you hooked up with our new analyst we just hired, James Corbelius, from IBM. Was focusing on exactly that. The intersections of developers, Cloud, AI machine learning and data, all coming together. And IOT is going to be a key application that we're going to see coming out of that. So, congratulations. Alan thank you for spending the time to come in. >> Thanks for allowing me. >> To see us in the CUBE. It's the CUBE, bringing you more action. Here from DataWorks 2017. I'm John Furrier with my cohost Dave Vallante, here on the CUBE, SiliconANGLE Media's flagship program. Where we've got the events, straight from SiliconANGLE. Stay with us for more great coverage. Day one of two days of coverage at DataWorks 2017. We'll be right back.

Published Date : Apr 5 2017

SUMMARY :

Brought to you by Hortonworks. Thanks for coming on the CUBE. And one of the motivations that So the thing was, how do we get away from that has scaled to it. And I think that's something that we So, how about the data component? of moving the Enterprise forward. And it's not going to be, just So let me ask the question, because on And I believe it's the COO. I don't think it's going to be the CIO. So really, the value of a CFO, as sitting It depends on the CIO. Dave: But in general, And so what are you left with? "But I can't find on the IT side, Right, so the business And on the benchmark, saved zero. change the operational mindset. But there could have Give an example of a And in this company, it's But that gets the And you have to have a platform player a born in the Cloud player. You got Microsoft, you got AWS, Google, So it's got to be somebody who understands So, the language barrier, so to speak, And I think what we'd all like to do, But here's the thing. The leaders say, the top CXO's say, is that speed is of the essence. And at the end of that time, you're almost You got to take layers But the technology It is the real problem, And I think they should listen to you. the industry's going to in droves, to the Cloud. it's not going to be that radical So they have to believe that the platform is not the growth on the consumer side. the scale, the money to-- you got 10 cents, maybe I haven't really looked at the Ali Baba's And I think they have the stomach for it. is the operational model is shifting the business is not going to grow. kind of gone over to the dark side. And the idea is to combine the insights the time to come in. It's the CUBE, bringing you more action.

ENTITIES

Entity	Category	Confidence
Alan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Virtual Clarity	ORGANIZATION	0.99+
Chris Sellender	PERSON	0.99+
Diane Greene	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Dave Vallante	PERSON	0.99+
John	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jack Welch	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Dave McCrory	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Dessa Grassa	PERSON	0.99+
Peter Boris	PERSON	0.99+
Google	ORGANIZATION	0.99+
2013	DATE	0.99+
China	LOCATION	0.99+
JP Rangaswami	PERSON	0.99+
10 times	QUANTITY	0.99+
James Corbelius	PERSON	0.99+
HP	ORGANIZATION	0.99+
Unifi	ORGANIZATION	0.99+
Sean	PERSON	0.99+
Diane	PERSON	0.99+
six years	QUANTITY	0.99+
Alan Nance	PERSON	0.99+
Phillips	ORGANIZATION	0.99+
US	LOCATION	0.99+
John Furrier	PERSON	0.99+
ING Bank	ORGANIZATION	0.99+
700 million	QUANTITY	0.99+
JP Morgan Chase	ORGANIZATION	0.99+
10 cents	QUANTITY	0.99+
20 years	QUANTITY	0.99+
Sean Connelly	PERSON	0.99+
Unilever	ORGANIZATION	0.99+
90 day	QUANTITY	0.99+
30 years	QUANTITY	0.99+
80%	QUANTITY	0.99+
GE	ORGANIZATION	0.99+
IBMs	ORGANIZATION	0.99+
last month	DATE	0.99+
ING	ORGANIZATION	0.99+
Tom Ritz	PERSON	0.99+
three year	QUANTITY	0.99+
Amazon Web Services	ORGANIZATION	0.99+

Shaun Connolly, Hortonworks - DataWorks Summit Europe 2017 - #DW17 - #theCUBE

>> Announcer: Coverage DataWorks Summit Europe 2017 brought to you by Hortonworks. >> Welcome back everyone. Live here in Munich, Germany for theCUBE'S special presentation of Hortonworks Hadoop Summit now called DataWorks 2017. I'm John Furrier, my co-host Dave Vellante, our next guest is Shaun Connolly, Vice President of Corporate Strategy, Chief Strategy Officer. Shaun great to see you again. >> Thanks for having me guys. Always a pleasure. >> Super exciting. Obviously we always pontificating on the status of Hadoop and Hadoop is dead, long live Hadoop, but runs in demise is greatly over-exaggerated, but reality is is that no major shifts in the trends other than the fact that the amplification with AI and machine learning has upleveled the narrative to mainstream around data, big data has been written on on gen one on Hadoop, DevOps, culture, open-source. Starting with Hadoop you guys certainly have been way out in front of all the trends. How you guys have been rolling out the products. But it's now with IoT and AI as that sizzle, the future self driving cars, smart cities, you're starting to really see demand for comprehensive solutions that involve data-centric thinking. Okay, said one. Two, open-source continues to dominate MuleSoft went public, you guys went public years ago, Cloudera filed their S-1. A crop of public companies that are open-source, haven't seen that since Red Hat. >> Exactly. 99 is when Red Hat went public. >> Data-centric, big megatrend with open-source powering it, you couldn't be happier for the stars lining up. >> Yeah, well we definitely placed our bets on that. We went public in 2014 and it's nice to see that graduating class of Taal and MuleSoft, Cloudera coming out. That just I think helps socializes movement that enterprise open-source, whether it's for on-prem or powering cloud solutions pushed out to the edge, and technologies that are relevant in IoT. That's the wave. We had a panel earlier today where Dahl Jeppe from Centric of British Gas, was talking about his ... The digitization of energy and virtual power plant notions. He can't achieve that without open-source powering and fueling that. >> And the thing about it is is just kind of ... For me personally being my age in this generation of computer industry since I was 19, to see the open-source go mainstream the way it is, is even gets better every time, but it really is the thousandth flower bloom strategy. Throwing the seeds out there of innovation. I want to ask you as a strategy question, you guys from a performance standpoint, I would say kind of got hammered in the public market. Cloudera's valuation privately is 4.1 billion, you guys are close to 700 million. Certainly Cloudera's going to get a haircut looks like. The public market is based on the multiples from Dave and I's intro, but there's so much value being created. Where's the value for you guys as you look at the horizon? You're talking about white spaces that are really developing with use cases that are creating value. The practitioners in the field creating value, real value for customers. >> So you covered some of the trends, but I'll translate em into how the customers are deploying. Cloud computing and IoT are somewhat related. One is a centralization, the other is decentralization, so it actually calls for a connected data architecture as we refer to it. We're working with a variety of IoT-related use cases. Coca-Cola, East Japan spoke at Tokyo Summit about beverage replenishment analytics. Getting vending machine analytics from vending machines even on Mount Fuji. And optimizing their flow-through of inventory in just-in-time delivery. That's an IoT-related to run on Azure. It's a cloud-related story and it's a big data analytics story that's actually driving better margins for the business and actually better revenues cuz they're getting the inventory where it needs to be so people can buy it. Those are really interesting use cases that we're seeing being deployed and it's at this convergence of IoT cloud and big data. Ultimately that leads to AI, but I think that's what we're seeing the rise of. >> Can you help us understand that sort of value chain. You've got the edge, you got the cloud, you need something in-between, you're calling it connected data platform. How do you guys participate in that value chain? >> When we went public our primary workhorse platform was Hortonworks Data Platform. We had first class cloud services with Azure HDInsight and Hortonworks Data Cloud for AWS, curated cloud services pay-as-you-go, and Hortonworks DataFlow, I call as our connective tissue, it manages all of your data motion, it's a data logistics platform, it's like FedEx for data delivery. It goes all the way out to the edge. There's a little component called Minify, mini and ify, which does secure intelligent analytics at the edge and transmission. These smart manufacturing lines, you're gathering the data, you're doing analytics on the manufacturing lines, and then you're bringing the historical stuff into the data center where you can do historical analytics across manufacturing lines. Those are the use cases that are connect the data archives-- >> Dave: A subset of that data comes back, right? >> A subset of the data, yep. The key events of that data it may not be full of-- >> 10%, half, 90%? >> It depends if you have operational events that you want to store, sometimes you may want to bring full fidelity of that data so you can do ... As you manufacture stuff and when it got deployed and you're seeing issues in the field, like Western Digital Hard Drives, that failure's in the field, they want that data full fidelity to connect the data architecture and analytics around that data. You need to ... One of the terms I use is in the new world, you need to play it where it lies. If it's out at the edge, you need to play it there. If it makes a stop in the cloud, you need to play it there. If it comes into the data center, you also need to play it there. >> So a couple years ago, you and I were doing a panel at our Big Data NYC event and I used the term "profitless prosperity," I got the hairy eyeball from you, but nonetheless, we talked about you guys as a steward of the industry, you have to invest in open-source projects. And it's expensive. I mean HDFS itself, YARN, Tez, you guys lead a lot of those initiatives. >> Shaun: With the community, yeah, but we-- >> With the community yeah, but you provided contributions and co-leadership let's say. You're there at the front of the pack. How do we project it forward without making forward-looking statements, but how does this industry become a cashflow positive industry? >> Public companies since end of 2014, the markets turned beginning at 2016 towards, prior to that high growth with some losses was palatable, losses were not palatable. That his us, Splunk, Tableau most of the IT sector. That's just the nature of the public markets. As more public open-source, data-driven companies will come in I think it will better educate the market of the value. There's only so much I can do to control the stock price. What I can from a business perspective is hit key measures from a path to profitability. The end of Q4 2016, we hit what we call the just-to-even or breakeven, which is a stepping stone. On our earnings call at the end of 2016 we ended with 185 million in revenue for the year. Only five years into this journey, so that's a hard revenue growth pace and we basically stated in Q3 or Q4 of 17, we will hit operating cashflow neutrality. So we are operating business-- >> John: But you guys also hit a 100 million at record pace too, I believe. >> Yeah, in four years. So revenue is one thing, but operating margins, like if you look at our margins on our subscription business for instance, we've got 84% margin on that. It's a really nice margin business. We can make that better margins, but that's a software margin. >> You know what's ironic, we were talking about Red Hat off camera. Here's Red Hat kicking butt, really hitting all cylinders, three billion dollars in bookings, one would think, okay hey I can maybe project forth some of these open-source companies. Maybe the flip side of this, oh wow we want it now. To your point, the market kind of flipped, but you would think that Red Hat is an indicator of how an open-source model can work. >> By the way Red Hat went public in 99, so it was a different trajectory, like you know I charted their trajectory out. Oracle's trajectory was different. They didn't even in inflation adjusted dollars they didn't hit a 100 million in four years, I think it was seven or eight years or what have you. Salesforce did it in five. So these SaaS models and these subscription models and the cloud services, which is an area that's near and dear to my heart. >> John: Goes faster. >> You get multiple revenue streams across different products. We're a multi-products cloud service company. Not just a single platform. >> So we were actually teasing this out on our-- >> And that's how you grow the business, and that's how Red Hat did it. >> Well I want to get your thoughts on this while we're just kind of ripping live here because Dave and I were talking on our intro segment about the business model and how there's some camouflage out there, at least from my standpoint. One of the main areas that I was kind of pointing at and trying to poke at and want to get your reaction to is in the classic enterprise go-to-market, you have sales force expansive, you guys pay handsomely for that today. Incubating that market, getting the profitability for it is a good thing, but there's also channels, VARs, ISVs, and so on. You guys have an open-source channel that kind of not as a VAR or an ISV, these are entrepreneurs and or businesses themselves. There's got to be a monetization shift there for you guys in the subscription business certainly. When you look at these partners, they're co-developing, they're in open-source, you can almost see the dots connecting. Is this new ecosystem, there's always been an ecosystem, but now that you have kind of a monetization inherently in a pure open distribution model. >> It forces you to collaborate. IBM was on stage talking about our system certified on the Power Systems. Many may look at IBM as competitive, we view them as a partner. Amazon, some may view them as a competitor with us, they've been a great partner in our for AWS. So it forces you to think about how do you collaborate around deeply engineered systems and value and we get great revenue streams that are pulled through that they can sell into the market to their ecosystems. >> How do you vision monetizing the partners? Let's just say Dave and I start this epic idea and we create some connective tissue with your orchestrator called the Data Platform you have and we start making some serious bang. We make a billion dollars. Do you get paid on that if it's open-source? I mean would we be more subscriptions? I'm trying to see how the tide comes in, whose boats float on the rising tide of the innovation in these white spaces. >> Platform thinking is you provide the platform. You provide the platform for 10x value that rides atop that platform. That's how the model works. So if you're riding atop the platform, I expect you and that ecosystem to drive at least 10x above and beyond what I would make as a platform provider in that space. >> So you expect some contributions? >> That's how it works. You need a thousand flowers to be running on the platform. >> You saw that with VMware. They hit 10x and ultimately got to 15 or 16, 17x. >> Shaun: Exactly. >> I think they don't talk about it anymore. I think it's probably trading the other way. >> You know my days at JBoss Red Hat it was somewhere between 15 to 20x. That was the value that was created on top of the platforms. >> What about the ... I want to ask you about the forking of the Hadoop distros. I mean there was a time when everybody was announcing Hadoop distros. John Furrier announced SiliconANGLE was announcing Hadoop distro. So we saw consolidation, and then you guys announced the ODP, then the ODPI initiative, but there seems to be a bit of a forking in Hadoop distros. Is that a fair statement? Unfair? >> I think if you look at how the Linux market played out. You have clearly Red Hat, you had Conicho Ubuntu, you had SUSE. You're always going to have curated platforms for different purposes. We have a strong opinion and a strong focus in the area of IoT, fast analytic data from the edge, and a centralized platform with HDP in the cloud and on-prem. Others in the market Cloudera is running sort of a different play where they're curating different elements and investing in different elements. Doesn't make either one bad or good, we are just going after the markets slightly differently. The other point I'll make there is in 2014 if you looked at the then chart diagrams, there was a lot of overlap. Now if you draw the areas of focus, there's a lot of white space that we're going after that they aren't going after, and they're going after other places and other new vendors are going after others. With the market dynamics of IoT, cloud and AI, you're going to see folks chase the market opportunities. >> Is that dispersity not a problem for customers now or is it challenging? >> There has to be a core level of interoperability and that's one of the reasons why we're collaborating with folks in the ODPI, as an example. There's still when it comes to some of the core components, there has to be a level of predictability, because if you're an ISV riding atop, you're slowed down by death by infinite certification and choices. So ultimately it has to come down to just a much more sane approach to what you can rely on. >> When you guys announced ODP, then ODPI, the extension, Mike Olson wrote a blog saying it's not necessary, people came out against it. Now we're three years in looking back. Was he right or not? >> I think ODPI take away this year, there's more than we can do above and beyond the Hadoop platform. It's expanded to include SQL and other things recently, so there's been some movement on this spec, but frankly you talk to John Mertic at ODPI, you talk to SAS and others, I think we want to be a bit more aggressive in the areas that we go after and try and drive there from a standardization perspective. >> We had Wei Wang on earlier-- >> Shaun: There's more we can do and there's more we should do. >> We had Wei on with Microsoft at our Big Data SV event a couple weeks ago. Talk about the Microsoft relationship with you guys. It seems to be doing very well. Comments on that. >> Microsoft was one of the two companies we chose to partner with early on, so and 2011, 2012 Microsoft and Teradata were the two. Microsoft was how do I democratize and make this technology easy for people. That's manifest itself as Azure Cloud Service, Azure HDInsight-- >> Which is growing like crazy. >> Which is globally deployed and we just had another update. It's fundamentally changed our engineering and delivering model. This latest release was a cloud first delivery model, so one of the things that we're proud of is the interactive SQL and the LLAP technology that's in HDP, that went out through Azure HDInsight what works data cloud first. Then it certified in HDP 2.6 and it went power at the same time. It's that cadence of delivery and cloud first delivery model. We couldn't do it without a partnership with Microsoft. I think we've really learned what it takes-- >> If you look at Microsoft at that time. I remember interviewing you on theCUBE. Microsoft was trading something like $26 a share at that time, around their low point. Now the stock is performing really well. Stockinnetel very cloud oriented-- >> Shaun: They're very open-source. >> They're very open-source and friendly they've been donating a lot to the OCP, to the data center piece. Extremely different Microsoft, so you slipped into that beautiful spot, reacted on that growth. >> I think as one of the stalwarts of enterprise software providers, I think they've done a really great job of bending the curve towards cloud and still having a mixed portfolio, but in sending a field, and sending a channel, and selling cloud and growing that revenue stream, that's nontrivial, that's hard. >> They know the enterprise sales motions too. I want to ask you how that's going over all within Hortonworks. What are some of the conversations that you're involved in with customers today? Again we were saying in our opening segment, it's on YouTube if you're not watching, but the customers is the forcing function right now. They're really putting the pressure one the suppliers, you're one of them, to get tight, reduce friction, lower costs of ownership, get into the cloud, flywheel. And so you see a lot-- >> I'll throw in another aspect some of the more late majority adopters traditionally, over and over right here by 2025 they want to power down the data center and have more things running in the public cloud, if not most everything. That's another eight years or what have you, so it's still a journey, but this journey to making that an imperative because of the operational, because of the agility, because of better predictability, ease of use. That's fundamental. >> As you get into the connected tissue, I love that example, with Kubernetes containers, you've got developers, a big open-source participant and you got all the stuff you have, you just start to see some coalescing around the cloud native. How do you guys look at that conversation? >> I view container platforms, whether they're container services that are running one on cloud or what have you, as the new lightweight rail that everything will ride atop. The cloud currently plays a key role in that, I think that's going to be the defacto way. In particularly if you go cloud first models, particularly for delivery. You need that packaging notion and you need the agility of updates that that's going to provide. I think Red Hat as a partner has been doing great things on hardening that, making it secure. There's others in the ecosystem as well as the cloud providers. All three cloud providers actually are investing in it. >> John: So it's good for your business? >> It removes friction of deployment ... And I ride atop that new rail. It can't get here soon enough from my perspective. >> So I want to ask about clouds. You were talking about the Microsoft shift, personally I think Microsoft realized holy cow, we could actaully make a lot of money if we're selling hardware services. We can make more money if we're selling the full stack. It was sort of an epiphany and so Amazon seems to be doing the same thing. You mentioned earlier you know Amazon is a great partner, even though a lot of people look at them as a competitor, it seems like Amazon, Azure etc., they're building out their own big data stack and offering it as a service. People say that's a threat to you guys, is it a threat or is it a tailwind, is it it is what it is? >> This is why I bring up industry-wide we always have waves of centralization, decentralization. They're playing out simultaneously right now with cloud and IoT. The fact of the matter is that you're going to have multiple clouds on-prem data and data at the edge. That's the problem I am looking to facilitate and solve. I don't view them as competitors, I view them as partners because we need to collaborate because there's a value chain of the flow of the data and some of it's going to be running through and on those platforms. >> The cloud's not going to solve the edge problem. Too expensive. It's just physics. >> So I think that's where things need to go. I think that's why we talk about this notion of connected data. I don't talk hybrid cloud computing, that's for compute. I talk about how do you connect to your data, how do you know where your data is and are you getting the right value out of the data by playing it where it lies. >> I think IoT has been a great sweet trend for the big data industry. It really accelerates the value proposition of the cloud too because now you have a connected network, you can have your cake and eat it too. Central and distributed. >> There's different dynamics in the US versus Europe, as an example. US definitely we're seeing a cloud adoption that's independent of IoT. Here in Europe, I would argue the smart mobility initiatives, the smart manufacturing initiatives, and the connected grid initiatives are bringing cloud in, so it's IoT and cloud and that's opening up the cloud opportunity here. >> Interesting. So on a prospects for Hortonworks cashflow positive Q4 you guys have made a public statement, any other thoughts you want to share. >> Just continue to grow the business, focus on these customer use cases, get them to talk about them at things like DataWorks Summit, and then the more the merrier, the more data-oriented open-source driven companies that can graduate in the public markets, I think is awesome. I think it will just help the industry. >> Operating in the open, with full transparency-- >> Shaun: On the business and the code. (laughter) >> Welcome to the party baby. This is theCUBE here at DataWorks 2017 in Munich, Germany. Live coverage, I'm John Furrier with Dave Vellante. Stay with us. More great coverage coming after this short break. (upbeat music)

Published Date : Apr 5 2017

SUMMARY :

brought to you by Hortonworks. Shaun great to see you again. Always a pleasure. in front of all the trends. Exactly. 99 is when you couldn't be happier for the and it's nice to see that graduating class Where's the value for you guys margins for the business You've got the edge, into the data center where you A subset of the data, yep. that failure's in the field, I got the hairy eyeball from you, With the community yeah, of the public markets. John: But you guys like if you look at our margins the market kind of flipped, and the cloud services, You get multiple revenue streams And that's how you grow the business, but now that you have kind on the Power Systems. called the Data Platform you have You provide the platform for 10x value to be running on the platform. You saw that with VMware. I think they don't between 15 to 20x. and then you guys announced the ODP, I think if you look at how and that's one of the reasons When you guys announced and beyond the Hadoop platform. and there's more we should do. Talk about the Microsoft the two companies we chose so one of the things that I remember interviewing you on theCUBE. so you slipped into that beautiful spot, of bending the curve towards cloud but the customers is the because of the operational, and you got all the stuff you have, and you need the agility of updates that And I ride atop that new rail. People say that's a threat to you guys, The fact of the matter is to solve the edge problem. and are you getting the It really accelerates the value and the connected grid you guys have made a public statement, that can graduate in the public Shaun: On the business and the code. Welcome to the party baby.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Europe	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
2014	DATE	0.99+
John Furrier	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
John Mertic	PERSON	0.99+
Mike Olson	PERSON	0.99+
Shaun	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Shaun Connolly	PERSON	0.99+
Centric	ORGANIZATION	0.99+
Teradata	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Coca-Cola	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
2016	DATE	0.99+
4.1 billion	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
two	QUANTITY	0.99+
100 million	QUANTITY	0.99+
five	QUANTITY	0.99+
2011	DATE	0.99+
Mount Fuji	LOCATION	0.99+
US	LOCATION	0.99+
seven	QUANTITY	0.99+
185 million	QUANTITY	0.99+
eight years	QUANTITY	0.99+
four years	QUANTITY	0.99+
10x	QUANTITY	0.99+
Dahl Jeppe	PERSON	0.99+
YouTube	ORGANIZATION	0.99+
FedEx	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
100 million	QUANTITY	0.99+
one	QUANTITY	0.99+
MuleSoft	ORGANIZATION	0.99+
2025	DATE	0.99+
Red Hat	ORGANIZATION	0.99+
three years	QUANTITY	0.99+
15	QUANTITY	0.99+
two companies	QUANTITY	0.99+
2012	DATE	0.99+
Munich, Germany	LOCATION	0.98+
Hadoop	TITLE	0.98+
DataWorks 2017	EVENT	0.98+
Wei Wang	PERSON	0.98+
Wei	PERSON	0.98+
10%	QUANTITY	0.98+
eight years	QUANTITY	0.98+
20x	QUANTITY	0.98+
Hortonworks Hadoop Summit	EVENT	0.98+
end of 2016	DATE	0.98+
three billion dollars	QUANTITY	0.98+
SiliconANGLE	ORGANIZATION	0.98+
Azure	ORGANIZATION	0.98+
DataWorks Summit	EVENT	0.97+

Mike Merritt-Holmes, Think Big - DataWorks Summit Europe 2017 - #DW17 - #theCUBE

>> Narrator: Covering Data Works Summit Europe 2017 brought to you by Horton Works. (uptempo, energetic music) >> Okay, welcome back everyone. We're here live in Germany at Munich for DataWorks Summit 2017, formerly Hadoop Summit. I'm John Furrier, my co-host Dave Vellante. Our next guest is Mike Merritt-Holmes, is senior Vice President of Global Services Strategy at Think Big, a Teradata company, formerly the co-founder of the Big Data Partnership merged in with Think Big and Teradata. Mike, welcome to The Cube. >> Mike: Thanks for having me. >> Great having an entrepreneur on, you're the co-founder, which means you've got that entrepreneurial blood, and I got to ask you, you know, you're in the big data space, you got to be pretty pumped by all the hype right now around AI because that certainly gives a lot of that extra, extra steroid of recognition. People love AI it gives a face to it, and certainly IOT is booming as well, Internet of Things, but big data's cruising along. >> I mean it's a great place to be. The train is certainly going very, very quickly right now. But the thing for us is, we've been doing data science and AI and trying to build business outcomes, and value for businesses for a long time. It's just great now to see this really, the data science and AI both were really starting to take effect and so companies are starting to understand it and really starting to really want to embrace it which is amazing. >> It's inspirational too, I mean I have a bunch of kids in my family, some are in college and some are in high school, even the younger generation are getting jazzed up on just software, right, but the big data stuffs been cruising along now. It's been a good, decade now of really solid DevOps culture, cloud now accelerating, but now the customers are forcing the vendors to be very deliberate in delivering great product, because the demand (chuckling) for real time, the demand for more stuff, is at an all time high. Can you elaborate your thoughts on, your reaction to what customers are doing, because they're the ones driving everyone, not to create friction, to create simplicity. >> Yeah, and you know, our customers are global organizations, trying to leverage this kind of technology, and they are, you know, doing an awesome amount of stuff right now to try to move them from, effectively, a step change in their business, whether it's, kind of, shipping companies doing preventive asset maintenance, or whether it's retailers looking to target customers in a more personalized way, or really understand who their customers are, where they come from, they're leveraging all those technologies, and really what they're doing is pushing the boundaries of all of them, and putting more demands on all of the vendors in the space to say, we want to do this quicker, faster, but more easily as well. >> And then the things that you're talking about, I want to get your thoughts on, because this is the conversation that you're having with customers, I want to extract is, have those kind of data-driven mindset questions, have come out the hype of the Hadoob. So, I mean we've been on a hype cycle for awhile, but now its back to reality. Where are we with the customer conversations, and, from your stand point, what are they working on? I mean, is it mostly IT conversation? Is it a frontoffice conversation? Is it a blend of both? Because, you know, data science kind of threads both sides of the fence there. >> Yeah, I mean certainly you can't do big data without IT being involved, but since the start, I mean, we've always been engaged with the business, it's always been about business outcome, because you bring data into a platform, you provide all this data science capability, but unless you actually find ROI from that, then there's no point, because you want to be moving the business forward, so it's always been about business engagement, but part of that has always been also about helping them to change their mindset. I don't want a report, I want to understand why you look at that report and what's the thing you're looking for, so we can start to identify that for you quicker. >> What's the coolest conversation you've been in, over the past year? >> Uh, I mean, I can't go into too much details, but I've had some amazing conversations with companies like Lego, for instance, they're an awesome company to work with. But when you start to see some of the things we're doing, we're doing some amazing object recognition with deep-learning in Japan. We're doing some ford analytics in the Nordics with deep-learning, we're doing some amazing stuff that's really pushing the boundaries, and when you start to put those deep-learning aspects into real world applications, and you start to see, customers clambering over to want to be part of that, it's a really exciting place to be. >> Let me just double-click on that for a second, because a lot of, the question I get a lot on The Cube, and certainly off-camera is, I want to do deep-learning, I want to do AI, I love machine learning, I hear, oh, it's finally coming to reality so people see it forming. How do they get started, what are some of the best practices of getting involved in deep-learning? Is it using open-source, obviously, is one avenue, but what advice would you give customers? >> From a deep-learning perspective, so I think first of all, I mean, a lot of the greatest deep-learning technologies, run open-source, as you rightly said, but I think actually there's a lot of tutorials and stuff on there, but really what you need is someone who has done it before, who knows where the pitfalls are, but also know when to use the right technology at the right time, and also to know around some of the aspects about whether using a deep-learning methodology is going to be the right approach for your business problem. Because a lot of companies are, like, we want to use this deep-learning thing, its amazing, but actually its not appropriate, necessarily, for the use case you're trying to draw from. >> It's the classic holy grail, where is it, if you don't know what you're looking for, it's hard to know when to apply it. >> And also, you've got to have enough data to utilize those methods as well, so. >> You hear a lot about the technical complexity associated with Hadoop specifically, but just ol' big data generally. I wonder if you could address that, in terms of what you're seeing, how people are dealing with that technical complexity but what other headwinds are there, in terms of adopting these new capabilities. >> Yeah, absolutely, so one of the challenges that we still see is that customers are struggling to leverage value from their platform, and normally that's because of the technical complexities. So we really, we introduced to the open-source world last month Kaylo, something you can download free of charge. It's completely open-source on the Apache license, and that really was about making it easier for customers to start to leverage the data on the platform, to self-serve injection onto that, and for data scientists to wrangle the data better. So, I think there's a real push right now about that next level up, if you like, in the technology stack to start to enable non-technical users to start to do interesting things on the platform directly, rather than asking someone to do it for them. And that, you know, we've had technologies in the PI space like Tableau, and, obviously, the (mumbling) did a data-warehouse solutions on Teradata that have been giving customers something, before and previously, but actually now they're asking for more, not just that, but more as well. And that's where we are starting to see the increases. >> So that's sort of operationalizing analytics as an example, what are some of the business complexities and challenges of actually doing that? >> That's a very good question, because, I think, when you find out great insight, and you go, wow you've built this algorithm, I've seen things I've never seen before, then the business wants to have that always on they want to know that it's that insight all the time is it changing, is it going up, is it going down do I need to change my business decisions? And doing that and making that operational means, not only just deploying it but also monitoring those models, being able to keep them up to date regularly, understanding whether those things are still accurate or not, because you don't want to be making business decisions, on algorithms that are now a bit stale. So, actually operationalizing it, is about building out an entire capability that's keeping these things accurate, online, and, therefore, there's still a bit of work to do, I think, actually in the marketplace still, around building out an operational capability. >> So you kind of got bottom-up, top-down. Bottom-up is the you know the Hadoop experiments, and then top-down is CXO saying we need to do big data. Have those two constituencies come together now, who's driving the bus? Are they aligned or is it still, sort of, a mess organizationally? >> Yeah, I mean, generally, in the organization, there's someone playing the Chief Data Officer, whether they have that as a title or a roll, ultimately someone is in charge of generating value from the data they have in the organization. But they can't do that with IT, and I think where we've seen companies struggle is where they've driven it from the bottom-up, and where they succeed is where they drive it from the top-down, because by driving it from the top-down, you really align what you're doing with the business and strategy that you have. So, the company strategy, and what you're trying to achieve, but ultimately, they both need to meet in the middle, and you can't do one without the other. >> And one of our practitioner friends, who's describing this situation in our office in Palo Alto, a couple of weeks ago. he said, you know, the challenge we have as an organization is, you've got top people saying alright, we're moving. And they start moving, the train goes, and then you've got kind of middle management, sort of behind them, and then you got the doers that are far behind, and aligning those is a huge challenge for this particular organization. How do you recommend organizations to address that alignment challenge, does Think Big have capabilities to help them through that, or is that, sort of, you got to call Accenture? >> In essence, our reason for being is to help with those kind of things, and, you know, whether it's right from the start, so, oh, my God, my Chief Data Officer or my CEO is saying we need to be doing this thing right now, come on, let's get on with it, and we help them to understand what does that mean, what are the use cases, how, where's the value going to come from, what's that architecting to look like, or whether its helping them to build out capability, in terms of data science or building out the cluster itself, and then managing that and providing training for staff. Our whole reason for being is supporting that transformation as a business, from, oh, my God, what do I do about this thing, to, I'm fully embracing it, I know what's going on, I'm enabling my business, and I'm completely comfortable with that world. >> There was a lot talk three, or four or five years ago, about the ROI of so-called big data initiatives, not being really, you know, there were edge cases which were huge ROI, but there was a lot of talk about not a lot of return. My question is, has that, first question, has that changed, are you starting to see much bigger phone numbers coming back where the executives are saying yeah, lets double down on this. >> Definitely, I'm definitely seeing that. I mean, I think it's fair to say that companies are a bit nervous about reporting their ROI around this stuff, in some cases, so there's more ROI out there than you necessarily see out in the public place, but-- >> Why is that? Because they don't want to expose to the competition, or they don't want to front run their earnings, or whatever it is? >> They're trying to get a competitive edge. The minute you start saying, we're doing this, their competitors have an opportunity to catch up. >> John: Very secretive. >> Yeah and I think, it's not necessarily about what they're doing, it's about keeping the edge over their customers, really, over their competitors. So, but what we're seeing is that many customers are getting a lot of ROI more recently because they're able to execute better, rather than being struggling with the IT problems, and even just recently, for instance, we had a customer of ours, the CEO phones us up and says, you know what, we've got this problem with our sales. We don't really know why this is going down, you know, in this country, in this part of the world, it's going up, in this country, it's going down, we don't know why, and that's making us very nervous. Could you come in and just get the data together, work out why it's happening, so that we can understand what it is. And we came in, and within weeks, we were able to give them a very good insight into exactly why that is, and they changed their strategy, moving forward, for the next year, to focus on addressing that problem, and that's really amazing ROI for a company to be able to get that insight. Now, we're working with them to operationalize that, so that particular insight is always available to them, and that's an example of how companies are now starting to see that ROI come through, and a lot of it is about being able to articulate the right business question, rather than trying to worry about reports. What is the business question I'm trying to solve or answer, and that's when you can start to see the ROI come through. >> Can you talk about the customer orientation when they get to that insight, because you mentioned earlier that they got used to the reports, and you mentioned visualization, Tableau, they become table states, once you get addicted to the visualization, you want to extract more insights so the pressure seems to be getting more insight. So, two questions, process gap around what they need to do process-wise, and then just organizational behavior. Are they there mentally, what are some of the criteria in your mind, in your experiments, with customers around the processes that they go through, and then organizational mindset. >> Yeah, so what I would say is, first of all, from an organizational mindset perspective, it's very important to start educating, not just the analysis team, but the entire business on what this whole machine-learning, big data thing is all about, and how to ask the right questions. So, really starting to think about the opportunities you have to move your business forward, rather than what you already know, and think forward rather than retrospective. So, the other thing we often have to teach people, as well, is that this isn't about what you can get from the data warehouse, or replacing your data warehouse or anything like that. It's about answering the right questions, with the right tools, and here is a whole set of tools that allow you to answer different questions that you couldn't before, so leverage them. So, that's very important, and so that mindset requires time actually, to transform business into that mindset, and a lot of commitment from the business to make that happen. >> So, mindset first, and then you look at the process, then you get to the product. >> Yep, so, and basically, once you have that mindset, you need to set up an engine that's going to run, and start to drive the ROI out, and the engine includes, you know, your technical folk, but also your business users, and that engine will then start to build up momentum. The momentum builds more interest, and, overtime, you start to get your entire business into using these tools. >> It kind of makes sense, just kind of riffing in real time here, so the product-gap conversation should probably come after you lay that out first, right? >> Totally, yeah, I mean, you don't choose a product before you know what you need to do with it. So, but actually often companies don't know what they need to do with it, because they've got the wrong mindset in the first place. And so part of the road map stuff that we do, that we have a road map offering, is about changing that mindset, and helping them to get through that first stage, where we start to put, articulate the right use cases, and that really is driving a lot of value for our customers. Because they start from the right place-- >> Sometimes we hear stories, like the product kind of gives them a blind spot, because they tend to go into, with a product mindset first, and that kind of gives them some baggage, if you will. >> Well, yeah, because you end up with a situation, where you go, you get a product in, and then you say what can we do with it. Or, in fact, what happens is the vendor will say, these are the things you could do, and they give you use cases. >> It constrains things, forecloses tons of opportunities, because you're stuck within a product mindset. >> Yeah, exactly that, and you're not, you don't want to be constrained. And that's why open-source, and the kind of ecosystem that we have within the big data space is so powerful, because there's so many different tools for different things but don't choose your tool until you know what you're trying to achieve. >> I have a market question, maybe you just give us opinion, caveat, if you like, it's sort of a global, macro view. When we started first looking at the big data market, we noticed right away the dominant portion of revenue was coming from services. Hardware was commodity, so, you know, maybe sort of less than you would, obviously, in a mainframe world, and open-source software has a smaller contribution, so services dominated, and, frankly, has continued to dominate, since the early days. Do you see that changing, or do you think those percentages, if you will, will stay relatively constant? >> Well, I think it will change over time, but not in the near future, for sure, there's too much advancement in the technology landscape for that to stop, so if you had a set of tools that weren't really evolving, becoming very mature, and that's what tools you had, ultimately, the skill sets around them start to grow, and it becomes much easier to develop stuff, and then companies start to build out industry- or solutions-specific stuff on top, and it makes it very easy to build products. When you have an ecosystem that's evolving, growing with the speed it is, you're constantly trying to keep up with that technology, and, therefore, services have to play an awful big part in making sure that you are using the right technology, at the right time, and so, for the near future, for certain, that won't change. >> Complexity is your friend. >> Yeah, absolutely. Well, you know, we live in a complex world, but we live and breathe this stuff, so what's complex to some is not to us, and that's why we add value, I guess. >> Mike Merritt-Holmes here inside The Cube with Teradata Think Big. Thanks for spending the time sharing your insights. >> Thank you for having me. >> Understand the organizational mindset, identify the process, then figure out the products. That's the insight here on The Cube, more coverage of Data Works Summit 2017, here in Germany after this short break. (upbeat electronic music)

Published Date : Apr 5 2017

SUMMARY :

brought to you by Horton Works. formerly the co-founder of and I got to ask you, you know, I mean it's a great place to be. but the big data stuffs and they are, you know, of the fence there. that for you quicker. and when you start to put but what advice would you give customers? a lot of the greatest if you don't know what you're looking for, got to have enough data I wonder if you could address that, and for data scientists to and you go, wow you've Bottom-up is the you know and you can't do one without the other. and then you got the is to help with those kind of things, not being really, you know, in the public place, but-- The minute you start and that's when you can start so the pressure seems to and a lot of commitment from the business then you get to the product. and the engine includes, you and helping them to get because they tend to go into, and then you say what can we do with it. because you're stuck and the kind of ecosystem that we have of less than you would, and so, for the near future, Well, you know, we live Thanks for spending the identify the process, then

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Japan	LOCATION	0.99+
Mike	PERSON	0.99+
John Furrier	PERSON	0.99+
Lego	ORGANIZATION	0.99+
Mike Merritt-Holmes	PERSON	0.99+
Teradata	ORGANIZATION	0.99+
Germany	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
Think Big	ORGANIZATION	0.99+
two questions	QUANTITY	0.99+
first question	QUANTITY	0.99+
Munich	LOCATION	0.99+
Accenture	ORGANIZATION	0.99+
last month	DATE	0.99+
one	QUANTITY	0.99+
Horton Works	ORGANIZATION	0.99+
Big Data Partnership	ORGANIZATION	0.99+
both	QUANTITY	0.99+
both sides	QUANTITY	0.98+
two constituencies	QUANTITY	0.98+
next year	DATE	0.98+
first	QUANTITY	0.98+
Nordics	LOCATION	0.98+
first stage	QUANTITY	0.98+
#DW17	EVENT	0.97+
Data Works Summit 2017	EVENT	0.97+
DataWorks Summit 2017	EVENT	0.96+
Tableau	TITLE	0.95+
Hadoop	TITLE	0.95+
four	DATE	0.93+
Hadoop Summit	EVENT	0.93+
five years ago	DATE	0.9+
Apache	TITLE	0.89+
The Cube	ORGANIZATION	0.87+
Vice President	PERSON	0.87+
Data Works Summit Europe 2017	EVENT	0.83+
a couple of weeks ago	DATE	0.82+
one avenue	QUANTITY	0.82+
DataWorks Summit Europe 2017	EVENT	0.8+
Kaylo	PERSON	0.8+
past year	DATE	0.79+
Global Services Strategy	ORGANIZATION	0.79+
Teradata Think Big	ORGANIZATION	0.77+
three	QUANTITY	0.76+
double	QUANTITY	0.75+
Think Big -	EVENT	0.71+
Covering	EVENT	0.69+
Hadoob	ORGANIZATION	0.62+
decade	QUANTITY	0.58+
second	QUANTITY	0.58+
Cube	COMMERCIAL_ITEM	0.56+
CXO	PERSON	0.48+
Cube	ORGANIZATION	0.46+
#theCUBE	ORGANIZATION	0.45+

Scott Gnau | DataWorks Summit Europe 2017

>> More information, click here. (soothing technological music) >> Announcer: Live from Munich, Germany, it's theCUBE. Covering Dataworks Summit Europe 2017. Brought to you by Hortonworks. (soft technological music) >> Okay welcome back everyone, we're here in Munich, Germany for Dataworks Summit 2017 formerly Hadoop Summit powered by Hortonworks. It's their event, but now called Dataworks because data is at the center of the value proposition Hadoop plus Airal Data and storage. I'm John, my cohost David. Our next guest is Scott Gnau he's the CTO of Hortonworks joining us again from the keynote stage, good to see you again. >> Thanks for having me back, great to be here. >> Good having you back. Get down and dirty and get technical. I'm super excited about the conversations that are happening in the industry right now for a variety of reasons. One is you can't get more excited about what's happening in the data business. Machine learning AI has really brought up the hype around, to me is human America, people can visualize AI and see the self-driving cars and understand how software's powering all this. But still it's data driven and Hadoop is extending into data seeing that natural extension and CloudAIR has filed their S1 to go public. So it brings back the conversations of this opensource community that's been doin' all this work in the big data industry, originally riding in the horse of Hadoop. You guys have an update to your Hadoop data platform which we'll get to in a second, but I want to ask you a lot of stories around Hadoop, I say Hadoop was the first horse that everyone rode in on in the big data industry... When I say big data, I mean like DevOps, Cloud, the whole open sourcing he does, but it's evolving it's not being replaced. So I want you to clarify your position on this because we're just talkin' about some of the false premises, a lot of stories being written about the demise of Hadoop, long-live Hadoop. Yeah, well, how long do we have? (laughing) I think you hit it first, we're at Dataworks Summit 2017 and we rebranded and it was previously Hadoop Summit. We rebranded it to really recognize that there's this bigger thing going on and it's not just Hadoop. Hadoop is a big contributor, a big driver, a very important part of the ecosystem but it's more than that. It's really about being able to manage and deliver analytic content on all data across that data's lifecycle from when it gets created at the edge to its moving through networks, to its landed and store in a cluster to analytics run and decisions go back out. It's that entire lifecycle and you mentioned some of the megatrends and I talked about this morning in the opening keynote. With AI and streaming and IoT, all of these things kind of converging are creating a much larger problem set and frankly, opportunity for us as an industry to go soft. So that's the context that we're really looking-- >> And there's real demand there. This is not like, I mean there's certainly a hype factor on AI, but IoT is real. You have data now, not just a back office concept, you have a front-facing business centric... I mean there's real customer demand here. >> There's real customer demand and it really creates the ability to dramatically change a business. A simple example that I used onstage this morning is think about the electric utility business. I live in Southern California. 25 years ago, by the way I studied to be an electrical engineer, 20 years ago, 30 years ago, that business not entirely simple was about building a big power plant and distributing electrons out to all the consumers of electrons. One direction and optimization of that grid, network and that business was very hard and there was billions of dollars at stake. Fast forward to today, now you still got those generating plants online, but you've also got folks like me generating their own power and putting it back into the grid. So now you've got bidirectional electrons. The optimization is totally different. Then how do you figure out how most effectively to create capacity and distribute that capacity because created capacity that's not consumed is 100% spoiled. So it's a huge data problem but it's a huge data problem meeting IoT, right? Devices, smart meter devices out at the edge creating data doing it in realtime. A cloud blew over, my generating capacity on my roof went down so I've got to pull from the grid, combining all of that data to make realtime decisions is we're talking hundreds of billions of dollars and it's being done today in an industry, it's not a high-tech Silicon Valley kind of industry, electric utilities are taking advantage of this technology today. >> So we were talking off-camera about you know some commentary about the Hadoop is failed and obviously you take exception to that and I and you also made the point it's not just about Hadoop but in a way it is because Hadoop was the catalyst of all this open Why has Hadoop not failed in your view >> Well because we have customers and you know the great thing about conferences like this is we're actually able to get a lot of folks to come in and talk about what they're doing with the technology and how they're driving business benefit and share that business benefit to their colleagues so we see that that it's business benefit coming along you know In any hype cycle you know people can go down a path maybe they had false expectations right early on you know six years ago years ago we were talking about hey is open source of Hadoop is going to come along and replace EDW complete fallacy right what I talked about in that opportunity being able to store all kinds of disparate data being able to manage and maneuver analytics in real time that's the value proposition is very different than some of the legacy ten. So if you view it as hey this thing is going to replace that thing okay maybe not but the point is is very successful for what is not verified that-- >> Just to clarify what you just said there that was you guys never kicked that position. CloudAIR or did with their impala was their initial on you could give me that you don't agree with that? >> Publicly they would say oh it's not a replacement but you're right i mean the actions were maybe designed to do that >> And set in the marketplace that that might be one of the outcomes >> Yeah, but they pivoted quickly when they realized that was failed strategy but i mean that but that became a premise that people locked in on. >> If that becomes your yardstick for measuring then then so-- >> Oh but but wouldn't you agree that that Hadoop in many respects was designed to solve some of the problems that edw never could >> Exactly so so you know again when you think about the the variety of data when you think about the analytic content doing time series analysis is very hard to do in a relational model so it's a new tool in the workbench to go solve analytic problems and so when you look at it from that perspective and I use the utility example the manufacturing example financial consumer finance telco all of these companies are using this technology leveraging this technology to solve problems they couldn't solve or and frankly to build new businesses that they couldn't build before because they didn't have access to that real time-- >> And so money did shift from pouring money into the edw with limited returns because you were at the steep part or the flat part of the s-curve to hey let's put it over here and this so called big data thing and that's why the market I think was conditioned to sort of come to that simple conclusion but dollars the spending did shift did it not? >> Yeah I mean if you subscribe kind of that to that herd mentality and you know the net increase the net new expenditure in the new technology is always going to outpace the growth of the existing kind of plateau technologists. That's just math. >> The growth yes, but not the size not the absolute dollars and so you have a lot of companies right now struggling in the traditional legacy space and you got this rocket ship going in-- >> And again I think if you think about kind of the converging forces that are out there in addition to you know i OT and streaming the ability frankly Hadoop is an enabler of AI when you think about the success of AI and machine learning it's about having massive massive massive amounts of data right? And I think back 25 years ago my first data Mart was 30 gigabytes and we thought that was all the data in the world Now fits on your phone so so when you think about just having the utter capacity and the ability to actually process that capacity of data these are technology breakthroughs that have been driven in the poor open source in Hadoop community when combined with the ability then to execute in clouds and ephemeral kinds of workloads you combine all that stuff together now instead of going to capital committee for 20 millioin dollars for a bunch of hardware to do an exabyte kind of study where you may not get an answer that means anything you can now spin that up in the cloud and for a couple of thousand dollars get the answer take that answer and go build a new system of insight that's going to drive your business and this is a whole new area of opportunity or even by the convergence of all that >> So I agree i mean it's absurd to say Hadoop and big data has failed, it's crazy. Okay but despite the growth i called profitless prosperity can the industry fund itself I mean you've got to make big bets yarn tezz different clouds how does the industry turn into one that is profitable and growing well I mean obviously it creates new business models and new ways of monetizing software in deploying software you know one of the key things that is core to our belief system is really leveraging and working with and nurturing the community is going to be a key success factor for our business right nurturing that innovation in collaboration across the community to keep up with the rate of pace of change is one of the aspects of being relevant as a business and then obviously creating a great service experience for our customers so that they they know that they can depend on enterprise class support enterprise-class security and governance and operational management in the cloud and on-prem in creating that value propisition along with the the advanced and accelerated delivery of innovation is where I think you know we kind of intersect uniquely in in the in the industry. >> and one of the things that I think that people point out and I have this conversation all the time of people who try to squint through the you know the wall street implications of the value proposition of the industry and this and that and I want to get your thoughts on because open source at this era that we're living in today bringing so much value outside of just important works in your your company Dave would made a comment on the intro package we're doing is that the practitioners are getting a lot of value people out in the field so these are the white space as a value and they're actually transformative can you give some examples where things are getting done that are real of real value as use cases that are that are highlighted you guys can i light I think that's the unwritten story that no one thought about it that rising tide floating all boat happening? >> Yeah yes I mean what is the most use cases the white so you have some of those use cases again it really involves kind of integrating legacy traditional transactional information right very valuable information about a company its operations its customers its products and all this kind of thing about being able to combine that with the ability to do real-time sensor management and ultimately have a technology stack that enables kind of the connection of all of those sources of data for an analytic and that's an important differentiation you know for the first 25 years of my career right it was all about what school all this data into a place and then let's do something with it and then we can push analytics back not an entirely bad model but a model that breaks in the world of IOT connected devices it's just frankly isn't enough money to spend on bandwidth to make that happen and as fast as the speed of light is it creates latency so those decisions aren't going to be able to be made in time so we're seeing even in traditional i mentioned utility business think about manufacturing oil and gas right sensors everywhere being able to take advantage not not of collecting all the central data and all of that but being able to actually create analytics based on sensor data and put those analytics outs of the sensors to make real-time decisions that can affect hundreds of millions of dollars of production or equipment are the use cases that we're seeing be deployed today and that's complete white space that was unavailable before. >> Yeah and customer demand too I mean Dave and I were also debating about the this not being a new trend this is just big data happening the customers are demanding production workload so you've seen a lot more forcing function driven by the customer and you guys have some news I want to get to and give your thoughts on HTTP or worse data platform two points dicks what's the key news their house in real time you talking about real time. >> Yeah it's about real time real time flexibility and choice you know motherhood and apple pie >> And the major highlights of that operate >> So the upgrades really inside of hive we now have operational analytic query capabilities where when you do tactical response times second sub second kind of response time. >> You know Hadoop and Hive wasn't previously known for that kind of a tactical response we've been able to now add inside of that technology the ability to view that workload we have customers who building these white space applications who have hundreds or thousands of users or applications that depend on consistency of very quick analytic response time we now deliver that inside the platform what's really cool about it in addition to the fact that it works is is that we did it inside a pipe so we didn't create yet another project or yet another thing that a customer has to integrate to or rewrite their application so any high based application cannot take advantage of this performance enhancement and that's part of our thinking of it as a platform the second thing inside of that that we've done that really it creaks to those kinds of workload is is we've really enhance the ability to incremental data acquisition right whether it be streaming whether it be patch up certs right on the sequel person doing up service being able to do that data maintenance in an active compliant fashion completely automatically and behind the scenes so that those applications again can just kind of run without any heavy lifting >> Just staying in motion kind of thing going on >> Right it's anywhere from data in motion even to batch to mini batch and anywhere kind of in between but we're doing those incremental data loads you know, it's easy to get the same file twice by mistake you don't want to double count you want to have sanctity of the transactions we now handle that inside of Hive with acid compliance. >> So a layperson question for the CTO if I may you mentioned Hadoop was not known for a sort of real-time response you just mentioned acid it was never in the early days known for a sort of acid you know complies others would say you know Hadoop the original Big Data Platform is not designed for the matrix of the matrix math of AI for example are these misconceptions and like Tim Berners-lee when we met Tim Berners-lee web 2.0 this is what the web was designed for would you say the same thing about Hadoop? >> Yeah. Ultimately from my perspective and kind of mending it out, Hadoop was designed for the easy acquisition of data the easy onboarding of data and then once you've onboarded that data it it also was known for enabling new kinds of analytics that could be plugged in certainly starting out with MapReduce in HDFS was kind of before but the whole idea is I have now the flexible way to easily acquire data in its native form without having to apply schema without having to have any formatting distort I can get it exactly as it was and store it and then I can apply whatever schema whatever rules whatever analytics on top of that that I want so the center of gravity from my mind has really moved up to yarn which enables a multi-tenancy approach to having pluggable multiple different kinds of file formats and pluggable different kinds of analytics and data access methods whether it be sequel whether it be machine learning whether the HBase will look up and indexing and anywhere kind of in between it's that it's that Swiss Army knife as it were for handling all of this new stuff that is changing every second we sit here data has changed. >> And just a quick follow-up if I can just clarification so you said new types of analytics that can be plugged in by design because of its openness is that right? >> By design because of its openness and the flexibility that the platform was was built for in addition on the performance we've also got a new update to spark and usability consume ability and collaboration for data scientists using the latest versions of spark inside the platform we've got a whole lot of other features and functions as that our customers have asked for and then on the flexibility and choice it's available public cloud infrastructures of service public cloud platform as a service on Prem x and net new on prem with power >> Just got final question for you just as the industry evolves what are some of the key areas that open source can pivot to that really takes advantage of the machine learning the AI trends going on because you start to see that really increase the narrative around the importance of data and a lot of people are scratching their heads going okay i need to do the back office to set up my IT to have all those crates stuff always open source projects all that the Hadoop data platform but then I got to get down and dirty i might do multiple clouds on the hybrid cloud going on i might want to leverage the moles canoe cool containers and super Nettie's and micro services and almost devops where's that transition happening as a CTO what do you see that that how do you talk to customers about that this transition this evolution of how the data businesses in getting more and more mainstream? >> Yeah i mean i think i think the big thing that people had to get over is we've reverse polarity from again 30 years of I want a stack vendor to have an integrated stack of everything a plug-and-play it's integrated and end it might not be a hundred percent what I want but the cost leverage that I get out of the stack versus what I'm going to go do that's perfect in this world if the opposite it's about enabling the ecosystem and that's where having and by the way it's a combination of open source and proprietary software that you know some of our partners have proprietary software that's okay but it's really about enabling the ecosystem and I think the biggest service that we as an open source community can do is to continue to kind of keep that standard kernel for the platform and make it very usable and very easy for many apps and software providers and other folks. >> A thousand flower bloom and kind of concept and that's what you've done with the white spaces as these cases are evolving very rapidly and then the bigger apps are kind of going to settling into a workload with realtime. >> Yeah all time you know think about the next generation of IT professional the next generation of business professional grew up with iphones and here comes they grew up in a mini app world i mean it download an app i'm going to try it is a widget boom and it's going to help me get something done but it's not a big stack that I'm going to spend 30 years to implement and I liked it and then I want to take to those widgets and connect them together to do things that i haven't been able to do before and that's how this ecosystem is really-- >> Great DevOps culture very agile that's their mindset. So Scott congratulations on your 2.6 upgrade and >> Scott: We're thrilled about it. >> Great stuff acid compliance really big deal again these compliance because little things are important in the enterprise great all right thanks for coming to accuse the Dataworks in Germany Munich I'm John thanks for watching more coverage live here in Germany after this short break

Published Date : Apr 5 2017

SUMMARY :

(soothing technological music) Brought to you by Hortonworks. because data is at the center of the value proposition that are happening in the industry you have a front-facing business centric... combining all of that data to make realtime decisions and share that business benefit to their Just to clarify what you just said there a premise that people locked in on. that to that herd mentality and you know the community to keep up with the rate cases the white so you have some of debating about the this not being a new So the upgrades really inside of hive we it's easy to get the same file twice by mistake you the CTO if I may you mentioned Hadoop acquisition of data the easy onboarding the big thing that people had to get kind of going to settling into a So Scott congratulations on your 2.6 upgrade and

ENTITIES

Entity	Category	Confidence
Scott	PERSON	0.99+
100%	QUANTITY	0.99+
John	PERSON	0.99+
David	PERSON	0.99+
Dave	PERSON	0.99+
Germany	LOCATION	0.99+
Southern California	LOCATION	0.99+
30 years	QUANTITY	0.99+
30 gigabytes	QUANTITY	0.99+
Scott Gnau	PERSON	0.99+
hundreds	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
six years ago years ago	DATE	0.99+
America	LOCATION	0.99+
25 years ago	DATE	0.99+
Hadoop	TITLE	0.99+
Munich, Germany	LOCATION	0.99+
today	DATE	0.98+
Dataworks Summit 2017	EVENT	0.98+
30 years ago	DATE	0.98+
two points	QUANTITY	0.98+
iphones	COMMERCIAL_ITEM	0.98+
telco	ORGANIZATION	0.98+
Hadoop	ORGANIZATION	0.98+
hundred percent	QUANTITY	0.98+
billions of dollars	QUANTITY	0.98+
first 25 years	QUANTITY	0.97+
DevOps	TITLE	0.97+
hundreds of millions of dollars	QUANTITY	0.97+
20 years ago	DATE	0.97+
20 millioin dollars	QUANTITY	0.97+
twice	QUANTITY	0.97+
DataWorks Summit	EVENT	0.97+
first	QUANTITY	0.97+
one	QUANTITY	0.97+
One	QUANTITY	0.96+
second thing	QUANTITY	0.96+
Tim Berners-lee	PERSON	0.96+
Silicon Valley	LOCATION	0.96+
Munich	LOCATION	0.96+
Hadoop Summit	EVENT	0.96+
One direction	QUANTITY	0.96+
first horse	QUANTITY	0.95+
first data	QUANTITY	0.95+
Dataworks	ORGANIZATION	0.94+
second	QUANTITY	0.92+
Cloud	TITLE	0.92+
EDW	ORGANIZATION	0.85+
2017	EVENT	0.85+
couple of thousand dollars	QUANTITY	0.84+
Dataworks Summit Europe 2017	EVENT	0.84+
MapReduce	TITLE	0.84+
thousands of users	QUANTITY	0.83+
lot of folks	QUANTITY	0.83+
this morning	DATE	0.8+
S1	TITLE	0.79+
Europe	LOCATION	0.78+
A thousand flower bloom	QUANTITY	0.78+
2.6	OTHER	0.76+
apps	QUANTITY	0.73+

Day One Kickoff– DataWorks Summit Europe 2017 - #DW17 - #theCUBE

>> Narrator: Recovery. DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Hello everyone, welcome to The Cube's special presentation here in Munich, Germany for DataWorks Summit 2017. This is the Hadoop Summit powered by Hortonworks. This is their event and again, shows the transition from the Hadoop world to the big data world. I'm John Furrier. My co-host Dave Vellante, good to see you Dave. We're back in the seats together, usually on different events, but now here together in Munich. Great beer, great scene here. Small European event for Hortonworks and the ecosystem but it's called DataWorks 2017. Strata Hadoop is calling themselves Strata and Data. They're starting to see the word Hadoop being sunsetted from these events, which is a big theme of this year. The transition from Hadoop being the branded category to Data. >> Well, you're certainly seeing that in a number of ways. The titles of these events. Well, first of all, I love being in Europe. These venues are great, right? They're so Euro, very clean and magnificent. But back to your point. You're seeing the Hadoop Summit now called the DataWorks Summit. You're seeing the Strata Plus Hadoop is now Strata Plus, I don't even know what it is. Right, it's not Hadoop driven anymore. You see it also in Cloudera's IPO. They're going to talk about Hadoop and Hadoop Distro. They're a Hadoop Distro vendor but they talked about being a data management company and John, I think we are entering the era, or well deep into the era of what I have been calling for the last couple of years, profitless prosperity. Really where you see the Cloudera IPO, as you know, they raised money from Intel, over $600 million at a $4.1 billion dollar valuation. The Wall Street Journal says they'll have a tough time getting a billion dollar valuation. For every dollar each of these companies spends, Hortonworks and Cloudera, they lose between $1.70 and $2.50, so we've always said at SiliconANGLE, Wiki Bond and The Cube that people are going to make money in big data or the practitioners of big data, and it's hard to find those guys, it's hard to see them but that's really what's happening is the industries are transforming and those are the guys that are putting money into their bottom line. Not so much for technology vendors. >> Great to unpack that but first of all, I want to just say congratulations to Wiki Bond for getting it right again. As usual Wiki Bond, ahead of the curve and being out there and getting it right because I think you nailed it and I think Wiki Bond saw this first of all the research firms, kind of, you know, pat ourselves on the back here, but the truth is that practitioners are making the money and I think you're going to see more of that. In fact, last night as I'm having a nice beer here in Germany, I just like to listen to the conversations in the bar area and a lot of conversations around, real conversations around, you know, doing deals, and you know, deployments. You know, you're hearing about HBase, you're hearing about clusters, you're hearing about service revenue, and I think this is the focus. Cloudera, I think, in a classic Silicon Valley way, their hubris was tempered by their lack of scale. I mean, they didn't really blow it out. I mean, now they do 200 million in revenue. Nothing to shake a stick at, they did a great job, but they're buying revenue and Hortonworks is as well. But the ecosystem is the factor, and this is the wildcard. I'm making a prediction. Profitless prosperity that you point out is right, but I think that it has longevity with these companies like Hortonworks and Cloudera and others, like MapR because the ecosystem's robust. If you factor in the ecosystem revenue that is enough rising tide in my opinion. The question is how do they become sustainable as a standalone venture, that Red Hat for Hadoop never worked as Pat Gilson, you know, predicted. So, I think you're going to see a quick shift and pivot quickly by Hortonworks, certainly Cloudera's going to be under the microscope once they go public. I'm expecting that valuation to plummet like a rock. They're going to go public, Silicon Valley people are going to get their exits but. >> Excel will be happy. >> Everyone, yeah, they'll be happy. They already sold in 2013. They did a big sale, I mean, all of them cashed out two years ago when that liquidation event happened with Intel but that's fine. But now it's back to business building and Hortonworks has been doing it for years, so when you see your evaluation is less than a billion, so I'm expecting Cloudera to plummet like a rock. I would not buy the IPO at all because I think it's going to go well under a billion dollars. >> And I think it's the right call and as we know, last year, at the end of last year, Fidelity and other mutual funds devalued their holdings in Cloudera and so, you know, you've got this situation where, as you say, a couple hundred, maybe you know, on the way to 300 million in revenue, Hortonworks on the way to 200 million in revenue. Add up the ecosystem, yeah, maybe you get to a billion, throw in all of what IBM and Oracle call big data, and it's kind of a more interesting business, but you've called it same wine, new bottle. Is it a new bottle? Now, what I mean by that is the shift from Hadoop and then again, you read Cloudera's S1, it's all about AI, machine learning, you know, the cloud. Interesting, we'll talk about the cloud a little later, but is it same wine, new bottle, or is this really a shift toward a new era of innovation? >> It's not a new shift. It's the same innovation that the Hortonworks was founded on. Big data is a categorical and Hadoop was the horse they rode in on, but I think what's changing is the fact that customers are now putting real projects on the table and the scrutiny around those projects have to produce value, and the value comes down to total cost of ownership and business value. And that's becoming a data specific thing, and you look at all the successes in the big data world, Spark and others, you're seeing a focus on cloud integration and real-time workloads. These are real projects. This isn't fantasy. This isn't hype. This isn't early adopter. These are real companies saying we are moving to a new paradigm of digital transforming our companies and we need cost efficiencies but revenue-producing applications and workloads that are going to be running in the cloud with data at the heart of it. So, this is a customer-forcing function where the customers are generally excited about machine learning, moving to real-time classification of workloads. This is the deal and no hubris, no technology posturing, no open standards, jockeying can right the situation. Customers have demands and they want them filled, and we're going to have a lot of guests on here and I'm going to ask them those direct questions. What are you looking for and? >> Well, I totally agree with what you're saying and when we first met, it was right around the, you know, the mid point of the web 2.0 era, and I remember Tim Berners-Lee commenting on all this excitement, everybody's doing, he said this is what the web was invented to do, and this is what big data was invented to do. It was to produce deep analytics, deep learning, machine learning, you know, cognitive, as IBM likes to brand that, and so, it really is the next era even though people don't like to use the term big data anymore. We were talking to, you know, some of the folks in our community earlier, John, you and I, about some of the challenges. Why is it profitless, you know? Why is there so much growth but it's no profit? And you know, we have to point out here that people like Hortonworks and Cloudera, they've made some big bets, take HDSF of example. And now you have the cloud guys, particularly Amazon, coming in, you know, with S3. Look at YARN, big open source project. But you got Docker and Kubernetes seem to be mopping that up. Tez was supposed to replace MapReduce and now you've got. >> I mean, I wouldn't say mopping up, I mean. >> You've got Spark. >> At the end of the day the ecosystem's going to revolve around what the customers want, and portability of workloads, Kubernetes and microservices, these are areas that just absolutely make a lot of sense and I think, you know, people will move to where the frictionless action is and that's going to happen with Kubernetes and containers and microservices, but that just speaks to the devops culture, and I think Hadoop ecosystem, again, was grounded in the devops culture. So, yeah, there's some progress that are going to maybe go out of flavor, but there's other stuff coming up trough the ranks in open source and I think it's compelling. >> But where I disagree with what you're saying is well, the point I'm trying to make, is you have to, if you're Cloudera and Hortonworks, you have to support those multiple projects and it's expensive as hell. Whereas the cloud guys put all their wood behind one arrow, to use an old Scott McNealy phrase, and you know, Amazon, I would argue is mopping up in big data. I think the cloud guys, you know, it's ironic to me that Cloudera in the cloud era picked that name, you know, but really never had. >> John: They missed the cloud. >> They've never really had a strong cloud play, and I would say the same thing with Hortonworks and MapR. They have to play in the cloud and they talk about cloud, but they've got to support hybrid, they've got to support on param, they got to pick the clouds that they're going to support, AWS, Azure, maybe IBM's cloud. >> Look, Cloudera completely missed the cloud era, pun intended. However, they didn't miss open source but they're great at and I'm an admirer of Cloudera and Hortonworks on is that their open source ethos is what drove them, and so they kind of got isolated in with some of their product decisions, but that's not a bad thing. I mean, ultimately, I'm really bullish on Cloudera and Hortonworks because the ecosystem points I mentioned earlier are not high on the I wouldn't buy the IPO, I think I'd buy them at a discount, but Cloudera's not going to go away, Dave. They're going to go public. I think the valuation's going to drop like a rock and then settle around a billion, but they have good management. The founders still there, Michael Olson, Amr Awadallah. So, you're going to see Cloudera transform as a company. They have to do business out in the open and they're not afraid to, obviously they're open source. So, we're going to start to see that transition from a private venture backed, scale up, buy revenue. In the playbook of Silicon Valley venture capital's Excel partners and Greylock. Now they go public and get liquid and then now next phase of their journey is going to be build a public company and I think that they will do a good job doing it and I'm not down on them at all for that and I think it's just going to be a transition. >> Well, they're going to raise what? A couple 100 million dollars? But this industry, yeah, this industry's cashflow negative, so I agree with you. Open source is great, let's ra-ra for open source and it drives innovation, but how does this industry pay for itself? That's what I want to know. How you respond to that? >> Well, I think they have sustainable issues around services and I think partnering with the big companies like Intel that have professional services might help them on that front, but Michael Olson said in his founder's letter in his S1, kind of AI washing, he said AI and cognitive. But that's okay because Cloudera could easily pivot with their brain power, and same with Hortonworks to AI. Machine learning is very open source driven. Open source culture is growing, it's not going away, so I think Cloudera's in a very good position. >> I think the cloud guys are going to kill them in that game, and cloud guys and IBM are going to cream these profitless startups in that AI and machine learning game. >> We'll see. >> You disagree? >> I disagree, I think. Well, I mean, it depends. I mean, you know, I'm not going to, you know, forecast what the managements might do, but I mean, if I'm cloud looking at what Cloudera's done. >> What would you do? >> I would do exactly what Mike Olson's doing is I'd basically pivot immediately to machine learning. Look at Google. TensorFlow it's go so much traction with their cloud because it's got machine learning built into it. Open source is where the action is, and that's where you could do a lot of good work and use it as an advantage in that they know that game. I would not count out the open source game. >> So, we know how IBM makes money at that, you know, in theory anyway it wants. We know how Amazon's going to make money at that with their priority approach, Microsoft will do the same thing. How to Cloudera and Hortonworks make money? >> I think it's a product transition around getting to the open source with cloud technologies. Amazon is not out to kill open source, so I think there's an opportunity to wedge in a position there, and so they just got to move quickly. If they don't make these decisions then that's a failed execution on the management team at Cloudera and Hortonworks and I think they're on it. So, we'll keep an eye on that. >> No, Amazon's not trying to kill open source, I would agree, but they are bogarting open source in a big way and profiting amazingly from it. >> Well, they just do what Amy Jessie would say, they're customer driven. So, if a customer doesn't want to do five things to do one thing this is back to my point. The customers want real-time workloads. They want it with open source and they don't want all these steps in the cost of ownership. That's why this is not a new shift, it's the same wine, new bottle because now you're just seeing real projects that are demanding successful and efficient code and support and whoever delivers it builds the better mousetrap. In this case, the better mousetrap will win. >> And I'm arguing that the better mousetrap and the better marginal economics, I know I'm like a broken record on this, but if I take Kinesis and DynamoDB and Red Ship and wrap it into my big data play, offer it as a service with a set of APIs on the cloud, like AWS is going to do, or is doing, and Azure is doing, that's a better business model than, as you say, five different pieces that I have to cobble together. It's just not economically viable for customers to do that. >> Well, we've got some big new coming up here. We're going to have two days of wall-to-wall coverage of DataWorks 2017. Hortonworks announcing 2.6 of their Hadoop Hortonworks data platform. We're going to talk to Scott now, the CTO, coming up shortly. Stay with us for exclusive coverage of DataWorks in Munich, Germany 2017. We'll be back with more after this short break.

Published Date : Apr 5 2017

SUMMARY :

Brought to you by Hortonworks. Hortonworks and the ecosystem and it's hard to find those guys, and you know, deployments. going to go well under and then again, you read Cloudera's S1, and I'm going to ask them and so, it really is the next era I mean, I wouldn't and that's going to happen with Kubernetes and you know, Amazon, that they're going to support, and I think that they will Well, they're going to raise what? and same with Hortonworks to AI. and cloud guys and IBM are going to cream I mean, you know, and that's where you could to make money at that and so they just got to move quickly. to kill open source, and they don't want all these steps and the better marginal economics, We're going to talk to Scott now, the CTO,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Michael Olson	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
2013	DATE	0.99+
Amy Jessie	PERSON	0.99+
John	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
Fidelity	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Mike Olson	PERSON	0.99+
Germany	LOCATION	0.99+
Munich	LOCATION	0.99+
Wiki Bond	ORGANIZATION	0.99+
$2.50	QUANTITY	0.99+
Dave	PERSON	0.99+
Scott	PERSON	0.99+
John Furrier	PERSON	0.99+
last year	DATE	0.99+
MapR	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
200 million	QUANTITY	0.99+
Pat Gilson	PERSON	0.99+
Intel	ORGANIZATION	0.99+
less than a billion	QUANTITY	0.99+
two days	QUANTITY	0.99+
Scott McNealy	PERSON	0.99+
Tim Berners-Lee	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
over $600 million	QUANTITY	0.99+
The Cube	ORGANIZATION	0.99+
SiliconANGLE	ORGANIZATION	0.99+
DataWorks Summit	EVENT	0.99+
Hadoop	ORGANIZATION	0.98+
Hadoop Distro	ORGANIZATION	0.98+
300 million	QUANTITY	0.98+
two years ago	DATE	0.98+
DataWorks 2017	EVENT	0.98+
Google	ORGANIZATION	0.98+
Hadoop Summit	EVENT	0.98+
each	QUANTITY	0.98+
a billion	QUANTITY	0.97+
DataWorks Summit 2017	EVENT	0.97+
billion dollar	QUANTITY	0.97+
Amr Awadallah	PERSON	0.97+
Munich, Germany	LOCATION	0.97+

Joe Morrissey, Hortonworks | Dataworks Summit 2018

>> Narrator: From Berlin, Germany, it's theCUBE! Covering Dataworks Summit Europe 2018. Brought to you by Hortonworks. >> Well, hello. Welcome to theCUBE. I'm James Kobielus. I'm lead analyst at Wikibon for big data analytics. Wikibon, of course, is the analyst team inside of SiliconANGLE Media. One of our core offerings is theCUBE and I'm here with Joe Morrissey. Joe is the VP for International at Hortonworks and Hortonworks is the host of Dataworks Summit. We happen to be at Dataworks Summit 2018 in Berlin! Berlin, Germany. And so, Joe, it's great to have you. >> Great to be here! >> We had a number of conversations today with Scott Gnau and others from Hortonworks and also from your customer and partners. Now, you're International, you're VP for International. We've had a partner of yours from South Africa on theCUBE today. We've had a customer of yours from Uruguay. So there's been a fair amount of international presence. We had Munich Re from Munich, Germany. Clearly Hortonworks is, you've been in business as a company for seven years now, I think it is, and you've established quite a presence worldwide, I'm looking at your financials in terms of your customer acquisition, it just keeps going up and up so you're clearly doing a great job of bringing the business in throughout the world. Now, you've told me before the camera went live that you focus on both Europe and Asia PACS, so I'd like to open it up to you, Joe. Tell us how Hortonworks is doing worldwide and the kinds of opportunities you're selling into. >> Absolutely. 2017 was a record year for us. We grew revenues by over 40% globally. I joined to lead the internationalization of the business and you know, not a lot of people know that Hortonworks is actually one of the fastest growing software companies in history. We were the fastest to get to $100 million. Also, now the fastest to get to $200 million but the majority of that revenue contribution was coming from the United States. When I joined, it was about 15% of international contribution. By the end of 2017, we'd grown that to 31%, so that's a significant improvement in contribution overall from our international customer base even though the company was growing globally at a very fast rate. >> And that's also not only fast by any stretch of the imagination in terms of growth, some have said," Oh well, maybe Hortonworks, "just like Cloudera, maybe they're going to plateau off "because the bloom is off the rose of Hadoop." But really, Hadoop is just getting going as a market segment or as a platform but you guys have diversified well beyond that. So give us a sense for going forward. What are your customers? What kind of projects are you positioning and selling Hortonworks solutions into now? Is it a different, well you've only been there 18 months, but is it shifting towards more things to do with streaming, NiFi and so forth? Does it shift into more data science related projects? Coz this is worldwide. >> Yeah. That's a great question. This company was founded on the premise that data volumes and diversity of data is continuing to explode and we believe that it was necessary for us to come and bring enterprise-grade security and management and governance to the core Hadoop platform to make it really ready for the enterprise, and that's what the first evolution of our journey was really all about. A number of years ago, we acquired a company called Onyara, and the logic behind that acquisition was we believe companies now wanted to go out to the point of origin, of creation of data, and manage data throughout its entire life cycle and derive pre-event as well as post-event analytical insight into their data. So what we've seen as our customers are moving beyond just unifying data in the data lake and deriving post-transaction inside of their data. They're now going all the way out to the edge. They're deriving insight from their data in real time all the way from the point of creation and getting pre-transaction insight into data as well so-- >> Pre-transaction data, can you define what you mean by pre-transaction data. >> Well, I think if you look at it, it's really the difference between data in motion and data at rest, right? >> Oh, yes. >> A specific example would be if a customer walks into the store and they've interacted in the store maybe on social before they come in or in some other fashion, before they've actually made the purchase. >> Engagement data, interaction data, yes. >> Engagement, exactly. Exactly. Right. So that's one example, but that also extends out to use cases in IoT as well, so data in motion and streaming data, as you mentioned earlier since become a very, very significant use case that we're seeing a lot of adoption for. Data science, I think companies are really coming to the realization that that's an essential role in the organization. If we really believe that data is the most important asset, that it's the crucial asset in the new economy, then data scientist becomes a really essential role for any company. >> How do your Asian customers' requirements differ, or do they differ from your European cause European customers clearly already have their backs against the wall. We have five weeks until GDPR goes into effect. Do many of your Asian customer, I'm sure a fair number sell into Europe, are they putting a full court, I was going to say in the U.S., a full court press on complying with GDPR, or do they have equivalent privacy mandates in various countries in Asia or a bit of both? >> I think that one of the primary drivers I see in Asia is that a lot of companies there don't have the years of legacy architecture that European companies need to contend with. In some cases, that means that they can move towards next generation data-orientated architectures much quicker than European companies have. They don't have layers of legacy tech that they need to sunset. A great example of that is Reliance. Reliance is the largest company in India, they've got a subsidiary called GO, which is the fastest growing telco in the world. They've implemented our technology to build a next-generation OSS system to improve their service delivery on their network. >> Operational support system. >> Exactly. They were able to do that from the ground up because they formed their telco division around being a data-only company and giving away voice for free. So they can in some extent, move quicker and innovate a little faster in that regards. I do see much more emphasis on regulatory compliance in Europe than I see in Asia. I do think that GDPR amongst other regulations is a big driver of that. The other factor though I think that's influencing that is Cloud and Cloud strategy in general. What we've found is that, customers are drawn to the Cloud for a number of reasons. The economics sometimes can be attractive, the ability to be able to leverage the Cloud vendors' skills in terms of implementing complex technology is attractive, but most importantly, the elasticity and scalability that the Cloud provides us, hugely important. Now, the key concern for customers as they move to the Cloud though, is how do they leverage that as a platform in the context of an overall data strategy, right? And when you think about what a data strategy is all about, it all comes down to understanding what your data assets are and ensuring that you can leverage them for a competitive advantage but do so in a regulatory compliant manner, whether that's data in motion or data at rest. Whether it's on-prem or in the Cloud or in data across multiple Clouds. That's very much a top of mind concern for European companies. >> For your customers around the globe, specifically of course, your area of Europe and Asia, what percentage of your customers that are deploying Hortonworks into a purely public Cloud environment like HDInsight and Microsoft Azure or HDP inside of AWS, in a public Cloud versus in a private on-premises deployment versus in a hybrid public-private multi Cloud. Is it mostly on-prem? >> Most of our business is still on-prem to be very candid. I think almost all of our customers are looking at migrating, some more close to the Cloud. Even those that had intended to have a Cloud for a strategy have now realized that not all workloads belong in the Cloud. Some are actually more economically viable to be on-prem, and some just won't ever be able to move to the Cloud because of regulation. In addition to that, most of our customers are telling us that they actually want Cloud optionality. They don't want to be locked in to a single vendor, so we very much view the future as hybrid Cloud, as multi Cloud, and we hear our customers telling us that rather than just have a Cloud strategy, they need a data strategy. They need a strategy to be able to manage data no matter where it lives, on which tier, to ensure that they are regulatory compliant with that data. But then to be able to understand that they can secure, govern, and manage those data assets at any tier. >> What percentage of your deals involve a partner? Like IBM is a major partner. Do you do a fair amount of co-marketing and joint sales and joint deals with IBM and other partners or are they mostly Hortonworks-led? >> No, partners are absolutely critical to our success in the international sphere. Our partner revenue contribution across EMEA in the past year grew, every region grew by over 150% in terms of channel contribution. Our total channel business was 28% of our total, right? That's a very significant contribution. The growth rate is very high. IBM are a big part of that, as are many other partners. We've got, the very significant reseller channel, we've got IHV and ISV partners that are critical to our success also. Where we're seeing the most impact with with IBM is where we go to some of these markets where we haven't had a presence previously, and they've got deep and long-standing relationships and that helps us accelerate time to value with our customers. >> Yeah, it's been a very good and solid partnership going back several years. Well, Joe, this is great, we have to wrap it up, we're at the end of our time slot. This has been Joe Morrissey who is the VP for International at Hortonworks. We're on theCUBE here at Dataworks Summit 2018 in Berlin, and want to thank you all for watching this segment and tune in tomorrow, we'll have a full slate of further discussions with Hortonworks, with IBM and others tomorrow on theCUBE. Have a good one. (upbeat music)

Published Date : Apr 18 2018

SUMMARY :

Brought to you by Hortonworks. and Hortonworks is the host of Dataworks Summit. and the kinds of opportunities you're selling into. Also, now the fastest to get to $200 million of the imagination in terms of growth, and governance to the core Hadoop platform Pre-transaction data, can you define what you mean maybe on social before they come in or Engagement data, that that's an essential role in the organization. Do many of your Asian customer, that they need to sunset. the ability to be able to leverage the Cloud vendors' skills and Microsoft Azure or Most of our business is still on-prem to be very candid. and joint deals with IBM that are critical to our success also. and want to thank you all for watching this segment and

ENTITIES

Entity	Category	Confidence
James Kobielus	PERSON	0.99+
Joe Morrissey	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Asia	LOCATION	0.99+
Europe	LOCATION	0.99+
Joe	PERSON	0.99+
Uruguay	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
India	LOCATION	0.99+
Scott Gnau	PERSON	0.99+
seven years	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
28%	QUANTITY	0.99+
South Africa	LOCATION	0.99+
Onyara	ORGANIZATION	0.99+
Berlin	LOCATION	0.99+
United States	LOCATION	0.99+
$100 million	QUANTITY	0.99+
$200 million	QUANTITY	0.99+
31%	QUANTITY	0.99+
five weeks	QUANTITY	0.99+
18 months	QUANTITY	0.99+
GO	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
2017	DATE	0.99+
both	QUANTITY	0.99+
GDPR	TITLE	0.99+
one example	QUANTITY	0.99+
one	QUANTITY	0.98+
today	DATE	0.98+
U.S.	LOCATION	0.98+
Dataworks Summit 2018	EVENT	0.98+
AWS	ORGANIZATION	0.98+
Berlin, Germany	LOCATION	0.98+
over 40%	QUANTITY	0.98+
Microsoft	ORGANIZATION	0.98+
Reliance	ORGANIZATION	0.98+
over 150%	QUANTITY	0.97+
Dataworks Summit	EVENT	0.97+
EMEA	ORGANIZATION	0.97+
first evolution	QUANTITY	0.96+
2018	EVENT	0.96+
European	OTHER	0.96+
SiliconANGLE Media	ORGANIZATION	0.95+
Munich, Germany	LOCATION	0.95+
One	QUANTITY	0.95+
end of 2017	DATE	0.94+
Hadoop	TITLE	0.93+
Cloudera	ORGANIZATION	0.93+
about 15%	QUANTITY	0.93+
past year	DATE	0.92+
theCUBE	ORGANIZATION	0.92+
single vendor	QUANTITY	0.91+
telco	ORGANIZATION	0.89+
Munich Re	ORGANIZATION	0.88+

Rob Thomas, IBM Analytics | IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM: Fast Track Your Data. Brought to you by IBM. >> Welcome, everybody, to Munich, Germany. This is Fast Track Your Data brought to you by IBM, and this is theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise. My name is Dave Vellante, and I'm here with my co-host Jim Kobielus. Rob Thomas is here, he's the General Manager of IBM Analytics, and longtime CUBE guest, good to see you again, Rob. >> Hey, great to see you. Thanks for being here. >> Dave: You're welcome, thanks for having us. So we're talking about, we missed each other last week at the Hortonworks DataWorks Summit, but you came on theCUBE, you guys had the big announcement there. You're sort of getting out, doing a Hadoop distribution, right? TheCUBE gave up our Hadoop distributions several years ago so. It's good that you joined us. But, um, that's tongue-in-cheek. Talk about what's going on with Hortonworks. You guys are now going to be partnering with them essentially to replace BigInsights, you're going to continue to service those customers. But there's more than that. What's that announcement all about? >> We're really excited about that announcement, that relationship, just to kind of recap for those that didn't see it last week. We are making a huge partnership with Hortonworks, where we're bringing data science and machine learning to the Hadoop community. So IBM will be adopting HDP as our distribution, and that's what we will drive into the market from a Hadoop perspective. Hortonworks is adopting IBM Data Science Experience and IBM machine learning to be a core part of their Hadoop platform. And I'd say this is a recognition. One is, companies should do what they do best. We think we're great at data science and machine learning. Hortonworks is the best at Hadoop. Combine those two things, it'll be great for clients. And, we also talked about extending that to things like Big SQL, where they're partnering with us on Big SQL, around modernizing data environments. And then third, which relates a little bit to what we're here in Munich talking about, is governance, where we're partnering closely with them around unified governance, Apache Atlas, advancing Atlas in the enterprise. And so, it's a lot of dimensions to the relationship, but I can tell you since I was on theCUBE a week ago with Rob Bearden, client response has been amazing. Rob and I have done a number of client visits together, and clients see the value of unlocking insights in their Hadoop data, and they love this, which is great. >> Now, I mean, the Hadoop distro, I mean early on you got into that business, just, you had to do it. You had to be relevant, you want to be part of the community, and a number of folks did that. But it's really sort of best left to a few guys who want to do that, and Apache open source is really, I think, the way to go there. Let's talk about Munich. You guys chose this venue. There's a lot of talk about GDPR, you've got some announcements around unified government, but why Munich? >> So, there's something interesting that I see happening in the market. So first of all, you look at the last five years. There's only 10 companies in the world that have outperformed the S&P 500, in each of those five years. And we started digging into who those companies are and what they do. They are all applying data science and machine learning at scale to drive their business. And so, something's happening in the market. That's what leaders are doing. And I look at what's happening in Europe, and I say, I don't see the European market being that aggressive yet around data science, machine learning, how you apply data for competitive advantage, so we wanted to come do this in Munich. And it's a bit of a wake-up call, almost, to say hey, this is what's happening. We want to encourage clients across Europe to think about how do they start to do something now. >> Yeah, of course, GDPR is also a hook. The European Union and you guys have made some talk about that, you've got some keynotes today, and some breakout sessions that are discussing that, but talk about the two announcements that you guys made. There's one on DB2, there's another one around unified governance, what do those mean for clients? >> Yeah, sure, so first of all on GDPR, it's interesting to me, it's kind of the inverse of Y2K, which is there's very little hype, but there's huge ramifications. And Y2K was kind of the opposite. So look, it's coming, May 2018, clients have to be GDPR-compliant. And there's a misconception in the market that that only impacts companies in Europe. It actually impacts any company that does any type of business in Europe. So, it impacts everybody. So we are announcing a platform for unified governance that makes sure clients are GDPR-compliant. We've integrated software technology across analytics, IBM security, some of the assets from the Promontory acquisition that IBM did last year, and we are delivering the only platform for unified governance. And that's what clients need to be GDPR-compliant. The second piece is data has to become a lot simpler. As you think about my comment, who's leading the market today? Data's hard, and so we're trying to make data dramatically simpler. And so for example, with DB2, what we're announcing is you can download and get started using DB2 in 15 minutes or less, and anybody can do it. Even you can do it, Dave, which is amazing. >> Dave: (laughs) >> For the first time ever, you can-- >> We'll test that, Rob. >> Let's go test that. I would love to see you do it, because I guarantee you can. Even my son can do it. I had my son do it this weekend before I came here, because I wanted to see how simple it was. So that announcement is really about bringing, or introducing a new era of simplicity to data and analytics. We call it Download And Go. We started with SPSS, we did that back in March. Now we're bringing Download And Go to DB2, and to our governance catalog. So the idea is make data really simple for enterprises. >> You had a community edition previous to this, correct? There was-- >> Rob: We did, but it wasn't this easy. >> Wasn't this simple, okay. >> Not anybody could do it, and I want to make it so anybody can do it. >> Is simplicity, the rate of simplicity, the only differentiator of the latest edition, or I believe you have Kubernetes support now with this new addition, can you describe what that involves? >> Yeah, sure, so there's two main things that are new functionally-wise, Jim, to your point. So one is, look, we're big supporters of Kubernetes. And as we are helping clients build out private clouds, the best answer for that in our mind is Kubernetes, and so when we released Data Science Experience for Private Cloud earlier this quarter, that was on Kubernetes, extending that now to other parts of the portfolio. The other thing we're doing with DB2 is we're extending JSON support for DB2. So think of it as, you're working in a relational environment, now just through SQL you can integrate with non-relational environments, JSON, documents, any type of no-SQL environment. So we're finally bringing to fruition this idea of a data fabric, which is I can access all my data from a single interface, and that's pretty powerful for clients. >> Yeah, more cloud data development. Rob, I wonder if you can, we can go back to the machine learning, one of the core focuses of this particular event and the announcements you're making. Back in the fall, IBM made an announcement of Watson machine learning, for IBM Cloud, and World of Watson. In February, you made an announcement of IBM machine learning for the z platform. What are the machine learning announcements at this particular event, and can you sort of connect the dots in terms of where you're going, in terms of what sort of innovations are you driving into your machine learning portfolio going forward? >> I have a fundamental belief that machine learning is best when it's brought to the data. So, we started with, like you said, Watson machine learning on IBM Cloud, and then we said well, what's the next big corpus of data in the world? That's an easy answer, it's the mainframe, that's where all the world's transactional data sits, so we did that. Last week with the Hortonworks announcement, we said we're bringing machine learning to Hadoop, so we've kind of covered all the landscape of where data is. Now, the next step is about how do we bring a community into this? And the way that you do that is we don't dictate a language, we don't dictate a framework. So if you want to work with IBM on machine learning, or in Data Science Experience, you choose your language. Python, great. Scala or Java, you pick whatever language you want. You pick whatever machine learning framework you want, we're not trying to dictate that because there's different preferences in the market, so what we're really talking about here this week in Munich is this idea of an open platform for data science and machine learning. And we think that is going to bring a lot of people to the table. >> And with open, one thing, with open platform in mind, one thing to me that is conspicuously missing from the announcement today, correct me if I'm wrong, is any indication that you're bringing support for the deep learning frameworks like TensorFlow into this overall machine learning environment. Am I wrong? I know you have Power AI. Is there a piece of Power AI in these announcements today? >> So, stay tuned on that. We are, it takes some time to do that right, and we are doing that. But we want to optimize so that you can do machine learning with GPU acceleration on Power AI, so stay tuned on that one. But we are supporting multiple frameworks, so if you want to use TensorFlow, that's great. If you want to use Caffe, that's great. If you want to use Theano, that's great. That is our approach here. We're going to allow you to decide what's the best framework for you. >> So as you look forward, maybe it's a question for you, Jim, but Rob I'd love you to chime in. What does that mean for businesses? I mean, is it just more automation, more capabilities as you evolve that timeline, without divulging any sort of secrets? What do you think, Jim? Or do you want me to ask-- >> What do I think, what do I think you're doing? >> No, you ask about deep learning, like, okay, that's, I don't see that, Rob says okay, stay tuned. What does it mean for a business, that, if like-- >> Yeah. >> If I'm planning my roadmap, what does that mean for me in terms of how I should think about the capabilities going forward? >> Yeah, well what it means for a business, first of all, is what they're going, they're using deep learning for, is doing things like video analytics, and speech analytics and more of the challenges involving convolution of neural networks to do pattern recognition on complex data objects for things like connected cars, and so forth. Those are the kind of things that can be done with deep learning. >> Okay. And so, Rob, you're talking about here in Europe how the uptick in some of the data orientation has been a little bit slower, so I presume from your standpoint you don't want to over-rotate, to some of these things. But what do you think, I mean, it sounds like there is difference between certainly Europe and those top 10 companies in the S&P, outperforming the S&P 500. What's the barrier, is it just an understanding of how to take advantage of data, is it cultural, what's your sense of this? >> So, to some extent, data science is easy, data culture is really hard. And so I do think that culture's a big piece of it. And the reason we're kind of starting with a focus on machine learning, simplistic view, machine learning is a general-purpose framework. And so it invites a lot of experimentation, a lot of engagement, we're trying to make it easier for people to on-board. As you get to things like deep learning as Jim's describing, that's where the market's going, there's no question. Those tend to be very domain-specific, vertical-type use cases and to some extent, what I see clients struggle with, they say well, I don't know what my use case is. So we're saying, look, okay, start with the basics. A general purpose framework, do some tests, do some iteration, do some experiments, and once you find out what's hunting and what's working, then you can go to a deep learning type of approach. And so I think you'll see an evolution towards that over time, it's not either-or. It's more of a question of sequencing. >> One of the things we've talked to you about on theCUBE in the past, you and others, is that IBM obviously is a big services business. This big data is complicated, but great for services, but one of the challenges that IBM and other companies have had is how do you take that service expertise, codify it to software and scale it at large volumes and make it adoptable? I thought the Watson data platform announcement last fall, I think at the time you called it Data Works, and then so the name evolved, was really a strong attempt to do that, to package a lot of expertise that you guys had developed over the years, maybe even some different software modules, but bring them together in a scalable software package. So is that the right interpretation, how's that going, what's the uptake been like? >> So, it's going incredibly well. What's interesting to me is what everybody remembers from that announcement is the Watson Data Platform, which is a decomposable framework for doing these types of use cases on the IBM cloud. But there was another piece of that announcement that is just as critical, which is we introduced something called the Data First method. And that is the recipe book to say to a client, so given where you are, how do you get to this future on the cloud? And that's the part that people, clients, struggle with, is how do I get from step to step? So with Data First, we said, well look. There's different approaches to this. You can start with governance, you can start with data science, you can start with data management, you can start with visualization, there's different entry points. You figure out the right one for you, and then we help clients through that. And we've made Data First method available to all of our business partners so they can go do that. We work closely with our own consulting business on that, GBS. But that to me is actually the thing from that event that has had, I'd say, the biggest impact on the market, is just helping clients map out an approach, a methodology, to getting on this journey. >> So that was a catalyst, so this is not a sequential process, you can start, you can enter, like you said, wherever you want, and then pick up the other pieces from majority model standpoint? Exactly, because everybody is at a different place in their own life cycle, and so we want to make that flexible. >> I have a question about the clients, the customers' use of Watson Data Platform in a DevOps context. So, are more of your customers looking to use Watson Data Platform to automate more of the stages of the machine learning development and the training and deployment pipeline, and do you see, IBM, do you see yourself taking the platform and evolving it into a more full-fledged automated data science release pipelining tool? Or am I misunderstanding that? >> Rob: No, I think that-- >> Your strategy. >> Rob: You got it right, I would just, I would expand a little bit. So, one is it's a very flexible way to manage data. When you look at the Watson Data Platform, we've got relational stores, we've got column stores, we've got in-memory stores, we've got the whole suite of open-source databases under the composed-IO umbrella, we've got cloud in. So we've delivered a very flexible data layer. Now, in terms of how you apply data science, we say, again, choose your model, choose your language, choose your framework, that's up to you, and we allow clients, many clients start by building models on their private cloud, then we say you can deploy those into the Watson Data Platform, so therefore then they're running on the data that you have as part of that data fabric. So, we're continuing to deliver a very fluid data layer which then you can apply data science, apply machine learning there, and there's a lot of data moving into the Watson Data Platform because clients see that flexibility. >> All right, Rob, we're out of time, but I want to kind of set up the day. We're doing CUBE interviews all morning here, and then we cut over to the main tent. You can get all of this on IBMgo.com, you'll see the schedule. Rob, you've got, you're kicking off a session. We've got Hilary Mason, we've got a breakout session on GDPR, maybe set up the main tent for us. >> Yeah, main tent's going to be exciting. We're going to debunk a lot of misconceptions about data and about what's happening. Marc Altshuller has got a great segment on what he calls the death of correlations, so we've got some pretty engaging stuff. Hilary's got a great piece that she was talking to me about this morning. It's going to be interesting. We think it's going to provoke some thought and ultimately provoke action, and that's the intent of this week. >> Excellent, well Rob, thanks again for coming to theCUBE. It's always a pleasure to see you. >> Rob: Thanks, guys, great to see you. >> You're welcome; all right, keep it right there, buddy, We'll be back with our next guest. This is theCUBE, we're live from Munich, Fast Track Your Data, right back. (upbeat electronic music)

Published Date : Jun 22 2017

SUMMARY :

Brought to you by IBM. This is Fast Track Your Data brought to you by IBM, Hey, great to see you. It's good that you joined us. and machine learning to the Hadoop community. You had to be relevant, you want to be part of the community, So first of all, you look at the last five years. but talk about the two announcements that you guys made. Even you can do it, Dave, which is amazing. I would love to see you do it, because I guarantee you can. but it wasn't this easy. and I want to make it so anybody can do it. extending that now to other parts of the portfolio. What are the machine learning announcements at this And the way that you do that is we don't dictate I know you have Power AI. We're going to allow you to decide So as you look forward, maybe it's a question No, you ask about deep learning, like, okay, that's, and speech analytics and more of the challenges But what do you think, I mean, it sounds like And the reason we're kind of starting with a focus One of the things we've talked to you about on theCUBE And that is the recipe book to say to a client, process, you can start, you can enter, and deployment pipeline, and do you see, IBM, models on their private cloud, then we say you can deploy and then we cut over to the main tent. and that's the intent of this week. It's always a pleasure to see you. This is theCUBE, we're live from Munich,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Europe	LOCATION	0.99+
Rob	PERSON	0.99+
Marc Altshuller	PERSON	0.99+
Hilary	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Bearden	PERSON	0.99+
February	DATE	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
May 2018	DATE	0.99+
March	DATE	0.99+
Munich	LOCATION	0.99+
Scala	TITLE	0.99+
Apache	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
Last week	DATE	0.99+
Java	TITLE	0.99+
last year	DATE	0.99+
two announcements	QUANTITY	0.99+
10 companies	QUANTITY	0.99+
GDPR	TITLE	0.99+
Python	TITLE	0.99+
DB2	TITLE	0.99+
15 minutes	QUANTITY	0.99+
last week	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
European Union	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
JSON	TITLE	0.99+
Watson Data Platform	TITLE	0.99+
third	QUANTITY	0.99+
One	QUANTITY	0.99+
this week	DATE	0.98+
today	DATE	0.98+
a week ago	DATE	0.98+
two things	QUANTITY	0.98+
SQL	TITLE	0.98+
last fall	DATE	0.98+
2017	DATE	0.98+
Munich, Germany	LOCATION	0.98+
each	QUANTITY	0.98+
Y2K	ORGANIZATION	0.98+

Arun Murthy, Hortonworks | DataWorks Summit 2017

>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE covering DataWorks Summit 2017. Brought to you by Hortonworks. >> Good morning, welcome to theCUBE. We are live at day 2 of the DataWorks Summit, and have had a great day so far, yesterday and today, I'm Lisa Martin with my co-host George Gilbert. George and I are very excited to be joined by a multiple CUBE alumni, the co-founder and VP of Engineering at Hortonworks Arun Murthy. Hey, Arun. >> Thanks for having me, it's good to be back. >> Great to have you back, so yesterday, great energy at the event. You could see and hear behind us, great energy this morning. One of the things that was really interesting yesterday, besides the IBM announcement, and we'll dig into that, was that we had your CEO on, as well as Rob Thomas from IBM, and Rob said, you know, one of the interesting things over the last five years was that there have been only 10 companies that have beat the S&P 500, have outperformed, in each of the last five years, and those companies have made big bets on data science and machine learning. And as we heard yesterday, these four meta-trains IoT, cloud streaming, analytics, and now the fourth big leg, data science. Talk to us about what Hortonworks is doing, you've been here from the beginning, as a co-founder I've mentioned, you've been with Hadoop since it was a little baby. How is Hortonworks evolving to become one of those big users making big bets on helping your customers, and yourselves, leverage machine loading to really drive the business forward? >> Absolutely, a great question. So, you know, if you look at some of the history of Hadoop, it started off with this notion of a data lake, and then, I'm talking about the enterprise side of Hadoop, right? I've been working for Hadoop for about 12 years now, you know, the last six of it has been as a vendor selling Hadoop to enterprises. They started off with this notion of data lake, and as people have adopted that vision of a data lake, you know, you bring all the data in, and now you're starting to get governance and security, and all of that. Obviously the, one of the best ways to get value over the data is the notion of, you know, can you, sort of, predict what is going to happen in your world of it, with your customers, and, you know, whatever it is with the data that you already have. So that notion of, you know, Rob, our CEO, talks about how we're trying to move from a post-transactional world to a pre-transactional world, and doing the analytics and data sciences will be, obviously, with me. We could talk about, and there's so many applications of it, something as similar as, you know, we did a demo last year of, you know, of how we're working with a freight company, and we're starting to show them, you know, predict which drivers and which routes are going to have issues, as they're trying to move, alright? Four years ago we did the same demo, and we would say, okay this driver has, you know, we would show that this driver had an issue on this route, but now, within the world, we can actually predict and let you know to take preventive measures up front. Similarly internally, you know, you can take things from, you know, mission-learning, and log analytics, and so on, we have a internal problem, you know, where we have to test two different versions of HDP itself, and as you can imagine, it's a really, really hard problem. We have the support, 10 operating systems, seven databases, like, if you multiply that matrix, it's, you know, tens of thousands of options. So, if you do all that testing, we now use mission-learning internally, to look through the logs, and kind of predict where the failures were, and help our own, sort of, software engineers understand where the problems were, right? An extension of that has been, you know, the work we've done in Smartsense, which is a service we offer our enterprise customers. We collect logs from their Hadoop clusters, and then they can actually help them understand where they can either tune their applications, or even tune their hardware, right? They might have a, you know, we have this example I really like where at a really large enterprise Financial Services client, they had literally, you know, hundreds and, you know, and thousands of machines on HDP, and we, using Smartsense, we actually found that there were 25 machines which had bad NIC configuration, and we proved to them that by fixing those, we got a 30% to put back on their cluster. At that scale, it's a lot of money, it's a lot of cap, it's a lot of optics So, as a company, we try to ourselves, as much as we, kind of, try to help our customers adopt it, that make sense? >> Yeah, let's drill down on that even a little more, cause it's pretty easy to understand what's the standard telemetry you would want out of hardware, but as you, sort of, move up the stack the metrics, I guess, become more custom. So how do you learn, not just from one customer, but from many customers especially when you can't standardize what you're supposed to pull out of them? >> Yeah so, we're sort of really big believers in, sort of, doctoring your own stuff, right? So, we talk about the notion of data lake, we actually run a Smartsense data lake where we actually get data across, you know, the hundreds of of our customers, and we can actually do predictive mission-learning on that data in our own data lake. Right? And to your point about how we go up the stack, this is, kind of, where we feel like we have a natural advantage because we work on all the layers, whether it's the sequel engine, or the storage engine, or, you know, above and beyond the hardware. So, as we build these models, we understand that we need more, or different, telemetry right? And we put that back into the product so the next version of HDP will have that metrics that we wanted. And, now we've been doing this for a couple of years, which means we've done three, four, five turns of the crank, obviously something we always get better at, but I feel like, compared to where we were a couple of years ago when Smartsense first came out, it's actually matured quite a lot, from that perspective. >> So, there's a couple different paths you can add to this, which is customers might want, as part of their big data workloads, some non-Hortonworks, you know, services or software when it's on-prem, and then can you also extend this management to the Cloud if they want to hybrid setup where, in the not too distant future, the Cloud vendor will be also a provider for this type of management. >> So absolutely, in fact it's true today when, you know, we work with, you know, Microsoft's a great partner of ours. We work with them to enable Smartsense on HDI, which means we can actually get the same telemetry back, whether you're running the data on an on-prem HDP, or you're running this on HDI. Similarly, we shipped a version of our Cloud product, our Hortonworks Data Cloud, on Amazon and again Smartsense preplanned there, so whether you're on an Amazon, or a Microsoft, or on-prem, we get the same telemetry, we get the same data back. We can actually, if you're a customer using many of these products, we can actually give you that telemetry back. Similarly, if you guys probably know this we have, you were probably there in an analyst when they announced the Flex Support subscription, which means that now we can actually take the support subscription you have to get from Hortonworks, and you can actually use it on-prem or on the Cloud. >> So in terms of transforming, HDP for example, just want to make sure I'm understanding this, you're pulling in data from customers to help evolve the product, and that data can be on-prem, it can be in a Microsoft lesur, it can be an AWS? >> Exactly. The HDP can be running in any of these, we will actually pull all of them to our data lake, and they actually do the analytics for us and then present it back to the customers. So, in our support subscription, the way this works is we do the analytics in our lake, and it pushes it back, in fact to our support team tickets, and our sales force, and all the support mechanisms. And they get a set of recommendations saying Hey, we know this is the work loads you're running, we see these are the opportunities for you to do better, whether it's tuning a hardware, tuning an application, tuning the software, we sort of send the recommendations back, and the customer can go and say Oh, that makes sense, the accept that and we'll, you know, we'll update the recommendation for you automatically. Then you can have, or you can say Maybe I don't want to change my kernel pedometers, let's have a conversation. And if the customer, you know, is going through with that, then they can go and change it on their own. We do that, sort of, back and forth with the customer. >> One thing that just pops into my mind is, we talked a lot yesterday about data governance, are there particular, and also yesterday on stage were >> Arun: With IBM >> Yes exactly, when we think of, you know, really data-intensive industries, retail, financial services, insurance, healthcare, manufacturing, are there particular industries where you're really leveraging this, kind of, bi-directional, because there's no governance restrictions, or maybe I shouldn't say none, but. Give us a sense of which particular industries are really helping to fuel the evolution of Hortonworks data lake. >> So, I think healthcare is a great example. You know, when we started off, sort of this open-source project, or an atlas, you know, a couple of years ago, we got a lot of traction in the healthcare sort of insurance industry. You know, folks like Aetna were actually founding members of that, you know, sort of consortium of doing this, right? And, we're starting to see them get a lot of leverage, all of this. Similarly now as we go into, you know, Europe and expand there, things like GDPR, are really, really being pardoned, right? And, you guys know GDPR is a really big deal. Like, you pay, if you're not compliant by, I think it's like March of next year, you pay a portion of your revenue as fines. That's, you know, big money for everybody. So, I think that's what we're really excited about the portion with IBM, because we feel like the two of us can help a lot of customers, especially in countries where they're significantly, highly regulated, than the United States, to actually get leverage our, sort of, giant portfolio of products. And IBM's been a great company to atlas, they've adopted wholesale as you saw, you know, in the announcements yesterday. >> So, you're doing a Keynote tomorrow, so give us maybe the top three things, you're giving the Keynote on Data Lake 3.0, walk us through the evolution. Data Lakes 1.0, 2.0, 3.0, where you are now, and what folks can expect to hear and see in your Keynote. >> Absolutely. So as we've, kind of, continued to work with customers and we see the maturity model of customers, you know, initially people are staying up a data lake, and then they'd want, you know, sort of security, basic security what it covers, and so on. Now, they want governance, and as we're starting to go to that journey clearly, our customers are pushing us to help them get more value from the data. It's not just about putting the data lake, and obviously managing data with governance, it's also about Can you help us, you know, do mission-learning, Can you help us build other apps, and so on. So, as we look to there's a fundamental evolution that, you know, Hadoop legal system had to go through was with advance of technologies like, you know, a Docker, it's really important first to help the customers bring more than just workloads, which are sort of native to Hadoop. You know, Hadoop started off with MapReduce, obviously Spark's went great, and now we're starting to see technologies like Flink coming, but increasingly, you know, we want to do data science. To mass market data science is obviously, you know, people, like, want to use Spark, but the mass market is still Python, and R, and so on, right? >> Lisa: Non-native, okay. >> Non-native. Which are not really built, you know, these predate Hadoop by a long way, right. So now as we bring these applications in, having technology like Docker is really important, because now we can actually containerize these apps. It's not just about running Spark, you know, running Spark with R, or running Spark with Python, which you can do today. The problem is, in a true multi-tenant governed system, you want, not just R, but you want specifics of a libraries for R, right. And the libraries, you know, George wants might be completely different than what I want. And, you know, you can't do a multi-tenant system where you install both of them simultaneously. So Docker is a really elegant solution to problems like those. So now we can actually bring those technologies into a Docker container, so George's Docker containers will not, you know, conflict with mine. And you can actually go to the races, you know after the races, we're doing data signs. Which is really key for technologies like DSX, right? Because with DSX if you see, obviously DSX supports Spark with technologies like, you know, Zeppelin which is a front-end, but they also have Jupiter, which is going to work the mass market users for Python and R, right? So we want to make sure there's no friction whether it's, sort of, the guys using Spark, or the guys using R, and equally importantly DSX, you know, in the short map will also support things like, you know, the classic IBM portfolio, SBSS and so on. So bringing all of those things in together, making sure they run with data in the data lake, and also the computer in the data lake, is really big for us. >> Wow, so it sounds like your Keynote's going to be very educational for the folks that are attending tomorrow, so last question for you. One of the themes that occurred in the Keynote this morning was sharing a fun-fact about these speakers. What's a fun-fact about Arun Murthy? >> Great question. I guess, you know, people have been looking for folks with, you know, 10 years of experience on Hadoop. I'm here finally, right? There's not a lot of people but, you know, it's fun to be one of those people who've worked on this for about 10 years. Obviously, I look forward to working on this for another 10 or 15 more, but it's been an amazing journey. >> Excellent. Well, we thank you again for sharing time again with us on theCUBE. You've been watching theCUBE live on day 2 of the Dataworks Summit, hashtag DWS17, for my co-host George Gilbert. I am Lisa Martin, stick around we've got great content coming your way.

Published Date : Jun 14 2017

SUMMARY :

Brought to you by Hortonworks. We are live at day 2 of the DataWorks Summit, and Rob said, you know, one of the interesting and we're starting to show them, you know, when you can't standardize what you're or the storage engine, or, you know, some non-Hortonworks, you know, services when, you know, we work with, you know, And if the customer, you know, Yes exactly, when we think of, you know, Similarly now as we go into, you know, Data Lakes 1.0, 2.0, 3.0, where you are now, with advance of technologies like, you know, And the libraries, you know, George wants One of the themes that occurred in the Keynote this morning There's not a lot of people but, you know, Well, we thank you again for sharing time again

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Lisa Martin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Rob	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
George	PERSON	0.99+
Lisa	PERSON	0.99+
30%	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
25 machines	QUANTITY	0.99+
10 operating systems	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
Arun Murthy	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
two	QUANTITY	0.99+
Aetna	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
Arun	PERSON	0.99+
today	DATE	0.99+
Spark	TITLE	0.99+
yesterday	DATE	0.99+
AWS	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Python	TITLE	0.99+
last year	DATE	0.99+
Four years ago	DATE	0.99+
15	QUANTITY	0.99+
tomorrow	DATE	0.99+
CUBE	ORGANIZATION	0.99+
three	QUANTITY	0.99+
DataWorks Summit	EVENT	0.99+
seven databases	QUANTITY	0.98+
four	QUANTITY	0.98+
DataWorks Summit 2017	EVENT	0.98+
United States	LOCATION	0.98+
Dataworks Summit	EVENT	0.98+
10	QUANTITY	0.98+
Europe	LOCATION	0.97+
10 companies	QUANTITY	0.97+
One	QUANTITY	0.97+
one customer	QUANTITY	0.97+
thousands of machines	QUANTITY	0.97+
about 10 years	QUANTITY	0.96+
GDPR	TITLE	0.96+
Docker	TITLE	0.96+
Smartsense	ORGANIZATION	0.96+
about 12 years	QUANTITY	0.95+
this morning	DATE	0.95+
each	QUANTITY	0.95+
two different versions	QUANTITY	0.95+
five turns	QUANTITY	0.94+
R	TITLE	0.93+
four meta-trains	QUANTITY	0.92+
day 2	QUANTITY	0.92+
Data Lakes 1.0	COMMERCIAL_ITEM	0.92+
Flink	ORGANIZATION	0.91+
first	QUANTITY	0.91+
HDP	ORGANIZATION	0.91+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Dataworks Summit Europe 2017: