Chuck Yarbough, Pentaho | Big Data NYC 2017
>> Announcer: Live from Midtown Manhattan it's theCUBE. Covering Big Data New York City 2017 brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Hey, welcome back everyone live here in New York City it's theCUBE's special presentation Big Data NYC. This is our fifth year doing our own event here in New York City, our eighth year covering the Hadoop World ecosystem from the beginning. Through eight years, it's had a lot evolutions, Hadoop World, Strata Conference, Strata Hadoop, now it's called Strata Data happening right around the corner. We run our own event here, talk about thought leaders and the expert CEO's, entrepreneurs. Getting the data for you, sharing that with you. I'm John Furrier co-host theCUBE with my co-host here Jim Kobielus who's the Lead Analyst at Wikibon Big Data. And Chuck Yarbough who's the Vice President at Pentaho Solutions part of Hitachi's new Vantara. A new company created just announced last week. Hitachi in a variety of their portfolio technologies into a new company, out to bring in a lot of those integrated solutions. Chuck great to see you again, theCUBE alumni. We chatted multiple times at Pentaho World, going back 2015. >> Always he always great to be at theCUBE. >> What a couple of years it's been. Give us quickly hard news, it's pretty awesome you guys have a variety of things at Pentaho you know with Hitachi, that happened, now the market's evolved, what's this new entity, this new company they're bringing together? >> Yes, so the big news Hitachi Vantara. So what that is, two years ago Hitachi Data Systems acquired Pentaho and so fast forward two years. A new company gets created from Hitachi Data Systems. Pentaho, in a third organization at Hitachi called the Insight Group so Hitachi Insight Group. Those three groups come together to form Hitachi Vantara >> What's the motivation behind that. I mean, I go connect the dots but I want to hear your perspective because it really is about pulling things together. The trend this year the show is as Jim calls it, hybrid data, integrated data. Things seem to be coming together, is that part the purpose? What's the reason behind pulling this together? >> Yeah, I think there's a lot of reasons. One of them is what we're seeing not just in our own business, but in our customers business, and that is digital transformation. Right, this this need to evolve So Hitachi Vantara is all about data and analytics. And a big focus of what we do is what Pentaho's been doing for years which is driving in all kinds of data, big data, all data. I think we're getting on the cusp of closing out the big data term, but you know, it's all data right. >> Data everywhere, every application. >> And applying analytics across the board. One of the big initiatives, part of why Pentaho was originally acquired we were actually Hitachi Data Systems was a customer of Pentaho when we got acquired, so we we knew each other pretty well. And part of the reason for that acquisition was to drive analytics in around internet of things. The IoT space, which is something that Hitachi being a very large IT and operational technology, OT, company probably does as well as anybody if not better. >> So going back couple of years, I'm just looking at my notes here from our our video index. You visited theCUBE in 2015, but really the concepts have evolved significantly. I want to just highlight a few of them. What data warehouse optimizations, we talk about that. Data refinery concepts, 360 view as applied to big data. Again that was foundational concepts that all are in play right now. >> Absolutely. >> What is the update in those areas? Because refinery, everyone talks about data refinery, you know, oil, the easy oil example but I mean, come on, data is everywhere it is most important, you can use it multiple times unlike oil, as you were pointing out. >> So interesting you bring that up. So to me data refinery in a digital transformation really in an IoT world where lots of data is is streaming through in fact, yesterday I read something by IDC that 95% of all data in the future and the data growth is dramatic it's 10x what it is today in just a few years. 95% of the that growth of data's IoT related. The question is how are you using most of that, right, and what what are you going to do with it. So that data's is streaming through, there's a lot happening, we can do things at the edge, we can apply analytics and filtering and do things. But ultimately that data is going to land somewhere and that's where that refinery, think of it as the big data center refinery, right, where I'm going to take that large amount of data and do the things that Jim does, you know and apply machine learning and deep algorithms too really. >> I had some thoughts on the IoT Jim and I were arguing, not arguing, discussing, with others in theCube about the role. >> We were bickering. >> The role of the edge because I was saying the refiner of the data can come back depending on what kind of data or you push compute to the edge, kind of known concepts, people been discussing that. But the issue is been, how do you view the edge? I'd love to get your reaction to that question because a lot of people are saying you have to think of IoT as a completely different category, than just cloud, than just data center, because the way some people are looking at IoT I know this can be semantics whether it's industrial or just straight internet of things device, or person, that is a different animal when it comes to like what you call it and how it gets put into a bucket. I mean most people put a lot of the IT bucket but. Some are saying IT edge should be completely different category of how you look at those problems. Your thoughts on how that IoT conversation shape. >> The question I always ask when I'm talking to somebody about the edge is, well what do you mean? Because it is something that can be defined a little bit differently but in an industrial IoT context I think, you know we look at it as one, you you have to know what those things are you have to really understand them. And part of understanding those things is having a digital representation of what those things are. >> A digital twin? >> A digital twin. Right, or asset avatar, as we call it at Hitachi. >> Oh I like that. >> So this idea of really managing those assets, understanding what they are and then being able to know what the current state, what the previous state, things are like that are. And then that refinery we just talked about is sort of where that information goes to so you can do other kinds of analytics right. But when you're talking about the edge, typically what we're seeing is the kinds of analytics might happen at the edge, are probably more around filtering you know, it's not quite as complex of analytics that's what we're seeing today. Now, the future I don't know. >> Sort of tiered analytics from the edge on in with more minimal, I mean, not minimal that's the wrong term, with a more narrowly scoped inference. Like predictions and so forth being handled at the edge with larger more complex models being like deep learning whatever being processed in the cloud is that it? >> Yeah that's exactly the way that I see it. Now the other thing about the edge, depends on who you're talking to, again, but what is an edge device or the the gateways or the compute right, so part of IoT is in my mind, it's not cloud, it's not on-prem or it's not, I mean it's a little bit of everything right, it depends on the use case and what you're operating. We have a customer who does trains as a service in England, in Europe, and so they don't sell the trains anymore they actually manufacture trains, and they sell the service of getting a passenger from here to there. But for them, edge is everything that happens on those trains. And tracking, as a digital representation, the train and then being able to drill down deeper and deeper, and you, know one of the things that I understand is one of the major delays for train service is doors opening and closing or being delayed, so maybe that comes down to a small part and the vibration of it and tracking that. So you've got to be able to track that appropriately. Now, on a train you might have a lot of extra space so you could put compute devices that have a lot of power. >> What's interesting you said the edge, in this context, is everything that happens on that train. In other words, it sounds like all the real world outcomes that are enabled, perhaps optimized, by embedding of the analytics in those physical devices or in that entire vehicle that is essentially. One way that you're describing the edge which is not a single device but as a complete assembly of devices that play together. Amongst themselves and in with the services in the cloud. Is that a logical sort of framework? >> That's why I said I usually ask what do we mean by edge. If you've got millions, thousands, whatever, devices out there feeding sensors whatever feeding this data, collecting, processing you know there's some some level of edge computing gateways, processes that are going to happen. >> Well, my question for ya, I'd like to get your thoughts, as we, again we're having a, we love the hyperbio we think its completely legit and it's going to be continued to be hyped because it's obvious what you see with IoT standing on the edge. But lot of customers we talked to are like, look I got a lot going on I got application development I got to break out my security got to build that up. I've got data governance issues, and now you throw in IoT over the top. They're like, I'm choking in projects. So they they come down to one of a selection criteria. How do they define a working IoT project? And the trend that we're seeing is that it has to do with their industrial equipment or something related to their business. Call it industrial IoT, because if they have something in their business, say trains, as a critical part of what they do, that's easy to say let's justify this. Everything else then tends to go on the back burner, if they don't have clear visibility of what their instrumenting. That's kind of weird do you agree with that? Do you see a pattern as well as what customers are doing by saying I'm going to bring this project in and were going to connect our IoT. >> That's exactly what I see. Industrial internet of things is where I see the biggest value today when you have trains or mining equipment or you know whatever. >> John: Whatever your business runs. >> Your manufacturing line right. and being able to a fine tune those lines to either predicts failures, maybe improve quality. Those are those are impactful and they can be done right now today and that's what we're seeing is kind of the big emerging thing. IoT's interesting to talk about, the reality is it's really digital transformation that we're seeing. Companies transforming into new business models, doing things significantly different to grow into the future. And IoT is an enabler of that. So you're not going to see IoT everywhere today. >> The low hanging fruit is where it gets to the real business. >> Yeah, but it's going to go across all verticals, right, no doubt. >> So what solutions does Pentaho have for digital twins, or managing digital twins, the objects, the data itself, within and IoT context, is this something you're engaged in already? >> So within the Hitachi Vantara, the larger company. Bigger company, we have, we have what we call our Lumada IoT Platform and in that there is this asset avatar technology that that does exactly what you're describing. Now I'm going to throw quick plug out if you don't mind. Pentaho World in a couple, in about a month. >> John: theCUBE will be there. >> theCUBE will be there, and we're excited to have theCUBE and we're going to we're going to give you complete information about asset avatar with all the right people. >> There's a movie in there somewhere I could feel it, Avatar two. There's a lot of great representations of data I want to get your thoughts on how the new firm's going to solve customer problems. Because now as the customer see this new entity from you guys, Vantara's been doing real well, we covered the acquisition and you were kind of left alone Pentaho was integrating in, but it wasn't like a radical shift. Now there's some movement, what does it mean to the customer, what's the story to the customer. >> You know I think it's great news for the customer because Pentaho's always been very customer focused. But when you look at Hitachi Vantara the wealth of technology and expertise. Everything from all of the the great IT oriented stuff that Hitachi Data Systems has done and been well known for in the past still exists. But this broader focus of taking data and processing it in a variety of ways to solve real business problems. All the way to orchestrating machine learning in applying algorithms and then with the Hitachi. >> What specifically in Hitachi is coming into this? Because again this is again a focused solution company now with data, so Hitachi Data Centers, >> Yeah, so Hitachi Data Systems, think of it as the the infrastructure company. Hitachi Insight was the really focused largely on the IoT platform development, with some Pentaho assets and then the Pentaho business. But here's the thing about Hitachi, very large company, builds everything. Mining equipment and and all kinds of stuff. So nobody understands how all those things fit together better, I believe, than Hitachi. But some of the things that we have at that organization is this idea of the Hitachi labs. And data scientists that are really doing interesting things Jim you'd love to get more embedded into what some of those things are, and making that available to customers is a huge opportunity for customers to now be able to embrace a lot of the technologies we've been talking about. I said last year that this year was going to be the year of machine learning. And if you look through the expo hall that's what everybody's talking about. Right, it's AI or machine learning. >> I'm wondering if you're commercializing R&D that's coming straight out of Hitachi labs already or whether the Vantara combination will enable that. In other words, more innovation straight out of the labs, into into the commercial arena. >> That's something that we are absolutely trying to to, right because there's great things that these lab organizations and at Hitachi they're big labs. They're really legit, I kind of joke about that. The kinds of stuff that they're able to bring about now, Pentaho is part of the engine to help actually commercialize those things. >> Chuck I know you're looking forward to Pentaho World I'll give you the final word here in this segment how you see the big data worlds evolve. Take your Pentaho hat off and put your industry guru hat on. What's happening, I mean this AI watch, that's pretty obvious, not a lot of blockchain discussion which is going to completely open up some things we getting on the decentralized application market which is going to compliment the distributed nature of how we see a date analytics flow and certainly the immutability of it's interesting. But that's kind of down the road. But here you're starting to see the swim lanes in the industry, you've seen people who've been successful and the ones who have fallen by the wayside. But now the customers, they want real solutions. They don't want more hype, they don't want another eighth year of hype, they want OK let's get into the real meat and potatoes of data impact to my organization, call it digital transformation. What's happening, what is going on the landscape. >> So you know I mentioned before and to me it's digital transformation which is a big huge thing. But that's what companies are interested in that's what they're beginning to think. If they're not thinking about those things they're falling behind, five or six, seven years ago we talked about the same exact thing with big data. It's like a big data is really you know it's a big opportunity and they're like well I don't know those that didn't adopt it aren't necessarily in a position now to transform digitally and to do some of the things that they're going to need to evolve into new business opportunities. >> And the big data examples of winner is the ones who actually made it valuable. Whether it's insight that converted to a new customer or change an outcome in a positive way, they go that wouldn't have been possible without data. The proof points kind of hit the table. >> That's right the other thing is you know, who's going to win, who's going to lose. I think people that are implementing technology for technology's sake are going to lose. People that are focused on the outcomes are going to win. That's what it is, technology enables all that but you've really got to be focused on. I want to get your quick, one more quick thing, before we go I know we got we're tight on time but I want to get thoughts on the open ecosystem. Open source going to whole other level. The projections are code will be shipping at an exponential rate, it's be a lot of onboarding of new stuff, so open obviously works, community models work, partnering is critical. So we're seeing that good partnerships, not fake deals or optical deals or Barney deals, whatever you want to call it. But real partnerships. You starting to see technology partnerships. What's your view on that, how is the new Vantara going to go forward, are you going to continue to do partnerships and what's the strategy? >> Yeah I think the opportunity with one, Hitachi Vantara is we have a breadth that can touch many different aspects. So as Pentaho we had great partnerships, very meaningful but it always comes down to what we doing for the customer. How are we changing things for customer. So I'm not a believer in those Barney kind of relationships those are nice but let's talk about what we're doing for customers. >> Yeah, real proof points. >> You guys will continue to parner. >> Yes, we will continue to do that. >> Okay great, Chuck, thank you so much. CUBE coverage Live in New York City in Manhattan it's theCUBE with Big Data NYC, out fifth year doing our own event in conjunction with Strata Data. Now bless the new name of the show. It was Strata Hadoop, Hadoop World before that. But we're still theCUBE covering eight years of the action here back with more after this short break.
SUMMARY :
brought to you by SiliconANGLE Media Chuck great to see you again, theCUBE alumni. now the market's evolved, what's this new entity, Yes, so the big news Hitachi Vantara. is that part the purpose? the big data term, but you know, it's all data right. One of the big initiatives, part of why Pentaho the concepts have evolved significantly. What is the update in those areas? and do the things that Jim does, you know on the IoT Jim and I were arguing, not arguing, But the issue is been, how do you view the edge? to somebody about the edge is, well what do you mean? Right, or asset avatar, as we call it at Hitachi. to know what the current state, what the previous state, I mean, not minimal that's the wrong term, it depends on the use case and what you're operating. by embedding of the analytics in those physical devices gateways, processes that are going to happen. to be continued to be hyped because it's obvious what you I see the biggest value today when you have trains and being able to a fine tune those lines it gets to the real business. Yeah, but it's going to go across all verticals, Now I'm going to throw quick plug out if you don't mind. and we're going to we're going to give you Because now as the customer see this new entity Everything from all of the the great But some of the things that we have of the labs, into into the commercial arena. now, Pentaho is part of the engine to help But now the customers, they want real solutions. and to do some of the things that they're going to need Whether it's insight that converted to a new customer People that are focused on the outcomes are going to win. to what we doing for the customer. continue to parner. to do that. of the action here back with more after this short break.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jim Kobielus | PERSON | 0.99+ |
Hitachi | ORGANIZATION | 0.99+ |
Jim | PERSON | 0.99+ |
Chuck Yarbough | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
John | PERSON | 0.99+ |
England | LOCATION | 0.99+ |
Hitachi Data Systems | ORGANIZATION | 0.99+ |
Chuck | PERSON | 0.99+ |
Vantara | ORGANIZATION | 0.99+ |
2015 | DATE | 0.99+ |
Pentaho | ORGANIZATION | 0.99+ |
millions | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
95% | QUANTITY | 0.99+ |
New York City | LOCATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
10x | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
eighth year | QUANTITY | 0.99+ |
three groups | QUANTITY | 0.99+ |
fifth year | QUANTITY | 0.99+ |
Hitachi Vantara | ORGANIZATION | 0.99+ |
last week | DATE | 0.99+ |
eight years | QUANTITY | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
yesterday | DATE | 0.99+ |
two years | QUANTITY | 0.99+ |
Hitachi Insight Group | ORGANIZATION | 0.99+ |
Big Data | ORGANIZATION | 0.99+ |
Insight Group | ORGANIZATION | 0.99+ |
this year | DATE | 0.99+ |
Midtown Manhattan | LOCATION | 0.98+ |
Strata Conference | EVENT | 0.98+ |
third organization | QUANTITY | 0.98+ |
theCUBE | ORGANIZATION | 0.98+ |
two years ago | DATE | 0.98+ |
Strata Hadoop | EVENT | 0.98+ |
Wikibon Big Data | ORGANIZATION | 0.98+ |
seven years ago | DATE | 0.97+ |
Hitachi Insight | ORGANIZATION | 0.97+ |
today | DATE | 0.97+ |
Strata Data | EVENT | 0.97+ |
Hadoop World | EVENT | 0.96+ |
one | QUANTITY | 0.96+ |
One way | QUANTITY | 0.96+ |
NYC | LOCATION | 0.96+ |
Pentaho Solutions | ORGANIZATION | 0.96+ |
thousands | QUANTITY | 0.95+ |
Hitachi Data Centers | ORGANIZATION | 0.95+ |