David Hitz, NetApp | NetApp Insight 2018

(electronic music) >> Narrator: Live from Las Vegas it's theCUBE! Covering NetApp Insight 2018. Brought to you by NetApp. >> Welcome back to theCUBE's coverage of NetApp Insight 2018, Lisa Martin with Stu Miniman and guess who's here now, Dave Hitz, EVP and founder of NetApp, Dave, welcome back to theCUBE. >> Thank you and glad to be here. >> This is a big event, we were in the keynote this morning when we were walking out, standing room only really strong messages delivered by George Kurian, who stopped by for the first time couple hours ago. Great customer story, the futurist was very interesting perspective, 26 years ago, can you envision? >> You know the futurist? >> Where you are? >> Never mind that, I have a very different perspective than him, I think we are entering the golden decade of artificial intelligence. It's smart enough to be super, super cool and it hasn't figured out how to kill us yet, decade. (laughing) >> Lisa: That's good. >> Enjoy your last 10 years. >> Oh no, that's it? >> I, no, no, you asked, you asked that I envision this 26 years ago, oh my god, no, I mean, you know, we were a little start-up and we had these spread sheets that said we would grow to, you know it basically that, what the VC's told us if we could get to 100 million in revenue we can go public, so, naturally our spread sheets showed 200 million (laughs) in revenue, you know or five, six, some where in there and is like, we're so far beyond anything I imagined when we started, and we were doing technical nerdy products for little engineers and little work groups, you know and the idea that that part of the storage market would merge against the heavy duty, high-end enterprise storage market doing databases, and then that would end up colliding with the cloud market and helping, like no we didn't even imagine this stuff that's happening now, I mean it's so far beyond. >> Enabling DreamWorks to make movies, I mean-- >> I love that, you know they do showings, they do previews for their vendors and so I've gotten to take my 11-year-old daughter, she's 11 now, but to see, you know early viewing of some of these movies it's, it's just fun. >> So, Dave, it's always interesting in the industry a lot of time you say like, okay, this architecture is long in the tooth, there's a new generation do things better and everything like that. ONTAP, been around for a long time now.. >> You know, so let me-- >> Seems like it's been reinvigorated with the cloud and everything like that, you know. >> Let me make a comment about that. >> Yeah. >> Cause people do this, oh, ONTAP is so old, isn't that the old generation? So lets talk about old. Mainframes are old, and AS400s are old, and Unix is old, and then there's Windows which is kind of younger, and ONTAP's younger than that, and then there's Windows NT, which was a rewrite of Windows and Clustered ONTAP is younger than that, so like stop with the old, you know I mean iOS is after that, so okay fine we're older than iOS, but it's not an ancient, and then we've revamped it again to go run in the cloud, I mean we first started doing ONTAP running in Azure, sorry I mean Amazon initially, we started that work in 2013 and shipped it in 2014, so like that was yet another refresh so. >> Well, but you bring a point, you've, it is adjusted and moved, it wasn't something that's static. Can you speak a little bit, that cloud, the you know, the rewrite and focus around the cloud and what, that mean internally, I know you've been reinvigorated. >> Ha! >> With everything that's happened for the last few years. >> You know, the cloud everybody's doing it now and everybody's trying to be cloud relevant, we were really struggling early on I will say you know 2013, 2014 we were really trying to get our heads around what to do and a lot of people were stepping back like, no, no, no, let's see if we can slow it down, and, I mean not just outside of NetApp but NetApp as well, and the guy that was the CEO of the time Tom Georgens, and George Kurian was part of the staff then. We, I'm proud of what we did was we said, you know let's really lean in, its either going to happen or it's not going to happen, probably not, based on what we do, and if it does happen we'll be way better off leaning into it early, learning how to make this stuff work, and that's, you know we shipped ONTAP in the cloud in 2014, and it sucked, I mean, and no one body else had anything like it, it was awesome, right, whenever you look at old tech die, the first iPhone sucked too, but it was both great, but it needed so much more work, like the very first rev I remember a story, Joe CaraDonna as a programmer he's like, we tried to get our own IT organization to use it and they told us the security wasn't good enough, so we had to fix the security, like, I mean we've been through so much stuff that's almost five years ago. We've been working on it, and so you do all of this work and then Cloud Volumes is a complete, have you guys had Anthony on? >> Both: Yes. >> Couple hours ago. >> I love how Anthony thinks, so, he's a cloudy guy right from the foundation, he joins the executive staff, whole new perspective on stuff, so Cloud ONTAP, like ONTAP's my baby and we put it in the cloud. I'm proud of that, like you have our forward leaning cloud and Anthony's like, you know, just so you know, that's not nearly good enough, like, that is a very old school infrastructural thing, probably storage infrastructural people will like that they can have their same old OS running in the cloud, but it's not what cloudy people want, cloudy people don't want to run a storage OS in the cloud, cloudy people just want to say, I'd like a volume, please. Here's your volume, Thank you, and by the way, it should be a RESTful API, like God, ONTAP was none of those things and so if you look at the work we're doing now is like, okay, here's a RESTful API, here's the JSON schema, send it to the Azure Resource Manager Like that's cloudy and so, it was because, you know we did a good job engineering getting it in but we didn't, we didn't have that like the, what does cloud smell like? If you know what I mean, like, the right whiff of cloud. Anyway, so Anthony really brought that and I, and I just feel really good about where we are at now, because, it's like cloud developers, develop this stuff for other cloud developers, it feels like that. >> Well in the last five years it sounds like tremendous amounts of transformation, reinvigoration, NetApp has some bold marketing messaging. We are the data authority, we help customers become data driven, you talk about these three business imperatives, customers have lots of choices that, you know public cloud, private cloud, hybrid, George talked about this morning in his keynote that hybrid and multi-cloud is now de facto. >> You know, someone asked me, I was giving a talk and they asked me, okay so much cloud, how long do you think till NetApp's not shipping hardware? And I was like, no, no, like we don't see that going away anytime soon, if anything we think our success in the cloud, 'cause customers want to do that, will help us gain share on-prem because customers also want to do that, right? George's picture shows, yes there is traditional on-prem IT, enterprise IT, there's private clouds people, HCI, convergence CI, and then there's public cloud. To me the interesting question, is why do people do those different things, the number one driver for public cloud is innovation, like, if you just, like all the catchwords you can think of, if you want to start up a DevOps team to-go program, I would like a new mobile phone app and I want it to take a picture of the person's face, oh look it's a woman, she looks happy, and then you want it to listen to her, to the voice, and like transcribe the voice and then do a sentiment analysis on the words, oh, she looked happy but it's snarky, and then you want to feed that into neural net deep learning engine, and say, what should we try to sell her, like, I guaranteed you, the team working on the public cloud will beat the on-prem team hands down every time. Right, I mean that's, so when you look at people and they go, we want all in on the cloud, or there's got to be 100% cloud. My question is what, what's your, like, don't start with that, what's your problem? If it's derive innovation, for the private cloud, typically that's just all about speed. They're so uniform regular, they're all the same you have extra capacity, you know you got empty rack space, for where the next one goes, someone says, I need some storage, and you say, hey, it's got a self service offer defined API, like, just do it yourself, and then in the enterprise space, the enterprise IT, Unix, Windows, clients, server, like that zone, probably the bulk of your investment, right? That's where you been spending the money historically. Probably still the bulk of most people's investment, but they want to modernize it, they don't want to get rid of it, they don't want to turn it off, it's working, but they'd like it to work better, so flash enable it, just get the performance issues out of the way. By the way, shrinks your footprint in the data center, frees up space, and connected to the cloud. Like not moving it, but just back it up or do DR, or like something cloudy and so to me I look at those three goals are tightly linked to the three styles of infrastructure. Notice, I haven't talked about products yet? The conversations I like to have with customers these days, help me understand what your business challenges are, your trying to move faster, be more innovative, modernize the stuff you have. Okay, like what ratio, now lets talk about how we could do those things together with the Data Fabric and let you build the Data Fabric you need, I mean, our Data Fabric strategy is not to tell customers what to do, it's to help them build the Data Fabric they need for their needs based on, oh, we're all about innovation, all on the cloud, like okay fine. We can do that like, but let's talk about that or is it. Now I'm stuttering. >> You bring up a great point there, Dave. >> I'm excited about this stuff. >> It's really exciting 'cause you know I think back, you know, just a couple of years ago, if you go to the enterprise, oftentimes storage was the boat anchor to prevent me from moving forward. Now we know that data, is absolutely going to be one of the drivers going forward, how do we help those people make that transition? How do you see NetApp driving that transition? So boating, that's an interesting word because I think if you look at cloud compute, it's very easy to move compute into the cloud, right. >> Stu: Yes. >> The thing about compute is it just happens and then its done, like you turn it on, you turn if off. You spin up the VM, you spin down the VM, it's easy. The reason data is a boat anchor is not because its a boat anchor, because data is the hard part, like you fired up the compute to the cloud but usually you're computing some data, well, how did you get the data to the place where the compute is? And then when you're finished a lot of times you created some data, well, how do you keep track of the data you created in the cloud, and is it legal for it to stay in the cloud, and now you want to put the data in a different cloud or put the data in your own data center and like, who's watching all that data? It's not a boat anchor because data sucks, it's a boat anchor actually because its the important thing you want to keep forever, right? I mean, maybe you do or maybe you want to delete it and know for sure it's gone. Like, those, compute doesn't have any of those issues. So, what's my point, whatever is hard, like if this was easy anybody can do it, right? Whatever is hard, you go hire lots and lots of smart people to work on hard problems and then customers are like, whoa, you're solving hard problems, I guess I will pay you after all. Isn't that what business is? >> So the majority of your conversations start with helping customers identify what they've got, where best to spread out their investments, it's not product based its about business outcomes. I'd love to get kind of in the last few minutes here, your perspective on NetApp's own IT and digital, and cultural transformation, how does that help your legacy long time enterprise customers feel an even stronger trust with NetApp? >> I think prior to our cloud work customers for the most part, customers and potential customers, they knew us, you know, it was interesting even as we thought about marketing the new work that we are doing, one of the questions was like, how much should be about the cloud, how much should be about the old stuff, and we've really leaned in almost 100% on telling people our new cloud stories, they're both public and private. And our VP of marketing I think she had a really, Jean English, she had a really good perspective. She basically said look, we've been telling the on-prem storage iron story for 26 years and if there's a customer who's out there waiting to decide who to use I don't think telling them that story again and year 27, is going to be the thing that makes the difference, like, they've decided they're happy with their Hitatchi or they're EM's, whatever it is, but, but they don't know that NetApp can help them in this brave new world. Right, they have no clue that ONTAP is also running on Amazon, I mean, It's like, seriously, I can run ONTAP on Amazon? Yeah like fire it up, it's five bucks an hour, or whatever the number is, it's like that's crazy, you know and so, so and then people go, well, we've had so many conversations where they're trying to get a cloud strategy together, and we talk about all these things and data movement and data management and cloud, and like just all of these tools and they're very excited about where they're trying to go and they said, you know, by the way, I do also have a on-prem storage need. Could you do me a quote for like what I need this week and meanwhile let's do some planning about what I need next year, right, you've got both of them working together, and I think it's that combo that's important. >> Last question, how do you, if only you had more energy and excitement like legitimately about this, but how do you keep some of the NetApp folks that have been here for a long time? How have you helped reinvigorate them to, to really be able to digest the massive impact that you guys are being able to make across industries? >> One of the things I think helps, 'cause there is a... Let me back up a step, you know, Steve Jobs, is such an awesome guy and also in his life he made so many mistakes, and one of the things he did when, when Apple was almost entirely floated on their Apple III business and, was that Apple III, Apple II? And he was doing the Mac, and basically his message to everybody else was, if you're not working on the Mac, you suck, except, by the way, that's the product that's floating the entire business and generating all the products, and I really was conscious of, like that's the wrong way to do it. And when I look in particular of what we're doing we've got new operating systems like E-Series and like SolidFire, the HCI is a whole new thing, and yet ONTAP is still shot through our entire product line. I mean, the Cloud Volumes' the cool, hottest new thing. It's ONTAP under the covers, right, and you look at the HCI it's got the SolidFire block storage built in there as a very scalable model, oh but if you'd like files guess what? We run ONTAP in a VM, it's HCI it runs VM, and so actually if you look at what's going on in there the work that we've done going way back, and yes it's evolved, it's changed, but that same work is actually shot through as technology, no longer the front piece but it's shot through all of it as technology, so it is kind of a unifying characteristic. If you talk about that, I think it helps people get more comfortable both internally but, we have the same, you know, you asked how do you get employees comfortable, a lot of customers have the same problem, you know-- >> Lisa: Right. >> They've spent a lot of investment and learning ONTAP's foibles over the year and Cloud Volume's hides all of that. So, gee, maybe I don't like this, you know what if you need all those features Cloud ONTAP, you can run ONTAP, like some people do want to do that, so, I just feel like the fact that the pieces all fit together, work together, actually gets people comfortable with it. >> Excellent, well Dave thanks so much for stopping by. >> Thank you for having me. >> Thank you for sharing your energy, and your excitement, your passion and all this wisdom and looking at where you guys are 26 years later, we look forward to year 27. >> Great, thank you. >> We want to thank you for watching theCUBE, I'm Lisa Martin with Stu Miniman, we're at NetApp Insight 2018 in Vegas. Stick around Stu and I will be right back with our next guest. (electronic music)

Published Date : Oct 24 2018

SUMMARY :

Brought to you by NetApp. Welcome back to theCUBE's coverage interesting perspective, 26 years ago, can you envision? and it hasn't figured out how to kill us yet, decade. that said we would grow to, you know it basically that, daughter, she's 11 now, but to see, you know early a lot of time you say like, okay, this architecture and everything like that, you know. you know I mean iOS is after that, so okay fine Can you speak a little bit, that cloud, the you know, and that's, you know we shipped ONTAP in the cloud in 2014, and so, it was because, you know we did a good job imperatives, customers have lots of choices that, you know like all the catchwords you can think of, It's really exciting 'cause you know I think back, it legal for it to stay in the cloud, and now you want to So the majority of your conversations start you know and so, so and then people go, well, we've had so customers have the same problem, you know-- So, gee, maybe I don't like this, you know what if you need much for stopping by. Thank you for sharing your energy, and your excitement, We want to thank you for watching theCUBE, I'm Lisa Martin

ENTITIES

Entity	Category	Confidence
George	PERSON	0.99+
George Kurian	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Anthony	PERSON	0.99+
David Hitz	PERSON	0.99+
Dave Hitz	PERSON	0.99+
Steve Jobs	PERSON	0.99+
Dave	PERSON	0.99+
2014	DATE	0.99+
200 million	QUANTITY	0.99+
2013	DATE	0.99+
Stu Miniman	PERSON	0.99+
100%	QUANTITY	0.99+
Joe CaraDonna	PERSON	0.99+
Lisa	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
100 million	QUANTITY	0.99+
Vegas	LOCATION	0.99+
11	QUANTITY	0.99+
Jean English	PERSON	0.99+
Apple	ORGANIZATION	0.99+
26 years	QUANTITY	0.99+
next year	DATE	0.99+
iOS	TITLE	0.99+
DreamWorks	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Windows	TITLE	0.99+
Mac	COMMERCIAL_ITEM	0.99+
Stu	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Unix	TITLE	0.99+
one	QUANTITY	0.99+
Tom Georgens	PERSON	0.99+
11-year-old	QUANTITY	0.99+
five	QUANTITY	0.99+
Both	QUANTITY	0.99+
ONTAP	TITLE	0.99+
Windows NT	TITLE	0.99+
six	QUANTITY	0.98+
first	QUANTITY	0.98+
Apple II	COMMERCIAL_ITEM	0.98+
Apple III	COMMERCIAL_ITEM	0.98+
theCUBE	ORGANIZATION	0.98+
Azure	TITLE	0.98+
three goals	QUANTITY	0.98+
couple hours ago	DATE	0.97+
26 years ago	DATE	0.97+
NetApp	TITLE	0.97+
26 years later	DATE	0.97+
HCI	TITLE	0.97+
three styles	QUANTITY	0.97+
Cloud Volume	TITLE	0.96+
this week	DATE	0.96+
five bucks an hour	QUANTITY	0.96+
Cloud ONTAP	TITLE	0.95+
Hitatchi	ORGANIZATION	0.95+
NetApp Insight 2018	EVENT	0.93+
first time	QUANTITY	0.93+
One	QUANTITY	0.93+
almost 100%	QUANTITY	0.93+

Arik Pelkey, Pentaho - BigData SV 2017 - #BigDataSV - #theCUBE

>> Announcer: Live from Santa Fe, California, it's the Cube covering Big Data Silicon Valley 2017. >> Welcome, back, everyone. We're here live in Silicon Valley in San Jose for Big Data SV in conjunct with stratAHEAD Hadoop part two. Three days of coverage here in Silicon Valley and Big Data. It's our eighth year covering Hadoop and the Hadoop ecosystem. Now expanding beyond just Hadoop into AI, machine learning, IoT, cloud computing with all this compute is really making it happen. I'm John Furrier with my co-host George Gilbert. Our next guest is Arik Pelkey who is the senior director of product marketing at Pentaho that we've covered many times and covered their event at Pentaho world. Thanks for joining us. >> Thank you for having me. >> So, in following you guys I'll see Pentaho was once an independent company bought by Hitachi, but still an independent group within Hitachi. >> That's right, very much so. >> Okay so you guys some news. Let's just jump into the news. You guys announced some of the machine learning. >> Exactly, yeah. So, Arik Pelkey, Pentaho. We are a data integration and analytics software company. You mentioned you've been doing this for eight years. We have been at Big Data for the past eight years as well. In fact, we're one of the first vendors to support Hadoop back in the day, so we've been along for the journey ever since then. What we're announcing today is really exciting. It's a set of machine learning orchestration capabilities, which allows data scientists, data engineers, and data analysts to really streamline their data science processes. Everything from ingesting new data sources through data preparation, feature engineering which is where a lot of data scientists spend their time through tuning their models which can still be programmed in R, in Weka, in Python, and any other kind of data science tool of choice. What we do is we help them deploy those models inside of Pentaho as a step inside of Pentaho, and then we help them update those models as time goes on. So, really what this is doing is it's streamlining. It's making them more productive so that they can focus their time on things like model building rather than data preparation and feature engineering. >> You know, it's interesting. The market is really active right now around machine learning and even just last week at Google Next, which is their cloud event, they had made the acquisition of Kaggle, which is kind of an open data science. You mentioned the three categories: data engineer, data science, data analyst. Almost on a progression, super geek to business facing, and there's different approaches. One of the comments from the CEO of Kaggle on the acquisition when we wrote up at Sylvan Angle was, and I found this fascinating, I want to get your commentary and reaction to is, he says the data science tools are as early as generations ago, meaning that all the advances and open source and tooling and software development is far along, but now data science is still at that early stage and is going to get better. So, what's your reaction to that, because this is really the demand we're seeing is a lot of heavy lifing going on in the data science world, yet there's a lot of runway of more stuff to do. What is that more stuff? >> Right. Yeah, we're seeing the same thing. Last week I was at the Gardener Data and Analytics conference, and that was kind of the take there from one of their lead machine learning analysts was this is still really early days for data science software. So, there's a lot of Apache projects out there. There's a lot of other open source activity going on, but there are very few vendors that bring to the table an integrated kind of full platform approach to the data science workflow, and that's what we're bringing to market today. Let me be clear, we're not trying to replace R, or Python, or MLlib, because those are the tools of the data scientists. They're not going anywhere. They spent eight years in their phD program working with these tools. We're not trying to change that. >> They're fluent with those tools. >> Very much so. They're also spending a lot of time doing feature engineering. Some research reports, say between 70 and 80% of their time. What we bring to the table is a visual drag and drop environment to do feature engineering a much faster, more efficient way than before. So, there's a lot of different kind of desperate siloed applications out there that all do interesting things on their own, but what we're doing is we're trying to bring all of those together. >> And the trends are reduce the time it takes to do stuff and take away some of those tasks that you can use machine learning for. What unique capabilities do you guys have? Talk about that for a minute, just what Pentaho is doing that's unique and added value to those guys. >> So, the big thing is I keep going back to the data preparation part. I mean, that's 80% of time that's still a really big challenge. There's other vendors out there that focus on just the data science kind of workflow, but where we're really unqiue is around being able to accommodate very complex data environments, and being able to onboard data. >> Give me an example of those environments. >> Geospatial data combined with data from your ERP or your CRM system and all kinds of different formats. So, there might be 15 different data formats that need to be blended together and standardized before any of that can really happen. That's the complexity in the data. So, Pentaho, very consistent with everything else that we do outside of machine learning, is all about helping our customers solve those very complex data challenges before doing any kind of machine learning. One example is one customer is called Caterpillar Machine Asset Intelligence. So, their doing predictive maintenance onboard container ships and on ferry's. So, they're taking data from hundreds and hundreds of sensors onboard these ships, combining that kind of operational sensor data together with geospatial data and then they're serving up predictive maintenance alerts if you will, or giving signals when it's time to replace an engine or complace a compressor or something like that. >> Versus waiting for it to break. >> Versus waiting for it to break, exactly. That's one of the real differentiators is that very complex data environment, and then I was starting to move toward the other differentiator which is our end to end platform which allows customers to deliver these analytics in an embedded fashion. So, kind of full circle, being able to send that signal, but not to an operational system which is sometimes a challenge because you might have to rewrite the code. Deploying models is a really big challenge within Pentaho because it is this fully integrated application. You can deploy the models within Pentaho and not have to jump out into a mainframe environment or something like that. So, I'd say differentiators are very complex data environments, and then this end to end approach where deploying models is much easier than ever before. >> Perhaps, let's talk about alternatives that customers might see. You have a tool suite, and others might have to put together a suite of tools. Maybe tell us some of the geeky version would be the impendent mismatch. You know, like the chasms you'd find between each tool where you have to glue them together, so what are some of those pitfalls? >> One of the challenges is, you have these data scientists working in silos often times. You have data analysts working in silos, you might have data engineers working in silos. One of the big pitfalls is not really collaborating enough to the point where they can do all of this together. So, that's a really big area that we see pitfalls. >> Is it binary not collaborating, or is it that the round trip takes so long that the quality or number of collaborations is so drastically reduced that the output is of lower quality? >> I think it's probably a little bit of both. I think they want to collaborate but one person might sit in Dearborn, Michigan and the other person might sit in Silicon Valley, so there's just a location challenge as well. The other challenge is, some of the data analysts might sit in IT and some of the data scientists might sit in an analytics department somewhere, so it kind of cuts across both location and functional area too. >> So let me ask from the point of view of, you know we've been doing these shows for a number of years and most people have their first data links up and running and their first maybe one or two use cases in production, very sophisticated customers have done more, but what seems to be clear is the highest value coming from those projects isn't to put a BI tool in front of them so much as to do advanced analytics on that data, apply those analytics to inform a decision, whether a person or a machine. >> That's exactly right. >> So, how do you help customers over that hump and what are some other examples that you can share? >> Yeah, so speaking of transformative. I mean, that's what machine learning is all about. It helps companies transform their businesses. We like to talk about that at Pentaho. One customer kind of industry example that I'll share is a company called IMS. IMS is in the business of providing data and analytics to insurance companies so that the insurance companies can price insurance policies based on usage. So, it's a usage model. So, IMS has a technology platform where they put sensors in a car, and then using your mobile phone, can track your driving behavior. Then, your insurance premium that month reflects the driving behavior that you had during that month. In terms of transformative, this is completely upending the insurance industry which has always had a very fixed approach to pricing risk. Now, they understand everything about your behavior. You know, are you turning too fast? Are you breaking too fast, and they're taking it further than that too. They're able to now do kind of a retroactive look at an accident. So, after an accident, they can go back and kind of decompose what happened in the accident and determine whether or not it was your fault or was in fact the ice on the street. So, transformative? I mean, this is just changing things in a really big way. >> I want to get your thoughts on this. I'm just looking at some of the research. You know, we always have the good data but there's also other data out there. In your news, 92% of organizations plan to deploy more predictive analytics, however 50% of organizations have difficulty integrating predictive analytics into their information architecture, which is where the research is shown. So my question to you is, there's a huge gap between the technology landscapes of front end BI tools and then complex data integration tools. That seems to be the sweet spot where the value's created. So, you have the demand and then front end BI's kind of sexy and cool. Wow, I could power my business, but the complexity is really hard in the backend. Who's accessing it? What's the data sources? What's the governance? All these things are complicated, so how do you guys reconcile the front end BI tools and the backend complexity integrations? >> Our story from the beginning has always been this one integrated platform, both for complex data integration challenges together with visualizations, and that's very similar to what this announcement is all about for the data science market. We're very much in line with that. >> So, it's the cart before the horse? Is it like the BI tools are really driven by the data? I mean, it makes sense that the data has to be key. Front end BI could be easy if you have one data set. >> It's funny you say that. I presented at the Gardner conference last week and my topic was, this just in: it's not about analytics. Kind of in jest, but it drove a really big crowd. So, it's about the data right? It's about solving the data problem before you solve the analytics problem whether it's a simple visualization or it's a complex fraud machine learning problem. It's about solving the data problem first. To that quote, I think one of the things that they were referencing was the challenging information architectures into which companies are trying to deploy models and so part of that is when you build a machine learning model, you use R and Python and all these other ones we're familiar with. In order to deploy that into a mainframe environment, someone has to then recode it in C++ or COBOL or something else. That can take a really long time. With our integrated approach, once you've done the feature engineering and the data preparation using our drag and drop environment, what's really interesting is that you're like 90% of the way there in terms of making that model production ready. So, you don't have to go back and change all that code, it's already there because you used it in Pentaho. >> So obviously for those two technologies groups I just mentioned, I think you had a good story there, but it creates problems. You've got product gaps, you've got organizational gaps, you have process gaps between the two. Are you guys going to solve that, or are you currently solving that today? There's a lot of little questions in there, but that seems to be the disconnect. You know, I can do this, I can do that, do I do them together? >> I mean, sticking to my story of one integrated approach to being able to do the entire data science workflow, from beginning to end and that's where we've really excelled. To the extent that more and more data engineers and data analysts and data scientists can get on this one platform even if their using R and WECCA and Python. >> You guys want to close those gaps down, that's what you guys are doing, right? >> We want to make the process more collaborative and more efficient. >> So Dave Alonte has a question on CrowdChat for you. Dave Alonte was in the snowstorm in Boston. Dave, good to see you, hope you're doing well shoveling out the driveway. Thanks for coming in digitally. His question is HDS has been known for mainframes and storage, but Hitachi is an industrial giant. How is Pentaho leveraging Hitatchi's IoT chops? >> Great question, thanks for asking. Hitatchi acquired Pentaho about two years ago, this is before my time. I've been with Pentaho about ten months ago. One of the reasons that they acquired Pentaho is because a platform that they've announced which is called Lumata which is their IoT platform, so what Pentaho is, is the analytics engine that drives that IoT platform Lumata. So, Lumata is about solving more of the hardware sensor, bringing data from the edge into being able to do the analytics. So, it's an incredibly great partnership between Lumata and Pentaho. >> Makes an eternal customer too. >> It's a 90 billion dollar conglomerate so yeah, the acquisition's been great and we're still very much an independent company going to market on our own, but we now have a much larger channel through Hitatchi's reps around the world. >> You've got IoT's use case right there in front of you. >> Exactly. >> But you are leveraging it big time, that's what you're saying? >> Oh yeah, absolutely. We're a very big part of their IoT strategy. It's the analytics. Both of the examples that I shared with you are in fact IoT, not by design but it's because there's a lot of demand. >> You guys seeing a lot of IoT right now? >> Oh yeah. We're seeing a lot of companies coming to us who have just hired a director or vice president of IoT to go out and figure out the IoT strategy. A lot of these are manufacturing companies or coming from industries that are inefficient. >> Digitizing the business model. >> So to the other point about Hitachi that I'll make, is that as it relates to data science, a 90 billion dollar manufacturing and otherwise giant, we have a very deep bench of phD data scientists that we can go to when there's very complex data science problems to solve at customer sight. So, if a customer's struggling with some of the basic how do I get up and running doing machine learning, we can bring our bench of data scientist at Hitatchi to bear in those engagements, and that's a really big differentiator for us. >> Just to be clear and one last point, you've talked about you handle the entire life cycle of modeling from acquiring the data and prepping it all the way through to building a model, deploying it, and updating it which is a continuous process. I think as we've talked about before, data scientists or just the DEV ops community has had trouble operationalizing the end of the model life cycle where you deploy it and update it. Tell us how Pentaho helps with that. >> Yeah, it's a really big problem and it's a very simple solution inside of Pentaho. It's basically a step inside of Pentaho. So, in the case of fraud let's say for example, a prediction might say fraud, not fraud, fraud, not fraud, whatever it is. We can then bring that kind of full lifecycle back into the data workflow at the beginning. It's a simple drag and drop step inside of Pentaho to say which were right and which were wrong and feed that back into the next prediction. We could also take it one step further where there has to be a manual part of this too where it goes to the customer service center, they investigate and they say yes fraud, no fraud, and then that then gets funneled back into the next prediction. So yeah, it's a big challenge and it's something that's relatively easy for us to do just as part of the data science workflow inside of Pentaho. >> Well Arick, thanks for coming on The Cube. We really appreciate it, good luck with the rest of the week here. >> Yeah, very exciting. Thank you for having me. >> You're watching The Cube here live in Silicon Valley covering Strata Hadoop, and of course our Big Data SV event, we also have a companion event called Big Data NYC. We program with O'Reilley Strata Hadoop, and of course have been covering Hadoop really since it's been founded. This is The Cube, I'm John Furrier. George Gilbert. We'll be back with more live coverage today for the next three days here inside The Cube after this short break.

Published Date : Mar 14 2017

SUMMARY :

it's the Cube covering Big Data Silicon Valley 2017. and the Hadoop ecosystem. So, in following you guys I'll see Pentaho was once You guys announced some of the machine learning. We have been at Big Data for the past eight years as well. One of the comments from the CEO of Kaggle of the data scientists. environment to do feature engineering a much faster, and take away some of those tasks that you can use So, the big thing is I keep going back to the data That's the complexity in the data. So, kind of full circle, being able to send that signal, You know, like the chasms you'd find between each tool One of the challenges is, you have these data might sit in IT and some of the data scientists So let me ask from the point of view of, the driving behavior that you had during that month. and the backend complexity integrations? is all about for the data science market. I mean, it makes sense that the data has to be key. It's about solving the data problem before you solve but that seems to be the disconnect. To the extent that more and more data engineers and more efficient. shoveling out the driveway. One of the reasons that they acquired Pentaho the acquisition's been great and we're still very much Both of the examples that I shared with you of IoT to go out and figure out the IoT strategy. is that as it relates to data science, from acquiring the data and prepping it all the way through and feed that back into the next prediction. of the week here. Thank you for having me. for the next three days here inside The Cube

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Hitachi	ORGANIZATION	0.99+
Dave Alonte	PERSON	0.99+
Pentaho	ORGANIZATION	0.99+
Dave	PERSON	0.99+
90%	QUANTITY	0.99+
Arik Pelkey	PERSON	0.99+
Boston	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Hitatchi	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
one	QUANTITY	0.99+
50%	QUANTITY	0.99+
eight years	QUANTITY	0.99+
Arick	PERSON	0.99+
One	QUANTITY	0.99+
Lumata	ORGANIZATION	0.99+
Last week	DATE	0.99+
two technologies	QUANTITY	0.99+
15 different data formats	QUANTITY	0.99+
first	QUANTITY	0.99+
92%	QUANTITY	0.99+
One example	QUANTITY	0.99+
Both	QUANTITY	0.99+
Three days	QUANTITY	0.99+
Python	TITLE	0.99+
Kaggle	ORGANIZATION	0.99+
one customer	QUANTITY	0.99+
today	DATE	0.99+
eighth year	QUANTITY	0.99+
last week	DATE	0.99+
Santa Fe, California	LOCATION	0.99+
two	QUANTITY	0.99+
each tool	QUANTITY	0.99+
90 billion dollar	QUANTITY	0.99+
80%	QUANTITY	0.99+
Caterpillar	ORGANIZATION	0.98+
both	QUANTITY	0.98+
NYC	LOCATION	0.98+
first data	QUANTITY	0.98+
Pentaho	LOCATION	0.98+
San Jose	LOCATION	0.98+
The Cube	TITLE	0.98+
Big Data SV	EVENT	0.97+
COBOL	TITLE	0.97+
70	QUANTITY	0.97+
C++	TITLE	0.97+
IMS	TITLE	0.96+
MLlib	TITLE	0.96+
one person	QUANTITY	0.95+
R	TITLE	0.95+
Big Data	EVENT	0.95+
Gardener Data and Analytics	EVENT	0.94+
Gardner	EVENT	0.94+
Strata Hadoop	TITLE	0.93+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Hitatchi: