David Hseih, Qubole - DataWorks Summit 2017
>> Announcer: Live from San Jose in the heart of Silicon Valley, it's theCube. Covering DataWorks Summit 2017. Brought to you by Hortonworks. >> Hey, welcome back to theCUBE. We are live on day one of the DataWorks Summit in the heart of Silicon Valley. I'm Lisa Martin with my co-host Peter Burgess. Just chatting with our next guest about the Warriors win yesterday we're also pretty excited about that. David Hseih the SVP of Marketing from Qubole, hi David. >> David: Hey, thanks for having me. >> Welcome to theCUBE, we're glad you still have a voice after no doubt cheering on the home team last night. >> It was a close call 'cause I was yelling pretty loud yesterday. >> So talk to us about you the SVP of Marketing for Qubole. Big data platform in the cloud. You guys just had a big announcement a few weeks ago. >> David: Right. >> What are your thoughts, what's going on with Qubole? What's going on with big data? What are you seeing in the market? So you know we're a cloud-native data platform and you know when we talk to customers, we're really, you know, they're really complaining about how they're just struggling with complexity and the barriers to entry and you know, they're really crying out for help. And the good news I suppose is we're in an industry that has a very high pace of innovation. That's great right. Spark has had eight versions now in two years, but that pace of innovation is, you know, making the complexity even harder. I was watching Cloudera bragging about how their new product is a combination of 24 open source projects. You know that's tough stuff, right. So if you're a practitioner trying to get big data operationalized in your company. And trying to scale the use of data and analytics across the company. The nature of open source is it's designed for flexibility. Right, the source codes public, you have all these options, configuration settings et cetera. But moving those into production and then scaling them in a reliable way is just crushing practitioners. And so data teams are suffering, and I think frankly it's bad for our industry, because, you know, Gardner's talking about a, you know, 80% failure rate of big data projects by 2018. Think about that, what industry can survive when 70 or 80% of the projects fail? >> Well I think what's let me push on that a little bit. Because I think that the concern is about, is not about 70 to 80% of the efforts to reach an answer in a complex big data thing, it's going to fail. We can probably accommodate that, but what we can't accommodate is failure in the underlying infrastructure. >> David: Absolutely. >> So the research we've done, suggest something as well that we are seeing an enormous amount of time spent on the underlying infrastructure. And there's a lot of failures there. People would say, I have a question, I want to know if there's an answer and try to get to that answer, and not getting the answer they want, >> David: Yep. or getting a different answer. That kind of failure is still okay. >> David: Right. >> Because that's experience, you get more and more and more. >> David: Absolutely. >> So it's not the failure in the data science side or the application side. >> Actually I would say getting to an answer you don't like, is a form of success. Like you have an idea, you try it out, that's all great. >> So what Gardner is really saying it's failure in the implementation of the infrastructure. >> That's exactly right. >> So it's the administrative and operational sides. >> Correct, it's a project that didn't deliver then resolve. If the end result what you hoped, great. >> You couldn't even answer your question. >> Exactly, couldn't even answer the question. >> So let me test something on you Dave, David. We've been carrying a thesis at Wikibon for awhile that it looks like opensource is proving that it's very good at mimicking, and not quite as good at inventing. >> David: Right. >> So by that I mean if you put an operating, drop an operating system in front of Linus Torvalds he can look at that and say I can do that. >> David: Right. >> And do a great job of it. If you put a development tool same kind of thing. But big data is very complex, a lot of it, an enormous number of usecases. >> David: Correct. >> And open source has done a good job at a tool level and it looks as though the tools are being built to make other tools more valuable, >> David: Ha, right. >> As opposed to making it easy for a business to operationalize data science and the use of big data in their business. Would you agree or disagree with that? >> I yeah, I think that sort of like fundamentally the philosophy of open source. You know I'm going to do my work, something I need for me, but I'm going to share it with everybody else. And they can contribute. But at the end of the day, you know, unlike commercial software, there's sort of no one throat to choke. Right and there's nobody who is going to guarantee the interoperability and the success of the piece of software that you're trying to deploy. >> There's mot even a real coherent vision in many respects. >> David: No, absolutely not. >> What the final product's going to end up looking like. >> So what you have is a lot of really great cutting edge technology that a lot of really smart people, sort of poured their hearts and souls into. But that's a little different than trying to get to an end result. And, you know. Like it or not, commercial software packages are designed to deliver the result you pay for. Open source being sort of philosophically, very different I think breeds you know inherent complexity. And that complexity right now, is I think the root of the problem in our industry. >> So give us an example David, you know, you're a Marketing guy, I'm a marketing gal. >> Sure. >> Give us an example of a customer, maybe one of your favorite examples, where are you helping them? They're struggling here, they've made significant investments from an infrastructure perspective. They know there's value in the data, >> David: Yup. varying degrees as we've talked about before. How does Qubole get in there and start helping this usecase customer start to optimize, and really start making this big data project successful? >> That's a great question. So there's really two things, number one is that we are a SAAS based platform in the cloud and what we do basically is make big data into more of a turnkey service. So actually the other day, I was sort of surfing the internet, and we have a customer from Sonic Drive-In. You know they do hamburgers and stuff. >> Lisa: Oh yeah. >> And they're doing a bunch of big data, and this guy was at a data science meet, talking about. We didn't put him up to this, he just volunteered. He was talking about how we made his life so much easier. Why, because all of the configurations stuff, the settings, and you know, how to manage costs, was basically filling out a form and setting policy and parameters. And not having to write scripts and figure out all these configuration settings. If I set this one this way and that one that way, what happens. You know, we have a sort of more curated environment that makes that easy. But the thing that I'm really excited about is we think this is the time to really look at having data platforms that can you know, build or run autonomously. Today companies have to hire really expensive, really highly skilled, super smart data engineers, and data ops people to run their infrastructure. And you know, if you look at studies, we're about a 180,000 people short of the number of data engineers, data ops this industry needs. So try to scale by adding more smart people is super hard. Right but instead if you could start to get machines to do what people are doing. Just faster, cheaper, more reliably. Then you can scale your data platform. So we basically, made an announcement a couple weeks ago, kind of about the industry's first autonomous data platform. And what we're building, are software agents that can take over certain types of data management tasks so that data engineers don't have to do it. Or don't have to be up at three in the morning making sure everything is going right. >> And from a market segmentation perspective where's your sweet spot for that? Enterprise, SMB, somewhere in the middle? >> The bigger you have to scale. It's not about company size it's really about sort of the scope and scale of your big data efforts. So you know, the more people you have using it, then the more data you have. The more you want automation to make things easier. It's sort of true of any industry, it's certainly going to be true of the big data industry. >> Peter: Yeah more complexity in the question set, >> Correct. >> The more complexity-- >> Or the more users you have, the more it gives. Adds more data sources. >> Which presumable is going to be correlated. >> Absolutely correct. >> Which is we can use a big data project to ascertain that. >> Well in fact that sort of what we're doing. Because we're a SAAS platform we take in the metadata from what our customers are doing. What users, what clusters, what queries, which tables, all that stuff. We basically use machine learning and artificial intelligence to analyze how you're using your data platform. And tell you what you could do better or automates stuff that you don't have to do anymore. >> So we've presumed that the industry at some point of time, the big data industry at some point of time, is going to start moving it's attention to things like machine learning and A.I., you know, up into applications. >> David: Yep. >> Are we going to see the big data industry basically more pretty rapidly into more of inservice or application conversation, or is it going to kind of are we going to see a rebirth, as folks try to bring a more coherent approach to the existing, many of the tools that are here right now. >> David: Right. >> What do you think? >> Well I think, we're going to see some degree of industry consolidation, and you're going to see vendors, you know, and you're seeing it today. Try to simplify and consolidate. Right so some of that is moving stack towards applications some of that is about repackaging their offerings and adding simplicity. It's about using artificial intelligence to make the operational platform itself easier. I think you'll see a variety of those things, because you know, companies have too many places where they can stumble in their deployment. And you know, it's going to be, you know, the vendor community has to step in and simplify those things to basically gain greater adoption. >> So if you think about it, what is, I mean I have my own idea, but what do you think the metric that businesses should be using as they conceive of how to source different tools and invest in different tools, put things together. I think it's increasingly we're going to talk about time to value. What do you think? >> I think time to value is one. I think another one you could look at is the number of people who have access to the data to create insights. Right so you know, you can say a 100% of my company has access to the data and analytics that they need to help their function run better. Whatever it is, that's a pretty awesome accomplishment. And you know, there's a bunch of people who may or may not have 100% but they're pretty close, right. And they've really become a data driven enterprise. And then you have lots of companies what are sort of stuck with, okay we have this usecase running, thank goodness. Took us two years and a couple million bucks and now they're trying to figure out how to get to the next step. And so they have five users who are able to use their data platform successfully. That's you know, I think that's a big measure of success. >> So I want to talk quickly about, if I may about the cloud. >> David: Yeah. >> Because it's pretty clear there are a number of, that there are some very, very large shops. >> David: Yep. >> That are starting to conceive of important parts of their overall approach to data. >> David: Right. >> And putting things into the cloud. There's a lot of advantages of doing it that way. At the same time they're also thinking about, and how I'm going to integrate, the models that I generate out of big data back into applications that might be running in a lot of different places. >> Right. >> That suggests there's going to be a new challenge on the horizon. Of how do we think about end to end bringing applications together with predictable date of movement and control and other types of activities. >> David: Yeah. >> Do you agree that's on the horizon of how we think about end to end performance across multiple different clouds? >> I think that's coming, you know, I think I'm still surprised at how many people have not figured out that the economic and agility advantages of cloud, are so great, that'd you'd be honestly foolish not to, you know, consider cloud and have that proactive way to migrate there. And so there is just you know a shocking amount of companies that are still plotting away, you know, and building their own prime infrastructures et cetera. And they still have hesitancy and questions about the cloud. I do think that you're right, but I think what you're talking about is, you know, three to five years out for the mainstream in the industry. Certainly there are early adopters you know, who have sort of gotten there. They're talking about that now. But as sort of a mainstream phenomenon I think that's a couple years out. >> Excuse me Peter, one of the things that just kind of made me think of was, you know, these companies as what you're saying, that is till had hesitancy regarding cloud. >> Right. >> And kind of vendor lock in popped into my head. And that kid of brought me back to one of the things that you were mentioning in the beginning. Open source, complexity there. >> David: Yep. >> Are you seeing, or are you helping companies to go back to more of that commercialized proprietary software. Are you seeing a shift in enterprises being less concerned about lock-in because they want simplicity? >> You know that's a great question. I think in the big data space it's hard to avoid, you know, sort of going down the open source path. I think what people are getting concerned about is getting locked into a single cloud vendor. So more and more of the conversations we have are about, what are your multi-cloud and eventually cross-cloud capabilities? >> Peter: That's the question I just asked, right. >> Exactly so I think more and more of that's coming to the front. I was with a large, very large healthcare company a week ago, and I said, what's your cloud strategy? And they said we have a no vendor left behind policy. So you know our, we're standardized on Azure, we've got a bunch of pilots on AWS, and we're planning to move from a data warehousing vendor to Oracle in the cloud. Ha so, I think for large companies a lot of them can't control the fact that different division, departments, whatever will use different clouds. So architecturally, they're going to have to start to think about using these multi-cloud, cross-cloud you know, scenarios. And you know, most large companies, given a choice, will not bet the farm on a single cloud provider. And you know, we're great partners and we love Amazon, but every time they have you know, an S3 outage like they had a few months ago. You know, it really makes people think carefully about what their infrastructure is and how they're dealing with reliability. >> Well in fairness they don't have that many, >> They don't, it only takes one. >> That's right, that's right, and there's reasons to suspect that there will be increased specialization of services in the cloud. >> David: Correct. >> So I mean it's going to get more complex as we go as well. >> David: Oh absolutely correct. >> Not less. >> Well David Hseih, SVP of Marketing at Qubole. Thank you so much for joining, >> Thank you. >> And sharing your insights with Peter and myself. It's been very insightful. >> Right. >> So this is another great example of how we've been talking about the Warriors and food, Sonic was brought up into play here. >> David: Exactly, go Sonic. Very exciting you never know what's going to happen on theCUBE. So for David and Peter, I am Lisa Martin, You're watching Day One, of the Data Work Summit, in the heart of Silicon Valley. But stick around because we've got more great content coming your way.
SUMMARY :
Brought to you by Hortonworks. in the heart of Silicon Valley. Welcome to theCUBE, we're glad you still have a voice It was a close call 'cause I was So talk to us about you the SVP of Marketing for Qubole. and the barriers to entry and you know, is not about 70 to 80% of the efforts to reach So the research we've done, suggest something as well That kind of failure is still okay. So it's not the failure in the Like you have an idea, you try it out, that's all great. it's failure in the implementation of the infrastructure. If the end result what you hoped, great. So let me test something on you Dave, David. So by that I mean if you put an operating, If you put a development tool same kind of thing. and the use of big data in their business. But at the end of the day, you know, unlike are designed to deliver the result you pay for. So give us an example David, you know, you're a They know there's value in the data, and really start making this big data project successful? So actually the other day, I was sort of surfing the the settings, and you know, how to manage costs, So you know, the more people you have Or the more users you have, the more it gives. or automates stuff that you don't have to do anymore. you know, up into applications. many of the tools that are here right now. And you know, it's going to be, you know, I mean I have my own idea, but what do you think And you know, there's a bunch of people who may or that there are some very, very large shops. of their overall approach to data. and how I'm going to integrate, the models That suggests there's going to be I think that's coming, you know, I think I'm still just kind of made me think of was, you know, And that kid of brought me back to one of the things Are you seeing, or are you helping companies So more and more of the conversations we have And you know, we're great partners and we love Amazon, to suspect that there will be increased Thank you so much for joining, And sharing your insights with Peter and myself. talking about the Warriors and food, Very exciting you never know what's
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Peter Burgess | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
David Hseih | PERSON | 0.99+ |
Peter | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
70 | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
five users | QUANTITY | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
2018 | DATE | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
two years | QUANTITY | 0.99+ |
100% | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
24 open source projects | QUANTITY | 0.99+ |
80% | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
Gardner | PERSON | 0.99+ |
Qubole | ORGANIZATION | 0.99+ |
Sonic Drive-In | ORGANIZATION | 0.99+ |
Linus Torvalds | PERSON | 0.99+ |
yesterday | DATE | 0.99+ |
two things | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
five years | QUANTITY | 0.99+ |
DataWorks Summit | EVENT | 0.99+ |
Today | DATE | 0.98+ |
Wikibon | ORGANIZATION | 0.98+ |
Data Work Summit | EVENT | 0.98+ |
a week ago | DATE | 0.98+ |
eight versions | QUANTITY | 0.97+ |
last night | DATE | 0.97+ |
theCUBE | ORGANIZATION | 0.97+ |
Day One | QUANTITY | 0.96+ |
today | DATE | 0.96+ |
DataWorks Summit 2017 | EVENT | 0.96+ |
Sonic | PERSON | 0.95+ |
single cloud | QUANTITY | 0.95+ |
180,000 people | QUANTITY | 0.95+ |
Hortonworks | ORGANIZATION | 0.93+ |
day one | QUANTITY | 0.93+ |
S3 | COMMERCIAL_ITEM | 0.93+ |
Spark | ORGANIZATION | 0.92+ |
Azure | TITLE | 0.92+ |
Cloudera | ORGANIZATION | 0.92+ |
SVP | PERSON | 0.89+ |
few months ago | DATE | 0.89+ |
few weeks ago | DATE | 0.86+ |
first autonomous data platform | QUANTITY | 0.86+ |
a couple weeks ago | DATE | 0.81+ |