Michael Weiss & Shere Saidon, NASDAQ | PentahoWorld 2017
>> Narrator: Live from Orlando, Florida, it's theCube covering PentahoWorld 2017 brought to you by Hitachi Ventara. >> Welcome back to theCube's live coverage of PentahoWorld brought to you by Hitachi Ventara. My name is Rebecca Knight, I'm your host along with my co-host, Dave Vellante. We're joined by Michael Weiss, he is the senior manager at NASDAQ, and Shere Saidon, who is analytics manager at NASDAQ. Thanks so much for coming back to theCube, I should say, you're Cube veterans now. >> We are, at least I am. This is his first year, this is his first time at PentahoWorld. So, excited to bring him along. >> Okay so you're a newbie but you're a veteran so. (laughing) >> Great. So, tell us a little bit about what has changed since the last time you came on, which was 2015, back then? >> So the biggest thing that's happened in the past 18 months is we've launched seven new exchanges. Integrated seven new exchanges. We bought the ISE, the International Stock Exchange, which is three options markets. We just completed that integration in August. We've also bought the Canadian, CHI-X, the Canadian Exchange, which also had three equities markets, so we integrated them, and we went live with a dark pool offering for Goldman back in June. So now we operate a dark pool for Goldman Sachs, and we're looking to kind of expand that offering at this point. >> So you're just getting bigger and bigger. So tell our viewers a little bit how Pentaho fits into this. >> So Pentaho is the engine that kind of does all our analytics behind the scenes at post trade, right. So we do a lot of traditionally TL, where we're doing batch processing. In the back-end we're doing a little bit more with the Hadoop ecosystem leveraging things like EMR, Spark, Presto, that type of stuff, And Pentaho kind of helps blend that stuff together a little bit. We use it for reporting, we do some of the BA, we're actually now looking to have the data Pentaho generates plug in a little bit of Tableau. So, we're looking to expand it and really leverage that data in other ways at this point. Even doing some things more externally, doing more data offerings via Pentaho externally. >> So I got to do a NASDAQ 101 for my 13 year-old. Came up to me the other day and said, "Daddy, what's the NASDAQ index and how does it work?" Well, give us a 20 second answer. >> Michael: On the NASDAQ index? >> Yeah, what's the NASDAQ Index and how does it work? >> Probably the wrong person to answer that one but, the index is generally just a blend of various stocks. So the S&P 500 is a blend of different stocks, much like that the cues, are NASDAQ's equivalent of the S&P, right, so, we use a different algorithm to determine the companies that make up that blend, but it's an index just like at the S&P. >> They're weighted by market cap- >> Michael: Right, yeah. >> And that determines the number at the end- >> Michael: Correct. >> And it goes up and down based on what the stock's index. >> Right, and that's how most people know NASDAQ, right. They see the S&P went up by 5 points, The Dow went down by 3 and the NASDAQ went up by a point, right. But most people don't realize that NASDAQ also operates 27 exchanges worldwide, I think it is now. So, probably a little bit more, maybe closer to 32, but... >> So you mentioned that you're doing a dark pool for Goldman >> Michael: Yes. >> So that's interesting. We were talking off camera about HFT and kind of the old days, and dark pools were criticized at the time. Now Goldman was one of the ones shown to be honest and above board, but what does that mean the dark pool for your business and how does that all tie in? >> Michael: So, dark pools are isolated markets, right, so they don't necessarily interact with the NASDAQ exchange themselves, it's all done within the pool. You interact with only people trading on that pool. What NASDAQ has done is we took our technology and we now host it for Goldman so, we have I-NETs our trading system, so we gave them I-NET, we built all the surrounding solutions, how you manage symbols, how you manage membership. Even the data, we curate their data in the AWS. We do some Pentaho transformations for them. We do some analytics for them. And that's actually going to start expanding, but yeah, we've provided them an entire solution, so now they don't have to manage their own dark pool. And now we're going to look to expand that to other potential clients. >> Dave: So that's NASDAQ as a technology >> Yes. >> Dave: Provider. Very interesting. So I was saying, earlier, the Hong Kong Stock Exchange is basically closing the facility where they house humans, again another example of machines replacing humans. So the joining, well NASDAQ, kind of, but NYSE, London Stock Exchange, Singapore, now Hong Kong... Essentially, electronic trading. So, brings us to the sort of technology underpinnings of NASDAQ. Shere, maybe you can talk a little bit about your role, and paint a picture of the technology infrastructure. >> Yeah so I focus primarily on the financial side of corporate finance. So we leverage Pentaho to do a lot of data integration, allow us to really answer our business questions. So, previously it would take days to put basic reporting together, now you've got it all automated, or we're working towards getting it mostly automated, and it just answer the questions that we need. And no longer use our gut to drive decisions, we're using hard data. And so that's helped us instrumentally in a lot of different places. >> Dave: So, talk more about the data pipeline, where the data's coming from, how you're blending it, and how you're bringing it through the pipeline and operationalizing it. >> Yeah, so we've got a lot of different billing systems, so we integrate companies, and historically we've let them keep their billings systems. So just kind of bring it all together into our core ERP, seeing how quantities...and just getting the data, and just figuring out on the basic side, how much do we make from a certain customer? What are we making from them? What happens in different scenarios if they consolidate, or if they default? And some of the pipeline there is just blending it all together, normalizing the data, making sure it's all in the same format, and then putting it in a format where our executives or business managers can actually make decisions off of it. >> Well you're talking about the decision making process, and you said it's no longer gut, you're using data to drive your decisions, to know which direction is the right direction. How big a change is that, just culturally speaking? How has that changed? >> Yeah, it's huge, at least on our side, it's making us a long more confident in the decisions we're making. We're no longer going in saying, hey this is probably how we should do it. No, the numbers are showing us that this is going to pay off, and we stick to it and look at the hard facts, rather than what do we think is going to happen? >> So, talk a little bit about what you guys are seeing here, and you're doing a lot of speaking here, we were joking earlier, you're kind of losing your voice. You're telling your story, what kind of reactions you getting? Share with us the behind the scenes at the conference. >> I think at this conference you're seeing a lot of people kind of fall in line with similar ideas that we're trying to get to. Taking advantage more instead of your traditional MPPs, or your traditional relational databases, moving more towards this Hadoop ecosystem. Leveraging Spark, Presto, Flume, all these various new technologies that have emerged over the past two to five years, and are now more viable than ever. They're easier to scale, if you look at your traditional MPPs, like we're a big Redshift user, but every time you scale it there's a cost with that, and we don't necessarily need to maintain all that data all the time, so something in the Hadoop ecosystem now lets us maintain that data without all the unnecessary cost. I see a lot of more of that than I did two years ago, a lot more people are following that trend. I think the other interesting trend I've seen this week is this idea of becoming more cloud agnostic. Where do you operate, and how do you store your data should be irrelevant to the data processing, and I think it's going to be a tough nut to crack for Pentaho, or any vendor. But if you can figure out a way to either do some type of cloud parity, where you have support across all your services, but you don't have to know which service you deploy to when you design your pipelines, I think that's going to be huge. I think we're a little ways from that, but that's been a common theme this week as well, both private and your big three cloud providers right now, your Googles, your Azures, and your AWS. >> So when I asked you said cloud agnostic, that's great, good vision and aspiration. The follow up would be, am I correct that you don't see it as data location agnostic, right, you want to bring the cloud model to your data, versus try to force your data into a cloud? Or not necessarily? >> A lot of it I think is being driven by not wanting to be vendor locked in, so they want to have the ability to, and I think this is easier said than done, the ability to move your data to different cloud providers based on pricing or offerings, right, and right now going from AWS to Google to Azure would be a very painful process. So you move petabytes of data across, it's not cost efficient and all the savings you want to realize by moving to maybe a Google in the future, are not going to be realized cause of all the effort it's going to take to get there. >> Dave: We had CERN on earlier, and they were working on that problem... >> Yeah, it's not a trivial problem to solve, but if you can crack that, and you can then say hey I wanna...even if I have a service offering, Like our operating a dark pool for Goldman. We also have a market tech side, where we sell our trading platform and various solutions to other exchanges worldwide. If we can come up with a way to be able to deploy to any cloud provider, even on an on-prem cloud, without having to do a bunch of customizations each time, that would be huge, it would revolutionize what we do. We're, as our own company, starting to look at that, and talking with Pentaho, they're also... are going to eye that as a potential way to go, with abstractions and things like that, but it's going to take some time. >> We're you guys here yesterday for the keynotes? >> Michael: Saw some of the keynotes, yes. >> The big messaging, like every conference that you go to, is be the disruptor, or you're going to get disrupted. We talked earlier off camera... Trading volumes are down, so the way you traditionally did business is changing, and made money is changing. >> Michael: Right. >> We talked earlier about you guys becoming a technology provider, I wonder if you could help us understand that a little bit, from the standpoint of NASDAQ strategy, when we hear your CEOs talk, real visionary, technology driven transformations. >> Yeah, I think Adena's coming in is definitely looking at that as a trend, right? Trading volumes are down, they've been going down, they've kind of stabilized a little bit, and we're stable able to make money in that space, but the problem is there's not a ton of growth. We acquire the ISE, we acquire the CHI-X, we're buying market share at that point. So you increase revenue, but you also increase overhead in that way. And you can only do so many major acquisitions at a time, you can only do how many one billion dollar acquisitions a year before you have to call it a day. And we can look at more strategic, smaller acquisitions for exchanges, but that doesn't necessarily bring you the transformation, the net revenue you're looking for. So what Adena has started to look at is, how do we transform to more of a technology company? We're really good at operating exchanges, how do we take that, and we already have market tech doing it, but how do we make that more scalable, not just to the financial sector, but to your other exchanges, your Ubers or your StubHubs of the world? How do you become a service provider, or a platform as a service for these other companies, to come in and use your tech? So we're looking at how do we rewrite our entire platform, from trading to the back-end, to do things like: Can we deploy to any cloud provider? Can we deploy on-prem? Can we be a little bit more technology agnostic so to speak, and offer these as services, and offer a bunch of microservices, so that if a startup comes up and wants to set up an exchange, they can do it, they can leverage our services, then build whatever other applications they want on top of it. I think that's a transformation we need to go through, I think it's good vision, and I'm looking forward to executing it. It's going to be a couple years before we see the fruits of that labor, but Adena's really doing a great job of coming in, and really driving that innovation, and Brad Peterson as well, our CIO, has really been pushing this vision, and I think it's really going to work out for us, assuming we can execute. >> Well you know what's interesting about that, if I may, is financial services is usually so secretive about their technology, right? But your business, you guys are becoming a technology provider, so you got to face the world and start marketing your capabilities now, and opening about that. It's sort of an interesting change. >> I think you'll see that starting to become more of a thing over the next year or two, as we start actually looking to build out the platform and figure it out. We do market on the market tech side, I mean it's not a small business, but we're more strategic about who we market to, cause we're still targeting your financial exchanges, more internationally than in the U.S., but there's only so many of them, again you have to start looking at rebranding, rebuilding, and rethinking how we think about exchanges in general, and not thinking of them as just a financial thing. >> Well that's what I wanted to get into, because you're talking about this rebranding, and this rebuilding, this transformation, to the backdrop within an industry that is changing rapidly, and we have sort of the threat of legislative reform, perhaps some administrative reforms coming down all the time, so how do you manage that? I mean, those are a lot of pressures there, are you constantly trying to push the envelope right up until any changes take place? Or what would you say Shere and Michael? >> Probably again not the right person to ask about this, but we're definitely trying to stay on top of the cutting edge in innovation and the technologies out there that, whether it be Blockchain, or different types of technologies. I mean we're definitely trying to make sure we're investing in them, while maintaining our core businesses. >> Right, it's trying to find that balance right now of when to make the next step in the technology food chain, and when to balance that with regulatory obligations. And if you look at it, going back to the idea of being able to launch marketplaces, I think what you're ending up seeing over the coming years is your Ubers, your StubHubs, I think they're going to become more regulated at some level. And we're good at operating more regulated markets, so I think that's where we can kind of come in and play a role, and help wade through those regulations a little bit more, and help build software to adhere to those regulations. >> Since you brought up Blockchain, Jamie Dimon craps all over Blockchain, or you know, Bitcoin, and then clarifies his remarks, saying look, technology underneath is here to stay. Thoughts on Blockchain? Obviously Financial Services is looking at it very closely, doing some really advanced stuff, what can you tell us? >> Yeah, I think there's no argument that it's definitely an innovation and a disruptive technology. I think that it's definitely in it's early stages across the board, so we're investing in it where we can, and trying to keep a close eye on it. We think that there's a lot of potential in a lot of different applications. >> As the NASDAQ transforms its business, how does that effect the sort of back-end analytics activity and infrastructure? >> The data is just growing, that's like the biggest challenge we have now. Data that used to be done in Excel, it's just no longer an option, so now in order to get the insights that we used to get just from having a couple people doing Excel transformations, you need to now invest in the infrastructure in the back-end, and so there's a lot that needs to go into building out an infrastructure to be able to ingest the data, and then also having the UI on the front-end, so that the business can actually view it the way they want. >> So skills wise, how's that affecting who you guys are hiring and training? And how's that transformation going? >> Michael: I'll let you go first. >> I think there's definitely, data analytics is a hot field. It's very new, there's definitely a big skills gap in administrative work and in the analytics side. Usually you have people could perform analytical functions just by being administrative or operational, and now it's really, we're investing in analysts, and making sure that we have the right people in place to be able to do these transformations, or pull the data and get the answers that we need from them. >> I mean from the tech side, I think what you're seeing is where we traditionally would just plug a developer in there, whether a Java developer, or an ETL developer, I think what you're seeing now is we're looking to bring more of a business minded data analyst to the tech side, right? So we're looking to bring a data engineer, so to speak, more to the tech side. So we're not looking to hire a traditional four year Computer Science degree, or Software Engineering degree, you're looking for a different breed of person, cause quite honestly because you're traditional Java dev. or C++ developer, they're not skilled or geared towards data. And when we've tried to plug that paradigm in, it just doesn't really work, so we're looking now to hiring more of an analyst, but someone who's a little bit more techie as well. They still need to have those skills to do some level of coding, and what we are finding is that skill gap is still very much... There's a gap there. There's a huge gap. And I think it's closing, but- >> And as you have to fund those for the new areas, I presume, like many companies in your business, you're trying to move away from the sort of undifferentiated low-level infrastructure deployment hassles, and the IT labor costs there, especially as we move to the cloud, presumably, so is that shift palpable? I mean, can you see that going on? >> Yeah, I think we made a lot of progress over the past couple years in doing that. We do more one button deployments, where the operation cost is a lot lower, a lot more automation around alerting, around when things go wrong, so there's not necessarily a human being sitting there watching a computer. We've invested a lot in that area to kind of reduce the costs, and make the experience better for our end user. And even from a development side, the cost of a new application is a lot less every time you have to do a release. The question is, how do you balance that with the regulations, and make sure you still have a good process in place. The idea of putting single button deployments in place is a great one, but you still have to balance that with making sure that what you push to productions been tested, well defined, and it meets the need, and you're not just arbitrarily throwing things out there. So we're still trying to hit that balance a little bit, it's more on the back-end side. The trading system is not quite there for obvious reasons, we're way more protective of what goes out there, then surrounding it a lot of the times, but I can see a future where, again going back to this idea of transforming our business, where you can stand up and do an exchange with the click of a button. I think that's a trend we're looking at. >> Rebecca: It's not too far in the future. >> No, I don't think it is. >> Last question, Pentaho report card. What are they doing really well? What do you want to see them do better? >> I think they continue to focus in the right areas, focusing more on the data processing side, and with the big data technologies, trying to fill that gap in the big data, and be the layer that you don't have to tie yourself to ike vCloud Air or MapR, you can kind of be a little bit more plug and play. I think they still need to do some improvements on there visualizations in their front-ends. I think they've been so much more focused on the data processing, that part of it, that the visualization's kind of lacked behind, so I think they need to put a little more focus into that, but all in all, they're an A, and we've been extremely happy with them as a software provider. >> Great. >> Shere: I think the visualization part is the part that allows people to understand that value being created at Pentaho. So I think being able to maybe improve a little bit on the visualization could go a far way. >> Michael, Shere, it's been so much fun having you on theCube, and having this conversation, keep that bull market coming please, do whatever you can. >> We'll do our best. >> I'm Rebecca Knight. We are here at PentahoWorld, sponsored by Hitachi Vantara. For Dave Vellante, we will have more from theCube in just a little bit.
SUMMARY :
brought to you by Hitachi Ventara. brought to you by Hitachi Ventara. So, excited to bring him along. Okay so you're a newbie the last time you came on, So the biggest thing that's So you're just getting So Pentaho is the engine So I got to do a NASDAQ of the S&P, right, so, we use a different And it goes up and down and the NASDAQ went up by a point, right. kind of the old days, and dark pools so now they don't have to and paint a picture of the and it just answer the about the data pipeline, And some of the pipeline there is just and you said it's no longer gut, in the decisions we're making. scenes at the conference. and I think it's going to that you don't see it as the ability to move your data and they were working on that problem... but it's going to take some time. so the way you traditionally from the standpoint of NASDAQ strategy, We acquire the ISE, we acquire the CHI-X, so you got to face the world We do market on the market tech side, and the technologies I think they're going to become stuff, what can you tell us? across the board, so we're so that the business can actually and in the analytics side. I mean from the tech side, and make the experience Rebecca: It's not What do you want to see them do better? and be the layer that you don't have to So I think being able to having you on theCube, and For Dave Vellante, we will
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Michael Weiss | PERSON | 0.99+ |
Rebecca Knight | PERSON | 0.99+ |
Rebecca | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Michael | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
NYSE | ORGANIZATION | 0.99+ |
NASDAQ | ORGANIZATION | 0.99+ |
August | DATE | 0.99+ |
Jamie Dimon | PERSON | 0.99+ |
June | DATE | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
London Stock Exchange | ORGANIZATION | 0.99+ |
Goldman | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
2015 | DATE | 0.99+ |
Excel | TITLE | 0.99+ |
Shere | PERSON | 0.99+ |
Goldman Sachs | ORGANIZATION | 0.99+ |
Shere Saidon | PERSON | 0.99+ |
Hong Kong Stock Exchange | ORGANIZATION | 0.99+ |
20 second | QUANTITY | 0.99+ |
Googles | ORGANIZATION | 0.99+ |
four year | QUANTITY | 0.99+ |
27 exchanges | QUANTITY | 0.99+ |
Brad Peterson | PERSON | 0.99+ |
5 points | QUANTITY | 0.99+ |
Ubers | ORGANIZATION | 0.99+ |
Adena | ORGANIZATION | 0.99+ |
Orlando, Florida | LOCATION | 0.99+ |
seven new exchanges | QUANTITY | 0.99+ |
Pentaho | ORGANIZATION | 0.99+ |
CERN | ORGANIZATION | 0.99+ |
first year | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
International Stock Exchange | ORGANIZATION | 0.99+ |
three options | QUANTITY | 0.99+ |
two years ago | DATE | 0.99+ |
Java | TITLE | 0.99+ |
first time | QUANTITY | 0.98+ |
Hitachi Vantara | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
Dav | PERSON | 0.98+ |
U.S. | LOCATION | 0.98+ |
a day | QUANTITY | 0.98+ |
3 | QUANTITY | 0.98+ |
this week | DATE | 0.98+ |
both | QUANTITY | 0.97+ |
each time | QUANTITY | 0.97+ |
StubHubs | ORGANIZATION | 0.97+ |
Spark | ORGANIZATION | 0.97+ |
ISE | ORGANIZATION | 0.97+ |
Hitachi Ventara | ORGANIZATION | 0.97+ |
Joel Horwitz, IBM & David Richards, WANdisco - Hadoop Summit 2016 San Jose - #theCUBE
>> Narrator: From San Jose, California, in the heart of Silicon Valley, it's theCUBE. Covering Hadoop Summit 2016. Brought to you by Hortonworks. Here's your host, John Furrier. >> Welcome back everyone. We are here live in Silicon Valley at Hadoop Summit 2016, actually San Jose. This is theCUBE, our flagship program. We go out to the events and extract the signal to the noise. Our next guest, David Richards, CEO of WANdisco. And Joel Horowitz, strategy and business development, IBM analyst. Guys, welcome back to theCUBE. Good to see you guys. >> Thank you for having us. >> It's great to be here, John. >> Give us the update on WANdisco. What's the relationship with IBM and WANdisco? 'Cause, you know. I can just almost see it, but I'm not going to predict. Just tell us. >> Okay, so, I think the last time we were on theCUBE, I was sitting with Re-ti-co who works very closely with Joe. And we began to talk about how our partnership was evolving. And of course, we were negotiating an OEM deal back then, so we really couldn't talk about it very much. But this week, I'm delighted to say that we announced, I think it's called IBM Big Replicate? >> Joel: Big Replicate, yeah. We have a big everything and Replicate's the latest edition. >> So it's going really well. It's OEM'd into IBM's analytics, big data products, and cloud products. >> Yeah, I'm smiling and smirking because we've had so many conversations, David, on theCUBE with you on and following your business through the bumpy road or the wild seas of big data. And it's been a really interesting tossing and turning of the industry. I mean, Joel, we've talked about it too. The innovation around Hadoop and then the massive slowdown and realization that cloud is now on top of it. The consumerization of the enterprise created a little shift in the value proposition, and then a massive rush to build enterprise grade, right? And you guys had that enterprise grade piece of it. IBM, certainly you're enterprise grade. You have enterprise everywhere. But the ecosystem had to evolve really fast. What happened? Share with the audience this shift. >> So, it's classic product adoption lifecycle and the buying audience has changed over that time continuum. In the very early days when we first started talking more at these events, when we were talking about Hadoop, we all really cared about whether it was Pig and Hive. >> You once had a distribution. That's a throwback. Today's Thursday, we'll do that tomorrow. >> And the buying audience has changed, and consequently, the companies involved in the ecosystem have changed. So where we once used to really care about all of those different components, we don't really care about the machinations below the application layer anymore. Some people do, yes, but by and large, we don't. And that's why cloud for example is so successful because you press a button, and it's there. And that, I think, is where the market is going to very, very quickly. So, it makes perfect sense for a company like WANdisco who've got 20, 30, 40, 50 sales people to move to a company like IBM that have 4 or 5,000 people selling our analytics products. >> Yeah, and so this is an OEM deal. Let's just get that news on the table. So, you're an OEM. IBM's going to OEM their product and brand it IBM, Big Replication? >> Yeah, it's part of our Big Insights Portfolio. We've done a great job at growing this product line over the last few years, with last year talking about how we decoupled all the value-as from the core distribution. So I'm happy to say that we're both part of the ODPI. It's an ODPI-certified distribution. That is Hadoop that we offer today for free. But then we've been adding not just in terms of the data management capabilities, but the partnership here that we're announcing with WANdisco and how we branded it as Big Replicate is squarely aimed at the data management market today. But where we're headed, as David points out, is really much bigger, right? We're talking about support for not only distributed storage and data, but we're also talking about a hybrid offering that will get you to the cloud faster. So not only does Big Replicate work with HDFS, it also works with the Swift objects store, which as you know, kind of the underlying storage for our cloud offering. So what we're hoping to see from this great partnership is as you see around you, Hadoop is a great market. But there's a lot more here when you talk about managing data that you need to consider. And I think hybrid is becoming a lot larger of a story than simply distributing your processing and your storage. It's becoming a lot more about okay, how do you offset different regions? How do you think through that there are multiple, I think there's this idea that there's one Hadoop cluster in an enterprise. I think that's factually wrong. I think what we're observing is that there's actually people who are spinning up, you know, multiple Hadoop distributions at the line of business for maybe a campaign or for maybe doing fraud detection, or maybe doing log file, whatever. And managing all those clusters, and they'll have Cloud Arrow. They'll have Hortonworks. They'll have IBM. They'll have all of these different distributions that they're having to deal with. And what we're offering is sanity. It's like give me sanity for how I can actually replicate that data. >> I love the name Big Replicate, fantastic. Big Insights, Big Replicate. And so go to market, you guys are going to have bigger sales force. It's a nice pop for you guys. I mean, it's good deal. >> We were just talking before we came on air about sort of a deal flow coming through. It's coming through, this potential deal flow coming through, which has been off the charts. I mean, obviously when you turn on the tap, and then suddenly you enable thousands and thousands of sales people to start selling your products. I mean, IBM, are doing a great job. And I think IBM are in a unique position where they own both cloud and on-prem. There are very few companies that own both the on-prem-- >> They're going to need to have that connection for the companies that are going hybrid. So hybrid cloud becomes interesting right now. >> Well, actually, it's, there's a theory that says okay, so, and we were just discussing this, the value of data lies in analytics, not in the data itself. It lies in you've been able to pull out information from that data. Most CIOs-- >> If you can get the data. >> If you can get the data. Let's assume that you've got the data. So then it becomes a question of, >> That's a big assumption. Yes, it is. (laughs) I just had Nancy Handling on about metadata. No, that's an issue. People have data they store they can't do anything with it. >> Exactly. And that's part of the problem because what you actually have to have is CPU slash processing power for an unknown amount of data any one moment in time. Now, that sounds like an elastic use case, and you can't do elastic on-prem. You can only do elastic in cloud. That means that virtually every distribution will have to be a hybrid distribution. IBM realized this years ago and began to build this hybrid infrastructure. We're going to help them to move data, completely consistent data, between on-prem and cloud, so when you query things in the cloud, it's exactly the same results and the correct results you get. >> And also the stability too on that. There's so many potential, as we've discussed in the past, that sounds simple and logical. To do an enterprise grade is pretty complex. And so it just gives a nice, stable enterprise grade component. >> I mean, the volumes of data that we're talking about here are just off the charts. >> Give me a use case of a customer that you guys are working with, or has there been any go-to-market activity or an ideal scenario that you guys see as a use case for this partnership? >> We're already seeing a whole bunch of things come through. >> What's the number one pattern that bubbles up to the top? Use case-wise. >> As Joel pointed out, that he doesn't believe that any one company just has one version of Hadoop behind their firewall. They have multiple vendors. >> 100% agree with that. >> So how do you create one, single cluster from all of those? >> John: That's one problem you solved. >> That's of course a very large problem. Second problem that we're seeing in spades is I have to move data to cloud to run analytics applications against it. That's huge. That required completely guaranteed consistent data between on-prem and cloud. And I think those two use cases alone account for pretty much every single company. >> I think there's even a third here. I think the third is actually, I think frankly there's a lot of inefficiencies in managing just HDFS and how many times you have to actually copy data. If I looked across, I think the standard right now is having like three copies. And actually, working with Big Replicate and WANdisco, you can actually have more assurances and actually have to make less copies across the cluster and actually across multiple clusters. If you think about that, you have three copies of the data sitting in this cluster. Likely, an analysts have a dragged a bunch of the same data in other clusters, so that's another multiple of three. So there's amount of waste in terms of the same data living across your enterprise. That I think there's a huge cost-savings component to this as well. >> Does this involve anything with Project Atlas at all? You guys are working with, >> Not yet, no. >> That project? It's interesting. We're seeing a lot of opening up the data, but all they're doing is creating versions of it. And so then it becomes version control of the data. You see a master or a centralization of data? Actually, not centralize, pull all the data in one spot, but why replicate it? Do you see that going on? I guess I'm not following the trend here. I can't see the mega trend going on. >> It's cloud. >> What's the big trend? >> The big trend is I need an elastic infrastructure. I can't build an elastic infrastructure on-premise. It doesn't make economic sense to build massive redundancy maybe three or four times the infrastructure I need on premise when I'm only going to use it maybe 10, 20% of the time. So the mega trend is cloud provides me with a completely economic, elastic infrastructure. In order to take advantage of that, I have to be able to move data, transactional data, data that changes all the time, into that cloud infrastructure and query it. That's the mega trend. It's as simple as that. >> So moving data around at the right time? >> And that's transaction. Anybody can say okay, press pause. Move the data, press play. >> So if I understand this correctly, and just, sorry, I'm a little slow. End of the day today. So instead of staging the data, you're moving data via the analytics engines. Is that what you're getting at? >> You use data that's being transformed. >> I think you're accessing data differently. I think today with Hadoop, you're accessing it maybe through like Flume or through Oozy, where you're building all these data pipelines that you have to manage. And I think that's obnoxious. I think really what you want is to use something like Apache Spark. Obviously, we've made a large investment in that earlier, actually, last year. To me, what I think I'm seeing is people who have very specific use cases. So, they want to do analysis for a particular campaign, and so they may just pull a bunch of data into memory from across their data environment. And that may be on the cloud. It may be from a third-party. It may be from a transactional system. It may be from anywhere. And that may be done in Hadoop. It may not, frankly. >> Yeah, this is the great point, and again, one of the themes on the show is, this is a question that's kind of been talked about in the hallways. And I'd love to hear your thoughts on this. Is there are some people saying that there's really no traction for Hadoop in the cloud. And that customers are saying, you know, it's not about just Hadoop in the cloud. I'm going to put in S3 or object store. >> You're right. I think-- >> Yeah, I'm right as in what? >> Every single-- >> There's no traction for Hadoop in the cloud? >> I'll tell you what customers tell us. Customers look at what they actually need from storage, and they compare whatever it is, Hadoop or any on-premise proprietor storage array and then look at what S3 and Swift and so on offer to them. And if you do a side-by-side comparison, there isn't really a difference between those two things. So I would argue that it's a fact that functionally, storage in cloud gives you all the functionality that any customer would need. And therefore, the relevance of Hadoop in cloud probably isn't there. >> I would add to that. So it really depends on how you define Hadoop. If you define Hadoop by the storage layer, then I would say for sure. Like HDFS versus an objects store, that's going to be a difficult one to find some sort of benefit there. But if you look at Hadoop, like I was talking to my friend Blake from Netflix, and I was asking him so I hear you guys are kind of like replatforming on Spark now. And he was basically telling me, well, sort of. I mean, they've invested a lot in Pig and Hive. So if you think it now about Hadoop as this broader ecosystem which you brought up Atlas, we talk about Ranger and Knox and all the stuff that keeps coming out, there's a lot of people who are still invested in the peripheral ecosystem around Hadoop as that central point. My argument would be that I think there's still going to be a place for distributed computing kind of projects. And now whether those will continue to interface through Yarn via and then down to HDFS, or whether that'll be Yarn on say an objects store or something and those projects will persist on their own. To me that's kind of more of how I think about the larger discussion around Hadoop. I think people have made a lot of investments in terms of that ecosystem around Hadoop, and that's something that they're going to have to think through. >> Yeah. And Hadoop wasn't really designed for cloud. It was designed for commodity servers, deployment with ease and at low cost. It wasn't designed for cloud-based applications. Storage in cloud was designed for storage in cloud. Right, that's with S3. That's what Swift and so on were designed specifically to do, and they fulfill most of those functions. But Joel's right, there will be companies that continue to use-- >> What's my whole argument? My whole argument is that why would you want to use Hadoop in the cloud when you can just do that? >> Correct. >> There's object store out. There's plenty of great storage opportunities in the cloud. They're mostly shoe-horning Hadoop, and I think that's, anyway. >> There are two classes of customers. There were customers that were born in the cloud, and they're not going to suddenly say, oh you know what, we need to build our own server infrastructure behind our own firewall 'cause they were born in the cloud. >> I'm going to ask you guys this question. You can choose to answer or not. Joel may not want to answer it 'cause he's from IBM and gets his wrist slapped. This is a question I got on DM. Hadoop ecosystem consolidation question. People are mailing in the questions. Now, keep sending me your questions if you don't want your name on it. Hold on, Hadoop system ecosystem. When will this start to happen? What is holding back the M and A? >> So, that's a great question. First of all, consolidation happens when you sort of reach that tipping point or leveling off, that inflection point where the market levels off, and we've reached market saturation. So there's no more market to go after. And the big guys like IBM and so on come in-- >> Or there was never a market to begin with. (laughs) >> I don't think that's the case, but yes, I see the point. Now, what's stopping that from happening today, and you're a naughty boy by the way for asking this question, is a lot of these companies are still very well funded. So while they still have cash on the balance sheet, of course, it's very, very hard for that to take place. >> You picked up my next question. But that's a good point. The VCs held back in 2009 after the crash of 2008. Sequoia's memo, you know, the good times role, or RIP good times. They stopped funding companies. Companies are getting funded, continually getting funding. Joel. >> So I don't think you can look at this market as like an isolated market like there's the Hadoop market and then there's a Spark market. And then even there's like an AI or cognitive market. I actually think this is all the same market. Machine learning would not be possible if you didn't have Hadoop, right? I wouldn't say it. It wouldn't have a resurgence that it has had. Mahout was one of the first machine learning languages that caught fire from Ted Dunning and others. And that kind of brought it back to life. And then Spark, I mean if you talk to-- >> John: I wouldn't say it creates it. Incubated. >> Incubated, right. >> And created that Renaissance-like experience. >> Yeah, deep learning, Some of those machine learning algorithms require you to have a distributed kind of framework to work in. And so I would argue that it's less of a consolidation, but it's more of an evolution of people going okay, there's distributed computing. Do I need to do that on-premise in this Hadoop ecosystem, or can I do that in the cloud, or in a growing Spark ecosystem? But I would argue there's other things happening. >> I would agree with you. I love both areas. My snarky comment there was never a market to begin with, what I'm saying there is that the monetization of commanding the hill that everyone's fighting for was just one of many hills in a bigger field of hills. And so, you could be in a cul-de-sac of being your own champion of no paying customers. >> What you have-- >> John: Or a free open-source product. >> Unlike the dotcom era where most of those companies were in the public markets, and you could actually see proper valuations, most of the companies, the unicorns now, most are not public. So the valuations are really difficult to, and the valuation metrics are hard to come by. There are only few of those companies that are in the public market. >> The cash story's right on. I think to Joel' point, it's easy to pivot in a market that's big and growing. Just 'cause you're in the wrong corner of the market pivoting or vectoring into the value is easier now than it was 10 years ago. Because, one, if you have a unicorn situation, you have cash on the bank. So they have a good flush cash. Your runway's so far out, you can still do your thing. If you're a startup, you can get time to value pretty quickly with the cloud. So again, I still think it's very healthy. In my opinion, I kind of think you guys have good analysis on that point. >> I think we're going to see some really cool stuff happen working together, and especially from what I'm seeing from IBM, in the fact that in the IT crowd, there is a behavioral change that's happening that Hadoop opened the door to. That we're starting to see more and more It professionals walk through. In the sense that, Hadoop has opened the door to not thinking of data as a liability, but actually thinking about data differently as an asset. And I think this is where this market does have an opportunity to continue to grow as long as we don't get carried away with trying to solve all of the old problems that we solved for on-premise data management. Like if we do that, then we're just, then there will be a consolidation. >> Metadata is a huge issue. I think that's going to be a big deal. And on the M and A, my feeling on the M and A is that, you got to buy something of value, so you either have revenue, which means customers, and or initial property. So, in a market of open source, it comes back down to the valuation question. If you're IBM or Oracle or HP, they can pivot too. And they can be agile. Now slower agile, but you know, they can literally throw some engineers at it. So if there's no customers in I and P, they can replicate, >> Exactly. >> That product. >> And we're seeing IBM do that. >> They don't know what they're buying. My whole point is if there's nothing to buy. >> I think it depends on, ultimately it depends on where we see people deriving value, and clearly in WANdisco, there's a huge amount of value that we're seeing our customers derive. So I think it comes down to that, and there is a lot of IP there, and there's a lot of IP in a lot of these companies. I think it's just a matter of widening their view, and I think WANdisco is probably the earliest to do this frankly. Was to recognize that for them to succeed, it couldn't just be about Hadoop. It actually had to expand to talk about cloud and talk about other data environments, right? >> Well, congratulations on the OEM deal. IBM, great name, Big Replicate. Love it, fantastic name. >> We're excited. >> It's a great product, and we've been following you guys for a long time, David. Great product, great energy. So I'm sure there's going to be a lot more deals coming on your. Good strategy is OEM strategy thing, huh? >> Oh yeah. >> It reduces sales cost. >> Gives us tremendous operational leverage. Getting 4,000, 5,000-- >> You get a great partner in IBM. They know the enterprise, great stuff. This is theCUBE bringing all the action here at Hadoop. IBM OEM deal with WANdisco all happening right here on theCUBE. Be back with more live coverage after this short break.
SUMMARY :
Brought to you by Hortonworks. extract the signal to the noise. What's the relationship And of course, we were Replicate's the latest edition. So it's going really well. The consumerization of the enterprise and the buying audience has changed That's a throwback. And the buying audience has changed, Let's just get that news on the table. of the data management capabilities, I love the name Big that own both the on-prem-- for the companies that are going hybrid. not in the data itself. If you can get the data. I just had Nancy Handling and the correct results you get. And also the stability too on that. I mean, the volumes of bunch of things come through. What's the number one pattern that any one company just has one version And I think those two use cases alone of the data sitting in this cluster. I guess I'm not following the trend here. data that changes all the time, Move the data, press play. So instead of staging the data, And that may be on the cloud. And that customers are saying, you know, I think-- Swift and so on offer to them. and all the stuff that keeps coming out, that continue to use-- opportunities in the cloud. and they're not going to suddenly say, What is holding back the M and A? And the big guys like market to begin with. hard for that to take place. after the crash of 2008. And that kind of brought it back to life. John: I wouldn't say it creates it. And created that or can I do that in the cloud, that the monetization that are in the public market. I think to Joel' point, it's easy to pivot And I think this is where this market I think that's going to be a big deal. there's nothing to buy. the earliest to do this frankly. Well, congratulations on the OEM deal. So I'm sure there's going to be Gives us tremendous They know the enterprise, great stuff.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Joel | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Joe | PERSON | 0.99+ |
David Richards | PERSON | 0.99+ |
Joel Horowitz | PERSON | 0.99+ |
2009 | DATE | 0.99+ |
John | PERSON | 0.99+ |
4 | QUANTITY | 0.99+ |
WANdisco | ORGANIZATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
20 | QUANTITY | 0.99+ |
San Jose | LOCATION | 0.99+ |
HP | ORGANIZATION | 0.99+ |
thousands | QUANTITY | 0.99+ |
Joel Horwitz | PERSON | 0.99+ |
Ted Dunning | PERSON | 0.99+ |
Big Replicate | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Big Replicate | ORGANIZATION | 0.99+ |
40 | QUANTITY | 0.99+ |
30 | QUANTITY | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
third | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Hadoop | TITLE | 0.99+ |
San Jose, California | LOCATION | 0.99+ |
three | QUANTITY | 0.99+ |
two things | QUANTITY | 0.99+ |
2008 | DATE | 0.99+ |
5,000 people | QUANTITY | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
100% | QUANTITY | 0.99+ |
David Richards | PERSON | 0.99+ |
Blake | PERSON | 0.99+ |
4,000, 5,000 | QUANTITY | 0.99+ |
S3 | TITLE | 0.99+ |
two classes | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
Second problem | QUANTITY | 0.99+ |
both areas | QUANTITY | 0.99+ |
three copies | QUANTITY | 0.99+ |
Hadoop Summit 2016 | EVENT | 0.99+ |
Swift | TITLE | 0.99+ |
both | QUANTITY | 0.99+ |
Big Insights | ORGANIZATION | 0.99+ |
one problem | QUANTITY | 0.98+ |
Today | DATE | 0.98+ |
Jim Campigli, WANdisco - #BigDataNYC 2015 - #theCUBE
>> Live from New York. It's The Cube, covering Big Data NYC 2015. Brought to you by Horton Works, IBM, EMC, and Pivotal. Now for your hosts, John Furrier and Dave Vellante. >> Hello, everyone. Welcome back to live in New York City for the Cube. A special big data [inaudible 00:00:27] our flagship program will go out to the events. They expect a [Inaudible 00:00:30] We are here live as part of Strata Hadoop Big Data NYC. I'm John Furrier. My co-host, Dave Vellante. Our next guest is Jim Campigli, the Chief Product Officer at WANdisco. Welcome back to The Cube. Great to see you. >> Thanks, great to be here. >> You've been COO of WANdisco, head of marketing, now Chief Product Officer for a few years. You guys have always had the patent. David was on earlier. I asked him specifically, why doesn't the other guys just do what you do? I wanted you to comment deeper on that because he had a great answer. He said, patents. But you guys do something that's really hard that people can't do. >> Right. >> So let's get into it because Fusion is a big announcement you guys made. Big deal with EMC, lot of traction with that, and it's one of these things that is kind of talked about, but not talked about. It's really a big deal, so what is the reason why you guys are so successful on the product side? >> Well I think, first of all, it starts with the technology that we have patented, and it's this true active active replication capability that we have. Other software products claim to have active active replication, but when you drill down on what they're really doing, typically, what's happening is they'll have a set of servers that they replicate across, and you can write a transaction at any server, but then that server is responsible for propagating it to all of the other servers in the implementation. There's no mechanism for pre-agreeing to that transaction before it's actually written, so there's no way to avoid conflicts up front, there's no way to effectively handle scenarios where some of the servers in the implementation go down while the replication is in process, and very frequently, those solutions end up requiring administrators to do periodic resynchronization, go back and manually find out what didn't take, and deal with all the deltas, whereas we offer guaranteed consistency. And effectively what happens is with us, you can write at any server as well, but the difference is we go through a peer-to-peer agreement process, and once a quorum of the servers in the implementation agree to the transaction, they all accept it, and we make sure everything is written in the same order on every server. And every server knows the last good transaction it processed, so if it goes down at some point in time, as soon as it comes back up, it can grab all the transactions it missed during that time slice while it was offline, resync itself automatically without an administrator having to do anything. And you can use that feature not only for network and server outages that cause downtime, but even for planned maintenance, which is one of the biggest causes of Hadoop availability issues, because obviously if you've got a global appointment, when it's midnight on Sunday in the U.S., it's the start of the business day on Monday in Europe, and then it's the middle of the afternoon in Asia. So if you take Hadoop clusters down, somebody somewhere in the world is going to be going without their applications and data. >> It's interesting; I want to get your comments on this because this has a great highlight into the next conversation we've been hearing all throughout The Cube this week is analytics, outcomes. These are the kind of things that people talk about because that means there's checks being written. Hadoop is moving into production. People have done the clusters. It used to be the conversation, hey, x number of clusters, you do this, you do that, replication here and there, YARN, all these different buzz words. Really feeds and speeds. Now, Hadoop is relevant, but it's kind of invisible. It's under the hood. >> Right. >> Yet, it's part of other things in the network, so high availability, non-disruptive operations, is what our table stakes now. So I want you to talk about that nuance because that's what we're seeing as the things that are powering, as the engine of Hadoop deployments. What is that? Take us through that nuance, because that's one of the things that you guys have been doing a lot of work in that's making it reliable and stable. To actually go out and play with Hadoop, deploy it, make sure it's always on. >> Well, we really come into play when companies are moving Hadoop out of the lab and into production. When they have defined application SLAs, when they can only have so much down time, and it may be business requirements, it may be regulatory compliance issues, for example, financial services. They pretty much always have to have their data available. They have to have a solid back-up of the data. That's a hard requirement for them to put anything into production in their data centers. >> The other use case we've been hearing is okay, I've got Hadoop, I've been playing with it, now I need to scale it up big time. I need to double, triple my clusters. I have to put it with my applications. Then the conversation's, okay, wait, do I need to do more cis admin work? How do you address that particular piece because I think that's where I think Fusion comes in from how I'm reading it, but is that a Fusion value proposition? Is it a WANdisco thing, and what does the customer, and is that happening? >> Yeah, so there's actually two angles to that, and the first is how do we maintain that up-time? How do we make sure there's performance availability to meet the SLA's, the production SLA's? The active active replication that we have patents for, that I described earlier, and it's embodied in our discount distributed coordination engine, is at the core of Fusion, and once a Fusion server's installed with each of your Hadoop clusters, that active active replication capability is extended to them, and we expose that HDFS API so the client applications, Sqoop, Flume, Impala, HIVE, anything that would normally run against a Hadoop cluster, would talk through us. If it's been defined for replication, we do the active active replication of it. Pass straight through and process normally on the local cluster. So how does that address the issues you were talking about? What you're getting by default with our active active replication is effectively continuous hot back-up. That means if one cluster or an entire data center goes offline, that data exists elsewhere. Your users can fail over. They can continue accessing the data, running their applications. As soon as that cluster comes back online, it resyncs automatically. Now what's the other >> No user involvement? No admin? >> No user involvement in that. Now the only time, and this gets back into what I was talking about earlier, if I take servers offline for planned maintenance, upgrade the hardware, the operating system, whatever it may be, I can take advantage of that feature, as I was alluding to earlier. I can take the servers of the entire cluster offline, and Fusion knows the last good transactions that were processed on that cluster. As soon as the admin turns it back on, it'll resync itself automatically. So that's how you avoid down time, even for planned maintenance, if you have to take an entire location off. Now, to your other question, how do you scale this stuff up? Think about what we do. We eliminate idle standby hardware, because everything is full read write. You don't have standby read-only back-up clusters and servers when we come into the picture. So let's say we walk into an existing implementation, and they've got two clusters. One is the active cluster where everything's being written to, read from, actively being accessed by users. The other's just simply taking snapshots or periodic back-ups, or they're using dis(CP) or something else, but they really can't get full utilization out of that. We come in with our active active replication capability, and they don't have to change anything, but what suddenly happens is, as soon as they define what they want replicated, we'll replicate it for them initially to the other clusters. They don't have to pre-sync it, and the cluster that was formally for disaster recovery, for back-up, is now live and fully usable. So guess what? I'm now able to scale up to twice my original implementation by just leveraging that formally read-only back-up cluster that I was >> Is there a lot of configuration involved in that, or is it automatically? >> No, so basically what happens, again, you don't have to synchronize the clusters in advance. The way we replicate is based on this concept of folders, and you can think of a folder as basically a collection of files and subdirectories that roll up into root directories, effectively, that reflect typically particular applications that people are using with Hadoop or groups of users that have data sets that they access for their various sets of applications. And you define the replicated folders, basically a high level directory that consists of everything in it, and as soon as you do that, what we'll do automatically, in a new implementation. Let's keep it simple. Let's say you just have two clusters, two locations. We'll replicate that folder in its entirety to the target you specify, and then from that point on, we're just moving the deltas over the wire. So you don't have to do anything in advance. And then suddenly that back-up hardware is fully usable, and you've doubled the size of your implementations. You've scaled up to 2x. >> So, I mean what you're describing before, really strikes me that the way you tell the complexity of a product and the value of a product in this space is what happens when something goes wrong. >> Yep. >> That's the question you always ask. How do you recover, because recovery's a very hard thing, and your patents, you've got a lot of math inside there. >> Right. >> But you also said something that's interesting, which is you're an asset utilization play. >> Right. >> You're being able to go in relatively simply and say, okay, you've got this asset that's underutilized. I'm now going to give you back some capacity that's on the floor and take advantage of that. >> Right, and you're able to scale up without spending any more on hardware and infrastructure. >> So I'm interested in, so another company. You're now with an EMC partnership this week. And they sort of got into this way back in the mainframe days with SRDF. I always thought when I first heard about WANdisco, it's like SRDF for Hadoop, but it's active active. Then they bought that Yada Yada. >> And there's no distance limitations for their active active. >> So what's the nature of the relationship with EMC? >> Okay, so basically EMC, like the other storage vendors that want to play in the Hadoop space, expose some form of an HDFS API, and in fact, if you look at Hortonworks or Cloudera, if you go and look at Cloudera Manager, one of the things it asks you when you're installing it is are you going to run this on regular HDFS storage, effectively a bunch of commodity boxes typically, or are you going to use EMC Isilon or the various other options? And what we're able to do is replicate across Hadoop clusters running on Isilon, running on EMC ECS, running on standard HDFS, and what that allows these companies to do is without modifying those storage systems, without migrating that data off of them, incorporate it into an enterprise-wide data lake, if that's what they want to do, and selectively replicate across all of those different storage systems. It could be a mix of different Hadoop distributions. You could have replication between C/D/H, HDP, Pivotal, MapR, all of those things, including EMC Storage that I just mentioned, it was mentioned in the press release, Isilon, and ECS effectively has a Hadoop-compatible API support. And we can create in effect a single virtual cluster out of all of those different platforms. >> So is it a go-to-market relationship? Is it an OEM deal? >> Yeah, it was really born out of the fact that we have some mutual customers that want to do exactly what I just described. They have standard Hortonworks or Cloudera deployments in house. They've got data running on Isilon, and they want to deploy a data lake that includes what they've got stored on Isilon with what they've got in HDFS and Hadoop and replicate across that. >> Like onerous EMC certification process? >> Yeah, we went through that process. We actually set up environments in our labs where we had EMC, Isilon, and ECS running and did demonstration integrations, replication across Isilon to HDP to Hortonworks, Isilon to Cloudera, ECS to Isilon to HDP and Cloudera and so forth. So we did prove it out. They saw that. In fact, they lent us boxes to actually do this in our labs, so they were very motivated, and they're seeing us in some of their bigger accounts. >> Talk about the aspect of two things: non-disruptive operations, meaning I have to want to deploy stuff because now that Hadoop has a hardened top with some abstraction layer, with analytics to focus, there's a lot of work going on under the hood, and a large scale enterprise might have a zillion versions of Hadoop. They might have little Hortonworks here. They might have something over here, so there might be some diversity in the distributions. That's one thing. The other one is operational disruption. >> Right. >> What do you guys do there? Is it zero disruption, and how do you deal with multiple versions of the distro? >> Okay, so basically what we're doing, the simplest way to describe it is we're providing a common API across all of these different distributions, running on different storage platforms and so forth, so that the client applications are always interacting with us. They're not worrying about the nuances of the particular Hadoop API's that these different things expose. So we're providing a layer of abstraction effectively. So we're transparent in effect, in that sense, operationally, once we're installed. The other thing is, and I mentioned this earlier, we come in, basically, you don't have to pre-sync clusters, you don't have to make sure they're all the same versions or the same distros or any of that, just install us, select the data that you want to replicate, we'll replicate it over initially to the target clusters, and then from that point on, you just go. It just works, and we talked about the core patent for active active replication. We've got other patents that have been approved, three patents now and seven pending applications pending, that allow this active active replication to take place while servers are being added and removed from implementations without disrupting user access or running applications and so forth. >> Final question for you, sum up the show this week. What's the vibe here? What's the aroma? Is it really Hadoop next? What is the overall Big Data NYC story here in Strata Hadoop? What's the main theme that you're seeing coming out of the show? >> I think the main theme that we're starting to see, it's twofold. I think one is we are seeing more and more companies moving this into production. There's a lot of interest in Spark and the whole fast data concept, and I don't think that Spark is necessarily orthogonal to Hadoop at all. I think the two have to coexist. If you think about Spark streaming and the whole fast data concept, basically, Hadoop provides the historical data at rest. It provides the historical context. The streaming data provides the point in time information. What Spark together with Hadoop allows you to do is that real time analysis, do the real time informed decision-making, but do it within historical context instead of a single point in time vacuum. So I think what's happening, and you notice the vendors themselves aren't saying, oh it's all Spark, forget Hadoop. They're really talking about coexisting. >> Alright, Jim, from WANdisco, Chief Product Officer, really in the trenches, talking about what's under the hood and making it all scale in the infrastructure so his analysts can hit the scene. Great to see you again. Thanks for coming and sharing your insight here on The Cube. Live in New York City. We are here, day two of three days of wall-to-wall coverage of Big Data NYC in conjunction with Strata. We'll be right back with more live coverage in the moment here in New York City after this short break.
SUMMARY :
Brought to you by Horton New York City for the Cube. You guys have always had the patent. on the product side? and once a quorum of the servers These are the kind of things because that's one of the things back-up of the data. and is that happening? So how does that address the issues and the cluster that was and you can think of a folder really strikes me that the way you tell That's the question you always ask. But you also said that's on the floor and Right, and you're able to scale up in the mainframe days with SRDF. And there's no distance limitations one of the things it asks you born out of the fact and Cloudera and so forth. diversity in the distributions. so that the client applications What is the overall Big Data NYC story and the whole fast data concept, in the infrastructure
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Jim Campigli | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
WANdisco | ORGANIZATION | 0.99+ |
EMC | ORGANIZATION | 0.99+ |
Asia | LOCATION | 0.99+ |
U.S. | LOCATION | 0.99+ |
New York | LOCATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
Horton Works | ORGANIZATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
two locations | QUANTITY | 0.99+ |
Strata Hadoop | TITLE | 0.99+ |
first | QUANTITY | 0.99+ |
Pivotal | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
two things | QUANTITY | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Hadoop | TITLE | 0.99+ |
One | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
two clusters | QUANTITY | 0.99+ |
three days | QUANTITY | 0.99+ |
Monday | DATE | 0.99+ |
three patents | QUANTITY | 0.98+ |
this week | DATE | 0.98+ |
seven pending applications | QUANTITY | 0.98+ |
two angles | QUANTITY | 0.98+ |
two clusters | QUANTITY | 0.98+ |
Spark | TITLE | 0.97+ |
this week | DATE | 0.97+ |
one cluster | QUANTITY | 0.97+ |
00:00:30 | DATE | 0.95+ |
ECS | TITLE | 0.95+ |
HDP | ORGANIZATION | 0.94+ |
Cloudera Manager | TITLE | 0.94+ |
single point | QUANTITY | 0.94+ |
#BigDataNYC | EVENT | 0.94+ |
each | QUANTITY | 0.94+ |
Impala | TITLE | 0.93+ |
NYC | LOCATION | 0.93+ |
twofold | QUANTITY | 0.93+ |
Strata | ORGANIZATION | 0.92+ |
Flume | TITLE | 0.92+ |
00:00:27 | DATE | 0.92+ |
Sqoop | TITLE | 0.92+ |
Fusion | TITLE | 0.91+ |
Isilon | ORGANIZATION | 0.89+ |
Cloudera | ORGANIZATION | 0.89+ |
midnight | DATE | 0.89+ |
Sunday | DATE | 0.88+ |
Isilon | TITLE | 0.88+ |
single | QUANTITY | 0.88+ |
HIVE | TITLE | 0.87+ |
one thing | QUANTITY | 0.83+ |
double | QUANTITY | 0.83+ |
Amr Awadallah - Hadoop Summit 2013 - theCUBE - #HadoopSummit
>>Come back here. This is Silicon Valley coverage of ADU Summit. I'm John Fur, the founder. We're, we're pleased to have a friend inside the cube. It's rare to have such luminaries, Ama Aala, good friend and also co-founder of Cloudera. Really the pioneer in the space that helped build this industry that we're living here at at Hadoop Summit. I'm with Dave Ante from wiba.org. Amour, welcome back to the Cube Cub alumni. Thank you for having me here. Wow, what a journey. Are you co-founded Cloudera? I remember when you in Stealth Mo, I really can't talk about it. And, and then of course the history of Silicon Angle being, you know, founded and kind of built in in your office when you only had like 20 something employees. Yep. We owe a great deal of gratitude to you and, and congratulations to you Michael Olson, the team for building an industry. So I just wanted Thank you. Thank you. And welcome to the Cube. >>Thank you. It was great to be here. >>So what do you think, what's your take on the current Hadoop ecosystem right now? I mean, obviously a lot's happened. I mean it's big now. It's growing up fast. Yeah. The word enterprise grade is out there. You're seeing it move from, you know, trying to change the world. Our first interview, you said, I've seen the future, I want to bring it to the mainstream. It's here. Yeah. It's hitting mainstream right now. Yeah. What's your take of the current situation of the ecosystem and it's, and its value? >>Yeah, so I, I have a quick question first. Should I look to you or look to the camera? Look to >>The camera or both? Whatever you, whatever you'd like. >>So I think it's, the ecosystem is definitely growing, which is very, very healthy. However, there is a side question there, which is what do you think of all the competition coming into the space? So five years ago when Cloudera was started was just Cloudera. There was no other commercial vendor trying to support or enable Hadoop in the, in the industry for enterprises. And today there is at least 10 of them trying to compete with us, right? And that includes big companies, established companies that decided, hey, we gonna start addressing the space, but includes many, many newcomers who like Hortonworks, who were founded over the last couple of years. That's a healthy thing. I mean, that's absolutely a sign of a growing market. If the market wasn't growing, if there wasn't money in the market, if there wasn't, if it was just hype, there wouldn't have been all of these new companies and new ventures showing up. That said, I never look at competition as something that worries me, that I'm afraid now or what's gonna happen to me, or that's normal. That's exactly what happens to successful companies. If you look at Red Hat, when Red Hat was launching with the Linux, they had 25 competitors or even more 30 competitors. That's when Red Hat was forming out. And today, even of these 25, 30 competitors, they still have six or seven still left. So I think it's a very, very healthy sign of the graph of this market and the maturity that's reaching. >>What do you think about some of the, the white spaces that are evolving? You guys have obviously been involved in a lot of deployments at Cloudera. Again, you're doing a lot of, lot of work with the top, top names and the clients that you have aren't usually disclosed cuz you really can't disclose them. What, what are you seeing right now as the white spaces for things to do in the Hado platform? >>It's a very, very good question. So first I can't talk about future, future roadmap. Right now we're becoming a big company at that level where we can't comment on future roadmaps. >>Ah, that's sinus sign of the >>Time. You're well media train, good to see they're doing a good job keeping you >>A, You want more information on that? I can connect you with a pt, >>Please. No, no, no, we're good. We're good. We'll get it outta you. But, >>But our vision, our vision for Cloudera from day one, like you were saying earlier, we saw the future, right? So our vision from from day one was really to build this data system where we can have detail of any type, whether that data is structured or unstructured or images, it doesn't matter. And then on top of that data run any type of workloads. That workload could be the initial genesis of Hado, which is map use, which is batch processing. But now as as we made many announcements through the last few years, we also now have Impala for interactive analytics as a workload. We have a very, very strong partner partnership with SaaS for doing machine learning and statistics as a workload. And a few weeks ago we announced search as another workload. So you have multiple types of workloads that can handle different types of problems that you have within your organization and bring all of these workloads to all of your data regardless of type. And that's the vision that we'll continue to deliver on. That's exactly what we're building going into the >>Future. So how's that fit in with yarn, right? We're hearing a lot at this conference about yarn, the ability to, you know, do more with less in a lot of the things that you typically hear with the enter within the enterprise. And, and so talk about that a little bit. >>Yarn is a very core part to our platform. In fact, yarn has been part of CDH four for more than a year now out in the, in the markets. So we did bring, we were one of the, I think we were the first vendor who brought yarn into a distribution of Hado out there. It's very, very fundamental to us because that is how we're gonna coordinate. We are gonna be using yarn to coordinate launching all of these different type of workloads. You're gonna have the map produce workload, which is very batch oriented. The Impala workload, which is very latency sensitive. The, the search workload, which is also very latency sensitive. The machine learning workload, which is more batch oriented, et cetera, et cetera. And yarn is a very, very central piece to helping us coordinate all of these different types of workloads onto the >>Platform. Cloudera has been a great citizen in the community also. You, you mentioned and, and we witnessed that your team create the industry. You guys were there, you took the chance, you were the first ones commercially funded by the venture capitalists, you know, then others will follow and I'll see huge ecosystem here. Yes. A lot of noise. A lot of people trying to get attention. So I got to ask you, because I want you to address this because I know it's been talked about in some of the other blogs is there's a lot of fud going on around who's doing what? Who's doing what, and in some cases maybe flat out, you know, misinformation and that happens in a growing market, you know, the elbows get sharp. Yes. So I want you share with the audience anything that you want say about the fud around what people say about Cloudera or about others or what you're doing. Just to clarify, cuz there has been, I mean I've gotten back channel information around, you know, not sure the committers this, and it's been, it's been well documented. There's a lot of fu out there. What, what would you say to the folks out there to clarify >>That? Yes, I, I would say that our focus should be to continue to work as a community, to push the platform forwards. I would say that at Cloudera we do a lot of contributions. Horton works definitely is one of the top contributors out there as well. I'll acknowledge that. So as many, many, many other companies and we wanna continue to see the platform evolve. I will stress though that at Cloudera we do have a number of the original project founders working at the company. So it's not just the, the contribution that we bring, but the fact that we have the founders of these projects working at Cloudera. And some of these projects actually were created at Cloudera from day one as opposed to created in some other company. And then you hire the employee and they work for you. So I gave you what examples from Cloudera dot cutting. >>He is the creator of Hudu dot Cutting is also the creator of Luine, which became solar, which is part of the search project that we launched recently. Dot Cutting wasn't with Cloudera from day one, right? So, so when he created these technologies, he actually was at Tia for example, when he created had he was at ta, wasn't at Cloudera. However, he now works for Cloudera. So we get that because now that cutting works for Cloudera. So that's one example. On the flip side, there is projects like Flume and Scoop that are now part of every single distribution out there. And flu and Scoop were both created at Calera. They were actually created inside of Cloudera. Yeah. So the key point is, and and that's what I would like all of the vendors out there that are trying to leverage had and get benefit about out Hadoop is please don't be just takers. >>There are some vendors out there who are just takers. Just wanna take from the open source, take from the open source and don't give back. Right? I'm not gonna name them, but there is a few of them out there. Please, please, please. I mean that that, that is very, very a selfish behavior. It's not gonna help the ecosystem in the long term. We would like to see you both take and give at the same time. So that would be my core message. And that's for example, like I thank Hortonworks because that's exactly what Hortonworks is doing. They're both giving and taking at the same >>Time. You guys have always been clear on that. Nobody, I mean here contribution to open source has been well documented and there's, there's no question about that. John and I have talked about it a lot that you guys help get it all started. And even Haak when we had 'em on a couple years ago, when Horton Works came to the market said, Hey, the more people work on an open source, the better. >>Yeah, >>Exactly. So yeah, it's always been, been your posture. You're not playing games there. Anyways, having said that, you you, you have a strategy to layer on top of that open source some of your own proprietary code. And so you have choices to make Yes. In terms of how you allocate those resources. So as an engineering manager, how do you allocate those resources in terms of, okay, what do we do for the community and what do we do for our own, you know, future because of the business model that we chose? How do you make those trade offs? >>Yes, that's a very, very good question. So first it's important to stress that our core platform, CDH, is open source. Everything we put in the core platform is open source. So for example, in Palo, which we launched very recently as a ga, now we launched beta last year, but now's ga is a hundred percent Apache license, a hundred percent open source search, which we announced very recently is also open source. So the platform itself, we're committing to everything in there to be open source. Now we believe fundamentally just from having lots of history in studying the open source markets from our ceo Mike Olson himself being one of the very first open source people in the world with, with sleepy cats, the company that he sold to Oracle before founding Cloudera from our investors, helping many other open source companies. To have a successful open co open source company, you need to have a very good engine between the business model that generates revenue and between the product that you are creating. If you don't have a good feedback loop there between these two, you won't be able to sustain the innovation to continue to push the, the boundaries of how good the product is. So we strongly believe in that if you are, if your product is literally a hundred percent open source, meaning both the management and every, there is nothing proprietary whatsoever inside of your products. I can't tell what that is. It's >>Taking a picture. >>Oh, sorry, I thought somebody was waiting >>For me. >>Sorry about that. >>It's a cheap signal. >>It >>Was like a's really good. >>I thought it's like a card of paper with some writing. You, >>You, you have a fan fans out there. They're storming the, the concert here. >>Okay, that's, that's good to hear. That's good to hear. Sorry about that interruption. So if, if, if you have everything a hundred percent open source, that creates two problems. First you have no differentiation whatsoever, meaning another big corporation without naming who the big corporations could be, we just can take everything you do, literally every single bit of source code you have and say, Hey, we can do it too. Come to us, don't work with those guys. Right? We have the latest, greatest things that they have. Why do you wanna continue to work with them? So no, no differentiation is number one, which is very dangerous. And number two, when it becomes, if, if it's a hundred percent open source and there is lots of other vendors able to take the art, the open source artifact and work with it, then it becomes now purely about maintenance and insurance on the products, which is a commodity product, which obviously the prices for that will go down to the ground and you won't be able to have this sustain this positive feedback effect between your business model and between your product code map and won't be able to build a long-lasting company. >>So that's why we do have a combination of open source artifacts and proprietary artifacts. Now our pro proprietary AR artifacts is always around the management of the system, right? So how do we manage the security of the system? How do we manage the, the data flow within the system? How do we manage the services inside the, of the system across all layers, right? Not just the Hado player but the edge based layer, the zookeeper layer, et cetera, et cetera. So that's where we focus our efforts going forward and that's how we differentiate ourself from our, from other vendors out there. Cloudera manager, Cloudera navigator are very unique to us. Nobody else has anything close to those capabilities out there. >>So it sounds like the contributions you make to open source are cultural of, of, in nature, I mean DNA of sorts of Right. And so you're, that's something that you guys do cuz you've always done it. Absolutely. And then the, the artifacts that are proprietary are essentially around rationalizing the revenue opportunity with the expense that you're gonna apply there and making a business case decided >>How to balance. That's that's one. And then two, the differentiation from other competitors. So these two things, Yes. >>Okay. >>I believe that's fundamental to business to open source business models. >>Yeah, I mean there are many open source business models, right? You can go pure service, you can go, like you said, you can totally bogart the code. >>There is no, there is no pure service open source model company that was able to build the longlasting surviving public company, never happened in history. They always get acquired because it becomes a commodity. I >>Mean, right. I mean, I mean and even ibm, right? >>Tom or I want to ask you about the storage thing. We were talking before camera, the, the hor and worst announcement storage you, what's your take on that? >>Which one? The Gluster, the one with Red Hats? Yes. Yes. So Red Hats and yeah, there has been recent news about Red Hat with, with Hor Works having a version of the Haddo platform that uses map use for the computation but uses Red Hat for the storage, right? So Red Hat has a new storage offering that was built based off of a company they acquired was called Guster. And that, that news was very, very surprising to me. And it, the reason why it was surprising, it's correlated also with a shift in messaging from, from Horton works. If you look at Horton Works last year at had Summit last year, one of the key messages that they deliver to us is that within the next five years or by 2015, the tagline back then by 2015, and you're doing research right now to see if I'm saying the right thing. By 2015, half the world data data will be on, will be stored in had would be stored in had. Yes. If you look today at the slides, it >>Doesn't say that it says within five years, >>Right? No, no, no. It says, well >>That was the second iteration was within five years. And now they say something >>Different. Now say they say within 2015 by, sorry, by 2015, half the world's data will be processed by Hado and instead of stored by Hado. And that's a very, very fundamental So >>It's a nuance. >>It's a, it's a very important >>Nuance. Well it's a big deal because yes, when I first saw that I said, Hmm, what does this all mean? And then it sounds 2015 sounds a little early. Yes. And now you're saying processed by, Okay that's different. >>Yes, exactly. And and the reason why now is we believe s GFS is very, very core to the had platform. S GFS is very core to had platform, the storage system of had we want. It's really the layer that Mid had with is more than anything else is how scalable, how reliable and how economical the sdfs storage layer is. So we, we really, I mean ask qu works and ask all the companies working in the, in the had community not to fragment at the storage layer. We need the storage for had to stay inside of had and not to fragment that out. That's very, very critical. >>Okay. So but so >>You're saying that they're in indicating through the gesture that, that they're not come out saying we're going to fragment Hgfs, but the way that this is position might signal >>No, no, no. The announcement, the announcement with Red Hat is >>That is the direct signal. It's >>Literally, we, you'll be able to run map produce directly on top of Red Hat storage instead of sdfs. >>Okay. So >>I >>Interpreted it, I interpret it as they were just hortonwork was hedging on its prediction, which I said Okay, I'll give 'em a break on that. You're saying it's something different, >>It's a shift in strategy potentially. Yeah. Which can be dangerous. It's shift in strategy. >>Is that a compliance issue? Cuz you know, the, the Dishon Hads poss Yeah. Red Hat does have a lot of enterprise customers. Yeah. So is that just maybe if >>Then invest in making had poss compliance, which actually by the way, we are as a community investing in that. Yeah. Yes. You must have. Yeah. So we are investing in adding compulsive poss compliance to had, we're investing in adding snapshots into had, which will be coming very, very soon overnight. >>Well, do you think that that pick a year, I don't care if it's 2015 2000, 22,000 whenever that the majority of the world's data will be running into do >>The majority of worse data that has to do with analytics. Yes. Okay. So so there is, >>So that is that >>Is it's very important, the caveat. Yes, exactly. Because there is lots of types of data that are not very suitable for, had at all. For example, that data storage for Oracle systems, for Oracle database systems. No, you wanna store that in an NetApp emc you don't wanna store that in Hao the, the, the, the, the data storage for streaming video files, right? For just streaming lots and lots of video files. No, you don't wanna store that indu. It's >>A huge >>Proportion of the data. Yeah. Which is a huge, huge >>Proportion of data files, in fact that could overwhelm the data. >>Yeah. So the new nuance, like I would say like I agree that the half thing but the half thing within the world of data for the purpose of analysis. >>Yeah. Okay. So that's, that's >>Narrow down the >>Yeah, okay. But it's a more reasonable, But I've, I >>Never, It's still a huge market by the way. It is. Yeah, >>It is. Yes. Okay. So, so what's next for you? A are you, you, you've gone on this, this journey, you start this company. You've, you've been traveling around like crazy working with customers. What's the next phase of aara do's, you know, career? >>What >>Do you want to have happen next? I mean, what, what do you, what excites you? What do you, what are you working on? >>Yeah, it's just to continue to grow cloud there to be the biggest company it can be. I mean, we want to be literally, we want be one of the very few companies that we're able to take an open source model and turn that into a large publicly traded corporation. >>So you've talked about that you guys brought a new CEO on Right. Look at the background of the ceo and it's, you know, clearly it's got some IPO chops. Yes. So that's, that's an aspiration that you guys have put forth. Okay. >>And you're outward facing now. So you're doing a lot of travel. Yes. So what, what, where have, what have your travels taken now? You've been in China, you obviously you've got a European office Yeah. Open. So what's going on internationally? Give us some sound bites of, of what's happening in the field. Yeah, >>So in, in internationally, I mean, Europe definitely is our next big focus right now. And we now have a big operation in Europe and we have an office presence in, in Europe and a big team down there. And it's growing very quickly. I would say Europe is about two years behind the US kind of like that's how the, how the growth usually matters. What's happening here. And yeah, so we, our, our next big market is Europe. We are looking at China. We don't have a big process in China right now. Japan, we have a big presence in Japan. Japan is growing very quickly. So yeah, I mean we're obviously Canada with the US growing very quickly as well. >>Great to have you on the cube again, for me personally and, and for, for Dave. And I wanna say thanks to Cloudera for some great support over the years. You guys have been fantastic. You know, I say it's built a great company. It's so hard to build a company. You guys have done a great job. I gotta ask you the final question because you did bring that first sound bite, which was, I saw the future, this is back when you guys were just in your B round in, in Palo Alto office, just ramping up, just starting to ramp what's next? What do you see as around the corner? Obviously we're on a trajectory right now. A lot of things gonna get done. Positive compliance, a lot of stuff's gonna fill in. The platform's gonna get stronger. Yeah. We think that open source will win. Yeah. Through all the democratization of open source. What's next? What's the, what's around the corner that you're watching personally that you're, that's interesting to you? A or around where this will take us? >>Yeah. So what, what's next is having this, having this vision become true. Having this future vision that, that you refer to become true. Meaning having a single platform that can store all of your data and that can, regardless of the type of that data, and allow you to extract value for different types of workloads, whether that be batch, interactive machine learning or search or more, right? There will be more things that will come to the platform, but how to bring your applications, all of your data applications, how to bring them to your data and all of your data as opposed to have the data go to them. >>And what are the landmines out there that you need to avoid Yes. In the industry and community needs to avoid to make that a reality. >>The, the key landmine, it's, it's a bit technical. The landmine is a bit technical, which is making sure that they, they are vision continues to evolve and that we have the capability to properly have a multi workload resource management system that allows me to run all of these type of workloads without having them step on each other's steps. That's the key key step going forward. And >>Of course, playing well together in the sandbox. And as always, competitive competition is good. And again, Hadup is doing great. Amma Aala, co-founder of Cloudera inside the Cube. This is Silicon Angle and Wiki Bond's exclusive coverage of ADU Summit here in Silicon Valley. Right back with our next guest after the short break.
SUMMARY :
We owe a great deal of gratitude to you and, and congratulations to you Michael Olson, It was great to be here. So what do you think, what's your take on the current Hadoop ecosystem right now? Should I look to you or look to the camera? The camera or both? there is a side question there, which is what do you think of all the competition coming into the space? what are you seeing right now as the white spaces for things to do in the So first I can't talk about future, future roadmap. you No, no, no, we're good. So you have multiple types of workloads that can handle different types of problems to, you know, do more with less in a lot of the things that you typically hear with the enter within the enterprise. You're gonna have the map produce workload, which is very batch So I want you share with the audience anything that you want say about the So I gave you what examples from Cloudera dot cutting. So the key point is, and and that's what I would like all of the vendors out there that We would like to see you both take and give at the same time. John and I have talked about it a lot that you guys help get it all started. And so you have choices to make Yes. So we strongly believe in that if you are, I thought it's like a card of paper with some writing. You, you have a fan fans out there. big corporations could be, we just can take everything you do, literally every single bit of source code you have So how do we manage the security of the system? So it sounds like the contributions you make to open source are cultural of, of, in nature, So these two things, Yes. You can go pure service, you can go, There is no, there is no pure service open source model company I mean, I mean and even ibm, right? Tom or I want to ask you about the storage thing. And it, the reason why it was surprising, it's correlated also with a shift in messaging No, no, no. It says, well And now they say something half the world's data will be processed by Hado and instead of stored And now you're saying processed And and the reason why now is we believe s GFS is very, That is the direct signal. Interpreted it, I interpret it as they were just hortonwork was hedging on its prediction, which I said Okay, It's a shift in strategy potentially. So is that just maybe if So we are investing in adding compulsive poss compliance to had, we're investing in adding snapshots So so there is, No, you wanna store that in an NetApp emc you don't wanna store that in Hao Proportion of the data. for the purpose of analysis. But it's a more reasonable, But I've, I Never, It's still a huge market by the way. What's the next phase of aara do's, you know, of the very few companies that we're able to take an open source model and turn that into So that's, that's an aspiration that you guys have You've been in China, you obviously you've got a European how the growth usually matters. that first sound bite, which was, I saw the future, this is back when you guys were just in your B round in, and allow you to extract value for different types of workloads, whether that be batch, interactive And what are the landmines out there that you need to avoid Yes. That's the key key step going forward. Amma Aala, co-founder of Cloudera inside the Cube.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Michael Olson | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Mike Olson | PERSON | 0.99+ |
six | QUANTITY | 0.99+ |
John Fur | PERSON | 0.99+ |
China | LOCATION | 0.99+ |
Dave | PERSON | 0.99+ |
Amma Aala | PERSON | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Horton Works | ORGANIZATION | 0.99+ |
Japan | LOCATION | 0.99+ |
2015 | DATE | 0.99+ |
25 | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
seven | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
25 competitors | QUANTITY | 0.99+ |
Dave Ante | PERSON | 0.99+ |
Ama Aala | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
two problems | QUANTITY | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
30 competitors | QUANTITY | 0.99+ |
Calera | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
First | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
ADU Summit | EVENT | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
five years ago | DATE | 0.99+ |
second iteration | QUANTITY | 0.99+ |
one | QUANTITY | 0.98+ |
22,000 | QUANTITY | 0.98+ |
Horton | ORGANIZATION | 0.98+ |
first vendor | QUANTITY | 0.98+ |
five years | QUANTITY | 0.98+ |
hundred percent | QUANTITY | 0.98+ |
Red Hat | TITLE | 0.98+ |
Canada | LOCATION | 0.98+ |
Tia | ORGANIZATION | 0.98+ |
Tom | PERSON | 0.98+ |
Hor Works | ORGANIZATION | 0.97+ |
first | QUANTITY | 0.97+ |
Horton | PERSON | 0.97+ |
two things | QUANTITY | 0.97+ |
first interview | QUANTITY | 0.97+ |
Stealth Mo | LOCATION | 0.97+ |
half | QUANTITY | 0.96+ |
Haak | PERSON | 0.96+ |
one example | QUANTITY | 0.96+ |
Hadoop Summit 2013 | EVENT | 0.95+ |
Dr. Amr Awadallah - Interview 2 - Hadoop World 2011 - theCUBE
Yeah, I'm Aala, They're the co-founder back to back. This is the cube silicon angle.com, Silicon angle dot TV's production of the cube, our flagship telecasts. We go out to the event. That was a great conversation. I was really just, just cool. I could have, we could have probably hit on a few more things, obviously well read. Awesome. Co-founder of Cloudera a. You were, you did a good job teaming up with that co-founder, huh? Not bad on the cube, huh? He's not bad on the cube, isn't he? He, >>He reads the internet. >>That's what I'm saying. >>Anything is going on. >>He's a cube star, you know, And >>Technology. Jeff knows it. Yeah. >>We, we tell you, I'm smarter just by being in Cloudera all those years. And I actually was following what he was saying, Sad and didn't dust my brain. So, Okay, so you're back. So we were talking earlier with Michaels and about the relational database thing. So I kind of pick that up where we left off with you around, you know, he was really excited. It's like, you know, hey, we saw that relational database movement happen. He was part of that. Yeah, yeah. That generation. And then, but things were happening or kind of happening the same way in a similar way, still early. So I was trying to really peg with him, how early are we, like, so, you know, as the curve, you know, this is 1400, it's not the Javit Center yet. Maybe the Duke world, you know, next year might be at the Javit Center, 35,000 just don't go to Vegas. So I'm trying to figure out where we are on that curve. Yeah. And we on the upwards slope, you know, down here, not even hitting that, >>I think, I think, I think we're moving up quicker than previous waves. And actually if you, if you look for example, Oracle, I think it took them 15, 20 years until they, they really became a mature company, VM VMware, which started about, what, 12, 13 years ago. It took them about maybe eight years to, to be a big company, met your company, and I'm hoping we're gonna do it in five. So a couple more years. >>Highly accelerated. >>Yes. But yeah, we see, I mean, I'm, I'm, I've been surprised by the growth. I have been, Right? I've been told, warned about enterprise software and, and that it takes long for production to take place. >>But the consumerization trend is really changing that. I mean, it seems to be that, yeah, the enterprises always last. Why the shorter >>Cycle? I think the shorter cycle is coming from having the, the, the, the right solution for the right problem at the right time. I think that's a big part of it. So luck definitely is a big part of this. Now, in terms of why this is changing compared to a couple of dec decades ago, why the adoption is changing compared to a couple of decades ago. I, I think that's coming just because of how quickly the technology itself, the underlying hardware is evolving. So right now, the fact that you can buy a single server and it has eight cores to 16 cores has 12 hards to terabytes. Each is, is something that's just pushing the, the, the, the limits what you can do with the existing systems and hence making it more likely for new systems to disrupt them. >>Yeah. We can talk about a lot. It's very easy for people to actually start a, a big data >>Project. >>Yes. For >>Example. Yes. And the hardest part is, okay, what, what do I really, what problem do I need to solve? How am I gonna, how am I gonna monetize it? Right? Those are the hard parts. It's not the, not the underlying >>Technology. Yes, Yes, that's true. That's true. I mean, >>You're saying, eh, you're saying >>Because, because I'm seeing both so much. I'm, I'm seeing both. I'm seeing both. And like, I'm seeing cases where you're right. There's some companies that was like, Oh, this Hadoop thing is so cool. What problem can I solve with it? And I see other companies, like, I have this huge problem and, and, and they don't know that HA exists. It's so, And once they know, they just jump on it right away. It's like, we know when you have a headache and you're searching for the medicine in Espin. Wow. It >>Works. I was talking to Jeff Hiba before he came on stage and, and I didn't even get to it cuz we were so on a nice riff there. Right. Bunch of like a musicians playing the guitar together. But like he, we talked about the it and and dynamics and he said something that I thoughts right. On money and SAP is talking the same thing and said they're going to the lines of business. Yes. Because it is the gatekeeper that's, it's like selling mini computers to a mainframe selling client servers from a mini computer team. Yeah. >>There's not, we're seeing, we're seeing both as well. So more likely the, the former one meaning, meaning that yes, line of business and departments, they adopt the technology and then it comes in and they see there's already these five different departments having it and they think, okay, now we need to formalize this across the organization. >>So what happens then? What are you seeing out there? Like when that happens, that mean people get their hands on, Hey, we got a problem to solve. Yeah. Is that what it comes down to? Well, Hadoop exist. Go get Hadoop. Oh yeah. They plop it in there and I what does it do? They, >>So they pop it into their, in their own installation or on the, on the cloud and they show that this actually is working and solving the problem for them. Yeah. And when that happens, it's a very, it's a very easy adoption from there on because they just go tell it, We need this right now because it's solving this problem and it's gonna make, make us much >>More money moving it right in. Yes. No problems. >>Is is that another reason why the cycle's compressed? I mean, you know, you think client server, there was a lot of resistance from it and now it's more much, Same thing with mobile. I mean mobile is flipped, right? I mean, so okay, bring it in. We gotta deal with it. Yep. I would think the same thing. We, we have a data problem. Let's turn it into an >>Opportunity. Yeah. In my, and it goes back to what I said earlier, the right solution for the right problem at the right time. Like when they, when you have larger amounts of unstructured data, there isn't anything else out there that can even touch what had, can >>Do. So Amar, I need to just change gears here a minute. The gaming stuff. So we have, we we're featured on justin.tv right now on the front page. Oh wow. But the numbers aren't coming in because there's a competing stream of a recently released Modern Warfare three feature. Yes. Yes. So >>I was looking for, we >>Have to compete with Modern Warfare three. So can you, can we talk about Modern Warfare three for a minute and share the folks what you think of the current version, if any, if you played it. Yeah. So >>Unfortunately I'm waiting to get back home. I don't have my Xbox with me here. >>A little like a, I'm talking about >>My lines and business. >>Boom. Water warfares like a Christmas >>Tree here. Sorry. You know, I love, I'm a big gamer. I'm a big video gamer at Cloudera. We have every Thursday at five 30 end office, we, we play Call of of Beauty version four, which is modern world form one actually. And I challenge, I challenge people out there to come challenge our team. Just ping me on Twitter and we'll, we'll do a Cloudera versus >>Let's, let's, let's reframe that. Let team out. There am Abalas company. This is the geeks that invent the future. Jeff Haer Baer at Facebook now at Cloudera. Hammerer leading the charge. These guys are at gamers. So all the young gamers out there am are saying they're gonna challenge you. At which version? >>Modern Warfare one. >>Modern Warfare one. Yes. How do they fire in? Can you set up an >>External We'll >>We'll figure it out. We'll figure it out. Okay. >>Yeah. Just p me on Twitter and We'll, >>We can carry it live actually we can stream that. Yeah, >>That'd be great. >>Great. >>Yeah. So I'll tell you some of our best Hadooop committers and Hadoop developers pitch >>A picture. Modern Warfare >>Three going now Model Warfare three. Very excited about the game. I saw the, the trailers for it looks, graphics look just amazing. Graphics are amazing. I love the Sirius since the first one that came out. And I'm looking forward to getting back home to playing the game. >>I can't play, my son won't let me play. I'm such a fumbler with the Hub. I'm a keyboard controller. I can't work the Xbox controller. Oh, I have a coordination problem my age and I'm just a gluts and like, like Dad, sorry, Charity's over. I can I play with my friends? You the box. But I'm around big gamer. >>But, but in terms of, I mean, something I wanted to bring up is how to link up gaming with big data and analysis and so on. So like, I, I'm a big gamer. I love playing games, but at the same time, whenever I play games, I feel a little bit guilty because it's kind of like wasted time. So it's like, I mean, yeah, it's fun and I'm getting lots of enjoyment on it makes my life much more cheerful. But still, how can we harness all of this, all of these hours that gamers spend playing a game like Modern Warfare three, How can we, how can we collect instrument, all of the data that's coming from that and coming up, for example, with something useful with predicted. >>This is exactly, this is exactly the kind of application that's mainstream is gaming. Yeah. Yeah. Danny at Riot G is telling me, we saw him at Oracle Open World. He's up there for the Java one. He said that they, they don't really have a big data platform and their business is about understanding user behavior rep tons of data about user playing time, who they're playing with. Yeah, Yeah. How they want us to get into currency trading, You know, >>Buy, I can't, I can't mention the names, but some of the biggest giving companies out there are using Hadoop right now. And, and depending on CDH for doing exactly that kind of thing, creating >>A good user experience >>Today, they're doing it for the purpose of enhancing the user experience and improving retention. So they do track everything. Like every single bullet, you fire everything in best Ball Head, you get everything home run, you do. And, and, and in, in a three >>Type of game consecutive headshot, you get >>Everything, everything is being Yeah. Headshot you get and so on. But, but as you said, they are using that information today to sell more products and, and, and retain their users. Now what I'm suggesting is that how can you harness that energy for the good as well? I mean for making money, money is good and everything, but how can you harness that for doing something useful so that all of this entertainment time is also actually productive time as well. I think that'd be a holy grail in this, in this environment if we >>Can achieve that. Yeah. It used to be that corn used to be the telegraph of the future of about, of applications, but gaming really is, if you look at gaming, you know, you get the headset on. It's a collaborative environment. Oh yeah. You got unified communications. >>Yeah. And you see our teenager kids, how, how many hours they spend on these things. >>You got play as a play environments, very social collaborative. Yeah. You know, some say, you know, we we're saying, what I'm saying is that that's the, that's the future work environment with Skype evolving. We're our multiplayer game's called our job. Right? Yeah. You know, so I'm big on gaming. So all the gamers out there, a has challenged you. Yeah. Got a big data example. What else are we seeing? So let's talk about the, the software. So we, one of the things you were talking about that I really liked, you were going down the list. So on Mike's slide he had all the new features. So around the core, can you just go down the core and rattle off your version of what, what it means and what it is. So you start off with say H Base, we talked about that already. What are the other ones that are out there? >>So the projects that we have right there, >>The projects that are around those tools that are being built. Cause >>Yeah, so the foundational, the foundational one as we mentioned before, is sdfs for storage map use for processing. Yeah. And then the, the immediate layer above that is how to make MAP reduce easier for the masses. So how can, not everybody knows how to learn map, use Java, everybody knows sql, right? So, so one of the most successful projects right now that has the highest attach rate, meaning people usually when they install had do installed as well is Hive. So Hive takes sequel and so Jeff Harm Becker, my co-founder, when he was at Facebook, his team built the Hive system. Essentially Hive takes sql so you don't have to learn a new language, you already know sql. And then converts that into MAP use for you. That not only expands the developer base for how many people can use adu, but also makes it easier to integrate Hadoop through all DBC and JDBC integrated with BI tools like MicroStrategy and Tableau and Informatica, et cetera, et cetera. >>You mentioned R too. You mentioned R Program R >>As well. Yeah, R is one of our best partnerships. We're very, very happy with them. So that's, that's one of the very key projects is Hive assisted project to Hive ISS called Pig. A pig Latin is a language that ya invented that you have to learn the language. It's very easy, it's very easy to learn compared to map produce. But once you learn it, you can, you can specify very deep data pipelines, right? SQL is good for queries. It's not good for data pipelines because it becomes very convoluted. It becomes very hard for the, the human brain to understand it. So Pig is much more natural to the human. It's more like Pearl very similar to scripting kind of languages. So with Peggy can write very, very long data pipelines, again, very successful projects doing very, very well. Another key project is Edge Base, like you said. So Edge Base allows you to do low latencies. So you can do very, very quick lookups and also allows you to do transactions. So you can do updates in inserts and deletes. So one of the talks here that had World we try to recommend people watch when the videos come out is the Talk by Jonathan Gray from Facebook. And he talked about how they use Edge Base, >>Jonathan, something on here in the Cube later. Yeah. So >>Drill him on that. So they use Edge Base now for many, many things within Facebook. They have a big team now committed to building an improving edge base with us and with the community at large. And they're using it for doing their online messaging system. The live mail system in Facebook is powered by Edge Base right now. Again, Pro and eBay, The Casini project, they gave a keynote earlier today at the conference as well is using Edge Base as well. So Edge Base is definitely one of the projects that's growing very, very quickly right now within the Hudu system. Another key project that Jeff alluded to earlier when he was on here is Flum. So Flume is very instrumental because you have this nice system had, but Hadoop is useless unless you have data inside it. So how do you get the data inside do? >>So Flum essentially is this very nice framework for having these agents all over your infrastructure, inside your web servers, inside your application servers, inside your mobile devices, your network equipment that collects all of that data and then reliably and, and materializes it inside Hado. So Flum does that. Another good project is Uzi, so many of them, I dunno how, how long you want me to keep going here, But, but Uzi is great. Uzi is a workflow processing system. So Uzi allows you to define a series of jobs. Some of them in Pig, some of them in Hive, some of them in map use. You can define a series of them and then link them to each other and say, only start this job when these other jobs, two jobs finish because I'm waiting for the input from them before I can kick off and so on. >>So Uzi is a very nice framework that will will do that. We'll manage the whole graph of jobs for you and retry things when they fail, et cetera, et cetera. Another good project is where W H I R R and where allows you to very easily start ADU cluster on top of Amazon. Easy two on top of Rackspace, virtualized environ. It's more for kicking off, it's for kicking off Hadoop instances or edge based instances on any virtual infrastructure. Okay. VMware, vCloud. So that it supports all of the major vCloud, sorry, all of the me, all of the major virtualized infrastructure systems out there, Eucalyptus as well, and so on. So that's where W H I R R ARU is another key project. It's one, it's duck cutting's main kind of project right now. Don of that gut cutting came on stage with you guys has, So Aru ARO is a project about how do we encode with our files, the schema of these files, right? >>Because when you open up a text file and you don't know how to what the columns mean and how to pars it, it becomes very hard to work for it. So ARU allows you to do that much more easily. It's also useful for doing rrp. We call rtc remove procedure calls for having different services talk to each other. ARO is very useful for that as well. And the list keeps going on and on Maha. Yeah. Which we just, thanks for me for reminding me of my house. We just added Maha very recently actually. What is that >>Adam? I'm not >>Familiar with it. So Maha is a data mining library. So MAHA takes some of the most popular data mining algorithms for doing clustering and regression and statistical modeling and implements them using the map map with use model. >>They have, they have machine learning in it too or Yes, yes. So that's the machine learning. >>So, So yes. Stay vector to machines and so on. >>What Scoop? >>So Scoop, you know, all of them. Thanks for feeding me all the names. >>The ones I don't understand, >>But there's so many of them, right? I can't even remember all of them. So Scoop actually is a very interesting project, is short for SQL to Hadoop, hence the name Scoop, right? So SQ from SQL and Oops from Hadoop and also means Scoop as in scooping up stuff when you scoop up ice cream. Yeah. And the idea for Scoop is to make it easy to move data between relational systems like Oracle metadata and it is a vertical and so on and Hadoop. So you can very simply say, Scoop the name of the table inside the relation system, the name of the file inside Hadoop. And the, the table will be copied over to the file and Vice and Versa can say Scoop the name of the file in Hadoop, the name of the table over there, it'll move the table over there. So it's a connectivity tool between the relational world and the Hadoop world. >>Great, great tutorial. >>And all of these are Apache projects. They're all projects built. >>It's not part of your, your unique proprietary. >>Yes. But >>These are things that you've been contributing >>To, We're contributing to the whole ecosystem. Yes. >>And you understand very well. Yes. And >>And contribute to your knowledge of the marketplace >>And Absolutely. We collaborate with the, with the community on creating these projects. We employ committers and founders for many of these projects. Like Duck Cutting, the founder of He works in Cloudera, the founder for that UIE project. He works at Calera for zookeeper works at Calera. So we have a number of them on stuff >>Work. So we had Aroon from Horton Works. Yes. And and it was really good because I tell you, I walk away from that conversation and I gotta say for the folks out there, there really isn't a war going on in Apache. There isn't. And >>Apache, there isn't. I mean isn't but would be honest. Like, and in the developer community, we are friends, we're working together. We want to achieve the, there's >>No war. It's all Kumbaya. Everyone understands the rising tide floats, all boats are all playing nice in the same box. Yes. It's just a competitive landscape in Horton. Works >>In the business, >>Business business, competitive business, PR and >>Pr. We're trying to be friendly, as friendly as we can. >>Yeah, no, I mean they're, they're, they're hying it up. But he was like, he was cool. Like, Hey, you know, we know each other. Yes. We all know each other and we're just gonna offer free Yes. And charge with support. And so are they. And that's okay. And they got other things going on. Yes. But he brought up the question. He said they're, they're launching a management console. So I said, Tyler's got a significant lead. He kind of didn't really answer the question. So the question is, that's your core bread and butter, That's your yes >>And no. Yes and no. I mean if you look at, if you look at Cloudera Enterprise, and I mentioned this earlier and when we talked in the morning, it has two main things in it. Cloudera Enterprise has the management suite, but it also has the, the the the support and maintenance that we provide to our customers and all the experience that we have in our team part That subscription. Yes. For a description. And I, I wanna stress the point that the fact that I built a sports car doesn't mean that I'm good at running that sports car. The driver of the car usually is much better at driving the car than the guy who built the car, right? So yes, we have many people on staff that are helping build had, but we have many more people on stuff that helped run Hado at large scale, at at financial indu, financial industry, retail industry, telecom industry, media industry, health industry, et cetera, et cetera. So that's very, very important for our customer. All that experience that we bring in on how to run the system technically Yeah. Within these verticals. >>But their strategies clear. We're gonna create an open source project within Apache for a management consult. Yes. And we sell support too. Yes. So there'll be a free alternative to management. >>So we have to see, But I mean we look at the product, I mean our products, >>It's gotta come down to product differentiation. >>Our product has been in the market for two years, so they just started building their products. It's >>Alpha, It's just Alpha. The >>Product is Alpha in Alpha right now. Yeah. Okay. >>Well the Apache products, it is >>Apache, right? Yeah. The Apache project is out. So we'll see how it does it compare to ours. But I think ours is way, way ahead of anything else out there. Yeah. Essentially people to try that for themselves and >>See essentially, John, when I asked Arro why does the world need Hortonwork? You know, eventually the answer we got was, well it's free. It needs to be more open. Had needs to be more open. >>No, there's, >>It's going to be, That's not really the reason why Warton >>Works. >>No, they want, they want to go make money. >>Exactly. We wasn't >>Gonna say them you >>When I kept pushing and pushing and that's ultimately the closest we can get cuz you >>Just listens. Not gonna >>12 open source projects. Yes. >>I >>Mean, yeah, yeah. You can't get much more open. Yeah. Look >>At management >>Consult, but Airs not shooting on all those. I mean, I mean not only we are No, no, not >>No, no, we absolutely >>Are. No, you are contributing. You're not. But that's not all your projects. There's other people >>Involved. Yeah, we didn't start, we didn't start all of these projects. Yeah, that's >>True. You contributing heavily to all of them. >>Yes, we >>Are. And that's clear. Todd Lipkin said that, you know, he contributed his first patch to HPAC in 2008. Yes. So I mean, you go back through the ranks >>Of your people and Todd now is a committer on Edge base is a committer on had itself. So on a number >>Of you clearly the lead and, and you know, and, but >>There is a concern. But we, we've heard it and I wanna just ask you No, no. So there's a concern that if I build processes around a proprietary management console, Yes. I'm gonna end up being locked into that proprietary management CNA all over again. Now this is so far from ca Yes. >>Right. >>But that's a concern that some people have expressed. And, and, and I think one of the reasons why Port Works is getting so much attention. So Yes. >>Talk about that. It's, it's a very good, it's a very good observation to make. Actually, >>There there is two separate things here. There's the platform where all the data sets and then there's this management parcel beside the platform. Now why did we make the management console why the cloud didn't make the management console? Because it makes our job for supporting the customers much more achievable. When a customer calls in and says, We have a problem, help us fix this problem. When they go to our management console, there is a button they click that gives us a dump of the state, of the cluster. And that's what allows us to very quickly debug what's going on. And within minutes tell them you need to do this and you to do that. Yeah. Without that we just can't offer the support services. There's >>Real value there. >>Yes. So, so now a year from, But, but, but you have to keep in mind that the, the underlying platform is completely open source and free CBH is completely a hundred percent open source, a hundred percent free, a hundred percent Apache. So a year from now, when it comes time to renew with us, if the customer is not happy with our management suite is not happy with our support data, they can, they can go to work >>And works. People are afraid >>Of all they can go to ibm. >>The data, you can take the data that >>You don't even need to take the data. You're not gonna move the data. It's the same system, the same software. Every, everything in CDH is Apache. Right? We're not putting anything in cdh, which is not Apache. So a year from now, if you're not happy with our service to you and the value that we're providing, you can switch. There is no lock in. There is no lock. And >>Your, your argument would be the switching costs to >>The only lock in is happiness. The only lock in is which >>Happiness inspection customer delay. Which by, by the way, we just wrote a piece about those wars and we said the risk of lockin is low. We made that statement. We've got some heat for it. Yes. And >>This is sort of at scale though. What the, what the people are saying, they're throwing the tomatoes is saying if this is, again, in theory at scale, the customers are so comfortable with that, the console that they don't switch. Now my argument was >>Yes, but that means they're happy with it. That means they're satisfied and happy >>With it. >>And it's more economical for them than going and hiding people full-time on stuff. Yeah. >>So you're, you're always on check as, as long as the customer doesn't feel like Oracle. >>Yeah. See that's different. Oracle is very, Oracle >>Is like different, right? Yeah. Here it's like Cisco routers, they get nested into the environment, provide value. That's just good competitive product strategy. Yes. If it they're happy. Yeah. It's >>Called open washing with >>Oracle, >>I mean our number one core attribute on the company, the number one value for us is customer satisfaction. Keeping our people Yeah. Our customers happy with the service that we provide. >>So differentiate in the product. Yes. Keep the commanding lead. That's the strategist. That's the, that's what's happening. That's your goal. Yes. >>That's what's happening. >>Absolutely. Okay. Co-founder of Cloudera, Always a pleasure to have you on the cube. We really appreciate all the hospitality over the beer and a half. And wanna personally thank you for letting us sit in your office and we'll miss you >>And we'll miss you too. We'll >>See you at the, the Cube events off Swing by, thanks for coming on the cube and great to see you and congratulations on all your success. >>Thank >>You. And thanks for the review on Modern Warfare three. Yeah, yeah. >>Love me again. If there any gaming stuff, you know, I.
SUMMARY :
Yeah, I'm Aala, They're the co-founder back to back. Yeah. So I kind of pick that up where we left off with you around, you know, he was really excited. So a couple more years. takes long for production to take place. But the consumerization trend is really changing that. So right now, the fact that you can buy a single server and it It's very easy for people to actually start a, a big data Those are the hard parts. I mean, It's like, we know when you have a headache and you're On money and SAP is talking the same thing and said they're going to the lines of business. the former one meaning, meaning that yes, line of business and departments, they adopt the technology and What are you seeing out there? So they pop it into their, in their own installation or on the, on the cloud and they show that this actually is working and Yes. I mean, you know, you think client server, there was a lot of resistance from for the right problem at the right time. Do. So Amar, I need to just change gears here a minute. of the current version, if any, if you played it. I don't have my Xbox with me here. And I challenge, I challenge people out there to come challenge our team. So all the young gamers out there am are saying they're gonna challenge you. Can you set up an We'll figure it out. We can carry it live actually we can stream that. Modern Warfare I love the Sirius since the first one that came out. You the box. but at the same time, whenever I play games, I feel a little bit guilty because it's kind of like wasted time. Danny at Riot G is telling me, we saw him at Oracle Open World. Buy, I can't, I can't mention the names, but some of the biggest giving companies out there are using Hadoop So they do Now what I'm suggesting is that how can you harness that energy for the good as well? but gaming really is, if you look at gaming, you know, you get the headset on. So around the core, can you just go down the core and rattle off your version of what, The projects that are around those tools that are being built. Yeah, so the foundational, the foundational one as we mentioned before, is sdfs for storage map use You mentioned R too. So one of the talks here that had World we Jonathan, something on here in the Cube later. So Edge Base is definitely one of the projects that's growing very, very quickly right now So Uzi allows you to define a series of So that it supports all of the major vCloud, So ARU allows you to do that much more easily. So MAHA takes some of the most popular data mining So that's the machine learning. So, So yes. So Scoop, you know, all of them. And the idea for Scoop is to make it easy to move data between relational systems like Oracle metadata And all of these are Apache projects. To, We're contributing to the whole ecosystem. And you understand very well. So we have a number of them on And and it was really good because I tell you, Like, and in the developer community, It's all Kumbaya. So the question is, the experience that we have in our team part That subscription. So there'll be a free alternative to management. Our product has been in the market for two years, so they just started building their products. Alpha, It's just Alpha. Product is Alpha in Alpha right now. So we'll see how it does it compare to ours. You know, eventually the answer We wasn't Not gonna Yes. Yeah. I mean, I mean not only we are No, But that's not all your projects. Yeah, we didn't start, we didn't start all of these projects. So I mean, you go back through the ranks So on a number But we, we've heard it and I wanna just ask you No, no. So there's a concern that So Yes. It's, it's a very good, it's a very good observation to make. And within minutes tell them you need to do this and you to do that. So a year from now, when it comes time to renew with us, if the customer is And works. It's the same system, the same software. The only lock in is which Which by, by the way, we just wrote a piece about those wars and we said the risk of lockin is low. the console that they don't switch. Yes, but that means they're happy with it. And it's more economical for them than going and hiding people full-time on stuff. Oracle is very, Oracle Yeah. I mean our number one core attribute on the company, the number one value for us is customer satisfaction. So differentiate in the product. And wanna personally thank you for letting us sit in your office and we'll miss you And we'll miss you too. you and congratulations on all your success. Yeah, yeah. If there any gaming stuff, you know, I.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff | PERSON | 0.99+ |
Jeff Hiba | PERSON | 0.99+ |
Todd Lipkin | PERSON | 0.99+ |
2008 | DATE | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
John | PERSON | 0.99+ |
Mike | PERSON | 0.99+ |
Modern Warfare three | TITLE | 0.99+ |
Apache | ORGANIZATION | 0.99+ |
Danny | PERSON | 0.99+ |
Jonathan Gray | PERSON | 0.99+ |
Jeff Haer Baer | PERSON | 0.99+ |
15 | QUANTITY | 0.99+ |
two years | QUANTITY | 0.99+ |
Calera | ORGANIZATION | 0.99+ |
Modern Warfare | TITLE | 0.99+ |
16 cores | QUANTITY | 0.99+ |
Jeff Harm Becker | PERSON | 0.99+ |
Todd | PERSON | 0.99+ |
eight cores | QUANTITY | 0.99+ |
Jonathan | PERSON | 0.99+ |
both | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Amazon | ORGANIZATION | 0.99+ |
Java | TITLE | 0.99+ |
next year | DATE | 0.99+ |
Skype | ORGANIZATION | 0.99+ |
two jobs | QUANTITY | 0.99+ |
Vegas | LOCATION | 0.99+ |
Michaels | PERSON | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
Hadoop | TITLE | 0.99+ |
hundred percent | QUANTITY | 0.99+ |
35,000 | QUANTITY | 0.99+ |
Horton Works | ORGANIZATION | 0.99+ |
Today | DATE | 0.99+ |
Peggy | PERSON | 0.99+ |
eBay | ORGANIZATION | 0.99+ |
Horton | LOCATION | 0.99+ |
12 hards | QUANTITY | 0.99+ |
Each | QUANTITY | 0.99+ |
vCloud | TITLE | 0.99+ |
HPAC | ORGANIZATION | 0.99+ |
Aala | PERSON | 0.99+ |
Adam | PERSON | 0.99+ |
Tyler | PERSON | 0.98+ |
UIE | ORGANIZATION | 0.98+ |
Hadoop World | TITLE | 0.98+ |
first one | QUANTITY | 0.98+ |
12 open source projects | QUANTITY | 0.98+ |
Edge Base | TITLE | 0.98+ |
W H I R R | TITLE | 0.98+ |
five | QUANTITY | 0.98+ |
Hammerer | PERSON | 0.98+ |
Xbox | COMMERCIAL_ITEM | 0.98+ |
Port Works | ORGANIZATION | 0.98+ |
Hive | TITLE | 0.98+ |
Amar | PERSON | 0.98+ |
five different departments | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Christmas | EVENT | 0.98+ |
SQL | TITLE | 0.97+ |
Silicon angle dot TV | ORGANIZATION | 0.97+ |
Tableau | TITLE | 0.97+ |
two | QUANTITY | 0.97+ |
W H I R R | TITLE | 0.97+ |