Jagane Sundar, WANdisco | CUBEConversation, January 2019

>> Hello everyone. Welcome to this CUBE conversations here in Palo Alto, California John Furrier, host of the Cube. I'm here with Jagane Sundar CTO chief technology officer of WANdisco, you get great to see you again. Place we're coming on. >> Thank you for having me, John. >> So the conversation I want to talk to about the technology behind WANdisco and we've had many conversations. So for the folks watching good, our YouTube channel insurgency the evolution of conversations over, I think. Eight, eight, nine years now we've been chatting. What a level up. You guys are now with cloud big announcements around multi cloud live data in particular. So the technology is the gift that keeps giving for WANdisco you guys continuing to take territory now, a big way with cloud, big growth, A lot of changes, a lot of hires. What's going on? >> So, as you well know, WANdisco stands for wide area network distributed, computing on the value ofthe the wide data network aspect is really shining through now because nobody goes to the cloud saying, I'm going to put it in one data center. It's always multiple regions, multiple data centers in each region. Suddenly, problem of having your data consistent, being across multiple cloud windows are on prem to cloud becomes a real challenge. We stepped in. We had something that was a good solution for small users, small data. But we developed it into something that's fantastic for large data volumes on people are running into the problem. The biggest problem that IT providers have is that data scientists do not respect data that's not consistent. If you look at a replica of data and you're not sure whether it's exactly accurate or not the data scientists who spent all his time building algorithms to predict some model gonna look at it and go, that data's not quite right. I'm not going to look at it. So if you use a inconsistent tool or an inadequate tool to replicate your data, you have the problem that nobody is going to respect the replicas. Everybody's going to go back to the source of truth. We solved that problem elegantly and accurately >> State the problem specifically. Is it the integrity of the data? What is the specific problem statement that you guys solve with technology? >> Let me give you an exam you have notifications that come out of cloud object stores when an object this place into the store or deleted from the store that the best effort delivery. If there are logjams in this mechanism used to deliver some notifications, maybe drop the problem with using that notification mechanism to replicate your data is that over a period of time, so you have two three petabytes of data and you're replicating it over a month or month and a half, you'll find that maybe point one percent of your data is not quite accurate anymore. So the value ofthe the replicas essentially zero >> like a leaky pipe. Basically, >> indeed, if you have a leaking pipe, then it's just totally >> we need to have integrity and to end. All right, let's get back to some of the things I want to ask because I think it's a fascinating been following your story. For years, you had a point solution. Multiple wider. You had the replication active, active great for data centers. So disaster recovery not mission critical, but certainly critical. Correct, depending on how it the mission of us. It wasn't this asked Income's Cloud. You mentioned a wide area. Networks and you go back to the old days when I was breaking into the business. That's when they had, you know, dial up modems and front pagers. Not even cell phones. Just starting. Why do your network would have really complicated beast and all the best resource is worked on expensive bandwith, that he had remote offices and you had campus networking then. So why the area networking went through that phase one? Correct. Now we're living in. They win all the time. Cloud is when white area >> correct cloud is when. But there are subtle aspect that people miss all the time. If you go to store an object in Amazon, says three, for example, you pick a region. If it's a complete wide area distributed entity, why do you need to pick a region? The truth is, each cloud vendor hides a number of region specific or local area network specific aspects of their service. Dynamo DB runs and one data centre one one region, two or three availability zones in a region. If you want to replicate that data, you don't really have much help from the cloud vendor themselves. So you need to parse the truth from what has offered what you will find us. The van is still a very challenging problem for a lot of these data application problems. >> Talk about the wide area network challenges in the modern era we're living in, which is cloud computing mentioned some of the nuances around regions and availability zones. Basically, the cloud grew up as building blocks and the plumbing on the neither essentially a mai britt of of certain techniques and networking. Local area network V lands tunneling All these stuff Nets router. So it's obviously plumbing. Yes, what's different now that's important to take that to the next level. Because, you know, there are arguments that saying, Hey, GPR, I might want to have certain regions be smarter, right? So you're starting to see a level up that Amazon and others air going. Google, in particular, talks about this a lot as Ama's Microsoft. What's that next level of when, where the plumbing it's upgraded from basically the other things. >> So the problem really has to be stated in terms ofthe your data architecture. If you look at your data on, figure out that you need the set of data to be available for your business critical applications, then the problem turns into. I need replicas of this data in this region and the other reasons, perhaps in two different cloud render locations because you don't want to be tied down to their availability. One cloud vendor, then the problem tones into How do you hide the complexity of replicating and keeping this data consistent from the users of the data data scientists, the application authors and so on. Now, that's where we step in. We have a transparent replication solution that fits into the plumbing. It's often offered by the IT folks as part of their cloud offering or as part of the hybrid offering. The application. Developers don't really need to worry about those things. A specific example would be hive tables that are users building in one data center an IT Professional from that organization can buy our replication software. That table will be available in multiple data centers and multiple regions available for both Read and write. The user did not do anything or does not need to be a there. So if you have problems such as GDPR requires the data to be here. But this summarized data can be available across all of these regions. Then we can solve the problem elegantly for you without any act application rewiring or reauthoring. >> Talk about the technology that makes all this happen again. This has been a key part of your success that WANdisco love the always love the name wide area there was a big wide area that were fan did that in my early days configuring router tables. You know how it has been. You know, hardcore back then, Distributed systems is certainly large. Scale now is part of the clouds. So all the large scale guys like me when we grew up into computer science days had to think about systems, architecture at scale. We're actually living it now, Correct. So talk about the technology. What specifically do you guys have that that that's your technology and talk about the impact to the scale piece. I think that's a real key technology piece >> indeed. So the core of our algorithm is enhancements and superior implementation. Often algorithm called paxos. Now paxos itself is the only mathematically proven algorithm for keeping replicas in multiple machines or multiple regions. So multiple data centers the other alternatives. Such a raft and zookeeper protocol. These are all compromises for the sake of the ease of implementation. Now we don't feel the cost of implementation. We spent many years doing the research on it, so we have fantastic implementation. Of paxos is extended for use over wide data networks without any special hardware I mentioned without any special hardware piece, because Google Spanner, which is one of our primary competitors, has an implementation that that needs your own specific network and hardware. So the value of >> because they're tired, the clock, atomic clock, actually, to the infrastructure of their timings, that's all synchronized. So it's it's only within Google Cloud? >> Exactly. It cannot even be made available to Google's customers of Google Cloud. That was a feature that they added recently, but it's rolling out in very limited. >> They inherited that from their large scale correct Google. Yes, which is a big table spanner. These are awesome products. >> These are awesome products, but they're very specific >>Tailored for Google. >> Yes, they're great in the Google environment. They're not so great outside of Google. Now we have technology that makes you able to run this across a Google Cloud and Microsoft's Cloud and Amazons Cloud. The value of this is that you have truly cloud neutral solutions. You don't need to worry about when the lock in, you don't need to worry about availability problems in one of the cloud vendors and then you can scale your solution. You can go in with an approach such that when the virtual machines or the compute resource is in one cloud vendor are really inexpensive. Will use that when it's very expensive. Will move our workloads to other locations. You can think up architectures like that, with our solution underpinning your replication >> rights again. I'm gonna ask you the technical quite love these conversations get down and dirty on the hood. So Joel Horowitz was on your new CMO former Microsoft. Keep alumni Richard CEO Talk aboutthe. Same thing. Moving data around the key value probably that's tied right into your legacy of your I P and how that value is with integrity. Moving data from point A to point B. But the world's moving also to identify scenarios where I'm going to move compute rather than through the day, because people have recognized that moving data is hard you got late in C and this cost in band with so two schools of thought not mutually exclusive. When do you pick one? >> Okay, absolutely. They're not mutually exclusive because there are data availability needs that defined some replication scenarios on their computer needs that can be more flexible. If you had the ability to say, have data in Amazon's cloud on in Microsoft's Cloud, You mean Want to use some Amazon specific tools for specific computer scenarios at the same time, used Microsoft tools for other scenarios or perhaps use open source, too, like Hadoop in either one of those clouds? Those are all mechanisms that work perfectly well, but at the core you have to figure out your data architecture. If you can live with your data in one region or in one data center, clearly that's what you should do. But if you cannot have that data, be unavailable, you do have to replicate it. At that point, you should consider replicating to a different cloud window because availability is concerned with all these vendors. >> So two things I hear you say one availability is it's a driver. The other one is user preference Yes. Why not have people who know Microsoft tools and Microsoft software work on Microsoft framework of someone using something else in another cloud? The same data can live in both places. You guys make that happen? Is that what you're saying? Exactly. That's a big deal. >> Absolutely. And we guarantee the consistency that a guarantee that you will not get from any other bender. >> So this basically debunks the whole walk in, Yes, that you guys air solution to to essentially relieve this notion of lock and so me as a customer and say, Hey, I'm an Amazon right now. We're all in an Amazon. But, you know, I've got some temptation to goto Azure or Google. Why wouldn't I if I have the ability to make my data consistent, exact. Is that what you're saying? >> That is exactly what I'm saying. You have this ability to experiment with different cloud vendors. You also have the ability to mitigate some of the cost aspect. If you're going to pay for copies in two different geographic locations, you might as well do it on two different cloud vendor see have the richer subset of applications and better availability. >> So for people who say date is a lock inspect for cloud. It's kind of right if unless they use WANdisco because in a sense, and because you know what really moves with it. I mean, your data's Did you stay there? Yeah, that's kind of common sense. It's not so much technical locket, so there's no real technical lockets. More operational lock and correct with data, if you don't wantto. But if you're afraid of lock in, you go with the WANdisco. That's live data. Multi cloud is that >> that was live data multi cloud on. Does this new ability to actually have active data sets that are available in different cloud bender locations? >> Well, that's a killer app right there. How do you feel? You must You must feel pretty good. You know, you and I have talked many times. Yes, but this's like you been waiting for this moment. This is actually really wide here in a k a cloud. I was a big data problem. Which only getting bigger, exactly. Replication is now the transport between clouds for anti lock. And this is the Holy Grail for home when >> it is the Holy Grail for the industrial. We've been talking about it for years now, and we feel completely redeemed. Now we feel that the industry has gotten to the point back. They understand what we've talked about. I feel very excited, the custom attraction we're seeing on watching our customers light of when we describe the attributes we bring, It's >> exciting and just the risk management alone is a hedge. I mean, if I'm a if I'm someone in the cyber security challenges alone on data, you've got data sovereignty, compliance. Never mind the productivity piece of it, which is pretty amazing. So you guys are changing the data equation. >> Indeed, R R No most excited customers are CEOs because mitigating risk from things like cyber security. As you point out, you may have a breach in one cloud vendor. You can turn that off and use your replica in the other cloud vendor side instantly. Those are comfort. You do not get that other solutions. >> So world having a love fest here. I love the whole multi cloud data. No anti lock. And I think that's a killer feature. Think we'll sell that baby? I'm going to say, OK, that's all good, but I'm going to get you on this one. Security. So no one saw security yet. So if you saw that, then you pretty much got it all. So tell me the securities. Just >> so I'll start by saying, right. Our biggest customer base is the financial industry, banking in companies insurance company's health care. There is no industry in the world that's more security conscious than the banking. And does the government the comment? Perhaps I would. I mean, the banks are really security >> conscious, Their money's money, >> money is money. And and they have, ah, judicially responsibility both governments and to their to their customers. So we've catered to these customers for upwards off a decade. Now, every technical decision we make has security. Ask one of the focus items on DH >> years. A good un security. You >> feel's way insecurity when minute comes to date. Yes. >> Encryption. Is that what this is? It's >> encrypted on the wire. We support all on this data at rest encryption schemes. We support all the the the soup and the cloud vendor security mechanisms. We have a cross cloud product, so the security problems are multiplied and we take care of each of those specifically. So you can be confident that your data secure >> and wire speed security, no overhead involved, >> no overhead involved at all. It's not measurable. >> So well, congratulations on where you guys are a lot more work to do. You guys going to staff? So you hiring a lot of people talk about the talent you're hiring real quick because, you know large skin attracting large scale talent is also one indicator. Yeah, the successful opportunity. I see, the more I think the positioning is phenomenal. Congratulations absent about the hiring, >> as you know, as as David mentioned. A few minutes ago, we hired Joel from IBM for our marketing a department. He cmo wonderful. Higher. We've got Ronchi, who's from the University of Denver. I left the head of that computer science department to come work for us. Another amazing guy. Terrific background. We've got shocked me. Who's another column? UT Austin, phD. He's running engineering for us. We're so pleased to be able to hire talent at this level. As as you well know, it's the people who make these jobs interesting and products interesting. We are. So what are >> some of the things that those guys say when they when they get into really exposed. I mean, why would someone with somewhat what would take someone to quit their ten year professor job at a university, which is pretty much retirement to engage in a growing opportunity? What's the What do they say? >> So the single I mean that you'll find in all of this is very complex, unique technology that has bean refined on it's on the verge of exploding toe, probably something ten to one hundred times the size it is today. People see that when dish when we show them the value ofthe what we've got on the market, that we're taking this too. I'm just getting excited. >> Well, congratulations. You guys have certainly worked hard. Has been great to watch the entrepreneurial journey of getting into that growth stream and just the winds that you're back all that hard work into technologies. Phenomenal again. Multi cloud data not worrying about where your data is is going to give people some East and rest in the other rest of night. Well, because that's the number one of the number one was besides security absolutely Jagane Sundar CTO chief technology officer of WANdisco here inside the CUBE in Palo Alto. I'm John Furrier. Thanks for watching.

Published Date : Jan 23 2019

SUMMARY :

you get great to see you again. So for the folks watching good, our YouTube channel insurgency the evolution of conversations over, So if you use a inconsistent tool or that you guys solve with technology? So the value ofthe the replicas essentially zero like a leaky pipe. You had the replication active, active great for data centers. So you need to parse the truth from what has offered Talk about the wide area network challenges in the modern era we're living in, which is cloud computing mentioned some So the problem really has to be stated in terms ofthe your data architecture. So all the large scale guys So the value of because they're tired, the clock, atomic clock, actually, to the infrastructure of their timings, It cannot even be made available to Google's customers of Google They inherited that from their large scale correct Google. availability problems in one of the cloud vendors and then you can scale your solution. Moving data around the key value probably that's tied right into your legacy work perfectly well, but at the core you have to figure out your data architecture. So two things I hear you say one availability is it's a driver. And we guarantee the consistency that a guarantee that you will not get from any So this basically debunks the whole walk in, Yes, that you guys air solution to to You also have the ability to mitigate some of the cost aspect. they use WANdisco because in a sense, and because you know what really moves with it. Does this new ability to actually You know, you and I have talked many times. it is the Holy Grail for the industrial. So you guys are changing As you point out, you may have a breach in So if you saw that, then you pretty much got it all. I mean, the banks are really security Ask one of the focus items on DH You feel's way insecurity when minute comes to date. Is that what this is? So you can be confident that your data secure It's not measurable. So you hiring a lot of people talk about the talent you're hiring real quick because, I left the head of that computer science department to come work for us. some of the things that those guys say when they when they get into really exposed. So the single I mean that you'll find in all of this getting into that growth stream and just the winds that you're back all

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Joel	PERSON	0.99+
Joel Horowitz	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jagane Sundar	PERSON	0.99+
John Furrier	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
WANdisco	ORGANIZATION	0.99+
Jagane Sundar	PERSON	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Eight	QUANTITY	0.99+
one	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Richard	PERSON	0.99+
Ronchi	PERSON	0.99+
University of Denver	ORGANIZATION	0.99+
January 2019	DATE	0.99+
GPR	ORGANIZATION	0.99+
one region	QUANTITY	0.99+
two schools	QUANTITY	0.98+
each region	QUANTITY	0.98+
YouTube	ORGANIZATION	0.98+
three	QUANTITY	0.98+
both	QUANTITY	0.98+
Ama	ORGANIZATION	0.98+
GDPR	TITLE	0.98+
each cloud	QUANTITY	0.98+
one indicator	QUANTITY	0.98+
two things	QUANTITY	0.98+
ten	QUANTITY	0.97+
three availability zones	QUANTITY	0.97+
both places	QUANTITY	0.97+
one hundred times	QUANTITY	0.97+
one percent	QUANTITY	0.97+
CUBE	ORGANIZATION	0.97+
Google Cloud	TITLE	0.97+
single	QUANTITY	0.96+
eight	QUANTITY	0.96+
Palo Alto, California	LOCATION	0.95+
Google Cloud	TITLE	0.95+
one data center	QUANTITY	0.94+
CTO	PERSON	0.94+
nine years	QUANTITY	0.93+
today	DATE	0.93+
both governments	QUANTITY	0.92+
Cube	ORGANIZATION	0.9+
one cloud vendor	QUANTITY	0.9+
two three petabytes	QUANTITY	0.9+
zero	QUANTITY	0.89+
each	QUANTITY	0.89+
One cloud vendor	QUANTITY	0.88+
two different cloud	QUANTITY	0.88+
over a month	QUANTITY	0.87+
month and a half	QUANTITY	0.86+
Hadoop	TITLE	0.85+
Dynamo	ORGANIZATION	0.82+
UT Austin	ORGANIZATION	0.82+
few minutes ago	DATE	0.81+
ten year professor	QUANTITY	0.79+

Joel Horwitz, WANdisco | CUBEConversation, January 2019

(soaring orchestral music) >> Everyone, welcome to this CUBE Conversation here at Palo Alto, California. I'm John Furrier, host of theCUBE. We are here with Joel Horwitz, who's the CMO of WANdisco, Joel, great to see you, formerly of IBM, we've known you for many years, we've had great conversations when you were at IBM, rising star, now at WANdisco, congratulations. >> Thank you, yeah, it's really great to be at WANdisco, and great to be here with theCUBE. So we've had many conversations, again, goin' back, you were a rising star in data, you know the cloud real well, why WANdisco, why leave IBM for WANdisco, what attracted you to the opportunity? >> Yeah, really three things. First and foremost, the people. I've known the WANdisco team now for years. Back in my Hadoop days, when I was at Datamere, I used to, hang out with the WANdisco team at Data After Dark, in New York, which was great, and they had the best marketing there at the time. Two, the product, I mean I won't join a company unless the product is really legit, and they have an absolutely great technology, and they are applying it to some really tough problems. And third is just the potential, really, the potential of this company is not even close to being tapped. So there's a ton of runway there, and so, for me, I'm just totally grateful, and totally honored, to be a part of WANdisco. >> What's the tailwind for them, that wave that they're on, if you will, because you mentioned, there's a lot of runway or headroom, a lot of market growth. Certainly cloud, David Richards will talk about that. But what attracted you, 'cause you knew the cloud game too. >> Yeah, yeah. >> IBM made a big run at the cloud. >> Yeah well, I came in, at IBM, through the data door, so to speak, and then I walked through the cloud door, as well, while I was there. And the reality is that data continues to be the lifeblood of an enterprise, no matter what. And so, what I saw in WANdisco was that they had technology that allowed people to enlarge enterprise to, frankly replicate or manage their data across Hadoop clusters from cluster to cluster. And then we ended up, when I spoke with you last, with David here, we also recognize the opportunity that just how copying data, large-scale data from one Hadoop cluster to another, is challenging, copying data, it's really not that different of copying data from, say, HDFS to an object storage or S3, as pretty similar problem. And so that's why, just this past week, we announced live data for multicloud. >> Explain live data for multicloud, I've read it in the news, got some buzz, it's this great trend, live. We're doing you a lot of live videos on theCUBE, live implies real time. Data's data. Multicloud is clearly becoming one of those enterprise categories. >> Yeah. >> First it was public cloud, then hybrid cloud. >> Yeah. >> Now it's multicloud. How does live data fit into multicloud? >> Yeah, so multicloud, and live data, as I just mentioned, we have live data for Hadoop, so that's fairly obvious, so if you're going multi-cluster you can do that. As well as from, even on-prem, data center to data center, so, multi-site if you will. But multicloud is a really interesting phrase that's kind of cropped up this year. We're seeing it used quite a lot. The focus in multicloud has been mainly focus on applications. And so, talking about, how do you have a container strategy? Or a virtualization strategy, for your applications? And so, I think of it really as a multicloud strategy, as opposed to a multicloud architecture. So we're helping our enterprise clients think about their multicloud strategy. So they're not locked in to any one vendor, so they're able to take advantage of all the great innovations that are happening, if you ask me, on the cloud first, and then ultimately comes down to, at times, on-prem. >> What's the pitfalls between multicloud strategy and multicloud architecture, you just said, customers don't want to get locked in, obviously, no-one wants to get locked in, multi-vendor used to be a big buzzword, during that last wave of computer-to-client server. >> Yeah. >> Now multicloud seems like multi-vendor, what do you mean by architecture versus strategy, how do you parse that? Yeah, so like I said, in terms of your data, right, and it all comes back to your data. If you go all in on, say, one vendor, and you're architecting for that vendor only and you're choosing your migration, your data management tools, for a particular cloud vendor, and, said a different way, if you're only using the native tools from that vendor, then it's very difficult to ever move off of that cloud, or to take advantage of other clouds as they, for example, maybe have new IOT offerings, or have new blockchain offerings, or have new AI offerings, as many others come on the scene. And so, that's what I mean by strategy, is if you choose one vendor for, your certain toolset, then it's going to be very difficult to maintain arbitrage between the different vendors. >> Talk about how you guys are attacking the market, obviously, it's clear that data, has been a fundamental part of WANdisco's value proposition. Moving data around has been a top concern, even back in the Hadoop days, now it's in the cloud. >> Yeah. >> Moving data across the network, whether it's cloud to cloud, or cloud to data center, or to the edge of the network-- >> Yep. Yep. >> Is a challenge. >> You know at IBM, when I was there in 2016, and we're coming up with our strategy when I was in Corp Dev. We talked about four different areas of data, we talked about data gravity, so data has gravity. We talked about data movement, and we talked about data science. And we talked about data governance. And I still think those are still relatively the four major themes around this topic of data. And so, absolutely data has gravity, and not just in terms of the absolute size and weight, if you will. But it also has applications that depend on it, the business itself depends on it, and so, the types of strategies that we've seen to migrate data, say, to the cloud, or have a hybrid data management strategy, has been lift and shift, or to load it on to the back of, I always picture that image of the forklift lifting all those tape drives onto the airplane, you know, the IBM version of that. And that's like a century old at this point, so, we have a way to replicate data continuously, using our patented consensus technology, that's in the lifeblood of our company, which is distributed computing. And so having a way, to migrate data to the cloud, without disrupting your business, is not just marketing speak, but it's really what we are able to do for our clients. How do you guys go to market, how do you guys serve customers, what's the strategy? >> So, primarily we've formed a number of strategic partnerships, obviously one with IBM that I helped spearhead while I was there, we actually just recently announced that we now support Big SQL, so it's actually the first opportunity where, if you are using a database, provided by IBM, you can actually replicate across different databases and still query it with Big SQL. Which is a big deal, right, it means you can still have access to your data while it's in motion, right, that's pretty cool. And then so IBM is there, and then secondly, we've formed a number of other strategic partnerships with the other cloud vendors, of course, Alibaba we have an OEM, Microsoft, we have preferred selling motion with them, AWS, of course, we're in their marketplace. So primarily, we sell through a number of our key partnerships, because, we are, fairly integrated, like I said, into the architecture of these platforms, and, just to comment more deeply on that, when you look at, object storage, on each of these various public cloud vendors. They may look similar on the surface, maybe they all use the same APIs or have some level of, similar interaction, they look like they're the same, the pricing might be the same. We go like one level deeper, and they're all very different, they're all very different flavors of object storage. And so while it might seem like, "Oh, that's trivial to work with," it really isn't, it's extremely non-trivial, so, we help, not only our customers solve that, but we also help our partners significantly, help their clients move to the cloud, to their cloud, faster. >> So you basically work through people who sell your product, to the end user customer, or through their application or service. >> Yeah, that's our main route to market, I would say, the other, obviously, the main, we have a direct sales force, who's out there, working with the best clients in the world. AMD is a great customer of ours, who we recently helped migrate to Microsoft Azure. And we have a number of other large enterprise customers, in retail, and finance, and media. And so really, when it comes down to it, yeah it's those two majors motions, one through the cloud vendors themselves, 'cause frankly, in most cases, they don't have this technology to do it, you know, they're trying to basically take snapshots of data, and they're struggling to convince their customers to move to their cloud. >> It becomes a key feature in platforms. >> Yes it does. >> So that's obviously what attracts sellers, what other things would attract sellers or partners, for you, what motivates them, obviously the IP, clearly, is the number one, economics, what's the other value proposition? >> The end goal isn't to move data to the cloud, the end goal is to move business processes to the cloud, and then be able to take advantage of the other value adds that already exist in the cloud. And so if you're saying, what's the benefit there, well, once you do that move, then you can sell into, clients with all your additional value adds. So that's really powerful, if you are stuck with this stage of "Eh, how do we actually migrate data to the cloud?" >> So IBM Think is coming up, what's your view of what's happening there, what are you guys going to be doing there, as are you, on the IBM side-- >> Yeah. >> Now you're on the other side of the table. You've been on both sides of the table. >> Yeah. >> So what's goin' on at Think, and how does WANdisco, vector, and certainly CUBE will be there. >> Yeah, we'll be there, so WANdisco is a sponsor of IBM Think as well, clearly, as I mentioned, we'll be talking about Big Replicate, which is our Hadoop replication offering, that's sold with IBM. The other one, as I mentioned, is Big SQL, so that's a new offering that we just announced this past month. So we'll be talking about that, and showing a number of great examples of how that actually works, so if you're going to be at Think, come by our booth, and check that out. In addition to that, I mean, clearly, IBM is also talking about multicloud and hybrid cloud, so hybrid data management, hybrid cloud is a big topic. You can expect to see, at IBM Think, a lot of conversations on the application side. In terms of, obviously with their acquisition of Red Hat, you can well imagine they're going to be talking a lot about the software stack, there. But I would say that, we'll be talking, and spending most of our time talking about, how to manage your data across different environments. >> Where's the product roadmap heading, I know you guys don't like to go into specifics in public- >> Yeah. >> Sensitive information, but, generally speaking, where's the main trendlines that you guys are going to be building on, obviously, cloud data, they'll come in together, good core competency there for WANdisco, what's next, what's the next level for you guys? >> So what's really fascinating, and I actually didn't realize this when I joined WANdisco, just to be completely transparent. WANdisco has a core piece of technology called DConE, Distributed Coordination Engine. It essentially is a form of blockchain, really, it's a consensus technology, it's an algorithm. And that's been their secret sauce since the founding of the company. And so they originally applied that to code, through source code management, and then only in this last few years they've applied it to data. So you can guess, at other areas that we might apply it to, and already this past year, we actually filed two patents, in the area of blockchain, or really, distributed ledger technology, as we're starting to hear it called in the actual enterprise that's using it. But you can expand that to any other enterprise asset, really. That's big, right, that has value, and that you want to manage across different environments, so you can imagine, lots of other assets that we could apply this to, not only code, not only data, not only ledgers, but what are the other assets? And so that's essentially what we're working on. >> Is that protectable IP the patents, so those are filed on the blockchain? >> Yeah, yeah. >> For instance? >> So DConE is certainly patented, I'm sure Jagane'll talk more about this. >> Yeah, we'll get into it. >> There's probably a handful of people in the world, and they might all be working at WANdisco at this point. (chuckles) Who actually know how that works, and it's essentially Paxos, which is a really gnarly problem to solve, a really difficult math problem. And as David mentioned earlier, Google, the other smartest company in the world, published their paper on Spanner, and as you said, they used brute force, really, to solve the problem. Where we have a very elegant solution, using software, right? So it's a really great time to be at WANdisco, because I just see that there's so many applications of our technology, but, right now, we're mainly focused on what our customers are asking for. >> You've said a great quote, thanks Joe, final question for you, where do you see it going, WANdisco, what are your plans, do you have anything in mind, do you want to share anything notable, around what you're doing, and what you think WANdisco will be in a few years. >> We have an incredible team, as I mentioned, the people that are joining WANdisco, as David mentioned, I myself, not to say too much there, but, the new folks that have joined our Research and Development Team, but we've been making some great hires, to WANdisco. So I'm really excited about the team, I'm going, actually, to visit, we have a great team in Europe, in the UK, in the United Kingdom, so I'm going to go see them next week. But we have just the company culture is what drives me, I think that's just one of those hard things, really, to find. And so that's what I'm really excited about, so there's a lot of cool stuff happening there. You know, on that note, it's actually kind of funny, because on one of the articles that talked about live data for multicloud, asked the question, and her headline was "Are You Down to Boogie?" So, disco continues to be a great meme for us, with our name. (John chuckles) Unintentional, so, as a marketer, it's a pretty fun time to be at WANdisco. >> Seventies and eighties were great times, certainly I'm an eighties guy, Joel, thanks for comin' on, appreciate the update, Joel Horowitz, CMO, Chief Marketing Officer, WANdisco, really on a nice wave right now, cloud growth, data growth, all comin' together, real IP, lookin' forward to hearing more, what comes down the pipe for those guys, you'll see him at IBM Think. I'm John Furrier here, in the studios at Palo Alto, thanks for watching. (soaring orchestral music)

Published Date : Jan 23 2019

SUMMARY :

we've had great conversations when you were at IBM, and great to be here with theCUBE. and they are applying it to some really tough problems. that wave that they're on, if you will, a big run at the cloud. And the reality is that data continues to be I've read it in the news, got some buzz, Now it's multicloud. data center to data center, so, multi-site if you will. and multicloud architecture, you just said, and it all comes back to your data. even back in the Hadoop days, now it's in the cloud. and so, the types of strategies that we've seen it means you can still have access to your data So you basically work through and they're struggling to convince their customers in platforms. the end goal is to move business processes to the cloud, You've been on both sides of the table. and how does WANdisco, vector, a lot of conversations on the application side. and that you want to manage across different environments, So DConE is certainly patented, So it's a really great time to be at WANdisco, and what you think WANdisco will be in a few years. And so that's what I'm really excited about, in the studios at Palo Alto, thanks for watching.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Joel	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
WANdisco	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Joel Horwitz	PERSON	0.99+
Joe	PERSON	0.99+
John Furrier	PERSON	0.99+
Joel Horowitz	PERSON	0.99+
UK	LOCATION	0.99+
2016	DATE	0.99+
David Richards	PERSON	0.99+
Google	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
New York	LOCATION	0.99+
United Kingdom	LOCATION	0.99+
January 2019	DATE	0.99+
Data After Dark	ORGANIZATION	0.99+
next week	DATE	0.99+
Palo Alto, California	LOCATION	0.99+
AMD	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Are You Down to Boogie	TITLE	0.99+
John	PERSON	0.99+
first opportunity	QUANTITY	0.99+
Two	QUANTITY	0.99+
third	QUANTITY	0.99+
Think	ORGANIZATION	0.99+
two patents	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Datamere	ORGANIZATION	0.98+
this year	DATE	0.98+
First	QUANTITY	0.98+
both sides	QUANTITY	0.97+
each	QUANTITY	0.97+
Multicloud	ORGANIZATION	0.96+
Jagane	PERSON	0.95+
IBM Think	ORGANIZATION	0.95+
four major themes	QUANTITY	0.94+
Big SQL	TITLE	0.92+
Red Hat	ORGANIZATION	0.92+
one vendor	QUANTITY	0.92+
two majors	QUANTITY	0.92+
Paxos	ORGANIZATION	0.92+
DConE	ORGANIZATION	0.91+
first	QUANTITY	0.91+
multicloud	TITLE	0.9+

David Richards WANdisco | CUBEConversation, January 2019

(upbeat instrumental music) >> Welcome to the special CUBE Conversation here, in Palo Alto, I'm John Furrier, host of theCUBE. I'm here with David Richards the CEO of WANdisco, CUBE alumni, been on many times. WANdisco continues to make the right bets. The bet they recently made has been on cloud many years. We've covered it certainly on theCUBE. But live data is the new hot thing. Multiple clouds is turning out to be the trend. That's your friend. David, good to see you. >> Great to be back. >> Thanks for coming on. So we talk all the time about how you guys have always evolved the business and continued to stay out front in all the major waves. Now again, another good call. You've certainly bet on Cloud. We've talked about that, Open Source, Big Data, Cloud, you saw that coming, positioned for that. But now you got some great momentum and resonance with customers around live data, which is not a stretch, given what you guys have done with replication, things in the past, the core intellectual property. Give us the update. You guys have been in the news lately. >> So, thanks and I think you enumerated the past history over the past two or three years, which we like to say that we're living in dog years. Everything's happening seven times faster than it would do normally. So of course, we started out life by making a prediction that storage arrays would change. People are beginning to store, companies beginning to store structured and unstructured data, mammoth sizes that we've never seen previously. We're going to have to resort to Open Source software, running a commoditized hardware that we'd already seen the social media companies move to. Then we've seen, we began to see a problem emerge, even in that marketplace, where spike computes all the applications which were going to be heavily compute, would need to run in Cloud and Cloud environments where you have complete elastic compute at remarkably low cost. And that leads to a problem. So this iceberg kind of that we like to talk about underneath the oceans, so moving data for static archival data really simple problem. And that's not live data, that's archival data. You just FTP it from point A to point B. But if we're talking about transactional systems where 10, 20, 30, 40, 50 percent of the data set changes all of the time, that creates a humongous problem in moving data from one premises to cloud, either for hybrid cloud or between clouds for multi-cloud. And that's the precise problem that WANdisco solves. And we've seen customer attraction, recently we've just announced the deal, jointly with Microsoft Azure. Where a big healthcare company, who 12 months ago were not talking about cloud suddenly they got over that hump where security keys could be managed by themselves within the cloud, were able to move petabytes-scale data from their on-premise systems into the cloud, without any interruption to service, without any blocking. That's a trend that we're seeing our pipelines now full of companies, all trying to do that. >> It's like you hit the oil gusher with data, because the data tsunami has been there, and we've documented certainly on theCUBE, and our Research team at Wikibon, have been talking about it for years, and now you're starting to see it, and you guys are getting the benefits of it, is that people figured out that it's moving data around is expensive. And it's hard to do so you push compute to the edge, but you still got to move the data around because the key part of the latency piece of the cloud. So how do you do that at scale? So this is the thing that you guys have, and I want you to explain what it is. You guys have live data from multi-cloud. What does that mean? What is all the hubbub about? What's the buzz? Why is this such a hot topic, live data from multi-cloud. >> Okay so let's just take a step back and talk about what multi-cloud actually is in today's definition, which is the vendor's definition, which is very convenient. So what they mean is, moving, putting applications into a container, Kubernetes or whatever, picking it up and shifting it somewhere else. And hey presto, I've got applications running, the same applications running in two different clouds. That is not multi-cloud because you're forgetting about the data, and the iceberg underneath the ocean of this colossal amount of data. If I've got petabyte-scale, multi-terabyte-scale data sets, and I need to run the same applications, or different applications but against the same data set, I need guaranteed consistent data, and that is, by definition, a data consistency problem. It is not a data replication problem. So all of the stuff that we used to use in the past for gigabyte-scale data, for traditional, relational database problems, none of that stuff works in a live data world. And by live data, we're talking about multi-terabyte, petabyte-scale data. Data sets that are so large that we've never seen them before running in end cloud locations. It's different or same applications, but guaranteed consistent data in every location. >> So you guys have had this core composite around integrity around the data, whether it's in replication. Sounds like the same thing's true around moving data. >> Yep. >> You guys are managing the life cycle of end-to-end of data movement. >> Yep. >> Point A to point B. >> Yep. >> The other approach is to move compute to the data. >> Yep. >> We're just seeing Amazon do a deal with VMware on-premise. So there's two schools of thought. When should customers think about each approach? Can you just kind of debunk or just clarify those two positions? >> So it's not really a chicken and egg because we know which comes first. It's definitely the chicken. It's definitely the data. So if I'm going to rebuild my application infrastructure, in the cloud, I'm going to do it piece-by-piece. I can't do lift-and-shift for a thousand applications that are running against this data set and just hope that the data that block for six months because I've got petabyte-scale data, and wait for it to all arrive in the cloud, or put it to the back of you know, use a snowmobile or some physical device to move the data. I need to do this, I need to kind of build the aircraft while it's taking off and flying and that's probably a good analogy. So what we see, is companies the first step is to get consistent data on-premise to cloud, or between different clouds. Then what that enables me to do of course, is to piece-by-piece then rebuild my application infrastructure at the pace that I want to. I mean there's a great add that I keep on seeing on t.v. Where it's migration day. As though I can press a button and then suddenly you know, in this Alice in Wonderland magical world, everything just appears. Realistically, and I saw the CEO of VMware a couple of years ago talk about being in a hybrid cloud scenario for 20 years. I think that's probably accurate. We've got billions of applications. A mix of homegrown stuff, a mix of, you know, actuarial applications in the insurance industry that are impossible to build overnight. This is going to take an elongated period of time. >> I was talking on Twitter with a bunch of thought leaders. We were talking about hybrid cloud and multi-cloud, and the kindergarten class is hybrid, right? >> Yeah. >> So you got some public cloud, then you got some on-premise data center. So getting that operational thing nailed down is great. But as you get old, you know, you progress in the grades, and get smarter, as you increase your I.T. I.Q., you're dealing with multiple, potentially multiple data centers or bigger on site, or an IOT edge, and multiple clouds. >> Yep. >> So that sounds easy on paper, but when you have to move data around the different work loads, that's the core problem that people are talking about today. How do you guys address this problem? Because I buy multi-cloud, I can see that certain tools and certain clouds the right work load and the right cloud, I get that. >> Yeah. It makes a lot of sense to me. The data is the problem. >> Yep. >> So how do you guys address that? This is the number one concern. >> So the closest, people ask me all the time about competition. The closest is Google. Google have got a product called Google Spanner. And Google Spanner is a time-sensitive, active-active WAN-scope data replication solution. That looks on paper very close to what WANdisco does. It enables them to keep active data in all of their different geolocations that they've built for their add services years and years and years ago. The trouble with that is, it only works on their own proprietary network, against their own proprietary applications because they launched a satellite and stuck it in the sky, they put dark fiber under the ocean, and they put GPS atomic clocks on every single one of their servers because it uses time and time accuracy in order to synchronize all of their data. We can do all of that over the public internet. So we're not a hardware solution. This is a pure software solution that can work over the public internet. So we can do that for any cloud vendor, and any provider of applications. And that's what we do. We're licensing our I.P. all over the place at the moment. >> So which clouds are, I imagine there's a great uptake for the clouds. Which one are you working with now? Can you talk about the deals you've done? >> We're very close. We announced the Azure partnership with Microsoft, and their Azure product, and we've been very impressed with the traction that we're seeing with them, particularly an enterprise cloud. I mean the early stage of cloud obviously was dominated by Amazon, Amazon Web Services. And they did a fantastic job of really bringing cloud to the market by accident kind of inventing cloud and then bringing it to market very very quickly. The fastest ever company to, if it's and independent company to 15 billion dollars, but most of those applications and projects and companies were born in the cloud. I mean a lot of the modern companies today were actually of course, you have Airbnb et cetera, were born in the cloud. So that, the second inning of cloud is certainly enterprise. We've also been impressed with the traction that we've seen from Google GCP as being extremely impressive. And of course Amazon continued to thrive. In cloud we also have an OEM deal with Ali, with Alibaba with their cloud as well. So they're really the only full. >> If Google has Spanner, how do you differentiate between Google Spanner? >> So Google Spanner only works on their proprietary network. Which is great for Google and between their data centers, but what about 99.9 percent of the rest of the problem, which is the rest of us right, who operate on the public internet. So we can do what Google Spanner does active-active, geo, one scope replication of data but over the public internet. >> So you guys have been talking active-active for many times. We've had many conversations here on theCUBE. So I get that. How has your business changed with cloud? You had mentioned prior to coming on camera. You made a bet on cloud. It's paying off obviously. People who have made the right bets on cloud at the right time, it's certainly paying off. You're one of them. How does the live data in the multi-cloud change your business? Does it increase your trajectory? Is there a pivot? I mean what does it mean for WANdisco? >> So the very, so my thesis or the company's thesis, I won't take the credit for it, but the company's thesis was really simplistic, which is our bet was in the small data world of gigabyte-scale data, in order to do data replication, small data equals small outage. When you get data sets that are growing exponentially, and you get, you know, data sets through a thousand or a million times greater than what we've seen previously, what was a small outage or small blocking of client applications will become an elongated blocking of client applications that we're talking about, you know, six months to move 20 petabytes of data. You can't block applications, business critical applications for six months. That was the bet that we made. We expected initially to see that happen on-premise in the data like world, in the Hadoop world if you will. That didn't quite happen, or has not happen to date. We don't think that's probably going to happen. We're certainly seeing a huge desire of companies moving those data lakes into cloud, and we've actually innovated, we've got some new inventions coming out that enable you to move in a single pass, massive quantity of data that will be exponentially faster than anything else, and just doing a unidirectional data move into clouds. That was our bet that we said "Okay, companies in order to achieve the kind of scale "that they need to achieve, "they're going to have to do this in cloud." "In order to get to cloud, "they're going to have to move that data there, "and they're not going to be able to block even for a day "in order to move that data to cloud." And that was the bet we made, and it was the right bet. >> Talk about where you guys go from here. Give a company update. What's the status of the company? Get some new personnel? Any changes, notable updates? >> So we, really interestingly, my Co-Founder and Chief Scientist is a genius, Dr. Yeturu Aahlad, Ph.D. from UT, and undergrad from IIT, a new VP of Engineering Sakthi, IIT, Ph.D. at U.T. under Draxler. This fantastic Ph.D. program they did there. My new Head of Research came from, was Chairman of Computer Science at the University of Denver. He's was an IIT undergrad, Ph.D with Aahlad at UT. And I said jokingly to Aahlad: "There must be a fourth guy "that we can bring on board here "that went through the same program." He said, "We can but we can't hire him, "because he's the CTO of Microsoft, so." That was, he was the forth guy. Joel, who I know, is going to be coming on theCUBE shortly. He also has joined us from IBM to run Marketing for us. So we've made some fantastic new hires. The company's doing really well. You know cloud certainly has played a big part in the second half of last year. I think it's going to play a big part. It's definitely going to play a big part in 2019. We've seen a pivot in pipeline, that's moved away from possibly even disaster recovery, data lake in the first half of last year. We pivoted to more of a reliable subscription revenue in the second half of the year. We announced some pretty big deals, big healthcare companies. We've got really good public reference with AMD. We announced a motor vehicle company one of the new used cases there is four petabytes of data per day they're generating. That all has to be moved from on-premise to cloud. So we've got some ginormous deals in pipeline. We'll see how they play out in the coming weeks and months. >> It's great to see the change, and certainly on theCUBE. We've been talking, I think we've known each other for almost, this is our tenth year. >> Yeah. Ever since we first met. It's fun to see how you guys entered the market at Hadoop, staying on the data wave and thinking enterprise, integrity of the data, active-active, the key I.P. And how cloud is just assumed data, and it's not just data, it's large scale. So if you look at the new people you hired, you've got jobs in large scale systems. >> Yep. >> We're talking about a large systems, now data is just given. So you're really nailing the large scale, moving from an enterprise nice feature, certainly table stakes for fault tolerance, and active-active. Just add recovery to mission critical >> Yep. >> Ingredient in large scale cloud. >> Well it's ironic isn't it because our value actually increases with the volume of data. So we're an unusual company in that context where the larger the data site, the greater the problem, and the greater the problem that we solve. See we made a pretty good bet, the active-active replication, that live data would be a critical component of both hybrid cloud and multi-cloud. And that's playing out I think really well for us. >> And certainly a lot more changes to come. Great to have you on. >> Yeah. >> Cloud and multi-cloud. Certainly cloud has proven the economics proven large scale value of moving at cloud speed but now you have multiple clouds. That's going to change the game on applications, work loads. It's not going to change the data equation. There's still more tsunami of data that's not stopping. >> Exactly. >> I think you've got a good wave you're riding. >> Yeah. >> Data cloud wave. David Richards, CEO of WANdisco here in CUBE Conversations here in Palo Alto. I'm John Furrier, thanks for watching. (upbeat instrumental music)

Published Date : Jan 22 2019

SUMMARY :

But live data is the new hot thing. So we talk all the time about how you guys And that leads to a problem. And it's hard to do so you push compute to the edge, So all of the stuff that we used to use in the past So you guys have had this core composite around are managing the life cycle of end-to-end of data movement. to move compute to the data. Can you just kind of debunk in the cloud, I'm going to do it piece-by-piece. and the kindergarten class is hybrid, right? So you got some that's the core problem It makes a lot of sense to me. So how do you guys address that? We can do all of that over the public internet. Can you talk about the deals you've done? I mean a lot of the modern companies today but over the public internet. So you guys have been talking in the Hadoop world if you will. What's the status of the company? in the second half of the year. It's great to see the change, It's fun to see how you guys entered the market at Hadoop, Just add recovery to mission critical and the greater the problem that we solve. Great to have you on. It's not going to change the data equation. David Richards, CEO of WANdisco here

ENTITIES

Entity	Category	Confidence
Microsoft	ORGANIZATION	0.99+
Joel	PERSON	0.99+
David	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
WANdisco	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
David Richards	PERSON	0.99+
2019	DATE	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Yeturu Aahlad	PERSON	0.99+
20 years	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
six months	QUANTITY	0.99+
Aahlad	PERSON	0.99+
Google	ORGANIZATION	0.99+
15 billion dollars	QUANTITY	0.99+
January 2019	DATE	0.99+
20 petabytes	QUANTITY	0.99+
two schools	QUANTITY	0.99+
VMware	ORGANIZATION	0.99+
Wikibon	ORGANIZATION	0.99+
tenth year	QUANTITY	0.99+
AMD	ORGANIZATION	0.99+
10	QUANTITY	0.99+
forth guy	QUANTITY	0.99+
both	QUANTITY	0.99+
40	QUANTITY	0.99+
50 percent	QUANTITY	0.99+
first step	QUANTITY	0.99+
each approach	QUANTITY	0.99+
University of Denver	ORGANIZATION	0.99+
two positions	QUANTITY	0.99+
CUBE	ORGANIZATION	0.98+
one	QUANTITY	0.98+
today	DATE	0.98+
Airbnb	ORGANIZATION	0.98+
fourth guy	QUANTITY	0.98+
UT	LOCATION	0.98+
IIT	ORGANIZATION	0.97+
first	QUANTITY	0.97+
seven times	QUANTITY	0.97+
a day	QUANTITY	0.97+
second inning	QUANTITY	0.97+
12 months ago	DATE	0.96+
UT	ORGANIZATION	0.96+
20	QUANTITY	0.96+
Spanner	TITLE	0.96+
Draxler	PERSON	0.95+
Twitter	ORGANIZATION	0.95+
a million times	QUANTITY	0.95+
30	QUANTITY	0.94+
about 99.9 percent	QUANTITY	0.92+
four petabytes	QUANTITY	0.91+
single pass	QUANTITY	0.89+
billions of applications	QUANTITY	0.88+
couple of years ago	DATE	0.86+
one premises	QUANTITY	0.85+
Ali	ORGANIZATION	0.83+
first half of last year	DATE	0.83+
a thousand	QUANTITY	0.83+
Google Spanner	TITLE	0.82+
of the year	DATE	0.81+
thousand applications	QUANTITY	0.8+
theCUBE	ORGANIZATION	0.79+
two different clouds	QUANTITY	0.78+
Dr.	PERSON	0.78+
Hadoop	ORGANIZATION	0.74+
Azure	TITLE	0.74+
last year	DATE	0.72+

David Richards, WANdisco | theCUBE NYC 2018

Live from New York, it's theCUBE. Covering theCUBE, New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Okay, welcome back everyone. This is theCUBE live in New York City for our CUBE NYC event, #cubenyc. This is our ninth year covering the big data ecosystem going back to the original Hadoop world, now it's evolved to essentially all things AI, future of AI. Peter Burris is my cohost. He gave a talk two nights ago on the future of AI presented in his research. So it's all about data, it's all about the cloud, it's all about live action here in theCUBE. Our next guest is David Richards, who's been in the industry for a long time, seen the evolution of Hadoop, been involved in it, has been a key enabler of the technology, certainly enabling cloud recovery replication for cloud, welcome back to theCUBE. It's good to see you. >> It's really good to be here. >> I got to say, you've been on theCUBE pretty much every year, I think every year, we've done nine years now. You made some predictions and calls that actually happened. Like five years ago you said the cloud's going to kill Hadoop. Yeah, I think you didn't say that off camera, but it might (laughing) maybe you said it on camera. >> I probably did, yeah. >> [John] But we were kind of pontificating but also speculating, okay, where does this go? You've been right on a lot of calls. You also were involved in the Hadoop distribution business >>back in the day. Oh god. >> You got out of that quickly. (laughing) You saw that early, good call. But you guys have essentially a core enabler that's been just consistently performing well in the market both on the Hadoop side, cloud, and as data becomes the conversation, which has always been your perspective, you guys have had a key in part of the infrastructure for a long time. What's going on? Is it still doing deals, what's? >> Yes, I mean, the history of WANdisco's play and big data in Hadoop has been, as you know because you've been with us for a long time, kind of an interesting one. So we back in sort of 2013, 2014, 2015 we built a Hadoop-specific product called Non-Stop NameNode and we had a Hadoop distribution. But we could see this transition, this change in the market happening. And the change wasn't driven necessarily by the advent of new technology. It was driven by overcomplexity associated with deploying, managing Hadoop clusters at scale because lots of people, and we were talking about this off-camera before, can deploy Hadoop in a fairly small way, but not many companies are equipped or built to deploy massive scale Hadoop distributions. >> Sustain it. >> They can't sustain it, and so the call that I made you know, actions speak louder than words. The company rebuilt the product, built a general purpose data replication platform called WANdisco Fusion that, yes, supported Hadoop but also supported object store and cloud technologies. And we're now seeing use cases in cloud certainly begin to overtake Hadoop for us for the first time. >> And you guys have a patent that's pretty critical in all this, right? >> Yeah. So there's some real IP. >> Yes, so people often make the mistake of calling us a data replication business, which we are, but data replication happens post-consensus or post-agreement, so the very heart of WANdisco of 35 patents are all based around a Paxos-based consensus algorithm, which wasn't a very cool thing to talk about now with the advent of blockchain and decentralized computing, consensus is at the core of pretty much that movement, so what WANdisco does is a consensus algorithm that enables things like hybrid cloud, multi cloud, poly cloud as Microsoft call it, as well as disaster recovery for Hadoop and other things. >> Yeah, as you have more disparate parts working together, say multi cloud, I mean, you're really perfectly positioned for multi cloud. I mean, hybrid cloud is hybrid cloud, but also multi cloud, they're two different things. Peter has been on the record describing the difference between hybrid cloud and multi cloud, but multi cloud is essentially connecting clouds. >> We're on a mission at the moment to define what those things actually are because I can tell you what it isn't. A multi cloud strategy doesn't mean you have disparate data and processes running in two different clouds that just means that you've got two different clouds. That's not a multi cloud strategy. >> [Peter] Two cloud silos. >> Yeah, correct. That's kind of creating problems that are really going to be bad further down the road. And hybrid cloud doesn't mean that you run some operations and processes and data on premise and a different siloed approach to cloud. What this means is that you have a data layer that's clustered and stretched, the same data that's stretched across different clouds, different on-premise systems, whether it's Hadoop on-premise and maybe I want to build a huge data lake in cloud and start running complex AI and analytics processes over there because I'm, less face it, banks et cetera ain't going to be able to manage and run AI themselves. It's already being done by Amazon, Google, Microsoft, Alibaba, and others in the cloud. So the ability to run this simultaneously in different locations is really important. That's what we do. >> [John] All right, let me just ask this directly since we're filming and we'll get a clip out of this. What is the definition of hybrid cloud? And what is the definition of multi cloud? Take, explain both of those. >> The ability to manage and run the same data set against different applications simultaneously. And achieve exactly the same result. >> [John] That's hybrid cloud or multi cloud? >> Both. >> So they're the same. >> The same. >> You consider hybrid cloud multi cloud the same? >> For us it's just a different end point. It's hybrid kind of mean that you're running something implies on-premise. A multi cloud or poly cloud implies that you're running between different cloud venues. >> So hybrid is location, multi is source. >> Correct. >> So but let's-- >> [David] That's a good definition. >> Yes, but let's unpack this a little bit because at the end of the day, what a business is going to want to do is they're going to want to be able to run apply their data to the best service. >> [David] Correct. >> And increasingly that's what we're advising our clients to think about. >> [David] Yeah. >> Don't think about being an AWS customer, per se, think about being a customer of AWS services that serve your business. Or IBM services that serve your business. But you want to ensure that your dependency on that service is not absolute, and that's why you want to be able to at least have the option of being able to run your data in all of these different places. >> And I think the market now realizes that there is not going to be a single, dominant vendor for cloud infrastructure. That's not going to happen. Yes, it happened, Oracle dominated in relational data. SAP dominated for ERP systems. For cloud, it's democratized. That's not going to happen. So everybody knows that Amazon probably have the best serverless compute lambda functions available. They've got millions of those things already written or in the process of being written. Everybody knows that Microsoft are going to extend the wonderful technology that they have on desktop and move that into cloud for analytics-based technologies and so on. The Google have been working on artificial intelligence for an elongated period of time, so vendors are going to arbitrage between different cloud vendors. They're going to choose the best of brood approach. >> [John] They're going to go to Google for AI and scale, they're going to go to Amazon for robustness of services, and they're going to go to Microsoft for the Suite. >> [Peter] They're going to go for the services. They're looking at the services, that's what they need to do. >> And the thing that we'll forget, that we don't at WANdisco, is that that requires guaranteed consistent data sets underneath the whole thing. >> So where does Fusion fit in here? How is that getting traction? Give us some update. Are you working with Microsoft? I know we've been talking about Amazon, what about Microsoft? >> So we've been working with Microsoft, we announced a strategic partnership with them in March where we became a tier zero vendor, which basically means that we're partnered with them in lockstep in the field. We executed extremely well since that point and we've done a number of fairly large, high-profile deals. A retailer, for example, that was based in Amazon didn't really like being based in Amazon so had to build a poly cloud implementation to move had to buy scale data from AWS into Azure, that went seamlessly. It was an overnight success. >> [John] And they're using your technology? >> They're using our technology. There's no other way to do that. I think the world has now, what Microsoft and others have realized, CDC technology changed data capture. Doesn't work at this kind of scale where you batch up a bunch of changes and then you ship them, block shipping or whatever, every 15 minutes or so. We're talking about petabyte scale ingest processes. We're talking about huge data lakes, that that technology simply doesn't work at this kind of scale. >> [John] We've got a couple minutes left, I want to just make sure we get your views on blockchain, you mentioned consensus, I want to get your thoughts on that because we're seeing blockchain is certainly experimental, it's got, it's certainly powering money, Bitcoin and the international markets, it's certainly becoming a money backbone for countries to move billions of dollars out. It's certainly in the tank right now about 600 million below its mark in January, but blockchain is fundamentally supply chain, you're seeing consensus, you're seeing some of these things that are in your realm, what's your view? >> So first of all, at WANdisco, we separate the notion of cryptocurrency and blockchain. We see blockchain as something that's been around for a long time. It's basically the world is moving to decentralization. We're seeing this with airlines, with supermarkets, and so on. People actually want to decentralize rather that centralize now. And the same thing is going to happen in the financial industry where we don't actually need a central transaction coordinator anymore, we don't need a clearinghouse, in other words. Now, how do you do that? At the very heart of blockchain is an incorrect assumption. So must people think that Satoshi's invention, whoever that may be, was based around the blockchain itself. Blockchain is pieced together technologies that doesn't actually scale, right? So it takes game-theoretic approach to consensus. And I won't get, we don't have enough time for me to delve into exactly what that means, but our consensus algorithm has already proven to scale, right? So what does that mean? Well, it means that if you want to go and buy a cup of coffee at the Starbucks next door, and you want to use a Bitcoin, you're going to be waiting maybe half an hour for that transaction to settle, right? Because the-- >> [John] The buyer's got to create a block, you know, all that step's in one. >> The game-theoretic approach basically-- >> Bitcoin's running 500,000 transactions a day. >> Yeah. That's eight. >> There's two transactions per second, right? Between two and eight transactions per second. We've already proven that we can achieve hundreds of thousands, potentially millions of agreements per second. Now the argument against using Paxos, which is what our technology's based on, is it's too complicated. Well, no shit, of course it's too complicated. We've solved that problem. That's what WANdisco does. So we've filed a patent >> So you've abstracted the complexity, that's your job. >> We've extracted the complexity. >> So you solve the complexity problem by being a complex solution, but you're making and abstracting it even easier. >> We have an algorithmic not a game-theoretic approach. >> Solving the scale problem Correct. >> Using Paxos in a way that allows real developers to be able to build consensus algorithm-based applications. >> Yes, and 90% of blockchain is consensus. We've solved the consensus problem. We'll be launching a product based around Hyperledger very soon, we're already in tests and we're already showing tens of thousands of transactions per second. Not two, not 2,000, two transactions. >> [Peter] The game theory side of it is still going to be important because when we start talking about machines and humans working together, programs don't require incentives. Human beings do, and so there will be very, very important applications for this stuff. But you're right, from the standpoint of the machine-to-machine when there is no need for incentive, you just want consensus, you want scale. >> Yeah and there are two approaches to this world of blockchains. There's public, which is where the Bitcoin guys are and the anarchists who firmly believe that there should be no oversight or control, then there's the real world which is permission blockchains, and permission blockchains is where the banks, where the regulators, where NASDAQ will be when we're trading shares in the future. That will be a permission blockchain that will be overseen by a regulator like the SEC, NASDAQ, or London Stock Exchange, et cetera. >> David, always great to chat with you. Thanks for coming on, again, always on the cutting edge, always having a great vision while knocking down some good technology and moving your IP on the right waves every time, congratulations. >> Thank you. >> Always on the next wave, David Richards here inside theCUBE. Every year, doesn't disappoint, theCUBE bringing you all the action here. Cube NYC, we'll be back with more coverage. Stay with us; a lot more action for the rest of the day. We'll be right back; stay with us for more after this short break. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media has been a key enabler of the technology, I got to say, you've been on theCUBE [John] But we were kind of pontificating back in the day. and as data becomes the conversation, in the market happening. and so the call that I made So there's some real IP. consensus is at the core of Peter has been on the record at the moment to define So the ability to run this simultaneously What is the definition of hybrid cloud? and run the same data set implies that you're running is they're going to want to be able to run our clients to think about. of being able to run your data that there is not going to and they're going to go to They're looking at the services, And the thing that we'll forget, How is that getting traction? in lockstep in the field. and then you ship them, Bitcoin and the international markets, And the same thing is going to happen got to create a block, 500,000 transactions a day. That's eight. Now the argument against using Paxos, So you've abstracted the So you solve the complexity problem We have an algorithmic not Solving the scale problem to be able to build consensus We've solved the consensus problem. is still going to be important because and the anarchists who firmly believe that Thanks for coming on, again, always on the action for the rest of the day.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
John	PERSON	0.99+
Peter	PERSON	0.99+
Google	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
SEC	ORGANIZATION	0.99+
NASDAQ	ORGANIZATION	0.99+
March	DATE	0.99+
two	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
January	DATE	0.99+
AWS	ORGANIZATION	0.99+
2014	DATE	0.99+
millions	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
2013	DATE	0.99+
WANdisco	ORGANIZATION	0.99+
London Stock Exchange	ORGANIZATION	0.99+
2015	DATE	0.99+
New York City	LOCATION	0.99+
nine years	QUANTITY	0.99+
both	QUANTITY	0.99+
two transactions	QUANTITY	0.99+
eight	QUANTITY	0.99+
five years ago	DATE	0.99+
New York	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
half an hour	QUANTITY	0.99+
35 patents	QUANTITY	0.99+
hundreds of thousands	QUANTITY	0.99+
2,000	QUANTITY	0.99+
Both	QUANTITY	0.99+
ninth year	QUANTITY	0.98+
first time	QUANTITY	0.98+
billions of dollars	QUANTITY	0.98+
Hadoop	TITLE	0.98+
SAP	ORGANIZATION	0.98+
Starbucks	ORGANIZATION	0.98+
Paxos	ORGANIZATION	0.98+
two nights ago	DATE	0.97+
single	QUANTITY	0.97+
two approaches	QUANTITY	0.97+
500,000 transactions a day	QUANTITY	0.97+
about 600 million	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.96+
Satoshi	PERSON	0.92+
two different clouds	QUANTITY	0.91+
NYC	LOCATION	0.89+
one	QUANTITY	0.88+
theCUBE	EVENT	0.87+

Jagane Sundar, WANdisco | CUBEConversation, May 2018

(regal music) >> Hi, I'm Peter Burris And welcome to another Cube conversation from our beautiful studios here in Palo Alto, California. Got another great guest today, Jagane Sundar is the CTO of WANdisco Jagane, welcome back to the Cube. >> Good morning, Peter. >> So, Jagane, I want to talk about something that I want to talk about. And I want you to help explicate for our clients what this actually means. So there's two topics that I want to discuss. We've done extensive research in both of them, and one is this notion that we call plastic infrastructure. And the other one, related, is something we call networks of data. Let's start with networks of data because I think that that's perhaps foundational for plastic infrastructure. If we look back at the history of computing, we've seen increasing decentralization of data. Yet today many people talk about data gravity and how the Cloud is going to bring all data into the Cloud. Our belief, however, is that there's a relationship between where data is located and the actions that have to be taken. And data locality has a technical reality to it. We think we're going to see more distribution of data, but in a way that nonetheless allows us to federate. To bring that data into structures that nonetheless can ensure that the data is valuable wherever it needs to be. When you think of the notion of networks of data, what does that make you think about? >> That's a very interesting concept, Peter. When you consider the Cloud, and you talk about S3 for example and buckets of objects, people automatically assume that it's a global storage system for objects. But if you scratch a little deeper under the surface you'll find that each bucket is located in one region. If you want it available in other regions you've got to set up something called cross-region replication, which replicates in an eventual consistent fashion. It may or may not get there in time. So even in the Cloud storage systems, there is a notion of locality of data. It's something that you have to pay attention to. Now you hit the nail on the head when you said networks of data. What does that mean? Where does the data go? How it is used. Our own platform, the Fusion platform for replication of data is a strongly consistent platform which helps you conform to legal requirements and locality of data and many such things. It was built with such a thing in mind. Of course, we didn't quite call it that way, but I like your way of describing it. >> So as we think then about where this is, the idea is, 40 years ago, ARPANET allowed us to create networks of devices in a relatively open application-oriented way. And the web allowed us to create networks of pages of content. But again, that content was highly stylized. More recently social media's allowed us great networks of identities. All very important stuff. Now as we start talking about digital business and the fact that we want to be able to rearrange our data assets very quickly in response to new business opportunities whether it's customer experience or operational-oriented, this notion of networks of data allows us to think about the approach to doing that, so that we can have the data being in service to existing business opportunities, new business opportunities, and even available for future activities. So we're talking about creating networks out of these data sources, but as you said, to do that properly, we need to worry about consistency, we need to worry about cost. The platform for doing this, Fusion is a good one, it's going to require over time, however we think, some additional types of capabilities. The ability to understand patterns of data usage, the ability to stage data in advance and predictably, et cetera. Where do you think this goes as we start conceiving of networks of data as a fundamental value proposition for technology and business? >> Sure, one of the first things that occurs to me when you talk about a network of data, if you consider that as parallel to a network of computers, you don't have a notion of things like read-only computers whereas read-write computers. That's just silly. You want all computers to be roughly equal in the world. If you have a network of servers, and a network of computers, any of them can read. Any of them can write, and any of them can store. Now our Fusion platform brings about that capability to your definition of a network of data. What we call live data is the ability for you to store replicas of the data in different data centers around the world with the ability to write to any of those locations. If one of the locations happens to go down, it's a non-event. You can continue writing and reading from the other locations. That truly makes the first step towards building this network of data that you're talking about feasible. >> But I want to build on that notion a little bit because we are seeing increased specialization for example, AI, or GPUs. >> Sure. >> AI-specific processors, so even though we are still looking forward to general purpose nonetheless we see some degree of specialization. But let me also take that notion of live data and say I expect that we're going to see something similar. So for example, the same data set can be applied to multiple different classes of applications where each application may take advantage of underlying hardware advantages. But you don't have a restriction on how you deploy it built into the data. Have I got that right? >> Absolutely. Our Fusion platform includes the capability to replicate across Cloud vendors. You can replicate your storage between Amazon S3 and Azure Blob store. Now this is interesting because suddenly, you may discover that Redshift is great for certain applications while Azure SQLDW is better for others. We give you the freedom to invent new applications based on what location is best suited for that purpose. You've taken this concept of network of data, you've applied a consistent replication platform, now you have the ability to build applications in different worlds, in completely different worlds. And that's very interesting to us because if we look at data as the primary asset of any company, consider a company like Netflix, their data and the way they manage their data is the most important thing to that company. We bring the capability to distribute that data across different Cloud vendors, different storage systems, and run different applications. Perhaps you have a GPU heavy Cloud that maybe a GPU vendor offers. Replicate your data into that Cloud, and run your AI applications against that particular replica. We give you truly the freedom to invent new applications for your purpose. >> But very importantly, you are also providing, and I think this is essential, a certainty that there's consistency no matter how you do it. And I think that's the basis of the whole, the Paxos algorithms you guys are using. >> Exactly. The fundamental fact is this. Data scientists hate to deal with outdated data. Because all the work they're doing may be for no use if the data that they're applying it to is outdated, invalid, or partially consistent. We give you guarantees that the data is constantly updated, live data, it's completely consistent. If you ask the same question of two replicas of your data, you will get exactly the same answer. There is no other product in the industry today that can offer that guarantee. And that's important for our customers. >> Now building on the foundation, we're going to have to add some additional things to it. So pattern recognition, ML inside the tool. Is that on the drawing board? And I don't want you to go too far in the future, but is that kind of the future that you see too? >> We are a platform company with an excellent plug-in API. And one of the uses of our plug-in API, I'll give you a simple example, we have banking customers and they need to prevent credit card numbers from flying over the wire under certain circumstances. Our plug-in API enables them to do that. Applying an ML intelligence program into the plug-in API, again, a very simple development effort to do that. We are facilitating such capabilities. We expect third-party developers. We already have a host of third-party developers and companies building to our plug-in API. We expect that to be the vehicle for this. We won't claim expertise in ML, but there are plenty of companies that will do that on our platform. >> All right, so that leads to the second set of questions that I wanted to ask you about. We've defined what we call plastic infrastructure as a future for the industry. And to make sense of that, what we've done is we've said let's take a look at three phases of infrastructure, not based on the nature of the hardware, but based on the fundamental capabilities of the infrastructure. Static infrastructure is when we took an application, we wired it to a particular class of infrastructure. New load hits it, often you broke the infrastructure. Elastic infrastructure is the ability to be able to take a set of workloads and have it vary up and down, so that you can consume more and release the infrastructure so it has a kind of a rubber orientation. You hit it with a new load, it will deform for as long as you need it to, then it snaps back into shape. So you've predictability about what your costs are. We think that increasingly digital business is going to have to think about plastic infrastructure. The ability to very rapidly have the infrastructure deform in response to new loads, but persist that new shape, that new structure in response to how the load has impacted the business if in fact that is a source of value for the business. >> Sure. >> What do you think about that notion of plastic infrastructure? >> I love the way you describe it. In our own internal terminology we have this notion of live data and freedom to invent. What you've described is exactly that. The plastic infrastructure matches exactly with our notion of freedom to invent. Once you've solved the problem of making your data consistently available in different Clouds, different regions, different data centers, the next step of course is the freedom to invent new applications. You're going to throw experimental things at it. You're going to find that there are specific business intelligence that you can draw from this by virtue of a new application. Use it to make some critical decisions, improve profitability perhaps. That results in what you describe as plastic infrastructure. I really love that description by the way. Because we've gone from, the Cloud brought us plastic infrastructure, we've replicated, we've built a system that enables innovation and invention of new ideas. That's plastic infrastructure. I really like the idea that you're proposing. >> So as you think about this concept of plastic infrastructure, obviously there's a lot of changes that're going to take place in the industry. But Fusion in particular, by providing consistency, by increasing the availability, more importantly even the delivery of data where it's required facilitates at that data level, that notion of plasticity. >> Absolutely. The notion that you can throw brand new applications at it in a Cloud vendor of your choice, the fact that we can replicate across different Clouds is important for plastic infrastructure. Perhaps there are certain applications that work better in one Cloud versus the other. You definitely want to try it out that. And if that results in some real valuable applications, continue running it. So your definition that elastic becomes plastic infrastructure matches perfectly with that. We love this notion that we take the CIO's problems of mundane data management away and introduce the capability to invent and innovate in their space. >> So let me give you a very, or let me ask you a very practical, simple question. Historically, the back-up and restore people, and the application development people didn't spend a lot of time with each other, and that has created some tension. Are we now because of our ability to do this live data, are we able to bring those two worlds more closely together so that developers can now think about building increasingly complex, increasingly rich applications? And at the same time ensure that the data that they're building and testing with is in fact very close to the live data that they're actually going to use. >> Absolutely. We do bridge that gap. We enabled application developers to think of more complex, more sophisticated applications without actually worrying about the availability or the consistency of data. And the IT administrators and the CIO run operations that need to deliver that, have the confidence that they can in fact deliver it with the levels of consistency and availability that they need. >> So I'm going to give you the last word in this. I talked about a fair amount now, about this notion of networks of data, and infrastructure plasticity, where do you think this kind of matures over the course of the next four or five years? And what's your peer CTOs of large businesses that are thinking about these challenges of data management be focusing on? >> So the first thing that you have to acknowledge is that people need to stop thinking about machines and servers, and consider this as infrastructure that they acquire from different Cloud vendors. Different Cloud vendors because in fact there is going to be a few, a handful of good Cloud vendors that'll give you different capabilities. Once you get to that conclusion, you need your data available in all of these different Cloud vendors perhaps on your on-prem location as well, with strong consistency. Our platform enables you to do that. Once you get to that point, you have the freedom to build new applications, build business-critical systems that can depend on the consistency and availability of data. That is your definition of plasticity and networks of data. I truly like that. >> Yeah, and so we, great, great summary. We would say that we would agree with you, that increasing with the CIO, or the CDO, whoever it's going to be, has to focus on how do I increase returns on my business's data, and to do that they need to start thinking differently about their data, about their data assets, both now and in the future. Very, very important stuff. Jagane, thank you very much for being on the Cube. >> Thank you, Peter. >> And once again, I'm Peter Burris, and this has been a Cube conversation with Jagane Sundar, CTO of WANdisco. Thanks again. (regal music)

Published Date : May 17 2018

SUMMARY :

Jagane Sundar is the CTO of WANdisco and the actions that have to be taken. It's something that you about the approach to doing that, that occurs to me when you talk that notion a little bit So for example, the same We bring the capability the Paxos algorithms you guys are using. that they're applying it to but is that kind of the We expect that to be the vehicle for this. is the ability to be able I really love that description by the way. of changes that're going to and introduce the capability to invent that they're actually going to use. operations that need to deliver that, So I'm going to give is that people need to stop thinking and to do that they need to start thinking and this has been a Cube conversation

ENTITIES

Entity	Category	Confidence
Jagane	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Peter	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
May 2018	DATE	0.99+
second set	QUANTITY	0.99+
two topics	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
two replicas	QUANTITY	0.99+
first step	QUANTITY	0.98+
one region	QUANTITY	0.98+
today	DATE	0.98+
one	QUANTITY	0.98+
each application	QUANTITY	0.98+
40 years ago	DATE	0.97+
Redshift	TITLE	0.96+
two worlds	QUANTITY	0.96+
each bucket	QUANTITY	0.95+
S3	TITLE	0.94+
Paxos	ORGANIZATION	0.94+
first thing	QUANTITY	0.93+
Azure SQLDW	TITLE	0.92+
first things	QUANTITY	0.92+
Azure Blob	TITLE	0.9+
ARPANET	ORGANIZATION	0.9+
five years	QUANTITY	0.89+
CTO	PERSON	0.87+
Cloud	TITLE	0.8+
three phases	QUANTITY	0.77+
Cube	ORGANIZATION	0.77+
questions	QUANTITY	0.72+
Fusion	TITLE	0.69+
CUBEConversation	EVENT	0.65+
four	QUANTITY	0.49+
next	QUANTITY	0.36+

Jagane Sundar, WANdisco | CUBEConversation, May 2018

(intense orchestral music) >> Hi I'm Peter Burris, welcome to another CUBEConversation. Today we've got a special guest from WANdisco, Jagane Sundar, who's the CTO, Jagane, welcome to theCUBE again! >> Thanks Peter, happy to be here! >> So Jagane, we've got a lot to talk about today, WANdisco's doing a lot of new things, but clearly the industry is, itself, in the midst of a relatively important evolution. Now we at Wikibon and SiliconANGLE have been calling it the transformation to digital business. Everybody talks about this, but we've been pretty specific, we think that it boils down to how a company uses data as an asset, and the degree to which it's institutionalizing, or re-institutionalizing work around those assets. How does WANdisco see this big transformation that we're in the midst of right now? >> So, you're exactly right, businesses are transforming from traditional means to a digital based business, and the most important thing about that is the data. WANdisco is at the forefront of making your data available for your innovation. We start off with the basic use-cases, disaster recovery, that's a traditional problem that people have half-solved in many different ways, but we have the ability to solve that problem, take you to the next stage, which is what we call live data, where you don't worry about the availability or the location of your data anymore. Finally, we take you from that live data platform to a place where you can invent with your data, the freedom to invent phase of our--of what we call. Now, that's what you're calling the digital transformation and there's great synergy between our two terminologies, that's an important aspect here. >> So let me impact that a little bit, if I can. So the core notion is: that every business has to start acknowledging that data is something more than the exhaust that comes out of applications, it really is a core data asset. So let's start with this notion of backup and restore, or disaster recovery, the historical orientation is: I have these very expensive assets, typically in the form of hardware, or maybe applications, and I have to ensure that I can back those assets up. So backup restore used to be back up a device, backup a volume, backup whatever else it might be, and now it's moved to more of a backup of a virtual machine. I think we're talking about something different when we talk about your approach to backup and restore we're really talking about backing up data assets, do I have that right? >> That is correct. You have gone from a place where you are backing up PC's and Macintosh's and cellphones, to a place where the digital assets of your company, that are useful analytics, are far more important. Now, a simple backup, where you take the contents in one data center, push it to another data center, are a half-solution to the problem. What we've come up with is this notion called live data. You have multiple data centers, some of them you own, they're on premise, some of them are Cloud vendor data centers, they definitely reside in different parts of the world. Your data also is generated in different parts of the world, now all of this data goes into this data system, this platform that we've built for you, and it's available under all circumstances. If a region of a Cloud vendor goes down or if your own data center goes down, that's a non-event, because that data is available in other data centers around the world. This gives you the flexibility to treat this as a live data platform. You can write data where you want, you can read and run analytics wherever you want. You've gone from backing up PC's and phones, to actually using your digital assets in a manner such that you can make critical business decisions based on that. Imagine that insurance company that's making-- underwriting policies based on this digital data. If the data's not available, you've got a full halt on the business, that's not acceptable. If the data is not available because a specific data center went down, you can't call a full-stop to your business, you've got to make it available. Those are simple examples of how digital transformation is happening, and regular backup and DR are really inadequate to fuel your digital transformation. >> In fact, we like to think, we're advising our clients, that as they think about digital transformation, the role that data's playing, a digital business is not just backing up and restoring or sustaining or avoiding disasters associated with the data, they're really talking about backing up and restoring their entire business. That's kind of what we mean when we talk about DR in the digital business sense, disaster recovery, or backup and recovery, restore, in a digital business sense. And as you said, this notion of live data increases our ability to do that, but partly that requires a second kind of a step. By that I mean, most people think about storage, they think about where data's located in terms of persisting the data. When we talk about this new approach, we're talking about ensuring that we can deliver the data. Restore takes on more importance than backup than it has before, would you agree with that? That really talking about live data is really about being able to restore data wherever it's needed. >> It's an interesting new approach where we don't really define a primary and a backup. One of the important things about our Paxos-based replication system is that each location, or each instance, replica of your data, is exactly equal. So if you have a West Coast data center, and an East Coast data center, and a Midwest data center, and your West Coast data center happens to go down, none of the activities that you perform on your data will stop, you can continue writing your data to your Midwest and your East Coast data center, you continue writing and reading, running your applications against this data set. Now there wasn't a definition that the West Coast is primary and East Coast is backup. When a disaster strikes, we will cut over to the backup, we'll start using that, when the primary comes back, now we have to reconcile it, that's the traditional way of doing things, and it brings about some really bad attributes. Such as you need to have all your data pumped into one data center, that's counter to our philosophy. We believe that live data is where each of these replicas is equal, we build a platform for you where you can write to any of these, you can run your analytics against any of those. Once you get past that mental hurdle, what you've got is the freedom to innovate. You can look at it and go: I've got my data available everywhere, I can write to it, I can read from it, what can I do with this data? How can I quickly iterate so I can make more interesting business decisions, more relevant business decisions that will result in better business, profits and revenue. This interesting outcome is because you're now, not concerned about the availability of data or the primary, backup, and failover and failback, all those disappear from your radar. >> So let me build on that a little bit too, Jagane. So the way we would describe that is that a digital business, most have those data assets, those crucial data assets available, so that they can be delivered to applications and new activities, so we think in terms, what we call data zones, where the idea, you take a look at what your digital business value proposition is, what activities are essential to delivering on that value proposition, and then, whether or not the data is in a zone approximate to that activity, so that activity can actually be executed. So that means, from a physical standpoint, it needs to be there, from a legal standpoint, from a intellectual property control, from cost, but also from a consistency standpoint, you don't want dramatically different behaviors in your business just because the data that's over there is not consistent with the data that's over here, that's kind of what you guys are looking at. Now, ultimately that means, going out a little bit, but ultimately that means that this notion of deploying data so it serves your business now, has to also include a futures orientation. That we want to choose technologies that give us high value options on data futures as well. Is that what you mean by effectively, freedom to invent? >> It's definitely one aspect of our definition of freedom to invent. We are focused fully on complying with some of these requirements that you talked about. Regions of data, for example, there are parts of the world where you cannot take the data from that part of the world outside but often you need to do analytics in a global manner, such that if you detect a flaw or a problem that is surfaced by data in one part of the world, the chances are very good that that'll apply to this restricted zone as well. You want to be able to apply your analytics against that. Critical business decisions may need to be made, yet you cannot export that data out of that country, we facilitate such capabilities. So we've gone from a simpler primary backup type of system to a live data platform. And finally, we've given you the freedom to invent because you can now take a look at it and go I can start building applications that are in the critical business path because I'm confident of the availability of my data, the fact that we comply with all regulatory consilience things like aging out data after a certain number of months or days, we can help you do that really well with our platform. So yes, in fact the notion that data resides in different pools, in different areas, replicated consistently, available under all circumstances, enables business to think about their data in a completely different manner, up-level it. >> And satisfying physical, legal, intellectual property, and cost realities. >> Exactly. Those are all consilience that need to be addressed by this replication platform. >> So as we think about where customers are going with this clearly they've started around this backup and restore, but it sounds like you guys are helping them today conceive of what it means to do backup and restore and analytics, that is a particularly sensitive issue for a lot of businesses right now that are trying to marry together data science and good practices associated with IT. How is that playing out, can you give us some insight into how customers are doing a better job of that? >> Sure. A global auto maker that has acquired our software can do replication started off by using it for two very simple use-cases. They were looking at migrating from an older version of a data system to a newer version, we enabled them to do that without downtime, that was a clear win for us. The second thing they wanted to do was enable a disaster recovery type scenario. Once we got to that stage, we showed them how easy it was for them to continue writing to what was originally notionally the backup system, that made about twice as much compute resources available for them, because their original notion was that the backup system would just be a backup system, nothing could be done on it. Light bulbs went off in our customers head, they looked at it and went I can continue writing here, even if my primary goes down, there's no real notion of a backup, there's no real notion of failover and failback, that opened their minds to a whole bunch of new ideas. Now, they are in a position to build some business critical applications. Gone are the days when an analytics thing meant you run a report once a week and send it off to the CIO, it's not that anymore, it's up to minute accuracy, people are making things like insurance companies making underwriting decisions, and healthcare companies tracking the spread of diseases based on up to the minute information that they're getting. These are not weekly once analytics applications anymore, these are truly businesses that are based on their digital data. >> So a fundamental promise of live data is that wherever the data is, the application is live? >> Jagane: Yes, absolutely. >> Alright one more thing I think we want to talk about very quickly Jagane is there is some differences in mindset that a CIO has to apply here, again the CIO used to look at the assets and say machines, the hardware, yes, and maybe the applications, and now, to really see the value of this, they have to think of this in terms of data being the asset. How are your customers starting to evolve that notion so that they see the problem differently? >> So, I think the first thing that happened was the Cloud, we can't take credit for that, of course, but it helped our costs a great deal because people looked at infrastructure with a completely different viewpoint. They don't look at it as I'm going to buy a server with this size to run my Oracle, that mentality went away, and people started looking at, I have to store my data here and I can run an elastic application on this, I can grow our resources on demand and surrender those resources back to the Cloud when I don't need that. We take that to the next step, we enable them to have consistent replicas of their data across multiple regions of Cloud vendors, across different Cloud vendors. Suddenly they have the ability to do things like, I can run this analytics on Redshift here in Amazon really well, I can use this same data to run it on Azure SQL DW here, which is a better application for this specific use-case. We've opened up the possibilities to them, such that, they don't worry about what data they're going to use, how much resources they're going to get, resources are truly elastic now, you can buy and surrender resources, as per your demand, so it's opening up possibilities that they never had before. >> Excellent! Jagane Sundar, CTO of WANdisco, talking about live data, and the journey the customers are on to make themselves more fully digital businesses. >> Thanks, Peter. >> Once again this is Peter Burris from theCUBE, CUBEConversation with Jagane Sundar of WANdisco. (intense orchestral music)

Published Date : May 17 2018

SUMMARY :

Today we've got a special guest from WANdisco, and the degree to which it's institutionalizing, to a place where you can invent with your data, So the core notion is: that every business has to start in a manner such that you can make that as they think about digital transformation, that you perform on your data will stop, so that they can be delivered to applications such that if you detect a flaw or a problem and cost realities. Those are all consilience that need to be addressed So as we think about where customers are going with this that opened their minds to a whole bunch of new ideas. that a CIO has to apply here, We take that to the next step, and the journey the customers are on to CUBEConversation with Jagane Sundar of WANdisco.

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Peter	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
Jagane	PERSON	0.99+
May 2018	DATE	0.99+
Amazon	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
each location	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
two terminologies	QUANTITY	0.99+
each instance	QUANTITY	0.99+
Today	DATE	0.98+
each	QUANTITY	0.98+
once a week	QUANTITY	0.98+
second	QUANTITY	0.97+
first thing	QUANTITY	0.97+
today	DATE	0.97+
one part	QUANTITY	0.97+
two very simple use-cases	QUANTITY	0.96+
One	QUANTITY	0.96+
second thing	QUANTITY	0.95+
East Coast	LOCATION	0.95+
Midwest	LOCATION	0.94+
Azure SQL DW	TITLE	0.94+
about twice	QUANTITY	0.94+
one data center	QUANTITY	0.94+
SiliconANGLE	ORGANIZATION	0.93+
West Coast	LOCATION	0.92+
one aspect	QUANTITY	0.91+
CTO	PERSON	0.83+
Macintosh	COMMERCIAL_ITEM	0.82+
one more thing	QUANTITY	0.8+
Cloud	TITLE	0.73+
CUBEConversation	EVENT	0.72+
Paxos	ORGANIZATION	0.56+
Redshift	TITLE	0.52+
theCUBE	ORGANIZATION	0.52+

Jagane Sundar, WANdisco | AWS Summit SF 2018

>> Voiceover: Live from the Moscone Center, it's theCUBE. Covering AWS Summit San Francisco 2018. Brought to you by Amazon Web Services. >> Welcome back, I'm Stu Miniman and this is theCUBE's exclusive coverage of AWS Summit here in San Francisco. Happy to welcome back to the program Jagane Sundar, who is the CTO of WANdisco. Jagane, great to see you, how have you been? >> Well, been great Stu, thanks for having me. >> All right so, every show we go to now, data really is at the center of it, you know. I'm an infrastructure guy, you know, data is so much of the discussion here, here in the cloud in the keynotes, they were talking about it. IOT of course, data is so much involved in it. We've watched WANdisco from the days that we were talking about big data. Now it's you know, there's AI, there's ML. Data's involved, but tell us what is WANdisco's position in the marketplace today, and the updated role on data? >> So, we have this notion, this brand new industry segment called live data. Now this is more than just itty-bitty data or big data, in fact this is cloud-scale data located in multiple regions around the world and changing all the time. So you have East Coast data centers with data, West Coast data centers with data, European data centers with data, all of this is changing at the same time. Yet, your need for analytics and business intelligence based on that is across the board. You want your analytics to be consistent with the data from all these locations. That, in a sense, is the live data problem. >> Okay, I think I understand it but, you know, we're not talking about like, in the storage world there was like hot data, what's hot and cold data. And we talked about real-time data for streaming data and everything like that. But how do you compare and contrast, you know, you said global in scope, talked about multi-region, really talking distributed. From an architectural standpoint, what's enabling that to be kind of the discussion today? Is it the likes of Amazon and their global reach? And where does WANdisco fit into the picture? >> So Amazon's clearly a factor in this. The fact that you can start up a virtual machine in any part of the world in a matter of minutes and have data accessible to that VM in an instant changes the business of globally accessible data. You're not simply talking about a primary data center and a disaster recovery data center anymore. You have multiple data centers, the data's changing in all those places, and you want analytics on all of the data, not part of the data, not on the primary data center, how do you accomplish that, that's the challenge. >> Yeah, so drill into it a little bit for us. Is this a replication technology? Is this just a service that I can spin up? When you say live, can I turn it off? How do those kind of, when I think about all the cloud dynamics and levers? >> So it is indeed based on active-active replication, using a mathematically strong algorithm called Paxos. In a minute, I'll contrast that with other replication technologies, but the essence of this is that by using this replication technology as a service, so if you are going up to Amazon's web services and you're purchasing some analytics engine, be it Hive or Redshift or any analytics engine, and you want to have that be accessible from multiple data centers, be available in the face of data center or entire region failure, and the data should be accessible, then you go with our live data platform. >> Yeah so, we want you to compare and contrast. What I think about, you know, I hear active-active, speed of light's always a challenge. You know globally, you have inconsistency it's challenging, there's things like Google Spanner out there to look at those. You know, how does this fit compared to the way we've thought of things like replication and globally distributed systems in the past? >> Interesting question. So, ours great for analytics applications, but something like Google Spanner is more like a MySQL database replacement that runs into multiple data centers. We don't cater to that and database-transaction type of applications. We cater to analytics applications of batch, very fast streaming applications, enterprise data warehouse-type analytics applications, for all of those. Now if you take a look inside and see what kind of replication technology will be used, you'll find that we're better than the other two different types. There are two different types of existing replication technologies. One is log shipping. The traditional Oracle, GoldenGate-type, ship the log, once the change is made to the primary. The second is, take a snapshot and copy differences between snapshots. Both have their deficiencies. Snapshot of course is time-based, and it happens once in a while. You'll be lucky if you can get one day RTO with those sorts of things. Also, there's an interesting anecdote that comes to mind when I say that because the Hadoop folks in their HTFS, implemented a version of snapshot and snapdiff. The unfortunate truth is that it was engineered such that, if you have a lot of changes happening, the snapshot and snapdiff code might consume too much memory and bring down your NameNode. That's undesirable, now your backup facility just brought down your main data capability. So snapshot has its deficiencies. Log shipping is always active/passive. Contrast that with our technology of live data, whereat you can have multiple data centers filled with data. You can write your data to any of these data centers. It makes for a much more capable system. >> Okay, can you explain, how does this fit with AWS and can it live in multi-clouds, what about on-premises, the whole you know, multi and hybrid cloud discussion? >> Interesting, so the answer is yes. It can live in multiple regions within the same cloud, multiple reasons within different clouds. It'll also bridge data that exists on your on-prem, Hadoop or other big data systems, or object store systems within Cloud, S3 or Azure, or any of the BLOB stores available in the cloud. And when I say this, I mean in a live data fashion. That means you can write to your on-prem storage, you can also write to your cloud buckets at the same time. We'll keep it consistent and replicated. >> Yeah, what are you hearing from customers when it comes to where their data lives? I know last time I interviewed David Richards, your CEO, he said the data lakes really used to be on premises, now there's a massive shift moving to the public clouds. Is that continuing, what's kind of the breakdown, what are you hearing from customers? >> So I cannot name a single customer of ours who is not thinking about the cloud. Every one of them has a presence on premise. They're looking to grow in the cloud. On-prem does not appear to be on a growth path for them. They're looking at growing in the cloud, they're looking at bursting into the cloud, and they're almost all looking at multi-cloud as well. That's been our experience. >> At the beginning of the conversation we talked about data. How are customers doing you know, exploiting and leveraging or making sure that they aren't having data become a liability for them? >> So there are so many interesting use cases I'd love to talk about, but the one that jumps out at me is a major auto manufacturer. Telematics data coming in from a huge number, hundreds of thousands, of cars on the road. They chose to use our technology because they can feed their West Coast car telematics into their West Coast data center, while simultaneously writing East Coast car data into the East Coast data center. We do the replication, we build the live data platform for them, they run their standard analytics applications, be it Hadoop-sourced or some other analytics applications, they get consistent answers. Whether you run the analytics application on the East Coast or the West Coast, you will get the same exact answer. That is very valuable because if you are doing things like fault detection, you really don't want spurious detection because the data on the West Coast was not quite consistent and your analytics application was led astray. That's a great example. We also have another example with a top three bank that has a regulatory concern where they need to operate out of their backup data centers, so-called backup data center, once every three months or so. Now with live data, there is no notion of active data center and backup data center. All data centers are active, so this particular regulatory requirement is extremely simple for them to implement. They just run their queries on one of the other data centers and prove to the regulators that their data is indeed live. I could go on and on about a number of these. We also have a top two retailer who has got such a volume data that they cannot manage it in one Hadoop cluster. They use our technology to create the live data data link. >> One of the challenges always, customers love the idea of global but governance, compliance, things like GDPR pop up. Does that play into your world? Or is that a bit outside of what WANdisco sees? >> It actually turns out to be an important consideration for us because if you think about it, when we replicate the data flows through us. So we can be very careful about not replicating data that is not supposed to be replicated. We can also be very careful about making sure that the data is available in multiple regions within the same country if that is the requirement. So GDPR does play a big role in the reason why many of our customers, particularly in the financial industry, end up purchasing our software. >> Okay, so this new term live data, are there any other partners of yours that are involved in this? As always, you want like a bit of an ecosystem to help build out a wave. >> So our most important partners are the cloud vendors. And they're multi-region by nature. There is no idea of a single data center or a single region cloud, so Microsoft, Amazon with AWS, these are all important partners of ours, and they're promoting our live data platform as part of their strategy of building huge hybrid data lakes. >> All right, Jagane give us a little view looking forward. What should we expect to see with live data and WANdisco through the rest of 2018? >> Looking forward, we expect to see our footprint grow in terms with dealing with a variety of applications, all the way from batch, pig scripts that used to run once a day to hive that's maybe once every 15 minutes to data warehouses that are almost instant and queryable by human beings, to streaming data that pours things into Kafka. We see the whole footprint of analytics databases growing. We see cross-capability meaning perhaps an Amazon Redshift to an Azure or SQL EDW replication. Those things are very interesting to us, to our customers, because some of them have strengths in certain areas and other have strengths in other areas. Customers want to exploit both of those. So we see us as being the glue for all world-scale analytics applications. >> All right well, Jagane, I appreciate you sharing with us everything that's happening at WANdisco. This new idea of live data, we look forward to catching up with you and the team in the future and hearing more about the customers and everything on there. We'll be back with lots more coverage here from AWS Summit here in San Francisco. I'm Stu Miniman, you're watching theCUBE. (electronic music)

Published Date : Apr 4 2018

SUMMARY :

Brought to you by Amazon Web Services. and this is theCUBE's exclusive coverage data really is at the center of it, you know. and changing all the time. Is it the likes of Amazon and their global reach? The fact that you can start up a virtual machine about all the cloud dynamics and levers? but the essence of this is that by using and globally distributed systems in the past? ship the log, once the change is made to the primary. That means you can write to your on-prem storage, Yeah, what are you hearing from customers They're looking at growing in the cloud, At the beginning of the conversation we talked about data. or the West Coast, you will get the same exact answer. One of the challenges always, of our customers, particularly in the financial industry, As always, you want like a bit of an ecosystem So our most important partners are the cloud vendors. What should we expect to see with live data We see the whole footprint to catching up with you and the team in the future

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
Jagane	PERSON	0.99+
San Francisco	LOCATION	0.99+
Jagane Sundar	PERSON	0.99+
Stu Miniman	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
GDPR	TITLE	0.99+
Stu	PERSON	0.99+
One	QUANTITY	0.99+
East Coast	LOCATION	0.99+
Both	QUANTITY	0.99+
second	QUANTITY	0.99+
two	QUANTITY	0.98+
MySQL	TITLE	0.98+
West Coast	LOCATION	0.98+
two different types	QUANTITY	0.98+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
one day	QUANTITY	0.98+
Kafka	TITLE	0.98+
S3	TITLE	0.97+
Moscone Center	LOCATION	0.97+
Oracle	ORGANIZATION	0.96+
once a day	QUANTITY	0.95+
Google Spanner	TITLE	0.95+
single data center	QUANTITY	0.95+
NameNode	TITLE	0.94+
hundreds of thousands	QUANTITY	0.94+
today	DATE	0.93+
theCUBE	ORGANIZATION	0.92+
Azure	TITLE	0.91+
WANdisco	TITLE	0.9+
snapdiff	TITLE	0.89+
SQL EDW	TITLE	0.89+
Redshift	TITLE	0.88+
single customer	QUANTITY	0.87+
AWS Summit	EVENT	0.87+
AWS Summit San Francisco 2018	EVENT	0.86+
single region	QUANTITY	0.85+
2018	DATE	0.84+
snapshot	TITLE	0.81+
Jagane	ORGANIZATION	0.76+
three bank	QUANTITY	0.74+
once every 15 minutes	QUANTITY	0.73+
European	LOCATION	0.73+
AWS Summit SF 2018	EVENT	0.71+
once	QUANTITY	0.7+
Cloud	TITLE	0.65+
every three months	QUANTITY	0.64+
GoldenGate	ORGANIZATION	0.57+
of cars	QUANTITY	0.55+
minute	QUANTITY	0.53+
Paxos	ORGANIZATION	0.53+
HTFS	TITLE	0.53+
Hive	TITLE	0.49+
Hadoop	ORGANIZATION	0.41+
BLOB	TITLE	0.4+

Jagane Sundar, WANdisco | AWS re:Invent 2017

>> Announcer: Live from Las Vegas It's theCube covering AWS re:Invent 2017 presented by AWS, Intel, and our ecosystem of partners. >> Welcome back to our live coverage. theCube here at AWS re:Invent 2017 Our fifth year covering Amazon Web Services and their massive growth. I'm John Furrier, my co-host Lisa Martin. Here our next guest is CTO of WANdisco, Jugane Sundar. Welcome back to theCube. >> Thank you John. >> You guys are everywhere. WANdisco around the table and all these deals so you guys have been doing extremely well with (indistinct talking) property. What's new? You got some news? >> Yes we do, we recently announced integration with Amazon's AWS Snowball device which gives you the ability to do migration of on-premises workload into the Cloud without down time, and then the end result is a hybrid cloud environment that you can have an active for right environment on both sides. That's a unique capability, nobody else can do that today. >> What does it mean for AWS and their customers 'cause they're very customer focused. What are you guys bringing to the table? >> We bring a whole lot of big data workloads, analytics workloads, IoT workloads into their Cloud. And the beauty of the cloud is that you may have a 20 node cluster on-premises but you can run analytics with a 1000 nodes up in the Cloud on demand and pay just for that use. We think it's a very powerful value proposition. >> Where are you seeing the most traction? We've talking about the massive growth at 18 billion dollar annual runway that fit AWS and Andy's conversation with you John the other day said we haven't gotten that big on startups alone. So even some of the things like the advertising that AWS is now starting to do suggests they're going up the stack to the Enterprise and to the Sea Suites. Where are you guys seeing the most traction with AWS? Is it in the Enterprise space, is it in the start up space, both? >> So somewhat because of our route, what we're finding is that the large majority of Big Data customers and analytics customers from the last two, three years are all considering some form of addition of a cloud to their environment. If it's not a wholesale migration, it's a hybrid environment. It's bursting out into the cloud type of use case and what you're finding is that growth of on-premise Big Data and analytics systems is slowing down because once you get to the Cloud, the plethora of tools you have, the facilities that the scale brings to you is just unmatched. That's the trend we really see in the market. >> We've seen a lot of people go and use it in the marketplace. Juniper Networks for instance, are seeing some activity at the network. Who would have thought a network player is gonna to pick it in the Cloud, but this is what industrial-strength cloud looks like. You guys have the active active. Where does that fit in for the customers who wanna leverage the apps, and don't wanna worry about the networks? >> Exactly, the traditional model of thinking was use the Cloud for back up. You have your on-premise stuff. The cheapest way to back it up is into the Cloud. But that's really just scratching the tip of the iceberg. Once you put your data up in the Cloud, you have the ability to have it strongly consistently replicated then you can do amazing things from the Cloud. You can do a whole new analytics system. Perhaps you want to experiment with Spark in the Cloud and have it on on high on-premise that works very well. Now that both sides are actively writeable, you can create partitions of your data that are dynamically generated written to both sides. These are things that people did not consider. Once they stumble upon it, it just opens their mind to a whole new way of operating. >> Business Park, I've heard some rumors and rumblings in the developer community here that they're running Spark on Lando. People always hacking with new stuff. So Lando server list I think is coming down. How does that relate to some of things that are driving WANdisco's, how do you relate to that? Does that help you? Does that hurt you guys? >> It helps us, the way we look at it. We're all about strong replication of storage. Lando is no storage, you talk to the underlying storage of some kind. It's S3, it's EBS volumes whatever. So long as the storage comes through our system. Any growth, any simple easy way for applications to be written is hugely positive for us. >> What are the start ups out there? We've seen a lot of start ups really missed the mark. They misfired on the Cloud and you seen some stars that have played it well. They've got in the tornadoes as we say. In fact, Geoffrey Moore, I think is rewriting his book Inside the Tornado, which is a management paradigm. But there really seems to be a new business model. You guys are like ever green at WANdisco because you're unique (indistinct talking) property. How are you guys working with that business model and what are some of the things you're seeing with start ups and companies who are trying to play the cloud but are misfiring? >> Right so WANdisco as you know stands for Wide Area Network Distributed Computing, and the Cloud is like a huge bonus to it. It's all about the Wide Area Network. We are now consolidating a bunch of work in the cloud, but guess what? It's gonna go back to going to go into the edge in some way 'cause the edges are getting smarter. You need replication between those. We see a lot of that coming up in the next two, three, five years. IoT workloads and use cases all involve somewhat of edge smart computing. We replicate between those really well. >> Lisa, we always talk about the trend is your friend. In your case, Cloud is your friend. >> Indeed, it is. The Cloud is all about wide area network computing and we are the ones who can really replicate-- >> How does a customer know what to do when it comes down to getting involved with WANdisco? It's not obvious. Spell it out, why do they need you guys? When do you get involved? What specific things should be red flags to a potential customer or a customer who says I'm gonna go in on the Cloud. Unpack that. >> Let me give you a simple example. We look at Amazon S3, it's a Cloud service storage. But do you know that it's actually on a per region basis. When you create a bucket to put objects into the bucket, it's located in one region. If you want it replicated elsewhere, they have cross-region replication which is an eventually consistent replication system that doesn't give you the consistent results that you want. If you have such a situation employing our technology immediately gives you consistent replication. Be it Cloud regions, Cloud to Cloud or on-premise to Cloud. The end result is the minute you step into replication across the land, every solution out there doesn't do it consistently and that's our core-- >> And that's your unique IP. >> Indeed, it is. >> Okay so I'm seeing Amazon racing their roll out regions. They got one coming in China, one in the Middle East. That's a big part of the strategy. Does that help you or what does that do? >> Absolutely it helps us a great deal, partly because customers now do not look at their applications as a single region applications. That doesn't fly anymore. The the notion that my banking app cannot work because a data center went down is just not acceptable in the modern world anymore. The fact that we depend so much on the services means they need to be up all the time. More regions, more data replication. That's why we step in. >> So that sounds like a lot of symbiosis here. You talk about S3 and replication challenges. So tell us how WANdisco is actually helping AWS. That's one example but help you us understand the symbiosis with your relationship with AWS. >> The best example I can give you is a large travel service company in the internet. They had to Adobe infrastructure that was growing out of control. They wanted to manage costs by moving some workloads to Amazon but didn't really know where to start, because you can't do such a thing as take a copy of the data, ship it off on a Snowball into the Cloud and tell the users of that data, stop writing to it now. It's gonna be available in the Cloud, a week, 10 days from now. Then you can start writing again. That's just not acceptable. This is live data problem. The problem here is that you need to be able to ship out your data on Snowballs, continue to write the on-premise storage. When it shows up in the Cloud, start writing that. Both are consistently replicated, you have a proper hybrid Cloud environment. So this was a great bonus to them. As for AWS, they watched this and they look at it as a easy way to move vast majority of data from on-premise big Data analytics systems. >> Have they been a fuel to your fire, in a sense that they've been on this incredible acceleration of their innovation and as Andy Joci said many times to you John. It's speed and customer focus. So how has their accelerated pace of innovation helped fuel WANdisco's so that like you were saying the unique value. How have they really ignited that? >> So they started off with just plain Snowball two years ago. Last year they announced Snowball Edge which is a pretty improved device. Now they have in the works, capability to do some compute on those boxes. That's very interesting to us. Now our services can decide on the Snowball, It arrives at a customer site. He plugs it in, turns it on instant replication capabilities Those are fueled both by Amazon's drive and extreme speed and our own capabilities. So Amazon is a wonderful partner for us partly because their charge to us innovation is quite amazing. >> Snowball, snow mobile, it's gonna be a white Christmas for you guys. Business is good. >> Business is great. >> Okay, final question. What's the conversations you're having here this year, share with us some of the quick conversations you're having in the hallways, meetings, Amazon got execs, partners. >> So most of the conversation are about moving workloads from on-premise into the Cloud. I personally am very interested in IoT use cases because I see the volume of data and the ability for us to do some interesting replications at being critical. That's where our focus is right now. >> Jugane Sundar, CTO of WANdisco. Big announcement, partnership with Amazon Web Services and Snowball replication active active. Great solution for replication. You got regions across regions. Check out WANdisco. Thanks for coming by, great to see you again. Congratulations on all your success. This is theCube, live coverage day one. It's coming down to an end. The halls open, we got two more days of packed two Cubes. Stay tuned for more, we got some great guest coming up, stay with us. (uptempo techno music)

Published Date : Nov 29 2017

SUMMARY :

It's theCube covering AWS re:Invent 2017 Welcome back to our live coverage. so you guys have been doing extremely well a hybrid cloud environment that you can have an active What are you guys bringing to the table? that you may have a 20 node cluster on-premises that fit AWS and Andy's conversation with you John the plethora of tools you have, Where does that fit in for the customers you have the ability to have it strongly consistently Does that hurt you guys? you talk to the underlying storage of some kind. and you seen some stars that have played it well. and the Cloud is like a huge bonus to it. Lisa, we always talk about the trend is your friend. and we are the ones who can really replicate-- Spell it out, why do they need you guys? The end result is the minute you step Does that help you or what does that do? The the notion that my banking app cannot work the symbiosis with your relationship with AWS. The problem here is that you need to be able to ship out many times to you John. Now our services can decide on the Snowball, it's gonna be a white Christmas for you guys. What's the conversations you're having here So most of the conversation are about moving workloads Thanks for coming by, great to see you again.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Geoffrey Moore	PERSON	0.99+
Andy Joci	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Jugane Sundar	PERSON	0.99+
China	LOCATION	0.99+
Andy	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Last year	DATE	0.99+
Lisa	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
both sides	QUANTITY	0.99+
fifth year	QUANTITY	0.99+
Inside the Tornado	TITLE	0.99+
two years ago	DATE	0.99+
1000 nodes	QUANTITY	0.99+
Adobe	ORGANIZATION	0.99+
Middle East	LOCATION	0.99+
both	QUANTITY	0.99+
10 days	QUANTITY	0.99+
18 billion dollar	QUANTITY	0.98+
Both	QUANTITY	0.98+
this year	DATE	0.98+
CTO	PERSON	0.98+
one example	QUANTITY	0.98+
20 node	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
three	QUANTITY	0.98+
five years	QUANTITY	0.97+
Christmas	EVENT	0.97+
one region	QUANTITY	0.97+
two Cubes	QUANTITY	0.97+
Spark	TITLE	0.97+
Las Vegas	LOCATION	0.96+
Cloud	TITLE	0.96+
one	QUANTITY	0.95+
Juniper Networks	ORGANIZATION	0.95+
day one	QUANTITY	0.94+
Lando	ORGANIZATION	0.94+
three years	QUANTITY	0.93+
a week	QUANTITY	0.92+
S3	TITLE	0.91+
today	DATE	0.91+
theCube	COMMERCIAL_ITEM	0.88+
two more days	QUANTITY	0.88+
single region	QUANTITY	0.88+
Sea Suites	ORGANIZATION	0.86+
Snowball	ORGANIZATION	0.86+
Snowball Edge	COMMERCIAL_ITEM	0.84+
two	QUANTITY	0.79+
Snowball	COMMERCIAL_ITEM	0.79+
re:Invent 2017	EVENT	0.78+
mobile	COMMERCIAL_ITEM	0.78+
Invent 2017	EVENT	0.76+

Jagane Sundar, WANdisco | BigData NYC 2017

>> Announcer: Live from midtown Manhattan, it's theCUBE, covering BigData New York City 2017, brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Okay welcome back everyone here live in New York City. This is theCUBE special presentation of our annual event with theCUBE and Wikibon Research called BigData NYC, it's our own event that we have every year, celebrating what's going on in the big data world now. It's evolving to all data, cloud applications, AI, you name it, it's happening. In the enterprise, the impact is huge for developers, the impact is huge. I'm John Furrier, cohost of the theCUBE, with Peter Burris, Head of Research, SiliconANGLE Media and General Manager of Wikibon Research. Our next guest is Jagane Sundar, who's the CTO of WANdisco, Cube alumni, great to see you again as usual here on theCUBE. >> Thank you John, thank you Peter, it's great to be back on theCUBE. >> So we've been talking the big data for many years, certainly with you guys, and it's been a great evolution. I don't want to get into the whole backstory and history, we covered that before, but right now is a really, really important time, we see you know the hurricanes come through, we see the floods in Texas, we've seen Florida, and Puerto Rico now on the main conversation. You're seeing it, you're seeing disasters happen. Disaster recovery's been the low hanging fruit for you guys, and we talked about this when New York City got flooded years and years ago. This is a huge issue for IT, because they have to have disaster recovery. But now it's moving more beyond just disaster recovery. It's cloud. What's the update from WANdisco? You guys have a unique perspective on this. >> Yes, absolutely. So we have capabilities to replicate between the cloud and Hadoop multi data centers across geos, so disasters are not a problem for us. And we have some unique technologies we use. One of the things we do is we can replicate in an active-active mode between different cloud vendors, between cloud and on-prem Hadoop, and we are the only game in town. Nobody else can do that. >> So okay let me just stop right there. When you say the only game in town I got a little skeptic here. Are you saying that nobody does active-active replication at all? >> That is exactly what I'm saying. We had some wonderful announcements from Hortonworks, they have a great product called the Dataplane. But if you dig deep, you'll find that it's actually an active-passive architecture, because to do active-active, you need this capability called the Paxos algorithm for resolving conflict. That's a very hard algorithm to implement. We have over 10 years' experience in that. That's what gives us our ability to do this active-active replication, between clouds, between on-prem and cloud. >> All right so just to take that a step further, I know we're having a CTO conversation, but the classic cliche is skate to where the puck is going to be. So you kind of didn't just decide one morning you're going to be the active-active for cloud. You kind of backed into this. You know the world spun in your direction, the puck came to you guys. Is that a fair statement? >> That is a very fair statement. We've always known there's tremendous value in this technology we own, and with the global infrastructure trends, we knew that this was coming. It wasn't called the cloud when we started out, but that's exactly what it is now, and we're benefiting from it. >> And the cloud is just a data center, it's just, you don't own it. (mumbles) Peter, what's your reaction to this? Because when he says only game in town, implies some scarcity. >> Well, WANdisco has a patent, and it actually is very interesting technology, if I can summarize very quickly. You do continuous replication based on writes that are performed against the database, so that you can have two writers and two separate databases and you guarantee that they will be synchronized at some point in time because you guarantee that the writing of the logs and the messaging to both locations >> Absolutely. >> in order, which is a big issue. You guys put a stamp on the stuff, and it actually writes to the different locations with order guaranteed, and that's not the way most replication software works. >> Yes, that's exactly right. That's very hard to do, and that's the only way for you to allow your clients in different data centers to write to the same data store, whether it's a database, a Hadoop folder, whether it's a bucket in a cloud object store, it doesn't matter. The core fact remains, the Paxos algorithm is the only way for you to do active-active replication, and ours is the only Paxos implementation that can work over the >> John: And that's patented by you guys? >> Yes, it's patented. >> And so someone to replicate that, they'd have to essentially reverse engineer and have a little twist on it to not get around the patents. Are you licensing the technology or are you guys hoarding it for yourselves? >> We have different ways of engaging with partners. We are very reasonable with that, and we work with several powerful partners >> So you partner with the technology. >> Yes. >> But the key thing, John, in answer to your question is that it's unassailable. I mean there's no argument, that is, companies move more towards a digital way of doing things, largely driven by what customers want, your data becomes more of an asset. As you data becomes more of an asset, you make money by using that data in more places, more applications and more times. That is possible with data, but the problem you end up with consistency issues, and for certain applications, it's not an issue, you're basically writing, or if you're basically reading data it's not an issue. But the minute that you're trying to write on behalf of a particular business event or a particular value proposition, then now you have a challenge, you are limited in how you can do it unless you have this kind of a technology. And so this notion of continuous replication in a world that's going to become increasingly dependent upon data, data that is increasingly distributed, data that you want to ensure has common governance and policy in place, technologies like WANdisco provides are going to be increasingly important to the overall way that a business organizes itself, institutes its work and makes sure it takes care of its data assets. >> Okay, so my next question then, thanks for the clarification, it's good input there and thanks for summarizing it like that, 'cause I couldn't have done that. But when we last talked, I always was enamored by the fact that you guys have the data center replication thing down. I always saw that as a great thing for you guys. Okay, I get that, that's an on-premise situation, you have active-active, good for disaster recovery, lot of use cases, people should be beating down your door 'cause you have a better mousetrap, I get that. Now how does that translate to the cloud? So take me through why the cloud now fits nicely with that same paradigm. >> So, I mean, these are industry trends, right. What we've found is that the cloud object stores are very, very cost effective and efficient, so customers are moving towards that. They're using their Hadoop applications but on cloud object stores. Now it's trivial for us to add plugins that enable us to replicate between a cloud object store on one side, and a Hadoop on the other side. It could also be another cloud object store from a different cloud provider on the other side. Once you have that capability, now customers are freed from lock-in from either a cloud vendor or a Hadoop vendor, and they love that, they're looking at it as another way to leverage their data assets. And we enable them to do that without fear of lock-in from any of these vendors. >> So on the cloud side, the regions have always been a big thing. So we've heard Amazon have a region down here, and there was fix it. We saw at VMworld push their VMware solution to only one western region. What's the geo landscape look like in the cloud? Does that relate to anything in your tech? >> So yes, it does relate, and one of the things that people forget is that when you create an Amazon S3 bucket, for example, you specify a region. Well, but this is the cloud, isn't it worldwide? Turns out that object store actually resides in one region, and you can use some shaky technologies like cross-region replication to eventually get the data to the other region. >> Peter: Which just boosts the prices you pay. >> Yes, not just boost the price. >> Well they're trying to save price but then they're exposed on reliability. >> Reliability, exactly. You don't know when the data's going to be there, there are no guarantees. What we offer is, take your cloud storage, but we'll guarantee that we can replicate it in a synchronous fashion to another region. Could be the same provider, could be another provider. That gives tremendous benefits to the customers. >> So you actually have a guarantee when you go to customers, say with an SLA guarantee? Do you back it up with like money back, what's the guarantee? >> So the guarantees are, you know we are willing to back it up with contracts and such like, and our customers put us through rigorous testing procedures, naturally. But we stand up to every one of those. We can scale and maintain the consistency guarantees that they need for modern businesses. >> Okay, so take me through the benefits. Who wants this? Because you can almost get kind of sucked into the complexities of it, and the nuances of cloud and everything as Peter laid out, it's pretty complex even as he simplified it. Who buys this? (laughs) I mean, who's the guy, is it the IT department, is it the ops guy, is it the facilities, who... >> So we sell to the IT departments, and they absolutely love the technology. But to go back to your initial statement, we have all these disasters happening, you know, hopefully people are all doing reasonably okay at the end of these horrible disasters, but if you're an enterprise of any size, it doesn't have to be a big enterprise, you cannot go back to your users or customers and say that because of a hurricane you cannot have access to your data. That's sometimes legally not allowed, and other times it's just suicide for a business >> And HPE in Houston, it's a huge plant down there. >> Jagane: Indeed. >> They got hit hard. >> Yep, in those sort of circumstances, you want to make sure that your data is available in multiple data centers spread throughout the world, and we give you that capability. >> Okay, what are some of the successes? Let's talk through now, obviously you've got the technology, I get that. Where's the stakes in the ground? Who's adopting it? I know you do a lot of biz dev deals. I don't know if they're actually OEM-type deals, or they're just licensing deals. Take us through to where your successes are with this technology. >> So, biz dev wise, we have a mix of OEM deals and licenses and co-selling agreements. The strong ones are all OEMs, of course. We have great partnerships with IBM, Amazon, Microsoft, just wonderful partnerships. The actual end customers, we started off selling mostly to the financial industry because they have a legal mandate, so they were the first to look into this sort of a thing. But now we've expanded into automobile companies. A lot of the auto companies are generating vast amounts of data from their cars, and you can't push all that data into a single data center, that's just not reasonable. You want to push that data into a single data store that's distributed across the world in just wherever the car is closest to. We offer that capability that nobody else can, so that we've got big auto manufacturers signed up, we've got big retailers signed up for exactly the same capability. You cannot imagine ingesting all that data into a single location. You want this replicated across, you want it available no matter what happens to any single region or a data center. So we've got tremendous success in retail, banking, and a lot of this is through partnerships again. >> Well congratulations, I got to ask, you know, what's new with you guys? Obviously you have success with the active-active. We'll dig into the Hortonworks things to check your comment around them not having it, so we'll certainly look with the Dataplane, which we like. We interviewed Rob Bearden. Love the announcement, but they don't have the active-active, we're going to document that, and get that on the record. But you guys are doing well. What's new here, what's in New York, what are some of your wins, can you just give a quick update on what's going on at WANdisco? >> Okay, so quick recap, we love the Hortonworks Dataplane as well. We think that we can build value into that ecosystem by building a plugin for them. And we love the whole technology. I have wonderful friends there as well. As for our own company, we see all of our, a lot of our business coming from cloud and hybrid environments. It's just the reality of the situation. You had, you know, 20 years ago, you had NFS, which was the great appender of all storage, but turned out to be very expensive, and you had 10 years, seven years ago you had HDFS come along, and that appended the cost model of NFS and SANs, which those industries were still working their way through. And now we have cloud object stores, which have appended the HDFS model, it's much more cost-efficient to operate using cloud object stores. So we will be there, we have replication products for that. >> John: And you're in the major clouds, you in Azure? >> Yes, we are in Azure. >> Google? >> Jagane: Yes, absolutely. >> AWS? >> AWS, of course. >> Oracle? >> Oracle, of course. >> So you got all the top four companies. >> We're in all of them. >> All right, so here's the next question is, >> And you're also in IBM stuff too. >> Yes, we're built tightly into IBM >> So you've got a pretty strong legacy >> And a monopoly. >> On the mainframe. >> Like the fiber channel of replication. (John and Jagane laugh) That was a bad analogy. I mean it's like... Well, I mean fiber channel has only limited suppliers 'cause they have unique technology, it was highly important. >> But the basic proposition is look, any customer that wants to ensure that a particular data source is going to be available in a distributed way, and you're going to have some degree of consistency, is going to look at this as an option. >> Yes. >> Well you guys certainly had a great team under your leadership, it's got great tech. The final question I have for you here is, you know, we've had many conversations about the industry, we like to pontificate, I certainly like to speculate, but now we have eight years of history now in the big data world, we look back, you know, we're doing our own event in New York City, you know, thanks to great support from you guys and other great friends in the community. Appreciate everyone out there supporting theCUBE, that's awesome. But the world's changed. So I got to ask you, you're a student of the industry, I know that and knowing you personally. What's been the success formula that keeps the winners around today, and what do people need to do going forward? 'Cause we've seen the train wreck, we've seen the dead bodies in the industry, we've kind of seen what's happened, there've been some survivors. Why did the current list of characters and companies survive, and what's the winning formula in your opinion to stay relevant as big data grows in a huge way from IoT to AI cloud and everything in between? >> I'll quote Stephen Hawking in this. Intelligence is the capability to adapt to changes. That's what keeps industries, that's what keeps companies, that what keeps executives around. If you can adapt to change, if you can see things coming, and adapt your core values, your core technology to that, you can offer customers a value proposition that's going to last a long time. >> And in a big data space, what is that adaptive key focus, what should they be focused on? >> I think at this point, it's extracting information from this volume of data, whether you use machine learning in the modern days, or whether it was simple hive queries, that's the value proposition, and making sure the data's available everywhere so you can do that processing on it, that remains the strength. >> So the whole concept of digital business suggests that increasingly we're going to see our assets rendered in some form as data. >> Yes. >> And we want to be able to ensure that that data is able to be where it needs to be when it needs to be there for any number of reasons. It's a very, very interesting world we're entering into. >> Peter, I think you have a good grasp on this, and I love the narrative of programming the world in real time. What's the phrase you use? It's real time but it's programming the world... Programming the real world. >> Yeah, programming the real world. >> That's a huge, that means something completely, it's not a tech, it's a not a speed or feed. >> Well the way we think about it, is that we look at IoT as a big information transducer, where information's in one form, and then you turn it into another form to do different kinds of work. And that big data's a crucial feature in how you take data from one form and turn it into another form so that it can perform work. But then you have to be able to turn that around and have it perform work back in the real world. There's a lot of new development, a lot of new technology that's coming on to help us do that. But any way you look at it, we're going to have to move data with some degree of consistency, we're still going to have to worry about making sure that if our policy says that that action needs to take place there, and that action needs to take place there, that it actually happens the way we want it to, and that's going to require a whole raft of new technologies. We're just at the very beginning of this. >> And active-active, things like active-active in what you're talking about really is about value creation. >> Well the thing that makes active-active interesting is, again, borrowing from your terms, it's a new term to both of us, I think, today. I like it actually. But the thing that makes it interesting is the idea that you can have a source here that is writing things, and you can have a source over there that are writing things, and as a consequence, you can nonetheless look at a distributed database and keep it consistent. >> Consistent, yeah. >> And that is a major, major challenge that's going to become increasingly a fundamental feature of our digital business as well. >> It's an enabling technology for the value creation and you call it work. >> Yeah, that's right. >> Transformation of work. Jagane, congratulations on the active-active, and WANdiscos's technology and all your deals you're doing, got all the cloud locked up. What's next? Well you going to lock up the edge? You're going to lock up the edge too, the cloud. >> We do like this notion of the edge cloud and all the intermediate steps. We think that replicating data between those systems or running consistent compute across those systems is an interesting problem for us to solve. We've got all the ingredients to solve that problem. We will be on that. >> Jagane Sundar, CTO of WANdisco, back on theCUBE, bringing it down. New tech, whole new generation of modern apps and infrastructure happening in distributed and decentralized networks. Of course theCUBE's got it covered for you, and more live coverage here in New York City for BigData NYC, our annual event, theCUBE and Wikibon here in Hell's Kitchen in Manhattan, more live coverage after this short break.

Published Date : Sep 27 2017

SUMMARY :

brought to you by SiliconANGLE Media great to see you again as usual here on theCUBE. Thank you John, thank you Peter, Disaster recovery's been the low hanging fruit for you guys, One of the things we do is we can replicate Are you saying that nobody does because to do active-active, you need this capability the puck came to you guys. and with the global infrastructure trends, And the cloud is just a data center, and the messaging to both locations You guys put a stamp on the stuff, is the only way for you to do active-active replication, or are you guys hoarding it for yourselves? and we work with several powerful partners But the key thing, John, in answer to your question that you guys have the data center replication thing down. Once you have that capability, Does that relate to anything in your tech? and you can use some shaky technologies but then they're exposed on reliability. Could be the same provider, could be another provider. So the guarantees are, you know we are willing to is it the ops guy, is it the facilities, who... you cannot have access to your data. And HPE in Houston, and we give you that capability. I know you do a lot of biz dev deals. and you can't push all that data into a single data center, and get that on the record. and that appended the cost model of NFS and SANs, So you got all Like the fiber channel of replication. But the basic proposition is look, in the big data world, we look back, you know, Intelligence is the capability to adapt to changes. and making sure the data's available everywhere So the whole concept of digital business is able to be where it needs to be What's the phrase you use? That's a huge, that means something completely, that it actually happens the way we want it to, in what you're talking about really is about is the idea that you can have a source here that's going to become increasingly and you call it work. Well you going to lock up the edge? We've got all the ingredients to solve that problem. and more live coverage here in New York City

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Rob Bearden	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
Jagane	PERSON	0.99+
John Furrier	PERSON	0.99+
Peter	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
Stephen Hawking	PERSON	0.99+
two writers	QUANTITY	0.99+
Houston	LOCATION	0.99+
New York City	LOCATION	0.99+
Puerto Rico	LOCATION	0.99+
Texas	LOCATION	0.99+
New York	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Wikibon Research	ORGANIZATION	0.99+
VMworld	ORGANIZATION	0.99+
Florida	LOCATION	0.99+
Google	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
both	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
two separate databases	QUANTITY	0.99+
20 years ago	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
Cube	ORGANIZATION	0.99+
first	QUANTITY	0.99+
WANdiscos	ORGANIZATION	0.98+
over 10 years'	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
SiliconANGLE Media	ORGANIZATION	0.98+
one form	QUANTITY	0.97+
Wikibon	ORGANIZATION	0.97+
One	QUANTITY	0.97+
today	DATE	0.97+
seven years ago	DATE	0.96+
one	QUANTITY	0.96+
one region	QUANTITY	0.96+
Hadoop	TITLE	0.96+
Hortonworks Dataplane	ORGANIZATION	0.95+
NYC	LOCATION	0.95+
four companies	QUANTITY	0.94+
single region	QUANTITY	0.94+
years	DATE	0.93+
Dataplane	ORGANIZATION	0.91+
single location	QUANTITY	0.91+
single data center	QUANTITY	0.91+
HPE	ORGANIZATION	0.9+
one side	QUANTITY	0.9+
one western	QUANTITY	0.89+
Paxos	TITLE	0.89+
Paxos	OTHER	0.88+
both locations	QUANTITY	0.88+
10 years	QUANTITY	0.88+
BigData	EVENT	0.87+
Azure	TITLE	0.86+

David Richards, WANdisco | AWS Summit 2017

>> Narrator: Live from Manhattan, it's theCUBE, covering AWS Summit New York City 2017, brought to you by Amazon Web Services. >> And welcome back to New York, here. AWS Summit, theCUBE continue our coverage of what's happening here in the Big Apple. I'm John Walls along with Stu Miniman, and what this is is maybe not the most prolific CUBE guest of all time, but he's in the hall of fame. He really is a CUBE MVP for sure. It's good to have David Richards with us, the president, chairman, CEO of WANdisco. Good to see you, sir. >> It's a pleasure to be back again. It feels like home. >> It is like home. We need to get you your own microphone, I think, you know? >> David: I know it. I need my name on the back of the seat or something. >> This isn't quite a home game for you. All right, so you've got an office in Sheffield, England. >> David: Yeah. >> You've got an office out in the valley, Silicon Valley. We got ya right in the middle, I think. >> David: Yeah. >> Almost, don't we? So-- >> Exactly. >> We kind of split the difference for you this one. >> I always tell people I'm recolonizing the United States. I've been here for about 20 years. I can change the accent. >> Right. >> I'll get you all, eventually. >> All right, well, another year or two, we'll see how that works for ya. Big, big, I guess six, seven months for you, right? As far as some acquisitions you've done, some vice partnerships and arrangements you've done. >> Yes, as a business, we've really progressed well in the first half of the year. I've got to be a little bit careful. We've got results coming out September the sixth in London, but we did do a pre-announcement of a business update. We signed a record big data cloud contract with a very large bank for over four million dollars. That was our largest ever contract win. We signed a major retailer who we can't name, obviously, which is another sort of cloud ObjectStore on premises. A big data win, and interestingly, we stopped burning cash and investors really like this kind of perfect storm of, 175%, 173% growth in our cloud big data revenue, booking, sorry, combined with a flat cost-base, which meant, first half of last year, burning five point four million dollars down to virtually zero, just $600,000 in the first half. So, investors really like that. We really like that, and it demonstrates that perfect storm of flat cost-base and growing sales. >> David, I'm curious, does working with Amazon, and your customers being on Amazon, does the speed and agility and everything like that contribute to that profitability? >> Well, Amazon kind of changes the game for all vendors, right? Because nobody, it used to be this sort of big four, five, six, whatever it is these days, consulting companies that had to implement ERP systems and all those complex applications. I don't necessarily think they're the people, they're not the go-to people anymore for cloud. So, it's down to uniqueness of technology. Amazon have got such a wide array, we were talking earlier about some of their announcements out today as they continue to go up the stack with applications and so on. So, it does lend itself very well to small vendors with sticky, unique intellectual property and unique products and services that are going to really thrive in this kind of cloud environment. So, we've really enjoyed working with Amazon, but we're also working with the other cloud vendors, as well, and I have to say, when we first saw the Snowmobile and the Snowball, well, actually, the Snowmobile, drive out on stage in New York, was it 12, 18 months ago? It's dog years, so everything goes seven times faster. >> John: Right, right, right. >> I was laughing. I was like, "How on Earth can you possibly use a truck to move data?" But a customer came to us, a prospect came to us the other day, he wanted to move a hundred petabytes of data. Now, if you're going to use the public internet to do that, that's going to take a hell of a long time. So, this idea of a mix between physical and digital data movement I think is, when moving to cloud, is actually fascinating. I think it's a really fascinating subject area. One that customers are definitely going to use. >> Yeah, you've got a great vantage point looking at customers' migrations. >> David: Yeah. >> It was actually something big in the keynote talking about, there are so many migrations out there that Amazon released an AWS Migration Hubs. So, obviously, physics is always a challenge, my legacy mindset. Customers, we heard a customer up onstage and it's usually not lift and shift maybe for the private cloud, but for public cloud, I usually, I need to rewrite, I need to do micro-services. What is the friction for customers, and how are you and Amazon and the other clouds helping customers work through those challenges? >> OK, so, just to take a step back and think about the problems that happen at hyper-scale data movement. So, small-scale data, gigabyte-scale data, the stuff that you typically see in a relational database, they're not particularly big problems. It's kind of minimal outage, press pause, move data, make it consistent, and you're done. You can have a sort of, a small outage, maybe 15 minutes or even a day to move data, but when it gets to hyper-scale, when it gets to petabyte-scale, multi-terabyte-scale data moves, that's when you have a problem, and that's really the problem that we solve. So, the idea that you can move data that's moving and changing without an interruption to service from on-premise to cloud and support a hybrid cloud topology for an elongated period of time is fascinating. I was listening at an investor conference to the CEO of VMware who was talking about, we're going to be in a situation of hybrid cloud for the next 20, 25 years because, overnight, not everybody can just repurpose every single application that they're running on-premise, whether it's in the main frame application, or a relational data application, or wherever it is in the OP application, and repurpose that in cloud overnight. So, we're going to have to gradually move and migrate those applications over. So, it's highly likely we're going to be in a hybrid cloud environment for the foreseeable future, and that's actually fantastic news for us. We're moving, as I said, at scale companies into cloud with transactional data, and nobody else can touch us in terms of the uniqueness of the IP, which is fantastic news for us. >> In terms of just big data in general, Stu has one use for it, I have a different use for it. It's going to live in a lot of different places. How are you responding to different needs within your clients and trying to make them more effective, make them more efficient? And yet, when you're dealing with more and more data, that's a big storm to handle. >> That's a great question. I went to speak a couple of months ago to a new customer of ours who is a major healthcare provider on the east coast, and I kind of said to him, "OK, you've had this deep cluster for the past three years. Why are you calling us? Why now?" Which is the question that I always ask our customers. Why? What changed? Why are you doing this right now?" And maybe for the past three years they've been putting legal data into the system. That's data, but who cares if you can't get access to it? We can move to telephone. We can move to e-mails. We can go into an archive, into a paper archive even, to find it, but the why now is that they're now putting patient record data, patient information with regulated SLA's into this system, and that really is our sweet spot. As you get to, remember that investment thesis, small-scale gigabyte outage is small outage, when you get into petabyte, exabyte-scale, when you've got data sets that are a thousand, a million times greater, it's linear to the quantum of data. That outage becomes a thousand or a million times greater. So, that's kind of intolerable. So, we love it when strategic applications, regardless of what the use case is, we could all have different, it might be patient data, it might be retail information, it might be banking data, it might be customer retention information, when those strategic applications move onto this hyper-scale infrastructure, you have to support RTO and RTP, and that's what we do. >> And is a byte a byte a byte? You have these thousands of needles in haystacks, right? How do you assign value to one as opposed to another? >> So, this is another great question and one that investors kind of ask me a lot. So, we used to model our business from kind of the ground up. So, we take the classic enterprise sales team, you have a sales and marketing organization that's quite large, you would multiply that by their quota and then multiply it by 66% because that's how many of them are going to be successful in selling product. Well, we completely threw that away when we launched WANdisco Fusion, our new technology, early 2016. Then, we moved to a channel-based approach. So, we have IBM, we have an OAM, 5,000 quarter-carrying enterprise sales guys at IBM selling our products. That was a fantastic deal for us. We signed it in April 2016, and they've done the first half of this year, and made at least six million dollars in sales that we have also announced, and then, we've got strategic partnerships with Amazon, with Microsoft, with Google, and we model our business by those channels. So, we're not looking for needles in haystacks. We don't, we could never hire another, I mean, if we had to come into the market and say, "We need to go and hire 5,000 enterprise sales guys," we'd have to be raising, doing fund-raisers like Uber or something. We'd just be untenable. We couldn't do it. So, we have a product that lends itself very well to a channel-based approach, and that's working very nicely for us. So, we're not looking for, we're just looking for haystacks. Somebody else can go and find the needles. >> John: Find me and you, right? >> Right. >> David, how are your customers managing the pace of change these days? We've said Amazon is an example. It's like everyday there's three new services coming out. Are they excited? Are they completely overwhelmed? What do you see these days? >> So, I think it's classic sort of products and option lifecycle stuff. The sort of technical enthusiasts, they love all this change. The early-stage companies that are implementing this new cloud-based technology, ObjectStore technology and so on, they're managing very well. It's the later-stage companies you might go to and say, "ObjectStore," and they'll go, "What's ObjectStore? We're just getting our head around Hadoop, and Hive, and Pig, and all this other stuff that you were talking about three years ago," and sales guys go in there now and say, "Oh, no, no, no, don't worry about Hadoop. Nobody's going to run Hadoop in the cloud." It's like, "Well, that's what you told me three years ago." So, I think the market's certainly divided. I think you're going to see, as we move up products and option lifecycle, you're going to see lots and lots and lots of interesting moves happen. The companies that seem to be owning cloud, I think Alibaba is coming up really fast. We're seeing them doing some interesting things. Obviously, they've got dominoes in the Chinese market. Amazon First-Mover, Microsoft's futures dependent on cloud. So, they all have their different spin and different take on applications that they're going to run in cloud. I think there is, I think it's a bit like the cellphone industry. There's lot and lots of different plans, lots and lots of different confusing nomenclature, but that's going to settle out in the next couple of years, but there's unquestionably, if you look at the audience here today, unquestionably large-scale movement of applications and data to cloud. >> Well, we appreciate the time, as always. Great to see you. Another notch in your CUBE belt. (laughing) So, congratulations for that, and maybe you can settle in to New York for a day or two. You said your travels have had you flip-floppin' back and forth between England and here. So, maybe you can settle in for a day or two. >> Yeah, I need to replicate myself. I need to put myself in at least two different places at the same time. >> Live data replication right here. (laughing) All right, David, thanks for bein' with us. David Richards. >> Thank you. Thanks guys. >> Back with more here on theCUBE, we continue our coverage of AWS Summit from New York City right after this break. (upbeat music)

Published Date : Aug 14 2017

SUMMARY :

brought to you by Amazon Web Services. It's good to have David Richards with us, It's a pleasure to be back again. We need to get you your own microphone, I think, you know? I need my name on the back of the seat or something. All right, so you've got an office in Sheffield, England. You've got an office out in the valley, Silicon Valley. I can change the accent. As far as some acquisitions you've done, I've got to be a little bit careful. So, it's down to uniqueness of technology. One that customers are definitely going to use. Yeah, you've got a great vantage point I need to do micro-services. and that's really the problem that we solve. that's a big storm to handle. and I kind of said to him, because that's how many of them are going to be successful What do you see these days? on applications that they're going to run in cloud. and maybe you can settle in to New York for a day or two. I need to put myself in at least two different places All right, David, thanks for bein' with us. Thank you. we continue our coverage of AWS Summit from New York City

ENTITIES

Entity	Category	Confidence
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
Google	ORGANIZATION	0.99+
John	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Stu Miniman	PERSON	0.99+
April 2016	DATE	0.99+
London	LOCATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
six	QUANTITY	0.99+
New York	LOCATION	0.99+
15 minutes	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
WANdisco	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
England	LOCATION	0.99+
New York City	LOCATION	0.99+
two	QUANTITY	0.99+
September	DATE	0.99+
175%	QUANTITY	0.99+
66%	QUANTITY	0.99+
a day	QUANTITY	0.99+
173%	QUANTITY	0.99+
Big Apple	LOCATION	0.99+
Earth	LOCATION	0.99+
VMware	ORGANIZATION	0.99+
seven months	QUANTITY	0.99+
early 2016	DATE	0.99+
five	QUANTITY	0.99+
three years ago	DATE	0.99+
AWS	ORGANIZATION	0.99+
Sheffield, England	LOCATION	0.99+
four million dollars	QUANTITY	0.98+
over four million dollars	QUANTITY	0.98+
United States	LOCATION	0.97+
three new services	QUANTITY	0.97+
zero	QUANTITY	0.97+
ObjectStore	ORGANIZATION	0.97+
today	DATE	0.97+
5,000	QUANTITY	0.97+
thousands of needles	QUANTITY	0.96+
about 20 years	QUANTITY	0.96+
AWS Summit	EVENT	0.96+
first	QUANTITY	0.95+
Silicon Valley	LOCATION	0.95+
Hive	ORGANIZATION	0.95+
first half	QUANTITY	0.95+
AWS Summit New York City 2017	EVENT	0.94+
AWS Summit 2017	EVENT	0.93+
sixth	DATE	0.92+
CUBE	ORGANIZATION	0.9+
first half of last year	DATE	0.89+
5,000 enterprise sales guys	QUANTITY	0.88+
a million times	QUANTITY	0.88+
couple of months ago	DATE	0.88+
$600,000	QUANTITY	0.87+
seven times	QUANTITY	0.87+
theCUBE	ORGANIZATION	0.86+
Snowmobile	ORGANIZATION	0.86+
two different places	QUANTITY	0.85+
a thousand	QUANTITY	0.85+
One	QUANTITY	0.85+
Pig	ORGANIZATION	0.85+
at least six million dollars	QUANTITY	0.84+
past three years	DATE	0.83+
four	QUANTITY	0.83+

Paul Scott-Murphy, WANdisco - Google Next 2017 - #GoogleNext17 - #theCUBE

>> Narrator: You are Cube Alumni. Live from Silicon Valley, it's the Cube. Covering Google Cloud Next 17. >> Welcome back to the Cube's coverage of Google Next 2017. Having a lot of conversations as to how enterprises are really grappling with cloud. You know, move from on premises to public cloud, multi-cloud, hybrid-cloud, all those pieces in between. Happy to welcome to the program a first time guest, Paul Scott-Murphy who's the vice president of product management at WANdisco, thanks so much for joining us. >> Yeah, thanks very much, it's great to be here and join your program. >> Alright, so you know, Paul, I think a lot of our audience probably is familiar with WANdisco, we've had many of your executives on, really dug into your environment for the last few years, usually see you guys a lot of not only the big data shows, we've got Strata coming up next week, last time I did an interview with you guys was at AWS re:Invent. So you know, WAN, replication, data, all those things put together, you've got a big bucket of big data in cloud. Tell us a little bit about kind of your background, your role at the company. >> Okay. So I've been at WANdisco now for about two and a half years. I previously worked for TIBCO Software for a decade. Working out of Asia-Pacific, held the CTO role there for APJ. And joined WANdisco two and half years ago, just as we were entering into the big data market with our replication capabilities. I now run product management for the company and work out of our headquarters here in the Bay area. >> Stu Miniman: Great. And connect with us you know, what you guys are doing at Google, what's the conversations you're having with customers that are attending. >> Yeah, so Google is definitely one of the key strategic partners for WANdisco, obviously particularly in the Cloud space for us. We're hosting a booth fair for the conference and using that as an opportunity to speak to other vendors and the customers that we have attending the Google conference. Particularly around what we're doing for replication between on premises and cloud environments, and how we support Google Cloud. Dataproc, and Google Cloud Storage as well. >> Can you help unpack for us a little bit, where are your customers, give us a tip of the customers, you know they're saying hey, I want to start using this cloud stuff, how are they figuring out what applications stay on premises, what goes to the public cloud, and that data piece is a challenging thing, moving data is not easy, there's a whole data gravity piece that fits into it, maybe you can help walk us through some of the scenarios. >> Yeah, as we're progressing the technology, we're certainly finding a broader and broader range of customers getting interested in what they can do around data replication. The sorts of organizations that we deal with primarily are those who are looking to leverage both on premises and cloud infrastructure. All those who are moving from a situation where they've been toying with these environments and moving into production-ready scenarios where the demands or enterprise level SLAs or availability, or the needs around disaster recovery, backup and migration use cases become a lot more dominant for them. The organizations that we work with typically they are larger organizations, we deal a lot with retail, with financial services, telecommunications, with research institutions as well. All of whom have larger needs around taking advantage of cloud infrastructure. Of course they all share the same challenge of the availability of their data, where it's sourced from, isn't always necessarily in the cloud, taking advantage of cloud infrastructure then requires them to think about how they make their information available both to their on premises systems and to the cloud environment where they can run perhaps larger analytic workloads against it, or use the cloud services that they would otherwise not have access to. >> One of the challenges we've seen is when we've got kind of that hybrid or multi-cloud environment, you know, manages my data, kind of the holes, you know, orchestrating pieces and getting my arms around how I take care of it and leverage it can be challenging. Is that something you guys help with or are there other partners that get involved, how are customers helping to sort out and mature these environments? >> Yeah it's a big question of course, you've touched on the management of data as a whole and what they means, and how organizations handle that. WANdisco's role in supporting organizations with those challenges is in ensuring that when they need to take advantage of more than one environment or when they need their data to be available in more than one place. They can do that seamlessly and easily. What we aport to do and what we encourage our customers to do with our technology is rather than keeping one copy of data on premises and using it solely there, or copying your data to another location in order that you can act upon it there, we treat those environments as the same and say well, have the best of both worlds. Have your data available in each location, let your applications use it at the local speed and do that without regard to the need for retaining a workflow by which you exchange data between environments. WANdisco's technology can take care of all of that, and to do so it has to do some very smart things under the covers, around consistency and making it work across wide-area networks. Makes it particularly suited to cloud environments where we can leverage those underlying capabilities in conjunction with the scale of the cloud which is a native home for data at scale. >> Can you give us some, you know, where do you see customers kind of in this maturation, Dan Green made a statement that today 5% of the data is in the public cloud, so what are some of those barriers that are stopping people from getting more data in the Cloud, is it something that we will just see a massive adoption of data in the cloud, or what's your guys viewpoint as to where data's going to live, how that movement is happening. >> Yeah, I think longer term the economic advantages of using cloud environments are undeniable. The cost advantages of hosting information in the cloud and the benefits that come from the scalability of those environments is certainly far surpassing the capabilities that organizations can invest in themselves through their own data centers. So that natural migration of data to the cloud is a common theme that we see across all sorts of organizations. But as many people say, data has gravity, and if the majority of your application information resides today in your own environments or in environments outside of the cloud, whether that's internet connected devices, or in points of ingest that reside outside of cloud environments, there's a natural tendency for data to remain in place where they're either ingested or created. What you need to do to better take advantage of cloud environments then is the ability to easily access that data from cloud infrastructure. So the sorts of organizations that are looking to that are those with either burgeoning problems around consuming data at multiple points. They might operate environments that span multiple contents. They might have jurisdictional restrictions around where their data can reside but need to control its flow between separate environments as well. So WANdisco can certainly help with all of those problems, the underlying replication technology that we bring to bear is very well suited to it. But we are a part of the overall solution. We're not the full answer to everything. We certainly deal very well with replication and we believe we cover that very well. >> I'm curious when you talk about kind of the dispersion of data and where it's being created, of course edge-use cases for things like IOT, are quite a hot topic at that point. Is that something you guys are touching on yet, gets involved in discussions, you know, where does that sit? >> Yeah, definitely. The interesting thing about WANdisco's approach to data replication is that we base it on this foundation of consistency. And using a mathematically proven approach to distributed consensus to guarantee that changes made in one environment are represented in others equally, regardless of where those changes occur. Now when you apply that to batch based data storage or streaming environments, or other forms of ingest is relatively irrelevant as long as you have that same underlying capability to guarantee consistency regardless of where changes occur. If you're talking about high IT environments where you naturally have infrastructure sitting outside of the cloud, and this is the type of infrastructure that needs to reside out of the cloud, right, your edge points where data are captured, where your consuming information are generating it from devices perhaps from an automotive vehicle or from an embedded device, some sort of sensor array, whatever that happens to be, these are the types of environments where it means you're generating data outside of the cloud. So if you're looking to use that inside of the cloud itself, you need some way of moving data around, and you need to do that with some degree of consistency between those environments to make sure you're not just challenged with extra copies of information. >> The other really interesting topic around data that's being discussed at the Google Cloud event is artificial intelligence, machine learning, I'm curious, are your customers involved in that, where do you see that kind of on the radar today? >> Yeah, it's obviously an absolutely critical part of where the IT industry in general is going, and the type of solution that's fed off data. These systems are better as your data set grows. The more information you have, the better they work, and the more capable they become. It's certainly an aspect of how well machine learning technique and artificial intelligence approaches have been adopted in the industry, and the rapid rate of change in that side of IT is driving a lot of the demand for increasing access to data sets. We see some of our customers using that for really interesting things. You might've seen some of the recent news around our involvement in a research project led through the University of Sheffield, looking to use data sets captured from a variety of research institutions and medical environments to solve the problem of identifying and responding to dementia. And it's a great outcome from that type of environment. Through which machine learning techniques are being applied across data sets. What you find though is that because there's a large set of institutions sharing access to data, no single data set is sufficient to support those outcomes, regardless of what intelligence you can place against the machine learning models that you build up. So by enabling the ability to bring those data sets together, have them available in a single location, being the cloud, where larger models can be assessed against the data sets means much better outcomes for those types of environments. >> Okay. Paul, in your role of product management, we've been through some of the hot buzz terms out there, how do you help the company identify those trends, focused on the ones that are important to your customers and the kind of feedback loops that you get from them. >> I guess a lot of work in the end is how we do it but we need to listen to customers directly of course, understand what they're looking to do with their information systems. What they're aiming for. Their goals at a business level, what type of value that they want to get out of their data, and how they're approaching that. That's really critical. We also need to look to the industry in general. We're obviously in a very rapidly changing environment where technologies, the organizations that build IT systems, are increasingly adopting new approaches and building systems that simply weren't available days ago. You look at the announcements from Google of late around their video intelligence APIs as a service, their image APIs as well, all new capabilities that organizations today now have access to. So bringing those things together, understanding where the general IT trends are, how that applies to our customers, and what WANdisco can do with the unique value that we bring is really key to the product management role. >> Alright, and Paul, you've been at the show, curious, any cool things you saw, interesting customer conversations that may want to give our audience a flavor of what's going on, why 10 thousand people are excited to be at the event. >> Yeah well it is a very exciting event, just the scale of these types of events run by Google and similar organizations is something in itself to behold. We're really excited to be a part of that. The things that are really interesting for me out of the show tend to be where we see customers or opportunities coming to us, identifying challenges that they can't address without the type of technology that we bring to bear. Those tend to be areas where either they're looking to do migration from on premises systems into the cloud which is obviously very strong interest for Google themselves, they need to bring customers in to take better advantage of the services that they have. WANdisco can play a strong role in that. We're seeing a lot of interesting things around the edge too, so all of the ways in which data can be used are always exciting and interesting to see. The combination of technologies like artificial intelligence, like virtual reality, the type of work that WANdisco does also, is certainly going to bring forward I think a new wave of applications and systems that we just hadn't considered even a few years ago. >> Yeah. Lots of really interesting things. There's personal assistants at home and personal assistants that are listening. Okay Google, subscribe to SiliconANGLE on Youtube. We'll be back with lots more coverage here from the Cube, talking about Google Next 2017. You're watching the Cube.

Published Date : Mar 9 2017

SUMMARY :

it's the Cube. Welcome back to the Cube's coverage it's great to be here for the last few years, here in the Bay area. connect with us you know, fair for the conference some of the scenarios. of the availability of their data, One of the challenges we've seen and to do so it has to do a massive adoption of data in the cloud, is the ability to easily access that data Is that something you inside of the cloud itself, is driving a lot of the demand focused on the ones that are important in the end is how we do it to be at the event. of the services that they have. from the Cube,

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
Paul Scott-Murphy	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
Dan Green	PERSON	0.99+
Google	ORGANIZATION	0.99+
University of Sheffield	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
next week	DATE	0.99+
APJ	ORGANIZATION	0.99+
10 thousand people	QUANTITY	0.98+
more than one place	QUANTITY	0.98+
each location	QUANTITY	0.98+
One	QUANTITY	0.98+
5%	QUANTITY	0.98+
single	QUANTITY	0.98+
both worlds	QUANTITY	0.98+
both	QUANTITY	0.98+
one copy	QUANTITY	0.98+
first time	QUANTITY	0.98+
TIBCO Software	ORGANIZATION	0.97+
Strata	TITLE	0.97+
today	DATE	0.97+
Youtube	ORGANIZATION	0.95+
about two and a half years	QUANTITY	0.95+
one	QUANTITY	0.95+
Bay	LOCATION	0.94+
Asia-Pacific	LOCATION	0.93+
Cube	ORGANIZATION	0.93+
Dataproc	ORGANIZATION	0.92+
two and half years ago	DATE	0.91+
more than one environment	QUANTITY	0.88+
Google	EVENT	0.88+
few years ago	DATE	0.86+
one environment	QUANTITY	0.85+
days	DATE	0.77+
Google Cloud Next 17	TITLE	0.77+
Google Next	TITLE	0.73+
last few years	DATE	0.69+
Cube	COMMERCIAL_ITEM	0.66+
Google Next 2017	TITLE	0.66+
Next 2017	TITLE	0.63+
re:Invent	EVENT	0.6+
Google Cloud	TITLE	0.56+
Cloud Storage	TITLE	0.54+
2017	DATE	0.5+
SiliconANGLE	ORGANIZATION	0.47+
Cloud	TITLE	0.33+

Jagane Sundar, WANdisco - BigDataNYC - #BigDataNYC - #theCUBE

>> Announcer: Live from New York, it's theCUBE covering BigData New York City 2016, brought to you by headline sponsors Cisco, IBM, Nvidia, and our ecosystem sponsors. Now here are your hosts, Dave Vellante and Peter Burris. >> Welcome back to theCUBE everybody. This is BigData NYC and we are covering wall to wall, we've been here since Monday evening. We we're with Nvidia, Nvidia talking about deep learning, machine learning. Yesterday we had a full slate, we had eight data scientists up on stage yesterday and then we covered the IBM event last night, the rooftop party. Saw David Richards there, hanging out with him, and wall to wall today and tomorrow. Jagane Sundar is here, he is the CTO of WANdisco, great to see you again Jagane. >> Thanks for having me Dave. >> You're welcome. It's been a while since you and I sat down and I know you were on theCUBE recently at Oracle Headquarters, which I was happy to see you there and see the deals that are going on you've got good stuff going on with IBM, good stuff going on with Oracle, the Cloud is eating the world as we sort of predicted and knew but everybody wanted to put their head in the sand but you guys had to accommodate that didn't you. >> We did and if you remember us from a few years ago we were very very interested in the Hadoop space but along the journey we realized that our replication platform is actually much bigger than Hadoop. And the Cloud is just a manifestation of that vision. We had this ability to replicate data, strongly consistent, across wide area networks in different data centers and across storage systems so you can go from HDFS to a Cloud storage system like S3 or Azure Wasabi and we will do it with strong consistency. And that turned out to be a bigger deal than actually providing just replication for the Hadoop platform. So we expanded beyond our initial Hadoop Forex and now we're big in the Cloud. We replicate data to many Cloud providers and customers use us for many use cases like disaster recovery, migration, active/active, Cloud bursting, all of those interesting use cases. >> So any time I get you on theCUBE I like to refresh the 101 for me and for the audience that may not be familiar with it but you say strongly consistent, versus you hear the term eventual consistency, >> Jugane: Correct. >> What's the difference, why is the latter inadequate for the applications that you're serving. >> Right so when people say eventually consistent, what they don't remember is that eventually consistent systems often have different data in the different replicas and once in a while, once every five minutes or 15 minutes, they have to run an anti-entropy process to reconcile the differences and entropy is the total randomness right if you go back to your physics, high school physics. What you're really talking about is having random data and once every 10 minutes making it reconcile and the reconciliation process is very messy, it's like last right winds and the notion of time becomes important, how do you keep time accurate between those. Companies like Google have wonderful infrastructure where they have GPS and atomic clocks and they can do a better job but for the regular enterprise user that's a hard problem so often you get wrong data that's reconciled. So asking the same query you may get different answers and your different replicas. That's a bad sign, you want it consistent enough so you can guarantee results. >> Dave: And you've done this with math, right? >> Exactly, our basis is an algorithm called Paxos, which was invented by a gentleman called Leslie Lamport back in '89 but it took many decades for that algorithm to be widely understood. Our own chief scientists spent over a decade developing those, adding enhancements to make it run over the wide area network. The end result is a strongly consistent system, mathematically proven, that runs over the wide area network and it's completely resistant to failure of all sorts. >> That allows you to sort of create the same type of availability, data consistency as you mentioned Google with the atomic clocks, Spanner I presume, is this fascinating, I mean when the paper came out I was, my eyes were bleeding reading it and but that's the type of capability that you're able to bring to enterprises right? >> That's exactly right, we can bring similar capabilities across diverse networks. You can have regular networking gear, time synchronized by NTP, out in the Cloud, things are running in a virtual machine where time adrift most of the time, people don't realize that VMs are pretty bad at keeping time and all you get up in the Cloud is VMS. Across all those enviroments we can give you strongly consistent replication at the same quality that Google does with their hardware. So that's the value that we bring to the Fortune 500. >> So increasingly enterprises are recognizing that data has an, I don't want to say intrinsic value but data is a source of value in context all by itself. Independent of any hardware, independent of any software. That it's something that needs to be taken care of and you guys have an approach for ensuring that important aspects of it are better taken care of. Not the least of which, is that you can provide an option to a customer who may make a bad technology choice one day to make a better technology choice the next day and not be too worried about dead ending themselves. I'm reminded of the old days when somebody who was negotiating an IBM main frame deal would put an Amdahl coffee cup in front of IBM or put an Oracle coffee cup in front of SAP. Do you find customers metaphorically putting a WANdisco coffee cup in front of those different options and say these guys are ensuring that our data remains ours? >> Customers are a lot more sophisticated now, the scenarios that you pointed out are very very funny but what customers come to us for is the exact same thing, the way they ask it is, I want to move to Cloud X, but I want to make sure that I can also run on Cloud Y and I want to do it seamlessly without any downtime on my on-prem applications that are running. We can give them that. Not only are they building a disaster recovery environment, often they're experimenting with multiple Clouds at the same time and may the better Cloud win. That puts a lot of competition and pressure on the actual Cloud applications they're trying. That's a manifestation in modern Cloud terms of the coffee cup competitor in the face that you just pointed out. Very funny but this how customers are doing it these days. >> So are you using or are they starting to, obviously you are able to replicate with high fidelity with strong fidelity, strong consistency, large volumes of data. Are you starting to see customers, based on that capability actually starting to redesign how they set up their technology plant? >> Absolutely, when customers were talking about hybrid Cloud which was pretty well hyped a year or so ago, they basically had some data on-prem and some other data in the Cloud and they were doing stuff but what we brought to them was the ability to have the same data both on-prem and in the Cloud, maybe you had a weekly analytics job that took a lot of resources. You'd burst that out into the Cloud and run it up there, move the result of that analytics job back on-prem. You'd have it with strong consistency. The result is that true hybrid Cloud is enabled when only when you have the same exact data available in all of your Cloud locations. We're the only company that can provide that so we've got customers that are expanding their Cloud options because of the data consistency we offer. >> And those Cloud options are obviously are increasing >> Jugane: They are. >> But there's also a recognition that it's as we gain more experience with Cloud, that different workloads are better than others as we move up there. Now Oracle with some of their announcements last week may start to push the envelope on that a little bit but as you think about where the need for moving large volumes of data with high, with strong consistency what types of applications do you think people are focusing on? Is it mainly big data or are there other application styles or job types that you think are going to become increasingly important? >> So we've got much more than big data, one of the big sources of leads for us now is our capability to migrate netapp filers up into the Cloud and that has suddenly become very important because an example I'd like to give is a big financial firm that has all of its binaries and applications and user data and netapp filers, the actual data is in HDFS on-prem. They're moving their binaries from the netapp up into the Cloud in a specific Cloud windows equal into the filer and the big data part of it from HDFS up into Cloud object store, we are the only platform that can deal with both in the strong consistent manner that I've talked about and we're a single replication platform so that gives them the ability to make the sort of a migration with very low risk. One of the attributes of our migration is that we do it with no downtime. You don't have to take your online, your on-prem environment offline in order to do the migration so they are doing that so we see a lot of business from that sort of migration efforts where people have data in mass filers, people have data in other non-HDFS storage systems. We're happy to migrate all of those. Our replication platform approach, which we've taken in the last year and a half or so is really paying off in that respect. >> And you couldn't do that with conventional migration techniques because it would take too long, you'd have to freeze the applications? >> A couple of things, one you'd probably have to take the applications offline, second you'd be using tools of periodic synchronization variety such as RSYNC and anybody in the devops or operations whose ever used RSYNC across the wide area network will tell you how bad that experience is. It really is a very bad experience. We've got capability to migrate netapp filer data without imposing a load on the netapp's on-prem so we can do it without pounding the crap out of the netapp's server such that they can't offer service to their existing customers. Very low impact on the network configuration, application configuration. We can go in, start the migration without downtime, maybe it takes two, three days for the data to get up over there because of mavenlink. After that is done, you can start playing with it up in the Cloud. And you can cut over seamlessly so there's so real downtime, that's the capability we've seen. >> But you've also mentioned one data type, binaries, they can't withstand error propagation. >> Jugane: Absolutely. >> And so being able to go to a customer and say you're going to have to move these a couple times over the course of the next n-months or years, as a consequence of the new technology that's now available and we can do so without error propagation is going to have a big impact on how well their IT infrastructure, their IT asset base runs in five years. >> Indeed, indeed. That's very important. Having the ability to actually start the application, having the data in a consistent and true form so you can start, for example, the data base and have it mount the actual data so you can use it up in the Cloud, those are capabilities that are very important to customers. >> So there's another application. If you think about, you tend to be more bulk, the question I'm going to ask is and at what point in time is the low threshold in terms of specific types of data movement. Here's why I'm asking. IOT data is a data source or is a use-case that has often the most stringent physical constraints possible. Time, speed of light, has an implication but also very importantly, this notion of error propagation really matters. If you go from a sensor to a gateway to another gateway to another gateway you will lose bits along the way if you're not very careful. >> Correct. >> And in a nuclear power plant, that doesn't work that way. >> Jugane: Yeah. >> Now we don't have to just look at a nuclear power plant as an example but there's increasingly industrial IOTs starting to dramatically impact not just life and death circumstances but business success or failure. What types of smaller batch use-cases do you guys find yourselves operating in, in places like IOT where this notion of error or air control strong consistency is so critical? >> So one of the most popular applications that use our replication is Spark and Spark Streaming which as you can imagine is a big part of most IOT infrastructure, we can do replication such that you ingest into the closest data center, you go from your server or your car or whatever to the closest data center, you don't have to go multiple hops. We will take care consistency from there on. What that gives you is the ability to say I have 12 data centers with my IOT infrastructure running, one data center goes down, you don't have a downtime at all. It's only the data that was generated inside the data center that's lost. All client machines connecting to that data center will simply connect to another data center, strong replication continues, this gives you the ability to ingest at very large volumes while still maintaining the consistency and IOT is a big deal for us, yes. >> We're out of time but I got a couple of last minute questions if I may. So when you integrate with IBM, Oracle, what kind of technical issues do you encounter, what kind of integration do you have to do, is it lightweight, heavyweight, middleweight? >> It's middleweight I would say. IBM is a great example, they have a deep integration with our product and some of the authentication technology they use was more advanced than what was available in open source at that time. We did a little of work, and they did a little bit of work to make that work, but other than that, it's a pretty straight forward process. The end result is that they have a number of their applications where this is a critical part of their infrastructure. >> Right, and then road map. What can you tell us about, what should we look for in the future, what kind of problems are you going to be solving? >> So we look at our platform as the best replication engine in the world. We're building an SDK, we expect custom plugins for different other applications, we expect more high-speed streaming data such as IOT data, we want to be the choice for replication. As for the plugins themselves, they're getting easier and easier to build so you'll see wide coverage from us. >> Jugane, thanks so much for coming to theCUBE, always a pleasure to have you. >> Thank you for having me. >> You're welcome. Alright keep it right there everybody, we'll be back to wrap. This is theCUBE, we're live from NYC. We'll be right back. (upbeat electronic music)

Published Date : Sep 29 2016

SUMMARY :

brought to you by headline great to see you again Jagane. and see the deals that are going on but along the journey we realized for the applications that you're serving. So asking the same query you runs over the wide area network So that's the value that we is that you can provide the scenarios that you pointed So are you using or You'd burst that out into the Cloud or job types that you think are going to and the big data part of it from HDFS and anybody in the devops or operations they can't withstand error propagation. as a consequence of the new and have it mount the actual the question I'm going to ask is that doesn't work that way. do you guys find yourselves operating in, What that gives you is the ability to say do you have to do, and some of the authentication you going to be solving? engine in the world. for coming to theCUBE, This is theCUBE, we're live from NYC.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Peter Burris	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Jagane	PERSON	0.99+
Jagane Sundar	PERSON	0.99+
Google	ORGANIZATION	0.99+
NYC	LOCATION	0.99+
two	QUANTITY	0.99+
15 minutes	QUANTITY	0.99+
David Richards	PERSON	0.99+
yesterday	DATE	0.99+
Cloud X	TITLE	0.99+
12 data centers	QUANTITY	0.99+
Cloud Y	TITLE	0.99+
tomorrow	DATE	0.99+
last week	DATE	0.99+
three days	QUANTITY	0.99+
five years	QUANTITY	0.99+
New York	LOCATION	0.99+
SAP	ORGANIZATION	0.99+
Jugane	PERSON	0.99+
One	QUANTITY	0.99+
Leslie Lamport	PERSON	0.99+
both	QUANTITY	0.99+
Yesterday	DATE	0.99+
Monday evening	DATE	0.99+
'89	DATE	0.99+
WANdisco	ORGANIZATION	0.98+
last night	DATE	0.98+
today	DATE	0.98+
Amdahl	ORGANIZATION	0.97+
one day	QUANTITY	0.97+
over a decade	QUANTITY	0.97+
single	QUANTITY	0.97+
Cloud	TITLE	0.95+
Hadoop	TITLE	0.95+
BigData	ORGANIZATION	0.95+
S3	TITLE	0.95+
one	QUANTITY	0.95+
next day	DATE	0.94+
eight data scientists	QUANTITY	0.93+
a year or so ago	DATE	0.9+
five minutes	QUANTITY	0.88+
BigDataNYC	ORGANIZATION	0.88+
once	QUANTITY	0.88+
Spark	TITLE	0.87+
few years ago	DATE	0.87+
one data center	QUANTITY	0.86+
Azure Wasabi	TITLE	0.86+
BigData	EVENT	0.84+
Paxos	OTHER	0.81+
101	QUANTITY	0.79+
one data	QUANTITY	0.77+
once every 10 minutes	QUANTITY	0.77+
last year and a half	DATE	0.77+
CTO	PERSON	0.76+
theCUBE	TITLE	0.75+
next n-months	DATE	0.74+
York City 2016	EVENT	0.71+
Oracle Headquarters	ORGANIZATION	0.67+
couple	QUANTITY	0.63+
Fortune 500	ORGANIZATION	0.58+
many	QUANTITY	0.58+
WANdisco	COMMERCIAL_ITEM	0.55+

David Richards, WANdisco - BigDataNYC - #BigDataNYC - #theCUBE

(silence) (upbeat techno music) >> Narrator: Live from New York, it's theCUBE, covering Big Data NYC 2016, brought to you by headline sponsors: Cisco... IBM... Nvidia, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and Peter Burris. >> Welcome back to New York City, everybody. This is theCUBE, the worldwide leader in live tech coverage. David Richards is here. He's the CEO of WANdisco, a long time CUBE alum. Great to see you again. >> Great to be back. >> It was good fun hanging out with last night and a good surprise at the IBM event. There was good action across the street. >> Yeah, you're both looking surprisingly well, actually. >> (Dave laughs) Yes. >> Well, we also heard about the WANdisco versus theCUBE golf tournament, that apparently theCUBE just did really, really well in it and WANdisco went running away with their tail between their legs. >> Well, I talked to Furrier last night. I said, "David Richards was telling me "that he kicked your butt on the golf course." He goes, "Yeah, that's true, actually." (laughter) >> I think I've got some video proof that he actually gave me $20 live on air because, of course, his wallet was empty. (laughter) He was blowing the dust off it, you know? >> Of course, yeah, the body swerve. >> Alligator arms. >> So David, it's, again, great to see you again. You guys have been in this business since day one, and things are evolving. How are things changing for WANdisco? >> So, when we first came into this market, back in the mid-2006, 2007, and then we obviously made a bunch of acquisitions around 2011 and 2012 that took us headlong into the big data marketplace. We pretty much had a completely different business model to our business model now. Then, we had a product called Non-Stop NameNode... My God, can you imagine that? (Dave laughs) That was very focused on the Hadoop marketplace because, at that time, we believed, like everybody else, that Hadoop was going to take over the world, people were going to move to commoditized servers, open-source software, and solve the huge storage problems that they were going to have from both a cost and efficiency perspective. What I think has happened, or is happening right now, is this evolution, and it really is more of a revolution than an evolution is taking place, where workloads, and we were discussing this last night, are moving at massive scale to cloud, and people are really skipping that step, where we thought they were going to have 5, 10,000 sort of clusters on-premise, but now they have some clusters on-prem, but the bulk of the workloads are actually moving into cloud. I was just discussing with George, off-camera a few minutes ago, why that is happening, and there's a lot of applications that are very efficient. The cloud packs are up there ready to use, off the shelf, and it becomes very simplistic, and to be quite frank, do we really care anymore about all these different open-source components? Is the CIO waking up in the middle of the night thinking, oh, my God, am I going to use Ignite, am I going to use Spark, am I going to use Pig, am I going to use Hive, et cetera, et cetera, et cetera? Of course they're not. They really just want to-- Let's inverse the question to ourselves. If you were going to start a competitor to Uber tomorrow, would you go and build a data center (Dave laughs) or would you just throw up a thousand servers up in the cloud and have done with it, and use all the apps that are up there? Of course, the answer's simple, so that's really what's happening. >> Well, one of the things that I... I wrote a piece of research a million years ago in which I prognosticated, the Dictionary Word of the Day, that the value of middleware was inversely proportional to the degree to which anybody knew anything about it. (Dave laughs) CIOs are waking up and asking those questions today, which is an indication that they're creating a problem. >> Yep. >> Infrastructure has to do no harm in the organization. I had a CIO friend for years who still asks his chief CTO, "To what degree is infrastructure creating a problem "for me today?" >> Yeah. >> And if it's creating a problem, it's a problem. >> Mm-hmm. >> You don't want to have to know about this stuff, and so what degree are you helping companies mask some of those... that visibility, so that people can spend less time worrying about the infrastructure? >> So, what we're focused on is a business model that has gone from direct, where we were hiring out a very large direct sales force enterprise, the classic enterprise sales guys that would go knock on doors, knock deals down, go and sell to the Global 1000s, to an indirect model, and we announced that OAM, recently with IBM, IBM Big Replicate, that is under the covers, is WANdisco Fusion, which is a great deal for us. So, our focus very much is on data movement, and data movement between data centers, for companies that want to stay on-prem, and between data centers and in and out of cloud seamlessly, and the word there is seamlessly. So, we worked very hard for the past 18 months on our product such that anybody can go to, if you want to go to the AWS Marketplace, you can, in a few clicks, begin to replicate petabyte-scale in and out of cloud, and we think, and we were discussing this last night, that the hybrid-cloud model is really fascinating, so the ability to take data on-premise, query it in cloud, get complete consistency between on-prem and cloud, but also have all the efficiency in the cloud economics, the elasticity, all the applications that exist in cloud, and I think that model is really interesting, and what's interesting is, I'm not sure that the little guys can execute in that model other than, like we're doing, veer on OAM, an indirect model. So, I'm not sure whether or not, just to go back to the conversation, CIOs are as concerned as they used to be about which Hadoop distribution, for example, they're using. I never hear that question anymore. That question was a 2012, 2013 question. What the CIOs are now concerned about is the economics of cloud, and how do I get that less than $5 per terabyte of data economics that I get in a cloud environment. >> Well, but also increasingly, they're talking about the use cases. >> David: Yeah. >> They want to get their people... They don't want to replicate the Linux or Unix versus NT wars of the 1990s, which was made possible because they were focused on what accounting package am I going to run? Am I going to run it-- >> Yeah. >> on this or that? You know, it was known process, unknown technology. In today's universe, it's unknown process, and they don't want to know as much about the technology, so they're focused on how do I get my men and women focused on use cases that are delivering value for their business. >> Exactly, and the economics question is really simple. Am I going to build a massive, partially used, elastic infrastructure on-premise or am I just going to go and use the elastic infrastructure that already exists in the cloud? That's a no-brainer. That's already happening, and the good news for us, the good news for WANdisco, is it's precisely what we do. It's a data movement problem. Now, I'm bound to say that, but it is actually a data movement problem. In this idea that you have data that changes, active transactional data, as we call it, so the active transactional data movement is a really hard problem. You can't just take a snapshot, right? A file scan and then a snapshot and then move the data, and that's the problem that all the other data replication guys have got. That's what IBM, OAM, that's why we've got strategic partnerships with companies like Oracle, like Amazon, and why I'm sure we'll be announcing things in due course with the other cloud vendors, like Google, for example, and Microsoft with their Azure products. They all have that problem, so data movement, in and out of cloud, if it's batch, if it's static, if it's archival data, easy problem to solve. There's a million and one different replication products. >> Dave: Right. >> You can use rsync if you really wanted to do that, but active transactional data, data that changes, data that moves, you know, at petabyte scale, hard problem. That's the problem that we solve. >> Because you've got speed of light problems and you're exposing yourself to data loss-- >> Yep. >> if something goes wrong. >> Peter: Fidelity is a problem. >> An eventual consistency replication model-- >> Yeah, it... >> doesn't work. You can't... If I'm query... We've got a customer that's trying to look at cardiographs, right, in and out of cloud. I mean, would you really feel comfortable in your cardiograph eventually getting into the cloud and being analyzed? You know, would you? You've got to be absolutely crystal clear that the data is completely consistent from the stuff that I'm generating on-premise versus the models that I'm building in cloud. It's vitally important. >> Well, I would imagine there's regulations, in certain industries anyway, that-- >> Oh, yeah, absolutely. >> require that eventual consistency doesn't fit, right? >> Yeah. Well, I mean, at the moment, without us, that's all you got, I'm afraid... >> Okay. >> Well, so, I'm on a mission, let me and I want to get your take on it, that we always talk about elastic infrastructure, which is a given workload, being able to scale up and scale down. >> David: Yeah. >> I think it's time to start talking about plastic infrastructure-- >> David: Oh, yeah, I like it. >> where a given workload, but a reconfiguration of how that workload is applied because of the value of data, because of integration, because of the need to be able to move in response to business needs. So we talk about plastic infrastructure, where we are reconfiguring based on policy and rules and some other things. What do you think about that? >> I love it, and the reason I love it is because, just to take a step back, the definition of hybrid cloud is... You would imagine it would be relatively simple, but to me, a hybrid means that you have... You know, it's a bit like a hybrid golf club. It's neither a driver nor an iron. It's somewhere in between. So, you have the same workload that can exist both on-premise and in the cloud. I can use both the cloud and on-premise interchangeably. What hybrid cloud actually means, for all the vendors, and this is their dirty little secret, it means that you have some workloads running against some data in the cloud and others that will run against some data on-premise. Now, why do they do that? Because they have to. Because they can't guarantee complete consistency between on-premise and cloud. Our definition of hybrid cloud is exactly the same data, if you want, between on-premise and cloud, and I love this plastic phrase, the idea of repurposing all of those applications, and they can live anywhere. It doesn't matter 'cause it's the same data. >> Yeah, so we have two terms we have to copyright here, plastic infrastructure. >> Plastic... >> What was the other one we heard? >> Data portfolio. >> Data portfolio, yeah. We'll run the tape back >> Plastic infrastructure. (laughter) >> Plastic infrastructure. >> I'm going to steal it (laughs). >> Please do, you know? But the key thing is, as these technologies get more deeply embedded within business and how the business runs, it's incumbent upon the technology leadership to be able to rapidly be able to reconfigure the infrastructure in response to what the business needs. That's not elasticity. >> Yeah. >> That's plasticity. >> I love it, absolutely. (Peter laughs) And I think you're touching on something that's changing, and what we discussed earlier, which is that CIOs aren't waking up in the middle of the night thinking, am I going to use Pig or Hive or any of those other open source components. They're thinking about the applications that they're going to build. How am I actually going to start using this data? And I think the agenda's kind of moved on, and walking around the whole... There's still a little bit of confusion. You still have people talking about infrastructure like it really still matters. I'm not absolutely sure it does. >> Well, so let's talk about that. We got a few minutes or something like that. >> Dave: It matters when it breaks, you know? >> What's that? >> It matters when it breaks. >> It sure does matter when it breaks. >> You know, but otherwise, nobody wants to think about it. >> No, yeah, because like I said earlier, it's the degree to which-- >> We have time, but I want to explore the new distribution model as well. >> Yeah, go ahead. >> Let me do that, get that out, tick that box, if I can. Help me understand, David, how it all works. So you, the partnership with IBM and others, you mentioned Amazon, how does it work? You are in the IBM cloud offering? IBM is actually selling that offering? Is it a branded IBM product? >> So, it's in the big data analytics and cloud offerings. So, at the moment, IBM are very focused, as you know, on owning the platform. IBM, as a company, have the own the platform. >> Dave: Yeah, absolutely. >> So, I'm delighted to say that we're embedded into their platform. Now, they had a big launch of some products last night. >> Yeah. >> I know that they were talking about IBM Big Replicate, which is 100% white label OAM of WANdisco Fusion to solve some very specific problems, primarily around data movement. So, at the hybrid cloud, how do I punch data out into clouds, run the analytics against it, and be sure that I'm going to get the right results? That's what Big Replicate solves, and also, they're moving into mixed environments, whether they're NetApp, just kind of Teradata environment, SAS-based environments, or whether a customer already has an existing distribution of, say, Cloudera or Hortonworks, so they can live alongside that, so we can replicate data between existing deployments, where they may have already made a strategic decision to go with one of those distributions, and also be able to migrate not just into IBM Big Insights, but also into their cloud offering, so that's a great deal for us. We're not... They're selling it themselves. I mean, obviously we've done a lot of field enablement, trained 5,000 or so IBM sales rep, and, you know, if a small company like WANdisco, or a small company like virtually any of the vendors in there that are not in the Global 1000 list, the go-to market has to be indirect. >> And so you're... Totally agree, and so you're basically, if I understand it correctly, you're moving what are conventional filers into the cloud. Customers are doing that. >> Oh. >> How fast is that happening and why are they doing that? >> My, God. I mean, we have not announced this product yet, but we're in the middle of launching it. It's, at scale, moving petabyte-scale data from, and this is transactional data, so it's a hard problem to solve, right, so it's an active data... It's an active transactional data replication problem. So, a lot of... The dirty little secret in the cloud is that a lot of those NFS filers have not moved yet-- >> Right. >> And why haven't they moved? 'Cause they can't. Because you can't just... You know, if you were to travel, one of the customaries of banks and travel companies is they can't press pause in their organization, do a file scan that's going to take six months, and then turn it back on again, and hey, presto, it's in the cloud. You can't do that. So, you kind of have to... At every single migration of those filers, of any sort of data, is a hybrid model, so you have to be able to run both on-prem and cloud while that migration is happening, and there, I can tell you, are a lot, a hell of a lot of NetApp filers that are going to move very soon here, in time. >> Dave: Oh, 'cause that's the problem that you solve. Otherwise, you'd have to freeze everything, which would kill your business, so you can't do it. >> Yeah, so when human beings imagine things, we're always imagining small use cases, small sets, like moving a few files into Dropbox or something, and that's okay that I can't edit those files for the few seconds it takes to move. I took a look at a deal the other day that was 3 billion files. (Dave laughs) Right, 3 billion. You can't even... My brain can't even calculate that, right? That's a three to six month data movement, and Amazon, for example, thought of this product called Snowball, which-- >> Yeah. >> You know, no techy ever believes this story, but, of course, they FedEx a box, a ruggedized hard drive to you essentially, a ruggedized server that you pour your data into it and then you mail it back to them and they can put it there. That doesn't work, of course, for transactional data, for data that changes all the time. >> These are hard problems to solve, and I go to market, getting back to your question, it is all about indirect, you know? So, AWS, a strategic partnership, that, Oracle, a strategic partnership, that, IBM... And as I said, I'm sure that we'll be doing things with Google and Microsoft soon, and they're the five partnerships that I really care about, to be quite frank with you. >> Mm-hmm. and this comes back to this notion of infrastructure, the value of infrastructure, and just to touch on it for a second, so many years ago, when we were doing client-server, >> David: Mm-hmm. >> We would test it on a local area network and deploy it on a WAN (David laughs) and wonder why it blew up. >> David: Yeah. >> The realities of the speed of light and the practical limitations have a real impact on design, and so where infrastructure still matters is we still have to worry about design, we still have to worry about legacy financial assets, how we're deploying those assets, and I want to come back to this because we were talking earlier about data as an asset, the value of data within the business, and you don't want to be limited by the legacy as you try to find new ways of generating value out of your data, and what you guys are trying to allow is that the data can be moved in response to the use case as opposed to the use case not being made possible because of the legacy decisions about where to put your data. >> David: That's precisely it, and I don't think that any CIO, in their right mind, wants to continue with the huge maintenance costs, maintenance payments they have to make to some of those vendors, some of those NFS-based vendors. They need to shut them down. They have to figure out a way to move them into cloud so you get cloud economics, and also be able to query the data in a massively efficient way. You simply cannot do that at the moment. They simply cannot do that at the moment, so, as I said, as we continue to launch these products in the marketplace, I'm sure you'll see, at scale, some pretty large companies surprising-- You know, the two that spring to my mind are that the regulators in the US and the UK, Fenero and the FCA, are both in the process of their moving all into cloud, 100% into cloud, and I would expect to see that trend continue. I mean, the re:Invent... I don't want to talk about another-- and we're here at Strata, but the AWS re:Invent, I would expect to see several major financial service companies announcing cloud strategy. >> Yeah, and Fenero's a big user of the AWS cloud. They talk about it pretty aggressively, and really interesting use case there. So, yeah, so we got to end. What's next for you guys? You've mentioned you're going to be at re:Invent, you're going to be at World of Watson (laughs)? Where are we going to find you next? >> Both of those. Obviously, the white label with IBM is a really interesting deal for us. I can't talk about deal flow yet 'cause it's our end of quarter at the moment, but I can tell you that they're doing a pretty damn good job of selling, so we're in execution mode at the moment, where we've already announced some key partnerships. There'll be more key partnerships to come, I'm sure. We're obviously chasing deals down with some of the other cloud vendors, and I'd expect to see us announcing some interesting new customer wins in the coming days and weeks. >> Dave: Great. Well, congratulations on the momentum and the renewed strategy. I love it, and I appreciate you coming to theCUBE. >> Always a pleasure. >> All right, keep it right there, buddy. We'll be back with our next guest. This is theCUBE. We're live at Big Data NYC, Strata and Hadoop World. Be right back. (spacey electronica music)

Published Date : Sep 29 2016

SUMMARY :

brought to you by headline sponsors: Great to see you again. and a good surprise at the IBM event. Yeah, you're both looking and WANdisco went running away butt on the golf course." He was blowing the dust off it, you know? great to see you again. Let's inverse the question to ourselves. that the value of middleware no harm in the organization. And if it's creating a and so what degree are so the ability to take data on-premise, they're talking about the use cases. Am I going to run it-- as much about the technology, and that's the problem That's the problem that we solve. that the data is completely consistent Well, I mean, at the moment, without us, being able to scale up and scale down. because of the need to be but to me, a hybrid means that you have... Yeah, so we have two terms We'll run the tape back Plastic infrastructure. in response to what the business needs. that they're going to build. Well, so let's talk about that. You know, but otherwise, to explore the new You are in the IBM cloud offering? So, it's in the big data analytics So, I'm delighted to the go-to market has to be indirect. into the cloud. The dirty little secret in the cloud is and hey, presto, it's in the cloud. the problem that you solve. for the few seconds it takes to move. for data that changes all the time. and I go to market, getting and this comes back to this notion and deploy it on a WAN (David laughs) and the practical limitations You simply cannot do that at the moment. going to be at re:Invent, and I'd expect to see us announcing and the renewed strategy. Strata and Hadoop World.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
FCA	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
Fenero	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
$20	QUANTITY	0.99+
Cisco	ORGANIZATION	0.99+
George	PERSON	0.99+
2012	DATE	0.99+
three	QUANTITY	0.99+
100%	QUANTITY	0.99+
New York City	LOCATION	0.99+
Peter	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
OAM	ORGANIZATION	0.99+
six months	QUANTITY	0.99+
3 billion	QUANTITY	0.99+
two terms	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
5,000	QUANTITY	0.99+
US	LOCATION	0.99+
two	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
FedEx	ORGANIZATION	0.99+
Linux	TITLE	0.99+
mid-2006	DATE	0.99+
Both	QUANTITY	0.99+
both	QUANTITY	0.99+
five partnerships	QUANTITY	0.99+
2013	DATE	0.99+
tomorrow	DATE	0.99+
Unix	TITLE	0.98+
2011	DATE	0.98+
one	QUANTITY	0.98+
six month	QUANTITY	0.98+
5, 10,000	QUANTITY	0.98+
last night	DATE	0.97+
less than $5 per terabyte	QUANTITY	0.97+
Hadoop World	LOCATION	0.97+
1990s	DATE	0.96+
Dropbox	ORGANIZATION	0.96+

David Richards, WANdisco - #AWS - #theCUBE - @DavidRichards

>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE. Covering AWS Summit 2016. (upbeat electronic music) >> Hello everyone, welcome to theCUBE. Here, live in Silicon Valley, at Amazon Web Services, AWS Summit, in Silicon Valley. I'm John Furrier, this is theCUBE, our flagship program. We go out to the events and extract the signal from the noise. I'm here with my co-host. Introducing Lisa Martin on theCUBE, new host. Lisa, you look great. Our first guest here is David Richards, CEO of WANdisco. Welcome to theCUBE, good to see you. >> Good to see you, John, as always. >> So, I've promised a special CUBE presentation, $20 bill here that I owe David. We played golf on Friday, our first time out in the year. He sandbagged me, he's a golfer, he's a pro. I don't play very often. There's your winnings, there you go, $20, I paid. (smooching) (laughing) I did not well challenge your swing, so it's been paid. Great fun, good to see you. >> It was great fun and I'm sorry that I cheated a little bit, mirror in the bathroom still running through your ears. >> I love the English style. Like all the inner gain and playing music on the course, it was great a great time. When we went golfing last week, we were talking, just kind of had a social get-together but we were talking about some things on the industry mind right now. And you had some interesting color around your business. We talked about your strategy of OEMing your core technology to IBM and also you have other business deals. Can you share some light on your strategy at WANdisco with your core IP, and how that relates to what's going on in this phenom called Amazon Web Services? They've been running the table on the enterprise now and certainly public cloud for years. $10 billion, Wikibon called that years ago. We see that trajectory not stopping but clearly the enterprise cloud is what they want. Do you have a deal with Amazon? Are you talking to them and what is that impact your business? >> Well I mean the wonderful thing is if you go to AWS Marketplace, you go to that front page, we're one of the feature products on the front page of the AWS Marketplace, so I think that tells you that we're pretty strategic with Amazon. We're solving a big problem for them which is the movement of data in and out of public cloud. But you asked an interesting question about our business model. When we first came into the whole big date marketplace we went for the whole direct selling thing like everybody does, but that doesn't give you a lot of operational leverage. I mean we're in accounts with IBM right now, you mentioned earlier, MR technology. At a big automotive company they have 72 enterprise sales guys, 72. We could never get to that scale any time soon. >> And you have relationships too. So it's not like they like, you know, just knocking on doors selling used cars. They are strategic high-end enterprise sales. >> Exactly. That gives us a tremendous amount of operational leverage and AWS is one of the great stories, will be one of the great IT stories of the century. To go from zero to 15 billion. If AWS was an independent company, faster than any other enterprise software company in the history of mankind, is just incredible. >> Yeah, well, enterprise obviously, they care about hybrid cloud, which you know all about through your IBM relationship. Andy Jassy at Amazon, the CEO now of Amazon. Newly announced title, he's certainly SVP, basically he's been the CEO of Amazon. He's been on record, certainly on stage, and on theCUBE saying, why do even companies need data centers? That kind of puts you out of business. You have a data center product, or is the cloud just one big data center? Will there ultimately be no data center at all? What's your thoughts? >> That's a great question. We see the cloud as just one great big data center or actually many great big data centers. And how you actually integrate those together, how you move data between data centers, how you arbitrage been cloud vendors. Are you really going to put all your eggs into one basket? You're going to put everything into AWS. Everything into Azure. I don't think you will. I think you'll need to move data around between those different data centers and then how about high availability? How do you solve that problem? Well WANdisco solves that problem as well. >> So a couple of questions for you David. One of the things that Dr. Wood said in the keynote today was friends don't let friends build data centers. So I wanted to get your take on that as well as from an IBM perspective. We just talked about the OEM opportunity that you're working there to get to those large enterprises. Does that mean that you're shifting your focus for enterprise towards IBM? Where does that leave WANdisco and Amazon as we see Amazon making a big push to the enterprise? >> So I think that was some big news that came out last week that was missed largely by the industry, which was the FCA, the financial regulatory authority in the United Kingdom, came out and said, we see no reason why banks cannot move to cloud from a regulatory perspective. That was one of the big fears that we all had which is are banks actually going to be able to move core infrastructure into a public cloud environment? Well now it turns out they can. So we're all in on cloud. I mean, we can see, if you look at the partnerships that we're focused on, it's the sort of four/five cloud vendors. It's the IBM, the AWS, Azure, Oracle, when they finally built that cloud, and so on. They're the key partnerships that we see in the marketplace. That will be our go-to market strategy. That is our go-to market strategy. >> So one of the things that's clear is the data value and you do a lot of replications. So one of the things that, I forget which CUBE segment we've done over the years, that's Hurricane Sandy I think it was, in New York City. You guys were instrumental in keeping the up-time and availability. >> Lisa mentioned, Amazon vis-a-vis IBM, obviously two different strategies, kind of converging in on the same customer. Amazon's had problems with availability zones and they're rushing and running like the wind to put up new data centers. They just announced a new data center in India just recently. Andy Jassy and team were out there kicking that off. So they're rushing to put points of presence, if you will, for lack of a better word, around the world. Does that fit into your availability concept and how do customers engage with you guys with specifically that kind of architecture developing very fast? >> I think that's a really great question. There are problems, there have been historic problems with general availability in cloud. There are lots of 15-minute outages and so on that cost billions and billions of dollars. We're working very closely and I can't say too much about it with the teams that are focused on enabling availability. Clearly the IBM OEM is very focused on the movement of data from the hybrid cloud, I'm from a data availability perspective. But there's a great deal of value in data that sits in cloud and I think you'll see us do more and more deals around general cloud availability moving forward. >> Is there a specific on that front project that you can share with us where you've really helped a customer gain significant advantage by working with AWS and facilitating those availability objectives, security compliance? >> So, one of the big use cases that we see, and it's kind of all happening at once really, is I built an on-premise infrastructure to store lots and lots of data, now I need to run compute and analytics against that data and I'm not going to build a massive redundant infrastructure on-premise in order to do that, so I need to figure out a way to move that data in and out of cloud without interruption to service. And when we are talking about large volumes of data, you simply can't move transactional data in and out of cloud using existing technology. AWS offers something called Snowball where you put it into a rugged ICE drive and then you ship it to them, but that's not really streaming analytics is it? Most of our use cases today are either involved in either the migration of data from on-premise into cloud infrastructure, or the movement of data for an atemporal basis so I can run compute against that data and taking advantage of the elastic compute available in cloud. They are really the two major use cases that web, and we're working with a lot of customers right now that have those exact problems. >> So majority of your customers are more using hybrid cloud versus all in the public cloud? >> Hybrid falls into two categories. I'm going to use hybrid in order to migrate data because I need to keep on using it while it's moving. And secondly I need to use hybrid because I need to build a compute infrastructure that I simply can't build behind firewall. I need to build it in cloud. >> So the new normal is the cloud. There was a tweet here that says, database migration, now we can have an Oracle Exadata data dispute that we're ready to throw into the river. (David laughs) Database migration is a big thing and you mentioned it on the first question that moving in and out of the cloud is a top concern for enterprises. This is one of those things, it's the elephant in the room, so to speak. No pun intended AKA Hadoop. Moving the data around is a big deal and you don't want to get a roach motel situation where you can check in and can't check out. That is the lock-in that enterprise customers are afraid of with Amazon. You're thoughts there, and what do you guys offer your customers. And if you can give some color on this whole database migration issue, real, not real? >> The big problem that the Hadoop market has had from a growth perspective is applications. And why they had a problem, well it's the concept of data gravity. The way that the AWS execs will look at their business the way that the Azure execs will look at their business at Microsoft. They will look at how much data they actually have. Data gravity. The implication being if I have data then the applications follow. The whole point of cloud is that I can build my applications on that ubiquitous infrastructure. We want to be the kings of moving data around right? Wherever the data lands is where the applications follow. If the applications follow, you have a business. If the applications don't follow, then it's probably a roach motel situation, as you so quaintly put it. But basically the data is temporal. It will move back to where the applications are going to be. So where the applications are, and it's who is going to be the king of applications, will actually win this race. >> So, question, in terms of migration, we're hearing a lot about mass migration. Amazon's even doing partner competency programs for migration. Not to trivialize it, talk to us about some of the challenges that you are helping customers overcome when they sort of don't know where to start when it comes to that data problem? >> If it's batch data, if it's stuff that I'm only going to touch if it's an archive, that I only going to touch once in a blue moon, then I can put it into Snowball and I can ship my Snowball device. I can sort of press the pause button akin to when I'm copying files into a network drive where you can't edit them, and then wait for two months, three months. Wait for them to turn up in AWS and that's fine. If it's transactional data where maybe 80% of my data set changes on a daily basis and I've got petabyte scale data to move, that's a hard problem. That requires active transactional data migration. That's a big mouthful, but that's really important for run-time transactional data. That's the problem that we solve. We enable customers, without interruption to service to move a massive scale active transactional data into cloud without any interruption of service. So I can still use it while it's moving. >> One of the things we were talking about before you came on was the whole global economy situation. I think a year and a half ago, or two years ago, you predicted the housing bubble bursting in London. You're in the London Exchange, you're a public company. Brexit, EU. These are huge issues that are going to impact, certainly North America looking healthy right now but some are saying that there's a big challenge and certainly the uncertainty of the U.S. presidency candidates that are lack of thereof. The general sentiment in the U.S. We're in a world of turmoil. So specifically the Brexit situation. You guys are in London. What does this impact your business and is that going to happen? Or give us some color and insight into what the countrymen are thinking over there. >> Okay, so, I get asked by, I live here of course, and I've lived here for 19 years. It feels like I'm recolonizing sometimes, I have to say. No, I'm joking. I get asked by a lot of Americans what the situation is with Brexit and why it happened. And for that you have to look at economics. If you sort of take a step back, in Northern Europe nine of the 10 poorest parts of Northern Europe are in the U.K. And one, only one of the top 10 richest parts is in the U.K. and that's London. So basically outside of London the U.K. has a really big problem. Those people are dissatisfied. When people are dissatisfied, if they're not benefiting from an economic upturn, if governments make it, like the conservative government for the past four years made huge cuts, those people don't benefit, and they really feel pissed off and they will vote against the government. >> John: So protest vote pretty much? >> Brexit was really, I think, a protest vote. It's people dissatisfied. It's people voting basically anti-immigration which is, being in the U.S., is a really foreign thing to us. >> But there are some implications to business. I mean obviously there's filings, there's legal issues, obviously currency. Have you been impacted positively, negatively and what is the outlook on WANdisco's business going forward with the Brexit uncertainty and/or impact? >> We're in great shape because we buy pounds. We buy labor that's now discounted by 20% in the U.K. I just got back from the U.K. If you want to go on vacation, Americans, anywhere, go to London this summer and go shopping because everything is humongously discounted for us American's right now. It's a great time to be there. So from a WANdisco perspective-- >> John: How does that affect the housing bubble too? >> I said to you about a year ago that the London housing market was akin to the jewelry shops that existed in Hong Kong a few years ago, where the Chinese used to come over and basically launder money by buying huge diamonds and bars of gold and things. If you look at the London housing market it is primarily fueled by the Saudis and by the Russians who have been buying Hyde Park Corner 100 million pounds, $160 million, well $140 million now, apartments and so on in London. Now seven, and I repeat seven housing funds in the U.K. last week canceled redemptions. Which means that they can foresee liquidity problems coming in those funds. I think you're about to see a housing crash in London, the like of which we've never seen before, and I think it would be very sad and I think that will make people really question the Brexit decision. >> John: So sell London property now people? >> Yes. >> Before the crash. >> And go shopping, I heard the go shopping. So following along that, you talked about the significant differential between London and the rest of the U.K. You're from Sheffield, you're very proud of that. You've also been proud of your business really helping to fuel that economy. How do you think Brexit is going to affect WANdisco in your home area of Sheffield. >> I don't think it really will. I think our employees there, relative terms, very well paid. They're working on interesting things. They're working very closely with the AWS team, for example, the S3 team, the MR team. And building our technology, we're liaising very closely with them. They're doing lots of interesting things. I suspect their vacations into Europe and their vacations to the United States have just gone up by about 20% which will reduce the amount of beer that they can drink. It's a big beer drinking part of the world in Sheffield. Sheffield is, in terms of cost of living, is relatively low compared to the rest of the U.K. and I think those people will be pretty happy. >> David, I appreciate you coming on theCUBE. I want to give you the final word here on the segment because you're a chief executive officer of a public company. You've been in the industry for awhile. You've seen the trials and tribulations of the Hadoop ecosystem. Now basically branded as the data ecosystem. As Hortonworks has recently announced, Hadoop Summit is now being called Data Works Summit. They're moving from the word Hadoop to Data. Clearly that's impacting all the trends. Cloud data, mobile is really the key. I want you, and I'm sure you get this question a lot, I would like you to take a minute and explain to the audience that's watching, what's this phenom of Amazon Web Services really all about? What's all the hub-bub about? Why is everyone fawning over Amazon now? When you go back five years ago, or 10 years ago when it started, they were ridiculed. I remember when this started I loved it, but they were looked at as just a kind of a tinkering environment. Now they're the behemoth and just on an unstoppable run and certainly the expansion has been fantastic under Andy Jassy's leadership. How do you explain it to normal people what's going on at Amazon? Take a minute please. >> So Amazon is, and that's a brilliant question, by the way. Amazon is the best investor-relation story ever, and I mean ever. What Bezos did is never talked about the potential size of the market. Never talked about this thing was going to generate lots of cash. He just said, you know what, we're building this little internet thing. It might, it might not work. It's not going to make any money. And then in the blink of an eye, it's a $15 billion revenue business growing faster than any other part of his business and throwing off cash like there's no tomorrow. It is just the most non-obvious story in technology, in business, of any public company ever. I mean AWS, arguably, as a stand alone entity, is almost worth as much as Oracle. An unbelievable, an unbelievable story and to do that with all the complexity. I mean mean running a public company with shareholder expectations, with investor relations where you have to constantly be positive about what's going on. For him to do that and never talk about making a profit, never talk about this becoming a multi-billion dollar segment of their business, is the most incredible thing. >> So they've been living the agile. Certainly that's the business story, but they've been living the agile story relative to announcing the slew of new products. Basic building blocks S3, EC2 to start with, as the story goes from Andy Jassy himself, and then a slew of new services. It's a tsunami of every event of new services. What is the disruptive enabler? What's the disruption under the hood for Amazon? How do you explain that? >> Well, I mean what they did is they took a really simple concept. They said, okay, storage, how do we make storage completely elastic, completely public, in a way that we can use the public internet to get data in and out of it. Right? That sounds simple. What they actually built underneath the covers was an extremely complex thing called object store. Everybody else in the industry completely missed this. Oracle missed it, Microsoft missed it, everybody missed it. Now we're all playing catch-up trying to develop this thing called object store. It's going to take over, I mean, somebody said to me, what's the relevance of Hadoop in cloud? And you have to ask that question. It's a relevant question. Do you really need it when you've got object store? Show me side-by-side, object store versus every, you know, Net Apple, Teradata, or any of those guys. Show me side-by-side the difference between the two things. There ain't a lot. >> Amazon Web Service is a company that can put incumbents out of business. David, thanks so much. As we always say, what inning are we in? It's really a double-header. Game one swept by Amazon Web Services. Game two is the enterprise and that's really the story here at Amazon Web Services Summit in Silicon Valley. Can Amazon capture the enterprise? Their focus is clear. We're theCUBE. I'm John Furrier with Lisa Martin. We'll be right back with more after this short break. (techno music)

Published Date : Jul 27 2016

SUMMARY :

in the heart of Silicon and extract the signal from the noise. there you go, $20, I paid. mirror in the bathroom still and how that relates to what's going on on the front page of the AWS Marketplace, So it's not like they like, you know, and AWS is one of the great stories, basically he's been the CEO of Amazon. We see the cloud as just One of the things that Dr. authority in the United Kingdom, So one of the things and how do customers engage with you guys the movement of data of the elastic compute I need to build it in cloud. the room, so to speak. the way that the Azure execs will look some of the challenges that I can sort of press the pause button and is that going to happen? of Northern Europe are in the U.K. is a really foreign thing to us. Have you been impacted I just got back from the U.K. Saudis and by the Russians between London and the rest of the U.K. of the world in Sheffield. and certainly the expansion It is just the most non-obvious story What is the disruptive enabler? the public internet to that's really the story here

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Europe	LOCATION	0.99+
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
London	LOCATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
FCA	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
David Richards	PERSON	0.99+
Lisa Martin	PERSON	0.99+
80%	QUANTITY	0.99+
India	LOCATION	0.99+
Microsoft	ORGANIZATION	0.99+
$20	QUANTITY	0.99+
two months	QUANTITY	0.99+
zero	QUANTITY	0.99+
Lisa	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
U.K.	LOCATION	0.99+
three months	QUANTITY	0.99+
seven	QUANTITY	0.99+
Hong Kong	LOCATION	0.99+
David Richards	PERSON	0.99+
New York City	LOCATION	0.99+
20%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Sheffield	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
$15 billion	QUANTITY	0.99+
Wood	PERSON	0.99+
United States	LOCATION	0.99+
WANdisco	ORGANIZATION	0.99+
$160 million	QUANTITY	0.99+
last week	DATE	0.99+
$140 million	QUANTITY	0.99+
$10 billion	QUANTITY	0.99+
19 years	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Friday	DATE	0.99+
Silicon Valley	LOCATION	0.99+

Joel Horwitz, IBM & David Richards, WANdisco - Hadoop Summit 2016 San Jose - #theCUBE

>> Narrator: From San Jose, California, in the heart of Silicon Valley, it's theCUBE. Covering Hadoop Summit 2016. Brought to you by Hortonworks. Here's your host, John Furrier. >> Welcome back everyone. We are here live in Silicon Valley at Hadoop Summit 2016, actually San Jose. This is theCUBE, our flagship program. We go out to the events and extract the signal to the noise. Our next guest, David Richards, CEO of WANdisco. And Joel Horowitz, strategy and business development, IBM analyst. Guys, welcome back to theCUBE. Good to see you guys. >> Thank you for having us. >> It's great to be here, John. >> Give us the update on WANdisco. What's the relationship with IBM and WANdisco? 'Cause, you know. I can just almost see it, but I'm not going to predict. Just tell us. >> Okay, so, I think the last time we were on theCUBE, I was sitting with Re-ti-co who works very closely with Joe. And we began to talk about how our partnership was evolving. And of course, we were negotiating an OEM deal back then, so we really couldn't talk about it very much. But this week, I'm delighted to say that we announced, I think it's called IBM Big Replicate? >> Joel: Big Replicate, yeah. We have a big everything and Replicate's the latest edition. >> So it's going really well. It's OEM'd into IBM's analytics, big data products, and cloud products. >> Yeah, I'm smiling and smirking because we've had so many conversations, David, on theCUBE with you on and following your business through the bumpy road or the wild seas of big data. And it's been a really interesting tossing and turning of the industry. I mean, Joel, we've talked about it too. The innovation around Hadoop and then the massive slowdown and realization that cloud is now on top of it. The consumerization of the enterprise created a little shift in the value proposition, and then a massive rush to build enterprise grade, right? And you guys had that enterprise grade piece of it. IBM, certainly you're enterprise grade. You have enterprise everywhere. But the ecosystem had to evolve really fast. What happened? Share with the audience this shift. >> So, it's classic product adoption lifecycle and the buying audience has changed over that time continuum. In the very early days when we first started talking more at these events, when we were talking about Hadoop, we all really cared about whether it was Pig and Hive. >> You once had a distribution. That's a throwback. Today's Thursday, we'll do that tomorrow. >> And the buying audience has changed, and consequently, the companies involved in the ecosystem have changed. So where we once used to really care about all of those different components, we don't really care about the machinations below the application layer anymore. Some people do, yes, but by and large, we don't. And that's why cloud for example is so successful because you press a button, and it's there. And that, I think, is where the market is going to very, very quickly. So, it makes perfect sense for a company like WANdisco who've got 20, 30, 40, 50 sales people to move to a company like IBM that have 4 or 5,000 people selling our analytics products. >> Yeah, and so this is an OEM deal. Let's just get that news on the table. So, you're an OEM. IBM's going to OEM their product and brand it IBM, Big Replication? >> Yeah, it's part of our Big Insights Portfolio. We've done a great job at growing this product line over the last few years, with last year talking about how we decoupled all the value-as from the core distribution. So I'm happy to say that we're both part of the ODPI. It's an ODPI-certified distribution. That is Hadoop that we offer today for free. But then we've been adding not just in terms of the data management capabilities, but the partnership here that we're announcing with WANdisco and how we branded it as Big Replicate is squarely aimed at the data management market today. But where we're headed, as David points out, is really much bigger, right? We're talking about support for not only distributed storage and data, but we're also talking about a hybrid offering that will get you to the cloud faster. So not only does Big Replicate work with HDFS, it also works with the Swift objects store, which as you know, kind of the underlying storage for our cloud offering. So what we're hoping to see from this great partnership is as you see around you, Hadoop is a great market. But there's a lot more here when you talk about managing data that you need to consider. And I think hybrid is becoming a lot larger of a story than simply distributing your processing and your storage. It's becoming a lot more about okay, how do you offset different regions? How do you think through that there are multiple, I think there's this idea that there's one Hadoop cluster in an enterprise. I think that's factually wrong. I think what we're observing is that there's actually people who are spinning up, you know, multiple Hadoop distributions at the line of business for maybe a campaign or for maybe doing fraud detection, or maybe doing log file, whatever. And managing all those clusters, and they'll have Cloud Arrow. They'll have Hortonworks. They'll have IBM. They'll have all of these different distributions that they're having to deal with. And what we're offering is sanity. It's like give me sanity for how I can actually replicate that data. >> I love the name Big Replicate, fantastic. Big Insights, Big Replicate. And so go to market, you guys are going to have bigger sales force. It's a nice pop for you guys. I mean, it's good deal. >> We were just talking before we came on air about sort of a deal flow coming through. It's coming through, this potential deal flow coming through, which has been off the charts. I mean, obviously when you turn on the tap, and then suddenly you enable thousands and thousands of sales people to start selling your products. I mean, IBM, are doing a great job. And I think IBM are in a unique position where they own both cloud and on-prem. There are very few companies that own both the on-prem-- >> They're going to need to have that connection for the companies that are going hybrid. So hybrid cloud becomes interesting right now. >> Well, actually, it's, there's a theory that says okay, so, and we were just discussing this, the value of data lies in analytics, not in the data itself. It lies in you've been able to pull out information from that data. Most CIOs-- >> If you can get the data. >> If you can get the data. Let's assume that you've got the data. So then it becomes a question of, >> That's a big assumption. Yes, it is. (laughs) I just had Nancy Handling on about metadata. No, that's an issue. People have data they store they can't do anything with it. >> Exactly. And that's part of the problem because what you actually have to have is CPU slash processing power for an unknown amount of data any one moment in time. Now, that sounds like an elastic use case, and you can't do elastic on-prem. You can only do elastic in cloud. That means that virtually every distribution will have to be a hybrid distribution. IBM realized this years ago and began to build this hybrid infrastructure. We're going to help them to move data, completely consistent data, between on-prem and cloud, so when you query things in the cloud, it's exactly the same results and the correct results you get. >> And also the stability too on that. There's so many potential, as we've discussed in the past, that sounds simple and logical. To do an enterprise grade is pretty complex. And so it just gives a nice, stable enterprise grade component. >> I mean, the volumes of data that we're talking about here are just off the charts. >> Give me a use case of a customer that you guys are working with, or has there been any go-to-market activity or an ideal scenario that you guys see as a use case for this partnership? >> We're already seeing a whole bunch of things come through. >> What's the number one pattern that bubbles up to the top? Use case-wise. >> As Joel pointed out, that he doesn't believe that any one company just has one version of Hadoop behind their firewall. They have multiple vendors. >> 100% agree with that. >> So how do you create one, single cluster from all of those? >> John: That's one problem you solved. >> That's of course a very large problem. Second problem that we're seeing in spades is I have to move data to cloud to run analytics applications against it. That's huge. That required completely guaranteed consistent data between on-prem and cloud. And I think those two use cases alone account for pretty much every single company. >> I think there's even a third here. I think the third is actually, I think frankly there's a lot of inefficiencies in managing just HDFS and how many times you have to actually copy data. If I looked across, I think the standard right now is having like three copies. And actually, working with Big Replicate and WANdisco, you can actually have more assurances and actually have to make less copies across the cluster and actually across multiple clusters. If you think about that, you have three copies of the data sitting in this cluster. Likely, an analysts have a dragged a bunch of the same data in other clusters, so that's another multiple of three. So there's amount of waste in terms of the same data living across your enterprise. That I think there's a huge cost-savings component to this as well. >> Does this involve anything with Project Atlas at all? You guys are working with, >> Not yet, no. >> That project? It's interesting. We're seeing a lot of opening up the data, but all they're doing is creating versions of it. And so then it becomes version control of the data. You see a master or a centralization of data? Actually, not centralize, pull all the data in one spot, but why replicate it? Do you see that going on? I guess I'm not following the trend here. I can't see the mega trend going on. >> It's cloud. >> What's the big trend? >> The big trend is I need an elastic infrastructure. I can't build an elastic infrastructure on-premise. It doesn't make economic sense to build massive redundancy maybe three or four times the infrastructure I need on premise when I'm only going to use it maybe 10, 20% of the time. So the mega trend is cloud provides me with a completely economic, elastic infrastructure. In order to take advantage of that, I have to be able to move data, transactional data, data that changes all the time, into that cloud infrastructure and query it. That's the mega trend. It's as simple as that. >> So moving data around at the right time? >> And that's transaction. Anybody can say okay, press pause. Move the data, press play. >> So if I understand this correctly, and just, sorry, I'm a little slow. End of the day today. So instead of staging the data, you're moving data via the analytics engines. Is that what you're getting at? >> You use data that's being transformed. >> I think you're accessing data differently. I think today with Hadoop, you're accessing it maybe through like Flume or through Oozy, where you're building all these data pipelines that you have to manage. And I think that's obnoxious. I think really what you want is to use something like Apache Spark. Obviously, we've made a large investment in that earlier, actually, last year. To me, what I think I'm seeing is people who have very specific use cases. So, they want to do analysis for a particular campaign, and so they may just pull a bunch of data into memory from across their data environment. And that may be on the cloud. It may be from a third-party. It may be from a transactional system. It may be from anywhere. And that may be done in Hadoop. It may not, frankly. >> Yeah, this is the great point, and again, one of the themes on the show is, this is a question that's kind of been talked about in the hallways. And I'd love to hear your thoughts on this. Is there are some people saying that there's really no traction for Hadoop in the cloud. And that customers are saying, you know, it's not about just Hadoop in the cloud. I'm going to put in S3 or object store. >> You're right. I think-- >> Yeah, I'm right as in what? >> Every single-- >> There's no traction for Hadoop in the cloud? >> I'll tell you what customers tell us. Customers look at what they actually need from storage, and they compare whatever it is, Hadoop or any on-premise proprietor storage array and then look at what S3 and Swift and so on offer to them. And if you do a side-by-side comparison, there isn't really a difference between those two things. So I would argue that it's a fact that functionally, storage in cloud gives you all the functionality that any customer would need. And therefore, the relevance of Hadoop in cloud probably isn't there. >> I would add to that. So it really depends on how you define Hadoop. If you define Hadoop by the storage layer, then I would say for sure. Like HDFS versus an objects store, that's going to be a difficult one to find some sort of benefit there. But if you look at Hadoop, like I was talking to my friend Blake from Netflix, and I was asking him so I hear you guys are kind of like replatforming on Spark now. And he was basically telling me, well, sort of. I mean, they've invested a lot in Pig and Hive. So if you think it now about Hadoop as this broader ecosystem which you brought up Atlas, we talk about Ranger and Knox and all the stuff that keeps coming out, there's a lot of people who are still invested in the peripheral ecosystem around Hadoop as that central point. My argument would be that I think there's still going to be a place for distributed computing kind of projects. And now whether those will continue to interface through Yarn via and then down to HDFS, or whether that'll be Yarn on say an objects store or something and those projects will persist on their own. To me that's kind of more of how I think about the larger discussion around Hadoop. I think people have made a lot of investments in terms of that ecosystem around Hadoop, and that's something that they're going to have to think through. >> Yeah. And Hadoop wasn't really designed for cloud. It was designed for commodity servers, deployment with ease and at low cost. It wasn't designed for cloud-based applications. Storage in cloud was designed for storage in cloud. Right, that's with S3. That's what Swift and so on were designed specifically to do, and they fulfill most of those functions. But Joel's right, there will be companies that continue to use-- >> What's my whole argument? My whole argument is that why would you want to use Hadoop in the cloud when you can just do that? >> Correct. >> There's object store out. There's plenty of great storage opportunities in the cloud. They're mostly shoe-horning Hadoop, and I think that's, anyway. >> There are two classes of customers. There were customers that were born in the cloud, and they're not going to suddenly say, oh you know what, we need to build our own server infrastructure behind our own firewall 'cause they were born in the cloud. >> I'm going to ask you guys this question. You can choose to answer or not. Joel may not want to answer it 'cause he's from IBM and gets his wrist slapped. This is a question I got on DM. Hadoop ecosystem consolidation question. People are mailing in the questions. Now, keep sending me your questions if you don't want your name on it. Hold on, Hadoop system ecosystem. When will this start to happen? What is holding back the M and A? >> So, that's a great question. First of all, consolidation happens when you sort of reach that tipping point or leveling off, that inflection point where the market levels off, and we've reached market saturation. So there's no more market to go after. And the big guys like IBM and so on come in-- >> Or there was never a market to begin with. (laughs) >> I don't think that's the case, but yes, I see the point. Now, what's stopping that from happening today, and you're a naughty boy by the way for asking this question, is a lot of these companies are still very well funded. So while they still have cash on the balance sheet, of course, it's very, very hard for that to take place. >> You picked up my next question. But that's a good point. The VCs held back in 2009 after the crash of 2008. Sequoia's memo, you know, the good times role, or RIP good times. They stopped funding companies. Companies are getting funded, continually getting funding. Joel. >> So I don't think you can look at this market as like an isolated market like there's the Hadoop market and then there's a Spark market. And then even there's like an AI or cognitive market. I actually think this is all the same market. Machine learning would not be possible if you didn't have Hadoop, right? I wouldn't say it. It wouldn't have a resurgence that it has had. Mahout was one of the first machine learning languages that caught fire from Ted Dunning and others. And that kind of brought it back to life. And then Spark, I mean if you talk to-- >> John: I wouldn't say it creates it. Incubated. >> Incubated, right. >> And created that Renaissance-like experience. >> Yeah, deep learning, Some of those machine learning algorithms require you to have a distributed kind of framework to work in. And so I would argue that it's less of a consolidation, but it's more of an evolution of people going okay, there's distributed computing. Do I need to do that on-premise in this Hadoop ecosystem, or can I do that in the cloud, or in a growing Spark ecosystem? But I would argue there's other things happening. >> I would agree with you. I love both areas. My snarky comment there was never a market to begin with, what I'm saying there is that the monetization of commanding the hill that everyone's fighting for was just one of many hills in a bigger field of hills. And so, you could be in a cul-de-sac of being your own champion of no paying customers. >> What you have-- >> John: Or a free open-source product. >> Unlike the dotcom era where most of those companies were in the public markets, and you could actually see proper valuations, most of the companies, the unicorns now, most are not public. So the valuations are really difficult to, and the valuation metrics are hard to come by. There are only few of those companies that are in the public market. >> The cash story's right on. I think to Joel' point, it's easy to pivot in a market that's big and growing. Just 'cause you're in the wrong corner of the market pivoting or vectoring into the value is easier now than it was 10 years ago. Because, one, if you have a unicorn situation, you have cash on the bank. So they have a good flush cash. Your runway's so far out, you can still do your thing. If you're a startup, you can get time to value pretty quickly with the cloud. So again, I still think it's very healthy. In my opinion, I kind of think you guys have good analysis on that point. >> I think we're going to see some really cool stuff happen working together, and especially from what I'm seeing from IBM, in the fact that in the IT crowd, there is a behavioral change that's happening that Hadoop opened the door to. That we're starting to see more and more It professionals walk through. In the sense that, Hadoop has opened the door to not thinking of data as a liability, but actually thinking about data differently as an asset. And I think this is where this market does have an opportunity to continue to grow as long as we don't get carried away with trying to solve all of the old problems that we solved for on-premise data management. Like if we do that, then we're just, then there will be a consolidation. >> Metadata is a huge issue. I think that's going to be a big deal. And on the M and A, my feeling on the M and A is that, you got to buy something of value, so you either have revenue, which means customers, and or initial property. So, in a market of open source, it comes back down to the valuation question. If you're IBM or Oracle or HP, they can pivot too. And they can be agile. Now slower agile, but you know, they can literally throw some engineers at it. So if there's no customers in I and P, they can replicate, >> Exactly. >> That product. >> And we're seeing IBM do that. >> They don't know what they're buying. My whole point is if there's nothing to buy. >> I think it depends on, ultimately it depends on where we see people deriving value, and clearly in WANdisco, there's a huge amount of value that we're seeing our customers derive. So I think it comes down to that, and there is a lot of IP there, and there's a lot of IP in a lot of these companies. I think it's just a matter of widening their view, and I think WANdisco is probably the earliest to do this frankly. Was to recognize that for them to succeed, it couldn't just be about Hadoop. It actually had to expand to talk about cloud and talk about other data environments, right? >> Well, congratulations on the OEM deal. IBM, great name, Big Replicate. Love it, fantastic name. >> We're excited. >> It's a great product, and we've been following you guys for a long time, David. Great product, great energy. So I'm sure there's going to be a lot more deals coming on your. Good strategy is OEM strategy thing, huh? >> Oh yeah. >> It reduces sales cost. >> Gives us tremendous operational leverage. Getting 4,000, 5,000-- >> You get a great partner in IBM. They know the enterprise, great stuff. This is theCUBE bringing all the action here at Hadoop. IBM OEM deal with WANdisco all happening right here on theCUBE. Be back with more live coverage after this short break.

Published Date : Jul 1 2016

SUMMARY :

Brought to you by Hortonworks. extract the signal to the noise. What's the relationship And of course, we were Replicate's the latest edition. So it's going really well. The consumerization of the enterprise and the buying audience has changed That's a throwback. And the buying audience has changed, Let's just get that news on the table. of the data management capabilities, I love the name Big that own both the on-prem-- for the companies that are going hybrid. not in the data itself. If you can get the data. I just had Nancy Handling and the correct results you get. And also the stability too on that. I mean, the volumes of bunch of things come through. What's the number one pattern that any one company just has one version And I think those two use cases alone of the data sitting in this cluster. I guess I'm not following the trend here. data that changes all the time, Move the data, press play. So instead of staging the data, And that may be on the cloud. And that customers are saying, you know, I think-- Swift and so on offer to them. and all the stuff that keeps coming out, that continue to use-- opportunities in the cloud. and they're not going to suddenly say, What is holding back the M and A? And the big guys like market to begin with. hard for that to take place. after the crash of 2008. And that kind of brought it back to life. John: I wouldn't say it creates it. And created that or can I do that in the cloud, that the monetization that are in the public market. I think to Joel' point, it's easy to pivot And I think this is where this market I think that's going to be a big deal. there's nothing to buy. the earliest to do this frankly. Well, congratulations on the OEM deal. So I'm sure there's going to be Gives us tremendous They know the enterprise, great stuff.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Joel	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Joe	PERSON	0.99+
David Richards	PERSON	0.99+
Joel Horowitz	PERSON	0.99+
2009	DATE	0.99+
John	PERSON	0.99+
4	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
20	QUANTITY	0.99+
San Jose	LOCATION	0.99+
HP	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Joel Horwitz	PERSON	0.99+
Ted Dunning	PERSON	0.99+
Big Replicate	ORGANIZATION	0.99+
last year	DATE	0.99+
Silicon Valley	LOCATION	0.99+
Big Replicate	ORGANIZATION	0.99+
40	QUANTITY	0.99+
30	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
third	QUANTITY	0.99+
today	DATE	0.99+
Hadoop	TITLE	0.99+
San Jose, California	LOCATION	0.99+
three	QUANTITY	0.99+
two things	QUANTITY	0.99+
2008	DATE	0.99+
5,000 people	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
David Richards	PERSON	0.99+
Blake	PERSON	0.99+
4,000, 5,000	QUANTITY	0.99+
S3	TITLE	0.99+
two classes	QUANTITY	0.99+
tomorrow	DATE	0.99+
Second problem	QUANTITY	0.99+
both areas	QUANTITY	0.99+
three copies	QUANTITY	0.99+
Hadoop Summit 2016	EVENT	0.99+
Swift	TITLE	0.99+
both	QUANTITY	0.99+
Big Insights	ORGANIZATION	0.99+
one problem	QUANTITY	0.98+
Today	DATE	0.98+

Jim Campigli, WANdisco - #BigDataNYC 2015 - #theCUBE

>> Live from New York. It's The Cube, covering Big Data NYC 2015. Brought to you by Horton Works, IBM, EMC, and Pivotal. Now for your hosts, John Furrier and Dave Vellante. >> Hello, everyone. Welcome back to live in New York City for the Cube. A special big data [inaudible 00:00:27] our flagship program will go out to the events. They expect a [Inaudible 00:00:30] We are here live as part of Strata Hadoop Big Data NYC. I'm John Furrier. My co-host, Dave Vellante. Our next guest is Jim Campigli, the Chief Product Officer at WANdisco. Welcome back to The Cube. Great to see you. >> Thanks, great to be here. >> You've been COO of WANdisco, head of marketing, now Chief Product Officer for a few years. You guys have always had the patent. David was on earlier. I asked him specifically, why doesn't the other guys just do what you do? I wanted you to comment deeper on that because he had a great answer. He said, patents. But you guys do something that's really hard that people can't do. >> Right. >> So let's get into it because Fusion is a big announcement you guys made. Big deal with EMC, lot of traction with that, and it's one of these things that is kind of talked about, but not talked about. It's really a big deal, so what is the reason why you guys are so successful on the product side? >> Well I think, first of all, it starts with the technology that we have patented, and it's this true active active replication capability that we have. Other software products claim to have active active replication, but when you drill down on what they're really doing, typically, what's happening is they'll have a set of servers that they replicate across, and you can write a transaction at any server, but then that server is responsible for propagating it to all of the other servers in the implementation. There's no mechanism for pre-agreeing to that transaction before it's actually written, so there's no way to avoid conflicts up front, there's no way to effectively handle scenarios where some of the servers in the implementation go down while the replication is in process, and very frequently, those solutions end up requiring administrators to do periodic resynchronization, go back and manually find out what didn't take, and deal with all the deltas, whereas we offer guaranteed consistency. And effectively what happens is with us, you can write at any server as well, but the difference is we go through a peer-to-peer agreement process, and once a quorum of the servers in the implementation agree to the transaction, they all accept it, and we make sure everything is written in the same order on every server. And every server knows the last good transaction it processed, so if it goes down at some point in time, as soon as it comes back up, it can grab all the transactions it missed during that time slice while it was offline, resync itself automatically without an administrator having to do anything. And you can use that feature not only for network and server outages that cause downtime, but even for planned maintenance, which is one of the biggest causes of Hadoop availability issues, because obviously if you've got a global appointment, when it's midnight on Sunday in the U.S., it's the start of the business day on Monday in Europe, and then it's the middle of the afternoon in Asia. So if you take Hadoop clusters down, somebody somewhere in the world is going to be going without their applications and data. >> It's interesting; I want to get your comments on this because this has a great highlight into the next conversation we've been hearing all throughout The Cube this week is analytics, outcomes. These are the kind of things that people talk about because that means there's checks being written. Hadoop is moving into production. People have done the clusters. It used to be the conversation, hey, x number of clusters, you do this, you do that, replication here and there, YARN, all these different buzz words. Really feeds and speeds. Now, Hadoop is relevant, but it's kind of invisible. It's under the hood. >> Right. >> Yet, it's part of other things in the network, so high availability, non-disruptive operations, is what our table stakes now. So I want you to talk about that nuance because that's what we're seeing as the things that are powering, as the engine of Hadoop deployments. What is that? Take us through that nuance, because that's one of the things that you guys have been doing a lot of work in that's making it reliable and stable. To actually go out and play with Hadoop, deploy it, make sure it's always on. >> Well, we really come into play when companies are moving Hadoop out of the lab and into production. When they have defined application SLAs, when they can only have so much down time, and it may be business requirements, it may be regulatory compliance issues, for example, financial services. They pretty much always have to have their data available. They have to have a solid back-up of the data. That's a hard requirement for them to put anything into production in their data centers. >> The other use case we've been hearing is okay, I've got Hadoop, I've been playing with it, now I need to scale it up big time. I need to double, triple my clusters. I have to put it with my applications. Then the conversation's, okay, wait, do I need to do more cis admin work? How do you address that particular piece because I think that's where I think Fusion comes in from how I'm reading it, but is that a Fusion value proposition? Is it a WANdisco thing, and what does the customer, and is that happening? >> Yeah, so there's actually two angles to that, and the first is how do we maintain that up-time? How do we make sure there's performance availability to meet the SLA's, the production SLA's? The active active replication that we have patents for, that I described earlier, and it's embodied in our discount distributed coordination engine, is at the core of Fusion, and once a Fusion server's installed with each of your Hadoop clusters, that active active replication capability is extended to them, and we expose that HDFS API so the client applications, Sqoop, Flume, Impala, HIVE, anything that would normally run against a Hadoop cluster, would talk through us. If it's been defined for replication, we do the active active replication of it. Pass straight through and process normally on the local cluster. So how does that address the issues you were talking about? What you're getting by default with our active active replication is effectively continuous hot back-up. That means if one cluster or an entire data center goes offline, that data exists elsewhere. Your users can fail over. They can continue accessing the data, running their applications. As soon as that cluster comes back online, it resyncs automatically. Now what's the other >> No user involvement? No admin? >> No user involvement in that. Now the only time, and this gets back into what I was talking about earlier, if I take servers offline for planned maintenance, upgrade the hardware, the operating system, whatever it may be, I can take advantage of that feature, as I was alluding to earlier. I can take the servers of the entire cluster offline, and Fusion knows the last good transactions that were processed on that cluster. As soon as the admin turns it back on, it'll resync itself automatically. So that's how you avoid down time, even for planned maintenance, if you have to take an entire location off. Now, to your other question, how do you scale this stuff up? Think about what we do. We eliminate idle standby hardware, because everything is full read write. You don't have standby read-only back-up clusters and servers when we come into the picture. So let's say we walk into an existing implementation, and they've got two clusters. One is the active cluster where everything's being written to, read from, actively being accessed by users. The other's just simply taking snapshots or periodic back-ups, or they're using dis(CP) or something else, but they really can't get full utilization out of that. We come in with our active active replication capability, and they don't have to change anything, but what suddenly happens is, as soon as they define what they want replicated, we'll replicate it for them initially to the other clusters. They don't have to pre-sync it, and the cluster that was formally for disaster recovery, for back-up, is now live and fully usable. So guess what? I'm now able to scale up to twice my original implementation by just leveraging that formally read-only back-up cluster that I was >> Is there a lot of configuration involved in that, or is it automatically? >> No, so basically what happens, again, you don't have to synchronize the clusters in advance. The way we replicate is based on this concept of folders, and you can think of a folder as basically a collection of files and subdirectories that roll up into root directories, effectively, that reflect typically particular applications that people are using with Hadoop or groups of users that have data sets that they access for their various sets of applications. And you define the replicated folders, basically a high level directory that consists of everything in it, and as soon as you do that, what we'll do automatically, in a new implementation. Let's keep it simple. Let's say you just have two clusters, two locations. We'll replicate that folder in its entirety to the target you specify, and then from that point on, we're just moving the deltas over the wire. So you don't have to do anything in advance. And then suddenly that back-up hardware is fully usable, and you've doubled the size of your implementations. You've scaled up to 2x. >> So, I mean what you're describing before, really strikes me that the way you tell the complexity of a product and the value of a product in this space is what happens when something goes wrong. >> Yep. >> That's the question you always ask. How do you recover, because recovery's a very hard thing, and your patents, you've got a lot of math inside there. >> Right. >> But you also said something that's interesting, which is you're an asset utilization play. >> Right. >> You're being able to go in relatively simply and say, okay, you've got this asset that's underutilized. I'm now going to give you back some capacity that's on the floor and take advantage of that. >> Right, and you're able to scale up without spending any more on hardware and infrastructure. >> So I'm interested in, so another company. You're now with an EMC partnership this week. And they sort of got into this way back in the mainframe days with SRDF. I always thought when I first heard about WANdisco, it's like SRDF for Hadoop, but it's active active. Then they bought that Yada Yada. >> And there's no distance limitations for their active active. >> So what's the nature of the relationship with EMC? >> Okay, so basically EMC, like the other storage vendors that want to play in the Hadoop space, expose some form of an HDFS API, and in fact, if you look at Hortonworks or Cloudera, if you go and look at Cloudera Manager, one of the things it asks you when you're installing it is are you going to run this on regular HDFS storage, effectively a bunch of commodity boxes typically, or are you going to use EMC Isilon or the various other options? And what we're able to do is replicate across Hadoop clusters running on Isilon, running on EMC ECS, running on standard HDFS, and what that allows these companies to do is without modifying those storage systems, without migrating that data off of them, incorporate it into an enterprise-wide data lake, if that's what they want to do, and selectively replicate across all of those different storage systems. It could be a mix of different Hadoop distributions. You could have replication between C/D/H, HDP, Pivotal, MapR, all of those things, including EMC Storage that I just mentioned, it was mentioned in the press release, Isilon, and ECS effectively has a Hadoop-compatible API support. And we can create in effect a single virtual cluster out of all of those different platforms. >> So is it a go-to-market relationship? Is it an OEM deal? >> Yeah, it was really born out of the fact that we have some mutual customers that want to do exactly what I just described. They have standard Hortonworks or Cloudera deployments in house. They've got data running on Isilon, and they want to deploy a data lake that includes what they've got stored on Isilon with what they've got in HDFS and Hadoop and replicate across that. >> Like onerous EMC certification process? >> Yeah, we went through that process. We actually set up environments in our labs where we had EMC, Isilon, and ECS running and did demonstration integrations, replication across Isilon to HDP to Hortonworks, Isilon to Cloudera, ECS to Isilon to HDP and Cloudera and so forth. So we did prove it out. They saw that. In fact, they lent us boxes to actually do this in our labs, so they were very motivated, and they're seeing us in some of their bigger accounts. >> Talk about the aspect of two things: non-disruptive operations, meaning I have to want to deploy stuff because now that Hadoop has a hardened top with some abstraction layer, with analytics to focus, there's a lot of work going on under the hood, and a large scale enterprise might have a zillion versions of Hadoop. They might have little Hortonworks here. They might have something over here, so there might be some diversity in the distributions. That's one thing. The other one is operational disruption. >> Right. >> What do you guys do there? Is it zero disruption, and how do you deal with multiple versions of the distro? >> Okay, so basically what we're doing, the simplest way to describe it is we're providing a common API across all of these different distributions, running on different storage platforms and so forth, so that the client applications are always interacting with us. They're not worrying about the nuances of the particular Hadoop API's that these different things expose. So we're providing a layer of abstraction effectively. So we're transparent in effect, in that sense, operationally, once we're installed. The other thing is, and I mentioned this earlier, we come in, basically, you don't have to pre-sync clusters, you don't have to make sure they're all the same versions or the same distros or any of that, just install us, select the data that you want to replicate, we'll replicate it over initially to the target clusters, and then from that point on, you just go. It just works, and we talked about the core patent for active active replication. We've got other patents that have been approved, three patents now and seven pending applications pending, that allow this active active replication to take place while servers are being added and removed from implementations without disrupting user access or running applications and so forth. >> Final question for you, sum up the show this week. What's the vibe here? What's the aroma? Is it really Hadoop next? What is the overall Big Data NYC story here in Strata Hadoop? What's the main theme that you're seeing coming out of the show? >> I think the main theme that we're starting to see, it's twofold. I think one is we are seeing more and more companies moving this into production. There's a lot of interest in Spark and the whole fast data concept, and I don't think that Spark is necessarily orthogonal to Hadoop at all. I think the two have to coexist. If you think about Spark streaming and the whole fast data concept, basically, Hadoop provides the historical data at rest. It provides the historical context. The streaming data provides the point in time information. What Spark together with Hadoop allows you to do is that real time analysis, do the real time informed decision-making, but do it within historical context instead of a single point in time vacuum. So I think what's happening, and you notice the vendors themselves aren't saying, oh it's all Spark, forget Hadoop. They're really talking about coexisting. >> Alright, Jim, from WANdisco, Chief Product Officer, really in the trenches, talking about what's under the hood and making it all scale in the infrastructure so his analysts can hit the scene. Great to see you again. Thanks for coming and sharing your insight here on The Cube. Live in New York City. We are here, day two of three days of wall-to-wall coverage of Big Data NYC in conjunction with Strata. We'll be right back with more live coverage in the moment here in New York City after this short break.

Published Date : Oct 6 2015

SUMMARY :

Brought to you by Horton New York City for the Cube. You guys have always had the patent. on the product side? and once a quorum of the servers These are the kind of things because that's one of the things back-up of the data. and is that happening? So how does that address the issues and the cluster that was and you can think of a folder really strikes me that the way you tell That's the question you always ask. But you also said that's on the floor and Right, and you're able to scale up in the mainframe days with SRDF. And there's no distance limitations one of the things it asks you born out of the fact and Cloudera and so forth. diversity in the distributions. so that the client applications What is the overall Big Data NYC story and the whole fast data concept, in the infrastructure

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Jim	PERSON	0.99+
Jim Campigli	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Europe	LOCATION	0.99+
WANdisco	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
Asia	LOCATION	0.99+
U.S.	LOCATION	0.99+
New York	LOCATION	0.99+
John Furrier	PERSON	0.99+
Horton Works	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
two locations	QUANTITY	0.99+
Strata Hadoop	TITLE	0.99+
first	QUANTITY	0.99+
Pivotal	ORGANIZATION	0.99+
one	QUANTITY	0.99+
two things	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
One	QUANTITY	0.99+
two	QUANTITY	0.99+
two clusters	QUANTITY	0.99+
three days	QUANTITY	0.99+
Monday	DATE	0.99+
three patents	QUANTITY	0.98+
this week	DATE	0.98+
seven pending applications	QUANTITY	0.98+
two angles	QUANTITY	0.98+
two clusters	QUANTITY	0.98+
Spark	TITLE	0.97+
this week	DATE	0.97+
one cluster	QUANTITY	0.97+
00:00:30	DATE	0.95+
ECS	TITLE	0.95+
HDP	ORGANIZATION	0.94+
Cloudera Manager	TITLE	0.94+
single point	QUANTITY	0.94+
#BigDataNYC	EVENT	0.94+
each	QUANTITY	0.94+
Impala	TITLE	0.93+
NYC	LOCATION	0.93+
twofold	QUANTITY	0.93+
Strata	ORGANIZATION	0.92+
Flume	TITLE	0.92+
00:00:27	DATE	0.92+
Sqoop	TITLE	0.92+
Fusion	TITLE	0.91+
Isilon	ORGANIZATION	0.89+
Cloudera	ORGANIZATION	0.89+
midnight	DATE	0.89+
Sunday	DATE	0.88+
Isilon	TITLE	0.88+
single	QUANTITY	0.88+
HIVE	TITLE	0.87+
one thing	QUANTITY	0.83+
double	QUANTITY	0.83+

Kickoff | Veritas Vision Solution Day 2018

(bright, peppy music) >> Announcer: From Chicago, it's theCUBE. Covering Veritas Vision Solution Day 2018. Brought to you by Veritas. >> Hello everyone, welcome to Chicago. We're here covering the Veritas Solution Day. Veritas, last year, had the Veritas Vision Conference and they brought together all their customers. This year they decided to go around the world, I think they have six or seven of these across the globe. And we just were in New York a few weeks ago at Tavern on the Green. We're here at the Palmer House in Chicago. Iconic hotel. About 60 to 70 customers here. Of course Chicago's a big opportunity for companies like Veritas because there's such a good customer base here. But what I want to do now is set up what's going on in the data protection business. According to a number of sources, Gartner, IDC Data, other survey data, certainly anecdotally when we talk to customers, about half of the customers that we talk to are going to replace their data protection platform within the next five years. Why is that? Well, there are a number of factors that are affecting that and I want to talk about the reasons why, the implications to the market, and what that means for customers. So if you look back 10 years ago, there was a similar dynamic going on catalyzed by the ascendancy of virtualization. What was happening is that you had all these servers that were underutilized and so the brilliance of virtualization was we're going to consolidate those servers, virtualize the compute power, dramatically increase the utilization and reduce the physical capacity that's on the floor. So you can get rid of stuff. Get rid of servers, spend less, and get more value out of that asset. Because you had all these underutilized hardware assets. Data protection backup in particular was the one workload that actually could use all that compute power. Why, because at the end of the day, you're backing up this huge stream of data. And so as a result, when you had to do a full backup, you didn't have the physical resources. So people had to rethink how they architected backup because of virtualization. So you now have a similar dynamic, but for different reasons. Some of the big trends that are going on here. The first one is of course digital. So digital means data and it's all about how you get value out of your data because data is increasingly an important asset. People are realizing that protecting that data is more and more important. As a result, people are rethinking just the definition of recovery. Recovery has to be faster, you've got to be always on in this digital world. So digital transformation is critical. You can't just bolt on backup as you have for the last 20, 30, 40 years really. Backup has been a bolt on. You've also got cloud. Everybody wants cloud-like. So you're seeing a shift from improving or dealing with resource utilization and allocation, as I explained in the virtualization world, now to automation. Why automation? Because people want a cloud-like experience. They realize they can't just shove all their data into the public cloud. There's data all over the place, and I'll talk about that in a moment in terms of distributed data, but specifically people want a cloud-like experience. What does that mean? That means they want pay-as-you-go, they want simple deployment, they want fast seamless recovery, and they want a lot of automation. While the price of technology comes down year after year, the price of people doesn't. And you can't just keep throwing people at the infrastructure problem, because it's so complex, you have to automate. And you want to shift resources toward higher value activities. Digital transformation, dev opps, application development. So this distributed data world, this multi-cloud world, and I'll talk a little bit more about that in a moment when I discuss the Edge, it's becoming a forcing function. Multi-cloud is a forcing function to rethink your backup. Because you've got different infrastructures, a service providers, you've got SAS providers, you've got all kinds of clouds that are popping up all over the lines of business and within your own data centers. As a result, you need to think about how do I catalog all that data, how do I protect that data, how do I govern that data, how do I deal with things like GDPR and make sure that I'm in compliance. So it becomes a much more complicated equation, and the variables are distinct. For example, I don't really understand what point in time means anymore. If you have distributed data, what does it mean to have a point in time copy? Point in what time? Who's the master? So you need some kind of controls in that multi-cloud world. That's a forcing function to rethink your backup. The other thing is platform. Platform beats products. I'll talk about that in a moment. People for years have looked at backup as purely insurance. Everybody hates buying insurance, we all know that, so you're seeing people trying to get more out of their backup and recovery platforms. For instance, integrating disaster recovery. So that's becoming an integral part of people's strategies. You're also seeing analytics becoming more and more important. People are trying to, because all the data sits in the corpus of the backup, people are saying why don't we analyze that data and get more out of it. Why don't we take snapshots of that data and make it available to dev opps. And what about ransomware, which again I'll talk about in a moment. Could I maybe look at anomalies in that data to determine if there are some problems. Many, many use cases emerging. Data classification, governance, I mentioned GDPR before, so you're seeing backup shift from pure insurance to a higher value business opportunity. And then of course, there's security, there's compliance, there's governance, ransomware is critical. Organizations are creating air gaps, meaning disconnecting from the internet, so that if they get hit with a ransomware attack they can isolate their data, but just even that is not enough. People can get through air gaps by physically putting in, whatever. Sticks or malware et cetera. So you still have to be able to use analytics to look at that corpus of backup data and identify anomalies. But again, because of those security risks and because of the importance of digital transformation and data people are rethinking how they do data protection. And finally, there's the Edge. We are living in a distributed world, it's a multi-cloud world, as I said before it's a forcing function, and the Edge is one of those clouds, if you will, which changes the way in which you think about backup. How does it change. Locality of the recovery data. If you've got Edge data, if you've got multi-cloud, you've, as I said before, got to have a global catalog and recover that data locally. Another thing to think about is SLAs. In a cloud world, you, the customer, are responsible for the recovery. Well, the cloud vendor can get the light back on on the disc system, or the computer, or the compute system, you are responsible for the people and the process to recover your business. That is not the cloud vendor's responsibility so you need to think about that. And think about recovery as recovery at the business level, not just recovery of the data, but recovery, getting your business back online. There's also the three laws of the cloud. We learned this from Pat Gelsinger this August at VMworld. The laws of physics, the laws of economics, and the law of the land. Those will dictate where you put data and how you back up that data. So all of this has created a new landscape in the data protection business. Let's run down that landscape. Who are the leaders. You've got Dell EMC, you've got Veritas, you've got Convault, and you've got IBM. Those guys comprise probably 2/3 or more of the marketplace. And you have startups like Cohesity and Rubrik who have raised hundreds of millions of dollars going after them and challenging them. You've got a whole new set of players that are taking new approaches. Actifio, for example, got the whole copy data management thing going. Datrium is creating end to end, both primary storage and data protection backup in the same platform with a software-based cloud-like, SAS-like offering. You've got companies like Zerto and Imanis Data that are specialists. You've got companies like WANdisco, again, taking new approaches. And then you have Oracle, with the Oracle recovery appliance, which is totally changing the way in which backup worked for Oracle databases exclusively. Taking a database-led approach to backup. And then of course you've got the storage players that are part of the ecosystem even though they're not directly competing with backup software vendors. Guys like Pure, NetApp, InfiniteApp. They're partnering with backup vendors. And then of course, there's the cloud guys. AWS, Azure, Google. The thing to think about as customers, really three things. Platform versus product. What's the platform look like? Is it an API-based platform? Because you want to program to that platform infrastructurer's code, you want to support your dev opps infrastructure. The second is cloud-like pricing, and cloud-like deployment. You want a cloud-based operating model to simplify your operations and lower your IT labor costs and shift those costs to more strategic efforts and initiatives such as digital transformation and application development. And the third is ecosystem alignment. Make sure that your backup software vendor and you backup solution vendors are all, their ecosystem is aligned with your ecosystem. Because you're going to get more facile integration and problem-solving and flexibility if those systems align. So take a look at that as well. Couple of things I want to mention and emphasize. New application development models. Cloud Native, Kubernetes. Function, you know people call it server-less, but function-based programming. Really to support dev opps and infrastructure as a code. That is going to have implications on how you protect data. And finally AI. How can you talk about anything today without talking about AI. Anticipatory staging of data for recovery, as in the example. Predicting where problems are going to occur. Machine intelligence will increasingly play a role in this whole landscape. So, as you can see, there's a lot going on. This is why data protection is such a hot space. That's why the VCs are getting in. It's why the incumbents like Veritas, Dell EMC, IBM, Convault, those that I mentioned are trying to re-platform and hang on to their large install bases and ultimately grow them. And it's why companies in the startup and the niche spaces, are tucking in and identifying new opportunities to participate. So that's a quick overview of what's going on here at the Veritas Vision Solution Day from Chicago. We'll be here all day talking to customers, talking to practitioners, technologists, and executives. So keep it right there, you're watching theCUBE. I'm Dave Vellante. Be right back. (bright music)

Published Date : Nov 10 2018

SUMMARY :

Brought to you by Veritas. and the process to recover your business.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Veritas	ORGANIZATION	0.99+
Pat Gelsinger	PERSON	0.99+
six	QUANTITY	0.99+
Convault	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Cohesity	ORGANIZATION	0.99+
Zerto	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Chicago	LOCATION	0.99+
last year	DATE	0.99+
Gartner	ORGANIZATION	0.99+
2/3	QUANTITY	0.99+
Imanis Data	ORGANIZATION	0.99+
Rubrik	ORGANIZATION	0.99+
VMworld	ORGANIZATION	0.99+
seven	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
This year	DATE	0.99+
third	QUANTITY	0.99+
both	QUANTITY	0.98+
Veritas Vision Solution Day	EVENT	0.98+
today	DATE	0.98+
Datrium	ORGANIZATION	0.98+
three laws	QUANTITY	0.98+
Dell EMC	ORGANIZATION	0.98+
GDPR	TITLE	0.98+
Veritas Solution Day	EVENT	0.98+
10 years ago	DATE	0.98+
Veritas Vision Solution Day 2018	EVENT	0.97+
70 customers	QUANTITY	0.96+
first one	QUANTITY	0.96+
InfiniteApp	ORGANIZATION	0.96+
second	QUANTITY	0.96+
SAS	ORGANIZATION	0.95+
About 60	QUANTITY	0.95+
IDC Data	ORGANIZATION	0.94+
hundreds of millions of dollars	QUANTITY	0.94+
30	QUANTITY	0.94+
Edge	TITLE	0.94+
Azure	ORGANIZATION	0.92+
Veritas Vision Conference	EVENT	0.91+
one	QUANTITY	0.91+
three things	QUANTITY	0.91+
Tavern on the Green	LOCATION	0.91+
Actifio	ORGANIZATION	0.9+
Palmer House	LOCATION	0.9+
40 years	QUANTITY	0.89+
Couple	QUANTITY	0.87+
August	DATE	0.86+
few weeks ago	DATE	0.8+
NetApp	ORGANIZATION	0.77+
next five years	DATE	0.76+
Platfo	ORGANIZATION	0.75+
about half of	QUANTITY	0.69+
Cloud Native	TITLE	0.66+
Pure	ORGANIZATION	0.64+
Iconic	LOCATION	0.55+
Kubernetes	TITLE	0.55+
years	QUANTITY	0.49+
last 20	DATE	0.4+

Day Two Kickoff | Big Data NYC

(quite music) >> I'll open that while he does that. >> Co-Host: Good, perfect. >> Man: All right, rock and roll. >> This is Robin Matlock, the CMO of VMware, and you're watching theCUBE. >> This is John Siegel of VPA Product Marketing at Dell EMC. You're watching theCUBE. >> This is Matthew Morgan, I'm the chief marketing officer at Druva and you are watching theCUBE. >> Announcer: Live from midtown Manhattan, it's theCUBE. Covering BigData New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (rippling music) >> Hello, everyone, welcome to a special CUBE live presentation here in New York City for theCUBE's coverage of BigData NYC. This is where all the action's happening in the big data world, machine learning, AI, the cloud, all kind of coming together. This is our fifth year doing BigData NYC. We've been covering the Hadoop ecosystem, Hadoop World, since 2010, it's our eighth year really at ground zero for the Hadoop, now the BigData, now the Data Market. We're doing this also in conjunction with Strata Data, which was Strata Hadoop. That's a separate event with O'Reilly Media, we are not part of that, we do our own event, our fifth year doing our own event, we bring in all the thought leaders. We bring all the influencers, meaning the entrepreneurs, the CEOs to get the real story about what's happening in the ecosystem. And of course, we do it with our analyst at Wikibon.com. I'm John Furrier with my cohost, Jim Kobielus, who's the chief analyst for our data piece. Lead analyst Jim, you know the data world's changed. We had commenting yesterday all up on YouTube.com/SiliconAngle. Day one was really set the table. And we kind of get the whiff of what's happening, we can kind of feel the trend, we got a finger on the pulse. Two things going on, two big notable stories is the world's continuing to expand around community and hybrid data and all these cool new data architectures, and the second kind of substory is the O'Reilly show has become basically a marketing. They're making millions of dollars over there. A lot of people were, last night, kind of not happy about that, and what's giving back to the community. So, again, the community theme is still resonating strong. You're starting to see that move into the corporate enterprise, which you're covering. What are you finding out, what did you hear last night, what are you hearing in the hallways? What is kind of the tea leaves that you're reading? What are some of the things you're seeing here? >> Well, all things hybrid. I mean, first of all it's building hybrid applications for hybrid cloud environments and there's various layers to that. So yesterday on theCUBE we had, for example, one layer is hybrid semantic virtualization labels are critically important for bridging workloads and microservices and data across public and private clouds. We had, from AtScale, we had Bruno Aziza and one of his customers discussing what they're doing. I'm hearing a fair amount of this venerable topic of semantic data virtualization become even more important now in the era of hybrid clouds. That's a fair amount of the scuttlebutt in the hallway and atrium talks that I participated in. Also yesterday from BMC we had Basil Faruqi talking about basically talking about automating data pipelines. There are data pipelines in hybrid environments. Very, very important for DevOps, productionizing these hybrid applications for these new multi-cloud environments. That's quite important. Hybrid data platforms of all sorts. Yesterday we had from ActIn Jeff Veis discussing their portfolio for on-prem, public cloud, putting the data in various places, and speeding up the queries and so forth. So hybrid data platforms are going increasingly streaming in real time. What I'm getting is that what I'm hearing is more and more of a layering of these hybrid environments is a critical concern for enterprises trying to put all this stuff together, and future-proof it so they can add on all the new stuff. That's coming along like cirrus clouds, without breaking interoperability, and without having to change code. Just plug and play in a massively multi-cloud environment. >> You know, and also I'm critical of a lot of things that are going on. 'Cause to your point, the reason why I'm kind of critical on the O'Reilly show and particularly the hype factor going on in some areas is two kinds of trends I'm seeing with respect to the owners of some of the companies. You have one camp that are kind of groping for solutions, and you'll see that with they're whitewashing new announcements, this is going on here. It's really kind of-- >> Jim: I think it's AI now, by the way. >> And they're AI-washing it, but you can, the tell sign is they're always kind of doing a magic trick of some type of new announcement, something's happening, you got to look underneath that, and say where is the deal for the customers? And you brought this up yesterday with Peter Burris, which is the business side of it is really the conversation now. It's not about the speeds and feeds and the cluster management, it's certainly important, and those solutions are maturing. That came up yesterday. The other thing that you brought up yesterday I thought was notable was the real emphasis on the data science side of it. And it's that it's still not easy or data science to do their job. And this is where you're seeing productivity conversations come up with data science. So, really the emphasis at the end of the day boils down to this. If you don't have any meat on the bone, you don't have a solution that rubber hits the road where you can come in and provide a tangible benefit to a company, an enterprise, then it's probably not going to work out. And we kind of had that tool conversation, you know, as people start to grow. And so as buyers out there, they got to look, and kind of squint through it saying where's the real deal? So that kind of brings up what's next? Who's winning, how do you as an analyst look at the playing field and say, that's good, that's got traction, that's winning, mm not too sure? What's your analysis, how do you tell the winners from the losers, and what's your take on this from the data science lens? >> Well, first of all you can tell the winners when they have an ample number of referenced customers who are doing interesting things. Interesting enough to get a jaded analyst to pay attention. Doing something that changes the fabric of work or life, whatever, clearly. Solution providers who can provide that are, they have all the hallmarks of a winner meaning they're making money, and they're likely to grow and so forth. But also the hallmarks of a winner are those, in many ways, who have a vision and catalyze an ecosystem around that vision of something that could be made, possibly be done before but not quite as efficiently. So you know, for example, now the way what we're seeing now in the whole AI space, deep learning, is, you know, AI means many things. The core right now, in terms of the buzzy stuff is deep learning for being able to process real time streams of video, images and so forth. And so, what we're seeing now is that the vendors who appear to be on the verge of being winners are those who use deep learning inside some new innovation that has enough, that appeals to a potential mass market. It's something you put on your, like an app or something you put on your smart phone, or it's something you buy at Walmart, install in your house. You know, the whole notion of clearly Alexa, and all that stuff. Anything that takes chatbot technology, really deep learning powers chatbots, and is able to drive a conversational UI into things that you wouldn't normally expect to talk to you and does it well in a way that people have to have that. Those are the vendors that I'm looking for, in terms of those are the ones that are going to make a ton of money selling to a mass market, and possibly, and very much once they go there, they're building out a revenue stream and a business model that they can conceivably take into other markets, especially business markets. You know, like Amazon, 20-something years ago when they got started in the consumer space as the exemplar of web retailing, who expected them 20 years later to be a powerhouse provider of business cloud services? You know, so we're looking for the Amazons of the world that can take something as silly as a conversational UI inside of a, driven by DL, inside of a consumer appliance and 20 years from now, maybe even sooner, become a business powerhouse. So that's what's new. >> Yeah, the thing that comes up that I want to get your thoughts on is that we've seen data integration become a continuing theme. The other thing about the community play here is you start to see customers align with syndicates or partnerships, and I think it's always been great to have customer traction, but, as you pointed out, as a benchmark. But now you're starting to see the partner equation, because this isn't open, decentralized, distributed internet these days. And it is looking like it's going to form differently than they way it was, than the web days and with mobile and connected devices it IoT and AI. A whole new infrastructure's developing, so you're starting to see people align with partnerships. So I think that's something that's signaling to me that the partnership is amping up. I think the people are partnering more. We've had Hortonworks on with IBM, people are partner, some people take a Switzerland approach where they partner with everyone. You had, WANdisco partners with all the cloud guys, I mean, they have unique ITP. So you have this model where you got to go out, do something, but you can't do it alone. Open source is a key part of this, so obviously that's part of the collaboration. This is a key thing. And then they're going to check off the boxes. Data integration, deep learning is a new way to kind of dig deeper. So the question I have for you is, the impact on developers, 'cause if you can connect the dots between open source, 90% of the software written will be already open source, 10% differentiated, and then the role of how people going to market with the enterprise of a partnership, you can almost connect the dots and saying it's kind of a community approach. So that leaves the question, what is the impact to developers? >> Well the impact to developers, first of all, is when you go to a community approach, and like some big players are going more community and partnership-oriented in hot new areas like if you look at some of the recent announcements in chatbots and those technologies, we have sort of a rapprochement between Microsoft and Facebook and so forth, or Microsoft and AWS. The impact for developers is that there's convergence among the companies that might have competed to the death in particular hot new areas, like you know, like I said, chatbot-enabled apps for mobile scenarios. And so it cuts short the platform wars fairly quickly, harmonizes around a common set of APIs for accessing a variety of competing offerings that really overlap functionally in many ways. For developers, it's simplification around a broader ecosystem where it's not so much competition on the underlying open source technologies, it's now competition to see who penetrates the mass market with actually valuable solutions that leverage one or more of those erstwhile competitors into some broader synthesis. You know, for example, the whole ramp up to the future of self-driving vehicles, and it's not clear who's going to dominate there. Will it be the vehicle manufacturers that are equipping their cars with all manner of computerized everything to do whatnot? Or will it be the up-and-comers? Will it be the computer companies like Apple and Microsoft and others who get real deep and invest fairly heavily in self-driving vehicle technology, and become themselves the new generation of automakers in the future? So, what we're getting is that going forward, developers want to see these big industry segments converge fairly rapidly around broader ecosystems, where it's not clear who will be the dominate player in 10 years. The developers don't really care, as long as there is consolidation around a common framework to which they can develop fairly soon. >> And open source is obviously a key role in this, and how is deep learning impacting some of the contributions that are being made, because we're starting to see the competitive advantage in collaboration on the community side is with the contributions from companies. For example, you mentioned TensorFlow multiple times yesterday from Google. I mean, that's a great contribution. If you're a young kind coming into the developer community, I mean, this is not normal. It wasn't like this before. People just weren't donating massive libraries of great stuff already pre-packaged, So all new dynamics emerging. Is that putting pressure on Amazon, is that putting pressure on AWS and others? >> It is. First of all, there is a fair amount of, I wouldn't call it first-mover advantage for TensorFlow, there've been a number of DL toolkits on the market, open source, for the last several years. But they achieved the deepest and broadest adoption most rapidly, and now they are a, TensorFlow is essentially a defacto standard in the way, that we just go back, betraying my age, 30, 40 years ago where you had two companies called SAS and SPSS that quickly established themselves as the go-to statistical modeling tools. And then they got a generation, our generation, of developers, or at least of data scientists, what became known as data scientists, to standardize around you're either going to go with SAS or SPSS if you're going to do data mining. Cut ahead to the 2010s now. The new generation of statistical modelers, it's all things DL and machine learning. And so SAS versus SPSS is ages ago, those companies are, those products still exist. But now, what are you going to get hooked on in school? What are you going to get hooked on in high school, for that matter, when you're just hobby-shopping DL? You'll probably get hooked on TensorFlow, 'cause they have the deepest and the broadest open source community where you learn this stuff. You learn the tools of the trade, you adopt that tool, and everybody else in your environment is using that tool, and you got to get up to speed. So the fact is, that broad adoption early on in a hot new area like DL, means tons. It means that essentially TensorFlow is the new Spark, where Spark, you know, once again, Spark just in the past five years came out real fast. And it's been eclipsed, as it were, on the stack of cool by TensorFlow. But it's a deepening stack of open source offerings. So the new generation of developers with data science workbenches, they just assume that there's Spark, and they're going to increasingly assume that there's TensorFlow in there. They're going to increasingly assume that there are the libraries and algorithms and models and so forth that are floating around in the open source space that they can use to bootstrap themselves fairly quickly. >> This is a real issue in the open source community which we talked, when we were in LA for the Open Source Summit, was exactly that. Is that, there are some projects that become fashionable, so for example, a cloud-native foundation, very relevant but also hot, really hot right now. A lot of people are jumping on board the cloud natives bandwagon, and rightfully so. A lot of work to be done there, and a lot of things to harvest from that growth. However, the boring blocking and tackling projects don't get all the fanfare but are still super relevant, so there's a real challenge of how do you nurture these awesome projects that we don't want to become like a nightclub where nobody goes anymore because it's not fashionable. Some of these open source projects are super important and have massive traction, but they're not as sexy, or flair-ish as some of that. >> Dl is not as sexy, or machine learning, for that matter, not as sexy as you would think if you're actually doing it, because the grunt work, John, as we know for any statistical modeling exercise, is data ingestion and preparation and so forth. That's 75% of the challenge for deep learning as well. But also for deep learning and machine learning, training the models that you build is where the rubber meets the road. You can't have a really strongly predictive DL model in terms of face recognition unless you train it against a fair amount of actual face data, whatever it is. And it takes a long time to train these models. That's what you hear constantly. I heard this constantly in the atrium talking-- >> Well that's a data challenge, is you need models that are adapting and you need real time, and I think-- >> Oh, here-- >> This points to the real new way of doing things, it's not yesterday's model. It's constantly evolving. >> Yeah, and that relates to something I read this morning or maybe it was last night, that Microsoft has made a huge investment in AI and deep learning machinery. They're doing amazing things. And one of the strategic advantages they have as a large, established solution provider with a search engine, Bing, is that from what I've been, this is something I read, I haven't talked to Microsoft in the last few hours to confirm this, that Bing is a source of training data that they're using for machine learning and I guess deep learning modeling for their own solutions or within their ecosystem. That actually makes a lot of sense. I mean, Google uses YouTube videos heavily in its deep learning for training data. So there's the whole issue of if you're a pipsqueak developer, some, you know, I'm sorry, this sounds patronizing. Some pimply-faced kid in high school who wants to get real deep on TensorFlow and start building and tuning these awesome kickass models to do face recognition, or whatever it might be. Where are you going to get your training data from? Well, there's plenty of open source database, or training databases out there you can use, but it's what everybody's using. So, there's sourcing the training data, there's labeling the training data, that's human-intensive, you need human beings to label it. There was a funny recent episode, or maybe it was a last-season episode of Silicone Valley that was all about machine learning and building and training models. It was the hot dog, not hot dog episode, it was so funny. They bamboozle a class on the show, fictionally. They bamboozle a class of college students to provide training data and to label the training data for this AI algorithm, it was hilarious. But where are you going to get the data? Where are you going to label it? >> Lot more work to do, that's basically what you're getting at. >> Jim: It's DevOps, you know, but it's grunt work. >> Well, we're going to kick off day two here. This is the SiliconeANGLE Media theCUBE, our fifth year doing our own event separate from O'Reilly media but in conjunction with their event in New York City. It's gotten much bigger here in New York City. We call it BigData NYC, that's the hashtag. Follow us on Twitter, I'm John Furrier, Jim Kobielus, we're here all day, we've got Peter Burris joining us later, head of research for Wikibon, and we've got great guests coming up, stay with us, be back with more after this short break. (rippling music)

Published Date : Sep 27 2017

SUMMARY :

This is Robin Matlock, the CMO of VMware, This is John Siegel of VPA Product Marketing This is Matthew Morgan, I'm the chief marketing officer Brought to you by SiliconANGLE Media What is kind of the tea leaves that you're reading? That's a fair amount of the scuttlebutt I'm kind of critical on the O'Reilly show is really the conversation now. Doing something that changes the fabric So the question I have for you is, the impact on developers, among the companies that might have competed to the death and how is deep learning impacting some of the contributions You learn the tools of the trade, you adopt that tool, and a lot of things to harvest from that growth. That's 75% of the challenge for deep learning as well. This points to the in the last few hours to confirm this, that's basically what you're getting at. This is the SiliconeANGLE Media theCUBE,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Robin Matlock	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Matthew Morgan	PERSON	0.99+
Basil Faruqi	PERSON	0.99+
Jim	PERSON	0.99+
John Siegel	PERSON	0.99+
O'Reilly Media	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
yesterday	DATE	0.99+
90%	QUANTITY	0.99+
Peter Burris	PERSON	0.99+
two companies	QUANTITY	0.99+
New York City	LOCATION	0.99+
SPS	ORGANIZATION	0.99+
SAS	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
75%	QUANTITY	0.99+
LA	LOCATION	0.99+
Silicone Valley	TITLE	0.99+
Facebook	ORGANIZATION	0.99+
10%	QUANTITY	0.99+
Walmart	ORGANIZATION	0.99+
2010s	DATE	0.99+
YouTube	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
AtScale	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
Jeff Veis	PERSON	0.99+
fifth year	QUANTITY	0.99+
one	QUANTITY	0.99+
Yesterday	DATE	0.99+
Dell EMC	ORGANIZATION	0.99+
VMware	ORGANIZATION	0.99+
eighth year	QUANTITY	0.99+
BigData	ORGANIZATION	0.99+
millions of dollars	QUANTITY	0.99+
Bing	ORGANIZATION	0.99+
BMC	ORGANIZATION	0.98+
Amazons	ORGANIZATION	0.98+
last night	DATE	0.98+
two kinds	QUANTITY	0.98+
Spark	TITLE	0.98+
Hortonworks	ORGANIZATION	0.98+
Day one	QUANTITY	0.98+
20 years later	DATE	0.98+
VPA	ORGANIZATION	0.98+
2010	DATE	0.98+
ActIn	ORGANIZATION	0.98+
Open Source Summit	EVENT	0.98+
one layer	QUANTITY	0.98+
Druva	ORGANIZATION	0.97+
Alexa	TITLE	0.97+
day two	QUANTITY	0.97+
Bruno Aziza	PERSON	0.97+
SPSS	TITLE	0.97+
Switzerland	LOCATION	0.97+
Two things	QUANTITY	0.96+
NYC	LOCATION	0.96+
Wikibon	ORGANIZATION	0.96+
30	DATE	0.95+
Wikibon.com	ORGANIZATION	0.95+
SiliconeANGLE Media	ORGANIZATION	0.95+
O'Reilly	ORGANIZATION	0.95+

Jagane Sundar & Pranav Rastogi | Big Data NYC 2017

>> Announcer: Live from Midtown Manhattan, it's theCUBE, covering Big Data, New York City, 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Okay, welcome back, everyone. Live in Manhattan, this is theCUBE's coverage of our fifth year doing Big Data, NYC; eighth year covering Hadoop World, which is now evolved into Strata Data which is right around the corner. We're doing that in conjunction with that event. This is, again, where we have the thought leaders, we have the experts, we have the entrepreneurs and CEOs come in, of course. The who's who in tech. And my next two guests, is Jagane Sundar, CUBE alumni, who was on yesterday. CTO of WANdisco, one of the hottest companies, most valuable companies in the space for their unique IP, and not a lot of people know what they're doing. So congratulations on that. But you're here with one of your partners, a company I've heard of, called Microsoft, also doing extremely well with Azure Cloud. We've got Pranav Rastogi, who's the program manager of Microsoft Cloud Azure. You guys have an event going on as well at Microsoft Ignite which has been creating a lot of buzz this year again. As usual, they have a good show, but this year the Cloud certainly has taken front and center. Welcome to theCUBE, and good to see you again. >> Thank you. >> Thank you. >> Alright, so talk about the partnership. You guys, Jagane deals with all the Cloud guys. You're here with Microsoft. What's going on with Microsoft? Obviously they've been, if you look at the stock price. From 20-something to a complete changeover of the leadership of Satya Nadella. The company has mobilized. The Cloud has got traction, putting a dent in the universe. Certainly, Amazon feels a little bit of pain there. But, in general, a lot more work to do. What are you guys doing together? Share the relationship. >> So, we just announced a product that's a one-click deployment in the Microsoft Azure Cloud, off WANdisco's Fusion Replication technology. So, if you got some data assets, Hadoop or Cloud object stores on-premise and you want to create a hybrid or a Cloud environment with Azure and Picture, ours is the only way of doing Active/Active. >> Active/Active. And there is some stuff out there that's looking like Active/Active. DataPlane by Hortonworks. But it's fully not Active/Active. We talked a little bit about that yesterday. >> Jagane: Yes. >> Microsoft, you guys, what's interesting about these guys besides the Active/Active? It's a unique thing. It's an ingredient for you guys. >> Yes, the interesting thing for us is, the biggest problem that we think customers have for big data perspective is, if you look at the landscape of the ecosystem in terms of open source projects that are available it's very hard to a: figure out How do I use this software?, b: How do I install it? And, so what we have done is created an experience in Azure HDInsight where you can discover these applications, within the context of your cluster and you can install these applications by one-click install. Which installs the application, configures it, and then you're good to go. We think that this is going to sort of increase the productivity of users trying to get sense out of big data. The key challenges we think customers have today is setting up some sort of hybrid environment between how do you connect your on premise data to move it to the Cloud, and there are different use cases that you can have you can move parts of the data and you can do experiment easily in the Cloud. So what we've done is, we've enabled WANdisco as an application on our HDInsight application platform, where customers can install it using a single-click deploy connected with the data that's sitting on-prem, use the Active/Active feature to have both these environments running simultaneously and they're in sync. >> So one benefits the one-click thing, that's on your side, right? You guys are enabling that. So, okay, I get that. That's totally cool. We'll get to that in a second. I want to kind of drill down on that. But, what's the benefit to the customers, that you guys are having? So, I'm a customer, I one-click, I want some WANdisco Active/Active. Why am I doing it? What does the Cloud change? How does your Cloud change from that experience? >> One example that you can think about is going to change is in an on-premise environment you have a cluster running, but you're kind of limited on what you can do with the cluster, because you've already setup the number of nodes and the workloads your running is fairly finite, but what's happening in reality and today is, lots of users, especially in the machine learning space, and AI space, and the analytic space are using a lot of open source libraries and technologies and they're using it on top of Hadoop, and they're using it on top of Spark. However, in experimenting with these technologies is hard on-prem because it's a locked environment. So we believe, with the Cloud, especially with it offering WANdisco and HDInsight, once you move the data you can start spinning up clusters, you can start installing more open source libraries, experiment, and you can shut down the clusters when you're done. So it's going to increase your efficiency, it's going to allow you to experiment faster, and it's going to reduce for cost as well, because you don't have to have the cluster running all the time and once you are done with your experimentation, then you can decide which way do you want to go. So, it's going to remove the-- >> Jagane, what's your experience with Azure? A lot of people have been, some people have been critical, and rightfully so. You guys are moving as fast you can. You can only go as fast you can, but the success of the Cloud has been phenomenal. You guys have done a great job with the Cloud. Got to give you props on that. Your customers are benefiting, or Microsoft's customers are benefiting. How's the relationship? Are you getting more customers through these guys? Are you bringing customers from on-prem to Cloud? How's the customer flow going? >> Almost all of our customers who have on-prem instances of Hadoop are considering Cloud in one form or the other. Different Clouds have different strengths, as they've found-- >> Interviewer: And different technologies. >> Indeed. And Azure's strengths appear to be the HDInsight piece of it and as Pranam just mentioned, the cool thing is, you can replicate into the Cloud, start up a 50 node Spark cluster today to run a query, that may return results to you really fast. Now, remember this is data that you can write to both in the Cloud and on-premise. It's kept consistent by our technology, or tomorrow you may find that somebody tells you, Hive with the new Tez enhancements is faster, sure, spin up a hundred node Hive cluster in the Cloud, HDInsight supports that really well. You're getting consistent data and your queries will respond much faster than your on-premise. >> We've had Oliver Chu on, before with Hortonworks obviously they're partnering there. HDInsight's been getting a lot of traction lately. Where's that going? We've seen some good buzz on that. Good people talking about it. What's the latest update on your end? >> HDInsight is doing really good. The customers love the ease of creating a cluster using just a few clicks and the benefits that customers get, clusters are optimized for certain scenarios. So if you're doing data science, you can create a Spark cluster, install open source libraries. We have Microsoft R Server running on Spark, which is a unique offering to Microsoft, which lots of customers have appreciated. You also have streaming scenarios that you can do using open source technologies, like we have Apache Kafka running on a stack, which is becoming very popular from an ingestion perspective. Folks have been-- >> Has the Kupernetes craze come down to your group yet? Has it trickled down? It seems to be going crazy. You hired an amazing person from Google, Brendan Burns, we've interviewed before. He's part of the original Kubernetes spec he now works for Microsoft. What's the buzz on the Kubernetes container world there? >> In general, Microsoft Azure has seen great benefits out of it. We are seeing lots of traction in that space. From my role in particular, I focus more on the HDInsight big data space, which is kind of outside of what we do with Kubernetes' work. >> And your relationship is going strong with WANdisco? >> Pranav: Yes. >> Right. >> We just launched this offering just about yesterday is what we announced and we're looking forward to getting customers on to the stack. >> That's awesome. What's your take on the industry right now? Obviously, the partnerships are becoming clearer as people can see there's (mumbles). You're starting to see the notion of infrastructure and services are changing. More and more people want services and then you got the classic infrastructure which looks like it's going to be hybrid. That's pretty clear, we see that. Services versus infrastructure, how should customers think about how they architect their environments? So they can take advantage of the Active/Active and also have a robust, clean, not a lot of re-skilling going on, but more of a good organization from a personnel standpoint, but yet get to a hybrid architecture? >> So, it depends, the Cloud gives you lots of options to meet the customers where they are. Different customers have different kinds of requirements. Customers who have specialized, some of their applications will probably want to go more of an infrastructure route, but customers also love to have some of the past benefits where, you know, I have a service running where I don't have to worry about the infrastructure, how dispatching happen, how does OS updates happen, how does maintenance happen. They want to sort of rely on the Microsoft Azure Cloud provider to take care of it. So that they can focus on their application specific logic, or business specific logic, or analytical workloads, and worry about optimizing those parts of the application because that is their core-- >> It's been great.I want to get your thoughts real quick. Share some color. What's going on inside Microsoft? Obviously, open source has become a really big part of the culture, even just at Ignite. More Linux news is coming. You guys have been involved in Linux. Obviously, open source with Azure, ton of stuff, I know is built in the Microsoft Cloud on open source. You're contributing now as to Kubernetes, as I mentioned earlier. Seems to be a good cultural shift at Microsoft. What's the vibe on the open source internally at Microsoft? Can you share, just some anecdotal insight into what's the vibe like inside, around open source? >> The vibe has increased quite a lot around open source. You rightly mentioned, just recently we've announced a SQL server on Linux as well, at the Ignite conference. You can also deploy a SQL server on a docker container, which is quite revolutionary if you think about how forward we have come. Open source is so pervasive it's almost used in a lot of these projects. Microsoft employees are contributing back to open source projects in terms of, bug fixes, feature requests, or documentation updates. It's a very, very active community and by and large I think customers are benefiting a lot, because there are so many folks working together on open source projects and making them successful and especially around the Azure stack, we also ensure that you can run these open source workloads lively in the Cloud. From an enterprise perspective, you get the best of both worlds. You get the latest innovations happening in open source, plus the reliability of the managed platform that Azure provides at an enterprise scale. >> So again, obviously Microsoft partnership is huge, all the Clouds as well. Where do you want to take the relationship with Microsoft? What happens next? You guys are just going to continue to do business, you're like expecting the one-click's nice, I have some questions on that. What happens next? >> So, I see our partnership becoming deeper. We see the value that HDInsight brings to the ecosystem and all of that value is captured by the data. At the end of the day, if you have stale data, if you have data that you can't rely on the applications are useless. So we see ourselves getting more and more deeply embedded in the system. We see of ourselves as an essential part of the data strategy for Azure. >> Yeah, we see continuous integration as a development concept, continuous analytics as a term, that's being kicked around. We were talking yesterday about, here in theCUBE, real time, I want some data real time and IT goes back, "Here it is, it's real time!" No, but the data's three weeks old. I mean, real time (laughs) is a word that doesn't mean I got to see it really fast, low latency response. Well, that's not the data I want. I meant the data in real time, not you giving me a real time query. So again, this brings up a mind shift in terms of the new way to do business in the Cloud and hybrid. It's changing the game. As customers scratch their heads and try to figure out how to make their organizations more DevOps oriented, what do you guys see for advice for those managers, who are really getting behind it, really want to make change, who kind of have to herd the cats a little bit, and maybe break out security and put it in it's own group? Or you come and say, okay IT guys we're going to change into our operating model, even on-prem, we'll use some burst in to the Cloud, Azure's got 365 on there, lot of coolness developing. What's the advice for the mindset of the change agents out there that are going to do the transformation? >> My advice would be, if you've done the same thing by hand over two times, it's time you automated it, but-- >> Interviewer: Two times?! >> Two times. >> No three rule? Three strikes you're out? >> You're saying two, contrarian. >> That's a careful statement. Because, if you try automating something that you've never actually tried by hand, that's a disaster as well. A couple times, so you know how it's supposed to work. >> Interviewer: Get a good groove on it. >> Right, then you optimize, you automate, and then you turn the knobs. So, you try a hundred node cluster, maybe that's going to be faster. Maybe after a certain point, you don't get any improvements, so you know how to-- >> So take some baby steps, and one easy way to do it is to automate something that you've done. >> Jagane: Yes, exactly. >> That's almost risk-free, relatively speaking. Thoughts, advice to change agents out there. This is your industry hat on. You can take your Microsoft hat off. >> Baby steps. So you start small, you get familiar with the environment and your toolsets are provided so that you get a consistent experience on what you were doing on-prem and sort of in a hybrid space. And the whole idea is as you get more comfortable the benefits of the Cloud far outweigh any sort of cultural changes that need to happen-- >> Guys, thanks for coming on theCUBE, really appreciate it. Thoughts on the Big Data NYC this week? What do you think? >> I think it's a conference that has a lot of Cloud hanging over it and people are scratching their heads. Including vendors, customers, everybody scratching their head, but there is a lot of Cloud in this conference, although this is not a Cloud conference. >> Yeah, they're trying to make it an AI conference. A lot of AI watching certainly we're seeing that everywhere. But again, nothing wrong hyping up AI. It's good for society. It really is cool, but still, that's talking about baby steps, AI is still not there. It seems like, AI from when I got my CS degree in the 80's, not a lot innovation, well machine learning is getting better, but, a lot more way to go on AI. Don't you think? >> Yes, you know a few of the announcements we've made in this week is all about making it easier for developers to get started with AI and machine learning and our whole hope is with these investments that we've done and Azure machine learning improvements and the companion app and the workbench, allows you to get started very easily with AI and machine learning models and you can apply and build these models, do a CICD process and deploy these models and be more effective in the space. >> Yeah and also the tooling market has kind of gotten out of control. We were just joking the other day, that there's this tool shed mindset where everything is in the tool shed and people bought a hammer and turned it into a lawnmower. So it's like, you got to be careful which tools you have. Think about a platform. Think holistically, but if you take the baby steps and implement it, certainly it's there. My personal opinion, I think the Cloud is the equalizer. Cloud can bring compute power that changes what a tool was built for. Even, go back six years, the tools that were out there even six years ago are completely changed by the impact of unlimited, potentially unlimited capacity horsepower. So, okay that resets a little bit. You agree? >> I do. I totally agree. >> Who wins, who loses on the reset? >> The Cloud is an equalizer, but there is a mindset shift that goes with that those who can adapt to the mindset shift, will win. Those who can not and are still clinging to their old practices will have a hard time. >> Yeah, it's exciting. If you're still reinventing Hadoop from 2011 then, probably not good shape right now. >> Jagane: Not a good place to be. >> Using Hadoop is great for Bash, but you can't make that be a lawnmower. That's my opinion. Okay, thanks for coming on. I appreciate it (laughs) You're smiling, you got something that you, no? >> Pranav: (laughs) Thank you so much for that comment. >> Yeah, tool sheds are out there, be careful. Guys do your job. Congratulations on your partnership, appreciate it. This is theCUBE, live in New York. More after this short break. We'll be right back.

Published Date : Sep 27 2017

SUMMARY :

Brought to you by SiliconANGLE Media Welcome to theCUBE, and good to see you again. of the leadership of Satya Nadella. and you want to create a hybrid We talked a little bit about that yesterday. It's an ingredient for you guys. and there are different use cases that you can have that you guys are having? and once you are done with your experimentation, Got to give you props on that. in one form or the other. the cool thing is, you can replicate into the Cloud, What's the latest update on your end? You also have streaming scenarios that you can do using Has the Kupernetes craze come down to your group yet? I focus more on the HDInsight big data space, on to the stack. and then you got the classic infrastructure So, it depends, the Cloud gives you lots of options of the culture, even just at Ignite. and especially around the Azure stack, Where do you want to take the relationship with Microsoft? At the end of the day, if you have stale data, in terms of the new way to do A couple times, so you know how it's supposed to work. and then you turn the knobs. and one easy way to do it is to You can take your Microsoft hat off. And the whole idea is as you get more comfortable Thoughts on the Big Data NYC this week? but there is a lot of Cloud in this conference, Don't you think? and you can apply and build these models, So it's like, you got to be careful which tools you have. I totally agree. and are still clinging to their old practices Yeah, it's exciting. but you can't make that be a lawnmower. Congratulations on your partnership, appreciate it.

ENTITIES

Entity	Category	Confidence
Microsoft	ORGANIZATION	0.99+
Brendan Burns	PERSON	0.99+
Two times	QUANTITY	0.99+
2011	DATE	0.99+
Amazon	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Satya Nadella	PERSON	0.99+
Google	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Jagane Sundar	PERSON	0.99+
three weeks	QUANTITY	0.99+
Jagane	PERSON	0.99+
fifth year	QUANTITY	0.99+
Manhattan	LOCATION	0.99+
yesterday	DATE	0.99+
HDInsight	ORGANIZATION	0.99+
CUBE	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
WANdisco	ORGANIZATION	0.99+
20	QUANTITY	0.99+
Pranav	PERSON	0.99+
one-click	QUANTITY	0.99+
Pranav Rastogi	PERSON	0.99+
two	QUANTITY	0.99+
New York City	LOCATION	0.99+
Midtown Manhattan	LOCATION	0.99+
this year	DATE	0.99+
eighth year	QUANTITY	0.98+
One example	QUANTITY	0.98+
SQL	TITLE	0.98+
both worlds	QUANTITY	0.98+
both	QUANTITY	0.98+
Linux	TITLE	0.97+
one	QUANTITY	0.97+
Spark	TITLE	0.97+
Azure	TITLE	0.97+
NYC	LOCATION	0.97+
two guests	QUANTITY	0.97+
this week	DATE	0.97+
six years ago	DATE	0.97+
today	DATE	0.96+
CTO	PERSON	0.96+
Ignite	EVENT	0.96+
one form	QUANTITY	0.96+
80's	DATE	0.95+
Ignite	ORGANIZATION	0.95+
Hadoop	TITLE	0.95+
Azure	ORGANIZATION	0.95+
single	QUANTITY	0.95+
Oliver Chu	PERSON	0.94+
Azure Cloud	TITLE	0.93+
one easy way	QUANTITY	0.93+
WANdisco	TITLE	0.91+

David Richards | AWS re:Invent 2016

>> Announcer: Live from Las Vegas, Nevada. It's the CUBE, covering AWS re:Invent 2016. Brought to you by AWS and its ecosystem partners. (light techno music) Now, here's your host. >> And we're back, happy to welcome back to the program, regular guest on our program, David Richards, who is the founder and CEO of WANdisco. David, anything interesting happen since last time, you know, we've talked to you? >> David: Well I kind of got, you guys are a bad omen for me. Kind of left the CUBE in New York, got off a plane, got fired, and then four days later got reinstated. Apart from that, virtually nothing's happened actually. >> Hey, you know it's good coverage in The Financial Times, and then lots of press and everything, so lots more people know about WANdisco now, right? >> David: That's right, and I don't have Tourette's, I promise. (laughs) >> Alright, David, AWS re:Invent, I mean, pretty impressive show, you know we see you in a lot of shows, many of them interesting, lots of smart people but I mean, wow this is pretty impressive. They got up on stage lots of things that I'm sure interest you, give us your take of the show so far. >> It's fascinating, I mean, this sort of must have been, I wasn't there when, you know, Steve Jobs was launching the first Mac and so on, but this kind of feels, more than just a small movement. This is a large shift in enterprise, moving from On-premises to Cloud, I think it's unquestionable that's happening. I mean, I'm sure you've covered it this week on The Cube. I've not seen it, but 32,000 people are here. Virtually every single vendor that you could ever think of is exhibiting in this exhibit hall. You can barely move about the people. Our booth traffic has just been phenomenal this week, and it really feels like this is a seismic shift in the marketplace. I know we've been saying that for a while, but it really does feel that way. >> Why do you think now, is it just, we just got here, and it's the overnight success that's been ten years in the making, or was there an event or something that really, kind of, tipped it over to where we are, because clearly, it's very different than last year. >> It, sort of, Cloud V1, and you guys have been covering this for a long time, was really companies that were born in the Cloud, it was the Airbnbs, it was the Tinders, it was the Facebooks and so on. Those companies were actually made, born in the Cloud. What's now happening, clearly, is enterprise is moving to the Cloud, and Cloud 2.0 really is about a different set of requirements, a different set of customers. There are customers with massive petabyte-scale data sets that they really can't take advantage of, they can't really scale out, it's too complex for them to build many of the applications they need to build, they now have to move to Cloud, and, you know, 32,000 people are not here just for the sake of it, they're here because they have to be here, because they're moving, obviously, to Cloud, and AWS have such a massive lead, I think, in the Cloud at the moment and Enterprise Cloud, and that's probably why so many people are here. >> David, one of the interesting things to look at at this show is, Amazon has some opinions about where data lives, how it moves, where you process it, you know, all of those kind of things. You guys are kind of opinionated on those kind of things too so, you know, give us your view on those kind of, those guys. I mean, I made a comment on Twitter, it was like, "Hey, what do we call a data lake when it's in the Cloud now?" >> Jeff: Well look, that's what happens to Clouds, they-- >> One of the big reveals in Andy Jassy's talk this morning was a truck coming across the front of the stage, and I've had so many emails saying, is this real, is this a joke, are we now really moving data in a semi from On-premises into the Cloud? And, it's kind of interesting, I think it's a little bit of a gimmick to be honest with you, I think Amazon do lots of great things, there were lots of wonderful announcements today, like opening up Alexa and allowing, you know, and some of the things they're doing with serverless computers, just phenomenal, but I think a truck to move data from On-premises to Cloud, kind of feels like we're back in the 1970s to me, whereas I was talking to a, the CIO of an automotive company a couple of weeks ago. They have a problem where, you know, to move data causes an outage in their organization today of about 30 hours. Their data growth is going to be so vast, the velocity is going to be so great in the next 12 months, that if they use the existing technology today, that they have today, would take them in the region of a month to move that data. So, trucks are great for cold, archival data, well they might be great for cold, archival data, I'm sure you could figure out a better way, like the internet to move it, but for our active transactional data, data that changes and moves, that's critical to the organization, you simply can't put it on the back of a truck and basically mail it to Amazon with a Snowball, that really doesn't work, and I think the market really needs to be educated a little bit about what's possible. >> Well, and I don't know that Amazon would necessarily disagree with you. I mean, if you look at the Snowball family, they had the Snowball Edge out there, which was realization, hey I might want compute, and even, we're going to give you that new green grass Lambda, serverless type stuff, so that you can do processing where there's no network, or I can't do anything, but, I guess, we know from a physics standpoint, I understand, you know, the internet is great, but, you know, if I want to move, you know, 100 petabytes or more of data, you know, even if I'm a Telco, that's a ton of data that I need to move. So, tell me where there's this connect. >> So, the way that WANdisco's technology works, is we continually replicate data, so where every other form of data replication is time based, it requires the concept of a clock, like, even Google, who've got Google Spanner, which is kind of active/active replication, but relies on a satellite in the sky, on atomic clocks, GPS clocks on every single server. We don't have any of that reliance, we're transactional data replication, which means if something changes, it gets replicated, and that process is continuous, which means that you can basically move data applications without any downtime or interruption to service. And that's absolutely critical for what I called earlier Cloud V2, which is the enterprises moving to Cloud, they have to be able to get there without any interruption to service. Small data, yeah, you can use that kind of technology, or non-strategic data, yeah, you can use this kind of technology. Strategic data and strategic applications, trading systems, you know, you can't be 99.99% correct if somebody's got cancer or not, right? If you're using the Cloud, or machine learning technology to figure that out, you can't be, you know, almost certain, you need to be completely certain, and that requires data to be where it's supposed to be. >> So, Amazon's a partner of yours. What's it like being a partner of Amazon's these days? Give us your point on that. >> Amazon are a phenomenal company. They have to be, right, they've just built, probably the world's most valuable enterprise technology business by a country mile in ten years. I mean, it's just, you know, zero to 10 billion in (snaps fingers) the blink of an eye is just incredible. And part of their secret is, they base everything on data, and I've learned a lot from dealing with Amazon actually, everything is data driven. You know, they have this Five Why's, I'm sure you've read about it in the media, where you have to prove, through facts and figures, not sentiment, that something is so, and that's pretty uncomfortable for a lot of people. For us, it's not, and it's, working with Amazon, their requirements, the bar is so high it's made our products much much much better. They have a well-architectured review that they go through with all their partners. They're actually great to partner with, if you're not a very good company, I would, daresay, don't bother because they'll find you out very quickly. But they're a great set of guys, very very good to partner with, it's very black and white, it's very quantitative, but, yeah, they've obviously got a huge market. >> Yeah. One of the things I love about this show is that the quality of people, you know, is phenomenal, and you get such a, I mean, a huge cross-section, not only location, size, industry, but one of the things I think that is across everybody that comes here, is they're trying new things, they're open to, you know, moving forward, iterating, learning, which has been one of the things that, you know, we kind of say what holds companies back is like, oh I'm doing it the old way. So, what's your experience been with the users? Any stories you can tell from that standpoint? >> So, right down to the bottom of the organization, they're prepared to take any idea. I mean, Amazon Web Services, for goodness' sake was basically a paper that was written and presented to Jeff Bezos, right, who said, yeah that's a good idea to Jassy and said yeah, let's go off and do it. But they, virtually every innovation in their organization is somebody coming up with an idea. They have the mechanics and machinery to listen to that idea. We do it ourselves, so, we're looking at serverless compute and using Lambda so we can have replication literally as a service that you can just call, you can call Paxos, which is our core IP, it's based on Paxos, it's called DConE, so you can call that algorithm and get a replication service. So these concepts, some of the concepts that Amazon are introducing, their ability to move so quickly to introduce new products is because they have this innovative approach where they allow people, right down to the very bottom of the organization, to come up with new ideas and approaches to doing things. And it's perfectly fine for somebody at the bottom of their organization to challenge somebody at the top of the organization. In fact, they expect it. And again, that's not comfortable for a lot of people, but I like the way that they go around their business. >> I'm looking forward to, Alexa, how's my replication doing? (laughs loudly) >> Wouldn't that be great? >> Well, it's interesting you say that, we had Malcolm Gladwell on a month or two ago, and he talked about, the most powerful organizations are the ones that let the fresh ideas bubble up from the bottom because it's the people that have not been tainted by being in part of the company, that had new and creative and innovative, and a different way of looking at it, and oftentimes they get squelched, so the fact that they let those ideas come up, and also driven by data, pretty powerful. >> It's interesting being at the show this week, and I have two types of meetings, I have meetings with companies at the forefront of this Cloud revolution, companies at the forefront of building new, innovative applications that were designed for the Cloud, and then I have other meetings with companies, vendors, who have been caught out by this. They didn't see this coming, they didn't expect, you know, this sea change to happen as quickly as it's happening and they really are fighting and scrambling to know what to do, and this is everything from, you know, the big services companies, the big traditional enterprise storage companies are really struggling to understand what they're going to do with the Cloud, and they don't have those processes and procedures inside their businesses like we do. Like, they can't change and be agile and nimble and take advantage of these new products and markets that are suddenly appearing overnight. >> Yeah, it's funny, the guy from (mumbles) was talking about, they don't want to be a system integrator anymore, right now it's services integration and really changing the way you think about putting this stuff together, it's very different. >> It is very different, and, it used to be the case that you'd get, and I know we've all lived through this, you get the enterprise sales guy that turns up in the $2,000 suit and the Porsche parked outside, and comes in and sells you, you know, a piece of software, and asks you how your wife and kids are doing and all the rest of it. Look at the audience here today. They're not going to put up with, you know, that style of enterprise sales moving forward. People are buying stuff from a marketplace. The expectation is you can choose, select, deploy, and build applications yourself, and that's how many of these companies are operating today. So it's not just the sea change in the technology, the technology's facilitating completely different and new markets. >> Jeff: Behaviors, yeah. >> David, want to give you the final word on, as you leave this show, you know, your takeaways, what you want people to know. >> Clearly we're in an era where, this is going to be an Enterprise Cloud. Cloud 2.0 is all about enterprises that are taking their data from On-premises into the Cloud. It's happening very quickly. 32,000 people are here this week, they're here for a reason, because they have to be. This is a sea change in the marketplace, and I hope, well I know WANdisco's the vanguard of moving many of those enterprises from On-premises into the Cloud very quickly. >> Alright, absolutely, definitely agree with the sea change there. David Richards, founder and still CEO of WANdisco, really appreciate you joining us again. We'll be back to wrap up our coverage of today at AWS re:Invent 2016. You're watching the CUBE. (light techno music)

Published Date : Dec 1 2016

SUMMARY :

Brought to you by AWS and you know, we've talked to you? Kind of left the CUBE in New York, and I don't have Tourette's, I promise. take of the show so far. that you could ever think of the overnight success that's to Cloud, and, you know, so, you know, give us your view on like the internet to move it, so that you can do and that requires data to be of Amazon's these days? in (snaps fingers) the blink of an eye One of the things I love about this show that you can just call, that let the fresh ideas at the forefront of this Cloud revolution, the way you think about and the Porsche parked outside, as you leave this show, you know, This is a sea change in the marketplace, really appreciate you joining us again.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
David Richards	PERSON	0.99+
Jeff Bezos	PERSON	0.99+
Jeff	PERSON	0.99+
$2,000	QUANTITY	0.99+
Andy Jassy	PERSON	0.99+
Telco	ORGANIZATION	0.99+
Steve Jobs	PERSON	0.99+
New York	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
ten years	QUANTITY	0.99+
WANdisco	ORGANIZATION	0.99+
zero	QUANTITY	0.99+
Porsche	ORGANIZATION	0.99+
Jassy	PERSON	0.99+
Cloud 2.0	TITLE	0.99+
32,000 people	QUANTITY	0.99+
last year	DATE	0.99+
Lambda	TITLE	0.99+
100 petabytes	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Las Vegas, Nevada	LOCATION	0.99+
this week	DATE	0.99+
two types	QUANTITY	0.99+
99.99%	QUANTITY	0.98+
Cloud	TITLE	0.98+
1970s	DATE	0.98+
about 30 hours	QUANTITY	0.98+
first	QUANTITY	0.98+
four days later	DATE	0.98+
Malcolm Gladwell	PERSON	0.97+
today	DATE	0.97+
Alexa	TITLE	0.97+
10 billion	QUANTITY	0.97+
One	QUANTITY	0.97+
Twitter	ORGANIZATION	0.94+
one	QUANTITY	0.93+
Mac	COMMERCIAL_ITEM	0.92+
Cloud V1	TITLE	0.9+
Facebooks	ORGANIZATION	0.9+
next 12 months	DATE	0.88+
a month	QUANTITY	0.88+
a month or	DATE	0.88+
Snowball Edge	COMMERCIAL_ITEM	0.88+
couple of weeks ago	DATE	0.83+
two ago	DATE	0.82+
this morning	DATE	0.81+
Tinders	TITLE	0.8+
Tourette	ORGANIZATION	0.8+
ton of data	QUANTITY	0.78+
V2	TITLE	0.77+
Invent 2016	EVENT	0.76+
Invent	EVENT	0.76+

Ritika Gunnar & David Richards - #BigDataSV 2016 - #theCUBE

>> Narrator: From San Jose, in the heart of Silicon Valley, it's The Cube, covering Big Data SV 2016. Now your hosts, John Furrier and Peter Burris. >> Okay, welcome back everyone. We are here live in Silicon Valley for Big Data Week, Big Data SV Strata Hadoop. This is The Cube, SiliconANGLE's flagship program. We go out to the events and extract the signals from the noise. I'm John Furrier, my co-host is Peter Burris. Our next guest is Ritika Gunnar, VP of Data and Analytics at IBM and David Richards is the CEO of WANdisco. Welcome to The Cube, welcome back. >> Thank you. >> It's a pleasure to be here. >> So, okay, IBM and WANdisco, why are you guys here? What are you guys talking about? Obviously, partnership. What's the story? >> So, you know what WANdisco does, right? Data replication, active-active replication of data. For the past twelve months, we've been realigning our products to a market that we could see rapidly evolving. So if you had asked me twelve months ago what we did, we were talking about replicating just Hadoop, but we think the market is going to be a lot more than that. I think Mike Olson famously said that this Hadoop was going to disappear and he was kind of right because the ecosystem is evolving to be a much greater stack that involves applications, cloud, completely heterogeneous storage environment, and as that happens the partnerships that we would need have to move on from just being, you know, the sort of Hadoop-specific distribution vendors to actually something that can deliver a complete solution to the marketplace. And very clearly, IBM has a massive advantage in the number of people, the services, ecosystem, infrastructure, in order to deliver a complete solution to customers, so that's really why we're here. >> If you could talk about the stack comment, because this is something that we're seeing. Mike Olson's kind of being political when he says make it invisible, but the reality is there is more to big data than Hadoop. There's a lot of other stuff going on. Call it stack, call it ecosystem. A lot of great things are growing, we just had Gaurav on from SnapLogic said, "everyone's winning." I mean, I just love that's totally true, but it's not just Hadoop. >> It's about Alldata and it's about all insight on that data. So when you think about Alldata, Alldata is a very powerful thing. If you look at what clients have been trying to do thus far, they've actually been confined to the data that may be in their operational systems. With the advent of Hadoop, they're starting to bring in some structured and unstructured data, but with the advent of IOT systems, systems of engagement, systems of records and trying to make sense of all of that, Alldata is a pretty powerful thing. When I think of Alldata, I think of three things. I think of data that is not only on premises, which is where a lot of data resides today, but data that's in the cloud, where data is being generated today and where a majority of the growth is. When I think of Alldata, I think of structured data, that is in your traditional operational systems, unstructured and semi-structured data from IOT systems et cetera, and when I think of Alldata, I think of not just data that's on premises for a lot of our clients, but actually external data. Data where we can correlate data with, for example, an acquisition that we just did within IBM with The Weather Company or augmenting with partnerships like Twitter, et cetera, to be able to extract insight from not just the data that resides within the walls of your organization, but external data as well. >> The old expression is if you want to go fast, do it alone, if you want to go deeper and broader and more comprehensive, do it as a team. >> That's right. >> That expression can be applied to data. And you look at The Weather data, you think, hmmm, that's an outlier type acquisition, but when you think about the diversity of data, that becomes a really big deal. And the question I want to ask you guys is, and Ritika, we'll start with you, there's always a few pressure points we've seen in big data. When that pressure is relieved, you've seen growth, and one was big data analytics kind of stalled a little bit, the winds kind of shifted, eye of the storm, whatever you want to call it, then cloud comes in. Cloud is kind of enabling that to go faster. Now, a new pressure point that we're seeing is go faster with digital transformation. So Alldata kind of brings us to all digital. And I know IBM is all about digitizing everything and that's kind of the vision. So you now have the pressure of I want all digital, I need data driven at the center of it, and I've got the cloud resource, so kind of the perfect storm. What's your thoughts on that? Do you see that similar picture? And then does that put the pressure on, say, WANdisco, say hey, I need replication, so now you're under the hood? Is that kind of where this is coming together? >> Absolutely. When I think about it, it's about giving trusted data and insights to everyone within the organization, at the speed in which they need it. So when you think about that last comment of, "At the speed in which they need it," that is the pressure point of what it means to have a digitally transformed business. That means being able to make insights and decisions immediately and when we look at what our objective is from an IBM perspective, it's to be able to enable our clients to be able to generate those immediate insights, to be able to transform their business models and to be able to provide the tooling and the skills necessary, whether we have it organically, inorganically, or through partnerships, like with WANdisco to be able to do that. And so with WANdisco, we believe we really wanted to be able to activate where that data resides. When I talk about Alldata and activation of that data, WANdisco provided to us complementary capabilities to be able to activate that data where it resides with a lot of the capabilities that they're providing through their fusion. So, being able to have and enable our end-users to have that digitally infused set of reactive type of applications is absolutely something... >> It's like David, we talk about, and maybe I'm oversimplifying your value proposition, but I always look at WANdisco as kind of the five nines of data, right? You guys make stuff work, and that's the theme here this year, people just want it to work, right? They don't want to have it down, right? >> Yeah, we're seeing, certainly, an uptick in understanding about what high availability, what continuous availability means in the context of Hadoop, and I'm sure we'll be announcing some pretty big deals moving forward. But we've only just got going with IBM. I would, the market should expect a number of announcements moving forward as we get going with this, but here's the very interesting question associated with cloud. And just to give you a couple of quick examples, we are seeing an increasing number of Global 1,000 companies, Fortune 100 companies move to cloud. And that's really important. If you would have asked me 12 months ago, how is the market going to shape up, I'd have said, well, most CIO's want to move to cloud. It's already happening. So, FINRA, the major financial regulator in the United States is moving to cloud, publicly announced it. The FCA in the UK publicly announced they are moving 100% to cloud. So this creates kind of a microcosm of a problem that we solve, which is how do you move transactional data from on-premise to cloud and create a sort of hybrid environment. Because with the migration, you have to build a hybrid cloud in order to do that anyway. So, if it's just archive systems, you can package it on a disk drive and post it, right? If we're talking about transactional data, i.e, stuff that you want to use, so for example, a big travel company can't stop booking flights while they move their data into the cloud, right? They would take six months to move petabyte scale data into cloud. We solve that problem. We enable companies to move transactional data from on-premise into cloud, without any interruption to services. >> So not six months? >> No, not six months. >> Six hours? >> And you can keep on using the data while it is in transit. So we've been looking for a really simplistic problem, right, to explain this really complex algorithm that we've got that you know does this active-active replication stuff. That's it, right? It's so simple, and nobody else can do it. >> So no downtime, no disruption to their business? >> No, and you can use the cloud or you can use the on-prem applications while the data is in transit. >> So when you say all cloud, now we're on a theme, Alldata, all digital, all cloud, there's a nuance there because most, and we had Gaurav from SnapLogic talk about it, there's always going to be an on-prem component. I mean, probably not going to see 100% everyone move to the cloud, public cloud, but cloud, you mean hybrid cloud essentially, with some on-prem component. I'm sure you guys see that with Bluemix as well, that you've got some dabbling in the public cloud, but ultimately, it's one resource pool. That's essentially what you're saying. >> Yeah, exactly. >> And I think it's really important. One of the things that's very attractive e about the WANdisco solution is that it does provide that hybridness from the on-premises to cloud and that being able to activate that data where it resides, but being able to do that in a heterogeneous fashion. Architectures are very different in the cloud than they are on premises. When you look at it, your data like may be as simple as Swift object store or as S3, and you may be using elements of Hadoop in there, but the architectures are changing. So the notion of being able to handle hybrid solutions both on-premises and cloud with the heterogeneous capability in a non-invasive way that provides continuous data is something that is not easily achieved, but it's something that every enterprise needs to take into account. >> So Ritika, talk about the why the WANdisco partnership, and specifically, what are some of the conversations you have with customers? Because, obviously there's, it sounds like, the need to go faster and have some of this replication active-active and kind of, five nines if you will, of making stuff not go down or non-disruptive operations or whatever the buzzword is, but you know, what's the motivation from your standpoint? Because IBM is very customer-centric. What are some of the conversations and then how does WANdisco fit into those conversations? >> So when you look at the top three use cases that most clients use for even Hadoop environments or just what's going on in the market today, the top three use cases are you know, can I build a logical data warehouse? Can I build areas for discovery or analytical discovery? Can I build areas to be able to have data archiving? And those top three solutions in a hybrid heterogeneous environment, you need to be able to have active-active access to the data where that data resides. And therefore, we believe, from an IBM perspective, that we want to be able to provide the best of breed regardless of where that resides. And so we believe from a WANdisco perspective, that WANdisco has those capabilities that are very complementary to what we need for that broader skills and tooling ecosystem and hence why we have formed this partnership. >> Unbelievably, in the market, we're also seeing and it feels like the Hadoop market's just got going, but we're seeing migrations from distributions like Cloudera into cloud. So you know, those sort of lab environments, the small clusters that were being set up. I know this is slightly controversial, and I'll probably get darts thrown at me by Mike Olson, but we are seeing pretty large-scale migration from those sort of labs that were set up initially. And as they progress, and as it becomes mission-critical, they're going to go to companies like IBM, really, aren't they, in order to scale up their infrastructure? They're going to move the data into cloud to get hyperscale. For some of these cases that Ritika was just talking about so we are seeing a lot of those migrations. >> So basically, Hadoop, there's some silo deployments of POC's that need to be integrated in. Is that what you're referring to? I mean, why would someone do that? They would say okay, probably integration costs, probably other solutions, data. >> If you do a roll-your-own approach, where you go and get some open-source software, you've got to go and buy servers, you've got to go and train staff. We've just seen one of our customers, a big bank, two years later get servers. Two years to get servers, to get server infrastructure. That's a pretty big barrier, a practical barrier to entry. Versus, you know, I can throw something up in Bluemix in 30 minutes. >> David, you bring up a good point, and I want to just expand on that because you have a unique history. We know each other, we go way back. You were on The Cube when, I think we first started seven years ago at Hadoop World. You've seen the evolution and heck, you had your own distribution at one point. So you know, you've successfully navigated the waters of this ecosystem and you had gray IP and then you kind of found your swim lanes and you guys are doing great, but I want to get your perspective on this because you mentioned Cloudera. You've seen how it's evolving as it goes mainstream, as you know, Peter says, "The big guys are coming in and with power." I mean, IBM's got a huge spark investment and it's not just you know, lip service, they're actually donating a ton of code and actually building stuff so, you've got an evolutionary change happening within the industry. What's your take on the upstarts like Cloudera and Hortonworks and the Dishrow game? Because that now becomes an interesting dynamic because it has to integrate well. >> I think there will always be a market for the distribution of opensource software. As that sort of, that layer in the stack, you know, certainly Cloudera, Hortonworks, et cetera, are doing a pretty decent job of providing a distribution. The Hadoop marketplace, and Ritika laid this on pretty thick as well, is not Hadoop. Hadoop is a component of it, but in cloud we talk about object store technology, we talk about Swift, we talk about S3. We talk about Spark, which can be run stand-alone, you don't necessarily need Hadoop underneath it. So the marketplace is being stretched to such a point that if you were to look at the percentage of the revenue that's generated from Hadoop, it's probably less than one percent. I talked 12 months ago with you about the whale season, the whales are coming. >> Yeah, they're here. >> And they're here right now, I mean... >> (laughs) They're mating out in the water, deals are getting done. >> I'm not going to deal with that visual right now, but you're quite right. And I love the Peter Drucker quote which is, "Strategy is a commodity, execution is an art." We're now moving into the execution phase. You need a big company in order to do that. You can't be a five hundred or a thousand person... >> Is Cloudera holding onto dogma with Hadoop or do they realize that the ecosystem is building around them? >> I think they do because they're focused on the application layer, but there's a lot of competition in the application layer. There's a little company called IBM, there's a little company called Microsoft and the little company called Amazon that are kind of focused on that as well, so that's a pretty competitive environment and your ability to execute is really determined by the size of the organization to be quite frank. >> Awesome, well, so we have Hadoop Summit coming up in Dublin. We're going to be in Ireland next month for Hadoop Summit with more and more coverage there. Guys, thanks for the insight. Congratulations on the relationship and again, WANdisco, we know you guys and know what you guys have done. This seems like a prime time for you right now. And IBM, we just covered you guys at InterConnect. Great event. Love The Weather Company data, as a weather geek, but also the Apple announcement was really significant. Having Apple up on stage with IBM, I think that is really, really compelling. And that was just not a Barney deal, that was real. And the fact that Apple was on stage was a real testament to the direction you guys are going, so congratulations. This is The Cube, bringing you all the action, here live in Silicon Valley here for Big Data Week, BigData SV, and Strata Hadoop. We'll be right back with more after this short break.

Published Date : Mar 30 2016

SUMMARY :

the heart of Silicon Valley, and David Richards is the CEO of WANdisco. What's the story? and as that happens the partnerships but the reality is there is but data that's in the cloud, if you want to go deeper and broader to ask you guys is, and to be able to provide the tooling how is the market going to that we've got that you know the cloud or you can use dabbling in the public cloud, from the on-premises to cloud the need to go faster and the top three use cases are you know, and it feels like the Hadoop of POC's that need to be integrated in. a practical barrier to entry. and it's not just you know, lip service, in the stack, you know, mating out in the water, And I love the Peter and the little company called Amazon to the direction you guys are

ENTITIES

Entity	Category	Confidence
Michiel	PERSON	0.99+
Anna	PERSON	0.99+
David	PERSON	0.99+
Bryan	PERSON	0.99+
John	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Michael	PERSON	0.99+
Chris	PERSON	0.99+
NEC	ORGANIZATION	0.99+
Ericsson	ORGANIZATION	0.99+
Kevin	PERSON	0.99+
Dave Frampton	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Kerim Akgonul	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Jared	PERSON	0.99+
Steve Wood	PERSON	0.99+
Peter	PERSON	0.99+
Lisa Martin	PERSON	0.99+
NECJ	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Mike Olson	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Michiel Bakker	PERSON	0.99+
FCA	ORGANIZATION	0.99+
NASA	ORGANIZATION	0.99+
Nokia	ORGANIZATION	0.99+
Lee Caswell	PERSON	0.99+
ECECT	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
OTEL	ORGANIZATION	0.99+
David Floyer	PERSON	0.99+
Bryan Pijanowski	PERSON	0.99+
Rich Lane	PERSON	0.99+
Kerim	PERSON	0.99+
Kevin Bogusz	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Jared Woodrey	PERSON	0.99+
Lincolnshire	LOCATION	0.99+
Keith	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Chuck	PERSON	0.99+
Jeff	PERSON	0.99+
National Health Services	ORGANIZATION	0.99+
Keith Townsend	PERSON	0.99+
WANdisco	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
March	DATE	0.99+
Nutanix	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Ireland	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
Michael Dell	PERSON	0.99+
Rajagopal	PERSON	0.99+
Dave Allante	PERSON	0.99+
Europe	LOCATION	0.99+
March of 2012	DATE	0.99+
Anna Gleiss	PERSON	0.99+
Samsung	ORGANIZATION	0.99+
Ritika Gunnar	PERSON	0.99+
Mandy Dhaliwal	PERSON	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for WANdisco: