Jagane Sundar, WANdisco | CUBEConversation, January 2019
>> Hello everyone. Welcome to this CUBE conversations here in Palo Alto, California John Furrier, host of the Cube. I'm here with Jagane Sundar CTO chief technology officer of WANdisco, you get great to see you again. Place we're coming on. >> Thank you for having me, John. >> So the conversation I want to talk to about the technology behind WANdisco and we've had many conversations. So for the folks watching good, our YouTube channel insurgency the evolution of conversations over, I think. Eight, eight, nine years now we've been chatting. What a level up. You guys are now with cloud big announcements around multi cloud live data in particular. So the technology is the gift that keeps giving for WANdisco you guys continuing to take territory now, a big way with cloud, big growth, A lot of changes, a lot of hires. What's going on? >> So, as you well know, WANdisco stands for wide area network distributed, computing on the value ofthe the wide data network aspect is really shining through now because nobody goes to the cloud saying, I'm going to put it in one data center. It's always multiple regions, multiple data centers in each region. Suddenly, problem of having your data consistent, being across multiple cloud windows are on prem to cloud becomes a real challenge. We stepped in. We had something that was a good solution for small users, small data. But we developed it into something that's fantastic for large data volumes on people are running into the problem. The biggest problem that IT providers have is that data scientists do not respect data that's not consistent. If you look at a replica of data and you're not sure whether it's exactly accurate or not the data scientists who spent all his time building algorithms to predict some model gonna look at it and go, that data's not quite right. I'm not going to look at it. So if you use a inconsistent tool or an inadequate tool to replicate your data, you have the problem that nobody is going to respect the replicas. Everybody's going to go back to the source of truth. We solved that problem elegantly and accurately >> State the problem specifically. Is it the integrity of the data? What is the specific problem statement that you guys solve with technology? >> Let me give you an exam you have notifications that come out of cloud object stores when an object this place into the store or deleted from the store that the best effort delivery. If there are logjams in this mechanism used to deliver some notifications, maybe drop the problem with using that notification mechanism to replicate your data is that over a period of time, so you have two three petabytes of data and you're replicating it over a month or month and a half, you'll find that maybe point one percent of your data is not quite accurate anymore. So the value ofthe the replicas essentially zero >> like a leaky pipe. Basically, >> indeed, if you have a leaking pipe, then it's just totally >> we need to have integrity and to end. All right, let's get back to some of the things I want to ask because I think it's a fascinating been following your story. For years, you had a point solution. Multiple wider. You had the replication active, active great for data centers. So disaster recovery not mission critical, but certainly critical. Correct, depending on how it the mission of us. It wasn't this asked Income's Cloud. You mentioned a wide area. Networks and you go back to the old days when I was breaking into the business. That's when they had, you know, dial up modems and front pagers. Not even cell phones. Just starting. Why do your network would have really complicated beast and all the best resource is worked on expensive bandwith, that he had remote offices and you had campus networking then. So why the area networking went through that phase one? Correct. Now we're living in. They win all the time. Cloud is when white area >> correct cloud is when. But there are subtle aspect that people miss all the time. If you go to store an object in Amazon, says three, for example, you pick a region. If it's a complete wide area distributed entity, why do you need to pick a region? The truth is, each cloud vendor hides a number of region specific or local area network specific aspects of their service. Dynamo DB runs and one data centre one one region, two or three availability zones in a region. If you want to replicate that data, you don't really have much help from the cloud vendor themselves. So you need to parse the truth from what has offered what you will find us. The van is still a very challenging problem for a lot of these data application problems. >> Talk about the wide area network challenges in the modern era we're living in, which is cloud computing mentioned some of the nuances around regions and availability zones. Basically, the cloud grew up as building blocks and the plumbing on the neither essentially a mai britt of of certain techniques and networking. Local area network V lands tunneling All these stuff Nets router. So it's obviously plumbing. Yes, what's different now that's important to take that to the next level. Because, you know, there are arguments that saying, Hey, GPR, I might want to have certain regions be smarter, right? So you're starting to see a level up that Amazon and others air going. Google, in particular, talks about this a lot as Ama's Microsoft. What's that next level of when, where the plumbing it's upgraded from basically the other things. >> So the problem really has to be stated in terms ofthe your data architecture. If you look at your data on, figure out that you need the set of data to be available for your business critical applications, then the problem turns into. I need replicas of this data in this region and the other reasons, perhaps in two different cloud render locations because you don't want to be tied down to their availability. One cloud vendor, then the problem tones into How do you hide the complexity of replicating and keeping this data consistent from the users of the data data scientists, the application authors and so on. Now, that's where we step in. We have a transparent replication solution that fits into the plumbing. It's often offered by the IT folks as part of their cloud offering or as part of the hybrid offering. The application. Developers don't really need to worry about those things. A specific example would be hive tables that are users building in one data center an IT Professional from that organization can buy our replication software. That table will be available in multiple data centers and multiple regions available for both Read and write. The user did not do anything or does not need to be a there. So if you have problems such as GDPR requires the data to be here. But this summarized data can be available across all of these regions. Then we can solve the problem elegantly for you without any act application rewiring or reauthoring. >> Talk about the technology that makes all this happen again. This has been a key part of your success that WANdisco love the always love the name wide area there was a big wide area that were fan did that in my early days configuring router tables. You know how it has been. You know, hardcore back then, Distributed systems is certainly large. Scale now is part of the clouds. So all the large scale guys like me when we grew up into computer science days had to think about systems, architecture at scale. We're actually living it now, Correct. So talk about the technology. What specifically do you guys have that that that's your technology and talk about the impact to the scale piece. I think that's a real key technology piece >> indeed. So the core of our algorithm is enhancements and superior implementation. Often algorithm called paxos. Now paxos itself is the only mathematically proven algorithm for keeping replicas in multiple machines or multiple regions. So multiple data centers the other alternatives. Such a raft and zookeeper protocol. These are all compromises for the sake of the ease of implementation. Now we don't feel the cost of implementation. We spent many years doing the research on it, so we have fantastic implementation. Of paxos is extended for use over wide data networks without any special hardware I mentioned without any special hardware piece, because Google Spanner, which is one of our primary competitors, has an implementation that that needs your own specific network and hardware. So the value of >> because they're tired, the clock, atomic clock, actually, to the infrastructure of their timings, that's all synchronized. So it's it's only within Google Cloud? >> Exactly. It cannot even be made available to Google's customers of Google Cloud. That was a feature that they added recently, but it's rolling out in very limited. >> They inherited that from their large scale correct Google. Yes, which is a big table spanner. These are awesome products. >> These are awesome products, but they're very specific >>Tailored for Google. >> Yes, they're great in the Google environment. They're not so great outside of Google. Now we have technology that makes you able to run this across a Google Cloud and Microsoft's Cloud and Amazons Cloud. The value of this is that you have truly cloud neutral solutions. You don't need to worry about when the lock in, you don't need to worry about availability problems in one of the cloud vendors and then you can scale your solution. You can go in with an approach such that when the virtual machines or the compute resource is in one cloud vendor are really inexpensive. Will use that when it's very expensive. Will move our workloads to other locations. You can think up architectures like that, with our solution underpinning your replication >> rights again. I'm gonna ask you the technical quite love these conversations get down and dirty on the hood. So Joel Horowitz was on your new CMO former Microsoft. Keep alumni Richard CEO Talk aboutthe. Same thing. Moving data around the key value probably that's tied right into your legacy of your I P and how that value is with integrity. Moving data from point A to point B. But the world's moving also to identify scenarios where I'm going to move compute rather than through the day, because people have recognized that moving data is hard you got late in C and this cost in band with so two schools of thought not mutually exclusive. When do you pick one? >> Okay, absolutely. They're not mutually exclusive because there are data availability needs that defined some replication scenarios on their computer needs that can be more flexible. If you had the ability to say, have data in Amazon's cloud on in Microsoft's Cloud, You mean Want to use some Amazon specific tools for specific computer scenarios at the same time, used Microsoft tools for other scenarios or perhaps use open source, too, like Hadoop in either one of those clouds? Those are all mechanisms that work perfectly well, but at the core you have to figure out your data architecture. If you can live with your data in one region or in one data center, clearly that's what you should do. But if you cannot have that data, be unavailable, you do have to replicate it. At that point, you should consider replicating to a different cloud window because availability is concerned with all these vendors. >> So two things I hear you say one availability is it's a driver. The other one is user preference Yes. Why not have people who know Microsoft tools and Microsoft software work on Microsoft framework of someone using something else in another cloud? The same data can live in both places. You guys make that happen? Is that what you're saying? Exactly. That's a big deal. >> Absolutely. And we guarantee the consistency that a guarantee that you will not get from any other bender. >> So this basically debunks the whole walk in, Yes, that you guys air solution to to essentially relieve this notion of lock and so me as a customer and say, Hey, I'm an Amazon right now. We're all in an Amazon. But, you know, I've got some temptation to goto Azure or Google. Why wouldn't I if I have the ability to make my data consistent, exact. Is that what you're saying? >> That is exactly what I'm saying. You have this ability to experiment with different cloud vendors. You also have the ability to mitigate some of the cost aspect. If you're going to pay for copies in two different geographic locations, you might as well do it on two different cloud vendor see have the richer subset of applications and better availability. >> So for people who say date is a lock inspect for cloud. It's kind of right if unless they use WANdisco because in a sense, and because you know what really moves with it. I mean, your data's Did you stay there? Yeah, that's kind of common sense. It's not so much technical locket, so there's no real technical lockets. More operational lock and correct with data, if you don't wantto. But if you're afraid of lock in, you go with the WANdisco. That's live data. Multi cloud is that >> that was live data multi cloud on. Does this new ability to actually have active data sets that are available in different cloud bender locations? >> Well, that's a killer app right there. How do you feel? You must You must feel pretty good. You know, you and I have talked many times. Yes, but this's like you been waiting for this moment. This is actually really wide here in a k a cloud. I was a big data problem. Which only getting bigger, exactly. Replication is now the transport between clouds for anti lock. And this is the Holy Grail for home when >> it is the Holy Grail for the industrial. We've been talking about it for years now, and we feel completely redeemed. Now we feel that the industry has gotten to the point back. They understand what we've talked about. I feel very excited, the custom attraction we're seeing on watching our customers light of when we describe the attributes we bring, It's >> exciting and just the risk management alone is a hedge. I mean, if I'm a if I'm someone in the cyber security challenges alone on data, you've got data sovereignty, compliance. Never mind the productivity piece of it, which is pretty amazing. So you guys are changing the data equation. >> Indeed, R R No most excited customers are CEOs because mitigating risk from things like cyber security. As you point out, you may have a breach in one cloud vendor. You can turn that off and use your replica in the other cloud vendor side instantly. Those are comfort. You do not get that other solutions. >> So world having a love fest here. I love the whole multi cloud data. No anti lock. And I think that's a killer feature. Think we'll sell that baby? I'm going to say, OK, that's all good, but I'm going to get you on this one. Security. So no one saw security yet. So if you saw that, then you pretty much got it all. So tell me the securities. Just >> so I'll start by saying, right. Our biggest customer base is the financial industry, banking in companies insurance company's health care. There is no industry in the world that's more security conscious than the banking. And does the government the comment? Perhaps I would. I mean, the banks are really security >> conscious, Their money's money, >> money is money. And and they have, ah, judicially responsibility both governments and to their to their customers. So we've catered to these customers for upwards off a decade. Now, every technical decision we make has security. Ask one of the focus items on DH >> years. A good un security. You >> feel's way insecurity when minute comes to date. Yes. >> Encryption. Is that what this is? It's >> encrypted on the wire. We support all on this data at rest encryption schemes. We support all the the the soup and the cloud vendor security mechanisms. We have a cross cloud product, so the security problems are multiplied and we take care of each of those specifically. So you can be confident that your data secure >> and wire speed security, no overhead involved, >> no overhead involved at all. It's not measurable. >> So well, congratulations on where you guys are a lot more work to do. You guys going to staff? So you hiring a lot of people talk about the talent you're hiring real quick because, you know large skin attracting large scale talent is also one indicator. Yeah, the successful opportunity. I see, the more I think the positioning is phenomenal. Congratulations absent about the hiring, >> as you know, as as David mentioned. A few minutes ago, we hired Joel from IBM for our marketing a department. He cmo wonderful. Higher. We've got Ronchi, who's from the University of Denver. I left the head of that computer science department to come work for us. Another amazing guy. Terrific background. We've got shocked me. Who's another column? UT Austin, phD. He's running engineering for us. We're so pleased to be able to hire talent at this level. As as you well know, it's the people who make these jobs interesting and products interesting. We are. So what are >> some of the things that those guys say when they when they get into really exposed. I mean, why would someone with somewhat what would take someone to quit their ten year professor job at a university, which is pretty much retirement to engage in a growing opportunity? What's the What do they say? >> So the single I mean that you'll find in all of this is very complex, unique technology that has bean refined on it's on the verge of exploding toe, probably something ten to one hundred times the size it is today. People see that when dish when we show them the value ofthe what we've got on the market, that we're taking this too. I'm just getting excited. >> Well, congratulations. You guys have certainly worked hard. Has been great to watch the entrepreneurial journey of getting into that growth stream and just the winds that you're back all that hard work into technologies. Phenomenal again. Multi cloud data not worrying about where your data is is going to give people some East and rest in the other rest of night. Well, because that's the number one of the number one was besides security absolutely Jagane Sundar CTO chief technology officer of WANdisco here inside the CUBE in Palo Alto. I'm John Furrier. Thanks for watching.
SUMMARY :
you get great to see you again. So for the folks watching good, our YouTube channel insurgency the evolution of conversations over, So if you use a inconsistent tool or that you guys solve with technology? So the value ofthe the replicas essentially zero like a leaky pipe. You had the replication active, active great for data centers. So you need to parse the truth from what has offered Talk about the wide area network challenges in the modern era we're living in, which is cloud computing mentioned some So the problem really has to be stated in terms ofthe your data architecture. So all the large scale guys So the value of because they're tired, the clock, atomic clock, actually, to the infrastructure of their timings, It cannot even be made available to Google's customers of Google They inherited that from their large scale correct Google. availability problems in one of the cloud vendors and then you can scale your solution. Moving data around the key value probably that's tied right into your legacy work perfectly well, but at the core you have to figure out your data architecture. So two things I hear you say one availability is it's a driver. And we guarantee the consistency that a guarantee that you will not get from any So this basically debunks the whole walk in, Yes, that you guys air solution to to You also have the ability to mitigate some of the cost aspect. they use WANdisco because in a sense, and because you know what really moves with it. Does this new ability to actually You know, you and I have talked many times. it is the Holy Grail for the industrial. So you guys are changing As you point out, you may have a breach in So if you saw that, then you pretty much got it all. I mean, the banks are really security Ask one of the focus items on DH You feel's way insecurity when minute comes to date. Is that what this is? So you can be confident that your data secure It's not measurable. So you hiring a lot of people talk about the talent you're hiring real quick because, I left the head of that computer science department to come work for us. some of the things that those guys say when they when they get into really exposed. So the single I mean that you'll find in all of this getting into that growth stream and just the winds that you're back all
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Joel | PERSON | 0.99+ |
Joel Horowitz | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Jagane Sundar | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
WANdisco | ORGANIZATION | 0.99+ |
Jagane Sundar | PERSON | 0.99+ |
John | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
two | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Eight | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
Richard | PERSON | 0.99+ |
Ronchi | PERSON | 0.99+ |
University of Denver | ORGANIZATION | 0.99+ |
January 2019 | DATE | 0.99+ |
GPR | ORGANIZATION | 0.99+ |
one region | QUANTITY | 0.99+ |
two schools | QUANTITY | 0.98+ |
each region | QUANTITY | 0.98+ |
YouTube | ORGANIZATION | 0.98+ |
three | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
Ama | ORGANIZATION | 0.98+ |
GDPR | TITLE | 0.98+ |
each cloud | QUANTITY | 0.98+ |
one indicator | QUANTITY | 0.98+ |
two things | QUANTITY | 0.98+ |
ten | QUANTITY | 0.97+ |
three availability zones | QUANTITY | 0.97+ |
both places | QUANTITY | 0.97+ |
one hundred times | QUANTITY | 0.97+ |
one percent | QUANTITY | 0.97+ |
CUBE | ORGANIZATION | 0.97+ |
Google Cloud | TITLE | 0.97+ |
single | QUANTITY | 0.96+ |
eight | QUANTITY | 0.96+ |
Palo Alto, California | LOCATION | 0.95+ |
Google Cloud | TITLE | 0.95+ |
one data center | QUANTITY | 0.94+ |
CTO | PERSON | 0.94+ |
nine years | QUANTITY | 0.93+ |
today | DATE | 0.93+ |
both governments | QUANTITY | 0.92+ |
Cube | ORGANIZATION | 0.9+ |
one cloud vendor | QUANTITY | 0.9+ |
two three petabytes | QUANTITY | 0.9+ |
zero | QUANTITY | 0.89+ |
each | QUANTITY | 0.89+ |
One cloud vendor | QUANTITY | 0.88+ |
two different cloud | QUANTITY | 0.88+ |
over a month | QUANTITY | 0.87+ |
month and a half | QUANTITY | 0.86+ |
Hadoop | TITLE | 0.85+ |
Dynamo | ORGANIZATION | 0.82+ |
UT Austin | ORGANIZATION | 0.82+ |
few minutes ago | DATE | 0.81+ |
ten year professor | QUANTITY | 0.79+ |