Image Title

Search Results for GoldenGate:

Juan Loaiza, Oracle | CUBE Conversation 2021


 

(upbeat music) >> The innovation around databases has exploded over the last few years. Not only do organizations continue to rely on database technology to manage their most mission critical business data. But new use cases have emerged that process and analyze unstructured data. They share data at scale, protect data, provide greater heterogeneity. New technologies are being injected into the database equation. Not just cloud which has been a huge force in the space, but also AI to drive better insights and automation, blockchain to protect data and provide better auditability, new file formats to expand the utility of database technology and more. Debates are bound as to who's the best number one, the fastest, the most cloudy, the least expensive, et cetera. But there is no debate, when it comes to leadership and mission critical database technologies. That status goes to Oracle. And with me to talk about the developments of database technology in the market is cube alum Juan Loaiza, who's executive vice president of Mission Critical Database Technology at Oracle. Juan always great to see you, thanks for making some time. >> Thanks, great to see you Dave, always a pleasure to join you. >> Yeah and I hope you have some time because they've got a lot of questions for you. (chuckles) I want to start with- >> All right I love questions. >> Good I want to start and we'll go deep if you're up for it. I want to start with the GoldenGate announcement. We're covering that recent announcement, the service on OCI. GoldenGate it's part of this your super high availability capabilities that Oracle is so well known for. What do we need to know about the new service and what it brings for your customers? >> Yeah, so first of all, GoldenGate is all about creating real time data throughout an enterprise. So it does replication, data integration, moving data into analytic workloads, streaming analytics of data, migrating of databases and making databases highly available. All those are use cases for real-time data movement. And GoldenGate is really the leading product in the market, has been for many years. We have about 80% of the global fortune 500 running GoldenGate today, in addition to thousands and thousands of smaller customers. So it is the premier data integration, replication, high availability, anything involving moving data in real time, GoldenGate is the premier platform. And so we've had that available as a product for many years. And what we just recently done is we've released it as a cloud service, as a fully managed and automated cloud service. So that's kind of the big new thing that's happening right now. >> So is that what's unique about this, is it's now a service, or there are other attributes that are unique to Oracle? >> Yeah, so the service is kind of the most basic part to it. But the big thing about the service is it makes this product dramatically easier to use. So traditionally the data integration, replication products, although very powerful, also are very complex to use. And one of the big benefits of the service is we've made a dramatically simpler. So not just super experts can use it, but anyone can use it. And also as part of releasing it as a cloud service, we've done a number of unique things including making it completely elastically scalable, pay per use and dynamic scalability. So just in time, real time scalability. So as your workload increases we automatically increase the throughput of GoldenGate. So previously you had to figure all this stuff out ahead of time. It was very static. All these products have been very static. Now it's completely dynamic a native cloud product and that's very unique in the market. >> So, I mean, from an availability standpoint, I guess IBM sort of has this with Db2 but it doesn't offer the heterogeneity that GoldenGate has. But at what about like AWS, Microsoft, Google, do they provide services like, like GoldenGate? >> There's really nothing like the GoldenGate service. When you're talking about people like Google and Azure, they really have do it yourself third-party products. So there'll be a third party data integration replication product, and it's kind of available in their marketplace and customers have to do everything. So it's basically a put it together, your own kit. And it's very complicated. I mean these data integration products have always been complicated, and they're even more complicated in the cloud, if you have to do everything yourself. Amazon has a product but it's really focused on basic data migration to their cloud. It doesn't have the same capabilities as Oracle has. It doesn't have the elasticity, it doesn't have pay peruse, so it's really not very clavy at all. >> Well, so I mean the biggest customers have always glommed onto GoldenGate because they need that super ultra high availability. And they're capable of do it yourself. So, tell us how this compares to two DIY. >> Yeah, so you have mentioned the big customers so you're absolutely right. The big customers have been big users of GoldenGate. Smaller customers or users as well, however, it's been challenging because it's complicated. Data integration has been a complicated area of data management. More and most complicated. And so one of the things this does, is that it expands the market. Makes it much dramatically easier for smaller companies that don't have as many it resources to use the product. Also, smaller companies obviously don't have as much data as the really large giants. So they don't have as much data throughput. So traditionally the price has been high for a small customer. But now, with pay per use in the cloud, it eliminates the two big blockers for smaller enterprises. Which are the costs, the high fixed costs and the complexity of the products. So in which, by the way, it's helpful for everyone also. And for big customers they've also struggled with elasticity. So sometimes a huge batch job will kick in, the rate of change increases and suddenly the replication product doesn't keep up. Because on-prem products aren't really very elastic. So it helps large customers as well. Everybody loves these reviews but the elasticity pay per use, on demand nature of it's really helpful for everybody. >> Well, and because it's delivered as a service I would imagine for the large customers that you're giving them more granularity, so they can apply it maybe for a single application, as opposed to trying to have to justify it across a whole suite. And because the cost is higher, but now if you're allowing me to pay by the drink, is that right? I could just sort of apply it in a more granular level. >> Yes, that's exactly right. It's really pay per use. You can use it as much or as little as you want. You just pay for what you use. And as I mentioned, it's not a static payment either. So if you have a lot of data loads going on and right now you pay a little more, at night when you have less going on, you pay a lot less. So you really just paying for what use. It's very easy to set it up for a single application or all your applications. >> How about for things like continuous replication or real-time analytics, is the service designed to support that? >> Yes, so that's the heritage of GoldenGate. GoldenGate has been around for decades and we've worked with some of the most demanding customers in the world on exactly those things. So real time data all over the enterprise is really the goal that everyone wants. Real-time data from OTP and to analytics, from one system to another system, and for availability. That is the key benefit of GoldenGate. And that's the key technology that we've been working on for decades. And now we have it very easy to use in the cloud. >> Well what would be the overheads associated with that? I mean, for instance, you've go it, you need a second copy. You need the other database copies, and where does it make sense to incur that overhead? Obviously the super high availability apps that can exploit real time. Think like fraud detection is the obvious one, but what else can you add there? >> Well, GoldenGate itself doesn't require any extra copies of anything. However, it does enable customers that want to create for example, an analytics system, a data warehouse, to feed data from all their systems in real time into that data warehouse for example. And it also enables the real-time capabilities, enable high availability and you can get high availability within the cloud with it, between on premises in the cloud, between clouds. Also, you can migrate data. Migrate databases without having to take them down. So all these capabilities are available now and they're very easy to use. >> Okay. Thanks for that clarification. What about autonomous? Is that on the roadmap or what you thinking? >> Yeah, the GoldenGate is essentially an autonomous service. And it works with the Oracle Autonomous Database. So you can both use it as a source for data and as a sink for data, as a place you're writing data. So for example, you can have an autonomous OTP database, that's replicating to another autonomous OTP database in real time. And both of them are replicating changes to the autonomous data warehouse. But it doesn't all have to be autonomous. You can have any mix of, autonomous not autonomous, on-prem in cloud, in anybody's cloud. So that's the beauty of GoldenGate, It's extremely flexible. >> Well, you mentioned the plasticity a couple of times. I mean, why is that so important that that GoldenGate on OCI gives you that elastic, whatever billing the auto-scaling talk, talk to me in terms of what that does for the customer. >> Yeah, there's really two big benefits. One benefit is it's very difficult to predict workloads. So normally on an on-prem configuration, you have to say, okay what is the max possible workload that's going to happen here? And then you have to buy the product, configure the product, get hardware, basically size, everything for that. And then if you guess wrong, you're either spending too much because you oversized it or you have a big data real-time problem. The data can't keep up with the real-time because you've undersized the configuration. So that's hard to do. So the beauty of elasticity and the dynamic elasticity, the pay per use, is you don't have to figure all this stuff out. So if you have more workload, we grow it automatically. If you have less workload, we shrink it automatically. And you don't have to guess ahead of time. You don't have to price ahead of time. So you, you just use what, what you use, right? You don't pay for something that you're not using. So it's a very big change in the whole model of how you use these data, replication, integration, high availability technologies. >> Well, I think I'm correct to say GoldenGate primarily has been for big companies. You mentioned that small companies can now take advantage of this service. We talked about the granularity. And I could definitely see, can they afford it? I guess this is part one and then, and then the other part of the question is, I can see GoldenGate really satisfying your on-prem customers and them taking advantage of it, but do you think this will attract new customers beyond your core? So two part question there. >> Yeah, absolutely. So small customers have been challenged by the complexity of data integration. And that's one of the great things about the cloud services is it's dramatically simpler. So Oracle manages everything. Oracle does the patching, the upgrades. Oracle does the monitoring. It takes care of the high availability of the product. So all that management, complexity, all the configuration set up, everything like that, that's all automated, that's owned by Oracle. So small customers were always challenged by the complexity of product, along with everything else that they had to do. And then the other of course benefit is small customers were challenged by the large fixed price. So now with pay per use, they pay only for what they use. It's really usable by easily by small customers also. So it really expands the market and makes it more broadly applicable. >> So kind of same answer for beyond your existing customer base, beyond the on-prem that that's kind of... You answered >> Right. >> my two part question with one answer, so that was pretty efficient, (chuckles) pun intended. So the bottom line for me and squinting through this announcement is you've got the heterogeneity piece with GoldenGate OCI and as such it's going to give you the capability to create what I'll call an architecturally coherent decentralized data mesh. Big on this data mesh these days, could have decentralized data. With the proviso then I going to be able to connect to OCI, which of course you can do with Azure or I guess you could bring cloud to a customer on prem, first of all, is this correct? And can we expect you over time to do this with AWS or other cloud providers? >> It can move data from Amazon or to Amazon. It can actually handle, any data wherever it lives. So, yeah, it's very flexible and it's really just the automation of all the management, that we're running in our public cloud But the data can be from anywhere to anywhere. >> Cool, all right, let's switch topics here a little bit. Just talk about some of the things that you've been working on, some of the innovation. I sat through your blockchain announcement, it was very cool. Of course I love anything blockchain and crypto, NFTs are exploding, so that Coinbase IPO. It's just really an exciting time out there. I think a lot of people don't really appreciate the innovation that's occurring. So you've been making a lot of big announcements last several months. You've been taking your R and D bringing it into product, So that's great, we love to always see that because that's where really the rubber meets the road. Just for the database side of the house, you announced 21c the next generation of the self-driving data warehouse, ADW, blockchain tables, now you got GoldenGate running on OCI. Take us inside the development organizations. What are the underlying drivers other than your boss. >> When we talk about our autonomous database, it is the mission critical Oracle database, but it's dramatically easier to do. So Oracle does all the management all on automation, but also we use machine learning to tune, and to make it highly available, and to make it highly secure. So that that's been one of our biggest products we've been working on for many years. And recently we enhanced our autonomous data warehouse taking it beyond being a data warehouse to complete a data analytics platform. So it includes things like ETL. So we built ETL into the autonomous data warehouse. We're building our GoldenGate replication into autonomous data warehousing. We built machine learning directly natively into the database. So now, if someone wants to run some machine learning they just run a machine learning queries. They no longer have to stand up a separate system. So a big move that we've been making is, taking it beyond just a database to a full analytic platform. And this goes beyond what anyone else in the industry is doing, because we have a lot more technology. So for example, the ML machine learning directly in the database, the ETL directly in the database. The data replication is directly in the database. All these things are very unique to Oracle. And they dramatically simplify for customers how they manage data. In addition to that, we've also been working in our database product. We've enhanced it tremendously. So our big goal there is to provide what we call it converged database. So everything you need, all the data types. Whether it's JSON, relational, spatial, graph, all that different kinds of data types, all the different kinds of workloads. Analytics, OTP, things like blockchain, microservices events, all built into the Oracle database, making it dramatically easier to both develop and deploy new applications. So those are some of our big, big goals. Make it simple, make it integrated. Take the complexity, we'll take on the complexity. So developers and customers find it easy to develop an easy to use. And we've made huge strides in all these areas in the last couple of years. >> That's awesome. I wonder if we could land on blockchain again for now it's kind of jogging, but sort of on crypto. Though you're not about crypto but you are about applying blockchain. Maybe you can help our audience understand what are some of the real use cases where blockchain tech can be used with Oracle database. >> Yeah, so that's a very interesting topic. As you mentioned, blockchain is very currently, we see a lot of cryptocurrencies. I distributed applications for blockchain. So in general, in the past, we've had two worlds. We've had the enterprise data management world and we've had the blockchain world. And these are very distinct, right? And on the blockchain side the applications have mostly centered around, distributed multi-party applications, right? So where you have multiple parties that all want to reach consensus and then that consensus is stored in a blockchain. So that's kind of been the focus of blockchain. And what we've done is very innovative. We're the first company to ever do this. Is we've taken the core architecture, ideas. And really a lot of it has to do with the cryptography of blockchain. And we've built, we've engineered that natively into the mainstream Oracle database. So now in mainstream Oracle database, we have blockchain technology built in. And it's very dramatically simpler to use. And the use cases, you asked about the use case, that's what we've done. And it's taken us about five years to do this. Now it's been released into the market in our mainstream 19c Oracle database. So the use case is different from the conventional blockchain use case. Which I mentioned was really multi-party consensus based apps. We're trying to make blockchain useful for mainstream, enterprise and government applications. So any kind of mainstream government application, or enterprise application. And that idea of blockchain, the core concept of blockchain, is it addresses a different kind of security problem. So when you look at conventional security, it's really trying to keep people out. So we have things like firewalls, passwords, networking cryption, data encryption. It's all about keeping bad people out of the data. And there's really two big problems that it doesn't address well. One problem is that there's always new security exploits being published. So you have hackers out there that are working overtime. Sometimes they're nation States that are trying to attack data providers. And every week, every month there's a new security exploit that's discovered and this happens all the time. So that's one big problem. So we're building up these elaborate walls of protection around our core data assets. And in the meantime, we have basically barbarians attacking on every side.(chuckles) And every once in a while, they get over the walls and this is just what's happening. So that's one big problem. And the second big problem is elicit changes made by people with credentials. So sometimes you have an insider in your, in your company. Whether it's an administrator or a sales person, a support person, that has valid credentials, but then uses those valid credentials in some illicit way. They go out and change somebody's data for their own gain. And even more common than that cause there's not that many bad guys inside the company to they exist, is stolen credentials. So what's happened in many cases is hackers or nation States will steal for example, administrative credentials and then use those administrative credentials to come into a system and steal data. So that's the kind of problem that is not well addressed by security mechanism. So if you have privileges security mechanism says, yeah you're fine. If somebody steals your privileges, again you get the pass through the gate. And so what we've done with blockchain is we've taken the cryptography elements of blockchain. We call it crypto secure data management. And we've built those into the Oracle database. So think of it this way. If someone actually makes it through over the walls that we built, and in into the core data, what we've done with that cryptographic technology of blockchain, is we've made that immutable. So you can't change it. So even if you make it over the gate you can't get into the core data assets and change those assets. And that's not built into Oracle databases is super easy to adopt. And I think it's going to really enhance and expand the community of people that can actually use that blockchain technology. >> I mean, that's awesome. I could talk all day about blockchain. And I mean, when you think about hackers, it's all there. They're all about ROI, value over cost. And if you can increase the denominator they're going to go somewhere else, right? Because the value will will decline. And this is really the intersection of software engineering cryptography. And I guess even when you bring crypto currency into it, it's like sort of the game theory. That's really kind of not what you're all about, but the first two pieces are really critical in terms of just next generation of raising that security hurdle. Love it. Now, go ahead. >> Yeah it's a different approach. I was just going to say, it's a different approach. Because think about trying to keep people out with things like passwords and firewalls, you can have basically bugs in that software that allow people to exploit and get in. When you're talking about cryptography, that's math, it's very difficult. I mean, you really can't fight pass math. Once the data is cryptographically protected on a blockchain, a hacker can't really do anything with that. It's just, math is math. There's nothing you can do to break it, right. It's very different from trying to get through some algorithm. That's really trying to keep you out. >> Awesome. I said, I could talk forever on this topic. But let me, let me go into some competitive dynamics. You recently announced Autonomous Data Warehouse. You've got service capabilities that are really trying to appeal to the line of business. I want to get your take on that announcement and specifically how you think it compares name names. I'm going to name names you don't have to. But Snowflake, obviously a lot of momentum in the marketplace. AWS with Redshift is doing very, very well. Obviously there are others. But those are two prominent ones that we've tracked in our data shows that have momentum. How do you compare? >> Yeah, so there's a number of different ways to look at the comparison. So the most simplest and straightforward is there's a lot more functionality in Oracle data warehousing. Oracle has been doing this for decades. We have a lot of built-in functionality. For example, machine learning natively built into the database makes it super easy to use. We have mixed workloads, we have spatial capabilities. We have graph capabilities. We have JSON capabilities. We have a microservice capabilities. We have-- So there's a lot more capabilities. So that's number one. Number two, our cloud service is dramatically more elastic. So with our cloud service all you really do, is you basically move the slide. You say hey, I want more resources, I want less resources. In fact, we'll do that automatically, that's called auto-scaling. In contrast when you look at people like Snowflake or Redshift they want you to stand up a new cluster. Hey you have some more workload on Monday, stand up another cluster and then we'll have two sets of clusters or maybe you'd want a third cluster, maybe you want a fourth cluster. So you end up with all these different systems which is how they scale. They say, hey, I can have multiple sets of servers access the same data. With Oracle you don't have to even think about those things. We auto scale, you get more workload. We just give it more resources. You don't even have to think about that. And then the other thing is we're looking at the whole data management end to end problem. So starting with capturing the data, moving the data in real time, transforming the data, loading the data, running machine learning and analytics on the data. Putting all kinds of data in a single place that you can do analytics on all of it together. And then having very rich screen capabilities for viewing the data, graphing the data, modeling the data, all those things. So it's all integrated. It makes it super easy to use. So a much easier, much more functionality and much more elastic than any of our competitors in the market. >> Interesting, thank you for those comments. I mean, it's a different world, right? I mean, you guys got all the market share, they got all the growth, those things over time, you've been around, you see it, they come together and you fight it out and may the best approach wins. >> So we'll be watching >> Yeah also I forgot to mention the obvious thing, which is Oracle runs everywhere. So you can run Oracle on premises. You can run Oracle on the public cloud. You can run what we call cloud at customer. Our competitors really are just public cloud only. So you customers don't get the choice of where they want to run their data warehouse. >> Now Juan a while ago I sat down with David foyer and Mark steamer. We reviewed how Gartner looks at the marketplace and it wasn't surprise that when it came to operational workloads, Oracle stood out. I mean, that's kind of an understatement relative to the major competitors. Most of our viewers, I don't think expected for instance Microsoft or AWS to be that far away from you. But at the same time, the database magic quadrant maybe didn't reflect that gap as widely. So there's some dissonance there with the detailed workload drill downs were dramatic. And I wonder what your take on the results. I mean, obviously you're happy with them. You came out leading in virtually every category or you will one and two, and some of that sort of not even non-mission critical operational stuff. But what can you add to my narrative there? >> Yeah, so Gartner, first of all, we're talking about cloud databases. >> Right. >> Right, so this is not on premises databases this is pure cloud databases. And what they did is they did two things. One is, the main thing was a technical rating of the databases, of the cloud databases. And, there's other vendors that have been had database in the cloud for longer than we have. But in the most recent Gartner analysis report, as you mentioned, Oracle came out on top for cloud database technology, in almost every single operational use case including things like Internet of Things, things like JSON data, variable data, analytics as well as a traditional OTP and mixed workloads. So Oracle was rated the highest technology which isn't a big surprise. We've been doing this for decades. Over 90% of the global fortune 500 run Oracle. And there's a reason, because this is what we're good at. This our core strength. Our availability, our security, our scalability, our functionality, both for OTP and analytics. All the capabilities, built-in machine learning, graph analytics, everything. So even when we compare narrowly things like Internet of Things or variable data against niche competitors that that's what all they do. We came up dramatically ahead. But what surprised a lot of people is how far ahead of some of the other cloud vendors like Amazon, like Azure, like Google, Oracle came out ahead in the cloud database category. So a lot of people think, well, some of these other pure cloud vendors must be ahead of Oracle in cloud database. But actually not. I mean, if you look at the Gartner analyst report, it was very clear. It was Oracle was dramatically ahead of their cloud database technologies with our cloud database. >> So I'm pretty much out of time but last question. I've had some interesting discussions lately and we've pointed out for years in our research that of course you're delivering the entire stack, the database, part of the infrastructure the applications, you have the whole engineered system strategy. And for the most part you're kind of unique in this regard. I mean, Dell just announced that it's spinning off VMware and it could have gone the other direction. And become more integrated hardware and software player, for the data center. But look, it's working for Dell based on the reaction, from the street post announcement. Cisco they got a hardware and software model that's sort of integrated but the company's value that peaked back in the .com boom, it's been very slow to bounce back. But my point is for these companies the street doesn't value, the integrated model. Oracle is kind of the exception. You know, it's at trading at all time highs, I know you're not going to comment on the stock price, but I guess in SAP until it missed it guided conservatively, was kind of on the good trajectory. But so I'm wondering, why do you think Oracle strategy resonates with investors, but not so much those companies? Is it, because you have the applications piece? I mean, maybe that's kind of my premise for, for SAP but what's your take? Why is it working for you? >> Well, okay. I think it's pretty simple, which is some of our competitors, for example, they might have a software product and a hardware product. But mostly those are acquired in their separate products that just happen to be in a portfolio. They are not a single company with a single vision and joint engineering going on. It's really, hey, I got the software on over here. I got the hardware over there, but they don't really talk to each other, they don't really work together. They're not trying to develop something where the stack is actually not just integrated but engineered together. And that is really the key. Oracle focuses on data management top to bottom. So we have everything from our ERP, CRM applications talking to our database, talking to our engineered systems, running in our cloud. And it's all completely engineered together. So Oracle doesn't just acquire these things and kind of glue them together. We actually engineer them and that's fundamentally the difference. You can buy two things and have them as two separate divisions in your company but it doesn't really get you a whole lot. >> Juan it's always a pleasure, I love these conversations and hope we can do more in the future. Really appreciate your time. Thanks for coming to the CUBE >> Pleasure, Dave nice to talk to you. >> All right keep it right there, everybody. This is Dave Vellante for theCUBE, we'll see you next time. (upbeat musiC)

Published Date : Apr 21 2021

SUMMARY :

of database technology in the market Thanks, great to see you Dave, Yeah and I hope you have some time about the new service So that's kind of the big new thing of the most basic part to it. but it doesn't offer the complicated in the cloud, Well, so I mean the biggest customers And so one of the things this does, And because the cost is higher, So if you have a lot And that's the key technology is the obvious one, And it also enables the Is that on the roadmap So that's the beauty of GoldenGate, that does for the customer. the pay per use, is you don't have of the question is, I can see GoldenGate So it really expands the market beyond the on-prem that that's kind of... So the bottom line for me and it's really just the of the self-driving data So for example, the ML but you are about applying blockchain. And the use cases, you of the game theory. Once the data is in the marketplace. So the most simplest and straightforward may the best approach wins. You can run Oracle on the public cloud. But at the same time, the Yeah, so Gartner, first of all, of the databases, of the cloud databases. And for the most part you're And that is really the key. Thanks for coming to the CUBE theCUBE, we'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AmazonORGANIZATION

0.99+

Dave VellantePERSON

0.99+

Juan LoaizaPERSON

0.99+

CiscoORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

DavePERSON

0.99+

JuanPERSON

0.99+

OracleORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

DellORGANIZATION

0.99+

AWSORGANIZATION

0.99+

IBMORGANIZATION

0.99+

thousandsQUANTITY

0.99+

MondayDATE

0.99+

two thingsQUANTITY

0.99+

One problemQUANTITY

0.99+

Mark steamerPERSON

0.99+

One benefitQUANTITY

0.99+

GartnerORGANIZATION

0.99+

OCIORGANIZATION

0.99+

fourth clusterQUANTITY

0.99+

OneQUANTITY

0.99+

twoQUANTITY

0.99+

bothQUANTITY

0.99+

oneQUANTITY

0.99+

one answerQUANTITY

0.99+

third clusterQUANTITY

0.99+

one big problemQUANTITY

0.99+

two big problemsQUANTITY

0.99+

two setsQUANTITY

0.99+

CoinbaseORGANIZATION

0.99+

two partQUANTITY

0.99+

about five yearsQUANTITY

0.98+

two big benefitsQUANTITY

0.98+

first companyQUANTITY

0.97+

two separate divisionsQUANTITY

0.97+

Over 90%QUANTITY

0.97+

GoldenGateORGANIZATION

0.97+

second copyQUANTITY

0.97+

David foyerPERSON

0.97+

first two piecesQUANTITY

0.96+

singleQUANTITY

0.96+

two big blockersQUANTITY

0.96+

single applicationQUANTITY

0.96+

Jagane Sundar, WANdisco | AWS Summit SF 2018


 

>> Voiceover: Live from the Moscone Center, it's theCUBE. Covering AWS Summit San Francisco 2018. Brought to you by Amazon Web Services. >> Welcome back, I'm Stu Miniman and this is theCUBE's exclusive coverage of AWS Summit here in San Francisco. Happy to welcome back to the program Jagane Sundar, who is the CTO of WANdisco. Jagane, great to see you, how have you been? >> Well, been great Stu, thanks for having me. >> All right so, every show we go to now, data really is at the center of it, you know. I'm an infrastructure guy, you know, data is so much of the discussion here, here in the cloud in the keynotes, they were talking about it. IOT of course, data is so much involved in it. We've watched WANdisco from the days that we were talking about big data. Now it's you know, there's AI, there's ML. Data's involved, but tell us what is WANdisco's position in the marketplace today, and the updated role on data? >> So, we have this notion, this brand new industry segment called live data. Now this is more than just itty-bitty data or big data, in fact this is cloud-scale data located in multiple regions around the world and changing all the time. So you have East Coast data centers with data, West Coast data centers with data, European data centers with data, all of this is changing at the same time. Yet, your need for analytics and business intelligence based on that is across the board. You want your analytics to be consistent with the data from all these locations. That, in a sense, is the live data problem. >> Okay, I think I understand it but, you know, we're not talking about like, in the storage world there was like hot data, what's hot and cold data. And we talked about real-time data for streaming data and everything like that. But how do you compare and contrast, you know, you said global in scope, talked about multi-region, really talking distributed. From an architectural standpoint, what's enabling that to be kind of the discussion today? Is it the likes of Amazon and their global reach? And where does WANdisco fit into the picture? >> So Amazon's clearly a factor in this. The fact that you can start up a virtual machine in any part of the world in a matter of minutes and have data accessible to that VM in an instant changes the business of globally accessible data. You're not simply talking about a primary data center and a disaster recovery data center anymore. You have multiple data centers, the data's changing in all those places, and you want analytics on all of the data, not part of the data, not on the primary data center, how do you accomplish that, that's the challenge. >> Yeah, so drill into it a little bit for us. Is this a replication technology? Is this just a service that I can spin up? When you say live, can I turn it off? How do those kind of, when I think about all the cloud dynamics and levers? >> So it is indeed based on active-active replication, using a mathematically strong algorithm called Paxos. In a minute, I'll contrast that with other replication technologies, but the essence of this is that by using this replication technology as a service, so if you are going up to Amazon's web services and you're purchasing some analytics engine, be it Hive or Redshift or any analytics engine, and you want to have that be accessible from multiple data centers, be available in the face of data center or entire region failure, and the data should be accessible, then you go with our live data platform. >> Yeah so, we want you to compare and contrast. What I think about, you know, I hear active-active, speed of light's always a challenge. You know globally, you have inconsistency it's challenging, there's things like Google Spanner out there to look at those. You know, how does this fit compared to the way we've thought of things like replication and globally distributed systems in the past? >> Interesting question. So, ours great for analytics applications, but something like Google Spanner is more like a MySQL database replacement that runs into multiple data centers. We don't cater to that and database-transaction type of applications. We cater to analytics applications of batch, very fast streaming applications, enterprise data warehouse-type analytics applications, for all of those. Now if you take a look inside and see what kind of replication technology will be used, you'll find that we're better than the other two different types. There are two different types of existing replication technologies. One is log shipping. The traditional Oracle, GoldenGate-type, ship the log, once the change is made to the primary. The second is, take a snapshot and copy differences between snapshots. Both have their deficiencies. Snapshot of course is time-based, and it happens once in a while. You'll be lucky if you can get one day RTO with those sorts of things. Also, there's an interesting anecdote that comes to mind when I say that because the Hadoop folks in their HTFS, implemented a version of snapshot and snapdiff. The unfortunate truth is that it was engineered such that, if you have a lot of changes happening, the snapshot and snapdiff code might consume too much memory and bring down your NameNode. That's undesirable, now your backup facility just brought down your main data capability. So snapshot has its deficiencies. Log shipping is always active/passive. Contrast that with our technology of live data, whereat you can have multiple data centers filled with data. You can write your data to any of these data centers. It makes for a much more capable system. >> Okay, can you explain, how does this fit with AWS and can it live in multi-clouds, what about on-premises, the whole you know, multi and hybrid cloud discussion? >> Interesting, so the answer is yes. It can live in multiple regions within the same cloud, multiple reasons within different clouds. It'll also bridge data that exists on your on-prem, Hadoop or other big data systems, or object store systems within Cloud, S3 or Azure, or any of the BLOB stores available in the cloud. And when I say this, I mean in a live data fashion. That means you can write to your on-prem storage, you can also write to your cloud buckets at the same time. We'll keep it consistent and replicated. >> Yeah, what are you hearing from customers when it comes to where their data lives? I know last time I interviewed David Richards, your CEO, he said the data lakes really used to be on premises, now there's a massive shift moving to the public clouds. Is that continuing, what's kind of the breakdown, what are you hearing from customers? >> So I cannot name a single customer of ours who is not thinking about the cloud. Every one of them has a presence on premise. They're looking to grow in the cloud. On-prem does not appear to be on a growth path for them. They're looking at growing in the cloud, they're looking at bursting into the cloud, and they're almost all looking at multi-cloud as well. That's been our experience. >> At the beginning of the conversation we talked about data. How are customers doing you know, exploiting and leveraging or making sure that they aren't having data become a liability for them? >> So there are so many interesting use cases I'd love to talk about, but the one that jumps out at me is a major auto manufacturer. Telematics data coming in from a huge number, hundreds of thousands, of cars on the road. They chose to use our technology because they can feed their West Coast car telematics into their West Coast data center, while simultaneously writing East Coast car data into the East Coast data center. We do the replication, we build the live data platform for them, they run their standard analytics applications, be it Hadoop-sourced or some other analytics applications, they get consistent answers. Whether you run the analytics application on the East Coast or the West Coast, you will get the same exact answer. That is very valuable because if you are doing things like fault detection, you really don't want spurious detection because the data on the West Coast was not quite consistent and your analytics application was led astray. That's a great example. We also have another example with a top three bank that has a regulatory concern where they need to operate out of their backup data centers, so-called backup data center, once every three months or so. Now with live data, there is no notion of active data center and backup data center. All data centers are active, so this particular regulatory requirement is extremely simple for them to implement. They just run their queries on one of the other data centers and prove to the regulators that their data is indeed live. I could go on and on about a number of these. We also have a top two retailer who has got such a volume data that they cannot manage it in one Hadoop cluster. They use our technology to create the live data data link. >> One of the challenges always, customers love the idea of global but governance, compliance, things like GDPR pop up. Does that play into your world? Or is that a bit outside of what WANdisco sees? >> It actually turns out to be an important consideration for us because if you think about it, when we replicate the data flows through us. So we can be very careful about not replicating data that is not supposed to be replicated. We can also be very careful about making sure that the data is available in multiple regions within the same country if that is the requirement. So GDPR does play a big role in the reason why many of our customers, particularly in the financial industry, end up purchasing our software. >> Okay, so this new term live data, are there any other partners of yours that are involved in this? As always, you want like a bit of an ecosystem to help build out a wave. >> So our most important partners are the cloud vendors. And they're multi-region by nature. There is no idea of a single data center or a single region cloud, so Microsoft, Amazon with AWS, these are all important partners of ours, and they're promoting our live data platform as part of their strategy of building huge hybrid data lakes. >> All right, Jagane give us a little view looking forward. What should we expect to see with live data and WANdisco through the rest of 2018? >> Looking forward, we expect to see our footprint grow in terms with dealing with a variety of applications, all the way from batch, pig scripts that used to run once a day to hive that's maybe once every 15 minutes to data warehouses that are almost instant and queryable by human beings, to streaming data that pours things into Kafka. We see the whole footprint of analytics databases growing. We see cross-capability meaning perhaps an Amazon Redshift to an Azure or SQL EDW replication. Those things are very interesting to us, to our customers, because some of them have strengths in certain areas and other have strengths in other areas. Customers want to exploit both of those. So we see us as being the glue for all world-scale analytics applications. >> All right well, Jagane, I appreciate you sharing with us everything that's happening at WANdisco. This new idea of live data, we look forward to catching up with you and the team in the future and hearing more about the customers and everything on there. We'll be back with lots more coverage here from AWS Summit here in San Francisco. I'm Stu Miniman, you're watching theCUBE. (electronic music)

Published Date : Apr 4 2018

SUMMARY :

Brought to you by Amazon Web Services. and this is theCUBE's exclusive coverage data really is at the center of it, you know. and changing all the time. Is it the likes of Amazon and their global reach? The fact that you can start up a virtual machine about all the cloud dynamics and levers? but the essence of this is that by using and globally distributed systems in the past? ship the log, once the change is made to the primary. That means you can write to your on-prem storage, Yeah, what are you hearing from customers They're looking at growing in the cloud, At the beginning of the conversation we talked about data. or the West Coast, you will get the same exact answer. One of the challenges always, of our customers, particularly in the financial industry, As always, you want like a bit of an ecosystem So our most important partners are the cloud vendors. What should we expect to see with live data We see the whole footprint to catching up with you and the team in the future

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AmazonORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

AWSORGANIZATION

0.99+

Amazon Web ServicesORGANIZATION

0.99+

David RichardsPERSON

0.99+

JaganePERSON

0.99+

San FranciscoLOCATION

0.99+

Jagane SundarPERSON

0.99+

Stu MinimanPERSON

0.99+

WANdiscoORGANIZATION

0.99+

GDPRTITLE

0.99+

StuPERSON

0.99+

OneQUANTITY

0.99+

East CoastLOCATION

0.99+

BothQUANTITY

0.99+

secondQUANTITY

0.99+

twoQUANTITY

0.98+

MySQLTITLE

0.98+

West CoastLOCATION

0.98+

two different typesQUANTITY

0.98+

oneQUANTITY

0.98+

bothQUANTITY

0.98+

one dayQUANTITY

0.98+

KafkaTITLE

0.98+

S3TITLE

0.97+

Moscone CenterLOCATION

0.97+

OracleORGANIZATION

0.96+

once a dayQUANTITY

0.95+

Google SpannerTITLE

0.95+

single data centerQUANTITY

0.95+

NameNodeTITLE

0.94+

hundreds of thousandsQUANTITY

0.94+

todayDATE

0.93+

theCUBEORGANIZATION

0.92+

AzureTITLE

0.91+

WANdiscoTITLE

0.9+

snapdiffTITLE

0.89+

SQL EDWTITLE

0.89+

RedshiftTITLE

0.88+

single customerQUANTITY

0.87+

AWS SummitEVENT

0.87+

AWS Summit San Francisco 2018EVENT

0.86+

single regionQUANTITY

0.85+

2018DATE

0.84+

snapshotTITLE

0.81+

JaganeORGANIZATION

0.76+

three bankQUANTITY

0.74+

once every 15 minutesQUANTITY

0.73+

EuropeanLOCATION

0.73+

AWS Summit SF 2018EVENT

0.71+

onceQUANTITY

0.7+

CloudTITLE

0.65+

every three monthsQUANTITY

0.64+

GoldenGateORGANIZATION

0.57+

of carsQUANTITY

0.55+

minuteQUANTITY

0.53+

PaxosORGANIZATION

0.53+

HTFSTITLE

0.53+

HiveTITLE

0.49+

HadoopORGANIZATION

0.41+

BLOBTITLE

0.4+

Steve Wilkes, Striim | Big Data SV 2018


 

>> Narrator: Live from San Jose it's theCUBE. Presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners. (upbeat music) >> Welcome back to San Jose everybody, this is theCUBE, the leader in live tech coverage and you're watching BigData SV, my name is Dave Vellante. In the early days of Hadoop everything was batch oriented. About four or five years ago the market really started to focus on real time and streaming analytics to try to really help companies affect outcomes while things were still in motion. Steve Wilks is here, he's the co-founder and CTO of a company called Stream, a firm that's been in this business for around six years. Steve welcome to theCUBE, good to see you. Thanks for coming on. >> Thanks Dave it's a pleasure to be here. >> So tell us more about that, you started about six years ago, a little bit before the market really started talking about real time and streaming. So what led you to that conclusion that you should co-found Steam way ahead of its time? >> It's partly our heritage. So the four of us that founded Stream, we were executives at GoldenGate Software. In fact our CEO Ali Kutay was the CEO of GoldenGate Software. So when we were acquired by Oracle in 2009, after having to work for Oracle for a couple years, we were trying to work out what to do next. And GoldenGate was replication software right? So it's moving data from one place to another. But customers would ask us in customer advisory boards, that data seems valuable, it's moving. Can you look at it while it's moving and analyze it while it's moving, get value out of that moving data? And so that was kind of set in our heads. And then we were thinking about what to do next, that was kind of the genesis of the idea. So the concept around Stream when we first started the company was we can't just give people streaming data, we need to give them the ability to process that data, analyze it, visualize it, play with it and really truly understand the data. As well as being able to collect it and move it somewhere else. And so the goal from day one was always to build a full end-to-end platform that did everything customers needed to do for streaming integration analytics out of the box. And that's what we've done after six years. >> I got to ask a really basic question, so you're talking about your experience at GoldenGate moving data from point a to point b and somebody said well why don't we put that to work. But is there change data or was it static data? Why couldn't I just analyze it in place? >> GoldenGate works on change data. >> Okay so that's why, there was changes going through. Why wait until it hits its target, let's do some work in real time and learn from that, get greater productivity. And now you guys have taken that to a new level. That new level being what? Modern tools, modern technologies? >> A platform built from the ground up to be inherently distributed, scalable, reliable with exactly one's processing guarantees. And to be a complete end-to-end platform. There's a recognition that the first part of being able to do streaming data integration or analytics is that you need to be able to collect the data right? And while change data captured from databases is the way to get data out of databases in a streaming fashion, you also have to deal with files and devices and message queues and anywhere else the data can reside. So you need a large number of different data collectors that all turn the enterprise data sources into streaming data. And similarly if you want to store data somewhere you need a large collection of target adapters that deliver to things. Not just on premise but also in the cloud. So things like Amazon S3 or the cloud databases like Redshift and Google BigQuery. So the idea was really that we wanted to give customers everything they need and that everything they need isn't trivial. It's not just, well we take Apache Kafka and then we stuff things into it and then we take things out. Pretty often, for example, you need to be able to enrich data and that means you need to be able to join streaming data with additional context information, reference data. And that reference data may come form a database or from files or somewhere else. So you can't call out to the database and maintain the speeds of streaming data. We have customers that are doing hundreds of thousands of events per second. So you can't call out to a database for every event and ask for records to enrich it with. And you can't even do that with an external cache because it's just not fast enough. So we built in an in-memory data grid as part of our platform. So you can join streaming data with the context information in real time without slowing anything down. So when you're thinking about doing streaming integration, it's more than just moving data around. It's ability to process it and get it in the right form, to be able to analyze it, to be able to do things like complex event processing on that data. And also to be able to visualize it and play with it is an essential part of the whole platform. >> So I wanted to ask you about end-to-end. I've seen a lot of products from larger, maybe legacy companies that will say it's end-to-end but what it really is, is a cobbled together pieces that they bought in and then, this is our end-to-end platform, but it's not unified. Or I've seen others "Well we've got an end-to-end platform" oh really, can I see the visualization? "Well we don't have visualization "we use this third party for visualization". So convince me that you're end-to-end. >> So our platform when you start with it you go into a UI, you can start building data flows. Those data flows start from connectors, we have all the connectors that you need to get your enterprise data. We have wizards to help you build those. And so now you have a data stream. Now you want to start processing that, we have SQL-based processing so you can do everything from filtering, transformation, aggregation, enrichment of data. If you want to load reference data into memory you use a cache component to drag that in, configure that. You now have data in-memory you can join with your streams. If you want to now take the results of all that processing and write it somewhere, use one of our target connectors, drag that in so you've got a data flow that's getting bigger and bigger, doing more and more processing. So now you're writing some of that data out to Kafka, oh I'm going to also add in another target adaptor write some of it into Azure Blob Storage and some of it's going to Amazon Redshift. So now you have a much bigger data flow. But now you say okay well I also want to do some analytics on that. So you take the data stream, you build another data flow that is doing some aggregation of a Windows, maybe some complex event processing, and then you use that dashboard builder to build a dashboard to visualize all of that. And that's all in one product. So it literally is everything you need to get value immediately. And you're right, the big vendors they have multiple different products and they're very happy to sell you consulting to put them all together. Even if you're trying to build this from open source and you know, organizations try and do that, you need five or six major pieces of open source, a lot of support in libraries, and a huge team of developers to just build a platform that you can start to build applications on. And most organizations aren't software platform companies, they're finance companies, oil and gas companies, healthcare companies. And they really want to focus on solving business problems and not on reinventing the wheel by building a software platform. So we can just go in there and say look; value immediately. And that really, really helps. >> So what are some of your favorite use cases, examples, maybe customer examples that you can share with me? >> So one of the great examples, one of my customers they have a lot of data in our HP non-stop system. And they needed to be able to get visibility into that immediately. And this was like order processing, supply chain, ERP data. And it would've taken a very large amount of time to do analytics directly on the HP nonstop. And finding resources to do that is hard as well. So they needed to get the data out and they need to get it into the appropriate place. And they recognize that use the right technology to ask the right question. So they wanted some of it in Hadoop so they could do some machine learning on that. They wanted some of it to go into Kafka so they could get real time analytics. And they wanted some of it to go into HBase so they could query it immediately and use that for reference purposes. So they utilized us to do change data capture against the HP nonstop, deliver that datastream out immediately into Kafka and also push some of it into HEFS and some of it into HBase. So they immediately got value out of that, because then they could also build some real-time analytics on it. It would sent out alerts if things were taking too long in their order processing system. And allowed them to get visibility directly into their process that they couldn't get before with much fewer resources and more modern technologies than they could have used before. So that's one example. >> Can I ask you a question about that? So you talked about Kafka, HBase, you talk about a lot of different open source projects. You've integrated those or you've got entries and exits into those? >> So we ship with Kafka as part of our product. It's an optional messaging bus. So, our platform has two different ways of moving data around. We have a high-speed, in-memory only message bus and that works almost network speed and it's great for a lot of different use cases. And that is what backs our data streams. So when you build a data flow, you have streams in between each step, that is backed by an in-memory bus. Pretty often though, in use cases, you need to be able to potentially rewind data for recovery purposes or have different applications running at different speeds and that's where a persistent message bus like Kafka comes in but you don't want to use a persistent message bus for everything because it's doing IO and it's slowing things down. So you typically use that at the beginning, at the sources, especially things like IOT where you can't rewind into them. Things like databases and files, you can rewind into them and replay and recover but IOT sources, you can't do that. So you would push that into a Kafka backed stream and then subsequent processing is in-memory. So we have that as part of our product. We also have Elastic as part of our product for results storage. You can switch to other results storage but that's our default. And we have a few other key components that are part of our product but then on the periphery, we have adapters integrate with a lot of the other things that you mentioned. So we have adapters to read and write HDFS, Hive, HBase, Across, Cloudera, Autumn Works, even MapR. So we have the MapR versions of the file system and MapR streams and MapR DB and then there's lots of other more proprietary connectors like CVC from Oracle, and SQL server, and MySQL and MariaDB. And then database connectors for delivery to virtually any JDBC compliant database. >> I took you down a tangent before you had a chance. You were going to give us another example. We're pretty much out of time but if you can briefly share either that or the last word, I'll give it to you. >> I think the last word would be that that is one example. We have lots and lots of other types of use cases that we do including things like: migrating data from on-premise to the cloud, being able to distribute log data, and being able to analyze that log data being able to do in-memory analytics and get real-time insights immediately and send alerts. It's a very comprehensive platform but each one of those use cases are very easy to develop on their own and you can do them very quickly. And of course as the use case expands within a customer, they build more and more and so they end up using the same platform for lots of different use cases within the same account. >> And how large is the company? How many people? >> We are around 70 people right now. >> 70 People and you're looking for funding? What rounds are you in? Where are you at with funding and revenue and all that stuff? >> Well I'd have to defer to my CEO for those questions. >> All right, so you've been around for what, six years you said? >> Yeah, we have a number of rounds of funding. We had initial seed funding then we had the investment by Summit Partners that carried us through for a while. Then subsequent investment from Intel Capital, Dell EMC, Atlantic Bridge. And that's where we are right now. >> Good, excellent. Steve, thanks so much for coming on theCUBE, really appreciate your time. >> Great, it's awesome. Thank you Dave. >> Great to meet you. All right, keep it right there everybody, we'll be back with our next guest. This is theCUBE. We're live from BigData SV in San Jose. We'll be right back. (techno music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media the market really started to focus So what led you to that conclusion So it's moving data from one place to another. I got to ask a really basic question, And now you guys have taken that to a new level. and that means you need to be able to So I wanted to ask you about end-to-end. So our platform when you start with it And they needed to be able to get visibility So you talked about Kafka, HBase, So when you build a data flow, you have streams We're pretty much out of time but if you can briefly to develop on their own and you can do them very quickly. And that's where we are right now. really appreciate your time. Thank you Dave. Great to meet you.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

Dave VellantePERSON

0.99+

Steve WilksPERSON

0.99+

StevePERSON

0.99+

2009DATE

0.99+

Steve WilkesPERSON

0.99+

fiveQUANTITY

0.99+

Intel CapitalORGANIZATION

0.99+

GoldenGate SoftwareORGANIZATION

0.99+

Ali KutayPERSON

0.99+

OracleORGANIZATION

0.99+

hundredsQUANTITY

0.99+

GoldenGateORGANIZATION

0.99+

KafkaTITLE

0.99+

San JoseLOCATION

0.99+

StreamORGANIZATION

0.99+

MySQLTITLE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

Atlantic BridgeORGANIZATION

0.99+

six yearsQUANTITY

0.99+

SteamORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

MapRTITLE

0.99+

HPORGANIZATION

0.99+

fourQUANTITY

0.99+

70 PeopleQUANTITY

0.99+

Dell EMCORGANIZATION

0.99+

MariaDBTITLE

0.99+

StriimPERSON

0.99+

SQLTITLE

0.99+

oneQUANTITY

0.98+

each stepQUANTITY

0.98+

Summit PartnersORGANIZATION

0.98+

two different waysQUANTITY

0.97+

first partQUANTITY

0.97+

around six yearsQUANTITY

0.97+

around 70 peopleQUANTITY

0.96+

HBaseTITLE

0.96+

one exampleQUANTITY

0.96+

theCUBEORGANIZATION

0.95+

BigData SVORGANIZATION

0.94+

Big DataORGANIZATION

0.92+

HadoopTITLE

0.92+

one productQUANTITY

0.92+

each oneQUANTITY

0.91+

six major piecesQUANTITY

0.91+

About fourDATE

0.91+

CVCTITLE

0.89+

firstQUANTITY

0.89+

about six years agoDATE

0.88+

day oneQUANTITY

0.88+

ElasticTITLE

0.87+

Silicon ValleyLOCATION

0.87+

WindowsTITLE

0.87+

five years agoDATE

0.86+

S3TITLE

0.82+

JDBCTITLE

0.81+

AzureTITLE

0.8+

CEOPERSON

0.79+

one placeQUANTITY

0.78+

RedshiftTITLE

0.76+

AutumnORGANIZATION

0.75+

secondQUANTITY

0.74+

thousandsQUANTITY

0.72+

Big Data SV 2018EVENT

0.71+

couple yearsQUANTITY

0.71+

GoogleORGANIZATION

0.69+