Sandeep Lahane and Shyam Krishnaswamy | KubeCon + CloudNative Con NA 2021
>>Okay, welcome back everyone. To the cubes coverage here, coop con cloud native con 2021 in person. The Cuba's here. I'm John farrier hosted the queue with Dave Nicholson, my cohost and cloud analyst, man. It's great to be back, uh, in person. We also have a hybrid event. We've got two great guests here, the founders of deep fence, sham, Krista Swami, C co-founder and CTO, and said deep line founder. It's great to have you on. This is a super important topic. As cloud native is crossed over. Everyone's talking about it mainstream, blah, blah, blah. But security is driving the agenda. You guys are in the middle of it. Cutting edge approach and news >>Like, like we were talking about John, we had operating at the intersection of the awesome desk, right? Open source security and cloud cloud native, essentially. Absolutely. And today's a super exciting day for us. We're launching something called track pepper, Apache V2, completely open source. Think of it as an x-ray or MRI scan for your cloud scan, you know, visualize this cloud at scale, all of the modalities, essentially, we look at cloud as a continuum. It's not a single modality it's containers. It's communities, it's William to settle we'll list all of them. Co-exist side by side. That's how we look at it and threat map. It essentially allows you to visualize all of this in real time, think of fed map, but as something that you, that, that takes over the Baton from the CIS unit, when the lift shift left gets over, that's when the threat pepper comes into picture. So yeah, super excited. >>It's like really gives that developer and the teams ops teams visibility into kind of health statistics of the cloud. But also, as you said, it's not just software mechanisms. The cloud is evolving, new sources being turned on and off. No one even knows what's going on. Sometimes this is a really hidden problem, right? Yeah, >>Absolutely. The basic problem is, I mean, I would just talk to, you know, a gentleman 70 of this morning is two 70 billion. Plus public cloud spent John two 70 billion plus even 3 billion, 30 billion they're saying right. Uh, projected revenue. And there is not even a single community tool to visualize all the clouds and all the cloud modalities at scale, let's start there. That's what we sort of decided, you know what, let's start with utilizing everything else there. And then look for known badness, which is the vulnerabilities, which still remains the biggest attack vector. >>Sure. Tell us about some of the hood. How does this all work cloud scale? Is it a cloud service managed service it's code? Take us out, take us through product. Absolutely. >>So, so, but before that, right, there's one small point that Sandeep mentioned. And Richard, I'd like to elaborate here, right? He spoke about the whole cloud spending such a large volume, right? If you look at the way people look at applications today, it's not just single clone anymore. It's multicloud multi regions across diverse plants, right? What does the solution to look at what my interests are to this point? That is a missing piece here. And that is what we're trying to tackle. And that is where we are going as open source. Coming back to your question, right? How does this whole thing work? So we have a completely on-prem model, right? Where customers can download the code today, install it. It can bill, we give binary stool and Shockley just as the exciting announcement that came out today, you're going to see somewhat exciting entrepreneurs. That's going to make a lot more easy for folks out there all day. Yeah, that's fine. >>So how does this, how does this all fit into security as a micro service and your, your vision of that? >>Absolutely. Absolutely. You know, I'll tell you, this has to do with the one of the continual conferences I would sort of when I was trying to get an idea, trying to shape the whole vision really? Right. Hey, what about syncretism? Microservice? I would go and ask people. They mentioned that sounds, that makes sense. Everything is becoming a microservice. Really. So what you're saying is you're going to deploy one more microservice, just like I deploy all of my other microservices. And that's going to look after my microservices. That compute back makes logical sense, essentially. That was the Genesis of that terminology. So defense essentially is deployed as a microservice. You go to scale, it's deployed, operated just like you to your microservices. So no code changes, no other tool chain changes. It just is yet another microservice. That's going to look after you talk about >>The, >>So there's one point I would like to add here, which is something very interesting, right? The whole concept of microservice came from, if you remember the memo from Jeff Bezos, that everybody's going to go, Microsoft would be fired. That gave rise to a very conventional unconditionally of thinking about their applications. Our deep friends, we believe that security should be. Now. You should bring the same unconventional way of thinking to security. Your security is all bottom up. No, it has to start popping up. So your applications on microservice, your security should also be a micro. >>So you need a microservice for a microservice security for the security. You're starting to get into a paradigm shift where you starting to see the API economy that bayzos and Amazon philosophy and their approach go Beanstream. So when I got to ask you, because this is a trend we've been watching and reporting on the actual application development processes, changing from the old school, you know, life cycle, software defined life cycle to now you've got machine learning and bots. You have AI. Now you have people are building apps differently. And the speed of which they want to code is high. And then other teams are slowing them down. So I've heard security teams just screw people over a couple of days. Oh my God, I can wait five days. No, it used to be five weeks. Now it's five days. They think that's progress. They want five minutes, the developers in real time. So this is a real deal optimum. >>Well, you know what? Shift left was a good thing. Instill a good thing. It helps you sort of figure out the issues early on in the development life cycle, essentially. Right? And so you started weaving in security early on and it stays with you. The problem is we are hydrating. So frequently you end up with a few hundred vulnerabilities every time you scan oftentimes few thousand and then you go to runtime and you can't really fix all these thousand one. You know? So this is where, so there is a little bit of a gap there. If you're saying, if look at the CIC cycle, the in financial cycle that they show you, right. You've got the far left, which is where you have the SAS tools, snake and all of that. And then you've got the center where, which is where you hand off this to ops. >>And then on the right side, you've got tech ops defense essentially starts in the middle and says, look, I know you've had thousand one abilities. Okay. But at run time, I see only one of those packages is loaded in memory. And only that is getting traffic. You go and fix that one because that's going to heart. You see what I'm saying? So that gap is what we're doing. So you start with the left, we come in in the middle and stay with you throughout, you know, till the whole, uh, she asks me. Yeah, well that >>Th that, that touches on a subject. What are the, what are the changes that we're seeing? What are the new threats that are associated with containerization and kind of coupled with that, look back on traditional security methods and how are our traditional security methods failing us with those new requirements that come out of the microservices and containerized world. And so, >>So having, having been at FireEye, I'll tell you I've worked on their windows products and Juniper, >>And very, very deeply involved in. >>And in fact, you know what I mean, at the company, we even sold a product to Palo Alto. So having been around the space, really, I think it's, it's, it's a, it's a foregone conclusion to say that attackers have become more sophisticated. Of course they have. Yeah. It's not a single attack vector, which gets you down anymore. It's not a script getting somewhere shooting who just sending one malicious HTP request exploiting, no, these are multi-vector multi-stage attacks. They, they evolve over time in space, you know? And then what happens is I could have shot a revolving with time and space, one notable cause of piling up. Right? And on the other side, you've got the infrastructure, which is getting fragmented. What I mean by fragmented is it's not one data center where everything would look and feel and smell similar it's containers and tuberosities and several lessons. All of that stuff is hackable, right? So you've got that big shift happening there. You've got attackers, how do you build visibility? So, in fact, initially we used to, we would go and speak with, uh, DevSecOps practitioner say, Hey, what is the coalition? Is it that you don't have enough scanners to scan? Is it that at runtime? What is the main problem? It's the lack of visibility, lack of observability throughout the life cycle, as well as through outage, it was an issue with allegation. >>And the fact that the attackers know that too, they're exploiting the fact that they can't see they're blind. And it's like, you know what? Trying to land a plane that flew yesterday and you think it's landing tomorrow. It's all like lagging. Right? Exactly. So I got to ask you, because this has comes up a lot, because remember when we're in our 11th season with the cube, and I remember conversations going back to 2010, a cloud's not secure. You know, this is before everyone realized shit, the club's better than on premises if you have it. Right. So a trend is emerged. I want to get your thoughts on this. What percentage of the hacks are because the attackers are lazier than the more sophisticated ones, because you see two buckets I'm going to get, I'm going to work hard to get this, or I'm going to go for the easy low-hanging fruit. Most people have just a setup that's just low hanging fruit for the hackers versus some sort of complex or thought through programmatic cloud system, because now is actually better if you do it. Right. So the more sophisticated the environment, the harder it is for the hackers, AK Bob wire, whatever you wanna call it, what level do we cross over? >>When does it go from the script periods to the, the, >>Katie's kind of like, okay, I want to go get the S3 bucket or whatever. There's like levels of like laziness. Yeah. Okay. I, yeah. Versus I'm really going to orchestrate Spearfish social engineer, the more sophisticated economy driven ones. Yeah. >>I think, you know what, this attackers, the hacks aren't being conducted the way they worked in the 10, five years ago, isn't saying that they been outsourced, there are sophisticated teams for building exploiters. This is the whole industry up there. Even the nation, it's an economy really. Right. So, um, the known badness or the known attacks, I think we have had tools. We have had their own tools, signature based tools, which would know, look for certain payloads and say, this is that I know it. Right. You get the stuff really starts sort of, uh, getting out of control when you have so many sort of different modalities running side by side. So much, so much moving attack surfaces, they will evolve. And you never know that you've scanned enough because you never happened because we just pushed the code. >>Yeah. So we've been covering the iron debt. Kim retired general, Keith Alexander, his company. They have this iron dome concept where there's more collective sharing. Um, how do you see that trend? Because I can almost imagine that the open-source man is going to love what you guys got. You're going to probably feed on it, like it's nobody's business, but then you start thinking, okay, we're going to be open. And you have a platform approach, not so much a tool based approach. So just give me tools. We all know that when does it, we cross over to the Nirvana of like real security sharing. Real-time telemetry data. >>And I want to answer this in two parts. The first part is really a lot of this wisdom is only in the community. It's a tribal knowledge. It's their informal feeds in from get up tickets. And you know, a lot of these things, what we're really doing with threat map, but as we are consolidating that and giving it out as a sort of platform that you can use, I like to go for free. This is the part you will never go to monetize this. And we are certain about disaster. What we are monetizing instead is you have, like I said, the x-ray or MRI scan of the cloud, which tells you what the pain points are. This is feel free. This is public collective good. This is a Patrick reader. This is for free. It's shocking. >>I took this long to get to that point, by the way, in this discussion. >>Yeah, >>This is this timing's perfect. >>Security is collective good. Right? And if you're doing open source, community-based, you know, programs like this is for the collector group. What we do look, this whole other set map is going to be open source. We going to make it a platform and our commercial version, which is called fetch Stryker, which is where we have our core IP, which is basically think about this way, right? If you figured out all the pain points and using tech map, or this was a free, and now you wanted the remedy for that pain feed to target a defense, we targeted quarantining of those statin workloads and all that stuff. And that's what our IP is. What we really do there is we said, look, you figured out the attack surface using tech fabric. Now I'm going to use threat Stryker to protect their attacks and stress >>Free. Not free to, or is that going to be Fort bang? >>Oh, that's for, okay. >>That's awesome. So you bring the goodness to the party, the goods to the party, again, share that collective, see where that goes. And the Stryker on top is how you guys monetize. >>And that's where we do some uniquely normal things. I would want to talk about that. If, if, if, if you know public probably for 30 seconds or so unique things we do in industry, which is basically being able to monitor what comes in, what goes out and what changes across time and space, because look, most of the modern attacks evolve over time and space, right? So you go to be able to see things like this. Here's a party structure, which has a vulnerability threats. Mapper told you that to strike. And what it does is it tells you a bunch of stress has a vulnerable again, know that somebody is sending a Melissa's HTP request, which has a malicious payload. And you know what, tomorrow there's a file system change. And there is outbound connection going to some funny place. That is the part that we're wanting this. >>Yeah. And you give away the tool to identify the threats and sell the hammer. >>That's giving you protection. >>Yeah. Yeah. Awesome. I love you guys love this product. I love how you're doing it. I got to ask you to define what is security as a microservice. >>So security is a microservice is a deployment modality for us. So defense, what defense has is one console. So defense is currently self posted by the customers within the infrastructure going forward. We'll also be launching a SAS version, the cloud version of it. But what happens as part of this deployment is they're running the management console, which is the gooey, and then a tiny sensor, which is collecting telemetric that is deployed as a microservice is what I'm saying. So you've got 10 containers running defenses level of container. That's, that's an eight or the Microsoft risk. And it utilizes, uh, EDP F you know, for tracing and all that stuff. Yeah. >>Awesome. Well, I think this is the beginning of a shift in the industry. You start to see dev ops and cloud native technologies become the operating model, not just dev dev ops are now in play and infrastructure as code, which is the ethos of a cloud generation is security is code. That's true. That's what you guys are doing. Thanks for coming on. Really appreciate it. Absolutely breaking news here in the queue, obviously great stuff. Open source continues to grow and win in the new model. Collaboration is the cube bringing you all the cover day one, the three days. I'm Jennifer, your host with Dave Nicholson. Thanks for watching.
SUMMARY :
It's great to have you on. It essentially allows you to visualize all of this in real time, think of fed map, but as something that you, It's like really gives that developer and the teams ops teams visibility into That's what we sort of decided, you know what, let's start with utilizing everything else there. How does this all work cloud scale? the solution to look at what my interests are to this point? That's going to look after you talk about came from, if you remember the memo from Jeff Bezos, that everybody's going to go, Microsoft would be fired. So you need a microservice for a microservice security for the security. You've got the far left, which is where you have the SAS So you start with the left, we come in in the middle and stay with you throughout, What are the new threats that are associated with containerization and kind And in fact, you know what I mean, at the company, we even sold a product to Palo Alto. the environment, the harder it is for the hackers, AK Bob wire, whatever you wanna call it, what level the more sophisticated economy driven ones. And you never know that you've scanned enough because Because I can almost imagine that the open-source man is going to love what you guys got. This is the part you will never go to monetize this. What we really do there is we said, look, you figured out the attack surface using tech And the Stryker on top is how you guys monetize. And what it does is it tells you a bunch of stress has a vulnerable I got to ask you to define what is security as a microservice. And it utilizes, uh, EDP F you know, for tracing and all that stuff. Collaboration is the cube bringing you all the cover day one, the three days.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Richard | PERSON | 0.99+ |
Dave Nicholson | PERSON | 0.99+ |
Dave Nicholson | PERSON | 0.99+ |
Keith Alexander | PERSON | 0.99+ |
John | PERSON | 0.99+ |
five weeks | QUANTITY | 0.99+ |
five days | QUANTITY | 0.99+ |
30 seconds | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
five minutes | QUANTITY | 0.99+ |
Kim | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Jennifer | PERSON | 0.99+ |
Jeff Bezos | PERSON | 0.99+ |
John farrier | PERSON | 0.99+ |
Krista Swami | PERSON | 0.99+ |
Shyam Krishnaswamy | PERSON | 0.99+ |
two parts | QUANTITY | 0.99+ |
2010 | DATE | 0.99+ |
Sandeep Lahane | PERSON | 0.99+ |
tomorrow | DATE | 0.99+ |
yesterday | DATE | 0.99+ |
3 billion | QUANTITY | 0.99+ |
10 containers | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Patrick | PERSON | 0.99+ |
three days | QUANTITY | 0.99+ |
Katie | PERSON | 0.99+ |
11th season | QUANTITY | 0.99+ |
30 billion | QUANTITY | 0.99+ |
KubeCon | EVENT | 0.99+ |
two buckets | QUANTITY | 0.98+ |
bayzos | ORGANIZATION | 0.98+ |
10 | DATE | 0.98+ |
one console | QUANTITY | 0.98+ |
first part | QUANTITY | 0.98+ |
Melissa | PERSON | 0.98+ |
one | QUANTITY | 0.98+ |
two great guests | QUANTITY | 0.98+ |
Palo Alto | LOCATION | 0.98+ |
FireEye | ORGANIZATION | 0.97+ |
one point | QUANTITY | 0.96+ |
Sandeep | PERSON | 0.96+ |
CloudNative Con | EVENT | 0.96+ |
Juniper | ORGANIZATION | 0.96+ |
Cuba | LOCATION | 0.96+ |
single modality | QUANTITY | 0.96+ |
single attack | QUANTITY | 0.95+ |
eight | QUANTITY | 0.94+ |
two | QUANTITY | 0.94+ |
70 | QUANTITY | 0.94+ |
Shockley | ORGANIZATION | 0.93+ |
one small point | QUANTITY | 0.92+ |
this morning | DATE | 0.9+ |
single clone | QUANTITY | 0.89+ |
thousand | QUANTITY | 0.89+ |
day one | QUANTITY | 0.88+ |
SAS | ORGANIZATION | 0.87+ |
70 billion | QUANTITY | 0.85+ |
single community tool | QUANTITY | 0.85+ |
William | PERSON | 0.83+ |
Baton | LOCATION | 0.83+ |
five years ago | DATE | 0.83+ |
S3 | COMMERCIAL_ITEM | 0.83+ |
NA 2021 | EVENT | 0.81+ |
one data center | QUANTITY | 0.81+ |
CTO | PERSON | 0.79+ |
con 2021 | EVENT | 0.78+ |
Nirvana | LOCATION | 0.78+ |
Apache | ORGANIZATION | 0.72+ |
Stryker | ORGANIZATION | 0.71+ |
few thousand | QUANTITY | 0.7+ |
DevSecOps | ORGANIZATION | 0.7+ |
coop con cloud native | ORGANIZATION | 0.69+ |
one abilities | QUANTITY | 0.69+ |
a couple of days | QUANTITY | 0.68+ |
hundred vulnerabilities | QUANTITY | 0.67+ |
one more microservice | QUANTITY | 0.64+ |
Beanstream | ORGANIZATION | 0.64+ |
track pepper | ORGANIZATION | 0.63+ |
Mapper | PERSON | 0.62+ |
AK Bob | PERSON | 0.59+ |
CIS | ORGANIZATION | 0.56+ |
fence | ORGANIZATION | 0.54+ |
V2 | COMMERCIAL_ITEM | 0.45+ |
Stryker | TITLE | 0.39+ |
Maria Colgan & Gerald Venzl, Oracle | June CUBEconversation
(upbeat music) Developers have become the new king makers in the world of digital and cloud. The rise of containers and microservices has accelerated the transition to cloud native applications. A lot of people will talk about application architecture and the related paradigms and the benefits they bring for the process of writing and delivering new apps. But a major challenge continues to be, the how and the what when it comes to accessing, processing and getting insights from the massive amounts of data that we have to deal with in today's world. And with me are two experts from the data management world who will share with us how they think about the best techniques and practices based on what they see at large organizations who are working with data and developing so-called data-driven apps. Please welcome Maria Colgan and Gerald Venzl, two distinguish product managers from Oracle. Folks, welcome, thanks so much for coming on. >> Thanks for having us Dave. >> Thank you very much for having us. >> Okay, Maria let's start with you. So, we throw around this term data-driven, data-driven applications. What are we really talking about there? >> So data-driven applications are applications that work on a diverse set of data. So anything from spatial to sensor data, document data as well as your usual transaction processing data. And what they're going to do is they'll generate value from that data in very different ways to a traditional application. So for example, they may use machine learning, they are able to do product recommendations in the middle of a transaction. Or we could use graph to be able to identify an influencer within the community so we can target them with a specific promotion. It could also use spatial data to be able to help find the nearest stores to a particular customer. And because these apps are deployed on multiple platforms, everything from mobile devices as well as standard browsers, they need a data platform that's going to be both secure, reliable and scalable. >> Well, so when you think about how the workloads are shifting I mean, we're not talking about, you know it's not anymore a world of just your ERP or your HCM or your CRM, you know kind of the traditional operational systems. You really are seeing an explosion of these new data oriented apps. You're seeing, you know, modeling in the cloud, you are going to see more and more inferencing, inferencing at the edge. But Maria maybe you could talk a little bit about sort of the benefits that customers are seeing from developing these types of applications. I mean, why should people care about data-driven apps? >> Oh, for sure, there's massive benefits to them. I mean, probably the most obvious one for any business regardless of the industry, is that they not only allow you to understand what your customers are up to, but they allow you to be able to anticipate those customer's needs. So that helps businesses maintain that competitive edge and retain their customers. But it also helps them make data-driven decisions in real time based on actual data rather than on somebody's gut feeling or basing those decisions on historical data. So for example, you can do real-time price adjustments on products based on demand and so forth, that kind of thing. So it really changes the way people do business today. >> So Gerald, you think about the narrative in the industry everybody wants to be a platform player all your customers they are becoming software companies, they are becoming platform players. Everybody wants to be like, you know name a company that is huge trillion dollar market cap or whatever, and those are data-driven companies. And so it would seem to me that data-driven applications, there's nobody, no company really shouldn't be data-driven. Do you buy that? >> Yeah, absolutely. I mean, data-driven, and that naturally the whole industry is data-driven, right? It's like we all have information technologies about processing data and deriving information out of it. But when it comes to app development I think there is a big push to kind of like we have to do machine learning in our applications, we have to get insights from data. And when you actually look back a bit and take a step back, you see that there's of course many different kinds of applications out there as well that's not to be forgotten, right? So there is a usual front end user interfaces where really the application all it does is just entering some piece of information that's stored somewhere or perhaps a microservice that's not attached to a data to you at all but just receives or asks calls (indistinct). So I think it's not necessarily so important for every developer to kind of go on a bandwagon that they have to be data-driven. But I think it's equally important for those applications and those developers that build applications, that drive the business, that make business critical decisions as Maria mentioned before. Those guys should take really a close look into what data-driven apps means and what the data to you can actually give to them. Because what we see also happening a lot is that a lot of the things that are well known and out there just ready to use are being reimplemented in the applications. And for those applications, they essentially just ended up spending more time writing codes that will be already there and then have to maintain and debug the code as well rather than just going to market faster. >> Gerald can you talk to the prevailing approaches that developers take to build data-driven applications? What are the ones that you see? Let's dig into that a little bit more and maybe differentiate the different approaches and talk about that? >> Yeah, absolutely. I think right now the industry is like in two camps, it's like sort of a religious war going on that you'll see often happening with different architectures and so forth going on. So we have single purpose databases or data management technologies. Which are technologies that are as the name suggests build around a single purpose. So it's like, you know a typical example would be your ordinary key-value store. And a key-value store all it does is it allows you to store and retrieve a piece of data whatever that may be really, really fast but it doesn't really go beyond that. And then the other side of the house or the other camp would be multimodal databases, multimodal data management technologies. Those are technologies that allow you to store different types of data, different formats of data in the same technology in the same system alongside. And, you know, when you look at the geographics out there of what we have from technology, is pretty much any relational database or any database really has evolved into such a multimodal database. Whether that's MySQL that allows you to store or chase them alongside relational or even a MongoDB that allows you to do or gives you native graph support since (mumbles) and as well alongside the adjacent support. >> Well, it's clearly a trend in the industry. We've talked about this a lot in The Cube. We know where Oracle stands on this. I mean, you just mentioned MySQL but I mean, Oracle Databases you've been extending, you've mentioned JSON, we've got blockchain now in there you're infusing, you know ML and AI into the database, graph database capabilities, you know on and on and on. We talked a lot about we compared that to Amazon which is kind of the right tool, the right job approach. So maybe you could talk about, you know, your point of view, the benefits for developers of using that converged database if I can use that word approach being able to store multiple data formats? Why do you feel like that's a better approach? >> Yeah, I think on a high level it comes down to complexity. You are actually avoiding additional complexity, right? So not every use case that you have necessarily warrants to have yet another data management technology or yet the special build technology for managing that data, right? It's like many use cases that we see out there happily want to just store a piece of a chase and document, a piece of chase in a database and then perhaps retrieve it again afterwards so write some simple queries over it. And you really don't have to get a new database technology or a NoSQL database into the mix if you already have some to just fulfill that exact use case. You could just happily store that information as well in the database you already have. And what it really comes down to is the learning curve for developers, right? So it's like, as you use the same technology to store other types of data, you don't have to learn a new technology, you don't have to associate yourself with new and learn new drivers. You don't have to find new frameworks and you don't have to know how to necessarily operate or best model your data for that database. You can essentially just reuse your knowledge of the technology as well as the libraries and code you have already built in house perhaps in another application, perhaps, you know framework that you used against the same technology because it is still the same technology. So, kind of all comes down again to avoiding complexity rather than not fragmenting you know, the many different technologies we have. If you were to look at the different data formats that are out there today it's like, you know, you would end up with many different databases just to store them if you were to fully religiously follow the single purpose best built technology for every use case paradigm, right? And then you would just end up having to manage many different databases more than actually focusing on your app and getting value to your business or to your user. >> Okay, so I get that and I buy that by the way. I mean, especially if you're a larger organization and you've got all these projects going on but before we go back to Maria, Gerald, I want to just, I want to push on that a little bit. Because the counter to that argument would be in the analogy. And I wonder if you, I'd love for you to, you know knock this analogy off the blocks. The counter would be okay, Oracle is the Swiss Army knife and it's got, you know, all in one. But sometimes I need that specialized long screwdriver and I go into my toolbox and I grab that. It's better than the screwdriver in my Swiss Army knife. Why, are you the Swiss Army knife of databases? Or are you the all-in-one have that best of breed screwdriver for me? How do you think about that? >> Yeah, that's a fantastic question, right? And I think it's first of all, you have to separate between Oracle the company that has actually multiple data management technologies and databases out there as you said before, right? And Oracle Database. And I think Oracle Database is definitely a Swiss Army knife has many capabilities of since the last 40 years, you know that we've seen object support coming that's still in the Oracle Database today. We have seen XML coming, it's still in the Oracle Database, graph, spatial, et cetera. And so you have many different ways of managing your data and then on top of that going into the converge, not only do we allow you to store the different data model in there but we actually allow you also to, you apply all the security policies and so forth on top of it something Maria can talk more about the mission around converged database. I would also argue though that for some aspects, we do actually have to or add a screwdriver that you talked about as well. So especially in the relational world people get very quickly hung up on this idea that, oh, if you only do rows and columns, well, that's kind of what you put down on disk. And that was never true, it's the relational model is actually a logical model. What's probably being put down on disk is blocks that align themselves nice with block storage and always has been. So that allows you to actually model and process the data sort of differently. And one common example or one good example that we have that we introduced a couple of years ago was when, column and databases were very strong and you know, the competition came it's like, yeah, we have In-Memory column that stores now they're so much better. And we were like, well, orienting the data role-based or column-based really doesn't matter in the sense that we store them as blocks on disks. And so we introduced the in memory technology which gives you an In-Memory column, a representation of your data as well alongside your relational. So there is an example where you go like, well, actually you know, if you have this use case of the column or analytics all In-Memory, I would argue Oracle Database is also that screwdriver you want to go down to and gives you that capability. Because not only gives you representation in columnar, but also which many people then forget all the analytic power on top of SQL. It's one thing to store your data columnar, it's a completely different story to actually be able to run analytics on top of that and having all the built-in functionalities and stuff that you want to do with the data on top of it as you analyze it. >> You know, that's a great example, the kilometer 'cause I remember there was like a lot of hype around it. Oh, it's the Oracle killer, you know, at Vertica. Vertica is still around but, you know it never really hit escape velocity. But you know, good product, good company, whatever. Natezza, it kind of got buried inside of IBM. ParXL kind of became, you know, red shift with that deal so that kind of went away. Teradata bought a company, I forget which company it bought but. So that hype kind of disapated and now it's like, oh yeah, columnar. It's kind of like In-Memory, we've had a In-Memory databases ever since we've had databases you know, it's a kind of a feature not a sector. But anyway, Maria, let's come back to you. You've got a lot of customer experience. And you speak with a lot of companies, you know during your time at Oracle. What else are you seeing in terms of the benefits to this approach that might not be so intuitive and obvious right away? >> I think one of the biggest benefits to having a multimodel multiworkload or as we call it a converged database, is the fact that you can get greater data synergy from it. In other words, you can utilize all these different techniques and data models to get better value out of that data. So things like being able to do real-time machine learning, fraud detection inside a transaction or being able to do a product recommendation by accessing three different data models. So for example, if I'm trying to recommend a product for you Dave, I might use graph analytics to be able to figure out your community. Not just your friends, but other people on our system who look and behave just like you. Once I know that community then I can go over and see what products they bought by looking up our product catalog which may be stored as JSON. And then on top of that I can then see using the key-value what products inside that catalog those community members gave a five star rating to. So that way I can really pinpoint the right product for you. And I can do all of that in one transaction inside the database without having to transform that data into different models or God forbid, access different systems to be able to get all of that information. So it really simplifies how we can generate that value from the data. And of course, the other thing our customers love is when it comes to deploying data-driven apps, when you do it on a converged database it's much simpler because it is that standard data platform. So you're not having to manage multiple independent single purpose databases. You're not having to implement the security and the high availability policies, you know across a bunch of different diverse platforms. All of that can be done much simpler with a converged database 'cause the DBA team of course, is going to just use that standard set of tools to manage, monitor and secure those systems. >> Thank you for that. And you know, it's interesting, you talk about simplification and you are in Juan's organization so you've big focus on mission critical. And so one of the things that I think is often overlooked well, we talk about all the time is recovery. And if things are simpler, recovery is faster and easier. And so it's kind of the hallmark of Oracle is like the gold standard of the toughest apps, the most mission critical apps. But I wanted to get to the cloud Maria. So because everything is going to the cloud, right? Not all workloads are going to the cloud but everybody is talking about the cloud. Everybody has cloud first mentality and so yes, it's a hybrid world. But the natural next question is how do you think the cloud fits into this world of data-driven apps? >> I think just like any app that you're developing, the cloud helps to accelerate that development. And of course the deployment of these data-driven applications. 'Cause if you think about it, the developer is instantly able to provision a converged database that Oracle will automatically manage and look after for them. But what's great about doing something like that if you use like our autonomous database service is that it comes in different flavors. So you can get autonomous transaction processing, data warehousing or autonomous JSON so that the developer is going to get a database that's been optimized for their specific use case, whatever they are trying to solve. And it's also going to contain all of that great functionality and capabilities that we've been talking about. So what that really means to the developer though is as the project evolves and inevitably the business needs change a little, there's no need to panic when one of those changes comes in because your converged database or your autonomous database has all of those additional capabilities. So you can simply utilize those to able to address those evolving changes in the project. 'Cause let's face it, none of us normally know exactly what we need to build right at the very beginning. And on top of that they also kind of get a built-in buddy in the cloud, especially in the autonomous database. And that buddy comes in the form of built-in workload optimizations. So with the autonomous database we do things like automatic indexing where we're using machine learning to be that buddy for the developer. So what it'll do is it'll monitor the workload and see what kind of queries are being run on that system. And then it will actually determine if there are indexes that should be built to help improve the performance of that application. And not only does it bill those indexes but it verifies that they help improve the performance before publishing it to the application. So by the time the developer is finished with that app and it's ready to be deployed, it's actually also been optimized by the developers buddy, the Oracle autonomous database. So, you know, it's a really nice helping hand for developers when they're building any app especially data-driven apps. >> I like how you sort of gave us, you know the truth here is you don't always know where you're going when you're building an app. It's like it goes from you are trying to build it and they will come to start building it and we'll figure out where it's going to go. With Agile that's kind of how it works. But so I wonder, can you give some examples of maybe customers or maybe genericize them if you need to. Data-driven apps in the cloud where customers were able to drive more efficiency, where the cloud buddy allowed the customers to do more with less? >> No, we have tons of these but I'll try and keep it to just a couple. One that comes to mind straight away is retrace. These folks built a blockchain app in the Oracle Cloud that allows manufacturers to actually share the supply chain with the consumer. So the consumer can see exactly, who made their product? Using what raw materials? Where they were sourced from? How it was done? All of that is visible to the consumer. And in order to be able to share that they had to work on a very diverse set of data. So they had everything from JSON documents to images as well as your traditional transactions in there. And they store all of that information inside the Oracle autonomous database, they were able to build their app and deploy it on the cloud. And they were able to do all of that very, very quickly. So, you know, that ability to work on multiple different data types in a single database really helped them build that product and get it to market in a very short amount of time. Another customer that's doing something really, really interesting is MindSense. So these guys operate the largest mines in Canada, Chile, and Peru. But what they do is they put these x-ray devices on the massive mechanical shovels that are at the cove or at the mine face. And what that does is it senses the contents of the buckets inside these mining machines. And it's looking to see at that content, to see how it can optimize the processing of the ore inside in that bucket. So they're looking to minimize the amount of power and water that it's going to take to process that. And also of course, minimize the amount of waste that's going to come out of that project. So all of that sensor data is sent into an autonomous database where it's going to be processed by a whole host of different users. So everything from the mine engineers to the geo scientists, to even their own data scientists utilize that data to drive their business forward. And what I love about these guys is they're not happy with building just one app. MindSense actually use our built-in low core development environment, APEX that comes as part of the autonomous database and they actually produce applications constantly for different aspects of their business using that technology. And it's actually able to accelerate those new apps to the business. It takes them now just a couple of days or weeks to produce an app instead of months or years to build those new apps. >> Great, thank you for that Maria. Gerald, I'm going to push you again. So, I said upfront and talked about microservices and the cloud and containers and you know, anybody in the developer space follows that very closely. But some of the things that we've been talking about here people might look at that and say, well, they're kind of antithetical to microservices. This is our Oracles monolithic approach. But when you think about the benefits of microservices, people want freedom of choice, technology choice, seen as a big advantage of microservices and containers. How do you address such an argument? >> Yeah, that's an excellent question and I get that quite often. The microservices architecture in general as I said before had architectures, Linux distributions, et cetera. It's kind of always a bit of like there's an academic approach and there's a pragmatic approach. And when you look at the microservices the original definitions that came out at the early 2010s. They actually never said that each microservice has to have a database. And they also never said that if a microservice has a database, you have to use a different technology for each microservice. Just like they never said, you have to write a microservice in a different programming language, right? So where I'm going with this is like, yes you know, sometimes when you look at some vendors out there, some niche players, they push this message or they jump on this academic approach of like each microservice has the best tool at hand or I'd use a different database for your purpose, et cetera. Which almost often comes across like us. You know, we want to stay part of the conversation. Nothing stops a developer from, you know using a multimodal database for the microservice and just using that as a document store, right? Or just using that as a relational database. And, you know, sometimes I mean, it was actually something that happened that was really interesting yesterday I don't know whether you follow Dave or not. But Facebook had an outage yesterday, right? And Facebook is one of those companies that are seen as the Silicon Valley, you know know how to do microservices companies. And when you add through the outage, well, what happened, right? Some unfortunate logical error with configuration as a force that took a database cluster down. So, you know, there you have it where you go like, well, maybe not every microservice is actually in fact talking to its own database or its own special purpose database. I think there, you know, well, what we should, the industry should be focusing much more on this argument of which technology to use? What's the right tool for a job? Is more to ask themselves, what business problem actually are we trying to solve? And therefore what's the right approach and the right technology for this. And so therefore, just as I said before, you know multimodal databases they do have strong benefits. They have many built-in functionalities that are already there and they allow you to reduce this complexity of having to know many different technologies, right? And so it's not only to store different data models either you know, treat a multimodal database as a chasing documents store or a relational database but most databases are multimodal since 20 plus years. But it's also actually being able to perhaps if you store that data together, you can perhaps actually derive additional value for somebody else but perhaps not for your application. But like for example, if you were to use Oracle Database you can actually write queries on top of all of that data. It doesn't really matter for our query engine whether it's the data is format that then chase or the data is formatted in rows and columns you can just rather than query over it. And that's actually very powerful for those guys that have to, you know get the reporting done the end of the day, the end of the week. And for those guys that are the data scientists that they want to figure out, you know which product performed really well or can we tweak something here and there. When you look into that space you still see a huge divergence between the guys to put data in kind of the altarpiece style and guys that try to derive new insights. And there's still a lot of ETL going around and, you know we have big data technologies that some of them come and went and some of them came in that are still around like Apache Spark which is still like a SQL engine on top of any of your data kind of going back to the same concept. And so I will say that, you know, for developers when we look at microservices it's like, first of all, is the argument you were making because the vendor or the technology you want to use tells you this argument or, you know, you kind of want to have an argument to use a specific technology? Or is it really more because it is the best technology, to best use for this given use case for this given application that you have? And if so there's of course, also nothing wrong to use a single purpose technology either, right? >> Yeah, I mean, whenever I talk about Oracle I always come back to the most important applications, the mission critical. It's very difficult to architect databases with microservices and containers. You have to be really, really careful. And so and again, it comes back to what we were talking before about with Maria that the complexity and the recovery. But Gerald I want to stay with you for a minute. So there's other data management technologies popping out there. I mean, I've seen some people saying, okay just leave the data in an S3 bucket. We can query that, then we've got some magic sauce to do that. And so why are you optimistic about you know, traditional database technology going forward? >> I would say because of the history of databases. So one thing that once struck me when I came to Oracle and then got to meet great people like Juan Luis and Andy Mendelsohn who had been here for a long, long time. I come to realization that relational databases are around for about 45 years now. And, you know, I was like, I'm too young to have been around then, right? So I was like, what else was around 45 years? It's like just the tech stack that we have today. It's like, how does this look like? Well, Linux only came out in 93. Well, databases pre-date Linux a lot rather than as I started digging I saw a lot of technologies come and go, right? And you mentioned before like the technologies that data management systems that we had that came and went like the columnar databases or XML databases, object databases. And even before relational databases before Cot gave us the relational model there were apparently these networks stores network databases which to some extent look very similar to adjacent documents. There wasn't a harder storing data and a hierarchy to format. And, you know when you then start actually reading the Cot paper and diving a little bit more into the relation model, that's I think one important crux in there that most of the industry keeps forgetting or it hasn't been around to even know. And that is that when Cot created the relational model, he actually focused not so much on the application putting the data in, but on future users and applications still being able to making sense out of the data, right? And that's kind of like I said before we had those network models, we had XML databases you have adjacent documents stores. And the one thing that they all have along with it is like the application that puts the data in decides the structure of the data. And that's all well and good if you had an application of the developer writing an application. It can become really tricky when 10 years later you still want to look at that data and the application that the developer is no longer around then you go like, what does this all mean? Where is the structure defined? What is this attribute? What does it mean? How does it correlate to others? And the one thing that people tend to forget is that it's actually the data that's here to stay not someone who does the applications where it is. Ideally, every company wants to store every single byte of data that they have because there might be future value in it. Economically may not make sense that's now much more feasible than just years ago. But if you could, why wouldn't you want to store all your data, right? And sometimes you actually have to store the data for seven years or whatever because the laws require you to. And so coming back then and you know, like 10 years from now and looking at the data and going like making sense of that data can actually become a lot more difficult and a lot more challenging than having to first figure out and how we store this data for general use. And that kind of was what the relational model was all about. We decompose the data structures into tables and columns with relationships amongst each other so therefore between each other. So that therefore if somebody wants to, you know typical example would be well you store some purchases from your web store, right? There's a customer attribute in it. There's some credit card payment information in it, just some product information on what the customer bought. Well, in the relational model if you just want to figure out which products were sold on a given day or week, you just would query the payment and products table to get the sense out of it. You don't need to touch the customer and so forth. And with the hierarchical model you have to first sit down and understand how is the structure, what is the customer? Where is the payment? You know, does the document start with the payment or does it start with the customer? Where do I find this information? And then in the very early days those databases even struggled to then not having to scan all the documents to get the data out. So coming back to your question a bit, I apologize for going on here. But you know, it's like relational databases have been around for 45 years. I actually argue it's one of the most successful software technologies that we have out there when you look in the overall industry, right? 45 years is like, in IT terms it's like from a star being the ones who are going supernova. You have said it before that many technologies coming and went, right? And just want to add a more really interesting example by the way is Hadoop and HDFS, right? They kind of gave us this additional promise of like, you know, the 2010s like 2012, 2013 the hype of Hadoop and so forth and (mumbles) and HDFS. And people are just like, just put everything into HDFS and worry about the data later, right? And we can query it and map reduce it and whatever. And we had customers actually coming to us they were like, great we have half a petabyte of data on an HDFS cluster and we have no clue what's stored in there. How do we figure this out? What are we going to do now? Now you had a big data cleansing problem. And so I think that is why databases and also data modeling is something that will not go away anytime soon. And I think databases and database technologies are here for quite a while to stay. Because many of those are people they don't think about what's happening to the data five years from now. And many of the niche players also and also frankly even Amazon you know, following with this single purpose thing is like, just use the right tool for the job for your application, right? Just pull in the data there the way you wanted. And it's like, okay, so you use technologies all over the place and then five years from now you have your data fragmented everywhere in different formats and, you know inconsistencies, and, and, and. And those are usually when you come back to this data-driven business critical business decision applications the worst case scenario you can have, right? Because now you need an army of people to actually do data cleansing. And there's not a coincidence that data science has become very, very popular the last recent years as we kind of went on with this proliferation of different database or data management technologies some of those are not even database. But I think I leave it at that. >> It's an interesting talk track because you're right. I mean, no schema on right was alluring, but it definitely created some problems. It also created an entire, you know you referenced the hyper specialized roles and did the data cleansing component. I mean, maybe technology will eventually solve that problem but it hasn't up at least up tonight. Okay, last question, Maria maybe you could start off and Gerald if you want to chime in as well it'd be great. I mean, it's interesting to watch this industry when Oracle sort of won the top database mantle. I mean, I watched it, I saw it. It was, remember it was Informix and it was (indistinct) too and of course, Microsoft you got to give them credit with SQL server, but Oracle won the database wars. And then everything got kind of quiet for awhile database was sort of boring. And then it exploded, you know, all the, you know not only SQL and the key-value stores and the cloud databases and this is really a hot area now. And when we looked at Oracle we said, okay, Oracle it's all about Oracle Database, but we've seen the kind of resurgence in MySQL which everybody thought, you know once Oracle bought Sun they were going to kill MySQL. But now we see you investing in HeatWave, TimesTen, we talked about In-Memory databases before. So where do those fit in Maria in the grand scheme? How should we think about Oracle's database portfolio? >> So there's lots of places where you'd use those different things. 'Cause just like any other industry there are going to be new and boutique use cases that are going to benefit from a more specialized product or single purpose product. So good examples off the top of my head of the kind of systems that would benefit from that would be things like a stock exchange system or a telephone exchange system. Both of those are latency critical transaction processing applications where they need microsecond response times. And that's going to exceed perhaps what you might normally get or deploy with a converged database. And so Oracle's TimesTen database our In-Memory database is perfect for those kinds of applications. But there's also a host of MySQL applications out there today and you said it yourself there Dave, HeatWave is a great place to provision and deploy those kinds of applications because it's going to run 100 times faster than AWS (mumbles). So, you know, there really is a place in the market and in our customer's systems and the needs they have for all of these different members of our database family here at Oracle. >> Yeah, well, the internet is basically running in the lamp stack so I see MySQL going away. All right Gerald, will give you the final word, bring us home. >> Oh, thank you very much. Yeah, I mean, as Maria said, I think it comes back to what we discussed before. There is obviously still needs for special technologies or different technologies than a relational database or multimodal database. Oracle has actually many more databases that people may first think of. Not only the three that we have already mentioned but there's even SP so the Oracle's NoSQL database. And, you know, on a high level Oracle is a data management company, right? And we want to give our customers the best tools and the best technology to manage all of their data. Rather than therefore there has to be a need or there should be a part of the business that also focuses on this highly specialized systems and this highly specialized technologies that address those use cases. And I think it makes perfect sense. It's like, you know, when the customer comes to Oracle they're not only getting this, take this one product you know, and if you don't like it your problem but actually you have choice, right? And choice allows you to make a decision based on what's best for you and not necessarily best for the vendor you're talking to. >> Well guys, really appreciate your time today and your insights. Maria, Gerald, thanks so much for coming on The Cube. >> Thank you very much for having us. >> And thanks for watching this Cube conversation this is Dave Vellante and we'll see you next time. (upbeat music)
SUMMARY :
in the world of digital and cloud. and the benefits they bring What are we really talking about there? the nearest stores to kind of the traditional So it really changes the way So Gerald, you think about to you at all but just receives or even a MongoDB that allows you to do ML and AI into the database, in the database you already have. and I buy that by the way. of since the last 40 years, you know the benefits to this approach is the fact that you can get And so one of the things that And that buddy comes in the form of the truth here is you don't and deploy it on the cloud. and the cloud and containers and you know, is the argument you were making that the complexity and the recovery. because the laws require you to. And then it exploded, you and the needs they have in the lamp stack so I and the best technology to and your insights. we'll see you next time.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Gerald Venzl | PERSON | 0.99+ |
Andy Mendelsohn | PERSON | 0.99+ |
Maria | PERSON | 0.99+ |
Chile | LOCATION | 0.99+ |
Peru | LOCATION | 0.99+ |
Maria Colgan | PERSON | 0.99+ |
Canada | LOCATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Gerald | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Maria Colgan | PERSON | 0.99+ |
seven years | QUANTITY | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Juan Luis | PERSON | 0.99+ |
100 times | QUANTITY | 0.99+ |
five star | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
two experts | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Sun | ORGANIZATION | 0.99+ |
45 years | QUANTITY | 0.99+ |
MySQL | TITLE | 0.99+ |
three | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
each microservice | QUANTITY | 0.99+ |
Swiss Army | ORGANIZATION | 0.99+ |
early 2010s | DATE | 0.99+ |
Teradata | ORGANIZATION | 0.99+ |
Swiss Army | ORGANIZATION | 0.99+ |
Linux | TITLE | 0.99+ |
10 years later | DATE | 0.99+ |
2012 | DATE | 0.99+ |
two camps | QUANTITY | 0.99+ |
SQL | TITLE | 0.99+ |
Both | QUANTITY | 0.98+ |
Oracle Database | TITLE | 0.98+ |
2010s | DATE | 0.98+ |
TimesTen | ORGANIZATION | 0.98+ |
Hadoop | TITLE | 0.98+ |
first | QUANTITY | 0.98+ |
Oracles | ORGANIZATION | 0.98+ |
Vertica | ORGANIZATION | 0.98+ |
tonight | DATE | 0.98+ |
2013 | DATE | 0.98+ |
Maria Colgan & Gerald Venzl, Oracle | June CUBEconversation
(upbeat music) >> It'll be five, four, three and then silent two, one, and then you guys just follow my lead. We're just making some last minute adjustments. Like I said, we're down two hands today. So, you good Alex? Okay, are you guys ready? >> I'm ready. >> Ready. >> I got to get get one note here. >> So I noticed Maria you stopped anyway, so I have time. >> Just so they know Dave and the Boston Studio, are they both kind of concurrently be on film even when they're not speaking or will only the speaker be on film for like if Gerald's drawing while Maria is talking about-- >> Sorry but then I missed one part of my onboarding spiel. There should be, if you go into gallery there should be a label. There should be something labeled Boston live switch feed. If you pin that gallery view you'll see what our program currently being recorded is. So any time you don't see yourself on that feed is an excellent time to take a drink of water, scratch your nose, check your notes. Do whatever you got to do off screen. >> Can you give us a three shot, Alex? >> Yes, there it is. >> And then go to me, just give me a one-shot to Dave. So when I'm here you guys can take a drink or whatever >> That makes sense? >> Yeah. >> Excellent, I will get my recordings restarted and we'll open up when Dave's ready. >> All right, you guys ready? >> Ready. >> All right Steve, you go on mute. >> Okay, on me in 5, 4, 3. Developers have become the new king makers in the world of digital and cloud. The rise of containers and microservices has accelerated the transition to cloud native applications. A lot of people will talk about application architecture and the related paradigms and the benefits they bring for the process of writing and delivering new apps. But a major challenge continues to be, the how and the what when it comes to accessing, processing and getting insights from the massive amounts of data that we have to deal with in today's world. And with me are two experts from the data management world who will share with us how they think about the best techniques and practices based on what they see at large organizations who are working with data and developing so-called data-driven apps. Please welcome Maria Colgan and Gerald Venzl, two distinguish product managers from Oracle. Folks, welcome, thanks so much for coming on. >> Thanks for having us Dave. >> Thank you very much for having us. >> Okay, Maria let's start with you. So, we throw around this term data-driven, data-driven applications. What are we really talking about there? >> So data-driven applications are applications that work on a diverse set of data. So anything from spatial to sensor data, document data as well as your usual transaction processing data. And what they're going to do is they'll generate value from that data in very different ways to a traditional application. So for example, they may use machine learning, they are able to do product recommendations in the middle of a transaction. Or we could use graph to be able to identify an influencer within the community so we can target them with a specific promotion. It could also use spatial data to be able to help find the nearest stores to a particular customer. And because these apps are deployed on multiple platforms, everything from mobile devices as well as standard browsers, they need a data platform that's going to be both secure, reliable and scalable. >> Well, so when you think about how the workloads are shifting I mean, we're not talking about, you know it's not anymore a world of just your ERP or your HCM or your CRM, you know kind of the traditional operational systems. You really are seeing an explosion of these new data oriented apps. You're seeing, you know, modeling in the cloud, you are going to see more and more inferencing, inferencing at the edge. But Maria maybe you could talk a little bit about sort of the benefits that customers are seeing from developing these types of applications. I mean, why should people care about data-driven apps? >> Oh, for sure, there's massive benefits to them. I mean, probably the most obvious one for any business regardless of the industry, is that they not only allow you to understand what your customers are up to, but they allow you to be able to anticipate those customer's needs. So that helps businesses maintain that competitive edge and retain their customers. But it also helps them make data-driven decisions in real time based on actual data rather than on somebody's gut feeling or basing those decisions on historical data. So for example, you can do real-time price adjustments on products based on demand and so forth, that kind of thing. So it really changes the way people do business today. >> So Gerald, you think about the narrative in the industry everybody wants to be a platform player all your customers they are becoming software companies, they are becoming platform players. Everybody wants to be like, you know name a company that is huge trillion dollar market cap or whatever, and those are data-driven companies. And so it would seem to me that data-driven applications, there's nobody, no company really shouldn't be data-driven. Do you buy that? >> Yeah, absolutely. I mean, data-driven, and that naturally the whole industry is data-driven, right? It's like we all have information technologies about processing data and deriving information out of it. But when it comes to app development I think there is a big push to kind of like we have to do machine learning in our applications, we have to get insights from data. And when you actually look back a bit and take a step back, you see that there's of course many different kinds of applications out there as well that's not to be forgotten, right? So there is a usual front end user interfaces where really the application all it does is just entering some piece of information that's stored somewhere or perhaps a microservice that's not attached to a data to you at all but just receives or asks calls (indistinct). So I think it's not necessarily so important for every developer to kind of go on a bandwagon that they have to be data-driven. But I think it's equally important for those applications and those developers that build applications, that drive the business, that make business critical decisions as Maria mentioned before. Those guys should take really a close look into what data-driven apps means and what the data to you can actually give to them. Because what we see also happening a lot is that a lot of the things that are well known and out there just ready to use are being reimplemented in the applications. And for those applications, they essentially just ended up spending more time writing codes that will be already there and then have to maintain and debug the code as well rather than just going to market faster. >> Gerald can you talk to the prevailing approaches that developers take to build data-driven applications? What are the ones that you see? Let's dig into that a little bit more and maybe differentiate the different approaches and talk about that? >> Yeah, absolutely. I think right now the industry is like in two camps, it's like sort of a religious war going on that you'll see often happening with different architectures and so forth going on. So we have single purpose databases or data management technologies. Which are technologies that are as the name suggests build around a single purpose. So it's like, you know a typical example would be your ordinary key-value store. And a key-value store all it does is it allows you to store and retrieve a piece of data whatever that may be really, really fast but it doesn't really go beyond that. And then the other side of the house or the other camp would be multimodal databases, multimodal data management technologies. Those are technologies that allow you to store different types of data, different formats of data in the same technology in the same system alongside. And, you know, when you look at the geographics out there of what we have from technology, is pretty much any relational database or any database really has evolved into such a multimodal database. Whether that's MySQL that allows you to store or chase them alongside relational or even a MongoDB that allows you to do or gives you native graph support since (mumbles) and as well alongside the adjacent support. >> Well, it's clearly a trend in the industry. We've talked about this a lot in The Cube. We know where Oracle stands on this. I mean, you just mentioned MySQL but I mean, Oracle Databases you've been extending, you've mentioned JSON, we've got blockchain now in there you're infusing, you know ML and AI into the database, graph database capabilities, you know on and on and on. We talked a lot about we compared that to Amazon which is kind of the right tool, the right job approach. So maybe you could talk about, you know, your point of view, the benefits for developers of using that converged database if I can use that word approach being able to store multiple data formats? Why do you feel like that's a better approach? >> Yeah, I think on a high level it comes down to complexity. You are actually avoiding additional complexity, right? So not every use case that you have necessarily warrants to have yet another data management technology or yet the special build technology for managing that data, right? It's like many use cases that we see out there happily want to just store a piece of a chase and document, a piece of chase in a database and then perhaps retrieve it again afterwards so write some simple queries over it. And you really don't have to get a new database technology or a NoSQL database into the mix if you already have some to just fulfill that exact use case. You could just happily store that information as well in the database you already have. And what it really comes down to is the learning curve for developers, right? So it's like, as you use the same technology to store other types of data, you don't have to learn a new technology, you don't have to associate yourself with new and learn new drivers. You don't have to find new frameworks and you don't have to know how to necessarily operate or best model your data for that database. You can essentially just reuse your knowledge of the technology as well as the libraries and code you have already built in house perhaps in another application, perhaps, you know framework that you used against the same technology because it is still the same technology. So, kind of all comes down again to avoiding complexity rather than not fragmenting you know, the many different technologies we have. If you were to look at the different data formats that are out there today it's like, you know, you would end up with many different databases just to store them if you were to fully religiously follow the single purpose best built technology for every use case paradigm, right? And then you would just end up having to manage many different databases more than actually focusing on your app and getting value to your business or to your user. >> Okay, so I get that and I buy that by the way. I mean, especially if you're a larger organization and you've got all these projects going on but before we go back to Maria, Gerald, I want to just, I want to push on that a little bit. Because the counter to that argument would be in the analogy. And I wonder if you, I'd love for you to, you know knock this analogy off the blocks. The counter would be okay, Oracle is the Swiss Army knife and it's got, you know, all in one. But sometimes I need that specialized long screwdriver and I go into my toolbox and I grab that. It's better than the screwdriver in my Swiss Army knife. Why, are you the Swiss Army knife of databases? Or are you the all-in-one have that best of breed screwdriver for me? How do you think about that? >> Yeah, that's a fantastic question, right? And I think it's first of all, you have to separate between Oracle the company that has actually multiple data management technologies and databases out there as you said before, right? And Oracle Database. And I think Oracle Database is definitely a Swiss Army knife has many capabilities of since the last 40 years, you know that we've seen object support coming that's still in the Oracle Database today. We have seen XML coming, it's still in the Oracle Database, graph, spatial, et cetera. And so you have many different ways of managing your data and then on top of that going into the converge, not only do we allow you to store the different data model in there but we actually allow you also to, you apply all the security policies and so forth on top of it something Maria can talk more about the mission around converged database. I would also argue though that for some aspects, we do actually have to or add a screwdriver that you talked about as well. So especially in the relational world people get very quickly hung up on this idea that, oh, if you only do rows and columns, well, that's kind of what you put down on disk. And that was never true, it's the relational model is actually a logical model. What's probably being put down on disk is blocks that align themselves nice with block storage and always has been. So that allows you to actually model and process the data sort of differently. And one common example or one good example that we have that we introduced a couple of years ago was when, column and databases were very strong and you know, the competition came it's like, yeah, we have In-Memory column that stores now they're so much better. And we were like, well, orienting the data role-based or column-based really doesn't matter in the sense that we store them as blocks on disks. And so we introduced the in memory technology which gives you an In-Memory column, a representation of your data as well alongside your relational. So there is an example where you go like, well, actually you know, if you have this use case of the column or analytics all In-Memory, I would argue Oracle Database is also that screwdriver you want to go down to and gives you that capability. Because not only gives you representation in columnar, but also which many people then forget all the analytic power on top of SQL. It's one thing to store your data columnar, it's a completely different story to actually be able to run analytics on top of that and having all the built-in functionalities and stuff that you want to do with the data on top of it as you analyze it. >> You know, that's a great example, the kilometer 'cause I remember there was like a lot of hype around it. Oh, it's the Oracle killer, you know, at Vertica. Vertica is still around but, you know it never really hit escape velocity. But you know, good product, good company, whatever. Natezza, it kind of got buried inside of IBM. ParXL kind of became, you know, red shift with that deal so that kind of went away. Teradata bought a company, I forget which company it bought but. So that hype kind of disapated and now it's like, oh yeah, columnar. It's kind of like In-Memory, we've had a In-Memory databases ever since we've had databases you know, it's a kind of a feature not a sector. But anyway, Maria, let's come back to you. You've got a lot of customer experience. And you speak with a lot of companies, you know during your time at Oracle. What else are you seeing in terms of the benefits to this approach that might not be so intuitive and obvious right away? >> I think one of the biggest benefits to having a multimodel multiworkload or as we call it a converged database, is the fact that you can get greater data synergy from it. In other words, you can utilize all these different techniques and data models to get better value out of that data. So things like being able to do real-time machine learning, fraud detection inside a transaction or being able to do a product recommendation by accessing three different data models. So for example, if I'm trying to recommend a product for you Dave, I might use graph analytics to be able to figure out your community. Not just your friends, but other people on our system who look and behave just like you. Once I know that community then I can go over and see what products they bought by looking up our product catalog which may be stored as JSON. And then on top of that I can then see using the key-value what products inside that catalog those community members gave a five star rating to. So that way I can really pinpoint the right product for you. And I can do all of that in one transaction inside the database without having to transform that data into different models or God forbid, access different systems to be able to get all of that information. So it really simplifies how we can generate that value from the data. And of course, the other thing our customers love is when it comes to deploying data-driven apps, when you do it on a converged database it's much simpler because it is that standard data platform. So you're not having to manage multiple independent single purpose databases. You're not having to implement the security and the high availability policies, you know across a bunch of different diverse platforms. All of that can be done much simpler with a converged database 'cause the DBA team of course, is going to just use that standard set of tools to manage, monitor and secure those systems. >> Thank you for that. And you know, it's interesting, you talk about simplification and you are in Juan's organization so you've big focus on mission critical. And so one of the things that I think is often overlooked well, we talk about all the time is recovery. And if things are simpler, recovery is faster and easier. And so it's kind of the hallmark of Oracle is like the gold standard of the toughest apps, the most mission critical apps. But I wanted to get to the cloud Maria. So because everything is going to the cloud, right? Not all workloads are going to the cloud but everybody is talking about the cloud. Everybody has cloud first mentality and so yes, it's a hybrid world. But the natural next question is how do you think the cloud fits into this world of data-driven apps? >> I think just like any app that you're developing, the cloud helps to accelerate that development. And of course the deployment of these data-driven applications. 'Cause if you think about it, the developer is instantly able to provision a converged database that Oracle will automatically manage and look after for them. But what's great about doing something like that if you use like our autonomous database service is that it comes in different flavors. So you can get autonomous transaction processing, data warehousing or autonomous JSON so that the developer is going to get a database that's been optimized for their specific use case, whatever they are trying to solve. And it's also going to contain all of that great functionality and capabilities that we've been talking about. So what that really means to the developer though is as the project evolves and inevitably the business needs change a little, there's no need to panic when one of those changes comes in because your converged database or your autonomous database has all of those additional capabilities. So you can simply utilize those to able to address those evolving changes in the project. 'Cause let's face it, none of us normally know exactly what we need to build right at the very beginning. And on top of that they also kind of get a built-in buddy in the cloud, especially in the autonomous database. And that buddy comes in the form of built-in workload optimizations. So with the autonomous database we do things like automatic indexing where we're using machine learning to be that buddy for the developer. So what it'll do is it'll monitor the workload and see what kind of queries are being run on that system. And then it will actually determine if there are indexes that should be built to help improve the performance of that application. And not only does it bill those indexes but it verifies that they help improve the performance before publishing it to the application. So by the time the developer is finished with that app and it's ready to be deployed, it's actually also been optimized by the developers buddy, the Oracle autonomous database. So, you know, it's a really nice helping hand for developers when they're building any app especially data-driven apps. >> I like how you sort of gave us, you know the truth here is you don't always know where you're going when you're building an app. It's like it goes from you are trying to build it and they will come to start building it and we'll figure out where it's going to go. With Agile that's kind of how it works. But so I wonder, can you give some examples of maybe customers or maybe genericize them if you need to. Data-driven apps in the cloud where customers were able to drive more efficiency, where the cloud buddy allowed the customers to do more with less? >> No, we have tons of these but I'll try and keep it to just a couple. One that comes to mind straight away is retrace. These folks built a blockchain app in the Oracle Cloud that allows manufacturers to actually share the supply chain with the consumer. So the consumer can see exactly, who made their product? Using what raw materials? Where they were sourced from? How it was done? All of that is visible to the consumer. And in order to be able to share that they had to work on a very diverse set of data. So they had everything from JSON documents to images as well as your traditional transactions in there. And they store all of that information inside the Oracle autonomous database, they were able to build their app and deploy it on the cloud. And they were able to do all of that very, very quickly. So, you know, that ability to work on multiple different data types in a single database really helped them build that product and get it to market in a very short amount of time. Another customer that's doing something really, really interesting is MindSense. So these guys operate the largest mines in Canada, Chile, and Peru. But what they do is they put these x-ray devices on the massive mechanical shovels that are at the cove or at the mine face. And what that does is it senses the contents of the buckets inside these mining machines. And it's looking to see at that content, to see how it can optimize the processing of the ore inside in that bucket. So they're looking to minimize the amount of power and water that it's going to take to process that. And also of course, minimize the amount of waste that's going to come out of that project. So all of that sensor data is sent into an autonomous database where it's going to be processed by a whole host of different users. So everything from the mine engineers to the geo scientists, to even their own data scientists utilize that data to drive their business forward. And what I love about these guys is they're not happy with building just one app. MindSense actually use our built-in low core development environment, APEX that comes as part of the autonomous database and they actually produce applications constantly for different aspects of their business using that technology. And it's actually able to accelerate those new apps to the business. It takes them now just a couple of days or weeks to produce an app instead of months or years to build those new apps. >> Great, thank you for that Maria. Gerald, I'm going to push you again. So, I said upfront and talked about microservices and the cloud and containers and you know, anybody in the developer space follows that very closely. But some of the things that we've been talking about here people might look at that and say, well, they're kind of antithetical to microservices. This is our Oracles monolithic approach. But when you think about the benefits of microservices, people want freedom of choice, technology choice, seen as a big advantage of microservices and containers. How do you address such an argument? >> Yeah, that's an excellent question and I get that quite often. The microservices architecture in general as I said before had architectures, Linux distributions, et cetera. It's kind of always a bit of like there's an academic approach and there's a pragmatic approach. And when you look at the microservices the original definitions that came out at the early 2010s. They actually never said that each microservice has to have a database. And they also never said that if a microservice has a database, you have to use a different technology for each microservice. Just like they never said, you have to write a microservice in a different programming language, right? So where I'm going with this is like, yes you know, sometimes when you look at some vendors out there, some niche players, they push this message or they jump on this academic approach of like each microservice has the best tool at hand or I'd use a different database for your purpose, et cetera. Which almost often comes across like us. You know, we want to stay part of the conversation. Nothing stops a developer from, you know using a multimodal database for the microservice and just using that as a document store, right? Or just using that as a relational database. And, you know, sometimes I mean, it was actually something that happened that was really interesting yesterday I don't know whether you follow Dave or not. But Facebook had an outage yesterday, right? And Facebook is one of those companies that are seen as the Silicon Valley, you know know how to do microservices companies. And when you add through the outage, well, what happened, right? Some unfortunate logical error with configuration as a force that took a database cluster down. So, you know, there you have it where you go like, well, maybe not every microservice is actually in fact talking to its own database or its own special purpose database. I think there, you know, well, what we should, the industry should be focusing much more on this argument of which technology to use? What's the right tool for a job? Is more to ask themselves, what business problem actually are we trying to solve? And therefore what's the right approach and the right technology for this. And so therefore, just as I said before, you know multimodal databases they do have strong benefits. They have many built-in functionalities that are already there and they allow you to reduce this complexity of having to know many different technologies, right? And so it's not only to store different data models either you know, treat a multimodal database as a chasing documents store or a relational database but most databases are multimodal since 20 plus years. But it's also actually being able to perhaps if you store that data together, you can perhaps actually derive additional value for somebody else but perhaps not for your application. But like for example, if you were to use Oracle Database you can actually write queries on top of all of that data. It doesn't really matter for our query engine whether it's the data is format that then chase or the data is formatted in rows and columns you can just rather than query over it. And that's actually very powerful for those guys that have to, you know get the reporting done the end of the day, the end of the week. And for those guys that are the data scientists that they want to figure out, you know which product performed really well or can we tweak something here and there. When you look into that space you still see a huge divergence between the guys to put data in kind of the altarpiece style and guys that try to derive new insights. And there's still a lot of ETL going around and, you know we have big data technologies that some of them come and went and some of them came in that are still around like Apache Spark which is still like a SQL engine on top of any of your data kind of going back to the same concept. And so I will say that, you know, for developers when we look at microservices it's like, first of all, is the argument you were making because the vendor or the technology you want to use tells you this argument or, you know, you kind of want to have an argument to use a specific technology? Or is it really more because it is the best technology, to best use for this given use case for this given application that you have? And if so there's of course, also nothing wrong to use a single purpose technology either, right? >> Yeah, I mean, whenever I talk about Oracle I always come back to the most important applications, the mission critical. It's very difficult to architect databases with microservices and containers. You have to be really, really careful. And so and again, it comes back to what we were talking before about with Maria that the complexity and the recovery. But Gerald I want to stay with you for a minute. So there's other data management technologies popping out there. I mean, I've seen some people saying, okay just leave the data in an S3 bucket. We can query that, then we've got some magic sauce to do that. And so why are you optimistic about you know, traditional database technology going forward? >> I would say because of the history of databases. So one thing that once struck me when I came to Oracle and then got to meet great people like Juan Luis and Andy Mendelsohn who had been here for a long, long time. I come to realization that relational databases are around for about 45 years now. And, you know, I was like, I'm too young to have been around then, right? So I was like, what else was around 45 years? It's like just the tech stack that we have today. It's like, how does this look like? Well, Linux only came out in 93. Well, databases pre-date Linux a lot rather than as I started digging I saw a lot of technologies come and go, right? And you mentioned before like the technologies that data management systems that we had that came and went like the columnar databases or XML databases, object databases. And even before relational databases before Cot gave us the relational model there were apparently these networks stores network databases which to some extent look very similar to adjacent documents. There wasn't a harder storing data and a hierarchy to format. And, you know when you then start actually reading the Cot paper and diving a little bit more into the relation model, that's I think one important crux in there that most of the industry keeps forgetting or it hasn't been around to even know. And that is that when Cot created the relational model, he actually focused not so much on the application putting the data in, but on future users and applications still being able to making sense out of the data, right? And that's kind of like I said before we had those network models, we had XML databases you have adjacent documents stores. And the one thing that they all have along with it is like the application that puts the data in decides the structure of the data. And that's all well and good if you had an application of the developer writing an application. It can become really tricky when 10 years later you still want to look at that data and the application that the developer is no longer around then you go like, what does this all mean? Where is the structure defined? What is this attribute? What does it mean? How does it correlate to others? And the one thing that people tend to forget is that it's actually the data that's here to stay not someone who does the applications where it is. Ideally, every company wants to store every single byte of data that they have because there might be future value in it. Economically may not make sense that's now much more feasible than just years ago. But if you could, why wouldn't you want to store all your data, right? And sometimes you actually have to store the data for seven years or whatever because the laws require you to. And so coming back then and you know, like 10 years from now and looking at the data and going like making sense of that data can actually become a lot more difficult and a lot more challenging than having to first figure out and how we store this data for general use. And that kind of was what the relational model was all about. We decompose the data structures into tables and columns with relationships amongst each other so therefore between each other. So that therefore if somebody wants to, you know typical example would be well you store some purchases from your web store, right? There's a customer attribute in it. There's some credit card payment information in it, just some product information on what the customer bought. Well, in the relational model if you just want to figure out which products were sold on a given day or week, you just would query the payment and products table to get the sense out of it. You don't need to touch the customer and so forth. And with the hierarchical model you have to first sit down and understand how is the structure, what is the customer? Where is the payment? You know, does the document start with the payment or does it start with the customer? Where do I find this information? And then in the very early days those databases even struggled to then not having to scan all the documents to get the data out. So coming back to your question a bit, I apologize for going on here. But you know, it's like relational databases have been around for 45 years. I actually argue it's one of the most successful software technologies that we have out there when you look in the overall industry, right? 45 years is like, in IT terms it's like from a star being the ones who are going supernova. You have said it before that many technologies coming and went, right? And just want to add a more really interesting example by the way is Hadoop and HDFS, right? They kind of gave us this additional promise of like, you know, the 2010s like 2012, 2013 the hype of Hadoop and so forth and (mumbles) and HDFS. And people are just like, just put everything into HDFS and worry about the data later, right? And we can query it and map reduce it and whatever. And we had customers actually coming to us they were like, great we have half a petabyte of data on an HDFS cluster and we have no clue what's stored in there. How do we figure this out? What are we going to do now? Now you had a big data cleansing problem. And so I think that is why databases and also data modeling is something that will not go away anytime soon. And I think databases and database technologies are here for quite a while to stay. Because many of those are people they don't think about what's happening to the data five years from now. And many of the niche players also and also frankly even Amazon you know, following with this single purpose thing is like, just use the right tool for the job for your application, right? Just pull in the data there the way you wanted. And it's like, okay, so you use technologies all over the place and then five years from now you have your data fragmented everywhere in different formats and, you know inconsistencies, and, and, and. And those are usually when you come back to this data-driven business critical business decision applications the worst case scenario you can have, right? Because now you need an army of people to actually do data cleansing. And there's not a coincidence that data science has become very, very popular the last recent years as we kind of went on with this proliferation of different database or data management technologies some of those are not even database. But I think I leave it at that. >> It's an interesting talk track because you're right. I mean, no schema on right was alluring, but it definitely created some problems. It also created an entire, you know you referenced the hyper specialized roles and did the data cleansing component. I mean, maybe technology will eventually solve that problem but it hasn't up at least up tonight. Okay, last question, Maria maybe you could start off and Gerald if you want to chime in as well it'd be great. I mean, it's interesting to watch this industry when Oracle sort of won the top database mantle. I mean, I watched it, I saw it. It was, remember it was Informix and it was (indistinct) too and of course, Microsoft you got to give them credit with SQL server, but Oracle won the database wars. And then everything got kind of quiet for awhile database was sort of boring. And then it exploded, you know, all the, you know not only SQL and the key-value stores and the cloud databases and this is really a hot area now. And when we looked at Oracle we said, okay, Oracle it's all about Oracle Database, but we've seen the kind of resurgence in MySQL which everybody thought, you know once Oracle bought Sun they were going to kill MySQL. But now we see you investing in HeatWave, TimesTen, we talked about In-Memory databases before. So where do those fit in Maria in the grand scheme? How should we think about Oracle's database portfolio? >> So there's lots of places where you'd use those different things. 'Cause just like any other industry there are going to be new and boutique use cases that are going to benefit from a more specialized product or single purpose product. So good examples off the top of my head of the kind of systems that would benefit from that would be things like a stock exchange system or a telephone exchange system. Both of those are latency critical transaction processing applications where they need microsecond response times. And that's going to exceed perhaps what you might normally get or deploy with a converged database. And so Oracle's TimesTen database our In-Memory database is perfect for those kinds of applications. But there's also a host of MySQL applications out there today and you said it yourself there Dave, HeatWave is a great place to provision and deploy those kinds of applications because it's going to run 100 times faster than AWS (mumbles). So, you know, there really is a place in the market and in our customer's systems and the needs they have for all of these different members of our database family here at Oracle. >> Yeah, well, the internet is basically running in the lamp stack so I see MySQL going away. All right Gerald, will give you the final word, bring us home. >> Oh, thank you very much. Yeah, I mean, as Maria said, I think it comes back to what we discussed before. There is obviously still needs for special technologies or different technologies than a relational database or multimodal database. Oracle has actually many more databases that people may first think of. Not only the three that we have already mentioned but there's even SP so the Oracle's NoSQL database. And, you know, on a high level Oracle is a data management company, right? And we want to give our customers the best tools and the best technology to manage all of their data. Rather than therefore there has to be a need or there should be a part of the business that also focuses on this highly specialized systems and this highly specialized technologies that address those use cases. And I think it makes perfect sense. It's like, you know, when the customer comes to Oracle they're not only getting this, take this one product you know, and if you don't like it your problem but actually you have choice, right? And choice allows you to make a decision based on what's best for you and not necessarily best for the vendor you're talking to. >> Well guys, really appreciate your time today and your insights. Maria, Gerald, thanks so much for coming on The Cube. >> Thank you very much for having us. >> And thanks for watching this Cube conversation this is Dave Vellante and we'll see you next time. (upbeat music)
SUMMARY :
and then you guys just follow my lead. So I noticed Maria you stopped anyway, So any time you don't So when I'm here you guys and we'll open up when Dave's ready. and the benefits they bring What are we really talking about there? the nearest stores to kind of the traditional So for example, you can do So Gerald, you think about to you at all but just receives or even a MongoDB that allows you to do ML and AI into the database, in the database you already have. and I buy that by the way. of since the last 40 years, you know the benefits to this approach is the fact that you can get And you know, it's And that buddy comes in the form of the truth here is you don't and deploy it on the cloud. and the cloud and containers and you know, is the argument you were making And so why are you because the laws require you to. And then it exploded, you and the needs they have in the lamp stack so I and the best technology to and your insights. we'll see you next time.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Gerald Venzl | PERSON | 0.99+ |
Andy Mendelsohn | PERSON | 0.99+ |
Maria | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Chile | LOCATION | 0.99+ |
Maria Colgan | PERSON | 0.99+ |
Peru | LOCATION | 0.99+ |
100 times | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Gerald | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Canada | LOCATION | 0.99+ |
seven years | QUANTITY | 0.99+ |
Juan Luis | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Steve | PERSON | 0.99+ |
five star | QUANTITY | 0.99+ |
Maria Colgan | PERSON | 0.99+ |
Swiss Army | ORGANIZATION | 0.99+ |
Swiss Army | ORGANIZATION | 0.99+ |
Alex | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
MySQL | TITLE | 0.99+ |
one note | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
two hands | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
two experts | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Linux | TITLE | 0.99+ |
Teradata | ORGANIZATION | 0.99+ |
each microservice | QUANTITY | 0.99+ |
Hadoop | TITLE | 0.99+ |
45 years | QUANTITY | 0.99+ |
Oracles | ORGANIZATION | 0.99+ |
early 2010s | DATE | 0.99+ |
today | DATE | 0.99+ |
one-shot | QUANTITY | 0.99+ |
five | QUANTITY | 0.99+ |
one good example | QUANTITY | 0.99+ |
Sun | ORGANIZATION | 0.99+ |
tonight | DATE | 0.99+ |
first | QUANTITY | 0.99+ |
Miguel Perez Colino & Rich Sharples, Red Hat | KubeCon + CloudNativeCon NA 2020
>>From around the globe. It's the cube with coverage of coop con and cloud native con North America, 2020 virtual brought to you by red hat, the cloud native computing foundation and ecosystem partners. >>Hey, welcome back, everybody Jeffrey here with the cube coming to you from our Palo Alto studios today with our ongoing coverage of coupon cloud native con North America, 2020. It's not really North America, it's virtual like everything else, but you know that the European show earlier in the summer, and this is the, this is the late fall show. So we're excited to welcome in our very next two guests. Uh, first joining us from Madrid. Spain is Miguel Perez, Kaleena. He is a principal product manager from red hat, Miguel. Great to see you. >>Good to see you happy to be in the cube. >>Yes. Great. Well welcome. And joining us from North Carolina is rich Sharples. He is a senior director, product management of red hat. Rich. Great to see you. >>Yeah, likewise, thanks for inviting me again. >>So we're talking about Java today and before we kind of jump into it, you know, in preparing for this rich, I saw an interview that you did, I think earlier about halfway through the year, uh, celebrating the 25th anniversary of Java and talking about the 25th anniversary Java. And before we kind of get into the future, I think it's worthwhile to take a look back at, you know, kind of where Java came from and how it's lasted for 25 years of such an important enterprise, you know, kind of application framework, because we always hear jokes about people looking for COBOL programmers or, you know, all these old language programmers, because they have some old system that's that needs a little assist. What's special about Java. Why are we 25 years into it? And you guys are still excited about Java yesterday, today and in the future. >>Yeah. And I should add that, um, in terms of languages, uh, twenty-five is actually still pretty young. Java's, uh, kind of middle aged, I guess. Um, you know, things like CC plus bus rrr you're 45, 50 years old Python, I think is about the same as Java in terms of years. So, you know, the languages do tend to move at a, um, at a, they do tend to stick around, uh, uh, a bit, well what's made Java really, really important for enterprises building business critical applications is it started off with a very large ecosystem of big vendors supporting it. Um, it was open in a sense from the very start and it's remained open as in open source and an open community as well. So that's really, really helped, um, you know, keep the language innovating and moving along and attracting new developers. And, um, it's, it's still a fairly modern language in terms of some of the new features it's advancing with the industry taking on new kinds of workloads and new kinds of per program paradigms as well. So, you know, it's, it's evolved very well and has a huge base out somewhere between 11 and 13 million developers still use it as a primary development language in professional settings. Yeah. >>What struck me about what you said though in that interview was kind of the evolution and how Java has been able to continue to adapt based on kind of what the new frameworks are. So whether it was early days in a machine, like you talked about being in a set top box, or, you know, kind of really lightweight kind of almost IOT applications then to be calming, you know, this really a great application to deliver enterprise applications via a web browser and that, you know, and it continues to morph and change and adapt over time. I thought that was pretty interesting given the vast change in the way applications are delivered today versus what they were 25 years ago. >>Yeah, absolutely. It's, you know, the very early days were around embedded devices, uh, intelligent toasters and, you know, whatever. Um, and, and then where it really, really took off was, but the building supporting big backend systems, big transactional workloads, whether you're a bank or an airline you're running both the scale, but also running really, really complex transactional systems that were business critical. And that's that's for the last, you know, 15 years has been, um, where it's, it's really shown building backend, um, systems. Now, as we kind of move forward, you know, the idea of, uh, um, like server side, uh, server side application versus a front end is kind of changed. You know, now we're talking microservices, we're talking about running in containers. So really the focus of where we run Java and the kinds of applications we're building with Java as this has radically changed. And as such the language has to change as well, which is, you know, one, I'm pretty excited to talk about caucus today. >>So let's, let's jump into it and talk about corcus cause the other big trend, you know, along with, with, with obviously, uh, uh, browsers being great enterprise applications, delivery vehicles is this thing called containers, right? And, and specifically more recently Kubernetes is the one that's grabbing all the attention and grabbing all the, all the momentum. Um, so I wonder Miguel, if you could talk about, you know, kind of as, as the popularity of containerized applications and containerized to everything right, containerized storage, or you even talked about containerizing networking, troll, how that's impacted, uh, what you guys are doing and the impact of Java, uh, and making it work with kind of a containerized Kubernetes world. >>Well, what we found is that the paradigm of development has teeth. So we have this top up, uh, uh, paradigm that the people are following to be able to do the best with containers, to the best with Kubernetes on the, this has worked quite fine in Greenfield on for, for many cases has been a way to develop applications faster, to be able to obtain variably salts. And the thing is that for many, uh, users, for many companies that we work with, uh, they also want to bring some of their stuff that the applications that are currently are running into this world. And, uh, I mean, we, we walk especially a lot in helping these customers be able to adopt those obligations, but we try to do it, uh, as we say, the N pixie dust, you know, we really dig into the code, we'll review the code with modernize. The application will help their customer with that application. We provide the tools are open for anyone to be able to review it and to be able to take it. So we are moving away from Greenfield into brownfield and not a way we are evolving together to say we more precise, you know, all these Greenfield applications keep coming, but also the current applications want to be more organized. >>Right. Right. So it's pretty interesting. Cause that's always the big conversation. There's, it's, it's all fine. And good if you're just building something new, uh, to use the latest tools. But as you mentioned, there's a whole lot of conversation about application modernization and this is really an opportunity to apply some of these techniques to do that. So quirky. So I wonder if you just give, let's just jump into it. What is it at the highest level? Uh, what's it all about? What should people know? >>Yeah. So, so Corker says I'm reading an attempt by red hat to ensure Java is a first-class citizen in containerized environments, but building reactive applications, uh, cloud native applications, uh, functions, Java is an incredible piece of engineering. It does some incredible things. It sudden can self optimize. As it's running in line code, it can do some really amazing things the longer it runs, but in a containerized environment, you're likely not going to be running huge amounts of code. You'd likely be running microservices and your, your services are likely to have a kind of limited life cycle as we you're able to deploy more frequently or in a function environment where, you know, you've been bought once and then you're done, um, you know, during all those long, um, kind of, um, those optimizations over time, don't really, um, make a lot of sense. So what we can do is remove a lot of the, um, the weights of Java, a lot of the complexity of Java, and we can optimize for an environment where your code is maybe just running for a few microseconds as in the case of the function or something running in native, cause you scale up and scale down. >>So we move a lot of the op side. We move a lot of the, um, the, the efforts within the application, uh, to compile time, we pre compile all of your, of your config and initialization, so that doesn't have to happen in your, um, your, your, your runtime or your production environment. Um, and then we can optimize the code week. We can, we can remove that code. We can remove, you know, whole, uh, trees and class libraries and really slimmed down the memory footprint and radically, um, slim, the Maddie memory footprint, um, increase the startup time as well. So, you know, you have less downtime in your applications. Um, and we've recently done a S a study with ADC that shows some pretty stunning results compared to, you know, some existing frameworks. And, you know, we get, um, you know, sort of like, you know, overall cost savings of, you know, 60, 64%. >>Um, we can get eight times better density. You're running more in a, in a, in a cluster and, um, you know, reduction in memory up to 90% as well. So it's, these are significant changes now. That's all good, you know, saving, saving 60, 60% on your operational costs is significant. But what we find is that most organizations, they come for the performance and the optimizations, but what actually stay for is the speed of development. So I think, I think caucus real silver bullets is, um, the developer productivity, you know, for organizations, the cost of development is still one of the major costs. I mean, the operational costs, the hosting costs a significant, but development costs, time to market will always be top of mind for organizations that are trying to move faster than the competition. And I think that's really where, um, um, caucus special and coupled in, uh, in, uh, OpenShift or Coobernetti's environment really, really does shine. Yeah, >>It's pretty interesting. So people can go to corcus.io and see a lot of the statistics that you just referenced in terms of memory usage and speed and, and whole bunch of stuff. But what struck me when I went to the site was that was this big, uh, uh, two words that jumped out developer joy. And it's funny that you talked on that just now about really, um, the benefits that come to the developer directly to make them happier. I mean, really calling out their joy. So they're more productive and ultimately that's what you said. That's where the great value is in terms of speed of deployment, happy developers, and productive developers. You know, Miguel, you get your, you get down into the weeds of this stuff. Again, the presentations on your LinkedIn, everyone needs to go look and you talk a lot about at migration and you lot talk a lot about app modernization. So without going through all 120 some odd slides that I think you have, which is good, phenomenal information, what are some of the top things that people need to think about and consider both for app modernization as well as at migration? >>Um, that's, that's, that's an interesting question. Uh, the thing is that, um, the tolling is important on the current code is, and the thing is that normally when, when we started migration project, we tried to find architects in the applications to be able to find patterns. You know, you find parents is much easier because, uh, once you solve one part on the same part on can be solved in a very similar way. So this is one of the parts of that. We focus a lot, but before getting to that point, it's very important how you stop, you know, so the assessment phase is, is very important to be able to review well, what is the status of the applications, the context of the applications. And with that, I mean, things like, for example, the requirements that they have, there's the maintenance that they take in their resiliency and so on. >>So you have to prepare very well, the project by starting with a good assessment, you have to check which applications makes more, make more sense to start with and see which, how to group them together by similarities. And then you can start with the project that saying, okay, let's go for these set of applications that make more sense that are more likely to be containerized because of the way we are developing them because of the dependencies that they have because of the resiliency that is already embedded into them and so on. So that, that the methodology is important. And we normally, for example, when we, when we help partners do a application migration, one of the things that we stress is that this is the methodology that we follow and in the website for my vision, totally for application, you can find also, um, methodology, uh, part that, uh, could help, uh, people understand, okay, these, these are the stages that we normally follow to be successful with migrating applications. >>Yeah. Let go. You don't, we're not friends. We don't hang out a lot, but if we did, you would know I never ever recommend PowerPoint for anything. So, so the fact that I'm calling out your PowerPoint actually means something. Cause I think it's the worst application ever built, but you got some tremendous, tremendous information in there and people do need to go in and look, and again, it's all from your LinkedIn work, but I wanted to shift gears a little bit, right? We're at CubeCon cloud native con. Um, obviously it's virtual is 2020. That's the way the world today. But I just curious to get your guys' take on, on what does this, uh, event mean for you obviously really active, open source community, you know, red hat has a long open-source history. Um, what does CubeCon cloud native con mean for you guys? What do you hope to get out of it? What should people hope to, uh, to learn from red hat? >>Yeah, we, um, yeah, we're, we're buying your DNA. We're very, very collaborative. Uh, we, we love to learn from our customers, users of the technologies, um, in the communities that we support. Um, speaking as a, you know, we're both product guys, there's nothing better than getting with, um, people that actually use the products, um, in anger, in real life, whether they're products are upstream technologies, learning, learning, what they're doing, understanding where, um, some of the gaps are there's. Um, yeah, we just couldn't do our jobs without engaging with developers, users in these kind of conferences. Yeah. A lot of the, um, love interest we've seen with coworkers is, is in the community, you know, um, like I'd been part of many, many successful open source projects, um, um, over red hat. And it's great when your customers, you know, like, uh, Vodafone, Greece or Carrefour in Spain are openly publicly talking about how good your technology is, what they're using it for. And that's really good. So it's just nothing, there's no alternative that, you know, whether it be virtual virtually or physically sitting down with, uh, with users of your technology, >>How about you, Miguel? What are you hoping to get out of, uh, out of the show this year? >>Um, we are working a lot with, on Kubernetes in red hat, on, uh, as part of the community, of course. And, um, I mean, there are so many new stuff that is coming around, Kubernetes that, uh, it's mostly about it, about all the capabilities that were arming, especially for example, several lists, you know, several lessons, there is an important topic with crackers, because for example, as you make the application stopped so much faster and react so much faster, you could have known of them running and just waiting for an event to happen, which saves a lot of resources and makes us super efficient. So this is one of the topics, for example, that we wanted to cover in this edition, you know, how we are implementing serverless with Kubernetes and OpenShift and many other things like pipelines. Like, I don't know, we just had quite a visit in the, uh, uh, video, uh, life of what is coming up. I see for the six. And I recommend people to take a look at it, to get everything that's new because there's a lot. Yeah, >>Yeah. You guys are technical people. You've been doing this for a long time. Why is Kubernetes so special? W Y Y you know, there's been containers in the past, right. And we've seen other kind of branded open source projects that got a lot of momentum, but Kubernetes just seems to be blowing everybody out of the out of its path. Why, what should people know about Kubernetes that aren't necessarily developers? >>Yeah, there's really nothing interesting about a single container or a single microservice, right? That's not, that's not the kind of environment that, um, real organizations live in. They live in organizations where they're going to have hundreds of services, um, who just containers and you need a technology to orchestrate and manage that in that complex environment. And Kubernete's has just quickly become the, the district per standard. Um, yeah, folks are red hat jumped on my very, very early, um, I mean, one of the advantages around her have is where we're embedded with developers and open source communities. We often have a pretty good, it gives us a pretty good crystal ball. So we're often quick to jump on the emerging technologies that are coming out of open source. And that's exactly what happened with Cubanetis. It was clear. It was, um, you're going to be sophisticated for our, you know, most, um, most sophisticated customers running at scale. Um, but, but also, you know, great for development environments as well. So it really a good fit for, uh, where we were headed and, you know, just very, very quickly became the fact that standard. And you, you just gotta go with the de facto standard. Right, right. >>Right. Well, the another thing that you mentioned rich in that other interview that I was watching is it came up the conversation in terms of managing open source projects. And at some point, you know, they kind of start, and then, you know, I think this one, if I go to corcus and look at the bottom of the page sponsored by red hat, but you talked about, you know, at some point, do you move it over to a foundation, um, you know, and kind of what are the things that kind of drive that process, that decision, um, and, you know, I would imagine that part of it has to do with popularity and scale, is that something, you know, potentially down the road, how do you think that you said you've been in lots of open source projects, when does it move from, you know, kind of single point of origin to more of a foundational support? >>Yeah. I mean, in fact the foundation's owner was necessary. Um, you know, when you have a, yeah. If you, if you have a, an open, very open project with, um, um, clear, clear rules for collaboration and kind of the encouragement or others to collaborate and be able to, you know, um, move the project and, you know, the foundation as low as necessarily what we've seen, I've been part of the no GS world where, you know, the, the community reached Belden to keep no GS moving forward. Um, we had to go from a, what we call a benevolent dictator for life, somebody who's well-intentioned, but, um, yeah, we're on stone, the technology, so a foundation, which is much more inclusive and, um, you know, greater collaboration and you can move even quicker. So, you know, um, I think what's required is, is open governance for open source projects and where that doesn't happen. You know, maybe a foundation is, is the right way forward. Right, right now with, with caucus, um, you know, the, the non red hat developers seem pretty happy with the way they can get, uh, get engaged and contribute. Um, but if we get to a point where the community is demanding a foundation and we'll absolutely consider it, that's the best project we'll do. >>So, so we're, we're coming to the end of our time. I want to give you each the last word, really with two questions, one again, you know, just kind of a summary of, of, uh, of CubeCon cloud, native con, you know, what should people be looking for, uh, find you, and, and, and I don't know if you guys are sponsoring any sessions, I'm sure there's a lot of great content. If you want to highlight one or two things. And then most importantly, as we turn the calendars, we come to the end of 2020, uh, thankfully, um, as you look ahead to 2021, you know, what are some of your priorities, uh, as, as we get ready to turn the turn, the calendar, and Miguel let's start with you. >>So, um, I mean, we have been working very hard this year on the migration, took it for applications to help her every user that is using Java to bring the two containers. You know, whether it is data IE or these crackers, but we're putting like a lot of effort in crackers. And now we are bringing in new rules. And, uh, by the, by December, we expect to have the new version of the migration looking for applications that is going to include the, all the bulls to help developers bring their, their code to the Java code, to, to carcass. And, uh, on this, this is the main goal for us right now. We are moving forward to the next year to include more, more capabilities in that project. Everything's up on site. You can go to the conveyor, uh, project and ticket on, uh, on the up capabilities for the assessment phase. So whenever any partner, any, any of our consultants are working on, on migration or anyone that would like to go and try it themselves on adopted, would like to do these migrations to the cloud native world, uh, will feel comfortable with, with this tool. So that is our main goal in, in my, in my team. >>All right. And how about you rich? >>Yeah, I think we're going to see this, um, um, kind of syllabus solidification kind of web of, um, microservices. Um, you know, if you like hate that, I'm sorry, but I'm just going to next generation microservice. There's going to be, as Miguel mentioned, is gonna be based around, um, uh, native, um, advancing, um, serverless functions. I think that's really the, the, the ideal architecture, the building March services, um, on, on Coobernetti's and caucus plays really, really well there. Um, I think there's, there's a, there's a kind of backlog of projects, um, within organizations that, um, you know, hopefully next year, everything really does start to crank up. And I think, um, yeah, I think a lot of the migration that Miguel has talked about is going to be, is going to rise in terms of importance. So app modernization, taking those existing applications, maybe taking aspects of those and, you know, doing some kind of decomposition in some microservices using caucus and a native, I think we'll see a lot of that. So I think we'll see a real drive around both the kind of Greenfield, um, applications, uh, you know, this next generation of microservices, as well as pulling those existing applications forward into these new environments, don't give an answers. So it's going to be excellent. >>Awesome. Well, thank you both for taking a few minutes with us and sharing the story of corcus, uh, and have a great show. Great to see you and a really good the conversation. All right. He's Miguel, he's rich. I'm Jeff. You're watching the cubes ongoing coverage of CubeCon cloud native con 2020 North America. Virtual. Thanks for watching. We'll see you next time.
SUMMARY :
cloud native con North America, 2020 virtual brought to you by red hat, Hey, welcome back, everybody Jeffrey here with the cube coming to you from our Palo Alto studios today with our ongoing coverage Great to see you. And before we kind of get into the future, I think it's worthwhile to take a look back at, you know, kind of where Java came So that's really, really helped, um, you know, keep the language innovating and moving IOT applications then to be calming, you know, this really a great application And that's that's for the last, you know, 15 years has been, So let's, let's jump into it and talk about corcus cause the other big trend, you know, along with, the N pixie dust, you know, we really dig into the code, So I wonder if you just give, as in the case of the function or something running in native, cause you scale up and scale down. um, you know, sort of like, you know, overall cost savings of, in a, in a cluster and, um, you know, reduction in memory up to 90% And it's funny that you talked on that just now about really, to that point, it's very important how you stop, you know, so the assessment phase is, So you have to prepare very well, the project by starting with a good assessment, open source community, you know, red hat has a long open-source history. So it's just nothing, there's no alternative that, you know, for example, that we wanted to cover in this edition, you know, how we are implementing serverless W Y Y you know, there's been containers in the past, right. So it really a good fit for, uh, where we were headed and, you know, just very, very quickly became the fact that And at some point, you know, kind of the encouragement or others to collaborate and be able to, you know, uh, thankfully, um, as you look ahead to 2021, you know, what are some of your priorities, So, um, I mean, we have been working very hard this year on the migration, And how about you rich? um, applications, uh, you know, this next generation of microservices, as well Great to see you and a really good the conversation.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Mark Shuttleworth | PERSON | 0.99+ |
John Troyer | PERSON | 0.99+ |
Madrid | LOCATION | 0.99+ |
60 | QUANTITY | 0.99+ |
Jeff | PERSON | 0.99+ |
Dorich Telecom | ORGANIZATION | 0.99+ |
Canonical | ORGANIZATION | 0.99+ |
Vodafone | ORGANIZATION | 0.99+ |
$10 | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Miguel Perez | PERSON | 0.99+ |
Spain | LOCATION | 0.99+ |
10 servers | QUANTITY | 0.99+ |
two questions | QUANTITY | 0.99+ |
Carrefour | ORGANIZATION | 0.99+ |
45 | QUANTITY | 0.99+ |
North Carolina | LOCATION | 0.99+ |
Miguel | PERSON | 0.99+ |
Americas | LOCATION | 0.99+ |
SoftBank | ORGANIZATION | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
25 years | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
Vancouver | LOCATION | 0.99+ |
AT&T | ORGANIZATION | 0.99+ |
20% | QUANTITY | 0.99+ |
Mark | PERSON | 0.99+ |
100 servers | QUANTITY | 0.99+ |
30% | QUANTITY | 0.99+ |
Java | TITLE | 0.99+ |
2018 | DATE | 0.99+ |
OpenStack Foundation | ORGANIZATION | 0.99+ |
2020 | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
last year | DATE | 0.99+ |
PowerPoint | TITLE | 0.99+ |
Stu | PERSON | 0.99+ |
one server | QUANTITY | 0.99+ |
15 years | QUANTITY | 0.99+ |
North America | LOCATION | 0.99+ |
64% | QUANTITY | 0.99+ |
Jeffrey | PERSON | 0.99+ |
next year | DATE | 0.99+ |
3% | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
today | DATE | 0.99+ |
11 | QUANTITY | 0.99+ |
CentOS | TITLE | 0.99+ |
Vancouver, Canada | LOCATION | 0.99+ |
.3% | QUANTITY | 0.99+ |
two words | QUANTITY | 0.99+ |
120 | QUANTITY | 0.99+ |
six | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Kaleena | PERSON | 0.99+ |
three years ago | DATE | 0.99+ |
Python | TITLE | 0.99+ |
OpenStack | ORGANIZATION | 0.99+ |
two problems | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
ON DEMAND SPEED K8S DEV OPS SECURE SUPPLY CHAIN
>> In this session, we will be reviewing the power and benefits of implementing a secure software supply chain and how we can gain a cloud like experience with the flexibility, speed and security of modern software delivering. Hi, I'm Matt Bentley and I run our technical pre-sales team here at Mirantis. I spent the last six years working with customers on their containerization journey. One thing almost every one of my customers has focused on is how they can leverage the speed and agility benefits of containerizing their applications while continuing to apply the same security controls. One of the most important things to remember is that we are all doing this for one reason and that is for our applications. So now let's take a look at how we can provide flexibility to all layers of the stack from the infrastructure on up to the application layer. When building a secure supply chain for container focused platforms, I generally see two different mindsets in terms of where their responsibilities lie between the developers of the applications and the operations teams who run the middleware platforms. Most organizations are looking to build a secure, yet robust service that fits their organization's goals around how modern applications are built and delivered. First, let's take a look at the developer or application team approach. This approach falls more of the DevOps philosophy, where a developer and application teams are the owners of their applications from the development through their life cycle, all the way to production. I would refer to this more of a self service model of application delivery and promotion when deployed to a container platform. This is fairly common, organizations where full stack responsibilities have been delegated to the application teams. Even in organizations where full stack ownership doesn't exist, I see the self service application deployment model work very well in lab development or non production environments. This allows teams to experiment with newer technologies, which is one of the most effective benefits of utilizing containers. In other organizations, there is a strong separation between responsibilities for developers and IT operations. This is often due to the complex nature of controlled processes related to the compliance and regulatory needs. Developers are responsible for their application development. This can either include dock at the development layer or be more traditional, throw it over the wall approach to application development. There's also quite a common experience around building a center of excellence with this approach where we can take container platforms and be delivered as a service to other consumers inside of the IT organization. This is fairly prescriptive in the manner of which application teams would consume it. Yeah when examining the two approaches, there are pros and cons to each. Process, controls and compliance are often seen as inhibitors to speed. Self-service creation, starting with the infrastructure layer, leads to inconsistency, security and control concerns, which leads to compliance issues. While self-service is great, without visibility into the utilization and optimization of those environments, it continues the cycles of inefficient resource utilization. And a true infrastructure as a code experience, requires DevOps, related coding skills that teams often have in pockets, but maybe aren't ingrained in the company culture. Luckily for us, there is a middle ground for all of this. Docker Enterprise Container Cloud provide the foundation for the cloud like experience on any infrastructure without all of the out of the box security and controls that our professional services team and your operations teams spend their time designing and implementing. This removes much of the additional work and worry around ensuring that your clusters and experiences are consistent, while maintaining the ideal self service model. No matter if it is a full stack ownership or easing the needs of IT operations. We're also bringing the most natural Kubernetes experience today with Lens to allow for multi-cluster visibility that is both developer and operator friendly. Lens provide immediate feedback for the health of your applications, observability for your clusters, fast context switching between environments and allowing you to choose the best in tool for the task at hand, whether it is the graphic user interface or command line interface driven. Combining the cloud like experience with the efficiencies of a secure supply chain that meet your needs brings you the best of both worlds. You get DevOps speed with all the security and controls to meet the regulations your business lives by. We're talking about more frequent deployments, faster time to recover from application issues and better code quality. As you can see from our clusters we have worked with, we're able to tie these processes back to real cost savings, real efficiency and faster adoption. This all adds up to delivering business value to end users in the overall perceived value. Now let's look and see how we're able to actually build a secure supply chain to help deliver these sorts of initiatives. In our example secure supply chain, where utilizing Docker desktop to help with consistency of developer experience, GitHub for our source control, Jenkins for our CACD tooling, the Docker trusted registry for our secure container registry and the Universal Control Plane to provide us with our secure container runtime with Kubernetes and Swarm, providing a consistent experience, no matter where our clusters are deployed. You work with our teams of developers and operators to design a system that provides a fast, consistent and secure experience. For my developers, that works for any application, Brownfield or Greenfield, Monolith or Microservice. Onboarding teams can be simplified with integrations into enterprise authentication services, calls to GitHub repositories, Jenkins access and jobs, Universal Control Plan and Docker trusted registry teams and organizations, Kubernetes namespace with access control, creating Docker trusted registry namespaces with access control, image scanning and promotion policies. So, now let's take a look and see what it looks like from the CICD process, including Jenkins. So let's start with Docker desktop. From the Docker desktop standpoint, we'll actually be utilizing visual studio code and Docker desktop to provide a consistent developer experience. So no matter if we have one developer or a hundred, we're going to be able to walk through a consistent process through Docker container utilization at the development layer. Once we've made our changes to our code, we'll be able to check those into our source code repository. In this case, we'll be using GitHub. Then when Jenkins picks up, it will check out that code from our source code repository, build our Docker containers, test the application that will build the image, and then it will take the image and push it to our Docker trusted registry. From there, we can scan the image and then make sure it doesn't have any vulnerabilities. Then we can sign them. So once we've signed our images, we've deployed our application to dev, we can actually test our application deployed in our real environment. Jenkins will then test the deployed application. And if all tests show that as good, we'll promote our Docker image to production. So now, let's look at the process, beginning from the developer interaction. First of all, let's take a look at our application as it's deployed today. Here, we can see that we have a change that we want to make on our application. So our marketing team says we need to change containerize NGINX to something more Mirantis branded. So let's take a look at visual studio code, which we'll be using for our ID to change our application. So here's our application. We have our code loaded and we're going to be able to use Docker desktop on our local environment with our Docker desktop plugin for visual studio code, to be able to build our application inside of Docker, without needing to run any command line specific tools. Here with our code, we'll be able to interact with Docker maker changes, see it live and be able to quickly see if our changes actually made the impact that we're expecting our application. So let's find our updated tiles for application and let's go ahead and change that to our Mirantis sized NGINX instead of containerized NGINX. So we'll change it in a title and on the front page of the application. So now that we've saved that changed to our application, we can actually take a look at our code here in VS code. And as simple as this, we can right click on the Docker file and build our application. We give it a name for our Docker image and VS code will take care of the automatic building of our application. So now we have a Docker image that has everything we need in our application inside of that image. So, here we can actually just right click on that image tag that we just created and do run. This will interactively run the container for us. And then once our containers running, we can just right click and open it up in a browser. So here we can see the change to our application as it exists live. So, once we can actually verify that our applications working as expected, we can stop our container. And then from here, we can actually make that change live by pushing it to our source code repository. So here, we're going to go ahead and make a commit message to say that we updated to our Mirantis branding. We will commit that change and then we'll push it to our source code repository. Again, in this case, we're using GitHub to be able to use as our source code repository. So here in VS code, we'll have that pushed here to our source code repository. And then, we'll move on to our next environment, which is Jenkins. Jenkins is going to be picking up those changes for our application and it checked it out from our source code repository. So GitHub notifies Jenkins that there's a change. Checks out the code, builds our Docker image using the Docker file. So we're getting a consistent experience between the local development environment on our desktop and then in Jenkins where we're actually building our application, doing our tests, pushing it into our Docker trusted registry, scanning it and signing our image in our Docker trusted registry and then deploying to our development environment. So let's actually take a look at that development environment as it's been deployed. So, here we can see that our title has been updated on our application, so we can verify that it looks good in development. If we jump back here to Jenkins, we'll see that Jenkins go ahead and runs our integration tests for our development environment. Everything worked as expected, so it promoted that image for our production repository in our Docker trusted registry. We're then, we're going to also sign that image. So we're assigning that yes, we've signed off that has made it through our integration tests and it's deployed to production. So here in Jenkins, we can take a look at our deployed production environment where our application is live in production. We've made a change, automated and very secure manner. So now, let's take a look at our Docker trusted registry, where we can see our name space for our application and our simple NGINX repository. From here, we'll be able to see information about our application image that we've pushed into the registry, such as the image signature, when it was pushed by who and then, we'll also be able to see the results of our image. In this case, we can actually see that there are vulnerabilities for our image and we'll actually take a look at that. Docker trusted registry does binary level scanning. So we get detailed information about our individual image layers. From here, these image layers give us details about where the vulnerabilities were located and what those vulnerabilities actually are. So if we click on the vulnerability, we can see specific information about that vulnerability to give us details around the severity and more information about what exactly is vulnerable inside of our container. One of the challenges that you often face around vulnerabilities is how exactly we would remediate that in a secure supply chain. So let's take a look at that. In the example that we were looking at, the vulnerability is actually in the base layer of our image. In order to pull in a new base layer for our image, we need to actually find the source of that and update it. One of the ways that we can help secure that as a part of the supply chain is to actually take a look at where we get our base layers of our images. Docker hub really provides a great source of content to start from, but opening up Docker hub within your organization, opens up all sorts of security concerns around the origins of that content. Not all images are made equal when it comes to the security of those images. The official images from Docker hub are curated by Docker, open source projects and other vendors. One of the most important use cases is around how you get base images into your environment. It is much easier to consume the base operating system layer images than building your own and also trying to maintain them. Instead of just blindly trusting the content from Docker hub, we can take a set of content that we find useful such as those base image layers or content from vendors and pull that into our own Docker trusted registry, using our mirroring feature. Once the images have been mirrored into a staging area of our Docker trusted registry, we can then scan them to ensure that the images meet our security requirements. And then based off of the scan result, promote the image to a public repository where you can actually sign the images and make them available to our internal consumers to meet their needs. This allows us to provide a set of curated content that we know is secure and controlled within our environment. So from here, we can find our updated Docker image in our Docker trusted registry, where we can see that the vulnerabilities have been resolved. From a developer's point of view, that's about as smooth as the process gets. Now, let's take a look at how we can provide that secure content for our developers in our own Docker trusted registry. So in this case, we're taking a look at our Alpine image that we've mirrored into our Docker trusted registry. Here, we're looking at the staging area where the images get temporarily pulled because we have to pull them in order to actually be able to scan them. So here we set up mirroring and we can quickly turn it on by making it active. And then we can see that our image mirroring, we'll pull our content from Docker hub and then make it available in our Docker trusted registry in an automatic fashion. So from here, we can actually take a look at the promotions to be able to see how exactly we promote our images. In this case, we created a promotion policy within Docker trusted registry that makes it so that content gets promoted to a public repository for internal users to consume based off of the vulnerabilities that are found or not found inside of the Docker image. So our actual users, how they would consume this content is by taking a look at the public to them, official images that we've made available. Here again, looking at our Alpine image, we can take a look at the tags that exist and we can see that we have our content that has been made available. So we've pulled in all sorts of content from Docker hub. In this case, we've even pulled in the multi architecture images, which we can scan due to the binary level nature of our scanning solution. Now let's take a look at Lens. Lens provides capabilities to be able to give developers a quick opinionated view that focuses around how they would want to view, manage and inspect applications deployed to a Kubernetes cluster. Lens integrates natively out of the box with Universal Control Plane clam bundles. So you're automatically generated TLS certificates from UCP, just work. Inside our organization, we want to give our developers the ability to see their applications in a very easy to view manner. So in this case, let's actually filter down to the application that we just employed to our development environment. Here, we can see the pod for application. And when we click on that, we get instant detailed feedback about the components and information that this pod is utilizing. We can also see here in Lens that it gives us the ability to quickly switch contexts between different clusters that we have access to. With that, we also have capabilities to be able to quickly deploy other types of components. One of those is helm charts. Helm charts are a great way to package up applications, especially those that may be more complex to make it much simpler to be able to consume and inversion our applications. In this case, let's take a look at the application that we just built and deployed. In this case, our simple NGINX application has been bundled up as a helm chart and is made available through Lens. Here, we can just click on that description of our application to be able to see more information about the helm chart. So we can publish whatever information may be relevant about our application. And through one click, we can install our helm chart. Here, it will show us the actual details of the helm charts. So before we install it, we can actually look at those individual components. So in this case, we can see this created an ingress rule. And then this will tell Kubernetes how did it create this specific components of our application. We'd just have to pick a namespace to deploy it to and in this case, we're actually going to do a quick test here because in this case, we're trying to deploy the application from Docker hub. In our Universal Control Plane, we've turned on Docker content trust policy enforcement. So this is actually going to fail to deploy. Because we're trying to employ our application from Docker hub, the image hasn't been properly signed in our environment. So the Docker content trust policy enforcement prevents us from deploying our Docker image from Docker hub. In this case, we have to go through our approved process through our secure supply chain to be able to ensure that we know where our image came from and that meets our quality standards. So if we comment out the Docker hub repository and comment in our Docker trusted registry repository and click install, it will then install the helm chart with our Docker image being pulled from our DTR, which then it has a proper signature. We can see that our application has been successfully deployed through our home chart releases view. From here, we can see that simple NGINX application and in this case, we'll get details around the actual deployed helm chart. The nice thing is, is that Lens provides us this capability here with helm to be able to see all of the components that make up our application. From this view, it's giving us that single pane of glass into that specific application, so that we know all of the components that is created inside of Kubernetes. There are specific details that can help us access the applications such as that ingress rule that we just talked about, gives us the details of that, but it also gives us the resources such as the service, the deployment and ingress that has been created within Kubernetes to be able to actually have the application exist. So to recap, we've covered how we can offer all the benefits of a cloud like experience and offer flexibility around DevOps and operations control processes through the use of a secure supply chain, allowing our developers to spend more time developing and our operators, more time designing systems that meet our security and compliance concerns.
SUMMARY :
of our application to be
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Matt Bentley | PERSON | 0.99+ |
GitHub | ORGANIZATION | 0.99+ |
First | QUANTITY | 0.99+ |
one reason | QUANTITY | 0.99+ |
Mirantis | ORGANIZATION | 0.99+ |
One | QUANTITY | 0.99+ |
NGINX | TITLE | 0.99+ |
Docker | TITLE | 0.99+ |
two approaches | QUANTITY | 0.99+ |
Monolith | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.98+ |
UCP | ORGANIZATION | 0.98+ |
Kubernetes | TITLE | 0.98+ |
One thing | QUANTITY | 0.98+ |
one developer | QUANTITY | 0.98+ |
Jenkins | TITLE | 0.98+ |
today | DATE | 0.98+ |
Brownfield | ORGANIZATION | 0.97+ |
both worlds | QUANTITY | 0.97+ |
two | QUANTITY | 0.97+ |
both | QUANTITY | 0.96+ |
one click | QUANTITY | 0.96+ |
Greenfield | ORGANIZATION | 0.95+ |
each | QUANTITY | 0.95+ |
single pane | QUANTITY | 0.92+ |
Docker hub | TITLE | 0.91+ |
a hundred | QUANTITY | 0.91+ |
Lens | TITLE | 0.9+ |
Docker | ORGANIZATION | 0.9+ |
Microservice | ORGANIZATION | 0.9+ |
VS | TITLE | 0.88+ |
DevOps | TITLE | 0.87+ |
K8S | COMMERCIAL_ITEM | 0.87+ |
Docker hub | ORGANIZATION | 0.85+ |
ways | QUANTITY | 0.83+ |
Kubernetes | ORGANIZATION | 0.83+ |
last six years | DATE | 0.82+ |
Jenkins | PERSON | 0.72+ |
One of | QUANTITY | 0.7+ |
Rich Gaston, Micro Focus | Virtual Vertica BDC 2020
(upbeat music) >> Announcer: It's theCUBE covering the virtual Vertica Big Data Conference 2020 brought to you by Vertica. >> Welcome back to the Vertica Virtual Big Data Conference, BDC 2020. You know, it was supposed to be a physical event in Boston at the Encore. Vertica pivoted to a digital event, and we're pleased that The Cube could participate because we've participated in every BDC since the inception. Rich Gaston this year is the global solutions architect for security risk and governance at Micro Focus. Rich, thanks for coming on, good to see you. >> Hey, thank you very much for having me. >> So you got a chewy title, man. You got a lot of stuff, a lot of hairy things in there. But maybe you can talk about your role as an architect in those spaces. >> Sure, absolutely. We handle a lot of different requests from the global 2000 type of organization that will try to move various business processes, various application systems, databases, into new realms. Whether they're looking at opening up new business opportunities, whether they're looking at sharing data with partners securely, they might be migrating it to cloud applications, and doing migration into a Hybrid IT architecture. So we will take those large organizations and their existing installed base of technical platforms and data, users, and try to chart a course to the future, using Micro Focus technologies, but also partnering with other third parties out there in the ecosystem. So we have large, solid relationships with the big cloud vendors, with also a lot of the big database spenders. Vertica's our in-house solution for big data and analytics, and we are one of the first integrated data security solutions with Vertica. We've had great success out in the customer base with Vertica as organizations have tried to add another layer of security around their data. So what we will try to emphasize is an enterprise wide data security approach, where you're taking a look at data as it flows throughout the enterprise from its inception, where it's created, where it's ingested, all the way through the utilization of that data. And then to the other uses where we might be doing shared analytics with third parties. How do we do that in a secure way that maintains regulatory compliance, and that also keeps our company safe against data breach. >> A lot has changed since the early days of big data, certainly since the inception of Vertica. You know, it used to be big data, everyone was rushing to figure it out. You had a lot of skunkworks going on, and it was just like, figure out data. And then as organizations began to figure it out, they realized, wow, who's governing this stuff? A lot of shadow IT was going on, and then the CIO was called to sort of reign that back in. As well, you know, with all kinds of whatever, fake news, the hacking of elections, and so forth, the sense of heightened security has gone up dramatically. So I wonder if you can talk about the changes that have occurred in the last several years, and how you guys are responding. >> You know, it's a great question, and it's been an amazing journey because I was walking down the street here in my hometown of San Francisco at Christmastime years ago and I got a call from my bank, and they said, we want to inform you your card has been breached by Target, a hack at Target Corporation and they got your card, and they also got your pin. And so you're going to need to get a new card, we're going to cancel this. Do you need some cash? I said, yeah, it's Christmastime so I need to do some shopping. And so they worked with me to make sure that I could get that cash, and then get the new card and the new pin. And being a professional in the inside of the industry, I really questioned, how did they get the pin? Tell me more about this. And they said, well, we don't know the details, but you know, I'm sure you'll find out. And in fact, we did find out a lot about that breach and what it did to Target. The impact that $250 million immediate impact, CIO gone, CEO gone. This was a big one in the industry, and it really woke a lot of people up to the different types of threats on the data that we're facing with our largest organizations. Not just financial data; medical data, personal data of all kinds. Flash forward to the Cambridge Analytica scandal that occurred where Facebook is handing off data, they're making a partnership agreement --think they can trust, and then that is misused. And who's going to end up paying the cost of that? Well, it's going to be Facebook at a tune of about five billion on that, plus some other finds that'll come along, and other costs that they're facing. So what we've seen over the course of the past several years has been an evolution from data breach making the headlines, and how do my customers come to us and say, help us neutralize the threat of this breach. Help us mitigate this risk, and manage this risk. What do we need to be doing, what are the best practices in the industry? Clearly what we're doing on the perimeter security, the application security and the platform security is not enough. We continue to have breaches, and we are the experts at that answer. The follow on fascinating piece has been the regulators jumping in now. First in Europe, but now we see California enacting a law just this year. They came into a place that is very stringent, and has a lot of deep protections that are really far-reaching around personal data of consumers. Look at jurisdictions like Australia, where fiduciary responsibility now goes to the Board of Directors. That's getting attention. For a regulated entity in Australia, if you're on the Board of Directors, you better have a plan for data security. And if there is a breach, you need to follow protocols, or you personally will be liable. And that is a sea change that we're seeing out in the industry. So we're getting a lot of attention on both, how do we neutralize the risk of breach, but also how can we use software tools to maintain and support our regulatory compliance efforts as we work with, say, the largest money center bank out of New York. I've watched their audit year after year, and it's gotten more and more stringent, more and more specific, tell me more about this aspect of data security, tell me more about encryption, tell me more about money management. The auditors are getting better. And we're supporting our customers in that journey to provide better security for the data, to provide a better operational environment for them to be able to roll new services out with confidence that they're not going to get breached. With that confidence, they're not going to have a regulatory compliance fine or a nightmare in the press. And these are the major drivers that help us with Vertica sell together into large organizations to say, let's add some defense in depth to your data. And that's really a key concept in the security field, this concept of defense in depth. We apply that to the data itself by changing the actual data element of Rich Gaston, I will change that name into Ciphertext, and that then yields a whole bunch of benefits throughout the organization as we deal with the lifecycle of that data. >> Okay, so a couple things I want to mention there. So first of all, totally board level topic, every board of directors should really have cyber and security as part of its agenda, and it does for the reasons that you mentioned. The other is, GDPR got it all started. I guess it was May 2018 that the penalties went into effect, and that just created a whole Domino effect. You mentioned California enacting its own laws, which, you know, in some cases are even more stringent. And you're seeing this all over the world. So I think one of the questions I have is, how do you approach all this variability? It seems to me, you can't just take a narrow approach. You have to have an end to end perspective on governance and risk and security, and the like. So are you able to do that? And if so, how so? >> Absolutely, I think one of the key areas in big data in particular, has been the concern that we have a schema, we have database tables, we have CALMS, and we have data, but we're not exactly sure what's in there. We have application developers that have been given sandbox space in our clusters, and what are they putting in there? So can we discover that data? We have those tools within Micro Focus to discover sensitive data within in your data stores, but we can also protect that data, and then we'll track it. And what we really find is that when you protect, let's say, five billion rows of a customer database, we can now know what is being done with that data on a very fine grain and granular basis, to say that this business process has a justified need to see the data in the clear, we're going to give them that authorization, they can decrypt the data. Secure data, my product, knows about that and tracks that, and can report on that and say at this date and time, Rich Gaston did the following thing to be able to pull data in the clear. And that could be then used to support the regulatory compliance responses and then audit to say, who really has access to this, and what really is that data? Then in GDPR, we're getting down into much more fine grained decisions around who can get access to the data, and who cannot. And organizations are scrambling. One of the funny conversations that I had a couple years ago as GDPR came into place was, it seemed a couple of customers were taking these sort of brute force approach of, we're going to move our analytics and all of our data to Europe, to European data centers because we believe that if we do this in the U.S., we're going to violate their law. But if we do it all in Europe, we'll be okay. And that simply was a short-term way of thinking about it. You really can't be moving your data around the globe to try to satisfy a particular jurisdiction. You have to apply the controls and the policies and put the software layers in place to make sure that anywhere that someone wants to get that data, that we have the ability to look at that transaction and say it is or is not authorized, and that we have a rock solid way of approaching that for audit and for compliance and risk management. And once you do that, then you really open up the organization to go back and use those tools the way they were meant to be used. We can use Vertica for AI, we can use Vertica for machine learning, and for all kinds of really cool use cases that are being done with IOT, with other kinds of cases that we're seeing that require data being managed at scale, but with security. And that's the challenge, I think, in the current era, is how do we do this in an elegant way? How do we do it in a way that's future proof when CCPA comes in? How can I lay this on as another layer of audit responsibility and control around my data so that I can satisfy those regulators as well as the folks over in Europe and Singapore and China and Turkey and Australia. It goes on and on. Each jurisdiction out there is now requiring audit. And like I mentioned, the audits are getting tougher. And if you read the news, the GDPR example I think is classic. They told us in 2016, it's coming. They told us in 2018, it's here. They're telling us in 2020, we're serious about this, and here's the finds, and you better be aware that we're coming to audit you. And when we audit you, we're going to be asking some tough questions. If you can't answer those in a timely manner, then you're going to be facing some serious consequences, and I think that's what's getting attention. >> Yeah, so the whole big data thing started with Hadoop, and Hadoop is open, it's distributed, and it just created a real governance challenge. I want to talk about your solutions in this space. Can you tell us more about Micro Focus voltage? I want to understand what it is, and then get into sort of how it works, and then I really want to understand how it's applied to Vertica. >> Yeah, absolutely, that's a great question. First of all, we were the originators of format preserving encryption, we developed some of the core basic research out of Stanford University that then became the company of Voltage; that build-a-brand name that we apply even though we're part of Micro Focus. So the lineage still goes back to Dr. Benet down at Stanford, one of my buddies there, and he's still at it doing amazing work in cryptography and keeping moving the industry forward, and the science forward of cryptography. It's a very deep science, and we all want to have it peer-reviewed, we all want to be attacked, we all want it to be proved secure, that we're not selling something to a major money center bank that is potentially risky because it's obscure and we're private. So we have an open standard. For six years, we worked with the Department of Commerce to get our standard approved by NIST; The National Institute of Science and Technology. They initially said, well, AES256 is going to be fine. And we said, well, it's fine for certain use cases, but for your database, you don't want to change your schema, you don't want to have this increase in storage costs. What we want is format preserving encryption. And what that does is turns my name, Rich, into a four-letter ciphertext. It can be reversed. The mathematics of that are fascinating, and really deep and amazing. But we really make that very simple for the end customer because we produce APIs. So these application programming interfaces can be accessed by applications in C or Java, C sharp, other languages. But they can also be accessed in Microservice Manor via rest and web service APIs. And that's the core of our technical platform. We have an appliance-based approach, so we take a secure data appliance, we'll put it on Prim, we'll make 50 of them if you're a big company like Verizon and you need to have these co-located around the globe, no problem; we can scale to the largest enterprise needs. But our typical customer will install several appliances and get going with a couple of environments like QA and Prod to be able to start getting encryption going inside their organization. Once the appliances are set up and installed, it takes just a couple of days of work for a typical technical staff to get done. Then you're up and running to be able to plug in the clients. Now what are the clients? Vertica's a huge one. Vertica's one of our most powerful client endpoints because you're able to now take that API, put it inside Vertica, it's all open on the internet. We can go and look at Vertica.com/secure data. You get all of our documentation on it. You understand how to use it very quickly. The APIs are super simple; they require three parameter inputs. It's a really basic approach to being able to protect and access data. And then it gets very deep from there because you have data like credit card numbers. Very different from a street address and we want to take a different approach to that. We have data like birthdate, and we want to be able to do analytics on dates. We have deep approaches on managing analytics on protected data like Date without having to put it in the clear. So we've maintained a lead in the industry in terms of being an innovator of the FF1 standard, what we call FF1 is format preserving encryption. We license that to others in the industry, per our NIST agreement. So we're the owner, we're the operator of it, and others use our technology. And we're the original founders of that, and so we continue to sort of lead the industry by adding additional capabilities on top of FF1 that really differentiate us from our competitors. Then you look at our API presence. We can definitely run as a dup, but we also run in open systems. We run on main frame, we run on mobile. So anywhere in the enterprise or one in the cloud, anywhere you want to be able to put secure data, and be able to access the protect data, we're going to be there and be able to support you there. >> Okay so, let's say I've talked to a lot of customers this week, and let's say I'm running in Eon mode. And I got some workload running in AWS, I've got some on Prim. I'm going to take an appliance or multiple appliances, I'm going to put it on Prim, but that will also secure my cloud workloads as part of a sort of shared responsibility model, for example? Or how does that work? >> No, that's absolutely correct. We're really flexible that we can run on Prim or in the cloud as far as our crypto engine, the key management is really hard stuff. Cryptography is really hard stuff, and we take care of all that, so we've all baked that in, and we can run that for you as a service either in the cloud or on Prim on your small Vms. So really the lightweight footprint for me running my infrastructure. When I look at the organization like you just described, it's a classic example of where we fit because we will be able to protect that data. Let's say you're ingesting it from a third party, or from an operational system, you have a website that collects customer data. Someone has now registered as a new customer, and they're going to do E-commerce with you. We'll take that data, and we'll protect it right at the point of capture. And we can now flow that through the organization and decrypt it at will on any platform that you have that you need us to be able to operate on. So let's say you wanted to pick that customer data from the operational transaction system, let's throw it into Eon, let's throw it into the cloud, let's do analytics there on that data, and we may need some decryption. We can place secure data wherever you want to be able to service that use case. In most cases, what you're doing is a simple, tiny little atomic efetch across a protected tunnel, your typical TLS pipe tunnel. And once that key is then cashed within our client, we maintain all that technology for you. You don't have to know about key management or dashing. We're good at that; that's our job. And then you'll be able to make those API calls to access or protect the data, and apply the authorization authentication controls that you need to be able to service your security requirements. So you might have third parties having access to your Vertica clusters. That is a special need, and we can have that ability to say employees can get X, and the third party can get Y, and that's a really interesting use case we're seeing for shared analytics in the internet now. >> Yeah for sure, so you can set the policy how we want. You know, I have to ask you, in a perfect world, I would encrypt everything. But part of the reason why people don't is because of performance concerns. Can you talk about, and you touched upon it I think recently with your sort of atomic access, but can you talk about, and I know it's Vertica, it's Ferrari, etc, but anything that slows it down, I'm going to be a concern. Are customers concerned about that? What are the performance implications of running encryption on Vertica? >> Great question there as well, and what we see is that we want to be able to apply scale where it's needed. And so if you look at ingest platforms that we find, Vertica is commonly connected up to something like Kafka. Maybe streamsets, maybe NiFi, there are a variety of different technologies that can route that data, pipe that data into Vertica at scale. Secured data is architected to go along with that architecture at the node or at the executor or at the lowest level operator level. And what I mean by that is that we don't have a bottleneck that everything has to go through one process or one box or one channel to be able to operate. We don't put an interceptor in between your data and coming and going. That's not our approach because those approaches are fragile and they're slow. So we typically want to focus on integrating our APIs natively within those pipeline processes that come into Vertica within the Vertica ingestion process itself, you can simply apply our protection when you do the copy command in Vertica. So really basic simple use case that everybody is typically familiar with in Vertica land; be able to copy the data and put it into Vertica, and you simply say protect as part of the data. So my first name is coming in as part of this ingestion. I'll simply put the protect keyword in the Syntax right in SQL; it's nothing other than just an extension SQL. Very very simple, the developer, easy to read, easy to write. And then you're going to provide the parameters that you need to say, oh the name is protected with this kind of a format. To differentiate it between a credit card number and an alphanumeric stream, for example. So once you do that, you then have the ability to decrypt. Now, on decrypt, let's look at a couple different use cases. First within Vertica, we might be doing select statements within Vertica, we might be doing all kinds of jobs within Vertica that just operate at the SQL layer. Again, just insert the word "access" into the Vertica select string and provide us with the data that you want to access, that's our word for decryption, that's our lingo. And we will then, at the Vertica level, harness the power of its CPU, its RAM, its horsepower at the node to be able to operate on that operator, the decryption request, if you will. So that gives us the speed and the ability to scale out. So if you start with two nodes of Vertica, we're going to operate at X number of hundreds of thousands of transactions a second, depending on what you're doing. Long strings are a little bit more intensive in terms of performance, but short strings like social security number are our sweet spot. So we operate very very high speed on that, and you won't notice the overhead with Vertica, perse, at the node level. When you scale Vertica up and you have 50 nodes, and you have large clusters of Vertica resources, then we scale with you. And we're not a bottleneck and at any particular point. Everybody's operating independently, but they're all copies of each other, all doing the same operation. Fetch a key, do the work, go to sleep. >> Yeah, you know, I think this is, a lot of the customers have said to us this week that one of the reasons why they like Vertica is it's very mature, it's been around, it's got a lot of functionality, and of course, you know, look, security, I understand is it's kind of table sticks, but it's also can be a differentiator. You know, big enterprises that you sell to, they're asking for security assessments, SOC 2 reports, penetration testing, and I think I'm hearing, with the partnership here, you're sort of passing those with flying colors. Are you able to make security a differentiator, or is it just sort of everybody's kind of got to have good security? What are your thoughts on that? >> Well, there's good security, and then there's great security. And what I found with one of my money center bank customers here in San Francisco was based here, was the concern around the insider access, when they had a large data store. And the concern that a DBA, a database administrator who has privilege to everything, could potentially exfil data out of the organization, and in one fell swoop, create havoc for them because of the amount of data that was present in that data store, and the sensitivity of that data in the data store. So when you put voltage encryption on top of Vertica, what you're doing now is that you're putting a layer in place that would prevent that kind of a breach. So you're looking at insider threats, you're looking at external threats, you're looking at also being able to pass your audit with flying colors. The audits are getting tougher. And when they say, tell me about your encryption, tell me about your authentication scheme, show me the access control list that says that this person can or cannot get access to something. They're asking tougher questions. That's where secure data can come in and give you that quick answer of it's encrypted at rest. It's encrypted and protected while it's in use, and we can show you exactly who's had access to that data because it's tracked via a different layer, a different appliance. And I would even draw the analogy, many of our customers use a device called a hardware security module, an HSM. Now, these are fairly expensive devices that are invented for military applications and adopted by banks. And now they're really spreading out, and people say, do I need an HSM? Well, with secure data, we certainly protect your crypto very very well. We have very very solid engineering. I'll stand on that any day of the week, but your auditor is going to want to ask a checkbox question. Do you have HSM? Yes or no. Because the auditor understands, it's another layer of protection. And it provides me another tamper evident layer of protection around your key management and your crypto. And we, as professionals in the industry, nod and say, that is worth it. That's an expensive option that you're going to add on, but your auditor's going to want it. If you're in financial services, you're dealing with PCI data, you're going to enjoy the checkbox that says, yes, I have HSMs and not get into some arcane conversation around, well no, but it's good enough. That's kind of the argument then conversation we get into when folks want to say, Vertica has great security, Vertica's fantastic on security. Why would I want secure data as well? It's another layer of protection, and it's defense in depth for you data. When you believe in that, when you take security really seriously, and you're really paranoid, like a person like myself, then you're going to invest in those kinds of solutions that get you best in-class results. >> So I'm hearing a data-centric approach to security. Security experts will tell you, you got to layer it. I often say, we live in a new world. The green used to just build a moat around the queen, but the queen, she's leaving her castle in this world of distributed data. Rich, incredibly knowlegable guest, and really appreciate you being on the front lines and sharing with us your knowledge about this important topic. So thanks for coming on theCUBE. >> Hey, thank you very much. >> You're welcome, and thanks for watching everybody. This is Dave Vellante for theCUBE, we're covering wall-to-wall coverage of the Virtual Vertica BDC, Big Data Conference. Remotely, digitally, thanks for watching. Keep it right there. We'll be right back right after this short break. (intense music)
SUMMARY :
Vertica Big Data Conference 2020 brought to you by Vertica. and we're pleased that The Cube could participate But maybe you can talk about your role And then to the other uses where we might be doing and how you guys are responding. and they said, we want to inform you your card and it does for the reasons that you mentioned. and put the software layers in place to make sure Yeah, so the whole big data thing started with Hadoop, So the lineage still goes back to Dr. Benet but that will also secure my cloud workloads as part of a and we can run that for you as a service but can you talk about, at the node to be able to operate on that operator, a lot of the customers have said to us this week and we can show you exactly who's had access to that data and really appreciate you being on the front lines of the Virtual Vertica BDC, Big Data Conference.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Australia | LOCATION | 0.99+ |
Europe | LOCATION | 0.99+ |
Target | ORGANIZATION | 0.99+ |
Verizon | ORGANIZATION | 0.99+ |
Vertica | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Dave Vellante | PERSON | 0.99+ |
May 2018 | DATE | 0.99+ |
NIST | ORGANIZATION | 0.99+ |
2016 | DATE | 0.99+ |
Boston | LOCATION | 0.99+ |
2018 | DATE | 0.99+ |
San Francisco | LOCATION | 0.99+ |
New York | LOCATION | 0.99+ |
Target Corporation | ORGANIZATION | 0.99+ |
$250 million | QUANTITY | 0.99+ |
50 | QUANTITY | 0.99+ |
Rich Gaston | PERSON | 0.99+ |
Singapore | LOCATION | 0.99+ |
Turkey | LOCATION | 0.99+ |
Ferrari | ORGANIZATION | 0.99+ |
six years | QUANTITY | 0.99+ |
2020 | DATE | 0.99+ |
one box | QUANTITY | 0.99+ |
China | LOCATION | 0.99+ |
C | TITLE | 0.99+ |
Stanford University | ORGANIZATION | 0.99+ |
Java | TITLE | 0.99+ |
First | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
U.S. | LOCATION | 0.99+ |
this week | DATE | 0.99+ |
National Institute of Science and Technology | ORGANIZATION | 0.99+ |
Each jurisdiction | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
Vertica | TITLE | 0.99+ |
Rich | PERSON | 0.99+ |
this year | DATE | 0.98+ |
Vertica Virtual Big Data Conference | EVENT | 0.98+ |
one channel | QUANTITY | 0.98+ |
one process | QUANTITY | 0.98+ |
GDPR | TITLE | 0.98+ |
SQL | TITLE | 0.98+ |
five billion rows | QUANTITY | 0.98+ |
about five billion | QUANTITY | 0.97+ |
One | QUANTITY | 0.97+ |
C sharp | TITLE | 0.97+ |
Benet | PERSON | 0.97+ |
first | QUANTITY | 0.96+ |
four-letter | QUANTITY | 0.96+ |
Vertica Big Data Conference 2020 | EVENT | 0.95+ |
Hadoop | TITLE | 0.94+ |
Kafka | TITLE | 0.94+ |
Micro Focus | ORGANIZATION | 0.94+ |
Vikram Murali, IBM | IBM Data Science For All
>> Narrator: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome back to New York here on theCUBE. Along with Dave Vellante, I'm John Walls. We're Data Science For All, IBM's two day event, and we'll be here all day long wrapping up again with that panel discussion from four to five here Eastern Time, so be sure to stick around all day here on theCUBE. Joining us now is Vikram Murali, who is a program director at IBM, and Vikram thank for joining us here on theCUBE. Good to see you. >> Good to see you too. Thanks for having me. >> You bet. So, among your primary responsibilities, The Data Science Experience. So first off, if you would, share with our viewers a little bit about that. You know, the primary mission. You've had two fairly significant announcements. Updates, if you will, here over the past month or so, so share some information about that too if you would. >> Sure, so my team, we build The Data Science Experience, and our goal is for us to enable data scientist, in their path, to gain insights into data using data science techniques, mission learning, the latest and greatest open source especially, and be able to do collaboration with fellow data scientist, with data engineers, business analyst, and it's all about freedom. Giving freedom to data scientist to pick the tool of their choice, and program and code in the language of their choice. So that's the mission of Data Science Experience, when we started this. The two releases, that you mentioned, that we had in the last 45 days. There was one in September and then there was one on October 30th. Both of these releases are very significant in the mission learning space especially. We now support Scikit-Learn, XGBoost, TensorFlow libraries in Data Science Experience. We have deep integration with Horton Data Platform, which is keymark of our partnership with Hortonworks. Something that we announced back in the summer, and this last release of Data Science Experience, two days back, specifically can do authentication with Technotes with Hadoop. So now our Hadoop customers, our Horton Data Platform customers, can leverage all the goodies that we have in Data Science Experience. It's more deeply integrated with our Hadoop based environments. >> A lot of people ask me, "Okay, when IBM announces a product like Data Science Experience... You know, IBM has a lot of products in its portfolio. Are they just sort of cobbling together? You know? So exulting older products, and putting a skin on them? Or are they developing them from scratch?" How can you help us understand that? >> That's a great question, and I hear that a lot from our customers as well. Data Science Experience started off as a design first methodology. And what I mean by that is we are using IBM design to lead the charge here along with the product and development. And we are actually talking to customers, to data scientist, to data engineers, to enterprises, and we are trying to find out what problems they have in data science today and how we can best address them. So it's not about taking older products and just re-skinning them, but Data Science Experience, for example, it started of as a brand new product: completely new slate with completely new code. Now, IBM has done data science and mission learning for a very long time. We have a lot of assets like SPSS Modeler and Stats, and digital optimization. And we are re-investing in those products, and we are investing in such a way, and doing product research in such a way, not to make the old fit with the new, but in a way where it fits into the realm of collaboration. How can data scientist leverage our existing products with open source, and how we can do collaboration. So it's not just re-skinning, but it's building ground up. >> So this is really important because you say architecturally it's built from the ground up. Because, you know, given enough time and enough money, you know, smart people, you can make anything work. So the reason why this is important is you mentioned, for instance, TensorFlow. You know that down the road there's going to be some other tooling, some other open source project that's going to take hold, and your customers are going to say, "I want that." You've got to then integrate that, or you have to choose whether or not to. If it's a super heavy lift, you might not be able to do it, or do it in time to hit the market. If you architected your system to be able to accommodate that. Future proof is the term everybody uses, so have you done? How have you done that? I'm sure API's are involved, but maybe you could add some color. >> Sure. So we are and our Data Science Experience and mission learning... It is a microservices based architecture, so we are completely dockerized, and we use Kubernetes under the covers for container dockerstration. And all these are tools that are used in The Valley, across different companies, and also in products across IBM as well. So some of these legacy products that you mentioned, we are actually using some of these newer methodologies to re-architect them, and we are dockerizing them, and the microservice architecture actually helps us address issues that we have today as well as be open to development and taking newer methodologies and frameworks into consideration that may not exist today. So the microservices architecture, for example, TensorFlow is something that you brought in. So we can just pin up a docker container just for TensorFlow and attach it to our existing Data Science Experience, and it just works. Same thing with other frameworks like XGBoost, and Kross, and Scikit-Learn, all these are frameworks and libraries that are coming up in open source within the last, I would say, a year, two years, three years timeframe. Previously, integrating them into our product would have been a nightmare. We would have had to re-architect our product every time something came, but now with the microservice architecture it is very easy for us to continue with those. >> We were just talking to Daniel Hernandez a little bit about the Hortonworks relationship at high level. One of the things that I've... I mean, I've been following Hortonworks since day one when Yahoo kind of spun them out. And know those guys pretty well. And they always make a big deal out of when they do partnerships, it's deep engineering integration. And so they're very proud of that, so I want to come on to test that a little bit. Can you share with our audience the kind of integrations you've done? What you've brought to the table? What Hortonworks brought to the table? >> Yes, so Data Science Experience today can work side by side with Horton Data Platform, HDP. And we could have actually made that work about two, three months back, but, as part of our partnership that was announced back in June, we set up drawing engineering teams. We have multiple touch points every day. We call it co-development, and they have put resources in. We have put resources in, and today, especially with the release that came out on October 30th, Data Science Experience can authenticate using secure notes. That I previously mentioned, and that was a direct example of our partnership with Hortonworks. So that is phase one. Phase two and phase three is going to be deeper integration, so we are planning on making Data Science Experience and a body management pact. And so a Hortonworks customer, if you have HDP already installed, you don't have to install DSX separately. It's going to be a management pack. You just spin it up. And the third phase is going to be... We're going to be using YARN for resource management. YARN is very good a resource management. And for infrastructure as a service for data scientist, we can actually delegate that work to YARN. So, Hortonworks, they are putting resources into YARN, doubling down actually. And they are making changes to YARN where it will act as the resource manager not only for the Hadoop and Spark workloads, but also for Data Science Experience workloads. So that is the level of deep engineering that we are engaged with Hortonworks. >> YARN stands for yet another resource negotiator. There you go for... >> John: Thank you. >> The trivia of the day. (laughing) Okay, so... But of course, Hortonworks are big on committers. And obviously a big committer to YARN. Probably wouldn't have YARN without Hortonworks. So you mentioned that's kind of what they're bringing to the table, and you guys primarily are focused on the integration as well as some other IBM IP? >> That is true as well as the notes piece that I mentioned. We have a notes commenter. We have multiple notes commenters on our side, and that helps us as well. So all the notes is part of the HDP package. We need knowledge on our side to work with Hortonworks developers to make sure that we are contributing and making end roads into Data Science Experience. That way the integration becomes a lot more easier. And from an IBM IP perspective... So Data Science Experience already comes with a lot of packages and libraries that are open source, but IBM research has worked on a lot of these libraries. I'll give you a few examples: Brunel and PixieDust is something that our developers love. These are visualization libraries that were actually cooked up by IBM research and the open sourced. And these are prepackaged into Data Science Experience, so there is IBM IP involved and there are a lot of algorithms, mission learning algorithms, that we put in there. So that comes right out of the package. >> And you guys, the development teams, are really both in The Valley? Is that right? Or are you really distributed around the world? >> Yeah, so we are. The Data Science Experience development team is in North America between The Valley and Toronto. The Hortonworks team, they are situated about eight miles from where we are in The Valley, so there's a lot of synergy. We work very closely with them, and that's what we see in the product. >> I mean, what impact does that have? Is it... You know, you hear today, "Oh, yeah. We're a virtual organization. We have people all over the world: Eastern Europe, Brazil." How much of an impact is that? To have people so physically proximate? >> I think it has major impact. I mean IBM is a global organization, so we do have teams around the world, and we work very well. With the invent of IP telephoning, and screen-shares, and so on, yes we work. But it really helps being in the same timezone, especially working with a partner just eight miles or ten miles a way. We have a lot of interaction with them and that really helps. >> Dave: Yeah. Body language? >> Yeah. >> Yeah. You talked about problems. You talked about issues. You know, customers. What are they now? Before it was like, "First off, I want to get more data." Now they've got more data. Is it figuring out what to do with it? Finding it? Having it available? Having it accessible? Making sense of it? I mean what's the barrier right now? >> The barrier, I think for data scientist... The number one barrier continues to be data. There's a lot of data out there. Lot of data being generated, and the data is dirty. It's not clean. So number one problem that data scientist have is how do I get to clean data, and how do I access data. There are so many data repositories, data lakes, and data swamps out there. Data scientist, they don't want to be in the business of finding out how do I access data. They want to have instant access to data, and-- >> Well if you would let me interrupt you. >> Yeah? >> You say it's dirty. Give me an example. >> So it's not structured data, so data scientist-- >> John: So unstructured versus structured? >> Unstructured versus structured. And if you look at all the social media feeds that are being generated, the amount of data that is being generated, it's all unstructured data. So we need to clean up the data, and the algorithms need structured data or data in a particular format. And data scientist don't want to spend too much time in cleaning up that data. And access to data, as I mentioned. And that's where Data Science Experience comes in. Out of the box we have so many connectors available. It's very easy for customers to bring in their own connectors as well, and you have instant access to data. And as part of our partnership with Hortonworks, you don't have to bring data into Data Science Experience. The data is becoming so big. You want to leave it where it is. Instead, push analytics down to where it is. And you can do that. We can connect to remote Spark. We can push analytics down through remote Spark. All of that is possible today with Data Science Experience. The second thing that I hear from data scientist is all the open source libraries. Every day there's a new one. It's a boon and a bane as well, and the problem with that is the open source community is very vibrant, and there a lot of data science competitions, mission learning competitions that are helping move this community forward. And it's a good thing. The bad thing is data scientist like to work in silos on their laptop. How do you, from an enterprise perspective... How do you take that, and how do you move it? Scale it to an enterprise level? And that's where Data Science Experience comes in because now we provide all the tools. The tools of your choice: open source or proprietary. You have it in here, and you can easily collaborate. You can do all the work that you need with open source packages, and libraries, bring your own, and as well as collaborate with other data scientist in the enterprise. >> So, you're talking about dirty data. I mean, with Hadoop and no schema on, right? We kind of knew this problem was coming. So technology sort of got us into this problem. Can technology help us get out of it? I mean, from an architectural standpoint. When you think about dirty data, can you architect things in to help? >> Yes. So, if you look at the mission learning pipeline, the pipeline starts with ingesting data and then cleansing or cleaning that data. And then you go into creating a model, training, picking a classifier, and so on. So we have tools built into Data Science Experience, and we're working on tools, that will be coming up and down our roadmap, which will help data scientist do that themselves. I mean, they don't have to be really in depth coders or developers to do that. Python is very powerful. You can do a lot of data wrangling in Python itself, so we are enabling data scientist to do that within the platform, within Data Science Experience. >> If I look at sort of the demographics of the development teams. We were talking about Hortonworks and you guys collaborating. What are they like? I mean people picture IBM, you know like this 100 plus year old company. What's the persona of the developers in your team? >> The persona? I would say we have a very young, agile development team, and by that I mean... So we've had six releases this year in Data Science Experience. Just for the on premises side of the product, and the cloud side of the product it's got huge delivery. We have releases coming out faster than we can code. And it's not just re-architecting it every time, but it's about adding features, giving features that our customers are asking for, and not making them wait for three months, six months, one year. So our releases are becoming a lot more frequent, and customers are loving it. And that is, in part, because of the team. The team is able to evolve. We are very agile, and we have an awesome team. That's all. It's an amazing team. >> But six releases in... >> Yes. We had immediate release in April, and since then we've had about five revisions of the release where we add lot more features to our existing releases. A lot more packages, libraries, functionality, and so on. >> So you know what monster you're creating now don't you? I mean, you know? (laughing) >> I know, we are setting expectation. >> You still have two months left in 2017. >> We do. >> We do not make frame release cycles. >> They are not, and that's the advantage of the microservices architecture. I mean, when you upgrade, a customer upgrades, right? They don't have to bring that entire system down to upgrade. You can target one particular part, one particular microservice. You componentize it, and just upgrade that particular microservice. It's become very simple, so... >> Well some of those microservices aren't so micro. >> Vikram: Yeah. Not. Yeah, so it's a balance. >> You're growing, but yeah. >> It's a balance you have to keep. Making sure that you componentize it in such a way that when you're doing an upgrade, it effects just one small piece of it, and you don't have to take everything down. >> Dave: Right. >> But, yeah, I agree with you. >> Well, it's been a busy year for you. To say the least, and I'm sure 2017-2018 is not going to slow down. So continue success. >> Vikram: Thank you. >> Wish you well with that. Vikram, thanks for being with us here on theCUBE. >> Thank you. Thanks for having me. >> You bet. >> Back with Data Science For All. Here in New York City, IBM. Coming up here on theCUBE right after this. >> Cameraman: You guys are clear. >> John: All right. That was great.
SUMMARY :
Brought to you by IBM. Good to see you. Good to see you too. about that too if you would. and be able to do collaboration How can you help us understand that? and we are investing in such a way, You know that down the and attach it to our existing One of the things that I've... And the third phase is going to be... There you go for... and you guys primarily are So that comes right out of the package. The Valley and Toronto. We have people all over the We have a lot of interaction with them Is it figuring out what to do with it? and the data is dirty. You say it's dirty. You can do all the work that you need with can you architect things in to help? I mean, they don't have to and you guys collaborating. And that is, in part, because of the team. and since then we've had about and that's the advantage of microservices aren't so micro. Yeah, so it's a balance. and you don't have to is not going to slow down. Wish you well with that. Thanks for having me. Back with Data Science For All. That was great.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Dave | PERSON | 0.99+ |
Vikram | PERSON | 0.99+ |
John | PERSON | 0.99+ |
three months | QUANTITY | 0.99+ |
six months | QUANTITY | 0.99+ |
John Walls | PERSON | 0.99+ |
October 30th | DATE | 0.99+ |
2017 | DATE | 0.99+ |
April | DATE | 0.99+ |
June | DATE | 0.99+ |
one year | QUANTITY | 0.99+ |
Daniel Hernandez | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
September | DATE | 0.99+ |
one | QUANTITY | 0.99+ |
ten miles | QUANTITY | 0.99+ |
YARN | ORGANIZATION | 0.99+ |
eight miles | QUANTITY | 0.99+ |
Vikram Murali | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
North America | LOCATION | 0.99+ |
two day | QUANTITY | 0.99+ |
Python | TITLE | 0.99+ |
two releases | QUANTITY | 0.99+ |
New York | LOCATION | 0.99+ |
two years | QUANTITY | 0.99+ |
three years | QUANTITY | 0.99+ |
six releases | QUANTITY | 0.99+ |
Toronto | LOCATION | 0.99+ |
today | DATE | 0.99+ |
Both | QUANTITY | 0.99+ |
two months | QUANTITY | 0.99+ |
a year | QUANTITY | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
third phase | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
this year | DATE | 0.98+ |
first methodology | QUANTITY | 0.98+ |
First | QUANTITY | 0.97+ |
second thing | QUANTITY | 0.97+ |
one small piece | QUANTITY | 0.96+ |
One | QUANTITY | 0.96+ |
XGBoost | TITLE | 0.96+ |
Cameraman | PERSON | 0.96+ |
about eight miles | QUANTITY | 0.95+ |
Horton Data Platform | ORGANIZATION | 0.95+ |
2017-2018 | DATE | 0.94+ |
first | QUANTITY | 0.94+ |
The Valley | LOCATION | 0.94+ |
TensorFlow | TITLE | 0.94+ |
Itamar Ankorion, Attunity & Arvind Rajagopalan, Verizon - #DataWorks - #theCUBE
>> Narrator: Live from San Jose in the heart of Silicon Valley, it's the CUBE covering DataWorks Summit 2017 brought to you by Hortonworks. >> Hey, welcome back to the CUBE live from the DataWorks Summit day 2. We've been here for a day and a half talking with fantastic leaders and innovators, learning a lot about what's happening in the world of big data, the convergence with Internet of Things Machine Learning, artificial intelligence, I could go on and on. I'm Lisa Martin, my co-host is George Gilbert and we are joined by a couple of guys, one is a Cube alumni, Itamar Ankorion, CMO of Attunity, Welcome back to the Cube. >> Thank you very much, good to be here, thank you Lisa and George. >> Lisa: Great to have you. >> And Arvind Rajagopalan, the Director of Technology Services for Verizon, welcome to the Cube. >> Thank you. >> So we were chatting before we went on, and Verizon, you're actually going to be presenting tomorrow, at the DataWorks summit, tell us about building... the journey that Verizon has been on building a Data Lake. >> Oh, Verizon is over the last 20 years, has been a large corporation, made up of a lot of different acquisitions and mergers, and that's how it was formed in 20 years back, and as we've gone through the journey of the mergers and the acquisitions over the years, we had data from different companies come together and form a lot of different data silos. So the reason we kind of started looking at this, is when our CFO started asking questions around... Being able to answer One Verizon questions, it's as simple as having Days Payable, or Working Capital Analysis across all the lines of businesses. And since we have a three-major-ERP footprint, it is extremely hard to get that data out, and there was a lot of manual data prep activities that was going into bringing together those One Verizon views. So that's really what was the catalyst to get the journey started for us. >> And it was driven by your CFO, you said? >> Arvind: That's right. >> Ah, very interesting, okay. So what are some of the things that people are going to hear tomorrow from your breakout session? >> Arvind: I'm sorry, say that again? >> Sorry, what are some of the things that the people, the attendees from your breakout session, are going to learn about the steps and the journey? >> So I'm going to primarily be talking about the challenges that we ran into, and share some around that, and also talk about some of the factors, such as the catalysts and what drew us to sort of moving in that direction, as well as getting to some architectural components, from high-level standpoint, talk about certain partners that we work with, the choices we made from an architecture perspective and the tools, as well as to kind of close the loop on, user adoption and what users are seeing in terms of business value, as we start centralizing all of the data at Verizon from a backoff as Finance and Supply Chains standpoint. So that's kind of what I'm looking at talking tomorrow. >> Arvind, it's interesting to hear you talk about sort of collecting data from essentially backoff as operational systems in a Data Lake. Were there... I assume that the state is sort of more refined and easily structured than the typical stories we hear about Data Lakes. Were there challenges in making it available for exploration and visualization, or were all the early-use cases really just Production Reporting? >> So standard reporting across the ERP systems is very mature and those capabilities are there, but then you look at across-ERP systems and we have three major ERP systems for each of the lines of businesses, when you want to look at combining all of the data, it's very hard, and to add to that, you pointed on self-service discovery, and visualization across all three datas, that's even more challenging, because it takes a lot of heavy lift, to normalize all of the data and bring it into one centralized platform, and we started off the journey with Oracle, and then we had SAP HANA, we were trying to bring all the data together, but then we were looking at systems in our non-SAP ERP systems and bringing that data into a SAP-kind of footprint, one, the cost was tremendously high, also there was a lot of heavy lift and challenges in terms of manually having to normalize the data and bring it into the same kind of data models. And even after all of that was done, it was not very self-service oriented for our users and Finance and Supply Chain. >> Let me drill into two of those things. So it sounds like the ETL process of converting it into a consumable format was very complex, and then it sounds like also, the discoverability, like where a tool, perhaps like Elation, might help, which is very, very immature right now, or maybe not immature, it's still young. Is that what was missing, or why was the ETL process so much more heavyweight than with a traditional data warehouse? >> The ETL processes, there's a lot of heavy lifting there involved, because of the proprietary data structures of the ERP systems, especially SAP is... The data structures and how the data is used across clustered and pool tables, is very proprietary. And on top of that, bringing the data formats and structures from a PeopleSoft ERP system which are supporting different lines of businesses, so there are a lot of customization that's gone into place, there are specific things that we use in the ERPs, in terms of the modules and how the processes are modeled in each of the lines of businesses, complicates things a lot. And then you try and bring all these three different ERPs, and the nuances that they have over the years, try and bring them together, it actually makes it very complex. >> So tell us then, help us understand, how the Data Lake made that easier. Was it because you didn't have to do all the refinement before it got there. And tell us how Attunity helped make that possible. >> Oh absolutely, so I think that's one of the big things, why we picked the Hortonworks as one of our key partners in terms of buidling out the Data Lake, it just came on greed, you aren't necessarily worried about doing a whole lot of ETL before you bring the data in, and it also provides with the tools and the technologies from a lot other partners. We have a lot of maturity now, better provided self-service discovery capabilities for ad hoc analysis and reporting. So this is helpful to the users because now they don't have to wait for prolonged IT development cycles to model the data, do the ETL and build reports for the to consume, which sometimes could take weeks and months. Now in a matter of days, they're able to see the data they're looking for and they're able to start the analysis, and once they start the analysis and the data is accessible, it's a matter of minutes and seconds looking at the different tools, how they want to look at it, how they want to model it, so it's actually being a huge value from the perspective of the users and what they're looking to do. >> Speaking of value, one of the things that was kind of thematic yesterday, we see enterprises are now embracing big data, they're embracing Hadoop, it's got to coexist within our ecosystem, and it's got to inter-operate, but just putting data in a Data Lake or Hadoop, that's not the value there, it's being able to analyze that data in motion, at rest, structured, unstructured, and start being able to glean or take actionable insights. From your CFO's perspective, where are you know of answering some of the questions that he or she had, from an insights perspective, with the Data Lake that you have in place? >> Yeah, before I address that, I wanted to quickly touch upon and wrap up George's question, if you don't mind. Because one of the key challenges, and I do talk about how Attunity helped. I was just about to answer the question before we moved on, so I just want to close the loop on that a little bit. So in terms of bringing the data in, the data acquisition or ingestion is key aspect of it, and again, looking at the proprietary data structures from the ERP systems is very complex, and involves a multi-step process to bring the data into a strange environment, and be able to put it in the swamp bring it into the Lake. And what Attunity has been able to help us with is, it has the intelligence to look at and understand the proprietary data structures of the ERPs, and it is able to bring all the data from the ERP source systems directly into Hadoop, without any stops, or staging data bases along the way. So it's been a huge value from that standpoint, I'll get into more details around that. And to answer your question, around how it's helping from a CFO standpoint, and the users in Finance, as I said, now all the data is available in one place, so it's very easy for them to consume the data, and be able to do ad hoc analysis. So if somebody's looking to, like I said earlier, want to look at and calculate base table, as an example, or they want to look at working capital, we are actually moving data using Attunity, CDC replicate product, we're getting data in real-time, into the Data Lake. So now they're able to turn things around, and do that kind of analysis in a matter of hours, versus overnight or in a matter of days, which was the previous environment. >> And that was kind of one of the things this morning, is it's really about speed, right? It's how fast can you move and it sounds like together with Attunity, Verizon is really not only making things simpler, as you talked about in this kind of model that you have, with different ERP systems, but you're also really able to get information into the right hands much, much faster. >> Absolutely, that's the beauty of the near real-time, and the CDC architecture, we're able to get data in, very easily and quickly, and Attunity also provides a lot of visibility as the data is in flight, we're able to see what's happening in the source system, how many packets are flowing through, and to a point, my developers are so excited to work with a product, because they don't have to worry about the changes happening in the source systems in terms of DDL and those changes are automatically understood by the product and pushed to the destination of Hadoop. So it's been a game-changer, because we have not had any downtime, because when there are things changing on the source system side, historically we had to take downtime, to change those configurations and the scripts, and publish it across environments, so that's been huge from that standpoint as well. >> Absolutely. >> Itamar, maybe, help us understand where Attunity can... It sounds like there's greatly reduced latency in the pipeline between the operational systems and the analytic system, but it also sounds like you still need to essentially reformat the data, so that it's consumable. So it sounds like there's an ETL pipeline that's just much, much faster, but at the same time, when it's like, replicate, it sounds like that goes without transformations. So help us sort of understand that nuance. >> Yeah, that's a great question, George. And indeed in the past few years, customers have been focused predominantly on getting the data to the Lake. I actually think it's one of the changes in the fame, we're hearing here in the show and the last few months is, how do we move to start using the data, the great applications on the data. So we're kind of moving to the next step, in the last few years we focused a lot on innovating and creating the solutions that facilitate and accelerate the process of getting data to the Lake, from a large scope of systems, including complex ones like SAP, and also making the process of doing that easier, providing real-time data that can both feed streaming architectures as well as batch ones. So once we got that covered, to your question, is what happens next, and one of the things we found, I think Verizon is also looking at it now and are being concomitant later. What we're seeing is, when you bring data in, and you want to adopt the streaming, or a continuous incremental type of data ingestion process, you're inherently building an architecture that takes what was originally a database, but you're kind of, in a sense, breaking it apart to partitions, as you're loading it over time. So when you land the data, and Arvind was referring to a swamp, or some customers refer to it as a landing zone, you bring the data into your Lake environment, but at the first stage that data is not structured, to your point, George, in a manner that's easily consumable. Alright, so the next step is, how do we facilitate the next step of the process, which today is still very manual-driven, has custom development and dealing with complex structures. So we actually are very excited, we've introduced, in the show here, we announced a new product by Attunity, Compose for Hive, which extends our Data Lake solutions, and what Compose of Hive is exactly designed to do, is address part of the problem you just described, where's when the data comes in and is partitioned, what Compose for Hive does, is it reassembles these partitions, and it then creates analytic-ready data sets, back in Hive, so it can create operational data stores, it can create historical data stores, so then the data becomes formatted, in a matter that's more easily accessible for users, who want to use analytic tools, VI-tools, Tableau, Qlik, any type of tool that can easily access a database. >> Would there be, as a next step, whether led by Verizon's requirements or Attunity's anticipation of broader customer requirements, something where, there's a, if not near real-time, but a very low latency landing and transformation, so that data that is time-sensitive can join the historical data. >> Absolutely, absolutely. So what we've done, is focus on real-time availability of data. So when we feed the data into the Data Lake, we fit it into ways, one is directly into Hive, but we also go through a streaming architecture, like Kafka, in the case of Hortonworks, can also fit also very well into HDF. So then the next step in the process, is producing those analytic data sets, or data source, out of it, which we enable, and what we do is design it together with our partners, with our inner customers. So again when we work on Replicate, then we worked on Compose, we worked very close with Fortune companies trying to deal with these challenges, so we can design a product. In the case of Compose for Hive for example, we have done a lot of collaboration, at a product engineering level, with Hortonworks, to leverage the latest and greatest in Hive 2.2, Hive LLAP, to be able to push down transformations, so those can be done faster, including real-time, so those datasets can be updated on a frequent basis. >> You talked about kind of customer requirements, either those specific or not, obviously talking to telecommunications company, are you seeing, Itamar, from Attunity's perspective, more of this need to... Alright, the data's in the Lake, or first it comes to the swamp, now it's in the Lake, to start partitioning it, are you seeing this need driven in specific industries, or is this really pretty horizontal? >> That's a good question and this is definitely a horizontal need, it's part of the infrastructure needs, so Verizon is a great customer, and we even worked similarly in telecommunications, we've been working with other customers in other industries, from manufacturing, to retail, to health care, to automotive and others, and in all of those cases it's on a foundation level, it's very similar architectural challenges. You need to ingest the data, you want to do it fast, you want to do it incrementally or continuously, even if you're loading directly into Hadoop. Naturally, when you're loading the data through a Kafka, or streaming architecture, it's a continuous fashon, and then you partition the data. So the partitioning of the data is kind of inherent to the architecture, and then you need to help deal with the data, for the next step in the process. And we're doing it both with Compose for Hive, but also for customers using streaming architectures like Kafka, we provide the mechanisms, from supporting or facilitating things like schema unpollution, and schema decoding, to be able to facilitate the downstream process of processing those partitions of data, so we can make the data available, that works both for analytics and streaming analytics, as well as for scenarios like microservices, where the way in which you partition the data or deliver the data, allows each microservice to pick up on the data it needs, from the relevant partition. >> Well guys, this has been a really informative conversation. Congratulations, Itamar, on the new announcement that you guys made today. >> Thank you very much. >> Lisa: Arvin, great to hear the use case and how Verizon really sounds quite pioneering in what you're doing, wish you continued success there, we look forward to hearing what's next for Verizon, we want to thank you for watching the CUBE, we are again live, day two, of the DataWorks summit, #DWS17, before me my co-host George Gilbert, I am Lisa Martin, stick around, we'll be right back. (relaxed techno music)
SUMMARY :
in the heart of Silicon Valley, and we are joined by a couple of guys, Thank you very much, good to be here, the Director of Technology Services for Verizon, at the DataWorks summit, So the reason we kind of started looking at this, that people are going to hear tomorrow and the tools, as well as to kind of close the loop on, than the typical stories we hear about Data Lakes. and bring it into the same kind of data models. So it sounds like the ETL process and the nuances that they have over the years, how the Data Lake made that easier. do the ETL and build reports for the to consume, and it's got to inter-operate, and it is able to bring all the data and it sounds like together with Attunity, and the CDC architecture, we're able to get data in, and the analytic system, getting the data to the Lake. can join the historical data. like Kafka, in the case of Hortonworks, Alright, the data's in the Lake, You need to ingest the data, you want to do it fast, Congratulations, Itamar, on the new announcement Lisa: Arvin, great to hear the use case
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George Gilbert | PERSON | 0.99+ |
Arvind Rajagopalan | PERSON | 0.99+ |
Arvind | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Verizon | ORGANIZATION | 0.99+ |
Itamar Ankorion | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
George | PERSON | 0.99+ |
Itamar | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
two | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
Kafka | TITLE | 0.99+ |
three | QUANTITY | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Cube | ORGANIZATION | 0.99+ |
Arvin | PERSON | 0.99+ |
DataWorks Summit | EVENT | 0.99+ |
SAP HANA | TITLE | 0.99+ |
One | QUANTITY | 0.99+ |
each | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
#DWS17 | EVENT | 0.99+ |
one | QUANTITY | 0.98+ |
a day and a half | QUANTITY | 0.98+ |
CDC | ORGANIZATION | 0.98+ |
first stage | QUANTITY | 0.98+ |
Tableau | TITLE | 0.98+ |
DataWorks Summit 2017 | EVENT | 0.98+ |
Attunity | ORGANIZATION | 0.98+ |
Hive | TITLE | 0.98+ |
both | QUANTITY | 0.98+ |
Attunity | PERSON | 0.98+ |
DataWorks | EVENT | 0.97+ |
today | DATE | 0.97+ |
Compose for Hive | ORGANIZATION | 0.97+ |
Compose | ORGANIZATION | 0.96+ |
Hive 2.2 | TITLE | 0.95+ |
Qlik | TITLE | 0.94+ |
Hadoop | TITLE | 0.94+ |
one place | QUANTITY | 0.93+ |
day two | QUANTITY | 0.92+ |
each microservice | QUANTITY | 0.9+ |
first | QUANTITY | 0.9+ |
20 years back | DATE | 0.89+ |
#DataWorks | ORGANIZATION | 0.87+ |
three major ERP systems | QUANTITY | 0.83+ |
last 20 years | DATE | 0.82+ |
PeopleSoft | ORGANIZATION | 0.8+ |
Data Lake | COMMERCIAL_ITEM | 0.8+ |
SAP | ORGANIZATION | 0.79+ |
Stephan Ewen | Flink Forward 2017
(click) >> Welcome, everyone, we're back at the Flink Forward user conference sponsored by data Artisan's folks. This is the first U.S. based Flink user conference, and we are on the ground at the Kabuki Hotel in San Francisco. We have a special guest, Stephan Ewen, who is one of the founders of data Artisans, and one of the creators of Flink. He is CTO, and he is in a position to shed some unique light on the direction of the company and the product. Welcome, Stephan. >> Yeah, so you were asking about how can stream processing or how can Flink and data Artisans help companies that are enterprises that want to adopt this kind of technologies actually do that despite the fact that we've been seeing, if we look at what the big internet companies that first adopted these technologies, what they had to do, they had to go through all this big process of productionizing these things by integrating them with so many other systems, making sure everything fits together, everything kind of works as one piece. What can we do there? So I think there are a few interesting points to that. Let's maybe start with stream processing in general. So, stream processing by itself has actually the potential to simplify many of these setups and infrastructures, per se. There's multiple dimensions to that. First of all, the ability to just more naturally fit what you're doing to what is actually happening. Let me qualify that a little bit. All these companies that are dealing with big data are dealing with data that is typically continuously produced from sensors, from user devices, from server logs, from all these things, right? Which is quite naturally a stream. And processing this with systems that give you the abstraction of a stream is a much more natural fit, so you eliminate bunches of the pipeline that do, for example, try to do periodic ingestion, and then grooming that into a video file and data sets and periodic processing of that and you can for example, get rid of a lot of these things. You kind of get a paradigm that unifies the processing of real time data and also historic data. So this by itself is an interesting development that I think many have recognized and that's why they're excited about stream processing because it helps reduce a lot of that complexity. So that is one side to it. The other side to it is that there was always kind of an interplay between the processing on the data and then you want to do something with these insights, right, you don't process the data just for the fun of processing, right? Usually the outcome infers to something. Sometimes it's just a report, but sometimes it's something that immediately affects how certain services react. For example, how they apply their decisions in classifying transactions as frauds or how to send out alerts, how to trigger certain actions. The interesting thing is then, we're going to see actually a little more of that later in this conference also, is that in this reprocessing paradigm there's this very natural way for these online live applications and the analytical applications to march together, again, reducing a bunch of this complexity. Another thing that is happening that I think is very, very powerful and helping (mumbles) in bringing these kind of technologies to a broader anchor system is actually how the whole deployment stick is growing. So we see actually more and more users converging onto recessed management infrastructures. Yan was an interesting first step to make it really easy and once you've productionized that part of productionized voice systems but even beyond that, like the uptake of mezas, the uptake of containment engines like (mumbles) on the ability to just prepare more functionality buttoned together out of the box, it doesn't pack into a container of what you need and put it into a repository and then various people can bring up these services without having to go through all of the set up and integration work, it can kind of way better templated integration with systems with this kind of technology. So those seem to be helping a lot for much broader adoption of these kind of technologies. Both stream processing as an easier paradigm, fewer moving parts, and developments and (mumbles) technologies. >> So let me see if I can repeat back just a summary version, which is stream processing is more natural to how the data is generated, and so we want to match the processing to how it originates, it flows. At the same time, if we do more of that, that becomes a workload or an application pattern that then becomes more familiar to more people who didn't grow up in a continuous processing environment. But also, it has a third capability of reducing the latency between originating or adjusting the data and getting an analysis that informs a decision whether by a person or a machine. Would that be a >> Yeah, you can even go one step further, it's not just about introducing the latency from the analysis to the decision. In many cases you can actually see that the part that does the analysis in the decision just merge and become one thing which makes it much fewer moving parts, less integration work, less, yeah, less maintenance and complexity. >> Okay, and this would be like, for example, how application databases are taking on the capabilities of analytic databases to some extent, or how stream processors can have machine learning whether they're doing online learning or calling a model that they're going to score in real time or even a pre scored model, is that another example of where we put? >> You can think of those as examples, yeah. A nice way to think about it is that if you look at what a lot of what the analytical applications do versus let's say, just online services that measure offers and trades, or to generate alerts. A lot of those kind of are, in some sense, different ways of just reacting to events, right? If you are receiving some real time data and just want to process these interact with some form of knowledge that you accumulated over the past, or some form of knowledge that you've accumulated from some other inputs and then react to that. That kind of paradigm which is in the core of stream processing for (mumbles) is so generic that it covers many of these use cases, both building directly applications, as we have actually seen, we have seen users that directly build a social network on Flink, where the events that they receive are, you know, a user being created, a user joining a group and so on, and it also covers the analytics of just saying, you know, I have a stream of sensor data and on certain outliers I want to raise alerts. It's so similar once you start thinking about both of them as just handling streams of events, in this flexible fashion that it helps to just bring together many things. >> So, that sounds like it would play into the notion of, micro services where the service is responsible for its own state, and they communicate with each other asynchronously, so you have a cooperating collection of components. Now, there are a lot of people who grew up with databases out here sharing the state among modules of applications. What might drive the growth of this new pattern, the microservices, for, you know, considering that there's millions of people who just know how to use databases to build apps. >> The interesting part that I think drives this new adaption is that it's such a natural fit for the microservice world. So how do you deploy microservices with state, right? You can have a central database with which you work and every time you create a new service you have to make sure that it fits with the capacities and capabilities of the database, you have to make sure that the group that runs this database is okay with the additional load that, or you can go to the different model where each microservice comes up with its own database, but that time, every time you deploy one and that may be a new service or it may just be experimenting with a different variation of the service they'd be testing. You'd have to bring out a completely new thing. In this interesting world of stream processing, stateful stream processing is done by Flink state is embedded directly in the processing application. So, you actually don't worry about this thing separately, you just deploy that one thing, and it brings both together tightly integrated, and it's a natural fit, right, the working set of your application goes with your application. If it deployed, if it's (mumbles), if you bring it down, these things go away. What the central part in this thing is it's nothing more than if you wish a back up store where it would take these snapshots of microservices and store them in order to recover them from catastrophic failures in order to just have an historic version to look into if you figure it out later, you know, something happened, and was this introduced in the last week, let me look at what it looked like the week before or to just migrate it to a different cluster. >> So, we're going to have to cut things short in a moment, but I wanted to ask you one last question: If like, microservices as a sweet spot and sort of near real time decisions are also a sweet spot for Kafka, what might we expect to see in terms of a roadmap that helps make those, either that generalizes those cases, or that opens up new use cases? >> Yes, so, what we're immediately working on in Flink right now is definitely extending the support in this area for the ability to keep much larger state in these applications, so state that really goes into the multiple terrabytes per service, functionality that allows us to manage this, even easier to evolve this, you know. If the application actually starts owning the state and it's not in a centralized database anymore, you start needing a little bit of tooling around this state, similar as the tooling you need in databases, a (mumbles) in all of that, so things that actually make that part easier. Handling (mumbles) and we're actually looking into what are the API's that users actually want in this area, so Flink has I think pretty stellar stream processing API's and if you've seen in the last release, we've actually started adding more low level API's one could even think, API's in which you don't think as streams as distributed collections and windows but to just think about the very basic in gradiances, events, state, time and snapshots, so more control and more flexibility by just taking directly the basic building blocks rather than more high level abstractions. I think you can expect more evolution on that layer, definitely in the near future. >> Alright, Stephan, we have to leave it at that, and hopefully to pick up the conversation not too long in the future, we are at the Flink Forward Conference at the Kabuki Hotel in San Francisco, and we will be back with more just after a few moments. (funky music)
SUMMARY :
and one of the creators of Flink. First of all, the ability to just more naturally that then becomes more familiar to more people that does the analysis in the decision just merge and it also covers the analytics of just saying, you know, the microservices, for, you know, and capabilities of the database, similar as the tooling you need in databases, a (mumbles) and hopefully to pick up the conversation
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Stephan | PERSON | 0.99+ |
Stephan Ewen | PERSON | 0.99+ |
Flink | ORGANIZATION | 0.99+ |
San Francisco | LOCATION | 0.99+ |
one | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
first step | QUANTITY | 0.99+ |
one piece | QUANTITY | 0.99+ |
both | QUANTITY | 0.98+ |
U.S. | LOCATION | 0.98+ |
one side | QUANTITY | 0.98+ |
first | QUANTITY | 0.98+ |
each microservice | QUANTITY | 0.98+ |
one thing | QUANTITY | 0.97+ |
First | QUANTITY | 0.97+ |
one last question | QUANTITY | 0.95+ |
Both | QUANTITY | 0.94+ |
third | QUANTITY | 0.92+ |
Kabuki Hotel | LOCATION | 0.9+ |
Kafka | TITLE | 0.89+ |
one step | QUANTITY | 0.89+ |
Artisan | ORGANIZATION | 0.85+ |
Flink Forward user | EVENT | 0.85+ |
millions of people | QUANTITY | 0.85+ |
data Artisans | ORGANIZATION | 0.82+ |
Flink Forward | ORGANIZATION | 0.82+ |
2017 | DATE | 0.73+ |
Forward Conference | LOCATION | 0.55+ |