Rahul Pathak, AWS | Inforum DC 2018

>> Live, from Washington, D.C., it's theCUBE! Covering Inforum DC 2018. Brought to you by Infor. >> Well, welcome back. We are here on theCUBE. Thanks for joining us here as we continue our coverage here at Inforum 18. We're in Washington D.C., at the Walter Washington Convention Center. I'm John Walls, with Dave Vellante and we're joined now by Rahul Pathak, who is the G.M. at Amazon Athena and Amazon EMR. >> Hey there. Rahul, nice to see you, sir. >> Nice to see you as well. Thanks for having me. >> Thank you for being with us, um, now you spoke earlier, at the executive forum, and, um, wanted to talk to you about the title of the presentation. It was Datalinks and Analytics: the Coming Wave of Brilliance. Alright, so tell me about the title, but more about the talk, too. >> Sure. Uh, so the talk was really about a set of components and a set of transdriving data lake adoption and then how we partner with Infor to allow Infor to provide a data lake that's customized for their vertical lines of business to their customers. And I think part of the notion is that we're coming from a world where customers had to decide what data they could keep, because their systems were expensive. Now, moving to a world of data lakes where storage and analytics is a much lower cost and so customers don't have to make decisions about what data to throw away. They can keep it all and then decide what's valuable later. So we believe we're in this transition, an inflection point where you'll see a lot more insights possible, with a lot of novel types of analytics, much more so than we could do, uh, to this point. >> That's the brilliance. That's the brilliance of it. >> Right. >> Right? Opportunity to leverage... >> To do more. >> Like, that you never could before. >> Exactly. >> I'm sorry, Dave. >> No, no. That's okay. So, if you think about the phases of so called 'big data,' you know, the.... We went from, sort of, EDW to cheaper... >> (laughs) Sure. >> Data warehouses that were distributed, right? And this guy always joked that the ROI of a dupe was reduction of investment, and that's what it became. And as a result, a lot of the so-called data lakes just became stagnant, and so then you had a whole slew of companies that emerged trying to get, sort of, clean up the swamp, so to speak. Um, you guys provide services and tools, so you're like "Okay guys, here it is. We're going to make it easier for you." One of the challenges that Hadoop and big data generally had was the complexity, and so, what we noticed was the cloud guys--not just AWS, but in particular AWS really started to bring in tooling that simplified the effort around big data. >> Right. >> So fast-forward to today, and now we're at the point of trying to get insights-- data's plentiful,insights aren't. Um, bring us up to speed on Amazon's big data strategy, the status, what customers are doing. Where are we at in those waves? >> Uh, it's a big question, but yeah, absolutely. So... >> It's a John Furrier question. (laughter) So what we're seeing is this transition from sort of classic EDW to S3 based data lakes. S3's our Amazon storage service, and it's really been foundational for customers. And what customers are doing is they're bringing their data to S3 and open data formats. EDWs still have a role to play. And then we offer services that make it easy to catalog and transform the data in S3, as well as the data in customer databases and data warehouses, and then make that available for systems to drive insight. And, when I talk about that, what I mean is, we have the classic reporting and visualization use cases, but increasingly we're seeing a lot more real time event processing, and so we have services like Kinesis Analytics that makes it easy to run real time queries on data as it's moving. And then we're seeing the integration of machine learning into the stacks. Once you've got data in S3, it's available to all of these different analytic services simultaneously, and so now you're able to run your reporting, your real time processing, but also now use machine learning to make predictive analytics and decisions. And then I would say a fourth piece of this is there's really been, with machine learning and deep learning and embedding them in developer services, there's now been a way to get at data that was historically opaque. So, if you had an audio recording of a social support call, you can now put it through a service that will actually transcribe it, tell you the sentiment in the call and that becomes data that you can then track and measure and report against. So, there's been this real explosion in capability and flexibility. And what we've tried to do at AWS is provide managed services to customers, so that they can assemble sophisticated applications out of building blocks that make each of these components easier, and, that focus on being best of breed in their particular use case. >> And you're responsible for EMR, correct? >> Uh, so I own a few of these, EMR, Athena and Glue. And, uh, really these are... EMR's Open Source, Spark and Hadoop, um, with customized clusters that upbraid directly against S3 data lakes, so no need to load in HDFS, so you avoid that staleness point that you mentioned. And then, Athena is a serverless sequel NS3, so you can let any analyst log in, just get a sequel prompt and run a query. And then Glue is for cataloging the data in your data lake and databases, and for running transformations to get data from raw form into an efficient form for querying, typically. >> So, EMR is really the first service, if I recall, right? The sort of first big data service-- >> That's right. >> -that you offered, right? And, as you say, you really begin to simplify for customers, because the dupe complexity was just unwieldy, and the momentum is still there with EMR? Are people looking for alternatives? Sounds like it's still a linchpin of the strategy? >> No, absolutely. I mean, I think what we've seen is, um, customers bring data to S3, they will then use a service, like Redshift, for petabyte scale data warehousing, they'll use EMR for really arbitrary analytics, using opensource technologies, and then they'll use Athena for broad data lake query and access. So these things are all very much complimentary, uh, to each other. >> How do you define, just the concept of data lakes, uh, versus other approaches to clients? And trying to explain to them, you know, the value and the use for them, uh, I guess ultimately how they can best leverage it for their purposes? How do you walk them through that? >> Yeah, absolutely. So, there's, um. You know, that starts from the principles around how data is changing. So before we used to have, typically, tabular data coming out of ERP systems, or CRM systems, going into data warehouses. Now we're seeing a lot more variety of data. So, you might have tweets, you might have JSON events, you might have log events, real time data. And these don't fit traditional... well into the traditional relational tabular model, ah, so what data lakes allow you to do is, you can actually keep both types of the data. You can keep your tabular data indirectly in your data lake and you can bring in these new types of data, the semi-structured or the unstructured data sets. And they can all live in the data lake. And the key is to catalog that all so you know what you have and then figure out how to get that catalog visible to the analytic layer. And so the value becomes you can actually now keep all your data. You don't have to make decisions about it a priori about what's going to be valuable or what format it's going to be useful in. And you don't have to throw away data, because it's expensive to store it in traditional systems. And this gives you the ability then to replay the past when you develop better ideas in the future about how to leverage that data. Ah, so there's a benefit to being able to store everything. And then I would say the third big benefit is around um, by placing data and data lakes in open data formats, whether that's CSV or JSON or a more efficient formats, that allows customers to take advantage of best of breed analytics technology at any point in time without having to replatform their data. So you get this technical agility that's really powerful for customers, because capabilities evolve over time, constantly, and so, being in a position to take advantage of them easily is a real competitive advantage for customers. >> I want to get to Infor, but this is so much fun, I have some other questions, because Amazon's such a force in this space. Um, when you think about things like Redshift, S3, Pedisys, DynamoDB...we're a customer, these are all tools we're using. Aurora. Um, the data pipeline starts to get very complex, and the great thing about AWS is I get, you know, API access to each of those and Primitive access. The drawback is, it starts to get complicated, my data pipeline gets elongated and I'm not sure whether I should run it on this service or that service until I get my bill at the end of the month. So, are there things you're doing to help... First of all, is that a valid concern of customers and what are you doing to help customers in that regard? >> Yeah, so, we do provide a lot of capability and I think our core idea is to provide the best tool for the job, with APIs to access them and combine them and compose them. So, what we're trying to do to help simplify this is A) build in more proscriptive guidance into our services about look, if you're trying to do x, here's the right way to do x, at least the right way to start with x and then we can evolve and adapt. Uh, we're also working hard with things like blogs and solution templates and cloud formation templates to automatically stand up environments, and then, the third piece is we're trying to bring in automation and machine learning to simplify the creation of these data pipelines. So, Glue for example. When you put data in S3, it will actually crawl it on your behalf and infer its structure and store that structure in a catalog and then once you've got a source table, and a destination table, you can point those out and Glue will then automatically generate a pipeline for you to go from A to B, that you can then edit or store in version control. So we're trying to make these capabilities easier to access and provide more guidance, so that you can actually get up and running more quickly, without giving up the power that comes from having the granular access. >> That's a great answer. Because the granularity's critical, because it allows you, as the market changes, it allows you... >> To adapt. To move fast, right? And so you don't want to give that up, but at the same time, you're bringing in complexity and you just, I think, answered it well, in terms of how you're trying to simplify that. The strategy's obviously worked very well. Okay, let's talk about Infor now. Here's a big ISP partner. They've got the engineering resources to deal with all this stuff, and they really seem to have taken advantage of it. We were talking earlier, that, I don't know if you heard Charles's keynote this morning, but he said, when we were an on prem software company, we didn't manage customer servers for them. Back then, the server was the server, uh software companies didn't care about the server infrastructure. Today it's different. It's like the cloud is giving Infor strategic advantage. The flywheel effect that you guys talk about spins off innovation that they can exploit in new ways. So talk about your relationship with Infor, and kind of the history of where it's come and where it's going. >> Sure. So, Infor's a great partner. We've been a partner for over four years, they're one of our first all-in partners, and we have a great working relationship with them. They're sophisticated. They understand our services well. And we collaborate on identifying ways that we can make our services better for their use cases. And what they've been able to do is take all of the years of industry and domain expertise that they've gained over time in their vertical segments, and with their customers, and bring that to bear by using the components that we provide in the cloud. So all these services that I mentioned, the global footprint, the security capabilities, the, um, all of the various compliance certifications that we offer act as accelerators for what Infor's trying to do, and then they're able to leverage their intellectual property and their relationships and experience they've built up over time to get this global footprint that they can deploy for their customers, that gets better over time as we add new capabilities, they can build that into the Infor platform, and then that rolls out to all of their customers much more quickly than it could before. >> And they seem to be really driving hard, I have not heard an enterprise software company talk so much about data, and how they're exploiting data, the way that I've heard Infor talk about it. So, data's obviously key, it's the lifeblood-- people say it's the new oil--I'm not sure that's the best analogy. I can only put oil in my house or my car, I can't put it in both. Data--I can do so many things with it, so, um... >> I suspect that analogy will evolve. >> I think it should. >> I'm already thinking about it now. >> You heard it here first in the Cube. >> You keep going, I'll come up with something >> Don't use that anymore. >> Scratch the oil. >> Okay, so, your perspectives on Infor, it's sort of use of data and what Amazon's role is in terms of facilitating that. >> So what we're providing is a platform, a set of services with powerful building blocks, that Infor can then combine into their applications that match the needs of their customers. And so what we're looking to do is give them a broad set of capabilities, that they can build into their offerings. So, CloudSuite is built entirely on us, and then Infor OS is a shared set of services and part of that is their data lake, which uses a number of our analytic services underneath. And so, what Infor's able to do for their customers is break down data silos within their customer organizations and provide a common way to think about data and machine learning and IoT applications across data in the data lake. And we view our role as really a supporting partner for them in providing a set of capabilities that they can then use to scale and grow and deploy their applications. >> I want to ask you about--I mean, security-- I've always been comfortable with cloud security, maybe I'm naive--but compliance is something that's interesting and something you said before... I think you said cataloging Glue allows you to essentially keep all the data, right? And my concern about that is, from a governance perspective, the legal counsel might say, "Well, I don't "want to keep all my data, if it's work in process, "I want to get rid of it "or if there's a smoking gun in there, "I want to get rid of it as soon as I can." Keep data as long as possible but no longer, to sort of paraphrase Einstein. So, what do you say to that? Do you have customers in the legal office that say, "Hey, we don't want to keep data forever, "and how can you help?" >> Yeah, so, just to refine the point on Glue. What Glue does is it gives you essentially a catalog, which is a map of all your data. Whether you choose to keep that data or not keep that data, that's a function of the application. So, absolutely >> Sure. Right. We have customers that say, "Look, here are my data sets for "whether it's new regulations, or I just don't want this "set of data to exist anymore, or this customer's no longer with us and we need to delete that," we provide all of those capabilities. So, our goal is to really give customers the set of features, functionality, and compliance certifications they need to express the enterprise security policies that they have, and ensure that they're complying with them. And, so, then if you have data sets that need to be deleted, we provide capabilities to do that. And then the other side of that is you want the audit capabilities, so we actually log every API access in the environment in a service called CloudTrail and then you can actually verify by going back and looking at CloudTrail that only the things that you wanted to have happen, actually did happen. >> So, you seem very relaxed. I have to ask you what life is like at Amazon, because when I was down at AWS's D.C. offices, and you walk in there, and there's this huge-- I don't know if you've seen it-- there's this giant graph of the services launched and announced, from 2006, when EC2 first came out, til today. And it's just this ridiculous set of services. I mean the line, the graph is amazing. So you're moving at this super, hyper pace. What's life like at AWS? >> You know, I've been there almost seven years. I love it. It's been fantastic. I was an entrepreneur and came out of startups before AWS, and when I joined, I found an environment where you can continue to be entrepreneurial and active on behalf of you customers, but you have the ability to have impact at a global scale. So it's been super fun. The pace is fast, but exhilarating. We're working on things we're excited about, and we're working on things that we believe matter, and make a difference to our customers. So, it's been really fun. >> Well, so you got--I mean, you're right at the heart of what I like to call the innovation sandwich. You've got data, tons of data, obviously, in the cloud. You're a leader and increasingly becoming sophisticated in machine intelligence. So you've got data, machine intelligence, or AI, applied to that data, and you've got cloud for scale, cloud for economics, cloud for innovation, you're able to attract startups--that's probably how you found AWS to begin with, right? >> That's right. >> All the startups, including ours, we want to be on AWS. That's where the developers want to be. And so, again, it's an overused word, but that flywheel of innovation occurs. And that to us is the innovation sandwich, it's not Moore's Law anymore, right? For decades this industry marched to the cadence of Moore's Law. Now it's a much more multi-dimensional matrix and it's exciting and sometimes scary. >> Yeah. No, I think you touched on a lot of great points. It's really fun. I mean, I think, for us, the core is, we want to put things together the customers want. We want to make them broadly available. We want to partner with our customers to understand what's working and what's not. We want to pass on efficiencies when we can and then that helps us speed up the cycle of learning. >> Well, Rahul, I actually was going to say, I think he's so relaxed because he's on theCUBE. >> Ah, could be. >> Right, that's it. We just like to do that with people. >> No, you're fantastic. >> Thanks for being with us. >> It's a pleasure. >> We appreciate the insights, and we certainly wish you well with the rest of the show here. >> Excellent. Thank you very much, it was great to be here. >> Thank you, sir. >> You're welcome. >> You're watching theCUBE. We are live here in Washington, D.C. at Inforum 18. (techno music)

Published Date : Sep 25 2018

SUMMARY :

Brought to you by Infor. We're in Washington D.C., at the Walter Washington Rahul, nice to see you, sir. Nice to see you as well. and, um, wanted to talk to you about the title and so customers don't have to make decisions about That's the brilliance of it. Opportunity to leverage... So, if you think about the phases of so called 'big data,' just became stagnant, and so then you had a whole So fast-forward to today, and now we're at the point of Uh, it's a big question, but yeah, absolutely. and that becomes data that you can then track so you can let any analyst log in, just get a customers bring data to S3, they will then use a service, And the key is to catalog that all so you know what you have and the great thing about AWS is I get, you know, and provide more guidance, so that you can actually Because the granularity's critical, because it allows They've got the engineering resources to deal with all this and then they're able to leverage And they seem to be really driving hard, it's sort of use of data and what Amazon's role is that match the needs of their customers. So, what do you say to that? Whether you choose to keep that data or not keep that data, looking at CloudTrail that only the things that you I have to ask you what life is like at Amazon, and make a difference to our customers. Well, so you got--I mean, you're right at the heart And that to us is the innovation sandwich, No, I think you touched on a lot of great points. I think he's so relaxed because he's on theCUBE. We just like to do that with people. We appreciate the insights, and we certainly Thank you very much, it was great to be here. We are live here in Washington, D.C. at Inforum 18.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Rahul Pathak	PERSON	0.99+
Rahul	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Charles	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
2006	DATE	0.99+
John Furrier	PERSON	0.99+
Dave	PERSON	0.99+
Washington, D.C.	LOCATION	0.99+
Washington D.C.	LOCATION	0.99+
Einstein	PERSON	0.99+
Today	DATE	0.99+
Infor	ORGANIZATION	0.99+
D.C.	LOCATION	0.99+
third piece	QUANTITY	0.99+
first service	QUANTITY	0.99+
both	QUANTITY	0.99+
S3	TITLE	0.99+
fourth piece	QUANTITY	0.99+
Amazon Athena	ORGANIZATION	0.98+
Athena	TITLE	0.98+
CloudSuite	TITLE	0.98+
over four years	QUANTITY	0.98+
Walter Washington Convention Center	LOCATION	0.98+
Moore's Law	TITLE	0.98+
first	QUANTITY	0.98+
one	QUANTITY	0.97+
EMR	TITLE	0.97+
CloudTrail	TITLE	0.96+
today	DATE	0.96+
Datalinks and Analytics: the Coming Wave of Brilliance	TITLE	0.95+
Glue	ORGANIZATION	0.95+
Redshift	TITLE	0.94+
Infor	TITLE	0.94+
First	QUANTITY	0.94+
this morning	DATE	0.94+
almost seven years	QUANTITY	0.94+
each	QUANTITY	0.91+
prem	ORGANIZATION	0.91+
Amazon EMR	ORGANIZATION	0.9+
DC	LOCATION	0.87+
EDW	TITLE	0.86+
Spark	TITLE	0.85+
both types	QUANTITY	0.84+
JSON	TITLE	0.83+
EC2	TITLE	0.82+
EMR	ORGANIZATION	0.82+
NS3	TITLE	0.82+
Athena	ORGANIZATION	0.81+
Hadoop	TITLE	0.8+
2018	DATE	0.78+
Kinesis Analytics	ORGANIZATION	0.77+
2018	EVENT	0.76+

Kunal Agarwal, Unravel Data | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCube! Presenting Big Data: Silicon Valley Brought to you by SiliconANGLE Media and its ecosystem partners. (techno music) >> Welcome back to theCube. We are live on our first day of coverage at our event BigDataSV. I am Lisa Martin with my co-host George Gilbert. We are at this really cool venue in downtown San Jose. We invite you to come by today, tonight for our cocktail party. It's called Forager Tasting Room and Eatery. Tasty stuff, really, really good. We are down the street from the Strata Data Conference, and we're excited to welcome to theCube a first-time guest, Kunal Agarwal, the CEO of Unravel Data. Kunal, welcome to theCube. >> Thank you so much for having me. >> So, I'm a marketing girl. I love the name Unravel Data. (Kunal laughs) >> Thank you. >> Two year old company. Tell us a bit about what you guys do and why that name... What's the implication there with respect to big data? >> Yeah, we are a application performance management company. And big data applications are just very complex. And the name Unravel is all about unraveling the mysteries of big data and understanding why things are not performing well and not really needing a PhD to do so. We're simplifying application performance management for the big data stack. >> Lisa: Excellent. >> So, so, um, you know, one of the things that a lot of people are talking about with Hadoop, originally it was this cauldron of innovation. Because we had the "let a thousand flowers bloom" in terms of all the Apache projects. But then once we tried to get it into operation, we discovered there's a... >> Kunal: There's a lot of problems. (Kunal laughs) >> There's an overhead, there's a downside to it. >> Maybe tell us, tell us why you both need to know, you need to know how people have done this many, many times. >> Yeah. >> How you need to learn from experience and then how you can apply that even in an environment where someone hasn't been doing it for that long. >> Right. So, if I back a little bit. Big data is powerful, right? It's giving companies an advantage that they never had, and data's an asset to all of these different companies. Now they're running everything from BI, machine learning, artificial intelligence, IOT, streaming applications on top of it for various reasons. Maybe it is to create a new product to understand the customers better, etc., But as you rightly pointed out, when you start to implement all of these different applications and jobs, it's very, very hard. It's because big data is very complex. With that great power comes a lot of complexity, and what we started to see is a lot of companies, while they want to create these applications and provide that differentiation to their company, they just don't have enough expertise as well in house to go and write good applications, maintain these applications, and even manage the underlying infrastructure and cluster that all these applications are running on. So we took it upon ourselves where we thought, Hey, if we simplify application performance management and if we simplify ongoing management challenges, then these companies would run more big data applications, they would be able to expand their use cases, and not really be fearful of, Hey, we don't know how to go and solve these problems. Do we actually rely on our system that is so complex and new? And that's the gap the Unravel fills, which is we monitor and manage not only one componenent of the big data ecosystem, but like you pointed out, it's a, it's a full zoo of all of these systems. You have Hadoop, and you have Spark, and you have Kafka for data injection. You may have some NoSQL systems and newer MPP platforms as well. So the vision of Unravel is really to be that one place where you can come in and understand what's happening with your applications and your system overall and be able to resolve those problems in an automatic, simple way. >> So, all right, let's start at the concrete level of what a developer might get out of >> Kunal: Right. >> something that's wrapped in Unravel and then tell us what the administrator experiences. >> Kunal: Absolutely. So if you are a big data developer you've got in a business requirement that, Hey, go and make this application that understands our customers better, right? They may choose a tool of their liking, maybe Hive, maybe Spark, maybe Kafka for data injection. And what they'll do is they'll write an app first in dev, in their dev environment or the QA environment. And they'll say, Hey, maybe this application is failing, or maybe this application is not performing as fast as I want it to, or even worse that this application is starting to hog a lot of resources, which may slow down my other applications. Now to understand what's causing these kind of problems today developers really need a PhD to go and decipher them. They have to look at tons of law rogs, uh, raw logs metrics, configuration settings and then try to stitch the story up in their head, trying to figure out what is the effect, what is the cause? Maybe it's this problem, maybe it's some other problem. And then do trial and error to try, you know to solving that particular issue. Now what we've seen is big data developers come in variety of flavors. You have the hardcore developers who truly understand Spark and Hadoop and everything, but then 80% of the people submitting these applications are data scientist or business analysts, who may understand SQL, who may know Python, but don't necessarily know what distributed computing and parallel processing and all of these things really are, and where can inefficiencies and problems really lie. So we give them this one view, which will connect all of these different data sources and then tell them in plain English, this is the problem, this is why this problem happened, and this is how you can go and resolve it, thereby getting them unstuck and making it very simple for them to go in and get the performance that they're getting. >> So, these, these, um, they're the developers up front and you're giving them a whole new, sort of, toolchain or environment to solve the operational issues. >> Kunal: Right. >> So that the, if it's DevOps, its really dev is much more sufficient. >> Yes, yes, I mean, all companies want to run fast. They don't want to be slowed down. If you have a problem today, they'll file a ticket, it'll go to the operations team, you wait a couple of days to get some more information back. That just means your business has slowed down. If things are simple enough where the application developers themselves can resolve a lot of these issues, that'll get the business unstuck and get them moving on further. Now, to the other point which you were asking, which is what about the operations and the app support people? So, Unravel's a great tool for them too because that helps them see what's happening holistically in the cluster. How are other applications behaving with each other? It's usually a multitenant, multiapplication environment that these big data jobs are running on. So, is my apps slowing down George's apps? Am I stealing resources from your applications? More so, not just about an individual application issue itself. So Unravel will give you visibility into each app, as well as the overall cluster to help you understand cluster-wide problems. >> Love to get at, maybe peel apart your target audience a little bit. You talked about DevOps. But also the business analysts, data scientists, and we talk about big data. Data is, has such tremendous power to fuel a company and, you know, like you said use it to deliver and, create and deliver new products. Are you talking with multiple audiences within a company? Do you start at DevOps and they bring in their peers? Or do you actually start, maybe, at the Chief Data Officer level? What's that kind of entrance for Unravel? >> So the word I use to describe this is DataOps, instead of DevOps, right? So in the older world you had developers, and you had operations people. Over here you have a data team and operations people, and that data team can comprise of the developers, the data scientists, the business analysts, etc., as well. But you're right. Although we first target the operations role because they have to manage and monitor the system and make sure everything is running like a well-oiled machine, they are now spreading it out to be end-users, meaning the developers themselves saying, "Don't come to me for every problem. "Look at Unravel, try solve it here, "and if you cannot, then come to me." This is all, again, improving agility within the company, making sure that people have the necessary tools and insights to carry on with their day. >> Sounds like an enabler, >> Yeah, absolutely. >> That operations would push down to the DevOp, the developers themselves. >> And even the managers and the CDOs, for example, they want to see their ROI that they're getting from their big data investments. They want to see, they have put in these millions of dollars, have got an infrastructure and these services set up, but how are we actually moving the needle forward? Are there any applications that we're actually putting in business, and is that driving any business value? So we will be able to give them a very nice dashboard helping them understand what kind of throughput are you getting from your system, how many applications were you able to develop last week and onboard to your production environment? And what's the rate of innovation that's really happening inside your company on those big data ecosystems? >> It sort of brings up an interesting question on two prongs. One is the well-known, but inexact number about how many big data projects, >> Kunal: Yeah, yeah. >> I don't know whether they fail or didn't pay off. So there's going in and saying, "Hey, we can help you manage this "because it was too complicated." But then there's also the, all the folks who decided, "Well, we really don't want "to run it all on-prem. "We're not going to throw away everything we did there, "but we're going to also put a lot of new investment >> Kunal: Exactly, exactly. >> in the cloud. Now, Wikibon has a term for that, which true private cloud, which is when you have the operational processes that you use in the public cloud and you can apply them on-prem. >> Right. >> George: But there's not many products that help you do that. How can Unravel work...? >> Kunal: That's a very good questions, George. We're seeing the world move more and more to a cloud environment, or I should say an on-demand environment where you're not so bothered about the infrastructure and the services, but you want Spark as a dial tone. You want Kafka as a dial tone. You want a machine-learning platform as a dial tone. You want to come in there, you want to put in your data, and you want to just start running it. Unravel has been designed from the ground up to monitor and manage any of these environments. So, Unravel can solve problems for your applications running on-premise and similarly all the applications that are running on cloud. Now, on the cloud there are other levels of problems as well so, of course, you'd have applications that are slow, applications that are failing; we can solve those problems. But if you look at a cloud environment, a lot of these now provide you an autoscaling capability, meaning, Hey, if this app doesn't run in the amount of time that we were hoping it to run, let's add extra hardware and run this application. Well, if you just keep throwing machines at the problem, it's not going to solve your issue. Now, it doesn't decrease the time that it will take linearly with how many servers that you're actually throwing in there, so what we can help companies understand is what is the resource requirement of a particular application? How should we be intelligently allocating resources to make sure that you're able to meet your time SLAs, your constraints of, here I need to finish this with x number of minutes, but at the same time be intelligent about how much cost you're spending over there. Do you actually need 500 containers to go and run this app? Well, you may have needed 200. How do you know that? So, Unravel will also help you get efficient with your run, not just faster, but also can it be a good multitenant citizen, can it use limited resources to actually run this applications as well? >> So, Kunal, some of the things I'm hearing from a customer's standpoint that are potential positive business outcomes are internal: performance boost. >> Kunal: Yeah. >> It also sounds like, sort of... productivity improvements internally. >> And then also the opportunity to have the insight to deliver new products, but even I'm thinking of, you know, helping make a retailer, for example, be able to do more targeted marketing, so >> the business outcomes and the impact that Unravel can make really seem to have pretty strong internal and external benefits. >> Kunal: Yes. >> Is there a favorite customer story, (Kunal laughs) don't have to mention names, that you really think speaks to your capabilities? >> So, 100% Improving performance is a very big factor of what Unravel can do. Decreasing costs by improving productivity, by limiting the amount of resources that you're using, is a very, very big factor. Now, amongst all of these companies that we work with, one key factor is improving reliability, which means, Hey, it's fine that he can speed up this application, but sometimes I know the latency that I expect from an app, maybe it's a second, maybe it's a minute, depending on the type of application. But what businesses cannot tolerate is this app taking five x amount more time today. If it's going to finish in a minute, tell me it'll finish in a minute and make sure it finishes in a minute. And this is a big use case for all of the big data vendors because a lot of the customers are moving from Teradata, or from Vertica, or from other relation databases, on to Hortonworks or Cloudera or Amazon EMR. Why? Because it's one tenth the amount of cost for running these workloads. But, all the customers get frustrated and say, "I don't mind paying 10 x more money, "but because over there it used to work. "Over here, there are just so many complications, "and I don't have reliability with these applications." So that's a big, big factor of, you know, how we actually help these customers get value out of the Unravel product. >> Okay, so, um... A question I'm, sort of... why aren't there so many other Unravels? >> Kunal: Yeah. (Kunal laughs) >> From what I understood from past conversations. >> Kunal: Yeah. >> You can only really build the models that are at the heart of your capabilities based on tons and tons of telemetry >> Kunal: Yeah. >> that cloud providers or, or, sort of, internet scale service providers have accumulated in that, because they all have sort of a well-known set of configurations and well-known kind of typology. In other words, there're not a million degrees of freedom on any particular side that you can, you have a well-scoped problem, and you have tons of data. So it's easier to build the models. So who, who else could do this? >> Yeah, so the difference between Unravel and other monitoring products is Unravel is not a monitoring product. It's an intelligent performance management suite. What that means is we don't just give you graphs and metrics and say, "Here are all the raw information, "you go figure it out." Instead, we have to take it a step further where we are actually giving people answers. In order to develop something like that, you need full stack information; that's number one. Meaning information from applications all the way down to infrastructure and everything in between. Why? Because problems can lie anywhere. And if you don't have that full stack info, you're blind-siding yourself, or limiting the scope of the problems that you can actually search for. Secondly is, like you were rightly pointing out, how do I create answers from all this raw data? So you have to think like how an expert with big data would think, which is if there is a problem what are the kinds of checks, balances, places that that person would look into, and how would that person establish that this is indeed the root cause of the problem today? And then, how would that person actually resolve this particular problem? So, we have a big team of scientists, researchers. In fact, my co-founder is a professor of computer science at Duke University who has been researching data-based optimization techniques for the last decade. We have about 80 plus publications in this area, Starfish being one of them. We have a bunch of other publications, which talk about how do you automate problem discovery, root cause analysis, as well as resolution, to get best performance out of these different databases? And you're right. A lot of work has gone on the research side, but a lot of work has gone in understanding the needs of the customers. So we worked with some of the biggest companies out there, which have some of the biggest big data clusters, to learn from them, what are some everyday, ongoing management challenges that you face, and then taking that problem to our datasets and figuring out, how can we automate problem discovery? How can we proactively spot a lot of these errors? I joke around and I tell people that we're big data for big data. Right? All these companies that we serve, they are gathering all of this data, and they're trying to find patterns, and they're trying to find, you know, some sort of an insight with their data. Our data is system generated data, performance data, application data, and we're doing the exact same thing, which is figuring out inefficiencies, problems, cause and effect of things, to be able to solve it in a more intelligent, smart way. >> Well, Kunal, thank you so much for stopping by theCube >> Kunal: Of course. >> And sharing how Unravel Data is helping to unravel the complexities of big data. (Kunal laughs) >> Thank you so much. Really appreciate it. >> Now you're a Cube almuni. (Kunal laughs) >> Absolutely. Thanks so much for having me. >> Kunal, thanks. >> Yeah, and we want to thank you for watching the Cube. I'm Lisa Martin with George Gilbert. We are live at our own event BigData SV in downtown San Jose, California. Stick around. George and I will be right back with our next guest. (quiet crowd noise) (techno music)

Published Date : Mar 8 2018

SUMMARY :

Brought to you by SiliconANGLE Media We invite you to come by today, I love the name Unravel Data. Tell us a bit about what you guys do and not really needing a PhD to do so. So, so, um, you know, one of the things that Kunal: There's a lot of problems. there's a downside to it. tell us why you both need to know, and then how you can apply that even in an environment of the big data ecosystem, but like you pointed out, and then tell us what the administrator experiences. and this is how you can go and resolve it, and you're giving them a whole new, sort of, So that the, if it's DevOps, Now, to the other point which you were asking, to fuel a company and, you know, like you said So in the older world you had developers, DevOp, the developers themselves. and is that driving any business value? One is the well-known, but inexact number "Hey, we can help you manage this and you can apply them on-prem. that help you do that. and you want to just start running it. So, Kunal, some of the things I'm hearing It also sounds like, sort of... that Unravel can make really seem to have So that's a big, big factor of, you know, A question I'm, sort of... and you have tons of data. What that means is we don't just give you graphs to unravel the complexities of big data. Thank you so much. Now you're a Cube almuni. Thanks so much for having me. Yeah, and we want to thank you

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Kunal Agarwal	PERSON	0.99+
George	PERSON	0.99+
Kunal	PERSON	0.99+
Lisa	PERSON	0.99+
80%	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Vertica	ORGANIZATION	0.99+
Unravel Data	ORGANIZATION	0.99+
Teradata	ORGANIZATION	0.99+
today	DATE	0.99+
500 containers	QUANTITY	0.99+
One	QUANTITY	0.99+
Two year	QUANTITY	0.99+
two prongs	QUANTITY	0.99+
last week	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
tonight	DATE	0.99+
200	QUANTITY	0.99+
first day	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Spark	TITLE	0.99+
Cloudera	ORGANIZATION	0.99+
each app	QUANTITY	0.99+
Python	TITLE	0.98+
a minute	QUANTITY	0.98+
English	OTHER	0.98+
one	QUANTITY	0.98+
Duke University	ORGANIZATION	0.98+
five	QUANTITY	0.98+
Kafka	TITLE	0.98+
Hadoop	TITLE	0.98+
BigData SV	EVENT	0.97+
first-time	QUANTITY	0.97+
Strata Data Conference	EVENT	0.97+
one key factor	QUANTITY	0.96+
millions of dollars	QUANTITY	0.95+
about 80 plus publications	QUANTITY	0.95+
SQL	TITLE	0.95+
DevOps	TITLE	0.94+
first	QUANTITY	0.94+
BigDataSV	EVENT	0.94+
tons and tons	QUANTITY	0.94+
both	QUANTITY	0.94+
Unravel	ORGANIZATION	0.93+
Secondly	QUANTITY	0.91+
million degrees	QUANTITY	0.91+
San Jose, California	LOCATION	0.91+
Hive	TITLE	0.91+
last decade	DATE	0.91+
Unravel	TITLE	0.9+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Amazon EMR: