Image Title

Search Results for Murthy:

Shruthi Murthy, St. Louis University & Venkat Krishnamachari, MontyCloud | AWS Startup Showcase


 

(gentle music) >> Hello and welcome today's session theCUBE presentation of AWS Startup Showcase powered by theCUBE, I'm John Furrier, for your host of theCUBE. This is a session on breaking through with DevOps data analytics tools, cloud management tools with MontyCloud and cloud management migration, I'm your host. Thanks for joining me, I've got two great guests. Venkat Krishnamachari who's the co-founder and CEO of MontyCloud and Shruthi Sreenivasa Murthy, solution architect research computing group St. Louis University. Thanks for coming on to talk about transforming IT, day one day two operations. Venkat, great to see you. >> Great to see you again, John. So in this session, I really want to get into this cloud powerhouse theme you guys were talking about before on our previous Cube Conversations and what it means for customers, because there is a real market shift happening here. And I want to get your thoughts on what solution to the problem is basically, that you guys are targeting. >> Yeah, John, cloud migration is happening rapidly. Not an option. It is the current and the immediate future of many IT departments and any type of computing workloads. And applications and services these days are better served by cloud adoption. This rapid acceleration is where we are seeing a lot of challenges and we've been helping customers with our platform so they can go focus on their business. So happy to talk more about this. >> Yeah and Shruthi if you can just explain your relationship with these guys, because you're a cloud architect, you can try to put this together. MontyCloud is your customer, talk about your solution. >> Yeah I work at the St. Louis University as the solutions architect for the office of Vice President of Research. We can address St. Louis University as SLU, just to keep it easy. SLU is a 200-year-old university with more focus on research. And our goal at the Research Computing Group is to help researchers by providing the right infrastructure and computing capabilities that help them to advance their research. So here in SLU research portfolio, it's quite diverse, right? So we do research on vaccines, economics, geospatial intelligence, and many other really interesting areas, and you know, it involves really large data sets. So one of the research computing groups' ambitious plan is to move as many high-end computation applications from on-prem to the AWS. And I lead all the cloud initiatives for the St. Louis university. >> Yeah Venkat and I, we've been talking, many times on theCUBE, previous interviews about, you know, the rapid agility that's happening with serverless and functions, and, you know, microservices start to see massive acceleration of how fast cloud apps are being built. It's put a lot of pressure on companies to hang on and manage all this. And whether you're a security group was trying to lock down something, or it's just, it's so fast, the cloud development scene is really fun and you're implementing it at a large scale. What's it like these days from a development standpoint? You've got all this greatness in the cloud. What's the DevOps mindset right now? >> SLU is slowly evolving itself as the AWS Center of Excellence here in St. Louis. And most of the workflows that we are trying to implement on AWS and DevOps and, you know, CICD Pipelines. And basically we want it ready and updated for the researchers where they can use it and not have to wait on any of the resources. So it has a lot of importance. >> Research as code, it's like the internet, infrastructure as code is DevOps' ethos. Venkat, let's get into where this all leads to because you're seeing a culture shift in companies as they start to realize if they don't move fast, and the blockers that get in the way of the innovation, you really can't get your arms around this growth as an opportunity to operationalize all the new technology, could you talk about the transformation goals that are going on with your customer base. What's going on in the market? Can you explain and unpack the high level market around what you guys are doing? >> Sure thing, John. Let's bring up the slide one. So they have some content that Act-On tabs. John, every legal application, commercial application, even internal IT departments, they're all transforming fast. Speed has never been more important in the era we are today. For example, COVID research, you know, analyzing massive data sets to come up with some recommendations. They don't demand a lot from the IT departments so that researchers and developers can move fast. And I need departments that are not only moving current workloads to the cloud they're also ensuring the cloud is being consumed the right way. So researchers can focus on what they do best, what we win, learning and working closely with customers and gathering is that there are three steps or three major, you know, milestone that we like to achieve. I would start the outcome, right? That the important milestone IT departments are trying to get to is transforming such that they're directly tied to the key business objectives. Everything they do has to be connected to the business objective, which means the time and you know, budget and everything's aligned towards what they want to deliver. IT departments we talk with have one common goal. They want to be experts in cloud operations. They want to deliver cloud operations excellence so that researchers and developers can move fast. But they're almost always under the, you know, they're time poor, right? And there is budget gaps and that is talent and tooling gap. A lot of that is what's causing the, you know, challenges on their path to journey. And we have taken a methodical and deliberate position in helping them get there. >> Shruthi hows your reaction to that? Because, I mean, you want it faster, cheaper, better than before. You don't want to have all the operational management hassles. You mentioned that you guys want to do this turnkey. Is that the use case that you're going after? Just research kind of being researchers having the access at their fingertips, all these resources? What's the mindset there, what's your expectation? >> Well, one of the main expectations is to be able to deliver it to the researchers as demand and need and, you know, moving from a traditional on-prem HBC to cloud would definitely help because, you know, we are able to give the right resources to the researchers and able to deliver projects in a timely manner, and, you know, with some additional help from MontyCloud data platform, we are able to do it even better. >> Yeah I like the onboarding thing and to get an easy and you get value quickly, that's the cloud business model. Let's unpack the platform, let's go into the hood. Venkat let's, if you can take us through the, some of the moving parts under the platform, then as you guys have it's up at the high level, the market's obvious for everyone out there watching Cloud ops, speed, stablism. But let's go look at the platform. Let's unpack that, do you mind pick up on slide two and let's go look at the what's going on in the platform. >> Sure. Let's talk about what comes out of the platform, right? They are directly tied to what the customers would like to have, right? Customers would like to fast track their day one activities. Solution architects, such as Shruthi, their role is to try and help get out of the way of the researchers, but we ubiquitous around delegating cloud solutions, right? Our platform acts like a seasoned cloud architect. It's as if you've instantly turned on a cloud solution architect that should, they can bring online and say, Hey, I want help here to go faster. Our lab then has capabilities that help customers provision a set of governance contracts, drive consumption in the right way. One of the key things about driving consumption the right way is to ensure that we prevent a security cost or compliance issues from happening in the first place, which means you're shifting a lot of the operational burden to left and make sure that when provisioning happens, you have a guard rails in place, we help with that, the platform solves a problem without writing code. And an important takeaway here, John is that a was built for architects and administrators who want to move fast without having to write a ton of code. And it is also a platform that they can bring online, autonomous bots that can solve problems. For example, when it comes to post provisioning, everybody is in the business of ensuring security because it's a shared model. Everybody has to keep an eye on compliance, that is also a shared responsibility, so is cost optimization. So we thought wouldn't it be awesome to have architects such as Shruthi turn on a compliance bot on the platform that gives them the peace of mind that somebody else and an autonomous bot is watching our 24 by 7 and make sure that these day two operations don't throw curve balls at them, right? That's important for agility. So platform solves that problem with an automation approach. Going forward on an ongoing basis, right, the operation burden is what gets IT departments. We've seen that happen repeatedly. Like IT department, you know, you know this, John, maybe you have some thoughts on this. You know, you know, if you have some comments on how IT can face this, then maybe that's better to hear from you. >> No, well first I want to unpack that platform because I think one of the advantages I see here and that people are talking about in the industry is not only is the technology's collision colliding between the security postures and rapid cloud development, because DevOps and cloud, folks, are moving super fast. They want things done at the point of coding and CICB pipeline, as well as any kind of changes, they want it fast, not weeks. They don't want to have someone blocking it like a security team, so automation with the compliance is beautiful because now the security teams can provide policies. Those policies can then go right into your platform. And then everyone's got the rules of the road and then anything that comes up gets managed through the policy. So I think this is a big trend that nobody's talking about because this allows the cloud to go faster. What's your reaction to that? Do you agree? >> No, precisely right. I'll let Shurthi jump on that, yeah. >> Yeah, you know, I just wanted to bring up one of the case studies that we read on cloud and use their compliance bot. So REDCap, the Research Electronic Data Capture also known as REDCap is a web application. It's a HIPAA web application. And while the flagship projects for the research group at SLU. REDCap was running on traditional on-prem infrastructure, so maintaining the servers and updating the application to its latest version was definitely a challenge. And also granting access to the researchers had long lead times because of the rules and security protocols in place. So we wanted to be able to build a secure and reliable enrollment on the cloud where we could just provision on demand and in turn ease the job of updating the application to its latest version without disturbing the production environment. Because this is a really important application, most of the doctors and researchers at St. Louis University and the School of Medicine and St. Louis University Hospital users. So given this challenge, we wanted to bring in MontyCloud's cloud ops and, you know, security expertise to simplify the provisioning. And that's when we implemented this compliance bot. Once it is implemented, it's pretty easy to understand, you know, what is compliant, what is noncompliant with the HIPAA standards and where it needs an remediation efforts and what we need to do. And again, that can also be automated. It's nice and simple, and you don't need a lot of cloud expertise to go through the compliance bot and come up with your remediation plan. >> What's the change in the outcome in terms of the speed turnaround time, the before and after? So before you're dealing with obviously provisioning stuff and lead time, but just a compliance closed loop, just to ask a question, do we have, you know, just, I mean, there's a lot of manual and also some, maybe some workflows in there, but not as not as cool as an instant bot that solve yes or no decision. And after MontyCloud, what are some of the times, can you share any data there just doing an order of magnitude. >> Yeah, definitely. So the provisioning was never simpler, I mean, we are able to provision with just one or two clicks, and then we have a better governance guardrail, like Venkat says, and I think, you know, to give you a specific data, it, the compliance bot does about more than 160 checks and it's all automated, so when it comes to security, definitely we have been able to save a lot of effort on that. And I can tell you that our researchers are able to be 40% more productive with the infrastructure. And our research computing group is able to kind of save the time and, you know, the security measures and the remediation efforts, because we get customized alerts and notifications and you just need to go in and, you know. >> So people are happier, right? People are getting along at the office or virtually, you know, no one is yelling at each other on Slack, hey, where's? Cause that's really the harmony here then, okay. This is like a, I'm joking aside. This is a real cultural issue between speed of innovation and the, what could be viewed as a block, or just the time that say security teams or other teams might want to get back to you, make sure things are compliant. So that could slow things down, that tension is real and there's some disconnects within companies. >> Yeah John, that's spot on, and that means we have to do a better job, not only solving the traditional problems and make them simple, but for the modern work culture of integrations. You know, it's not uncommon like you cut out for researchers and architects to talk in a Slack channel often. You say, Hey, I need this resource, or I want to reconfigure this. How do we make that collaboration better? How do you make the platform intelligent so that the platform can take off some of the burden off of people so that the platform can monitor, react, notify in a Slack channel, or if you should, the administrator say, Hey, next time, this happens automatically go create a ticket for me. If it happens next time in this environment automatically go run a playbook, that remediates it. That gives a lot of time back that puts a peace of mind and the process that an operating model that you have inherited and you're trying to deliver excellence and has more help, particularly because it is very dynamic footprint. >> Yeah, I think this whole guard rail thing is a really big deal, I think it's like a feature, but it's a super important outcome because if you can have policies that map into these bots that can check rules really fast, then developers will have the freedom to drive as fast as they want, and literally go hard and then shift left and do the coding and do all their stuff on the hygiene side from the day, one on security is really a big deal. Can we go back to this slide again for the other project? There's another project on that slide. You talked about RED, was it REDCap, was that one? >> Yeah. Yeah, so REDCap, what's the other project. >> So SCAER, the Sinfield Center for Applied Economic Research at SLU is also known as SCAER. They're pretty data intensive, and they're into some really sophisticated research. The Center gets daily dumps of sensitive geo data sensitive de-identified geo data from various sources, and it's a terabyte so every day, becomes petabytes. So you know, we don't get the data in workable formats for the researchers to analyze. So the first process is to convert this data into a workable format and keep an analysis ready and doing this at a large scale has many challenges. So we had to make this data available to a group of users too, and some external collaborators with ads, you know, more challenges again, because we also have to do this without compromising on the security. So to handle these large size data, we had to deploy compute heavy instances, such as, you know, R5, 12xLarge, multiple 12xLarge instances, and optimizing the cost and the resources deployed on the cloud again was a huge challenge. So that's when we had to take MontyCloud help in automating the whole process of ingesting the data into the infrastructure and then converting them into a workable format. And this was all automated. And after automating most of the efforts, we were able to bring down the data processing time from two weeks or more to three days, which really helped the researchers. So MontyCloud's data platform also helped us with automating the risk, you know, the resource optimization process and that in turn helped bring the costs down, so it's been pretty helpful then. >> That's impressive weeks to days, I mean, this is the theme Venkat speed, speed, speed, hybrid, hybrid. A lot of stuff happening. I mean, this is the new normal, this is going to make companies more productive if they can get the apps built faster. What do you see as the CEO and founder of the company you're out there, you know, you're forging new ground with this great product. What do you see as the blockers from customers? Is it cultural, is it lack of awareness? Why aren't people jumping all over this? >> Only people aren't, right. They go at it in so many different ways that, you know, ultimately be the one person IT team or massively well-funded IT team. Everybody wants to Excel at what they're delivering in cloud operations, the path to that as what, the challenging part, right? What are you seeing as customers are trying to build their own operating model and they're writing custom code, then that's a lot of need for provisioning, governance, security, compliance, and monitoring. So they start integrating point tools, then suddenly IT department is now having a, what they call a tax, right? They have to maintain the technical debt while cloud service moving fast. It's not uncommon for one of the developers or one of the projects to suddenly consume a brand new resource. And as you know, AWS throws up a lot more services every month, right? So suddenly you're not keeping up with that service. So what we've been able to look at this from a point of view of how do we get customers to focus on what they want to do and automate things that we can help them with? >> Let me, let me rephrase the question if you don't mind. Cause I I didn't want to give the impression that you guys aren't, you guys have a great solution, but I think when I see enterprises, you know, they're transforming, right? So it's not so much the cloud innovators, like you guys, it's really that it's like the mainstream enterprise, so I have to ask you from a customer standpoint, what's some of the cultural things are technical reasons why they're not going faster? Cause everyone's, maybe it's the pandemic's forcing projects to be double down on, or some are going to be cut, this common theme of making things available faster, cheaper, stronger, more secure is what cloud does. What are some of the enterprise challenges that they have? >> Yeah, you know, it might be money for right, there's some cultural challenges like Andy Jassy or sometimes it's leadership, right? You want top down leadership that takes a deterministic step towards transformation, then adequately funding the team with the right skills and the tools, a lot of that plays into it. And there's inertia typically in an existing process. And when you go to cloud, you can do 10X better, people see that it doesn't always percolate down to how you get there. So those challenges are compounded and digital transformation leaders have to, you know, make that deliberate back there, be more KPI-driven. One of the things we are seeing in companies that do well is that the leadership decides that here are our top business objectives and KPIs. Now if we want the software and the services and the cloud division to support those objectives when they take that approach, transformation happens. But that is a lot more easier said than done. >> Well you're making it really easy with your solution. And we've done multiple interviews. I've got to say you're really onto something really with this provisioning and the compliance bots. That's really strong, that the only goes stronger from there, with the trends with security being built in. Shruthi, got to ask you since you're the customer, what's it like working with MontyCloud? It sounds so awesome, you're customer, you're using it. What's your review, what's your- What's your, what's your take on them? >> Yeah they are doing a pretty good job in helping us automate most of our workflows. And when it comes to keeping a tab on the resources, the utilization of the resources, so we can keep a tab on the cost in turn, you know, their compliance bots, their cost optimization tab. It's pretty helpful. >> Yeah well you're knocking projects down from three weeks to days, looking good, I mean, looking real strong. Venkat this is the track record you want to see with successful projects. Take a minute to explain what else is going on with MontyCloud. Other use cases that you see that are really primed for MontyCloud's platform. >> Yeah, John, quick minute there. Autonomous cloud operations is the goal. It's never done, right? It there's always some work that you hands-on do. But if you set a goal such that customers need to have a solution that automates most of the routine operations, then they can focus on the business. So we are going to relentlessly focused on the fact that autonomous operations will have the digital transformation happen faster, and we can create a lot more value for customers if they deliver to their KPIs and objectives. So our investments in the platform are going more towards that. Today we already have a fully automated compliance bot, a security bot, a cost optimization recommendation engine, a provisioning and governance engine, where we're going is we are enhancing all of this and providing customers lot more fluidity in how they can use our platform Click to perform your routine operations, Click to set up rules based automatic escalation or remediation. Cut down the number of hops a particular process will take and foster collaboration. All of this is what our platform is going and enhancing more and more. We intend to learn more from our customers and deliver better for them as we move forward. >> That's a good business model, make things easier, reduce the steps it takes to do something, and save money. And you're doing all those things with the cloud and awesome stuff. It's really great to hear your success stories and the work you're doing over there. Great to see resources getting and doing their job faster. And it's good and tons of data. You've got petabytes of that's coming in. It's it's pretty impressive, thanks for sharing your story. >> Sounds good, and you know, one quick call out is customers can go to MontyCloud.com today. Within 10 minutes, they can get an account. They get a very actionable and valuable recommendations on where they can save costs, what is the security compliance issues they can fix. There's a ton of out-of-the-box reports. One click to find out whether you are having some data that is not encrypted, or if any of your servers are open to the world. A lot of value that customers can get in under 10 minutes. And we believe in that model, give the value to customers. They know what to do with that, right? So customers can go sign up for a free trial at MontyCloud.com today and get the value. >> Congratulations on your success and great innovation. A startup showcase here with theCUBE coverage of AWS Startup Showcase breakthrough in DevOps, Data Analytics and Cloud Management with MontyCloud. I'm John Furrier, thanks for watching. (gentle music)

Published Date : Sep 22 2021

SUMMARY :

the co-founder and CEO Great to see you again, John. It is the current and the immediate future you can just explain And I lead all the cloud initiatives greatness in the cloud. And most of the workflows that and the blockers that get in important in the era we are today. Is that the use case and need and, you know, and to get an easy and you get of the researchers, but we ubiquitous the cloud to go faster. I'll let Shurthi jump on that, yeah. and reliable enrollment on the cloud of the speed turnaround to kind of save the time and, you know, as a block, or just the off of people so that the and do the coding and do all Yeah, so REDCap, what's the other project. the researchers to analyze. of the company you're out there, of the projects to suddenly So it's not so much the cloud innovators, and the cloud division to and the compliance bots. the cost in turn, you know, to see with successful projects. So our investments in the platform reduce the steps it takes to give the value to customers. Data Analytics and Cloud

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JohnPERSON

0.99+

ShruthiPERSON

0.99+

Andy JassyPERSON

0.99+

AWSORGANIZATION

0.99+

Shruthi MurthyPERSON

0.99+

two weeksQUANTITY

0.99+

40%QUANTITY

0.99+

Sinfield Center for Applied Economic ResearchORGANIZATION

0.99+

Venkat KrishnamachariPERSON

0.99+

John FurrierPERSON

0.99+

School of MedicineORGANIZATION

0.99+

St. LouisLOCATION

0.99+

oneQUANTITY

0.99+

Shruthi Sreenivasa MurthyPERSON

0.99+

SLUORGANIZATION

0.99+

VenkatPERSON

0.99+

10XQUANTITY

0.99+

St. Louis University HospitalORGANIZATION

0.99+

HIPAATITLE

0.99+

MontyCloudORGANIZATION

0.99+

two operationsQUANTITY

0.99+

24QUANTITY

0.99+

St. Louis UniversityORGANIZATION

0.99+

two clicksQUANTITY

0.99+

TodayDATE

0.99+

three stepsQUANTITY

0.99+

three daysQUANTITY

0.99+

ExcelTITLE

0.99+

10 minutesQUANTITY

0.99+

todayDATE

0.98+

under 10 minutesQUANTITY

0.98+

200-year-oldQUANTITY

0.98+

three weeksQUANTITY

0.98+

OneQUANTITY

0.97+

firstQUANTITY

0.97+

Research Computing GroupORGANIZATION

0.97+

MontyCloud.comORGANIZATION

0.96+

VenkatORGANIZATION

0.96+

first processQUANTITY

0.96+

AWS Center of ExcellenceORGANIZATION

0.95+

Research Electronic Data CaptureORGANIZATION

0.95+

theCUBEORGANIZATION

0.95+

twoQUANTITY

0.95+

7QUANTITY

0.94+

ShurthiPERSON

0.94+

about more than 160 checksQUANTITY

0.94+

one personQUANTITY

0.93+

St. Louis universityORGANIZATION

0.93+

two great guestsQUANTITY

0.93+

One clickQUANTITY

0.93+

Vice PresidentPERSON

0.91+

one common goalQUANTITY

0.9+

pandemicEVENT

0.9+

three majorQUANTITY

0.9+

Arun Murthy, Hortonworks | theCUBE NYC 2018


 

>> Live from New York, it's The Cube, covering The Cube New York City 2018 brought to you by SiliconAngle Media and its Ecosystem partners. >> Okay, welcome back everyone, here live in New York City for Cube NYC, formally Big Data NYC, now called CubeNYC. The topic has moved beyond big data. It's about cloud, it's about data, it's also about potentially blockchain in the future. I'm John Furrier, Dave Vellante. We're happy to have a special guest here, Arun Murthy. He's the cofounder and chief product officer of Hortonworks, been in the Ecosystem from the beginning, at Yahoo, already been on the Cube many times, but great to see you, thanks for coming in, >> My pleasure, >> appreciate it. >> thanks for having me. >> Super smart to have you on here, because a lot of people have been squinting through the noise of the market place. You guys have been now for a few years on this data plan idea, so you guys have actually launched Hadoop with Cloudera, they were first. You came after, Yahoo became second, two big players. Evolved it quickly, you guys saw early on that this is bigger than Hadoop. And now, all the conversations on what you guys have been talking about three years ago. Give us the update, what's the product update? How is the hybrids a big part of that, what's the story? >> We started off being the Hadoop company, and Rob, our CEO who was here on Cube, a couple of hours ago, he calls it sort of the phase one of the company, where it were Hadoop company. Very quickly realized we had to help enterprises manage the entire life cycle data, all the way from the edge to the data center, to the cloud, and between, right. So which is why we did acquisition of YARN, we've been talking about it, which kind of became the basis of our Hot marks Data flow product. And then as we went through the phase of that journey it was quickly obvious to us that enterprises had to manage data and applications in a hybrid manner right which is both on prem And public load and increasingly Edge, which is really very we spend a lot of time these days With IOT and everything from autonomous cars to video monitoring to all these aspects coming in. Which is why we wanted to get to the data plan architecture it allows to get you to a consistent security governance model. There's a lot of, I'll call it a lot of, a lot of fight about Cloud being insecure and so on, I don't think there's anything inherently insecure about the Cloud. The issue that we see is lack of skills and our enterprises know how to manage the data on-prem they know how to do LDAP, groups, and curb rows, and AAD, and what have you, they just don't have the skill sets yet to be able to do it on the public load, which leads to mistakes occasionally. >> Um-hm. >> And Data breaches and so on. So we recognize really early that part of data plan was to get that consistent security in governance models, so you don't have to worry about how you set up IMRL's on Amazon versus LDAP on-prem versus something else on Google. >> It's operating consistency. >> It's operating, exactly. I've talked about this in the past. So getting that Data plan was that journey, and this week at Charlotte work week we announced was we wanted to take that step further we've been able to kind of allow enterprise to manage this hybrid architecture on prem, multiple public loads. >> And the Edge. >> In a connected manner, the issue we saw early on and it's something we've been working on for a long while. Is that we've been able to connect the architectures Hadoop when it started it was more of an on premise architecture right, and I was there in 2005, 2006 when it started, Hadoop's started was bought on the world wide web we had a gigabyte of ethernet and I was up to the rack. From the rack on we had only eight gigs up to the rack so if you have a 2000 or cluster your dealing with eight gigs of connection. >> Bottleneck >> Huge bottleneck, fast forward today, you have at least ten if not one hundred gigabits. Moving to one hundred to a terabyte architecture, for that standpoint, and then what's happening is everything in that world, if you had the opportunity to read things on the assumptions we have in Hadoop. And then the good news is that when Cloud came along Cloud already had decoupled storage and architecture, storage and compute architectures. As we've sort of helped customers navigate the two worlds, with data plan, it's been a journey that's been reasonably successful and I think we have an opportunity to kind of provide identical consistent architectures both on prem and on Cloud. So it's almost like we took Hadoop and adapted it to Cloud. I think we can adapt the Cloud architecture back on prem, too to have consistent architectures. >> So talk about the Cloud native architecture. So you have a post that just got published. Cloud native architecture for big data and the data center. No, Cloud native architecture to big data in the data center. That's hyrid, explain the hybrid model, how do you define that? >> Like I said, for us it's really important to be able to have consistent architectures, consistent security, consistent governance, consistent way to manage data, and consistent way to actually to double up and port applications. So portability for data is important, which is why having security and governance consistently is a key. And then portability for the applications themselves are important, which is why we are so excited to kind of be, kind of first to embrace the whole containerize the ecosystem initiative. We've announced the open hybrid architecture initiative which is about decoupling storage and compute and then leveraging containers for all the big data apps, for the entire ecosystem. And this is where we are really excited to be working with both IBM and Redhat especially Redhat given their sort of investments in Kubernetes and open ship. We see that much like you'll have S3 and EC2, S3 for storage, EC2 for compute, and same thing with ADLS and azure compute. You'll actually have the next gen HDFS and Kubernetives. So is this a massive architectural rewrite, or is it more sort of management around the core. >> Great question. So part of it is evolution of the architecture. We have to get, whether it's Spark or Kafka or any of these open source projects, we need to do some evolution in the architecture, to make them work in the ecosystem, in the containerized world. So we are containerizing every one of the 28 animals 30 animals, in the zoo, right. That's a lot of work, we are kind of you know, sort of do it, we've done it in the past. Along with your point it's not enough to just have the architecture, you need to have a consistent fabric to be able to manage and operate it, which is really where the data plan comes in again. That was really the point of data plane all the time, this is a multi-roadmap, you know when we sit down we are thinking about what we'll do in 22, and 23. But we really have to execute on a multi-roadmap. >> And Data plane was a lynch pin. >> Well it was just like the sharp edge of the sword. Right, it was the tip of the sphere, but really the idea was always that we have to get data plan in to kind of get that hybrid product out there. And then we can sort of get to a inter generational data plan which would work with the next generation of the big data ecosystem itself. >> Do you see Kubernetes and things like Kubernetes, you've got STO a few service meshes up the stack, >> Absolutely are going to play a pretty instrumental role around orchestrating work loads and providing new stateless and stateful application with data, so now data you've got more data being generated there. So this is a new dynamic, it sounds like that's a fit for what you guys are doing. >> Which is something we've seen for awhile now. Like containers are something we've tracked for a long time and really excited to see Docker and RedHat. All the work that they are doing with Redhat containers. Get the security and so on. It's the maturing of that ecosystem. And now, the ability to port, build and port applications. And the really cool part for me is that, we will definitely see Kubenetes and open shift, and prem but even if you look at the Cloud the really nice part is that each of the Cloud providers themselves, provide a Kubenesos. Whether it's GKE on Google or Fargate on Amazon or AKS on Microsoft, we will be able to take identical architectures and leverage them. When we containerize high mark aft or spark we will be able to do this with kubernetes on spark with open shift and there will be open shift on leg which is available in the public cloud but also GKE and Fargate and AKS. >> What's interesting about the Redhat relationship is that I think you guys are smart to do this, is by partnering with Redhat you can, customers can run their workloads, analytical workloads, in the same production environment that Redhat is in. But with kind of differentiation if you will. >> Exactly with data plane. >> Data plane is just a wonderful thing there. So again good move there. Now around the ecosystem. Who else are you partnering with? what else do you see out there? who is in your world that is important? >> You know again our friends at IBM, that we've had a long relationship with them. We are doing a lot of work with IBM to integrate, data plane and also ICPD, which is the IBM Cloud plane for data, which brings along all of the IBM ecosystem. Whether it's DBT or IGC information governance catalogs, all that kind of were back in this world. What we also believe this will give a flip to is the whole continued standardization of security and governance. So you guys remember the old dpi, it caused a bit of a flutter, a few years ago. (anxious laughing) >> We know how that turned out. >> What we did was we kind of said, old DPI was based on the old distributions, now it's DPI's turn to be more about merit and governance. So we are collaborating with IBM on DPI more on merit and governance, because again we see that as being very critical in this sort of multi-Cloud, on prem edge world. >> Well the narrative, was always why do you need it, but it's clear that these three companies have succeeded dramatically, when you look at the financials, there has been statements made about IBM's contribution to seven figure deals to you guys. We had Redhat on and you guys are birds of a feather. [Murhty] Exactly. >> It certainly worked for you three, which presumably means it confers value to your customers. >> Which is really important, right from a customer standpoint, what is something we really focus on is that the benefit of the bargain is that now they understand that some of their key vendor partners that's us and Ibm and Redhat, we have a shared roadmap so now they can be much more sure about the fact that they can go to containers and kubernetes and so on and so on. Because all of the tools that they depend on are and all the partners they depend on are working together. >> So they can place bets. >> So they can place bets, and the important thing is that they can place longer term bets. Not a quarter bet, we hear about customers talking about building the next gen data centers, with kubernetes in mind. >> They have too. >> They have too, right and it's more than just building machines up, because what happens is with this world we talked about things like networking the way you do networking in this world with kubernetes, is different than you do before. So now they have to place longer term bets and they can do this now with the guarantee that the three of us will work together to deliver on the architecture. >> Well Arun, great to have you on the Cube, great to see you, final question for you, as you guys have a good long plan which is very cool. Short term customers are realizing, the set-up phase is over, okay now they're in usage mode. So the data has got to deliver value, so there is a real pressure for ROI, we would give people a little bit of a pass earlier on because set-up everything, set-up the data legs, do all this stuff, get it all operationalized, but now, with the AI and the machine learning front and center that's a signal that people want to start putting this to work. What have you seen customers gravitate to from the product side? Where are they going, is it the streaming is it the Kafka, is it the, what products are they gravitating to? >> Yeah definitely, I look at these in my role, in terms of use cases, right, we are certainly seeing a continued push towards the real-time analytics space. Which is why we place a longer-term bet on HDF and Kafka and so on. What's been really heartening kind of back to your sentiment, is we are seeing a lot of push right now on security garments. That's why we introduced for GDPR, we introduced a bunch of cable readies and data plane, with DSS and James Cornelius wrote about this earlier in the year, we are seeing customers really push us for key aspects like GDPR. This is a reflection for me of the fact of the maturing of the ecosystem, it means that it's no longer something on the side that you play with, it's something that's more, the whole ecosystem is now more a system of record instead of a system of augmentation, so that is really heartening but also brings a sharper focus and more sort of responsibility on our shoulders. >> Awesome, well congratulations, you guys have stock prices at a 52-week high. Congratulations. >> Those things take care of themselves. >> Good products, and stock prices take care of themselves. >> Okay the Cube coverage here in New York City, I'm John Vellante, stay with us for more live coverage all things data happening here in New York City. We will be right back after this short break. (digital beat)

Published Date : Sep 12 2018

SUMMARY :

brought to you by SiliconAngle Media at Yahoo, already been on the Cube many times, And now, all the conversations on what you guys a couple of hours ago, he calls it sort of the phase one so you don't have to worry about how you set up IMRL's on was we wanted to take that step further we've been able In a connected manner, the issue we saw early on on the assumptions we have in Hadoop. So talk about the Cloud native architecture. it more sort of management around the core. evolution in the architecture, to make them work in idea was always that we have to get data plan in to for what you guys are doing. And the really cool part for me is that, we will definitely What's interesting about the Redhat relationship is that Now around the ecosystem. So you guys remember the old dpi, it caused a bit of a So we are collaborating with IBM on DPI more on merit and Well the narrative, was always why do you need it, but It certainly worked for you three, which presumably be much more sure about the fact that they can go to building the next gen data centers, with kubernetes in mind. So now they have to place longer term bets and they So the data has got to deliver value, so there is a on the side that you play with, it's something that's Awesome, well congratulations, you guys have stock Okay the Cube coverage here in New York City,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Arun MurthyPERSON

0.99+

RobPERSON

0.99+

IBMORGANIZATION

0.99+

2005DATE

0.99+

John VellantePERSON

0.99+

John FurrierPERSON

0.99+

RedhatORGANIZATION

0.99+

YahooORGANIZATION

0.99+

30 animalsQUANTITY

0.99+

SiliconAngle MediaORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

AKSORGANIZATION

0.99+

New York CityLOCATION

0.99+

secondQUANTITY

0.99+

52-weekQUANTITY

0.99+

James CorneliusPERSON

0.99+

GoogleORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

HortonworksORGANIZATION

0.99+

New YorkLOCATION

0.99+

threeQUANTITY

0.99+

YARNORGANIZATION

0.99+

28 animalsQUANTITY

0.99+

one hundredQUANTITY

0.99+

FargateORGANIZATION

0.99+

two worldsQUANTITY

0.99+

GDPRTITLE

0.99+

2006DATE

0.99+

ArunPERSON

0.99+

three companiesQUANTITY

0.99+

one hundred gigabitsQUANTITY

0.99+

eight gigsQUANTITY

0.99+

this weekDATE

0.99+

two big playersQUANTITY

0.99+

HadoopTITLE

0.98+

firstQUANTITY

0.98+

SparkTITLE

0.98+

GKEORGANIZATION

0.98+

KafkaTITLE

0.98+

bothQUANTITY

0.98+

KubernetesTITLE

0.98+

eachQUANTITY

0.97+

todayDATE

0.97+

NYCLOCATION

0.97+

three years agoDATE

0.97+

CloudTITLE

0.97+

CharlotteLOCATION

0.96+

seven figureQUANTITY

0.96+

DSSORGANIZATION

0.96+

EC2TITLE

0.95+

S3TITLE

0.95+

CubeCOMMERCIAL_ITEM

0.94+

CubeORGANIZATION

0.92+

MurhtyPERSON

0.88+

2000QUANTITY

0.88+

few years agoDATE

0.87+

couple of hours agoDATE

0.87+

EcosystemORGANIZATION

0.86+

IbmPERSON

0.85+

Arun Murthy, Hortonworks | DataWorks Summit 2018


 

>> Live from San Jose in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my cohost, Jim Kobielus. We're joined by Aaron Murphy, Arun Murphy, sorry. He is the co-founder and chief product officer of Hortonworks. Thank you so much for returning to theCUBE. It's great to have you on >> Yeah, likewise. It's been a fun time getting back, yeah. >> So you were on the main stage this morning in the keynote, and you were describing the journey, the data journey that so many customers are on right now, and you were talking about the cloud saying that the cloud is part of the strategy but it really needs to fit into the overall business strategy. Can you describe a little bit about how you're approach to that? >> Absolutely, and the way we look at this is we help customers leverage data to actually deliver better capabilities, better services, better experiences, to their customers, and that's the business we are in. Now with that obviously we look at cloud as a really key part of it, of the overall strategy in terms of how you want to manage data on-prem and on the cloud. We kind of joke that we ourself live in a world of real-time data. We just live in it and data is everywhere. You might have trucks on the road, you might have drawings, you might have sensors and you have it all over the world. At that point, we've kind of got to a point where enterprise understand that they'll manage all the infrastructure but in a lot of cases, it will make a lot more sense to actually lease some of it and that's the cloud. It's the same way, if you're delivering packages, you don't got buy planes and lay out roads you go to FedEx and actually let them handle that view. That's kind of what the cloud is. So that is why we really fundamentally believe that we have to help customers leverage infrastructure whatever makes sense pragmatically both from an architectural standpoint and from a financial standpoint and that's kind of why we talked about how your cloud strategy, is part of your data strategy which is actually fundamentally part of your business strategy. >> So how are you helping customers to leverage this? What is on their minds and what's your response? >> Yeah, it's really interesting, like I said, cloud is cloud, and infrastructure management is certainly something that's at the foremost, at the top of the mind for every CIO today. And what we've consistently heard is they need a way to manage all this data and all this infrastructure in a hybrid multi-tenant, multi-cloud fashion. Because in some GEOs you might not have your favorite cloud renderer. You know, go to parts of Asia is a great example. You might have to use on of the Chinese clouds. You go to parts of Europe, especially with things like the GDPR, the data residency laws and so on, you have to be very, very cognizant of where your data gets stored and where your infrastructure is present. And that is why we fundamentally believe it's really important to have and give enterprise a fabric with which it can manage all of this. And hide the details of all of the underlying infrastructure from them as much as possible. >> And that's DataPlane Services. >> And that's DataPlane Services, exactly. >> The Hortonworks DataPlane Services we launched in October of last year. Actually I was on CUBE talking about it back then too. We see a lot of interest, a lot of excitement around it because now they understand that, again, this doesn't mean that we drive it down to the least common denominator. It is about helping enterprises leverage the key differentiators at each of the cloud renderers products. For example, Google, which we announced a partnership, they are really strong on AI and MO. So if you are running TensorFlow and you want to deal with things like Kubernetes, GKE is a great place to do it. And, for example, you can now go to Google Cloud and get DPUs which work great for TensorFlow. Similarly, a lot of customers run on Amazon for a bunch of the operational stuff, Redshift as an example. So the world we live in, we want to help the CIO leverage the best piece of the cloud but then give them a consistent way to manage and count that data. We were joking on stage that IT has just about learned how deal with Kerberos and Hadoob And now we're telling them, "Oh, go figure out IM on Google." which is also IM on Amazon but they are completely different. The only thing that's consistent is the name. So I think we have a unique opportunity especially with the open source technologies like Altas, Ranger, Knox and so on, to be able to draw a consistent fabric over this and secured occurrence. And help the enterprise leverage the best parts of the cloud to put a best fit architecture together, but which also happens to be a best of breed architecture. >> So the fabric is everything you're describing, all the Apache open source projects in which HortonWorks is a primary committer and contributor, are able to scheme as in policies and metadata and so forth across this distributed heterogeneous fabric of public and private cloud segments within a distributed environment. >> Exactly. >> That's increasingly being containerized in terms of the applications for deployment to edge nodes. Containerization is a big theme in HTP3.0 which you announced at this show. >> Yeah. >> So, if you could give us a quick sense for how that containerization capability plays into more of an edge focus for what your customers are doing. >> Exactly, great point, and again, the fabric is obviously, the core parts of the fabric are the open source projects but we've also done a lot of net new innovation with data plans which, by the way, is also open source. Its a new product and a new platform that you can actually leverage, to lay it out over the open source ones you're familiar with. And again, like you said, containerization, what is actually driving the fundamentals of this, the details matter, the scale at which we operate, we're talking about thousands of nodes, terabytes of data. The details really matter because a 5% improvement at that scale leads to millions of dollars in optimization for capex and opex. So that's why all of that, the details are being fueled and driven by the community which is kind of what we tell over HDP3 Until the key ones, like you said, are containerization because now we can actually get complete agility in terms of how you deploy the applications. You get isolation not only at the resource management level with containers but you also get it at the software level, which means, if two data scientists wanted to use a different version of Python or Scala or Spark or whatever it is, they get that consistently and holistically. That now they can actually go from the test dev cycle into production in a completely consistent manner. So that's why containers are so big because now we can actually leverage it across the stack and the things like MiNiFi showing up. We can actually-- >> Define MiNiFi before you go further. What is MiNiFi for our listeners? >> Great question. Yeah, so we've always had NiFi-- >> Real-time >> Real-time data flow management and NiFi was still sort of within the data center. What MiNiFi does is actually now a really, really small layer, a small thin library if you will that you can throw on a phone, a doorbell, a sensor and that gives you all the capabilities of NiFi but at the edge. >> Mmm Right? And it's actually not just data flow but what is really cool about NiFi it's actually command and control. So you can actually do bidirectional command and control so you can actually change in real-time the flows you want, the processing you do, and so on. So what we're trying to do with MiNiFi is actually not just collect data from the edge but also push the processing as much as possible to the edge because we really do believe a lot more processing is going to happen at the edge especially with the A6 and so on coming out. There will be custom hardware that you can throw and essentially leverage that hardware at the edge to actually do this processing. And we believe, you know, we want to do that even if the cost of data not actually landing up at rest because at the end of the day we're in the insights business not in the data storage business. >> Well I want to get back to that. You were talking about innovation and how so much of it is driven by the open source community and you're a veteran of the big data open source community. How do we maintain that? How does that continue to be the fuel? >> Yeah, and a lot of it starts with just being consistent. From day one, James was around back then, in 2011 we started, we've always said, "We're going to be open source." because we fundamentally believed that the community is going to out innovate any one vendor regardless of how much money they have in the bank. So we really do believe that's the best way to innovate mostly because their is a sense of shared ownership of that product. It's not just one vendor throwing some code out there try to shove it down the customers throat. And we've seen this over and over again, right. Three years ago, we talk about a lot of the data plane stuff comes from Atlas and Ranger and so on. None of these existed. These actually came from the fruits of the collaboration with the community with actually some very large enterprises being a part of it. So it's a great example of how we continue to drive it6 because we fundamentally believe that, that's the best way to innovate and continue to believe so. >> Right. And the community, the Apache community as a whole so many different projects that for example, in streaming, there is Kafka, >> Okay. >> and there is others that address a core set of common requirements but in different ways, >> Exactly. >> supporting different approaches, for example, they are doing streaming with stateless transactions and so forth, or stateless semantics and so forth. Seems to me that HortonWorks is shifting towards being more of a streaming oriented vendor away from data at rest. Though, I should say HDP3.0 has got great scalability and storage efficiency capabilities baked in. I wonder if you could just break it down a little bit what the innovations or enhancements are in HDP3.0 for those of your core customers, which is most of them who are managing massive multi-terabyte, multi-petabyte distributed, federated, big data lakes. What's in HDP3.0 for them? >> Oh for lots. Again, like I said, we obviously spend a lot of time on the streaming side because that's where we see. We live in a real-time world. But again, we don't do it at the cost of our core business which continues to be HDP. And as you can see, the community trend is drive, we talked about continuization massive step up for the Hadoob Community. We've also added support for GPUs. Again, if you think about Trove's at scale machine learning. >> Graphing processing units, >> Graphical-- >> AI, deep learning >> Yeah, it's huge. Deep learning, intensive flow and so on, really, really need a custom, sort of GPU, if you will. So that's coming. That's an HDP3. We've added a whole bunch of scalability improvements with HDFS. We've added federation because now we can go from, you can go over a billion files a billion objects in HDFS. We also added capabilities for-- >> But you indicated yesterday when we were talking that very few of your customers need that capacity yet but you think they will so-- >> Oh for sure. Again, part of this is as we enable more source of data in real-time that's the fuel which drives and that was always the strategy behind the HDF product. It was about, can we leverage the synergies between the real-time world, feed that into what you do today, in your classic enterprise with data at rest and that is what is driving the necessity for scale. >> Yes. >> Right. We've done that. We spend a lot of work, again, loading the total cost of ownership the TCO so we added erasure coding. >> What is that exactly? >> Yeah, so erasure coding is a classic sort of storage concept which allows you to actually in sort of, you know HTFS has always been three replicas So for redundancy, fault tolerance and recovery. Now, it sounds okay having three replicas because it's cheap disk, right. But when you start to think about our customers running 70, 80 hundred terabytes of data those three replicas add up because you've now gone from 80 terabytes of effective data where actually two 1/4 of an exobyte in terms of raw storage. So now what we can do with erasure coding is actually instead of storing the three blocks we actually store parody. We store the encoding of it which means we can actually go down from three to like two, one and a half, whatever we want to do. So, if we can get from three blocks to one and a half especially for your core data, >> Yeah >> the ones you're not accessing every day. It results in a massive savings in terms of your infrastructure costs. And that's kind of what we're in the business doing, helping customers do better with the data they have whether it's on-prem or on the cloud, that's sort of we want to help customers be comfortable getting more data under management along with secured and the lower TCO. The other sort of big piece I'm really excited about HDP3 is all the work that's happened to Hive Community for what we call the real-time database. >> Yes. >> As you guys know, you follow the whole sequel of ours in the Doob Space. >> And hive has changed a lot in the last several years, this is very different from what it was five years ago. >> The only thing that's same from five years ago is the name (laughing) >> So again, the community has done a phenomenal job, kind of, really taking sort of a, we used to call it like a sequel engine on HDFS. From there, to drive it with 3.0, it's now like, with Hive 3 which is part of HDP3 it's a full fledged database. It's got full asset support. In fact, the asset support is so good that writing asset tables is at least as fast as writing non-asset tables now. And you can do that not only on-- >> Transactional database. >> Exactly. Now not only can you do it on prem, you can do it on S3. So you can actually drive the transactions through Hive on S3. We've done a lot of work to actually, you were there yesterday when we were talking about some of the performance work we've done with LAP and so on to actually give consistent performance both on-prem and the cloud and this is a lot of effort simply because the performance characteristics you get from the storage layer with HDFS versus S3 are significantly different. So now we have been able to bridge those with things with LAP. We've done a lot of work and sort of enhanced the security model around it, governance and security. So now you get things like account level, masking, row-level filtering, all the standard stuff that you would expect and more from an Enprise air house. We talked to a lot of our customers, they're doing, literally tens of thousands of views because they don't have the capabilities that exist in Hive now. >> Mmm-hmm 6 And I'm sitting here kind of being amazed that for an open source set of tools to have the best security and governance at this point is pretty amazing coming from where we started off. >> And it's absolutely essential for GDPR compliance and compliance HIPA and every other mandate and sensitivity that requires you to protect personally identifiable information, so very important. So in many ways HortonWorks has one of the premier big data catalogs for all manner of compliance requirements that your customers are chasing. >> Yeah, and James, you wrote about it in the contex6t of data storage studio which we introduced >> Yes. >> You know, things like consent management, having--- >> A consent portal >> A consent portal >> In which the customer can indicate the degree to which >> Exactly. >> they require controls over their management of their PII possibly to be forgotten and so forth. >> Yeah, it's going to be forgotten, it's consent even for analytics. Within the context of GDPR, you have to allow the customer to opt out of analytics, them being part of an analytic itself, right. >> Yeah. >> So things like those are now something we enable to the enhanced security models that are done in Ranger. So now, it's sort of the really cool part of what we've done now with GDPR is that we can get all these capabilities on existing data an existing applications by just adding a security policy, not rewriting It's a massive, massive, massive deal which I cannot tell you how much customers are excited about because they now understand. They were sort of freaking out that I have to go to 30, 40, 50 thousand enterprise apps6 and change them to take advantage, to actually provide consent, and try to be forgotten. The fact that you can do that now by changing a security policy with Ranger is huge for them. >> Arun, thank you so much for coming on theCUBE. It's always so much fun talking to you. >> Likewise. Thank you so much. >> I learned something every time I listen to you. >> Indeed, indeed. I'm Rebecca Knight for James Kobeilus, we will have more from theCUBE's live coverage of DataWorks just after this. (Techno music)

Published Date : Jun 19 2018

SUMMARY :

brought to you by Hortonworks. It's great to have you on Yeah, likewise. is part of the strategy but it really needs to fit and that's the business we are in. And hide the details of all of the underlying infrastructure for a bunch of the operational stuff, So the fabric is everything you're describing, in terms of the applications for deployment to edge nodes. So, if you could give us a quick sense for Until the key ones, like you said, are containerization Define MiNiFi before you go further. Yeah, so we've always had NiFi-- and that gives you all the capabilities of NiFi the processing you do, and so on. and how so much of it is driven by the open source community that the community is going to out innovate any one vendor And the community, the Apache community as a whole I wonder if you could just break it down a little bit And as you can see, the community trend is drive, because now we can go from, you can go over a billion files the real-time world, feed that into what you do today, loading the total cost of ownership the TCO sort of storage concept which allows you to actually is all the work that's happened to Hive Community in the Doob Space. And hive has changed a lot in the last several years, And you can do that not only on-- the performance characteristics you get to have the best security and governance at this point and sensitivity that requires you to protect possibly to be forgotten and so forth. Within the context of GDPR, you have to allow The fact that you can do that now Arun, thank you so much for coming on theCUBE. Thank you so much. we will have more from theCUBE's live coverage of DataWorks

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Jim KobielusPERSON

0.99+

Rebecca KnightPERSON

0.99+

JamesPERSON

0.99+

Aaron MurphyPERSON

0.99+

Arun MurphyPERSON

0.99+

ArunPERSON

0.99+

2011DATE

0.99+

GoogleORGANIZATION

0.99+

5%QUANTITY

0.99+

80 terabytesQUANTITY

0.99+

FedExORGANIZATION

0.99+

twoQUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

HortonworksORGANIZATION

0.99+

San JoseLOCATION

0.99+

AmazonORGANIZATION

0.99+

Arun MurthyPERSON

0.99+

HortonWorksORGANIZATION

0.99+

yesterdayDATE

0.99+

San Jose, CaliforniaLOCATION

0.99+

three replicasQUANTITY

0.99+

James KobeilusPERSON

0.99+

three blocksQUANTITY

0.99+

GDPRTITLE

0.99+

PythonTITLE

0.99+

EuropeLOCATION

0.99+

millions of dollarsQUANTITY

0.99+

ScalaTITLE

0.99+

SparkTITLE

0.99+

theCUBEORGANIZATION

0.99+

five years agoDATE

0.99+

one and a halfQUANTITY

0.98+

EnpriseORGANIZATION

0.98+

threeQUANTITY

0.98+

Hive 3TITLE

0.98+

Three years agoDATE

0.98+

bothQUANTITY

0.98+

AsiaLOCATION

0.97+

50 thousandQUANTITY

0.97+

TCOORGANIZATION

0.97+

MiNiFiTITLE

0.97+

ApacheORGANIZATION

0.97+

40QUANTITY

0.97+

AltasORGANIZATION

0.97+

Hortonworks DataPlane ServicesORGANIZATION

0.96+

DataWorks Summit 2018EVENT

0.96+

30QUANTITY

0.95+

thousands of nodesQUANTITY

0.95+

A6COMMERCIAL_ITEM

0.95+

KerberosORGANIZATION

0.95+

todayDATE

0.95+

KnoxORGANIZATION

0.94+

oneQUANTITY

0.94+

hiveTITLE

0.94+

two data scientistsQUANTITY

0.94+

eachQUANTITY

0.92+

ChineseOTHER

0.92+

TensorFlowTITLE

0.92+

S3TITLE

0.91+

October of last yearDATE

0.91+

RangerORGANIZATION

0.91+

HadoobORGANIZATION

0.91+

HIPATITLE

0.9+

CUBEORGANIZATION

0.9+

tens of thousandsQUANTITY

0.9+

one vendorQUANTITY

0.89+

last several yearsDATE

0.88+

a billion objectsQUANTITY

0.86+

70, 80 hundred terabytes of dataQUANTITY

0.86+

HTP3.0TITLE

0.86+

two 1/4 of an exobyteQUANTITY

0.86+

Atlas andORGANIZATION

0.85+

DataPlane ServicesORGANIZATION

0.84+

Google CloudTITLE

0.82+

Murthy Mathiprakasam, Informatica | Big Data SV 2018


 

>> Narrator: Live from San Jose, it's theCUBE. Presenting big data silicon valley, brought to you be Siliconangle Media and its ecosystem partner. >> Welcome back to theCUBE we are live in San Jose, at Forger Eatery, super cool place. Our first day of our two days of coverage at our event called Big Data SV. Down the street is the Strata Data Conference, and we've got some great guests today that are going to share a lot insight and different perspectives on Big Data. This is our 10th big data event on theCUBE, our fifth in San Jose. We invite you to come on down to Forger Eatery and we also invite you to come down this evening. We've got a party going on and we've got a really cool breakfast presentation on the analysis site in the morning. Our first guest is, needs no introduction to theCUBE, he's a Cube Alumni, Murthy Mathiprakasam, did I get that right? >> Murthy: Absolutely. >> Murthy, awesome, as we're going to call him. The director of product marketing for Informatica, welcome back to theCUBE, it's great to have you back. >> Thanks for having me back, and congratulations on the 10 year anniversary. >> Yeah! So, interesting, exciting news from Informatica in the last two days, tell us about a couple of those big announcements that you guys just released. >> Absolutely, yes. So this has been very exciting year for us lots of, you know product, innovations and announcements, so just this week alone, actually there's one announcement that's probably going out right now as we speak, around API management, so one of the things, probably taking about before we started interviews you know around the trend toward cloud, lots of people doing a lot more data integration and application integration in the cloud space. But they face all the challenges that we've always seen in the data management space. Around developer productivity, and hand coding, just a lot of complexity that organizations have around maintenance. So one of the things at Informatica always brought to every domain that we cover is this ability to kind of abstract the underlying complexity, use a graphical user interface, make things at the logical level instead of the physical level. So we're bringing that entire kind of paradigm to the API management space. That's going to be very exciting, very game changing on the kind of app-to-app integration side of things. Back on the data world of course, which is what we're, you know, mainly talking about here today. We're doing a lot there as well. So we announced kind of a next generation of our data management platforms for the big data world, part of that is also a lot of cloud capabilities. 'Cause again, one of the bigger trends. >> Peter: Have you made a big bet there? >> Absolutely, and I mean this is the investment, return on investments over like 10 years, right? We were started in a kind of cloud game about 10 years ago with our platform as a service offering. So that has been continuously innovated on and we've been engineering, re-imagining that, to now include more of the big data stuff in it too, because more and more people are building data lakes in the cloud. So it's actually quite surprising, you know the rate at which the data lake kind of projects are now either migrating or just starting in the cloud environments. So given that being the trend, we were kind of innovating against that as well. So now our platform is service offerings supports the ability to connect to data sources in the cloud natively. You can do processing and gestion in the cloud. So there's a lot of really cool capabilities, again it's kind of bringing the Informatica ease of use, and kind of acceleration that comes to platform approach to the cloud environment. And there's a whole bunch of other announcements too, I mean I could spend 20 minutes, just on different innovations, but you know bringing artificial intelligence into the platform so we can talk more about that. >> Well I want to connect what you just announced with the whole notion of the data lake, 'cause it's really Informatica strength has always been in between. And it turns out that where a lot the enterprise problems have been. So the data lake has been there, but it's been big, it's been large, it was big data and the whole notion is make this as big as you can and we'll figure out what to do with it later. >> Murthy: Right. >> And now you're doing the API which is just a indication that we're seeing further segmentation and a specificity, a targeting of how we're going to use data, the value that we create out of data and apply it to business problems. But really Informatica strength is been in between. >> Murthy: Absolutely. >> It's been in, knowing where you data is, it's been in helping to build those pipelines and managing those pipelines. How have the investments that you've made over the last few years, made it possible for you to actually deliver an API orientation, that will actually work for Enterprises? >> Yeah, absolutely, and I would actually phrase it as sort of platform orientation, but you're exactly right. So what's happening is, I view this as sort of maturation of a lot of these new technologies. You know Hadoop was a very very, as you were saying kind of experimental technology four or five years ago. And we had customers too who were kind of in that experimental phase. But what's happening now is, big data isn't just a conversation with data engineers and developers, we're talking to CDO's, and Chief Data Officers, and VP's of data infrastructures about using Hadoop for Enterprise scale projects, now the minute you start having a conversation with a Chief Data Officer, you're not just talking about simple tools for ingestion and stuff like that. You're talking about security, you're talking about compliance, you're talking about GDPR if you're in Europe. So there's a whole host of sort of data management challenges, that are now relevant for the big data world, just because the big data world has become main stream. And so this is exactly to your point, where the investments that I think Informatica has been making and bringing our kind of comprehensive platform oriented approach to this space are paying off. Because for Chief Data Officer, they can't really do big data without those features. They can't not deal with security and compliance, they can't not deal with not knowing what the data is. 'Cause they're accountable for knowing what the data is, right? And so, there's a number of things that by virtue of the maturation of the industry, I think that trends are pointing toward, the enterprises kind of going more toward that platform approach. >> On that platform approach Informatica's really one of the only vendors that's talking about that, and delivering it. So that clearly is an area of differentiation. Why do you think that's nascent. This platform approach verses a kind of fit-for-purpose approach. >> Yeah, absolutely. And we should be careful with even the phrase fit-for-purpose too, 'cause I think that word gets thrown around a lot as it's one of those buzz words in the industry. Because it's sort of the positive way of saying incomplete, you know? And so, I think there are vendors who have tried to kind of address, know you one aspect of sort of one feature of the entire problem, that a Chief Data Officer would care about. They might call it fit-for-purpose, but you have to actually solve a problem at the end of the day. The Chief Data Officer's are trying to build enterprise data pipelines. You know you've got raw information from all sorts of data sources, on premise, in the cloud. You need to push that through a process, like at manufacturing process of being able to ingest it, repair it, cleanse it, govern it, secure it, master it, all the stuff has to happen in order to serve all the various communities that a Chief Data Officer has to serve. And so you're either doing all that or you're not. You know, that's the problem, that way we see the problem. And so the platform approach is a way of addressing the comprehensive set of problems that a Chief Data Officer, or these kind of of Data Executives care about, but also do it in a way, that fosters productivity and re-usability. Because the more you sort of build things in a kind of infrastructure level way, as soon as the infrastructure changes you're hosed, right? So you're seeing a lot of this in the industry now too, where somebody built something in Mapreduce three years ago, as soon as Spark came out, they're throwing all that stuff away. And it's not just, you know, major changes like that, even versions of Spark, or versions of Hadoop, can sometimes trigger a need to recode and throw away stuff. And organization can't afford this. When you're talking about 40 to 50% growth in the data overall. The last thing you want to do is make an investment that you're going to end up throwing away. And so, the platform approach to go back to your question, is the sort of most efficient pathway from an investment stand point, that an enterprise can take, to build something now that they can actually reuse and maintain and kind of scale in a very very pragmatic way. >> Well, let me push you on that a little bit. >> Murthy: Yeah. >> 'Cause what we would say is that, the fit-to-purpose is okay so long as you're true about the purpose, and you understand what it means to fit. What a lot of the open source, a lot of companies have done, is they've got a fit-to-purpose but then they do make promises that they say, oh this is fit-to-purpose, but it's really a platform. And as a consequence you get a whole bunch of, you know, duck-like solutions, (laughing) That are, you know, are they swimming, or are they flying, kind of problems. So, I think that what we see clients asking for, and this is one of my questions, what we see clients asking for is, I want to invest in technologies that allow me to sustain my investments, including perhaps some of my mistakes, if they are generating business value. >> Murthy: Right. >> So it's not a rip and replace, that's not what you're suggesting, what you're suggesting I think is, you know, use what you got, if it's creating value continue to use it, and then over time, invest the platform approach that's able to generate additional returns on top of it. Have I got that right? >> Absolutely. So it goes back to flexibility, that's the key word, I think that's kind of on the minds of a lot of Chief Data Officers. I don't want to build something today, that I know I'm going to throw away a year from now. >> Peter: I want to create options for the future. >> Create options. >> When I build them today. >> Exactly. So even the cloud, you're bringing up earlier on, right? Not everybody knows exactly what their cloud strategy is. And it's changing extremely rapidly, right? We had almost, we were seeing very few big data customers in the cloud maybe even a year or two ago? Now we're close to almost 50% of our big data business is people deploying off premise, I mean that's amazing, you know in a period of just a year or two. So Chief Data Officers are having to operate in these extreme kind of high velocity environments. The last thing you want to do is make a bet today, with the knowledge that you're going to end up having to throw away that bet in six months or a year. So the platform approach is sort of like your insurance policy because it enables you to design for today's requirements, but then very very quickly migrate or modify for new requirements that maybe be six months, a year or two down the line. >> On that front, I'd love for you to give us an example of a customer that has maybe in the last year, since you've seen so much velocity, come to you. But also had other technologies and their environment that from a cost perspective, I mean but at Peter's point there's still generating value, business value. How do you help customers that have multiple different products maybe exploring different multi-calibers, how to they come and start working with Informatica and not have to rip out other stuff, but be able to move forward and achieve ROI? >> So, it's really interesting kind of how people think about the whole rip and replace concept. So we actually had a customer dinner last night and I'm sitting next to a guy, and I was kind of asking very similar question. Tell me about your technology landscape, you know where are things going, where have things gone in the past, and he basically said there's a whole portfolio of technologies that they plan to obsolete. 'Cause they just know that, like they're probably, they don't even bother thinking about sustainability, to your point. They just want to use something just to kind of try it out. It's basically like a series of like three month trails of different technologies. And that's probably why we such proliferation of different technologies, 'cause people are just kind of trying stuff out, but it's like, I know I'm going to throw this stuff out. >> Yeah but that's, I mean, let me make sure I got that. 'Cause I want to reconcile a point. That's if they're in pilot and the pilot doesn't work. But the minute it goes into production and values being created they want to be able to sustain that stream of value. >> This is production environment. I'm glad you asked that question. So this is a customer that, and I'll tell you where I'm going to the point. So they've been using Informatica for over four years, for big data which is essentially almost the entire time big data's been around. So the reason this customers making the point is, Informatica's the only technology that is actually sustained precisely for the point that you're bringing up, because their requirements have changed wildly during this time. Even the internal politics of who needs access to data, all of that has changed radically over these four years. But the platform has enabled them to actually make those changes, and it's you know, been able to give them that flexibly. Everything else as far as, you know, developer tools, you know, visualization tools, like every year there's some kind of new thing that sort of comes out. And I don't want to be terribly harsh, there's probably one or two kind of vendors that have also persisted in those other areas. But, the point that they were trying to make to your original point is, is the point about sustainability. Like, at some point to avoid complete and utter chaos, you got to have like some foundation in the data environment. Something actually has to be something you can invest in today, knowing that as these changes internally externally are happening, you can kind of count on it and you can go to cloud you can be on Premise, you can have structured data, unstructured data, you know, for any type of data, any type of user, any type of deployment environment. I need something that I can count on, that's actually existing for four or more years. And that's where Informatica fits in. And meanwhile there's going to be a lot of other tools that, like this guy was saying, they're going to try out for three month or six months and that's great, but they're almost using it with the idea that they're going to throw it away. >> Couple questions here; What are some of the business values that you were, stating like this gentlemen, that you ere talking to last night. What's the industry that's he in and also, are there any like stats or ranges you can give us. Like, reduction in TCO, or new business models opening up. What's the business impact that Informatica is helping these customers achieve. >> Yeah, absolutely, I'll use this example, he's, I can't mention the name of the company but it's an insurance company. >> Lisa: Lot's of data. >> Lots of data, right. Not only do they have a lot of data, but there's a lot of sensitivity around the data. Because basically the only way they grow is by identifying patterns in consumers and they want to look at it if somebody's using car insurance in, maybe it for so long they're ready to get married, they need home insurance, they have these like really really sophisticated models around human behavior. So they know when to go and position new forms of insurance. There's also obviously security government types of issues that are at play as well. So the sensitivity around data is very very important. So for them, the business value is increased revenue, and you know ability to meet kind of regulatory pressure. I think that's generally, I mean every industry has some variant of that. >> Right. >> Cost production, increase revenue, you know meeting regulatory pressures. And so Informatica facilitates that, because instead of having to hire armies of people, and having to change them out maybe every three months or six months 'cause the underlying infrastructures changing, there's this one team, the Informatica team that's actually existed for this entire journey. They just keep changing, used cases, and projects, and new data sets, new deployment models, but the platform is sort of fixed and it's something that they can count on it's robust, it enables that kind of. >> Peter: It's an asset. >> It's an asset that delivers that sustainable value that you were taking about. >> Last question, we've got about a minute left, in terms of delivering value, Informatica not the only game in town, your competitors are kind of going with this MNA partnership approach. What makes Informatica stand out, why should companies consider Informatica? >> So they say like, what there's a quote about it. Imitation is the most sincere from of flattery. Yeah! (laughing) I guess we should feel as a little bit flattered, you know, by what we're seeing in the industry, but why from a customers stand point should they, you know continue to rely on Informatica. I mean we keep pushing the envelope on innovations, right? So, one the other areas that we innovated on is machine learning within the platform, because ultimately if one of the goals of the platform is to eliminate manual labor, a great way to do that is to just not have people doing it in the first place. Have machines doing it. So we can automatically understand the structure of data without any human intervention, right? We can understand if there's a file and it's got costumer names and you know, cost and skews, it must be an order. You don't actually have to say that it's an order. We can infer all this, because of the machine learning them we have. We can give recommendations to people as they're using our platform, if you're using a data set and you work with another person, we can go to you and say hey, maybe this is a data set that you would be interesting in. So those types of recommendations, predictions, discovery, totally changes the economic game for an organization. 'Cause the last thing you want is to have 40 to 50% growth in data translate into 40 to 50% of labor. Like you just can't afford it. It's not sustainable, again, to go back to your original point. The only sustainable approach to managing data for the future, is to have a machine learning based approach and so that's why, to your question, I think just gluing a bunch of stuff together still doesn't actually get to nut of sustainability. You actually have to have, the glue has to have something in it, you know? And in our case it's the machine learning approach that ties everything together that brings a data organization together, so they can actually deliver the maximum business value. >> Literally creates a network of data that delivers business value. >> You got it. >> Well Murthy, Murthy Awesome, thank you so much for coming back to theCUBE. >> Thank you! >> And sharing what's going on the Informatica and what's differentiating you guys. We wish you a great rest of the Strata Conference. >> Awesome, you as well. Thank you. >> Absolutely, we want to thank you for watching theCUBE. I'm Lisa Martin with Peter Burris, we are live in San Jose at the Forger Eatery, come down here and join us, we've got a really cool space, we've got a part-tay tonight, so come join us. And we've got a really interesting breakfast presentation tomorrow morning, stick around and we'll be right back, with our next guest for this short break. (fun upbeat music)

Published Date : Mar 7 2018

SUMMARY :

brought to you be Siliconangle Media and we also invite you to come down this evening. welcome back to theCUBE, it's great to have you back. and congratulations on the 10 year anniversary. big announcements that you guys just released. of our data management platforms for the big data world, and kind of acceleration that comes to platform approach So the data lake has been there, and apply it to business problems. for you to actually deliver an API orientation, now the minute you start having a conversation Informatica's really one of the only vendors And so, the platform approach to go back to your question, about the purpose, and you understand what it means to fit. you know, use what you got, that I know I'm going to throw away a year from now. So even the cloud, you're bringing up earlier on, right? that has maybe in the last year, of technologies that they plan to obsolete. But the minute it goes into production But the platform has enabled them to actually make What are some of the business values that you were, he's, I can't mention the name of the company and you know ability to meet kind of regulatory pressure. and it's something that they can count on it's robust, that you were taking about. Informatica not the only game in town, the glue has to have something in it, you know? that delivers business value. thank you so much for coming back to theCUBE. and what's differentiating you guys. Awesome, you as well. Absolutely, we want to thank you for watching theCUBE.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
40QUANTITY

0.99+

Lisa MartinPERSON

0.99+

Peter BurrisPERSON

0.99+

EuropeLOCATION

0.99+

LisaPERSON

0.99+

PeterPERSON

0.99+

six monthsQUANTITY

0.99+

San JoseLOCATION

0.99+

Murthy MathiprakasamPERSON

0.99+

three monthQUANTITY

0.99+

InformaticaORGANIZATION

0.99+

MurthyPERSON

0.99+

20 minutesQUANTITY

0.99+

oneQUANTITY

0.99+

two daysQUANTITY

0.99+

fourQUANTITY

0.99+

twoQUANTITY

0.99+

first dayQUANTITY

0.99+

a yearQUANTITY

0.99+

tomorrow morningDATE

0.99+

Siliconangle MediaORGANIZATION

0.99+

fourDATE

0.99+

one teamQUANTITY

0.99+

MNAORGANIZATION

0.99+

last yearDATE

0.99+

last nightDATE

0.99+

todayDATE

0.99+

fifthQUANTITY

0.99+

50%QUANTITY

0.98+

first guestQUANTITY

0.98+

four yearsQUANTITY

0.98+

tonightDATE

0.98+

CubeORGANIZATION

0.98+

three years agoDATE

0.98+

Big Data SVEVENT

0.97+

five years agoDATE

0.97+

firstQUANTITY

0.96+

Couple questionsQUANTITY

0.96+

Strata Data ConferenceEVENT

0.95+

almost 50%QUANTITY

0.95+

10 year anniversaryQUANTITY

0.94+

over four yearsQUANTITY

0.94+

two kindQUANTITY

0.94+

one announcementQUANTITY

0.94+

GDPRTITLE

0.94+

two agoDATE

0.92+

Murthy Mathiprakasam, Informatica | Big Data NYC 2017


 

>> Narrator: Live from midtown Manhattan, it's theCUBE. Covering BigData, New York City, 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Welcome back everyone, we're here live in New York City for theCUBE's coverage of BigData NYC, our event we've been running for five years, been covering BigData space for eight years, since 2010 when it was Hadoop World, Strata Conference, Strata Hadoop, Strata Data, soon to be called Strata AI, just a few. We've been theCUBE for all eight years. Here, live in New York, I'm John Furrier. Our next guest is Murthy Mathiprakasam, who is the Director of Product Marketing at Informatica. Cube alumni has been on many times, we cover Informatica World, every year. Great to see you, thanks for coming by and coming in. >> Great to see you. >> You guys do data, so there's not a lot of recycling going on in the data because we've been talking about it all week, total transformation, but the undercurrent has been a lot of AI, AI this, and you guys have the CLAIRE product, doing a lot of things there. But outside of the AI, the undertone is cloud, cloud, cloud. Governance, governance, governance. There's two kind of the drivers I'm seeing as the force of this week is, a lot of people trying to get their act together on those two fronts and you can kind of see the scabs on the industry, people, some people haven't been paying attention. And they're weak in the area. Cloud is absolutely going to be driving the BigData world, 'cause data is horizontal. Cloud's the power source that you guys have been on that. What's your thoughts, what other drivers encourage you? (mumbles) what I'm saying and what else did I miss? Security is obviously in there, but-- >> Absolutely, no, so I think you're exactly right on. So obviously governments security is a big deal. Largely being driven by the GDPR regulation, it's happening in Europe. But, I mean every company today is global, so. Everybody's essentially affected by it. So, I think data until now has always been a kind of opportunistic thing, that there's a couple guys and their organizations were looking at it as oh, let's do some experimentation. Let's do something interesting here. Now, it's becoming government managed so I think there's a lot of organizations who are, like, to your point, getting their act together, and that's driving a lot of demand for data management projects. So now, people say, well, if I got to get my act together, I don't have to hire armies of people to do it, let me look for automated machine learning based ways of doing it. So that they can actually deliver on their audit reports that they need to deliver on, and ensure the compliance that they need to ensure, but do it in a very scalable way. >> I've been kind of joking all week, and I kind of had this meme in my head, so I've been pounding on it all week, calling it the tool shed problem. The tool shed problem is, everyone's got these tools. They throw them into the tool shed. They bought a hammer and the company that sells them the hammer is trying to turn it to a lawnmower, right? You can't mow your lawn with a hammer, it's not going to work, and so this, these tools are great but it defines work. What you do, but, the platforming issue is a huge one. And you start to see people who took that view. You guys were one of them because in a platform centric view with tools that are enabled, to be highly productive. You don't have to worry about new things like a government's policy, the GDPR that might pop up, or the next Equifax that's around the corner. There's probably two or three of them going on right now. So, that's an impact, the data, who uses it, how it's used, and who's at fault or whatever. So, how does a company deal with that? And machine learning has proven to be a great horse that a lot of people are riding right now. You guys are doing it, how does a customer deal with that tsunami of potential threats? Architecture challenges, what is your solution, how do you talk about that? >> Well, I think machine learning, you know, up until now has been seen as the kind of, nice to have, and I think that very quickly, it's going to become a must have. Because, exactly like you're saying, it really is a tsunami. I mean, you could see people who are nervous about the fact that I mean, there's different estimates. It's like 40% growth in data assets from most organizations every year. So, you can try to get around this somehow with one of these (mumbles) tools or something. But at some point, something is going to break, either you just don't, run out of manpower, you can't train the manpower, people start leaving. whatever the operational challenges are, it just isn't going to scale. Machine learning is the only approach. It is absolutely the only approach that actually ensures that you can maintain data for these kind of defensive reasons like you're saying. The structure and compliance, but also the kind of offensive opportunistic reasons, and do it scalably, 'cause there's just no other way mathematically speaking, that when the data is growing 40% a year, just throwing a bunch of tools at it just doesn't work. >> Yeah, I would just amplify and look right in the camera, say, if you're not on machine learning, you're out of business. That's a straight up obvious trend, 'cause that's a precursor to AI, real AI. Alright, let's get down to data management, so when people throw around data management, it's like, oh yeah we've got some data management. There are challenges with that. You guys have been there from day one. But now if you take it out in the future, how do you guys provide the data management in a totally cloud world where now the customer certainly has public and private, or on premise but theirs might have multi cloud? So now, comes a land grab for the data layer, how do you guys play in that? >> Well, I think it's a great opportunity for these kind of middle work platforms that actually do span multiple clouds, that can span the internal environments. So, I'll give you an example. Yesterday we actually had a customer speaking at Astrada here, and he was talking about from him, the cloud is really just a natural extension of what they're already doing, because they already have a sophisticated data practice. This is a large financial services organization, and he's saying well now the data isn't all inside, some of it's outside, you've got partners, who've got data outside. How do we get to that data? Clearly, the cloud is the path for doing that. So, the fact that the cloud is a national extension a lot of organizations were already doing internally means they don't want to have a completely different approach to the data management. They want to have a consistent, simple, systematic repeatable approach to the data management that spans, as you said, on premise in the cloud. That's why I think the opportunity of a very mature and sophisticated platform because you're not rewriting and re-platforming for every new, is it AWS, is it Azure? Is it something on premise? You just want something that works, that shields you from the underlying infrastructure. >> So I put my skeptic hat on for a second and challenge you on this, because this I think is fundamental. Whether it's real or not, it's perceived, maybe in the back of the mind of the CXO or the CDO, whoever is enabled to make these big calls. If they have the keys to the kingdom in Informatica, I'm going to get locked in. So, this is a deep fear. People wake up with nightmares in the enterprise, they've seen locked in before. How do you explain that to a customer that you're going to be an enabling opportunity for them, not a lock in and foreclosing future benefits. Especially if I have an unknown scenario called multi-cloud. I mean, no one's really doing multi-cloud let's face it. I mean, I have multiple clouds with stuff on it, >> At least not intentionally. Sometimes you got a line of businesses and doing things, but absolutely I get it. >> No one's really moving workloads dynamically between clouds in real time. Maybe a few people doing some hacks, but for the most part of course, not a standard practice. >> Right. >> But they want it to be. >> Absolutely. >> So that's the future. From today, how do you preserve that position with the customer where you say hey we're going to add value, but we're not going to lock you in? >> So the whole premise again of, I mean, this goes back to classic three tier models of how you think about technology stacks, right? There's an infrastructure layer, there's a platform layer, there's an analytics layer and the whole premise of the middle of the layer, the platform layer, is that it enables flexibility in the other two layers. It's precisely when you don't have something that's kind of intermediating the data and the use of the data, that's when you run into challenges with flexibility and with data being locked in the data store. But you're absolutely right. We had dinner with a bunch of our customers last night. They were talking about they'd essentially evaluated every version of sort of BigData platform and data infrastructure platform right? And why? It was because they were a large organization and your different teams start stuff and they had to compute them out and stuff. And I was like that must have been pretty hard for you guys. Now what we were using Informatica, so it didn't really matter where the data was, we were still doing everything as far as the data management goes from a consistent layer and we integrate with all those different platforms. >> John: So you didn't get in the way? >> We didn't get in the way. >> You've actually facilitated. >> We are facilitating increased flexibility. Because without a layer like that, a fabric, or whatever you want to call it a data platform that's facilitating this the complexity's going to get very, very crazy very soon. If it hasn't already. The number of infrastructure platforms that are available like you said, on premise and on the cloud now, keeps growing. The number of analytical tools that are available is also growing. And all this is amazing innovation by the way. This is all great stuff, but to your point about it if your the chief officer of an organization going, I got to get this thing figured out somehow. I need some sanity, that's really the purpose of-- >> They just don't want to know the tool for tool's sake, they need to have it be purposeful. >> And that's why this machine learning aspect is very, very critical because I was thinking about an analogy just like you were and I was thinking, in a way you can think of data managing as sort of cleaning stuff up and there are people that have brooms and mops and all these different tools. Well, we are bringing a Roomba to market, right? Because you don't want to just create tools that transfer the laborer around, which is a little bit of what's going on. You want to actually get the laborer out of the equation, so that the people are focused on the context, business strategy and the data management is sort of cleaning itself. It's doing the work for you. That's really what Informatica's vision is. It's about being a kind of enterprise cloud data management vendor that is leveraging AI under the hood so that you can sort of set it and forget it. A lot of this ingestion and the cleansing, telling annals what data they should be looking for. All the stuff is just happening in an automated way and you're not in this total chaos. >> And that can be some tools will be sitting in the back for a long time. In my tool shed, when I had one back in a big enough property back east. No one has tool sheds by the way. No one does any gardening. The issue is in the day, I need to have a reliable partner. So I want you to take a minute and explain to the folks who aren't yet Informatica customers why they should be and the Informatica customers why they should stay with Informatica. >> Absolutely, so certainly the ones we have, a very loyal customer base. In fact the guy who was presenting with us yesterday, he said he's been with Informatica since 1999, going through various versions of our products and adopting new innovations. So we have a very loyal customer base, so I think that loyalty itself speaks for itself as well. As far as net new customers, I think that in a world of this increasing data complexity, it's exactly what you were saying, you need to find an approach that is going to scale. I keep hearing this word from the chief data officer, I kind of got something some going on today, I don't know how I scale it. How is this going to work in 2018 and 2019, in 2025? And it's just daunting for some of these guys. Especially going back to your point about compliance, right? So it's one thing if you have data sitting around, data so to speak, that you're not using it. But god forbid now, you got legal and regulatory concerns around it as well. So you have to get your arms around the data and that's precisely where Informatica can help because we've actually thought through these problems and we've talked about them. >> Most of them were a problem you solved because at the end of the day, we were talking about problems that have massive importance, big time consequences people can actually quantify. >> That's right. >> So what specific problem highest level do you solve is the most important, has the most consequences? >> Everything from ingestion of raw data sets from wherever like you said, in the cloud on premise, all the way through all the processes you need to make it fully usable. And we view that as one problem. There's other vendors who think that one aspect of that is a problem and it is worth solving. We really think, look at the end of the day, you got raw stuff and you have to turn it into useful stuff. Everything in there has to happen, so we might as well just give you everything and be very, very good at doing all those things. And so that's what we call enterprise cloud data management. It's everything from raw material to finished goods of insights. We want to be able to provide that in a consistent integrated and machine learning integrate it. >> Well you guys have a loyal customer base but to be fair and you kind of have to acknowledge that there is a point in time and not throw Informatica's away the big customers, big engagements. But there was a time in Informatica's history where you went private. There was some new management came in. There was a moment where the boat was taking on water, right? And you could almost look at it and say, hmm, you know, we're in this space. You guys retooled around that. Success to the team. Took it to another dimension. So that's the key thing. You know a lot of the companies become big and it's hard to get rid of. So the question is that's a statement. I think you guys done a great job. Yet, the boat might have taken on water, that's my opinion, but you can probably debate that. But I think as you get mature and you're in public, you just went private. But here's the thing, you guys have had a good product chop in Informatica, so I got to ask you the question. What cool things are you doing? Because remember, cool shiny new toys help put a little flash and glam on the nuts and bolts that scales. What are you guys doing? I know you just announced claire, some AI stuff. What's the hot stuff you're doing that's adding value? >> Yeah, absolutely, first of all, this kind of addresses your water comment as well. So we are probably one of the few vendors that spends almost about $200 million in R and D. And that hasn't changed through the acquisition. If anything, I think it actually increased a little bit because now our investors are even more committed to innovation. >> Well you're more nimble in private. A lot more nimble. >> Absolutely, a lot more ideas that are coming to the forefront. So there's never been any water just to be clear. But to answer your follow on question about some examples of this innovation. So I think Ahmed yesterday talked about some of our recent release as well but we really just keep pushing on this idea of, I know I keep saying this but it's this whole machine learning approach here of how can we learn more about the data? So one of the features, I'll give you an example, is if we can actually go look at a file and if we spot like a name and an address and some order information, that probably is a customer, right? And we know that right, because we've seen past data sets. So, there's examples of this pattern matching where you don't even have to have data that's filled out. And this is increasingly the way the data looks we are not dealing with relational tables anymore it's JSON files, it's web blogs, XML files, all of that data that you had to have that data scientists go through and parse and sift through, we just automatically recognize it now. If we can look for the data and understand it, we can match it. >> Put that in context in the order of benefits that, from the old way versus the current way, what's the pain levels? One versus the other, can you put context around that? In terms of, it's pretty significant. >> It's huge because again, back to this sort of volume and variety of data that people are trying to get into systems and do it very rapidly. I'll give you a really tangible customer case. So, this is a customer that presented at Informatica World a couple months ago. It's Jewelry TV, I can actually tell you the name. So there are one of these online kind of shopping sites and they've got a TV program that goes with the online site. So what they do is obviously when you promote something on TV, your orders go up online, right? They wanted to flip it around and they said, look, let's look at the web logs of the traffic that's on the website and then go promote that on the TV program. Because then you get a closed loop and start to have this explosion of sales. So they used Informatica, didn't have to do any of this hand coding. They just build this very quickly and with the graphical user interface that we provide, it leverages sparks streaming under the hood. So they are using all these technologies under the hood, they just didn't have to do any of the manual coding. Got this thing out in a couple days and it works. And they have been able to measure it and they're actually driving increased sales by taking the data and just getting it out to the people that need to see the data very, very quickly. So that's an example of a use case where this isn't just to your point about is this a small, incremental type of thing. No, there is a lot of money behind data if you can actually put it to good use. >> The consequences are grave and I think you've seen more and more, I mean the hacks just amplify it over and over again. It's not a cost center when you think about it. It has to be somehow configured differently as a profit center, even though it might not drive top line revenue directly like an app or anything else. It's not a cost center. If anything it will be treated as a profit center because you get hacked or someone's data is misused, you can be out of business. There is no profit. Look at the results of these hacks. >> The defensive argument is going to become very, very strong as these regulations come out. But, let's be clear, we work with a lot of the most advanced customers. There are people making money off of this. It can be a top line driver-- >> No it should be, it should be. That's exactly the mindset. So the final question for you before we break. I know we're out of time here. There are some chief data officers that are enabled, some aren't and that's just my observation. I don't want to pidgeonhole anyone, but some are enable to really drive change, some are just figureheads that are just managing the compliance risk and work for the CFO and say no to everything. I'm over-generalizing. But that's essentially how I see it. What's the problem with that? Because the cost center issue has, we've seen this moving before in the security business. Security should not be part of IT. That's it's own deal. >> Exactly. >> So we're kind of, this is kind of smoke, but we're coming out of the jungle here. Your thoughts on that. >> Yeah, you're absolutely right. We see a variety of models. We can see the evolution of those models and it's also very contextual to different industries. There are industries that are inherently more regulated, so that's why you're seeing the data people maybe more in those cost center areas that are focused on regulations and things like that. There's other industries that are a lot more consumer oriented. So for them, it makes more sense to have the data people be in a department that seems more revenue basing. So it's not entirely random. There are some reasons, that's not to say that's not the right model moving forward, but someday, you never know. There is a reason why this role became a CXO in the first place. Maybe it is somebody who reports to the CEO and they really view the data department as a strategic function. And it might take a while to get there, but I don't think it's going to take a long time. Again, we're talking about 40% growth in the data and these guys are realizing that now and I think we're going to see very quickly people moving out of the whole tool shed model, and moving to very systematic, repeatable practices. Sophisticated middleware platforms and-- >> As we say don't be a tool, be a platform. Murphy thanks so much for coming on to theCUBE, we really appreciate it. What's going on in Informatica real quick. Things good? >> Things are great. >> Good, awesome. Live from New York, this is theCUBE here at BigData NYC more live coverage continuing day three after this short break. (digital music)

Published Date : Sep 29 2017

SUMMARY :

Brought to you by SiliconANGLE Media soon to be called Strata AI, just a few. Cloud's the power source that you guys have been on that. the compliance that they need to ensure, And you start to see people who took that view. that you can maintain data for these kind So now, comes a land grab for the data layer, that shields you from the underlying infrastructure. So I put my skeptic hat on for a second and challenge you Sometimes you got a line of businesses and doing things, but for the most part of course, not a standard practice. So that's the future. is that it enables flexibility in the other two layers. the complexity's going to get very, very crazy very soon. they need to have it be purposeful. so that you can sort of set it and forget it. The issue is in the day, I need to have a reliable partner. So you have to get your arms around the data because at the end of the day, we were talking about all the processes you need to make it fully usable. But here's the thing, you guys have had a good product So we are probably one of the few vendors that spends almost Well you're more nimble in private. So one of the features, I'll give you an example, of benefits that, from the old way versus the current way, So what they do is obviously when you promote something on It's not a cost center when you think about it. of the most advanced customers. So the final question for you before we break. So we're kind of, this is kind of smoke, So for them, it makes more sense to have the data people Murphy thanks so much for coming on to theCUBE, Live from New York, this is theCUBE here at BigData NYC

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
InformaticaORGANIZATION

0.99+

JohnPERSON

0.99+

Murthy MathiprakasamPERSON

0.99+

2018DATE

0.99+

John FurrierPERSON

0.99+

twoQUANTITY

0.99+

EuropeLOCATION

0.99+

AstradaORGANIZATION

0.99+

2025DATE

0.99+

New YorkLOCATION

0.99+

yesterdayDATE

0.99+

five yearsQUANTITY

0.99+

2019DATE

0.99+

threeQUANTITY

0.99+

New York CityLOCATION

0.99+

MurphyPERSON

0.99+

eight yearsQUANTITY

0.99+

two layersQUANTITY

0.99+

oneQUANTITY

0.99+

AWSORGANIZATION

0.99+

firstQUANTITY

0.99+

todayDATE

0.99+

two frontsQUANTITY

0.99+

1999DATE

0.99+

GDPRTITLE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

one problemQUANTITY

0.99+

last nightDATE

0.98+

AhmedPERSON

0.98+

YesterdayDATE

0.98+

2010DATE

0.98+

one thingQUANTITY

0.98+

Strata ConferenceEVENT

0.98+

NYCLOCATION

0.98+

40% a yearQUANTITY

0.97+

Hadoop WorldEVENT

0.97+

EquifaxORGANIZATION

0.97+

day threeQUANTITY

0.96+

Strata HadoopEVENT

0.95+

Informatica WorldORGANIZATION

0.95+

two kindQUANTITY

0.95+

2017DATE

0.95+

about $200 millionQUANTITY

0.94+

one aspectQUANTITY

0.94+

theCUBEORGANIZATION

0.94+

Informatica WorldEVENT

0.91+

this weekDATE

0.9+

40% growthQUANTITY

0.88+

BigDataORGANIZATION

0.87+

three tierQUANTITY

0.87+

day oneQUANTITY

0.87+

Strata DataEVENT

0.85+

CXOTITLE

0.85+

CubeORGANIZATION

0.84+

midtown ManhattanLOCATION

0.83+

about 40% growthQUANTITY

0.8+

couple months agoDATE

0.8+

Strata AIEVENT

0.79+

couple guysQUANTITY

0.76+

clairePERSON

0.71+

lot of moneyQUANTITY

0.67+

OneQUANTITY

0.66+

BigDataTITLE

0.64+

CDOTITLE

0.63+

couple daysQUANTITY

0.63+

JSONTITLE

0.62+

Arun Murthy, Hortonworks | BigData NYC 2017


 

>> Coming back when we were a DOS spreadsheet company. I did a short stint at Microsoft and then joined Frank Quattrone when he spun out of Morgan Stanley to create what would become the number three tech investment (upbeat music) >> Host: Live from mid-town Manhattan, it's theCUBE covering the BigData New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (upbeat electronic music) >> Welcome back, everyone. We're here, live, on day two of our three days of coverage of BigData NYC. This is our event that we put on every year. It's our fifth year doing BigData NYC in conjunction with Hadoop World which evolved into Strata Conference, which evolved into Strata Hadoop, now called Strata Data. Probably next year will be called Strata AI, but we're still theCUBE, we'll always be theCUBE and this our BigData NYC, our eighth year covering the BigData world since Hadoop World. And then as Hortonworks came on we started covering Hortonworks' data summit. >> Arun: DataWorks Summit. >> DataWorks Summit. Arun Murthy, my next guest, Co-Founder and Chief Product Officer of Hortonworks. Great to see you, looking good. >> Likewise, thank you. Thanks for having me. >> Boy, what a journey. Hadoop, years ago, >> 12 years now. >> I still remember, you guys came out of Yahoo, you guys put Hortonworks together and then since, gone public, first to go public, then Cloudera just went public. So, the Hadoop World is pretty much out there, everyone knows where it's at, it's got to nice use case, but the whole world's moved around it. You guys have been, really the first of the Hadoop players, before ever Cloudera, on this notion of data in flight, or, I call, real-time data but I think, you guys call it data-in-motion. Batch, we all know what Batch does, a lot of things to do with Batch, you can optimize it, it's not going anywhere, it's going to grow. Real-time data-in-motion's a huge deal. Give us the update. >> Absolutely, you know, we've obviously been in this space, personally, I've been in this for about 12 years now. So, we've had a lot of time to think about it. >> Host: Since you were 12? >> Yeah. (laughs) Almost. Probably look like it. So, back in 2014 and '15 when we, sort of, went public and we're started looking around, the thesis always was, yes, Hadoop is important, we're going to love you to manage lots and lots of data, but a lot of the stuff we've done since the beginning, starting with YARN and so on, was really enable the use cases beyond the whole traditional transactions and analytics. And Drop, our CO calls it, his vision's always been we've got to get into a pre-transactional world, if you will, rather than the post-transactional analytics and BIN and so on. So that's where it started. And increasingly, the obvious next step was to say, look enterprises want to be able to get insights from data, but they also want, increasingly, they want to get insights and they want to deal with it in real-time. You know while you're in you shopping cart. They want to make sure you don't abandon your shopping cart. If you were sitting at at retailer and you're on an island and you're about to walk away from a dress, you want to be able to do something about it. So, this notion of real-time is really important because it helps the enterprise connect with the customer at the point of action, if you will, and provide value right away rather than having to try to do this post-transaction. So, it's been a really important journey. We went and bought this company called Onyara, which is a bunch of geeks like us who started off with the government, built this batching NiFi thing, huge community. Its just, like, taking off at this point. It's been a fantastic thing to join hands and join the team and keep pushing in the whole streaming data style. >> There's a real, I don't mean to tangent but I do since you brought up community I wanted to bring this up. It's been the theme here this week. It's more and more obvious that the community role is becoming central, beyond open-source. We all know open-source, standing on the shoulders before us, you know. And Linux Foundation showing code numbers hitting up from $64 million to billions in the next five, ten years, exponential growth of new code coming in. So open-source certainly blew me. But now community is translating to things you start to see blockchain, very community based. That's a whole new currency market that's changing the financial landscape, ICOs and what-not, that's just one data point. Businesses, marketing communities, you're starting to see data as a fundamental thing around communities. And certainly it's going to change the vendor landscape. So you guys compare to, Cloudera and others have always been community driven. >> Yeah our philosophy has been simple. You know, more eyes and more hands are better than fewer. And it's been one of the cornerstones of our founding thesis, if you will. And you saw how that's gone on over course of six years we've been around. Super-excited to have someone like IBM join hands, it happened at DataWorks Summit in San Jose. That announcement, again, is a reflection of the fact that we've been very, very community driven and very, very ecosystem driven. >> Communities are fundamentally built on trust and partnering. >> Arun: Exactly >> Coding is pretty obvious, you code with your friends. You code with people who are good, they become your friends. There's an honor system among you. You're starting to see that in the corporate deals. So explain the dynamic there and some of the successes that you guys have had on the product side where one plus one equals more than two. One plus one equals five or three. >> You know IBM has been a great example. They've decided to focus on their strengths which is around Watson and machine learning and for us to focus on our strengths around data management, infrastructure, cloud and so on. So this combination of DSX, which is their data science work experience, along with Hortonworks is really powerful. We are seeing that over and over again. Just yesterday we announced the whole Dataplane thing, we were super excited about it. And now to get IBM to say, we'll get in our technologies and our IP, big data, whether it's big Quality or big Insights or big SEQUEL, and the word has been phenomenal. >> Well the Dataplane announcement, finally people who know me know that I hate the term data lake. I always said it's always been a data ocean. So I get redemption because now the data lakes, now it's admitting it's a horrible name but just saying stitching together the data lakes, Which is essentially a data ocean. Data lakes are out there and you can form these data lakes, or data sets, batch, whatever, but connecting them and integrating them is a huge issue, especially with security. >> And a lot of it is, it's also just pragmatism. We start off with this notion of data lake and say, hey, you got too many silos inside the enterprise in one data center, you want to put them together. But then increasingly, as Hadoop has become more and more mainstream, I can't remember the last time I had to explain what Hadoop is to somebody. As it has become mainstream, couple things have happened. One is, we talked about streaming data. We see all the time, especially with HTF. We have customers streaming data from autonomous cars. You have customers streaming from security cameras. You can put a small minify agent in a security camera or smart phone and can stream it all the way back. Then you get into physics. You're up against the laws of physics. If you have a security camera in Japan, why would you want to move it all the way to California and process it. You'd rather do it right there, right? So with this notion of a regional data center becomes really important. >> And that talks to the Edge as well. >> Exactly, right. So you want to have something in Japan that collects all of the security cameras in Tokyo, and you do analysis and push what you want back here, right. So that's physics. The other thing we are increasingly seeing is with data sovereignty rules especially things like GDPR, there's now regulation reasons where data has to naturally stay in different regions. Customer data from Germany cannot move to France or visa versa, right. >> Data governance is a huge issue and this is the problem I have with data governance. I am really looking for a solution so if you can illuminate this it would be great. So there is going to be an Equifax out there again. >> Arun: Oh, for sure. >> And the problem is, is that going to force some regulation change? So what we see is, certainly on the mugi bond side, I see it personally is that, you can almost see that something else will happen that'll force some policy regulation or governance. You don't want to screw up your data. You also don't want to rewrite your applications or rewrite you machine learning algorithms. So there's a lot of waste potential by not structuring the data properly. Can you comment on what's the preferred path? >> Absolutely, and that's why we've been working on things like Dataplane for almost a couple of years now. We is to say, you have to have data and policies which make sense, given a context. And the context is going to change by application, by usage, by compliance, by law. So, now to manage 20, 30, 50 a 100 data lakes, would it be better, not saying lakes, data ponds, >> [Host} Any Data. >> Any data >> Any data pool, stream, river, ocean, whatever. (laughs) >> Jacuzzis. Data jacuzzis, right. So what you want to do is want a holistic fabric, I like the term, you know Forrester uses, they call it the fabric. >> Host: Data fabric. >> Data fabric, right? You want a fabric over these so you can actually control and maintain governance and security centrally, but apply it with context. Last not least, is you want to do this whether it's on frame or on the cloud, or multi-cloud. So we've been working with a bank. They were probably based in Germany but for GDPR they had to stand up something in France now. They had French customers, but for a bunch of new reasons, regulation reasons, they had to sign up something in France. So they bring their own data center, then they had only the cloud provider, right, who I won't name. And they were great, things are working well. Now they want to expand the similar offering to customers in Asia. It turns out their favorite cloud vendor was not available in Asia or they were not available in time frame which made sense for the offering. So they had to go with cloud vendor two. So now although each of the vendors will do their job in terms of giving you all the security and governance and so on, the fact that you are to manage it three ways, one for OnFrame, one for cloud vendor A and B, was really hard, too hard for them. So this notion of a fabric across these things, which is Dataplane. And that, by the way, is based by all the open source technologies we love like Atlas and Ranger. By the way, that is also what IBM is betting on and what the entire ecosystem, but it seems like a no-brainer at this point. That was the kind of reason why we foresaw the need for something like a Dataplane and obviously couldn't be more excited to have something like that in the market today as a net new service that people can use. >> You get the catalogs, security controls, data integration. >> Arun: Exactly. >> Then you get the cloud, whatever, pick your cloud scenario, you can do that. Killer architecture, I liked it a lot. I guess the question I have for you personally is what's driving the product decisions at Hortonworks? And the second part of that question is, how does that change your ecosystem engagement? Because you guys have been very friendly in a partnering sense and also very good with the ecosystem. How are you guys deciding the product strategies? Does it bubble up from the community? Is there an ivory tower, let's go take that hill? >> It's both, because what typically happens is obviously we've been in the community now for a long time. Working publicly now with well over 1,000 customers not only puts a lot of responsibility on our shoulders but it's also very nice because it gives us a vantage point which is unique. That's number one. The second one we see is being in the community, also we see the fact that people are starting to solve the problems. So it's another elementary for us. So you have one as the enterprise side, we see what the enterprises are facing which is kind of where Dataplane came in, but we also saw in the community where people are starting to ask us about hey, can you do multi-cluster Atlas? Or multi-cluster Ranger? Put two and two together and say there is a real need. >> So you get some consensus. >> You get some consensus, and you also see that on the enterprise side. Last not least is when went to friends like IBM and say hey we're doing this. This is where we can position this, right. So we can actually bring in IGSC, you can bring big Quality and bring all these type, >> [Host} So things had clicked with IBM? >> Exactly. >> Rob Thomas was thinking the same thing. Bring in the power system and the horsepower. >> Exactly, yep. We announced something, for example, we have been working with the power guys and NVIDIA, for deep learning, right. That sort of stuff is what clicks if you're in the community long enough, if you have the vantage point of the enterprise long enough, it feels like the two of them click. And that's frankly, my job. >> Great, and you've got obviously the landscape. The waves are coming in. So I've got to ask you, the big waves are coming in and you're seeing people starting to get hip with the couple of key things that they got to get their hands on. They need to have the big surfboards, metaphorically speaking. They got to have some good products, big emphasis on real value. Don't give me any hype, don't give me a head fake. You know, I buy, okay, AI Wash, and people can see right through that. Alright, that's clear. But AI's great. We all cheer for AI but the reality is, everyone knows that's pretty much b.s. except for core machine learning is on the front edge of innovation. So that's cool, but value. [Laughs] Hey I've got the integrate and operationalize my data so that's the big wave that's coming. Comment on the community piece because enterprises now are realizing as open source becomes the dominant source of value for them, they are now really going to the next level. It used to be like the emerging enterprises that knew open source. The guys will volunteer and they may not go deeper in the community. But now more people in the enterprises are in open source communities, they are recruiting from open source communities, and that's impacting their business. What's your advice for someone who's been in the community of open source? Lessons you've learned, what is the best practice, from your standpoint on philosophy, how to build into the community, how to build a community model. >> Yeah, I mean, the end of the day, my best advice is to say look, the community is defined by the people who contribute. So, you get advice if you contribute. Which means, if that's the fundamental truth. Which means you have to get your legal policies and so on to a point that you can actually start to let your employees contribute. That kicks off a flywheel, where you can actually go then recruit the best talent, because the best talent wants to stand out. Github is a resume now. It is not a word doc. If you don't allow them to build that resume they're not going to come by and it's just a fundamental truth. >> It's self governing, it's reality. >> It's reality, exactly. Right and we see that over and over again. It's taken time but it as with things, the flywheel has changed enough. >> A whole new generation's coming online. If you look at the young kids coming in now, it is an amazing environment. You've got TensorFlow, all this cool stuff happening. It's just amazing. >> You, know 20 years ago that wouldn't happen because the Googles of the world won't open source it. Now increasingly, >> The secret's out, open source works. >> Yeah, (laughs) shh. >> Tell everybody. You know they know already but, This is changing some of the how H.R. works and how people collaborate, >> And the policies around it. The legal policies around contribution so, >> Arun, great to see you. Congratulations. It's been fun to watch the Hortonworks journey. I want to appreciate you and Rob Bearden for supporting theCUBE here in BigData NYC. If is wasn't for Hortonworks and Rob Bearden and your support, theCUBE would not be part of the Strata Data, which we are not allowed to broadcast into, for the record. O'Reilly Media does not allow TheCube or our analysts inside their venue. They've excluded us and that's a bummer for them. They're a closed organization. But I want to thank Hortonworks and you guys for supporting us. >> Arun: Likewise. >> We really appreciate it. >> Arun: Thanks for having me back. >> Thanks and shout out to Rob Bearden. Good luck and CPO, it's a fun job, you know, not the pressure. I got a lot of pressure. A whole lot. >> Arun: Alright, thanks. >> More Cube coverage after this short break. (upbeat electronic music)

Published Date : Sep 28 2017

SUMMARY :

the number three tech investment Brought to you by SiliconANGLE Media This is our event that we put on every year. Co-Founder and Chief Product Officer of Hortonworks. Thanks for having me. Boy, what a journey. You guys have been, really the first of the Hadoop players, Absolutely, you know, we've obviously been in this space, at the point of action, if you will, standing on the shoulders before us, you know. And it's been one of the cornerstones Communities are fundamentally built on that you guys have had on the product side and the word has been phenomenal. So I get redemption because now the data lakes, I can't remember the last time I had to explain and you do analysis and push what you want back here, right. so if you can illuminate this it would be great. I see it personally is that, you can almost see that We is to say, you have to have data and policies Any data pool, stream, river, ocean, whatever. I like the term, you know Forrester uses, the fact that you are to manage it three ways, I guess the question I have for you personally is So you have one as the enterprise side, and you also see that on the enterprise side. Bring in the power system and the horsepower. if you have the vantage point of the enterprise long enough, is on the front edge of innovation. and so on to a point that you can actually the flywheel has changed enough. If you look at the young kids coming in now, because the Googles of the world won't open source it. This is changing some of the how H.R. works And the policies around it. and you guys for supporting us. Thanks and shout out to Rob Bearden. More Cube coverage after this short break.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AsiaLOCATION

0.99+

FranceLOCATION

0.99+

ArunPERSON

0.99+

IBMORGANIZATION

0.99+

Rob BeardenPERSON

0.99+

GermanyLOCATION

0.99+

Arun MurthyPERSON

0.99+

JapanLOCATION

0.99+

NVIDIAORGANIZATION

0.99+

TokyoLOCATION

0.99+

2014DATE

0.99+

CaliforniaLOCATION

0.99+

12QUANTITY

0.99+

fiveQUANTITY

0.99+

Frank QuattronePERSON

0.99+

threeQUANTITY

0.99+

twoQUANTITY

0.99+

OnyaraORGANIZATION

0.99+

$64 millionQUANTITY

0.99+

MicrosoftORGANIZATION

0.99+

San JoseLOCATION

0.99+

O'Reilly MediaORGANIZATION

0.99+

eachQUANTITY

0.99+

Morgan StanleyORGANIZATION

0.99+

Linux FoundationORGANIZATION

0.99+

OneQUANTITY

0.99+

fifth yearQUANTITY

0.99+

AtlasORGANIZATION

0.99+

20QUANTITY

0.99+

oneQUANTITY

0.99+

Rob ThomasPERSON

0.99+

three daysQUANTITY

0.99+

eighth yearQUANTITY

0.99+

yesterdayDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

six yearsQUANTITY

0.99+

EquifaxORGANIZATION

0.99+

next yearDATE

0.99+

NYCLOCATION

0.99+

HortonworksORGANIZATION

0.99+

second partQUANTITY

0.99+

bothQUANTITY

0.99+

RangerORGANIZATION

0.99+

50QUANTITY

0.98+

30QUANTITY

0.98+

YahooORGANIZATION

0.98+

Strata ConferenceEVENT

0.98+

DataWorks SummitEVENT

0.98+

HadoopTITLE

0.98+

'15DATE

0.97+

20 years agoDATE

0.97+

ForresterORGANIZATION

0.97+

GDPRTITLE

0.97+

second oneQUANTITY

0.97+

one data centerQUANTITY

0.97+

GithubORGANIZATION

0.96+

about 12 yearsQUANTITY

0.96+

three waysQUANTITY

0.96+

ManhattanLOCATION

0.95+

day twoQUANTITY

0.95+

this weekDATE

0.95+

NiFiORGANIZATION

0.94+

DataplaneORGANIZATION

0.94+

BigDataORGANIZATION

0.94+

Hadoop WorldEVENT

0.93+

billionsQUANTITY

0.93+

Arun Murthy, Hortonworks | DataWorks Summit 2017


 

>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE covering DataWorks Summit 2017. Brought to you by Hortonworks. >> Good morning, welcome to theCUBE. We are live at day 2 of the DataWorks Summit, and have had a great day so far, yesterday and today, I'm Lisa Martin with my co-host George Gilbert. George and I are very excited to be joined by a multiple CUBE alumni, the co-founder and VP of Engineering at Hortonworks Arun Murthy. Hey, Arun. >> Thanks for having me, it's good to be back. >> Great to have you back, so yesterday, great energy at the event. You could see and hear behind us, great energy this morning. One of the things that was really interesting yesterday, besides the IBM announcement, and we'll dig into that, was that we had your CEO on, as well as Rob Thomas from IBM, and Rob said, you know, one of the interesting things over the last five years was that there have been only 10 companies that have beat the S&P 500, have outperformed, in each of the last five years, and those companies have made big bets on data science and machine learning. And as we heard yesterday, these four meta-trains IoT, cloud streaming, analytics, and now the fourth big leg, data science. Talk to us about what Hortonworks is doing, you've been here from the beginning, as a co-founder I've mentioned, you've been with Hadoop since it was a little baby. How is Hortonworks evolving to become one of those big users making big bets on helping your customers, and yourselves, leverage machine loading to really drive the business forward? >> Absolutely, a great question. So, you know, if you look at some of the history of Hadoop, it started off with this notion of a data lake, and then, I'm talking about the enterprise side of Hadoop, right? I've been working for Hadoop for about 12 years now, you know, the last six of it has been as a vendor selling Hadoop to enterprises. They started off with this notion of data lake, and as people have adopted that vision of a data lake, you know, you bring all the data in, and now you're starting to get governance and security, and all of that. Obviously the, one of the best ways to get value over the data is the notion of, you know, can you, sort of, predict what is going to happen in your world of it, with your customers, and, you know, whatever it is with the data that you already have. So that notion of, you know, Rob, our CEO, talks about how we're trying to move from a post-transactional world to a pre-transactional world, and doing the analytics and data sciences will be, obviously, with me. We could talk about, and there's so many applications of it, something as similar as, you know, we did a demo last year of, you know, of how we're working with a freight company, and we're starting to show them, you know, predict which drivers and which routes are going to have issues, as they're trying to move, alright? Four years ago we did the same demo, and we would say, okay this driver has, you know, we would show that this driver had an issue on this route, but now, within the world, we can actually predict and let you know to take preventive measures up front. Similarly internally, you know, you can take things from, you know, mission-learning, and log analytics, and so on, we have a internal problem, you know, where we have to test two different versions of HDP itself, and as you can imagine, it's a really, really hard problem. We have the support, 10 operating systems, seven databases, like, if you multiply that matrix, it's, you know, tens of thousands of options. So, if you do all that testing, we now use mission-learning internally, to look through the logs, and kind of predict where the failures were, and help our own, sort of, software engineers understand where the problems were, right? An extension of that has been, you know, the work we've done in Smartsense, which is a service we offer our enterprise customers. We collect logs from their Hadoop clusters, and then they can actually help them understand where they can either tune their applications, or even tune their hardware, right? They might have a, you know, we have this example I really like where at a really large enterprise Financial Services client, they had literally, you know, hundreds and, you know, and thousands of machines on HDP, and we, using Smartsense, we actually found that there were 25 machines which had bad NIC configuration, and we proved to them that by fixing those, we got a 30% to put back on their cluster. At that scale, it's a lot of money, it's a lot of cap, it's a lot of optics So, as a company, we try to ourselves, as much as we, kind of, try to help our customers adopt it, that make sense? >> Yeah, let's drill down on that even a little more, cause it's pretty easy to understand what's the standard telemetry you would want out of hardware, but as you, sort of, move up the stack the metrics, I guess, become more custom. So how do you learn, not just from one customer, but from many customers especially when you can't standardize what you're supposed to pull out of them? >> Yeah so, we're sort of really big believers in, sort of, doctoring your own stuff, right? So, we talk about the notion of data lake, we actually run a Smartsense data lake where we actually get data across, you know, the hundreds of of our customers, and we can actually do predictive mission-learning on that data in our own data lake. Right? And to your point about how we go up the stack, this is, kind of, where we feel like we have a natural advantage because we work on all the layers, whether it's the sequel engine, or the storage engine, or, you know, above and beyond the hardware. So, as we build these models, we understand that we need more, or different, telemetry right? And we put that back into the product so the next version of HDP will have that metrics that we wanted. And, now we've been doing this for a couple of years, which means we've done three, four, five turns of the crank, obviously something we always get better at, but I feel like, compared to where we were a couple of years ago when Smartsense first came out, it's actually matured quite a lot, from that perspective. >> So, there's a couple different paths you can add to this, which is customers might want, as part of their big data workloads, some non-Hortonworks, you know, services or software when it's on-prem, and then can you also extend this management to the Cloud if they want to hybrid setup where, in the not too distant future, the Cloud vendor will be also a provider for this type of management. >> So absolutely, in fact it's true today when, you know, we work with, you know, Microsoft's a great partner of ours. We work with them to enable Smartsense on HDI, which means we can actually get the same telemetry back, whether you're running the data on an on-prem HDP, or you're running this on HDI. Similarly, we shipped a version of our Cloud product, our Hortonworks Data Cloud, on Amazon and again Smartsense preplanned there, so whether you're on an Amazon, or a Microsoft, or on-prem, we get the same telemetry, we get the same data back. We can actually, if you're a customer using many of these products, we can actually give you that telemetry back. Similarly, if you guys probably know this we have, you were probably there in an analyst when they announced the Flex Support subscription, which means that now we can actually take the support subscription you have to get from Hortonworks, and you can actually use it on-prem or on the Cloud. >> So in terms of transforming, HDP for example, just want to make sure I'm understanding this, you're pulling in data from customers to help evolve the product, and that data can be on-prem, it can be in a Microsoft lesur, it can be an AWS? >> Exactly. The HDP can be running in any of these, we will actually pull all of them to our data lake, and they actually do the analytics for us and then present it back to the customers. So, in our support subscription, the way this works is we do the analytics in our lake, and it pushes it back, in fact to our support team tickets, and our sales force, and all the support mechanisms. And they get a set of recommendations saying Hey, we know this is the work loads you're running, we see these are the opportunities for you to do better, whether it's tuning a hardware, tuning an application, tuning the software, we sort of send the recommendations back, and the customer can go and say Oh, that makes sense, the accept that and we'll, you know, we'll update the recommendation for you automatically. Then you can have, or you can say Maybe I don't want to change my kernel pedometers, let's have a conversation. And if the customer, you know, is going through with that, then they can go and change it on their own. We do that, sort of, back and forth with the customer. >> One thing that just pops into my mind is, we talked a lot yesterday about data governance, are there particular, and also yesterday on stage were >> Arun: With IBM >> Yes exactly, when we think of, you know, really data-intensive industries, retail, financial services, insurance, healthcare, manufacturing, are there particular industries where you're really leveraging this, kind of, bi-directional, because there's no governance restrictions, or maybe I shouldn't say none, but. Give us a sense of which particular industries are really helping to fuel the evolution of Hortonworks data lake. >> So, I think healthcare is a great example. You know, when we started off, sort of this open-source project, or an atlas, you know, a couple of years ago, we got a lot of traction in the healthcare sort of insurance industry. You know, folks like Aetna were actually founding members of that, you know, sort of consortium of doing this, right? And, we're starting to see them get a lot of leverage, all of this. Similarly now as we go into, you know, Europe and expand there, things like GDPR, are really, really being pardoned, right? And, you guys know GDPR is a really big deal. Like, you pay, if you're not compliant by, I think it's like March of next year, you pay a portion of your revenue as fines. That's, you know, big money for everybody. So, I think that's what we're really excited about the portion with IBM, because we feel like the two of us can help a lot of customers, especially in countries where they're significantly, highly regulated, than the United States, to actually get leverage our, sort of, giant portfolio of products. And IBM's been a great company to atlas, they've adopted wholesale as you saw, you know, in the announcements yesterday. >> So, you're doing a Keynote tomorrow, so give us maybe the top three things, you're giving the Keynote on Data Lake 3.0, walk us through the evolution. Data Lakes 1.0, 2.0, 3.0, where you are now, and what folks can expect to hear and see in your Keynote. >> Absolutely. So as we've, kind of, continued to work with customers and we see the maturity model of customers, you know, initially people are staying up a data lake, and then they'd want, you know, sort of security, basic security what it covers, and so on. Now, they want governance, and as we're starting to go to that journey clearly, our customers are pushing us to help them get more value from the data. It's not just about putting the data lake, and obviously managing data with governance, it's also about Can you help us, you know, do mission-learning, Can you help us build other apps, and so on. So, as we look to there's a fundamental evolution that, you know, Hadoop legal system had to go through was with advance of technologies like, you know, a Docker, it's really important first to help the customers bring more than just workloads, which are sort of native to Hadoop. You know, Hadoop started off with MapReduce, obviously Spark's went great, and now we're starting to see technologies like Flink coming, but increasingly, you know, we want to do data science. To mass market data science is obviously, you know, people, like, want to use Spark, but the mass market is still Python, and R, and so on, right? >> Lisa: Non-native, okay. >> Non-native. Which are not really built, you know, these predate Hadoop by a long way, right. So now as we bring these applications in, having technology like Docker is really important, because now we can actually containerize these apps. It's not just about running Spark, you know, running Spark with R, or running Spark with Python, which you can do today. The problem is, in a true multi-tenant governed system, you want, not just R, but you want specifics of a libraries for R, right. And the libraries, you know, George wants might be completely different than what I want. And, you know, you can't do a multi-tenant system where you install both of them simultaneously. So Docker is a really elegant solution to problems like those. So now we can actually bring those technologies into a Docker container, so George's Docker containers will not, you know, conflict with mine. And you can actually go to the races, you know after the races, we're doing data signs. Which is really key for technologies like DSX, right? Because with DSX if you see, obviously DSX supports Spark with technologies like, you know, Zeppelin which is a front-end, but they also have Jupiter, which is going to work the mass market users for Python and R, right? So we want to make sure there's no friction whether it's, sort of, the guys using Spark, or the guys using R, and equally importantly DSX, you know, in the short map will also support things like, you know, the classic IBM portfolio, SBSS and so on. So bringing all of those things in together, making sure they run with data in the data lake, and also the computer in the data lake, is really big for us. >> Wow, so it sounds like your Keynote's going to be very educational for the folks that are attending tomorrow, so last question for you. One of the themes that occurred in the Keynote this morning was sharing a fun-fact about these speakers. What's a fun-fact about Arun Murthy? >> Great question. I guess, you know, people have been looking for folks with, you know, 10 years of experience on Hadoop. I'm here finally, right? There's not a lot of people but, you know, it's fun to be one of those people who've worked on this for about 10 years. Obviously, I look forward to working on this for another 10 or 15 more, but it's been an amazing journey. >> Excellent. Well, we thank you again for sharing time again with us on theCUBE. You've been watching theCUBE live on day 2 of the Dataworks Summit, hashtag DWS17, for my co-host George Gilbert. I am Lisa Martin, stick around we've got great content coming your way.

Published Date : Jun 14 2017

SUMMARY :

Brought to you by Hortonworks. We are live at day 2 of the DataWorks Summit, and Rob said, you know, one of the interesting and we're starting to show them, you know, when you can't standardize what you're or the storage engine, or, you know, some non-Hortonworks, you know, services when, you know, we work with, you know, And if the customer, you know, Yes exactly, when we think of, you know, Similarly now as we go into, you know, Data Lakes 1.0, 2.0, 3.0, where you are now, with advance of technologies like, you know, And the libraries, you know, George wants One of the themes that occurred in the Keynote this morning There's not a lot of people but, you know, Well, we thank you again for sharing time again

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

IBMORGANIZATION

0.99+

RobPERSON

0.99+

HortonworksORGANIZATION

0.99+

Rob ThomasPERSON

0.99+

GeorgePERSON

0.99+

LisaPERSON

0.99+

30%QUANTITY

0.99+

San JoseLOCATION

0.99+

MicrosoftORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

25 machinesQUANTITY

0.99+

10 operating systemsQUANTITY

0.99+

hundredsQUANTITY

0.99+

Arun MurthyPERSON

0.99+

Silicon ValleyLOCATION

0.99+

twoQUANTITY

0.99+

AetnaORGANIZATION

0.99+

10 yearsQUANTITY

0.99+

ArunPERSON

0.99+

todayDATE

0.99+

SparkTITLE

0.99+

yesterdayDATE

0.99+

AWSORGANIZATION

0.99+

bothQUANTITY

0.99+

PythonTITLE

0.99+

last yearDATE

0.99+

Four years agoDATE

0.99+

15QUANTITY

0.99+

tomorrowDATE

0.99+

CUBEORGANIZATION

0.99+

threeQUANTITY

0.99+

DataWorks SummitEVENT

0.99+

seven databasesQUANTITY

0.98+

fourQUANTITY

0.98+

DataWorks Summit 2017EVENT

0.98+

United StatesLOCATION

0.98+

Dataworks SummitEVENT

0.98+

10QUANTITY

0.98+

EuropeLOCATION

0.97+

10 companiesQUANTITY

0.97+

OneQUANTITY

0.97+

one customerQUANTITY

0.97+

thousands of machinesQUANTITY

0.97+

about 10 yearsQUANTITY

0.96+

GDPRTITLE

0.96+

DockerTITLE

0.96+

SmartsenseORGANIZATION

0.96+

about 12 yearsQUANTITY

0.95+

this morningDATE

0.95+

eachQUANTITY

0.95+

two different versionsQUANTITY

0.95+

five turnsQUANTITY

0.94+

RTITLE

0.93+

four meta-trainsQUANTITY

0.92+

day 2QUANTITY

0.92+

Data Lakes 1.0COMMERCIAL_ITEM

0.92+

FlinkORGANIZATION

0.91+

firstQUANTITY

0.91+

HDPORGANIZATION

0.91+

Murthy Mathiprakasam, - Informatica - Big Data SV 17 - #BigDataSV - #theCUBE1


 

(electronic music) >> Announcer: Live from San Jose, California, it's The Cube, covering Big Data Silicon Valley 2017. >> Okay, welcome back everyone. We are live in Silicon Valley for Big Data Silicon Valley. Our companion showed at Big Data NYC in conjunction with Strata Hadoop, Big Data Week. Our next guest is Murthy Mathiprakasam, with the director of product marketing Informatica. Did I get it right? >> Murthy: Absolutely (laughing)! >> Okay (laughing), welcome back. Good to see you again. >> Good to see you! >> Informatica, you guys had a AMIT on earlier yesterday, kicking off our event. It is a data lake world out there, and the show theme has been, obviously beside a ton of machine learning-- >> Murthy: Yep. >> Which has been fantastic. We love that because that's a real trend. And IOT has been a subtext to the conversation and almost a forcing function. Every year the big data world is getting more and more pokes and levers off of Hadoop to a variety of different data sources, so a lot of people are taking a step back, and a protracted view of their landscape inside their own companies and, saying, Okay, where are we? So kind of a checkpoint in the industry. You guys do a lot of work with customers, your history with Informatica, and certainly over the past few years, the change in focus, certainly on the product side, has been kind of interesting. You guys have what looks like to be a solid approach, a abstraction layer for data and metadata, to be the keys to the kingdom, but yet not locking it down, making it freely available, yet provide the governance and all that stuff. >> Murthy: Exactly. >> And my interview with AMIT laid it all out there. But the question is what are the customers doing? I'd like to dig in, if you could share just some of the best practices. What are you seeing? What are the trends? Are they taking a step back? How is IOT affecting it? What's generally happening? >> Yeah, I know, great question. So it has been really, really exciting. It's been kind of a whirlwind over the last couple years, so many new technologies, and we do get the benefit of working with a lot of very, very, innovative organizations. IOT is really interesting because up until now, IOT's always been sort of theoretical, you're like, what's the thing? >> John: Yeah. (laughing) What's this Internet of things? >> But-- >> And IT was always poo-pooing someone else's department (laughing). >> Yeah, exactly. But we have actually have customers doing this now, so we've been working with automative manufacturers on connected vehicle initiatives, pulling sensor data, been working with oil and gas companies, connected meters and connected energy, manufacturing, logistics companies, looking at putting meters on trucks, so they can actually track where all the trucks are going. Huge cost savings and service delivery kind of benefits from all this stuff, so you're absolutely right IOT, I think is finally becoming real. And we have a streaming solution that kind of works on top of all the open source streaming platforms, so we try to simplify everything, just like we have always done. We did that MapReduce, with Spark, now with all the streaming technologies. You gave a graphical approach where you can go in and say, Well, here's what the kind of processing we want. You'd lay it out visually and it executes in the Hadoop cluster. >> I know you guys have done a great job with the product, it's been very complimentary you guys, and it's almost as if there's been an transformation within Informatica. And I know you went private and everything, but a lot of good product shops there. You guys got a lot good product guys, so I got to ask you the question, I don't see IOT sometimes as an operational technology component, usually running their own stacks, not even plugged into IT, so that's the whole another story. I'll get to that in a second. But the trend here is you have the batch world, companies that have been in this ecosystem here that are on the show floor, at O'Reilly Media, or talking to us on The Cube. Some have been just pure play batch-related! Then the fashionable steaming technologies have come out, but what's happened with Spark, you're starting to see the collision between batch and realtime-- >> Umm-hmm. >> Called streaming or what not. And at the center of that's the deep learning, it's the IOT, and it's the AI, that's going to be at the intersection of these two colliding forces, so you can't have a one-trick pony here and there. You got to kind of have a blended, more of a holistic, horizontal, scalable approach. >> Murthy: Yes. >> So I want to get your reaction to that. And two, what product gaps and organizational gaps and process gaps emerge from this trend? And what do you guys do? So, three-part question. >> Murthy: Yeah (laughing). >> Go ahead. Go ahead. >> I'll try to cover all three. >> So, first, the collision and your reaction to that trend. >> Murthy: Yeah, yeah. >> And then the gaps. >> Absolutely. So basically, you know Informatica, we've supported every type of kind of variation of these type of environments, and so we're not really a believer in it's this or that. It's not on premise or cloud, it's not realtime or batch. We want to make it simple and no matter how you want to process the data, or where you want to process it. So customers who use our platform for their realtime or streaming solutions, are using the same interface, as if they were doing it batched. We just run it differently under the hood. And so, that simplifies and makes a lot of these initiatives more practical because you might start with a certain latency, and you think maybe it's okay to do it at one speed. Maybe you decide to change. It could be faster or slower, and you don't have to go through code rewrites and just starting completely from scratch. That's the benefit of the abstraction layer, like you were saying. And so, I think that's one way that organizations can shield themselves from the question because why even pose that question in the first... Why is it either this or that? Why not have a system that you can actually tune and maybe today you want to start batch, and tomorrow you evolve it to be more streaming and more realtime. Help me on the-- >> John: On the gaps-- >> Yes. >> Always product gaps because, again, you mentioned that you're solving it, and that might be an integration challenge for you guys. >> Yep. >> Or an integration solution for you guys, challenge, opportunity, whatever you guys want to call it. >> Absolutely! >> Organizational gaps maybe not set up for and then processed. >> Right. I think it was interesting that we actually went out to dinner with a couple of customers last night. And they were talking a lot about the organizational stuff because the technology they're using is Informatica, so that's part's easy. So, they're like, Okay, it's always the stuff around budgeting, it's around resourcing, skills gap, and we've been talking about this stuff for a long time, right. >> John: Yeah. >> But it's fascinating, even in 2017, it's still a persistent issue, and part of what their challenge was is that even the way IT projects have been funded in the past. You have this kind of waterfall-ish type of governance mechanism where you're supposed to say, Oh, what are you going to do over the next 12 months? We're going to allocate money for that. We'll allocate people for that. Like, what big data project takes 12 months? Twelve months you're going to have a completely (laughing) different stack that you're going to be working with. And so, their challenge is evolving into a more agile kind of model where they can go justify quick-hit projects that may have very unknown kind of business value, but it's just getting by in that... Hey, sometime might be discovered here? This is kind of an exploration-use case, discovery, a lot of this IOT stuff, too. People are bringing back the sensor data, you don't know what's going to coming out of that or (laughing)-- >> John: Yeah. >> What insights you're going to get. >> So there's-- >> Frequency, velocity, could be completely dynamic. >> Umm-hmm. Absolutely! >> So I think part of the best practice is being able to set outside of this kind of notion of innovation where you have funding available for... Get a small cross-functional team together, so this is part of the other aspect of your question, which is organizationally, this isn't just IT. You got to have the data architects from IT, you got to have the data engineers from IT. You got to have data stewards from the line of business. You got business analysts from the line of business. Whenever you get these guys together-- >> Yeah. >> Small core team, and people have been talking about this, right. >> John: Yeah. >> Agile development and all that. It totally applies to the data world. >> John: And the cloud's right there, too, so they have to go there. >> Murthy: That's right! Exactly. So you-- >> So is the 12-month project model, the waterfall model, however you want... maybe 24 months more like it. But the problem on the fail side there is that when they wake up and ship the world's changed, so there's kind of a diminishing return. Is that kind of what you're getting out there on that fail side? >> Exactly. It's all about failing fast forward and succeeding very quickly as well. And so, when you look at most of the successful organizations, they have radically faster project lifecycles, and this is all the more reason to be using something like Informatica, which abstracts all the technology away, so you're not mired in code rewrites and long development cycles. You just want to ship as quickly as possible, get the organization by in that, Hey, we can make this work! Here's some new insights that we never had before. That gets you the political capital-- >> John: Yeah. >> For the next project, the next project, and you just got to keep doing that over and over again. >> Yeah, yeah. I always call that agile more of a blank check in a safe harbor because, in case you fail forward, (laughing) I'm failing forward. (laughing) You keep your job, but there's some merit to that. But here's the trick question for you: Now let's talk about hybrid. >> Umm-hmm. >> On prem and cloud. Now, that's the real challenge. What are you guys doing there because now I don't want to have a job on prem. I don't want to have a job on the cloud. That's not redundancy, that's inefficient, that's duplicates. >> Yes. >> So that's an issue. So how do you guys tee it up there for the customer? And what's the playbook for them, and people who are trying to scratching their heads saying, I want on prem. And Oracle got this right. Their earnings came out pretty good, same code on prem, off prem, same code base. So workloads can move depending upon the use cases. >> Yep. >> How do you guys compare? >> Actually that's the exact same approach that we're taking because, again, it's all about that customer shouldn't have to make the either or-- >> So for you guys, interfacing code same on prem and cloud. >> That's right. So you can run our big data solutions on Amazon, Microsoft, any kind of cloud Hadoop environment. We can connect to data sources that are in the cloud, so different SAAS apps. >> John: Umm-hmm. >> If you want to suck data out of there. We got all the out-of-the-box connectivity to all the major SAAS applications. And we can also actually leverage a lot of these new cloud processing engines, too. So we're trying to be the abstraction layer, so now it's not just about Spark and Spark streaming, there's all these new platforms that are coming out in the cloud. So we're integrating with that, so you can use our interface and then push down the processing to a cloud data processing system. So there's a lot of opportunity here to use cloud, but, again, we don't want to be... We want to make things more flexible. It's all about enabling flexibility for the organization. So if they want to go cloud, great. >> John: Yep. >> There's plenty of organizations that if they don't want to go cloud, that's fine, too. >> So if I get this right, standard interface on prem and cloud for the usability, under the hood it's integration points in clouds, so that data sources, whatever they are and through whatever could be Kinesis coming off Amazon-- >> Exactly! >> Into you guys, or Ah-jahs got some stuff-- >> Exactly! >> Over there, That all works under the hood. >> Exactly! >> Abstracts from the user. >> That's right! >> Okay, so the next question is, okay, to go that way, that means it's a multicloud world. You probably agree with that. Multicloud meaning, I'm a customer. I might have multiple workloads on multiple clouds. >> That's where it is today. I don't know if that's the endgame? And obviously all this is changing very, very quickly. >> Okay (laughing). >> So I mean, Informatica we're neutral across multiple vendors and everything. So-- >> You guys are Switzerland. >> We're the Switzerland (laughing), so we work with all the major cloud providers, and there's new one that we're constantly signing up also, but it's unclear how the market rule shipped out. >> Umm-hmm. >> There's just so much information out there. I think it's unlikely that you're going to see mass consolidation. We all know who the top players are, and I think that's where a lot of large enterprises are investing, but we'll see how things go in the future, too. >> Where should customers spend their focus because this you're seeing the clouds. I was just commenting about Google yesterday, with AMIT, AI, and others. That they're to be enterprise-ready. You guys are very savvy in the enterprising, there's a lot of table stakes, SLAs to integration points, and so, there's some clouds that aren't ready for prime time, like Google for the enterprise. Some are getting there fast like Amazon Ah-jahs super enterprise-friendly. They have their own problems and opportunities. But they are very strong on the enterprise. What do you guys advise customers? What are they looking at right now? Where should they be spending their time, writing more code, scripts, or tackling the data? How do you guys help them shift their focus? >> Yeah, yeah! >> And where-- >> And definitely not scripts (laughing). >> It's about the worst thing you can do because... And it's all for all the reasons we understand. >> Why is that? >> Well, again, we we're talking about being agile. There's nothing agile about manually sitting there, writing Java code. Think about all the developers that were writing MapReduce code three or four years ago (laughing). Those guys, well, they're probably looking for new jobs right now. And with the companies who built that code, they're rewriting all of it. So that approach of doing things at the lowest possible level doesn't make engineering sense. That's why the kind of abstraction layer approach makes so much better sense. So where should people be spending their time? It's really... The one thing technology cannot do is it can't substitute for context. So that's business context, understanding if you're in healthcare there's things about the healthcare industry that only that healthcare company could possibly know, and know about their data, and why certain data is structured the way it is. >> John: Yeah. >> Or financial services or retail. So business context is something that only that organization can possibly bring to the table, and organizational context, as you were alluding to before, roles and responsibilities, who should have access to data, who shouldn't have access to data, That's also something that can be prescribed from the outside. It's something that organizations have to figure out. Everything else under the hood, there's no reason whatsoever to be mired in these long code cycles. >> John: Yeah. >> And then you got to rewrite it-- >> John: Yeah. >> And you got to maintain it. >> So automation is one level. >> Yep. >> Machine learning is a nice bridge between the taking advantage of either vertical data, or especially, data for that context. >> Yep. >> But then the human has to actually synthesize it. >> Right! >> And apply it. That's the interface. Did I get that right, that progression? >> Yeah, yeah. Absolutely! And the reason machine learning is so cool... And I'm glad you segway into that. Is that, so it's all about having the machine learning assist the human, right. So the humans don't go away. We still have to have people who understand-- >> John: Okay. >> The business context and the organizational context. But what machine learning can do is in the world of big data... Inherently, the whole idea of big data is that there's too much data for any human to mentally comprehend. >> John: Yeah. >> Well, you don't have to mentally comprehend it. Let the machine learning go through, so we've got this unique machine learning technology that will actually scan all the data inside of Hadoop and outside of Hadoop, and it'll identify what the data is-- >> John: Yeah. >> Because it's all just pattern matching and correlations. And most organizations have common patterns to their data. So we figured up all this stuff, and we can say, Oh, you got credit card information here. Maybe you should go look at that, if that's not supposed to be there (laughing). Maybe there's a potential violation there? So we can focus the manual effort onto the places where it matters, so now you're looking at issues, problems, instead of doing the day-to-day stuff. The day-to-day stuff is fully automated and that's not what organizations-- >> So the guys that are losing their jobs, those Java developers writing scripts, to do the queries, where should they be focusing? Where should they look for jobs? Because I would agree with you that their jobs would be because the the MapReduce guys and all the script guys and the Java guys... Java has always been the bulldozer of the programming language, very functional. >> Murthy: Yep. >> But where those guys go? What's your advice for... We have a lot of friends, I'm sure you do, too. I know a lot of friends who are Java developers who are awesome programmers. >> Yeah. >> Where should they go? >> Well, so first, I'm not saying that Java's going to go away, obviously (laughing). But I think Java-- >> Well, I mean, Java guys who are doing some of the payload stuff around some of the deep--- >> Exactly! >> In the bowels of big data. >> That's right! Well, there's always things that are unique to the organization-- >> Yeah. >> Custom applications, so all that stuff is fine. What we're talking about is like MapReduce coding-- >> Yeah, what should they do? What should those guys be focusing on? >> So it's just like every other industry you see. You go up the value stack, right. >> John: Right. >> So if you can become more of the data governor, the data stewards, look at policy, look at how you should be thinking about organizational context-- >> John: And governance is also a good area. >> And governance, right. Governance jobs are just going to explode here because somebody has to define it, and technology can't do this. Somebody has to tell the technology what data is good, what data is bad, when do you want to get flagged if something is going wrong, when is it okay to send data through. Whoever decides and builds those rules, that's going to be a place where I think there's a lot of opportunities. >> Murthy, final question. We got to break, we're getting the hook sign here, but we got Informatica World coming up soon in May. What's going to be on the agenda? What should we expect to hear? What's some of the themes that you could tease a little bit, get people excited. >> Yeah, yeah. Well, one thing we want to really provide a lot of content around the journey to the cloud. And we've been talking today, too, there's so many organizations who are exploring the cloud, but it's not easy, for all the reasons we just talked about. Some organizations want to just kind of break away, take out, rip out everything in IT, move all their data and their applications to the cloud. Some of them are taking more of a progressive journey. So we got customers who've been on the leading front of that, so we'll be having a lot of sessions around how they've done this, best practices that they've learned. So hopefully, it's a great opportunity for both our current audience who's always looked to us for interesting insights, but also all these kind of emerging folks-- >> Right. >> Who are really trying to figure out this new world of data. >> Murthy, thanks so much for coming on The Cube. Appreciate it. Informatica World coming up. You guys have a great solution, and again, making it easier (laughing) for people to get the data and put those new processes in place. This is The Cube breaking it down for Big Data SV here in conjunction with Strata Hadoop. I'm John Furrier. More live coverage after this short break. (electronic music)

Published Date : Mar 15 2017

SUMMARY :

it's The Cube, Did I get it right? Good to see you again. and the show theme has been, So kind of a checkpoint in the industry. What are the trends? over the last couple years, John: Yeah. And IT was always poo-pooing and it executes in the Hadoop cluster. so I got to ask you the question, and it's the AI, And what do you guys do? Go ahead. So, first, the collision and you don't have to and that might be an integration for you guys, not set up for and then processed. it's always the stuff around is that even the way IT could be completely dynamic. Umm-hmm. from the line of business. and people have been and all that. John: And the cloud's right there, too, So you-- So is the 12-month project model, at most of the successful organizations, and you just got to keep doing But here's the trick question for you: Now, that's the real challenge. So how do you guys So for you guys, sources that are in the cloud, the processing to a cloud that if they don't want to go cloud, That all works under the hood. Okay, so the next question I don't know if that's the endgame? So I mean, Informatica We're the Switzerland (laughing), go in the future, too. Google for the enterprise. And it's all for all the Think about all the from the outside. is a nice bridge between the has to actually synthesize it. That's the interface. So the humans don't go away. and the organizational context. Let the machine learning go through, instead of doing the day-to-day stuff. So the guys that are losing their jobs, I'm sure you do, too. going to go away, obviously (laughing). so all that stuff is fine. So it's just like every John: And governance that's going to be a place where I think What's some of the themes that you could for all the reasons we just talked about. to figure out this new world of data. get the data and put those

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JohnPERSON

0.99+

MicrosoftORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

Murthy MathiprakasamPERSON

0.99+

2017DATE

0.99+

Silicon ValleyLOCATION

0.99+

MurthyPERSON

0.99+

OracleORGANIZATION

0.99+

AMITORGANIZATION

0.99+

John FurrierPERSON

0.99+

Twelve monthsQUANTITY

0.99+

JavaTITLE

0.99+

InformaticaORGANIZATION

0.99+

O'Reilly MediaORGANIZATION

0.99+

12 monthsQUANTITY

0.99+

San Jose, CaliforniaLOCATION

0.99+

24 monthsQUANTITY

0.99+

MayDATE

0.99+

tomorrowDATE

0.99+

yesterdayDATE

0.99+

GoogleORGANIZATION

0.99+

SparkTITLE

0.99+

firstQUANTITY

0.99+

last nightDATE

0.99+

todayDATE

0.98+

MurthPERSON

0.98+

Informatica WorldORGANIZATION

0.98+

SwitzerlandLOCATION

0.98+

twoQUANTITY

0.98+

three-partQUANTITY

0.98+

threeQUANTITY

0.98+

bothQUANTITY

0.97+

threeDATE

0.96+

NYCLOCATION

0.96+

Big Data WeekEVENT

0.96+

one levelQUANTITY

0.96+

oneQUANTITY

0.96+

one speedQUANTITY

0.96+

two colliding forcesQUANTITY

0.95+

one-trickQUANTITY

0.93+

MapReduceTITLE

0.93+

one wayQUANTITY

0.93+

four years agoDATE

0.92+

#BigDataSVTITLE

0.91+

KinesisORGANIZATION

0.87+

The CubeORGANIZATION

0.86+

MapReduceORGANIZATION

0.85+

agileTITLE

0.84+

Big DataORGANIZATION

0.81+

Arun Murthy, Hortonworks - Spark Summit East 2017 - #SparkSummit - #theCUBE


 

>> [Announcer] Live, from Boston, Massachusetts, it's the Cube, covering Spark Summit East 2017, brought to you by Data Breaks. Now, your host, Dave Alante and George Gilbert. >> Welcome back to snowy Boston everybody, this is The Cube, the leader in live tech coverage. Arun Murthy is here, he's the founder and vice president of engineering at Horton Works, father of YARN, can I call you that, godfather of YARN, is that fair, or? (laughs) Anyway. He's so, so modest. Welcome back to the Cube, it's great to see you. >> Pleasure to have you. >> Coming off the big keynote, (laughs) you ended the session this morning, so that was great. Glad you made it in to Boston, and uh, lot of talk about security and governance, you know we've been talking about that years, it feels like it's truly starting to come into the main stream Arun, so. >> Well I think it's just a reflection of what customers are doing with the tech now. Now, three, four years ago, a lot of it was pilots, a lot of it was, you know, people playing with the tech. But increasingly, it's about, you know, people actually applying stuff in production, having data, system of record, running workloads both on prem and on the cloud, cloud is sort of becoming more and more real at mainstream enterprises. So a lot of it means, as you take any of the examples today any interesting app will have some sort of real time data feed, it's probably coming out from a cell phone or sensor which means that data is actually not, in most cases not coming on prem, it's actually getting collected in a local cloud somewhere, it's just more cost effective, why would we put up 25 data centers if you don't have to, right? So then you got to connect that data, production data you have or customer data you have or data you might have purchased and then join them up, run some interesting analytics, do geobased real time threat detection, cyber security. A lot of it means that you need a common way to secure data, govern it, and that's where we see the action, I think it's a really good sign for the market and for the community that people are pushing on these dimensions of the broader, because, getting pushed in this dimension because it means that people are actually using it for real production work loads. >> Well in the early days of Hadoop you really didn't talk that much about cloud. >> Yeah. >> You know, and now, >> Absolutely. >> It's like, you know, duh, cloud. >> Yeah. >> It's everywhere, and of course the whole hybrid cloud thing comes into play, what are you seeing there, what are things you can do in a hybrid, you know, or on prem that you can't do in a public cloud and what's the dynamic look like? >> Well, it's definitely not an either or, right? So what we're seeing is increasingly interesting apps need data which are born in the cloud and they'll stay in the cloud, but they also need transactional data which stays on prem, you might have an EDW for example, right? >> Right. >> There's not a lot of, you know, people want to solve business problems and not just move data from one place to another, right? Or back from one place to another, so it's not interesting to move an EDW to the cloud, and similarly it's not interesting to bring your IOT data or sensor data back into on-prem, right? Just makes sense. So naturally what happens is, you know, at Hortonworks we talk of kinds of modern app or a modern data app, which means a modern data app has to spare, has to sort of, you know, it can pass both on-prem data and cloud data. >> Yeah, you talked about that in your keynote years ago. Furio said that the data is the new development kit. And now you're seeing the apps are just so dang rich, >> Exactly, exactly. >> And they have to span >> Absolutely. >> physical locations, >> Yeah. >> But then this whole thing of IOT comes up, we've been having a conversation on The Cube, last several Cubes of, okay, how much stays out, how much stays in, there's a lot of debates about that, there's reasons not to bring it in, but you talked today about some of the important stuff will come back. >> Yeah. >> So the way this is, this all is going to be, you know, there's a lot of data that should be born in the cloud and stay there, the IOT data, but then what will happen increasingly is, key summaries of the data will move back and forth, so key summaries of your EDW will move to the cloud, sometimes key summaries of your IOT data, you know, you want to do some sort of historical training in analytics, that will come back on-prem, so I think there's a bi-directional data movement, but it just won't be all the data, right? It'll be key interesting summaries of the data but not all of it. >> And a lot of times, people say well it doesn't matter where it lives, cloud should be an operating model, not a place where you put data or applications, and while that's true and we would agree with that, from a customer standpoint it matters in terms of performance and latency issues and cost and regulation, >> And security and governance. >> Yeah. >> Absolutely. >> You need to think those things through. >> Exactly, so I mean, so that's what we're focused on, to make sure that you have a common security and governance model regardless of where data is, so you can think of it as, infrastructure you own and infrastructure you lease. >> Right. >> Right? Now, the details matter of course, when you go to the cloud you lose S3 for example or ADLS from Microsoft, but you got to make sure that there's a common sort of security governance front and top of it, in front of it, as an example one of the things that, you know, in the open source community, Ranger's a really sort of key project right now from a security authorization and authentication standpoint. We've done a lot of work with our friends at Microsoft to make sure, you can actually now manage data in Wasabi which is their object store, data stream, natively with Ranger, so you can set a policy that says only Dave can access these files, you know, George can access these columns, that sort of stuff is natively done on the Microsoft platform thanks to the relationship we have with them. >> Right. >> So that's actually really interesting for the open source communities. So you've talked about sort of commodity storage at the bottom layer and even if they're different sort of interfaces and implementations, it's still commodity storage, and now what's really helpful to customers is that they have a common security model, >> Exactly. >> Authorization, authentication, >> Authentication, lineage prominence, >> Oh okay. >> You want to make sure all of these are common sources across. >> But you've mentioned off of the different data patterns, like the stuff that might be streaming in on the cloud, what, assuming you're not putting it into just a file system or an object store, and you want to sort of merge it with >> Yeah. >> Historical data, so what are some of the data stores other than the file system, in other words, newfangled databases to manage this sort of interaction? >> So I think what you're saying is, we certainly have the raw data, the raw data is going to line up in whatever cloud native storage, >> Yeah. >> It's going to be Amazon, Wasabi, ADLS, Google Storage. But then increasingly you want, so now the patterns change so you have raw data, you have some sort of an ETL process, what's interesting in the cloud is that even the process data or, if you take the unstructured raw data and structure it, that structured data also needs to live on the cloud platform, right? The reason that's important is because A, it's cheaper to use the native platform rather than set up your own database on top of it. The other one is you also want to take advantage of all the native sources that the cloud storage provides, so for example, linking your application. So automatically data in Wasabi, you know, if you can set up a policy and easily say this structured data stable that I have of which is a summary of all the IOT activity in the last 24 hours, you can, using the cloud provider's technologies you can actually make it show up easily in Europe, like you don't have to do any work, right? So increasingly what we Hortonworks focused a lot on is to make sure that we, all of the computer engines, whether it's Spark or Hive or, you know, or MapReduce, it doesn't really matter, they're all natively working on the cloud provider's storage platform. >> [George] Okay. >> Right, so, >> Okay. >> That's a really key consideration for us. >> And the follow up to that, you know, there's a bit of a misconception that Spark replaces Hadoop, but it actually can be a processing, a compute engine for, >> Yeah. >> That can compliment or replace some of the compute engines in Hadoop, help us frame, how you talk about it with your customers. >> For us it's really simple, like in the past, the only option you had on Hadoop to do any computation was MapReduce, that was, I started working in MapReduce 11 years ago, so as you can imagine, it's a pretty good run for any technology, right? Spark is definitely the interesting sort of engine for sort of the, anything from mission learning to ETL for data on top of Hadoop. But again, what we focus a lot on is to make sure that every time we bring in, so right now, when we started on HTP, the first on HTP had about nine open source projects literally just nine. Today, the last one we shipped was 2.5, HTP 2.5 had about 27 I think, like it's a huge sort of explosion, right? But the problem with that is not just that we have 27 projects, the problem is that you're going to make sure each of the 27 work with all the 26 others. >> It's a QA nightmare. >> Exactly. So that integration is really key, so same thing with Spark, we want to make sure you have security and YARN (mumbles), like you saw in the demo today, you can now run Spark SQL but also make sure you get low level (mumbles) masking, all of the enterprise capabilities that you need, and I was at a financial services three or four weeks ago in Chicago. Today, to do equivalent of what I showed today on demo, they need literally, they have a classic ADW, and they have to maintain anywhere between 1500 to 2500 views of the same database, that's a nightmare as you can imagine. Now the fact that you can do this on the raw data using whether it's Hive or Spark or Peg or MapReduce, it doesn't really matter, it's really key, and that's the thing we push to make sure things like YARN security work across all the stacks, all the open source techs. >> So that makes life better, a simplification use case if you will, >> Yeah. >> What are some of the other use cases that you're seeing things like Spark enable? >> Machine learning is a really big one. Increasingly, every product is going to have some, people call it, machine learning and AI and deep learning, there's a lot of techniques out there, but the key part is you want to build a predictive model, in the past (mumbles) everybody want to build a model and score what's happening in the real world against model, but equally important make sure the model gets updated as more data comes in on and actually as the model scores does get smaller over time. So that's something we see all over, so for example, even within our own product, it's not just us enabling this for the customer, for example at Hortonworks we have a product called SmartSense which allows you to optimize how people use Hadoop. Where the, what are the opportunities for you to explore deficiencies within your own Hadoop system, whether it's Spark or Hive, right? So we now put mesh learning into SmartSense. And show you that customers who are running queries like you are running, Mr. Customer X, other customers like you are tuning Hadoop this way, they're running this sort of config, they're using these sort of features in Hadoop. That allows us to actually make the product itself better all the way down the pipe. >> So you're improving the scoring algorithm or you're sort of replacing it with something better? >> What we're doing there is just helping them optimize their Hadoop deploys. >> Yep. >> Right? You know, configuration and tuning and kernel settings and network settings, we do that automatically with SmartSense. >> But the customer, you talked about scoring and trying to, >> Yeah. >> They're tuning that, improving that and increasing the probability of it's accuracy, or is it? >> It's both. >> Okay. >> So the thing is what they do is, you initially come with a hypothesis, you have some amount of data, right? I'm a big believer that over time, more data, you're better off spending more, getting more data into the system than to tune that algorithm financially, right? >> Interesting, okay. >> Right, so you know, for example, you know, talk to any of the big guys on Facebook because they'll do the same, what they'll say is it's much better to get, to spend your time getting 10x data to the system and improving the model rather than spending 10x the time and improving the model itself on day one. >> Yeah, but that's a key choice, because you got to >> Exactly. >> Spend money on doing either, >> One of them. >> And you're saying go for the data. >> Go for the data. >> At least now. >> Yeah, go for data, what happens is the good part of that is it's not just the model, it's the, what you got to really get through is the entire end to end flow. >> Yeah. >> All the way from data aggregation to ingestion to collection to scoring, all that aspect, you're better off sort of walking through the paces like building the entire end to end product rather than spending time in a silo trying to make a lot of change. >> We've talked to a lot of machine learning tool vendors, application vendors, and it seems like we got to the point with Big Data where we put it in a repository then we started doing better at curating it and understanding it then starting to do a little bit exploration with business intelligence, but with machine learning, we don't have something that does this end to end, you know, from acquiring the data, building the model to operationalizing it, where are we on that, who should we look to for that? >> It's definitely very early, I mean if you look at, even the EDW space, for example, what is EDW? EDW is ingestion, ETL, and then sort of fast query layer, Olap BI, on and on and on, right? So that's the full EDW flow, I don't think as a market, I mean, it's really early in this space, not only as an overall industry, we have that end to end sort of industrialized design concept, it's going to take time, but a lot of people are ahead, you know, the Google's a world ahead, over time a lot of people will catch up. >> We got to go, I wish we had more time, I had so many other questions for you but I know time is tight in our schedule, so thanks so much Arun, >> Appreciate it. For coming on, appreciate it, alright, keep right there everybody, we'll be back with our next guest, it's The Cube, we're live from Spark Summit East in Boston, right back. (upbeat music)

Published Date : Feb 9 2017

SUMMARY :

brought to you by Data Breaks. father of YARN, can I call you that, Glad you made it in to Boston, So a lot of it means, as you take any of the examples today you really didn't talk that has to sort of, you know, it can pass both on-prem data Yeah, you talked about that in your keynote years ago. but you talked today about some of the important stuff So the way this is, this all is going to be, you know, And security and You need to think those so that's what we're focused on, to make sure that you have as an example one of the things that, you know, in the open So that's actually really interesting for the open source You want to make sure all of these are common sources in the last 24 hours, you can, using the cloud provider's in Hadoop, help us frame, how you talk about it with like in the past, the only option you had on Hadoop all of the enterprise capabilities that you need, Where the, what are the opportunities for you to explore What we're doing there is just helping them optimize and network settings, we do that automatically for example, you know, talk to any of the big guys is it's not just the model, it's the, what you got to really like building the entire end to end product rather than but a lot of people are ahead, you know, the Google's everybody, we'll be back with our next guest, it's The Cube,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

George GilbertPERSON

0.99+

Dave AlantePERSON

0.99+

Arun MurthyPERSON

0.99+

EuropeLOCATION

0.99+

MicrosoftORGANIZATION

0.99+

10xQUANTITY

0.99+

BostonLOCATION

0.99+

ChicagoLOCATION

0.99+

AmazonORGANIZATION

0.99+

GeorgePERSON

0.99+

ArunPERSON

0.99+

WasabiORGANIZATION

0.99+

25 data centersQUANTITY

0.99+

TodayDATE

0.99+

HadoopTITLE

0.99+

WasabiLOCATION

0.99+

YARNORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

ADLSORGANIZATION

0.99+

HortonworksORGANIZATION

0.99+

Horton WorksORGANIZATION

0.99+

todayDATE

0.99+

Data BreaksORGANIZATION

0.99+

1500QUANTITY

0.98+

SmartSenseTITLE

0.98+

S3TITLE

0.98+

Boston, MassachusettsLOCATION

0.98+

OneQUANTITY

0.98+

27 projectsQUANTITY

0.98+

threeDATE

0.98+

GoogleORGANIZATION

0.98+

FurioPERSON

0.98+

SparkTITLE

0.98+

2500 viewsQUANTITY

0.98+

firstQUANTITY

0.97+

Spark Summit EastLOCATION

0.97+

bothQUANTITY

0.97+

Spark SQLTITLE

0.97+

Google StorageORGANIZATION

0.97+

26QUANTITY

0.96+

RangerORGANIZATION

0.96+

four weeks agoDATE

0.95+

oneQUANTITY

0.94+

eachQUANTITY

0.94+

four years agoDATE

0.94+

11 years agoDATE

0.93+

27 workQUANTITY

0.9+

MapReduceTITLE

0.89+

HiveTITLE

0.89+

this morningDATE

0.88+

EDWTITLE

0.88+

about nine open sourceQUANTITY

0.88+

day oneQUANTITY

0.87+

nineQUANTITY

0.86+

yearsDATE

0.84+

OlapTITLE

0.83+

CubeORGANIZATION

0.81+

a lot of dataQUANTITY

0.8+

Sanjay Poonen, VMware | VMworld 2019


 

>> live from San Francisco, celebrating 10 years of high tech coverage. It's the Cube covering Veum World 2019. Brought to you by IBM Wear and its ecosystem partners. >> Welcome back to the cubes Live coverage Of'em World 2019 in San Francisco, California We're here at Mosconi North Lobby. Two sets. Jumper of my Coast. David wanted Dave 10 years. Our 10th season of the cue coming up on our 10 year anniversary May of 2020. But this corner are 10 years of the Cube. Our next guest is Sanjay Putting Chief Operating Officer Of'em where who took the time out of his busy schedule to help us do a commemorative look back. Thanks for coming to our studio. Hello, John. That was great. Fans of yours was really regulations on the 10 year mark with the, um well, we really appreciate your partnership. We really appreciate one. Things we love doing is covering as we call that thing. David, I coined the term tech athletes, you know, kind of the whole joke of ESPN effect that we've been called and they're really tech athlete is just someone who's a strong in tech always fighting for that extra inch. Always putting in the hard work discipline, smart, competitive. You get all that above. Plus, you interviewed athletes today on state real athletes. Real athletes, Tech show. So I guess they would qualify as Tech athlete Steve Young. That's pretty funny. It was a >> great time. We've been trying to, you know, Veum World is now the first time was 2004. So it's 1/16 season here, and traditionally many of these tech conference is a really boring because it's just PowerPoint dead by power point lots of Tec Tec Tec Tec breakout sessions. And we're like, You know, last year we thought, Why don't we mix it up and have something that's inspirational education We had Malala was a huge hit. People are crying at the end of the session. Well, let's try something different this year, and we thought the combination of Steve Young and Lyndsey one would be great. Uh, you know, Listen, just like you guys prepped for these interviews, I did a lot of prep. I mean, I'm not I'm a skier, but I'm nowhere close to an avid skier that watch in the Olympics huge fan of Steve Young so that part was easy, but preparing for Lindsay was tough. There were many dynamics of that interview that I had to really think through. You want to get both of them to converse, you know, he's She's 34 he's 55. You want to get them to really feel like it's a good and I think it kind of played out well. >> You were watching videos. A great prep. Congratulations >> trying t o show. It's the culture of bringing the humanization aspect of your team about tech for good. Also, you believe in culture, too, and I don't get your thoughts on that. You recently promoted one of your person that she has a chief communications Johnstone Johnstone about stars you promote from within. This >> is the >> culture you believe it. Talk about the ethos. Jones is a rock star. We love her. She's just >> hardworking, credible, well respected. Inside VM where and when we had a opening in that area a few months ago, I remember going to the her team meeting and announcing, and the team erupted in cheers. I mean that to me tells me that somebody was well liked from within, respected within and pure level and you know the organization's support for a promotion of that kind of battlefield promotion. It's great big fan of hers, and this is obviously her first show at Vienna. Well, along with Robin, Matt, look. So we kind of both of them as the chief marketing officer, Robin and Jones >> and Robinson story. Low Crawl made her interim first, but they then she became Steve Made it Permanent way. >> Want them to both do well. They have different disciplines. Susan, uh, national does our alliances, you know, if you include my chief of staff for the six of my direct reports are women, and I'm a big believer in more women. And take why? Because I want my Sophia, who's 13 year old do not feel like the tech industry is something that is not welcome to women in tech. So, you know, we really want to see more of them. And I hope that the folks who are reporting to me in senior positions senior vice president is an example can be a role model to other women who are aspiring, say, one day I wanna be like a Jones Stone or Robin. Madam Local Susan Nash, >> John and I both have daughters, so we're passionate about this. Tech is everywhere, so virtually whatever industry they go into. But I've asked this question Sanjay of women before on the Cube. I've never asked him in. And because you have a track record of hiring women, how do you succeed in hiring women? Sometimes way have challenges because way go into our little network. Convenient. What? What's your approach? Gotta >> blow off that network and basically say First off, if that network is only male or sometimes unfortunately white male or just Indian male, which is sometimes the nature of tech I mean, if you're looking for a new position, tell the recruiters to find you something that's different. Find me, Ah woman. Find me on underrepresented minority like an African American Latino and those people exist. You just have a goal. Either build a network yourself. So you've got those people on your radar. We'll go look, and that's more work on us, says leaders. But we should be doing that work. We should be cultivating those people because the more you promote capable. First off, you have to be capable. This is not, you know, some kind of affirmative action away. We want capable people. Someone shouldn't get the job just because they're a woman just because the minority, that's not the way we work. We want capable people to do it. But if we have to go a little further to find them, we'll go do it. That's okay. They exist. So part of my desires to cultivate relationships with women and underrepresented minorities in the world that can actually in the world of tech and maintain those relationships because you never know you're not gonna hire them immediately. But at some point in time, you might need to have them on your radar. >> Sanjay, I wanna ask you a big picture question. I didn't get a chance to ask path this morning. I was at the bar last night just having a little dinner, and I was checking out Twitter. And he said that the time has never been. It's never been a greater time arm or important time to be a technologist. Now I saw that I went interesting. What does that mean? Economic impact, social impact? And I know we often say that, and I don't say this to disparage the comment. It's just to provide historical context and get a get it open discussion about what is actually achievable with tech in this era and what we actually believe. So I started to do some research and I started right down. First of all, I presume you believe that right on your >> trusty napkin at the >> bar. So there has never been a more important time to be a technologist. You know, it's your company at your league. You know, Pat, I presume you agree with it. Yeah, absolutely. I slipped it back to the 1900. Electricity, autos, airplanes, telephones. So you we, as an industry are up against some pretty major innovations. With that historical context, Do you feel as though we can have a similar greater economic and social impact? >> Let's start with economic first and social. Next time. Maybe we should do the opposite, but economic? Absolutely. All those inventions that you >> have are all being reinvented. The technology the airplanes all been joined by software telephones are all driving through, you know, five g, which is all software in the future. So tech is really reinventing every industry, including the mundane non tech industries like agriculture. If you look at what's happening. Agriculture, I ot devices are monitoring the amount of water that should go to particular plant in Brazil, or the way in which you're able to use big data to kind of figure out what's the right way to think about health care, which is becoming very much tech oriented financial service. Every industry is becoming a tech industry. People are putting tech executives on their boards because they need an advice on what is the digital transformations impact on them cybersecurity. Everyone started by this. Part of the reason we made these big moves and security, including the acquisition of carbon black, is because that's a fundamental topic. Now social, we have to really use this as a platform for good. So just the same way that you know a matchstick could help. You know, Warm house and could also tear down the house. Is fire good or bad? That's been the perennial debate since people first discovered fire technology. Is this the same way it can be used? Reboot. It could be bad in our job is leaders is to channel the good and use examples aware tech is making a bit force for good. And then listen. Some parts of it may not be tech, but just our influence in society. One thing that pains me about San Francisco's homelessness and all of the executives that a partner to help rid this wonderful city of homeless men. They have nothing to attack. It might be a lot of our philanthropy that helps solve that and those of us who have much. I mean, I grew up in a poor, uh, bringing from Bangla, India, but now I have much more than I have. Then I grew up my obligations to give back, and that may have nothing to do with Tech would have to do all with my philanthropy. Those are just principles by which I think when you live with your a happier man, happier woman, you build a happier >> society and I want to get your thoughts on common. And I asked a random set of college students, thanks to my son that the network is you said your daughter to look at the key to Pat's King Pat's commentary in The Cube here this morning that was talking about tech for good. And here's some of the comments, but I liked the part about tech for good and humanity. Tech with no purpose is meaningless tech back by purposes. More impactful is what path said then the final comments and Pat's point quality engineering backing quality purpose was great. So again, this is like this is Gen Z, not Millennials. But again, this is the purpose where it's not just window dressing on on industry. It's, you know, neutral fire. I like that argument. Fire. That's a good way Facebook weaponizing Facebook could be good or bad, right? Same thing. But the younger generation. You're new demographics that are coming into cloud. Native. Yeah, what do you think? >> No. And I think that's absolutely right. We have to build a purpose driven company that's purposes much more than just being the world's best softer infrastructure company or being the most profit. We have to obviously deliver results to our shareholders. But I think if you look at the Milton Friedman quote, you know, paper that was written that said, the sole purpose of a company is just making profits, and every business school student is made to read that I >> think even he >> would probably agree that listen today While that's important, the modern company has to also have a appropriate good that they are focused on, you know, with social good or not. And I don't think it's a trade off being able to have a purpose driven culture that makes an impact on society and being profitable. >> And a pointed out yesterday on our intro analysis, the old term was You guys go Oh, yeah, Michael Dell and PAD shareholder value. They point out that stakeholder value, because now the stakeholder Employees and society. So congratulations could keep keep keep it going on the millennial generation. >> Just like your son and our kids want a purpose driven company. They want to know that the company that working for is having an impact. Um, not just making an impression. You do that. It shows like, but having an impact. >> And fire is the most popular icon on instagram. Is that right? Yeah, I know that fire is good. Like your fire. Your hot I don't know. I guess. Whatever. Um fire. Come comment. There was good Sanjay now on business front. Okay, again, A lot of inflection points happen over 10 years. We look back at some of this era, the Abel's relationship would you know about. But they've also brought up a nuance which we talked about on the intro air Watch. You were part of that acquisition again. Pig part of it. So what Nasiriyah did for the networking STD see movement that shaped VM. Whereas it is today your acquisition that you were involved and also shaping the end user computing was also kind of come together with the cloud Natives. >> How is >> this coming to market? I mean, you could get with >> my comparison with carbon black there watch was out of the building. Carbon black is not considered. >> Let's talk about it openly. And we talked about it some of the earnings because we got that question. Listen, I was very fortunate. Bless to work on the revitalization of end user computing that was Turbo charged to the acquisition of a watch. At that time was the biggest acquisition we did on both Nice era and air watch put us into court new markets, networking and enterprise mobility of what we call not additional work space. And they've been so successful thanks to know not just me. It was a team of village that made those successful. There's a lot of parallels what we're doing. Carbon, black and security. As we looked at the security industry, we feel it's broken. I alluded to this, but if I could replay just 30 seconds of what I said on some very important for your viewers to know this if I went to my doctor, my mom's a doctor and I asked her how Doe I get well, and she proposed 5000 tablets to me. Okay, it would take me at 30 seconds of pop to eat a tablet a couple of weeks to eat 5000 tablets. That's not how you stay healthy. And the analogy is 5000 metres and security all saying that they're important fact. They use similar words to the health care industry viruses. I mean, you know, you and what do you do instead, to stay healthy, you have a good diet. You eat your vegetables or fruit. Your proteins drink water. So part of a diet is making security intrinsic to the platform. So the more that we could make security intrinsic to the platform, we avoid the bloatware of agents, the number of different consuls, all of this pleasure of tools that led to this morass. And what happens at the end of that is you about these point vendors, Okay, Who get gobbled up by hardware companies that's happening spattered my hardware companies and sold to private equity companies. What happens? The talent they all leave, we look at the landscape is that's ripe for disruption, much the same way we saw things with their watch. And, you know, we had only companies focusing VD I and we revitalize and innovative that space. So what we're gonna do in securities make it intrinsic and take a modern cloud security company carbon black, and make that part of our endpoint Security and Security Analytics strategy? Yes, they're one of two companies that focus in the space. And when we did air watch, they were number three. Good was number one. Mobile line was number two and that which was number three and the embers hands. We got number one. The perception in this space is common. Lacks number two and crowdstrike number one. That's okay, you know, that might be placed with multiple vendors, but that's the state of it today, and we're not going point against Crowdstrike. Our competition's not just an endpoint security point to a were reshaping the entire security industry, and we believe with the integration that we have planned, like that product is really good. I would say just a cz good upper hand in some areas ahead of common black, not even counting the things we're gonna integrate with it. It's just that they didn't have the gold market muscle. I mean, the sales and marketing of that company was not as further ahead that >> we >> change Of'em where we've got an incredible distribution will bundle that also with the Dell distribution, and that can change. And it doesn't take long for that to take a lot of customers here. One copy black. So that's the way in which we were old. >> A lot of growth there. >> Yeah, plenty of >> opportunity to follow up on that because you've obviously looked at a lot of companies and crowdstrike. I mean, huge valuation compared to what you guys paid for carbon black. I mean, >> I'm a buyer. I mean, if I'm a buyer, I liked what we paid. >> Well, I had some color to it. Just when you line up the Was it really go to market. I mean some functions. Maybe not that there >> was a >> few product gaps, but it's not very nominal. But when you add what we announced in a road map app, defensive alderman management, the integration of works based one this category is gonna be reshaped very quickly. Nobody, I mean, the place. We're probably gonna compete more semantic and McAfee because most of those companies that kind of decaying assets, you know, they've gotten acquired by the companies and they're not innovating. So I'd say the bulk of the market will be eating up the leftover fossils of those sort of companies as as companies decided they want to invest in legacy. Technology is a more modern, but I think the differentiation from Crowdstrike very clear is we integrate these, these technology and the V's fear. Let me give an example. With that defense, we can make that that workload security agent list. Nobody can do that. Nobody, And that's apt defense with carbon black huge innovation. I described on stage workspace one plus carbon black is like peanut butter and jelly management. Security should go together. Nobody could do that as good as us. Okay, what we do inside NSX. So those four areas that I outlined in our plans with carbon black pending the close of the transaction into V sphere Agent Lis with workspace one unified with NSX integrated and into secure state, You know, in the cloud security area we take that and then send it through the V m. Where the devil and other ecosystem channels like you No idea. Security operative CDW You know, I think Dimension data, all the security savvy partners here. I think the distribution and the innovation of any of'em were takes over long term across strike may have a very legitimate place, but our strategy is very different. We're not going point tool against 0.0.2 wish reshaping the security industry. Yeah, What platform? >> You're not done building that platform. My obvious question is the other other assets inside of Arcee and secureworks that you'd like to get your hands on. >> I mean, listen, at this point in time, we are good. I mean, it's the same thing like asking me when we acquired air watching. Nice Here. Are you gonna do more networking and mobility? Yeah, but we're right now. We got enough to Digest in due course you. For five years later, we did acquire Arkin for network Analytics. We acquired fellow Cloud for SD when we're cloud recently, Avi. So the approach we take a hammer to innovations first. You know, if you're gonna have an anchor acquisition, make sure it's got critical mass. I mean, buying a small start up with only 35 people 10 people doesn't really work for us. So we got 1100 people would come back, we're gonna build on it. But let's build, build, build, build, partner and then acquire. So we will partner a lot with a lot of players. That compliment competition will build a lot around this. >> And years from now, we need >> add another tuck in acquisition. But we feel we get a lot in this acquisition from both endpoint security and Security Analytics. Okay, it's too early to say how much more we will need and when we will need that. But, you know, our goal would be Let's go plot away. I have a billion dollar business and then take it from there. >> One more security question, if I may say so. I'm not trying to pit you against your friends and AWS. But there are some cleared areas where your counter poise >> Stevens just runs on eight of us comin back. >> That part about a cloud that helps your class ass business. I like the acquisition. But Steven Schmidt, it reinforced the cloud security conference, said, You know, this narrative in the industry that security is broken is not the right one. Now, by the way, agree with this. Security's a do over pat kill singer. And we talked about that for five years ago. Um, but then in eight of you says the shared security model, when you talk to the practitioners like, yeah, they they cover, that's three and compute. But we have the the real work to d'oh! So help me square that circle. >> Yeah, I think if aws bills Security Service is that our intrinsic to their platform and they open up a prize, we should leverage it. But I don't think aws is gonna build workload security for azure compute or for Gogol compute. That's against the embers or into the sphere. Like after finishing third accordion. And they're like, That's not a goal. You go do it via more So from my perspective. Come back to hydrogen. 80. If there's a workload security problem that's going to require security at the kernel of the hyper visor E C to azure compute containers. Google Compute. >> Who's gonna do >> that? Jammer? Hopefully, hopefully better than because we understand the so workloads. Okay, now go to the client site. There's Windows endpoints. There's Mac. There's Lennox. Who should do it? We've been doing that for a while on the client side and added with workspace one. So I think if you believe there is a Switzerland case for security, just like there was a Switzerland case for management endpoint management I described in Point management in Point Security going together like peanut butter and jelly, Whatever your favorite analogy is, if we do that well, we will prove to the market just like we did with their watch An endpoint management. There is a new way of doing endpoint security. Dan has been done ever before. Okay, none >> of these >> guys let me give an example. I've worked at Semantic 15 years ago. I know a lot about the space. None of these guys built a really strategic partnership with the laptop vendors. Okay, Del was not partnering strategically on their laptops with semantic micro. Why? Because if this wasn't a priority, then they were, you know, and a key part of what we're doing here is gonna be able to do end point management. And in point security and partner Adult, they announced unified workspace integrated into the silicon of Dell laptops. Okay, we can add endpoint security that capability next. Why not? I mean, if you could do management security. So, you know, we think that workspace one, we'll get standing toe work space security with the combination of workspace one and security moving and carbon black. >> Sanjay, we talked about this on our little preview and delivery. Done us. We don't need to go into it. The Amazon relationship cleared the way for the strategy in stock price since October 2016 up. But >> one of the >> things I remember from that announcement that I heard from the field sales folks that that were salespeople for VM wear as well as customers, was finally clarity around. What the hell? We're doing the cloud. So I bring up the go to market In the business side, the business results are still strong. Doing great. You guys doing a great job? >> How do you >> keep your field troops motivated? I know Michael Dell says these are all in a strategy line. So when we do these acquisitions, you >> had a lot >> of new stuff coming in. I mean, what's how do you keep him trained? Motivated constantly simplifying whenever >> you get complex because you add into your portfolio, you go back and simplify, simplify, simplify, make it Sesame Street simple. So we go back to that any cloud, any app, any device diagram, if you would, which had security on the side. And we say Now, let's tell you looking this diagram how the new moves that we've made, whether it's pivotal and what we're announcing with tanz ou in the container layer that's in that any Apple air carbon black on the security there. But the core strategy of the emer stays the same. So the any cloud strategy now with the relevance now what, what eight of us, Who's our first and preferred partner? But if you watched on stage, Freddie Mac was incredible. Story. Off moving 600 absent of the N word cloud made of us Fred and Tim Snyder talked about that very eloquently. The deputy CTO. They're ratty Murthy. CTO off Gap basically goes out and says, Listen, I got 800 APS. I'm gonna invest a lot on premise, and when I go to the cloud, I'm actually going to Azure. >> Thanks for joining you. Keep winning. Keep motivated through winning >> and you articulate a strategy that constantly tells people Listen. It's their choice of how they run in the data center in the cloud. It's their choice, and we basically on top of all of those in the any cloud AP world. That's how we play on the same with the device and the >> security. A lot of great things having Sanjay. Thanks >> for you know what a cricket fan I am. Congratulations. India won by 318 goals. Is that >> what they call girls run against the West Indies? I think you >> should stay on and be a 40 niner fan for when you get Tom baseball get Tom Brady's a keynote will know will be in good Wasn't Steve Young and today love so inspirational and we just love them? Thank you for coming on the Cube. 10 years. Congratulations. Any cute moments you can point out >> all of them. I mean, I think when I first came to, I was Who's the d? I said ASAP, like these guys, John and Dave, and I was like, Man, they're authentic people. What I like about you is your authentic real good questions. When I came first year, you groomed me a lot of their watch like, Hey, this could be a big hat. No cattle. What you gonna do? And you made me accountable. You grilled me on eight of us. You're grilling me right now on cloud native and modern, absent security, which is good. You keep us accountable. Hopefully, every you're that we come to you, we want to show as a team that we're making progress and then were credible back with you. That's the way we roll. >> Sanjay. Thanks for coming. Appreciate. Okay, we're live here. Stay with us for more of this short break from San Francisco v emerald 2019

Published Date : Aug 27 2019

SUMMARY :

Brought to you by IBM Wear and its ecosystem partners. David, I coined the term tech athletes, you know, kind of the whole joke of ESPN effect that we've We've been trying to, you know, Veum World is now the first time You were watching videos. It's the culture of bringing the humanization aspect of your team about culture you believe it. I mean that to me tells me that somebody and Robinson story. And I hope that the folks who are reporting to me And because you have a track record of hiring women, how do you succeed in hiring women? This is not, you know, some kind of affirmative action away. I presume you believe that right on your You know, Pat, I presume you agree with it. All those inventions that you Part of the reason we made these thanks to my son that the network is you said your daughter to look at the key to Pat's King Pat's But I think if you look at the Milton have a appropriate good that they are focused on, you know, on the millennial generation. that working for is having an impact. We look back at some of this era, the Abel's relationship would you know about. my comparison with carbon black there watch was out of the building. I mean, you know, you and what do you do instead, to stay healthy, So that's the way in which we were old. I mean, huge valuation compared to what you guys paid for carbon black. I mean, if I'm a buyer, I liked what we paid. Just when you line up the Was it really go to market. m. Where the devil and other ecosystem channels like you No idea. Arcee and secureworks that you'd like to get your hands on. I mean, it's the same thing like asking me when we acquired air watching. But, you know, our goal would be Let's go plot away. I'm not trying to pit you against your friends and AWS. I like the acquisition. of the hyper visor E C to azure compute containers. So I think if you believe there is a Switzerland case for I mean, if you could do management security. the way for the strategy in stock price since October 2016 up. What the hell? So when we do these acquisitions, you I mean, what's how do you keep him trained? And we say Now, let's tell you looking Thanks for joining you. and you articulate a strategy that constantly tells people Listen. A lot of great things having Sanjay. for you know what a cricket fan I am. when you get Tom baseball get Tom Brady's a keynote will know will be in good Wasn't Steve Young and That's the way we roll. Stay with us for more of this short break from San Francisco

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Steve YoungPERSON

0.99+

AmazonORGANIZATION

0.99+

JohnPERSON

0.99+

DavidPERSON

0.99+

Tim SnyderPERSON

0.99+

AWSORGANIZATION

0.99+

sixQUANTITY

0.99+

LyndseyPERSON

0.99+

October 2016DATE

0.99+

BanglaLOCATION

0.99+

DellORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

StevePERSON

0.99+

DavePERSON

0.99+

5000 tabletsQUANTITY

0.99+

FredPERSON

0.99+

BrazilLOCATION

0.99+

Steven SchmidtPERSON

0.99+

San FranciscoLOCATION

0.99+

Susan NashPERSON

0.99+

30 secondsQUANTITY

0.99+

SanjayPERSON

0.99+

eightQUANTITY

0.99+

PatPERSON

0.99+

LindsayPERSON

0.99+

55QUANTITY

0.99+

SophiaPERSON

0.99+

Sanjay PoonenPERSON

0.99+

IBMORGANIZATION

0.99+

10 yearsQUANTITY

0.99+

RobinPERSON

0.99+

JonesPERSON

0.99+

SusanPERSON

0.99+

bothQUANTITY

0.99+

2004DATE

0.99+

last yearDATE

0.99+

Sanjay PuttingPERSON

0.99+

May of 2020DATE

0.99+

San Francisco, CaliforniaLOCATION

0.99+

Tom BradyPERSON

0.99+

34QUANTITY

0.99+

10th seasonQUANTITY

0.99+

Two setsQUANTITY

0.99+

ArceeORGANIZATION

0.99+

10 yearQUANTITY

0.99+

MattPERSON

0.99+

awsORGANIZATION

0.99+

oneQUANTITY

0.99+

OlympicsEVENT

0.99+

yesterdayDATE

0.99+

318 goalsQUANTITY

0.99+

5000 metresQUANTITY

0.99+

first showQUANTITY

0.99+

10 yearsQUANTITY

0.99+

firstQUANTITY

0.99+

last nightDATE

0.99+

DanPERSON

0.99+

10 peopleQUANTITY

0.99+

1100 peopleQUANTITY

0.99+

five years laterDATE

0.99+

NSXORGANIZATION

0.99+

McAfeeORGANIZATION

0.99+

ArkinORGANIZATION

0.99+

ViennaLOCATION

0.99+

The CubeTITLE

0.98+

PowerPointTITLE

0.98+

two companiesQUANTITY

0.98+

PADORGANIZATION

0.98+

AppleORGANIZATION

0.98+

SemanticORGANIZATION

0.98+

Freddie MacPERSON

0.98+

SwitzerlandLOCATION

0.98+

five years agoDATE

0.98+

FirstQUANTITY

0.98+

VMwareORGANIZATION

0.98+

John Kreisa, Hortonworks | DataWorks Summit 2018


 

>> Live from San José, in the heart of Silicon Valley, it's theCUBE! Covering DataWorks Summit 2018. Brought to you by Hortonworks. (electro music) >> Welcome back to theCUBE's live coverage of DataWorks here in sunny San José, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We're joined by John Kreisa. He is the VP of marketing here at Hortonworks. Thanks so much for coming on the show. >> Thank you for having me. >> We've enjoyed watching you on the main stage, it's been a lot of fun. >> Thank you, it's been great. It's been great general sessions, some great talks. Talking about the technology, we've heard from some customers, some third parties, and most recently from Kevin Slavin from The Shed which is really amazing. >> So I really want to get into this event. You have 2,100 attendees from 23 different countries, 32 different industries. >> Yep. This started as a small, >> That's right. tiny little thing! >> Didn't Yahoo start it in 2008? >> It did, yeah. >> You changed names a few year ago, but it's still the same event, looming larger and larger. >> Yeah! >> It's been great, it's gone international as you've said. It's actually the 17th total event that we've done. >> Yeah. >> If you count the ones we've done in Europe and Asia. It's a global community around data, so it's no surprise. The growth has been phenomenal, the energy is great, the innovations that the community is talking about, the ecosystem is talking about, is really great. It just continues to evolve as an event, it continues to bring new ideas and share those ideas. >> What are you hearing from customers? What are they buzzing about? Every morning on the main stage, you do different polls that say, "how much are you using machine learning? What portion of your data are you moving to the cloud?" What are you learning? >> So it's interesting because we've done similar polls in our show in Berlin, and the results are very similar. We did the cloud poll pole and there's a lot of buzz around cloud. What we're hearing is there's a lot of companies that are thinking about, or are somewhere along their cloud journey. It's exactly what their overall plans are, and there's a lot of news about maybe cloud will eat everything, but if you look at the pole results, something like 75% of the attendees said they have cloud in their plans. Only about 12% said they're going to move everything to the cloud, so a lot of hybrid with cloud. It's how to figure out which work loads to run where, how to think about that strategy in terms of where to deploy the data, where to deploy the work loads and what that should look like and that's one of the main things that we're hearing and talking a lot about. >> We've been seeing that Wikiban and our recent update to the recent market forecast showed that public cloud will dominate increasingly in the coming decade, but hybrid cloud will be a long transition period for many or most enterprises who are still firmly rooted in on-premises employment, so forth and so on. Clearly, the bulk of your customers, both of your custom employments are on premise. >> They are. >> So you're working from a good starting point which means you've got what, 1,400 customers? >> That's right, thereabouts. >> Predominantly on premises, but many of them here at this show want to sustain their investment in a vendor that provides them with that flexibility as they decide they want to use Google or Microsoft or AWS or IBM for a particular workload that their existing investment to Hortonworks doesn't prevent them from facilitating. It moves that data and those workloads. >> That's right. The fact that we want to help them do that, a lot of our customers have, I'll call it a multi-cloud strategy. They want to be able to work with an Amazon or a Google or any of the other vendors in the space equally well and have the ability to move workloads around and that's one of the things that we can help them with. >> One of the things you also did yesterday on the main stage, was you talked about this conference in the greater context of the world and what's going on right now. This is happening against the backdrop of the World Cup, and you said that this is really emblematic of data because this is a game, a tournament that generates tons of data. >> A tremendous amount of data. >> It's showing how data can launch new business models, disrupt old ones. Where do you think we're at right now? For someone who's been in this industry for a long time, just lay the scene. >> I think we're still very much at the beginning. Even though the conference has been around for awhile, the technology has been. It's emerging so fast and just evolving so fast that we're still at the beginning of all the transformations. I've been listening to the customer presentations here and all of them are at some point along the journey. Many are really still starting. Even in some of the polls that we had today talked about the fact that they're very much at the beginning of their journey with things like streaming or some of the A.I. machine learning technologies. They're at various stages, so I believe we're really at the beginning of the transformation that we'll see. >> That reminds me of another detail of your product portfolio or your architecture streaming and edge deployments are also in the future for many of your customers who still primarily do analytics on data at rest. You made an investment in a number of technologies NiFi from streaming. There's something called MiNiFi that has been discussed here at this show as an enabler for streaming all the way out to edge devices. What I'm getting at is that's indicative of Arun Murthy, one of your co-founders, has made- it was a very good discussion for us analysts and also here at the show. That is one of many investments you're making is to prepare for a future that will set workloads that will be more predominant in the coming decade. One of the new things I've heard this week that I'd not heard in terms of emphasis from you guys is more of an emphasis on data warehousing as an important use case for HDP in your portfolios, specifically with HIVE. The HIVE 3.0 now in- HDP3.0. >> Yes. >> With the enhancements to HIVE to support more real time and low latency, but also there's ACID capabilities there. I'm hearing something- what you guys are doing is consistent with one of your competitors, Cloudera. They're going deeper into data warehousing too because they recognize they've got to got there like you do to be able to absorb more of your customers' workloads. I think that's important that you guys are making that investment. You're not just big data, you're all data and all data applications. Potentially, if your customers want to go there and engage you. >> Yes. >> I think that was a significant, subtle emphasis that me as an analyst noticed. >> Thank you. There were so many enhancements in 3.0 that were brought from the community that it was hard to talk about everything in depth, but you're right. The enhancements to HIVE in terms of performance have really enabled it to take on a greater set of workloads and inner activity that we know that our customers want. The advantage being that you have a common data layer in the back end and you can run all this different work. It might be data warehousing, high speed query workloads, but you can do it on that same data with Spark and data-science related workloads. Again, it's that common pool backend of the data lake and having that ability to do it with common security and governance. It's one of the benefits our customers are telling us they really appreciate. >> One of the things we've also heard this morning was talking about data analytics in terms of brand value and brand protection importantly. Fedex, exactly. Talking about, the speaker said, we've all seen these apology commercials. What do you think- is it damage control? What is the customer motivation here? >> Well a company can have billions of dollars of market cap wiped out by breeches in security, and we've seen it. This is not theoretical, these are actual occurrences that we've seen. Really, they're trying to protect the brand and the business and continue to be viable. They can get knocked back so far that it can take years to recover from the impact. They're looking at the security aspects of it, the governance of their data, the regulations of GVPR. These things you've mentioned have real financial impact on the businesses, and I think it's brand and the actual operations and finances of the businesses that can be impacted negatively. >> When you're thinking about Hortonworks's marketing messages going forward, how do you want to be described now, and then how do you want customers to think of you five or 10 years from now? >> I want them to think of us as a partner to help us with their data journey, on all aspects of their data journey, whether they're collecting data from the EDGE, you mentioned NiFi and things like that. Bringing that data back, processing it in motion, as well as processing it in rest, regardless of where that data lands. On premise, in the cloud, somewhere in between, the hybrid, multi-cloud strategy. We really want to be thought of as their partner in their data journey. That's really what we're doing. >> Even going forward, one of the things you were talking about earlier is the company's sort of saying, "we want to be boring. We want to help you do all the stuff-" >> There's a lot of money in boring. >> There's a lot of money, right! Exactly! As you said, a partner in their data journey. Is it "we'll do anything and everything"? Are you going to do niche stuff? >> That's a good question. Not everything. We are focused on the data layer. The movement of data, the process and storage, and truly the analytic applications that can be built on top of the platform. Right now we've stuck to our strategy. It's been very consistent since the beginning of the company in terms of taking these open source technologies, making them enterprise viable, developing an eco-system around it and fostering a community around it. That's been our strategy since before the company even started. We want to continue to do that and we will continue to do that. There's so much innovation happening in the community that we quickly bring that into the products and make sure that's available in a trusted, enterprise-tested platform. That's really one of the things we see our customers- over and over again they select us because we bring innovation to them quickly, in a safe and consumable way. >> Before we came on camera, I was telling Rebecca that Hortonworks has done a sensational job of continuing to align your product roadmaps with those of your leading partners. IBM, AWS, Microsoft. In many ways, your primary partners are not them, but the entire open source community. 26 open source projects in which Hortonworks represents and incorporated in your product portfolio in which you are a primary player and committer. You're a primary ingester of innovation from all the communities in which you operate. >> We do. >> That is your core business model. >> That's right. We both foster the innovation and we help drive the information ourselves with our engineers and architects. You're absolutely right, Jim. It's the ability to get that innovation, which is happening so fast in the community, into the product and companies need to innovate. Things are happening so fast. Moore's Law was mentioned multiple times on the main stage, you know, and how it's impacting different parts of the organization. It's not just the technology, but business models are evolving quickly. We heard a little bit about Trumble, and if you've seen Tim Leonard's talk that he gave around what they're doing in terms of logistics and the ability to go all the way out to the farmer and impact what's happening at the farm and tracking things down to the level of a tomato or an egg all the way back and just understand that. It's evolving business models. It's not just the tech but the evolution of business models. Rob talked about it yesterday. I think those are some of the things that are kind of key. >> Let me stay on that point really quick. Industrial internet like precision agriculture and everything it relates to, is increasingly relying on visual analysis, parts and eggs and whatever it might be. That is convolutional neural networks, that is A.I., it has to be trained, and it has to be trained increasingly in the cloud where the data lives. The data lives in H.D.P, clusters and whatnot. In many ways, no matter where the world goes in terms of industrial IoT, there will be massive cluster of HTFS and object storage driving it and also embedded A.I. models that have to follow a specific DevOps life cycle. You guys have a strong orientation in your portfolio towards that degree of real-time streaming, as it were, of tasks that go through the entire life cycle. From the preparing the data, to modeling, to training, to deploying it out, to Google or IBM or wherever else they want to go. So I'm thinking that you guys are in a good position for that as well. >> Yeah. >> I just wanted to ask you finally, what is the takeaway? We're talking about the attendees, talking about the community that you're cultivating here, theme, ideas, innovation, insight. What do you hope an attendee leaves with? >> I hope that the attendee leaves educated, understanding the technology and the impacts that it can have so that they will go back and change their business and continue to drive their data projects. The whole intent is really, and we even changed the format of the conference for more educational opportunities. For me, I want attendees to- a satisfied attendee would be one that learned about the things they came to learn so that they could go back to achieve the goals that they have when they get back. Whether it's business transformation, technology transformation, some combination of the two. To me, that's what I hope that everyone is taking away and that they want to come back next year when we're in Washington, D.C. and- >> My stomping ground. >> His hometown. >> Easy trip for you. They'll probably send you out here- (laughs) >> Yeah, that's right. >> Well John, it's always fun talking to you. Thank you so much. >> Thank you very much. >> We will have more from theCUBE's live coverage of DataWorks right after this. I'm Rebecca Knight for James Kobielus. (upbeat electro music)

Published Date : Jun 20 2018

SUMMARY :

in the heart of Silicon Valley, He is the VP of marketing you on the main stage, Talking about the technology, So I really want to This started as a small, That's right. but it's still the same event, It's actually the 17th total event the innovations that the community is that's one of the main things that Clearly, the bulk of your customers, their existing investment to Hortonworks have the ability to move workloads One of the things you also did just lay the scene. Even in some of the polls that One of the new things I've heard this With the enhancements to HIVE to subtle emphasis that me the data lake and having that ability to One of the things we've also aspects of it, the the EDGE, you mentioned NiFi and one of the things you were talking There's a lot of money, right! That's really one of the things we all the communities in which you operate. It's the ability to get that innovation, the cloud where the data lives. talking about the community that learned about the things they came to They'll probably send you out here- fun talking to you. coverage of DataWorks right after this.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
James KobielusPERSON

0.99+

Rebecca KnightPERSON

0.99+

IBMORGANIZATION

0.99+

RebeccaPERSON

0.99+

MicrosoftORGANIZATION

0.99+

Tim LeonardPERSON

0.99+

AWSORGANIZATION

0.99+

Arun MurthyPERSON

0.99+

JimPERSON

0.99+

Kevin SlavinPERSON

0.99+

EuropeLOCATION

0.99+

John KreisaPERSON

0.99+

BerlinLOCATION

0.99+

AmazonORGANIZATION

0.99+

JohnPERSON

0.99+

GoogleORGANIZATION

0.99+

2008DATE

0.99+

Washington, D.C.LOCATION

0.99+

AsiaLOCATION

0.99+

75%QUANTITY

0.99+

RobPERSON

0.99+

fiveQUANTITY

0.99+

San JoséLOCATION

0.99+

next yearDATE

0.99+

YahooORGANIZATION

0.99+

Silicon ValleyLOCATION

0.99+

32 different industriesQUANTITY

0.99+

World CupEVENT

0.99+

yesterdayDATE

0.99+

23 different countriesQUANTITY

0.99+

oneQUANTITY

0.99+

1,400 customersQUANTITY

0.99+

todayDATE

0.99+

twoQUANTITY

0.99+

2,100 attendeesQUANTITY

0.99+

FedexORGANIZATION

0.99+

10 yearsQUANTITY

0.99+

26 open source projectsQUANTITY

0.99+

HortonworksORGANIZATION

0.98+

17thQUANTITY

0.98+

bothQUANTITY

0.98+

OneQUANTITY

0.98+

billions of dollarsQUANTITY

0.98+

ClouderaORGANIZATION

0.97+

about 12%QUANTITY

0.97+

theCUBEORGANIZATION

0.97+

this weekDATE

0.96+

DataWorks Summit 2018EVENT

0.95+

NiFiORGANIZATION

0.91+

this morningDATE

0.89+

HIVE 3.0OTHER

0.86+

SparkTITLE

0.86+

few year agoDATE

0.85+

WikibanORGANIZATION

0.85+

The ShedORGANIZATION

0.84+

San José, CaliforniaLOCATION

0.84+

tonsQUANTITY

0.82+

H.D.PLOCATION

0.82+

DataWorksEVENT

0.81+

thingsQUANTITY

0.78+

DataWorksORGANIZATION

0.74+

MiNiFiTITLE

0.62+

dataQUANTITY

0.61+

MooreTITLE

0.6+

yearsQUANTITY

0.59+

coming decadeDATE

0.59+

TrumbleORGANIZATION

0.59+

GVPRORGANIZATION

0.58+

3.0OTHER

0.56+

Jennifer Tejada, PagerDuty | PagerDuty Summit 2017


 

>> Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're at PagerDuty Summit. It's our first time at PagerDuty Summit and Pier 27, our first time to this cool venue. It's right on the water between the Bay Bridge and Pier 39, beautiful view outside. Unfortunately, the fire smoke's a little over-the-top. But we're excited to have one of our favorite guests, Jennifer Tejada. She's the CEO at PagerDuty. Jennifer, great to see you. >> Thank you. It's so great to be back, Jeff. >> Absolutely. So this is, what, your second PagerDuty Summit? >> This is our second PagerDuty Summit. >> 500-some-odd people? >> I think we've had 700 through the door already. We've got a few hundred streaming online. Almost twice what we did last year. So we're really excited. We're still in the infancy stages of sponsoring an industry event, and we've been really focused on trying to make it a little different to insure that people walk away with actionable insights, and best practices and learnings they can take immediately back to their teams, and to their companies. So we've had just some awesome guest speakers and panelists here today, and it's been a lot of fun. The PagerDuty band played live at lunch. >> That's right, I saw them at lunchtime. >> Yeah, which was great. So we're having a good time. >> What are they called? The On-Calls. >> The On-Calls. I let them name themselves. >> And so, you've been here a year now. So, how are things moving, how are you moving the company along since you got here? What are some of the strategic things that you've been able to execute, and now you're looking forward? >> So, it's just been an incredible year, honestly. You always hope for a number of things when you come into a new role. You hope that the team rallies around the business. You hope that the opportunity is as significant as you thought it would. You hope that there aren't more bad surprises than you think there are going to be. PagerDuty's been so unique, in that there have been more good surprises than bad surprises. There's so much potential to unlock in the business. But probably the thing that's most amazing about it is the people, the community, and the culture around PagerDuty, and just the sense of alliance towards making the engineering world work better to insure that customer experience and employee experience is better. There's just a real sense of duty there, and there's a sense that the community is there with you trying to make it happen, as opposed to working against you. So a lot of our innovation this year, and I mean, we've released tons of new technology product, including machine learning and analytics, and going from reactive and responsive to proactive. There's a lot of stuff happening. So much of that has come from input from our practitioner community and our customer base. You just don't always have that kind of vocal engagement, that proactive, constructive engagement from your customer base, so that's just been amazing. And the team's awesome. We've expanded into the UK and western Europe over this summer. We opened an office in Sydney recently. We've shifted from being a single-product company to a platform company. We've more than doubled in size, 150 people to over 350 people. We're in 130 countries now, in terms of where our customer base lives, and just around 10 thousand customers, so really, really amazing progress. Sometimes I feel like we're a little bit of a teenage prodigy, you know? We're growing super fast, other kids are starting to learn how to play the piano. It's a little awkward, but we're still really good at what we do. I think the thing that keeps us out in front is our commitment, and all of our efforts being in service to making both the lives better of the practitioners in our community, and creating quantifiable value for our enterprise customers. >> It's interesting to focus on the duty, because that kind of came with the old days of when you were the person that had to wear the pager, right? Whether you're a doctor on call, or you were the IT person. So it's an interesting metaphor, even though probably most of the kids here have never seen a pager. >> No, I remember as a kid, my dad was in healthcare, and he had a pager, and you knew that when the pager went off, it was time. You were on-duty, you were out. And there's an honor in duty, and it is a service to the organization. Adrian Cockcroft was here this morning, VP of architecture from AWS, and known for cloud architecture that he built out at Netflix. And he said something really interesting, which is, he believes all people should be on-call, because you need the pain to go where it's most useful. And if everybody's on-call, it also creates this kind of self-fulfilling cycle. If you know you're going to be on-call, you build better code. If you know you're going to be on-call on the weekend, you don't ship something stupid on Friday night. If you know you're going to be on-call and you're a non-technical person, you align yourselves with people who are technical that can help you when that happens. So there's something sort of magical that happens when you do have that culture of being available on the spot when things don't go as planned. >> And now you've got a whole new rash of technology that you can apply to this, in the area of artificial intelligence and machine learning. Wonder if you could share a little bit, where is that now taking you for the next step? >> I think the biggest opportunity with machine learning for us is that, over the last eight years, we've been collecting a tremendous amount of data. And AI and machine learning are only as good as the data they sit on top of. So we have three really interesting data sets. We have the events and the signals that come from all of the machine instrumentation, the applications, the monitoring environment, the ticketing platforms that we integrate directly to. We have information around the workflow, what works best for most of our customers, what doesn't work. What's the best agile-centric DevOps related workflow that enables ultimate response and ultimate availability and resilience for customers. And then finally, what's going on with the people? Who are the people that work the hardest for you? Who are the people that have the subject matter expertise to be the most useful when things aren't working the way they should? You bring all of that together, and you build a model that starts to learn, which immediately means you can automate a lot of manual process. You can improve the quality of decisions, because you're making those decisions in context. An example would be, if an incident pops up, we see it in the form of a signal or a set of events. And our machine learning will recognize that we've actually seen those events before. And the last time this happened, here's what the outcome was, here's what went well and not so well, here's how you fixed it, and here's the person who was on top of it, here's the expert you need to call. So I've immediately shortened the distance between signal and action. I've gotten the people, now, that are going to come in to that process to respond to either a problem or an opportunity, are already much more prepared to be successful quickly, efficiently, and effectively. >> So you've shortened it and you've increased the probability of success dramatically. >> Exactly. And maybe you don't even need a person. That person can go off and do other more important proactive work. >> But you're all about people. And we first met when you were at Keynote and we brought you out for a Women in Tech interview. So you had a thing on Tuesday night that I want for you to share. What did you do Tuesday night? >> I was just super moved and inspired and excited. I've had the opportunity to attend lots of diversity events, lots of inclusion events, a lot of support groups, I'm asked to speak a lot on behalf of women and under-represented minorities, and I appreciate that, and I see that as my own civic duty to help lead the way and set an example, and reach back for other people and help develop younger women and minorities coming up. But I've found that a lot of these events, it's a bunch of women sitting in a room talking about all the challenges that we're facing. And I don't need to spend more time identifying the problem. I understand the problem. What I really wanted to do was bring together a group of experts who have seen success, who have a demonstrable track record for overcoming some of these barriers and challenges, and have taken that success and applied it into their own organizations, and sort of beating the averages in terms of building inclusive, diverse teams and companies. So Tuesday was all about one, creating a fun environment, we had cocktails, we had entertainment, it was in a great venue at Dirty Habit, where we could have a proactive, constructive, action-oriented conversation about things that are working. Things that you can hear from a female leader who's a public company executive, and take that directly back to your teams. Expert career advice, how some of these women have achieved what they have. And we just had a phenomenal lineup. Yvonne Wassenaar, who's the CEO of Airware, and Andreessen Horowitz come, theCUBE alumni, previously CIO at New Relic. We had Merline Saintil, who's the head of operations for all of product and technology for Intuit. Sheila Jordan, the CIO of Symantec. We had Alvina Antar, who's the CIO at Zuora. And, I'm missing one ... Oh, Rathi Murthy, the CTO at the Gap. And so, just quite an incredible lineup of executives in their own right. The fact that they happen to be a diverse group of women was just all the more interesting. And then we surprised the organization. After about 45 minutes of this discussion, sharing key learning, sharing best practices, we brought in the San Francisco Gay Men's Chorus, who are just embarking, in the next 10 days, on a trip called the Lavender Pen Tour, where they're looking to spread love, hope, and social justice, and proof that diversity delivers results, in the southern states, where equality equals gender equality, and I think challenges for equal opportunity for the LGBTQ community are really significant. And Mikkel Svane, who's the CEO of Zendesk, introduced me to Chris, the director there, about a week before, and I was so inspired by what they're doing. This is a group of 450 volunteers, who have day jobs, who perform stunning shows, beautiful music together, that are going to go on four buses for 11 days around the Deep South, and I think, make a big difference. And they're taking the Oakland Interfaith Gospel Choir with them. So just really cool. So they came, and I mean, when's the last time you went to a diversity event and people were singing, and dancing, and toasting? It was just really different, and everybody walked away learning something new, including the number of male executives, champions that I asked to come as my special guest, to support people in building sponsorship, to support these women and these under-represented minorities in finding connections that can help them build their own careers, they learned a lot at the event. It was incredible. I'm really proud of it, and it's the start of something special. >> I love it. I mean, you bring such good energy, both at your day job, and also in this very, very important role that you play, and it's great that you've embraced that, and not only take it seriously, but also have some fun. >> What's the point if you're not going to have fun? You apply the growth mindset to one of the biggest problems in the industry, and you hack it the same way you would a deeply technical problem, or a huge business problem. And when we get constructive and focused like that, amazing things happen. And so I now have people begging to be on the next panel, and we're trying to find the next venue, and got to come up with a name for it, but this is a thing. >> And oh, by the way, there's better business outcomes as well. >> I mean, I did a ton of business that night. Half that panel were customers that are continuing to invest and partner with PagerDuty, and we're excited about the future. And some of those women happen to be machine learning experts, for instance. So, great opportunity for me to partner and get advice on some of the new innovation that we've undertaken. >> Well, Jennifer, thanks for inviting us to be here. We love to keep up with you and everything that you're doing, both before and in your current journey. And congrats on a great event. >> My pleasure. Absolutely. Thanks for having me. >> She's Jennifer Tejada, I'm Jeff Frick. You're watching theCUBE from PagerDuty Summit. Thanks for watching. (upbeat music)

Published Date : Sep 8 2017

SUMMARY :

It's right on the water between the Bay Bridge It's so great to be back, Jeff. So this is, We're still in the infancy stages of sponsoring So we're having a good time. What are they called? I let them name themselves. the company along since you got here? that the community is there with you trying of the kids here have never seen a pager. that can help you when that happens. that you can apply to this, in the area here's the expert you need to call. the probability of success dramatically. And maybe you don't even need a person. And we first met when you were at Keynote and I see that as my own civic duty to help lead the way I mean, you bring such good energy, You apply the growth mindset to one of the biggest problems And oh, by the way, on some of the new innovation that we've undertaken. We love to keep up with you and everything Thanks for having me. Thanks for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JenniferPERSON

0.99+

ChrisPERSON

0.99+

Adrian CockcroftPERSON

0.99+

Yvonne WassenaarPERSON

0.99+

Jennifer TejadaPERSON

0.99+

Jeff FrickPERSON

0.99+

Sheila JordanPERSON

0.99+

AirwareORGANIZATION

0.99+

Rathi MurthyPERSON

0.99+

Mikkel SvanePERSON

0.99+

Alvina AntarPERSON

0.99+

11 daysQUANTITY

0.99+

SydneyLOCATION

0.99+

JeffPERSON

0.99+

SymantecORGANIZATION

0.99+

UKLOCATION

0.99+

Bay BridgeLOCATION

0.99+

Tuesday nightDATE

0.99+

Merline SaintilPERSON

0.99+

ZendeskORGANIZATION

0.99+

TuesdayDATE

0.99+

AWSORGANIZATION

0.99+

150 peopleQUANTITY

0.99+

Friday nightDATE

0.99+

last yearDATE

0.99+

NetflixORGANIZATION

0.99+

130 countriesQUANTITY

0.99+

Andreessen HorowitzPERSON

0.99+

450 volunteersQUANTITY

0.99+

Pier 39LOCATION

0.99+

first timeQUANTITY

0.99+

Deep SouthLOCATION

0.99+

PagerDutyORGANIZATION

0.99+

500QUANTITY

0.99+

Pier 27LOCATION

0.99+

bothQUANTITY

0.98+

New RelicORGANIZATION

0.98+

firstQUANTITY

0.98+

todayDATE

0.98+

western EuropeLOCATION

0.97+

around 10 thousand customersQUANTITY

0.97+

700QUANTITY

0.97+

oneQUANTITY

0.97+

Dirty HabitORGANIZATION

0.97+

this yearDATE

0.97+

over 350 peopleQUANTITY

0.97+

HalfQUANTITY

0.96+

Oakland Interfaith Gospel ChoirORGANIZATION

0.95+

secondQUANTITY

0.95+

ZuoraORGANIZATION

0.93+

Lavender Pen TourEVENT

0.93+

this morningDATE

0.92+

PagerDuty SummitEVENT

0.91+

San Francisco Gay Men's ChorusORGANIZATION

0.89+

theCUBEORGANIZATION

0.88+

a yearQUANTITY

0.88+

LGBTQORGANIZATION

0.88+

about 45 minutesQUANTITY

0.87+

single-productQUANTITY

0.86+

KeynoteEVENT

0.86+

this summerDATE

0.86+

about a week beforeDATE

0.85+

PagerDuty Summit 2017EVENT

0.84+

DevOpsTITLE

0.84+

last eight yearsDATE

0.8+

three really interesting data setsQUANTITY

0.74+

AlmostQUANTITY

0.71+

twiceQUANTITY

0.7+

PagerDuty SummitLOCATION

0.69+

next 10 daysDATE

0.68+

TechEVENT

0.64+

PagerDutyEVENT

0.57+

GapORGANIZATION

0.52+

PagerDutyLOCATION

0.51+

hundredQUANTITY

0.49+

SummitORGANIZATION

0.49+

Kickoff - Spark Summit East 2017 - #sparksummit - #theCUBE


 

>> Narrator: Live from Boston, Massachusetts, this is theCUBE covering Spark Summit East 2017. Brought to you by Databricks. Now, here are your hosts, Dave Vellante and George Gilbert. >> Everybody the euphoria is still palpable here, we're in downtown Boston at the Hynes Convention Center. For Spark Summit East, #SparkSummit, my co-host and I, George Gilbert, will be unpacking what's going on for the next two days. George, it's good to be working with you again. >> Likewise. >> I always like working with my man, George Gilbert. We go deep, George goes deeper. Fantastic action going on here in Boston, actually quite a good crowd here, it was packed this morning in the keynotes. The rave is streaming. Everybody's talking about streaming. Let's sort of go back a little bit though George. When Spark first came onto the scene, you saw these projects coming out of Berkeley, it was the hope of bringing real-timeness to big data, dealing with some of the memory constraints that we found going from batch to real-time interactive and now streaming, you're going to talk about that a lot. Then you had IBM come in and put a lot of dough behind Spark, basically giving it a stamp, IBM's imprimatur-- >> George: Yeah. >> Much in the same way it did with Lynx-- >> George: Yeah. >> Kind of elbowing it's way in-- >> George: Yeah. >> The marketplace and sort of gaining a foothold. Many people at the time thought that Hadoop needed Spark more than Spark needed Hadoop. A lot of people thought that Spark was going to replace Hadoop. Where are we today? What's the state of big data? >> Okay so to set some context, when Hadoop V1, classic Hadoop came out it was file system, commodity file system, keep everything really cheap, don't have to worry about shared storage, which is very expensive and the processing model, the execution of munging through data was map produced. We're all familiar with those-- >> Dave: Complicated but dirt cheap. >> Yes. >> Dave: Relative to a traditional data warehouse. >> Yes. >> Don't buy a big Oracle Unix box or Lynx box, buy this new file system and figure out how to make it work and you'll save a ton of money. >> Yeah, but unlike the traditional RDBMS', it wasn't really that great for doing interactive business intelligence and things like that. It was really good for big batch jobs that would run overnight or periods of hours, things like that. The irony is when Matei Zaharia, the co-creator of Spark or actually the creator and co-founder of Databricks, which is steward of Spark. When he created the language and the execution environment, his objective was to do a better MapReduce than Radue, than MapReduce, make it faster, take advantage of memory, but he did such a good job of it, that he was able to extend it to be a uniform engine not just for MapReduce type batch stuff, but for streaming stuff. >> Dave: So originally they start out thinking that if I get this right-- >> Yeah. >> It was sort of a microbatch leveraging memory more effectively and then it extended beyond-- >> The microbatch is their current way to address the streaming stuff. >> Dave: Okay. >> It takes MapReduce, which would be big long running jobs, and they can slice them up and so each little slice turns into an element in the stream. >> Dave: Okay, so the point it was improvement upon these big long batch jobs-- >> George: Yeah. >> They're making it batch to interactive in real-time, so let's go back to big data for a moment here. >> George: Yeah. >> Big data was the hottest topic in the world three or four years ago and now it's sort of waned as a buzz word, but big data is now becoming more mainstream. We've talked about that a lot. A lot of people think it's done. Is big data done? >> George: Not it's more that it's sort of-- it's boring for us, kind of pundits, to talk about because it's becoming part of the fabric. The use cases are what's interesting. It started out as a way to collect all data into this really cheap storage repository and then once you did that, this was the data you couldn't afford to put into your terra data, data warehouse at 25,000 per terabyte or with running costs a multiple of that. Here you put all your data in here, your data scientists and data engineers started munging with the data, you started taking workloads off your data warehouse, like ETL things that didn't belong there. Now people are beginning to experiment with business intelligence sort of exploration and reporting on Hadoop, so taking more workloads off the data warehouse. The limitations, there are limitations there that will get solved by putting MPP SQL back-ends on it, but the next step after that. So we're working on that step, but the one that comes after that is make it easier for data scientists to use this data, to create predictive models-- [Dave] Okay, so I often joke that the ROI on big data was reduction on investment and lowering the denominator-- >> George: Yeah. >> In the expense equation, which I think it's fair to say that big data and Hadoop succeeded in achieving that, but then the question becomes, what's the real business impact. Clearly big data has not, except in some edge cases and there are a number of edge cases and examples, but it's not yet anyway lived up to the promise of real-time, affecting outcomes before, you know taking the human out of the decision, bringing transaction and analytics together. Now we're hearing a lot of that talk around AI and machine learning, of course, IoT is the next big thing, that's where streaming fits in. Is it same line new bottle? Or is it sort of the evolution of the data meme? >> George: It's an evolution, but it's not just a technology evolution to make it work. When we've been talking about big data as efficiency, like low cost, cost reduction for the existing type of infrastructure, but when it starts going into machine learning you're doing applications that are more strategic and more top line focused. That means your c-level execs actually have to get involved because they have to talk about the strategic objectives, like growth versus profitability or which markets you want to target first. >> So has Spark been a headwind or tailwind to Hadoop? >> I think it's very much been a tailwind because it simplified a lot of things that took many, many engines in Hadoop. That's something that Matei, creator of Spark, has been talking about for awhile. >> Dave: Okay something I learned today and actually I had heard this before, but the way I phrased it in my tweet, Genomiocs is kicking Moore's Law's ass. >> George: Yeah. >> That the price performance of sequencing a gene improves three x every year to what is essentially a doubling every 18 months for Moore's Law. The amount of data that's being created is just enormous, I think we heard from Broad Institute that they create 17 terabytes a day-- >> George: Yeah. >> As compared to YouTube, which is 24 terabytes a day. >> And then a few years it will be-- >> It will be dwarfing YouTube >> Yeah. >> Of course Twitter you couldn't even see-- >> Yeah. >> So what do you make of that? Is that just the fun fact, is that a new use case, is that really where this whole market is headed? >> It's not a fun fact because we've been hearing for years and years about this study about data doubling every 18 to 24 months, that's coming from the legacy storage guys who can only double their capacity every 18 to 24 months. The reality is that when we take what was analog data and we make it digitally accessible, the only thing that's preventing us from capturing all this data is the cost to acquire and manage it. The available data is growing much, much faster than 40% every 18 months. >> Dave: So what you're saying is that-- I mean this industry has marched to the cadence of Moore's Law for decades and what you're saying is that linear curve is actually reshaping and it's becoming exponential. >> George: For data-- >> Yes. >> George: So the pressure is on for compute, which is now the bottleneck to get clever and clever about how to process it-- >> So that says innovation has to come from elsewhere, not just Moore's Law. It's got to come from a combination of-- Thomas Friedman talks a lot about Moore's Law being one of the fundamentals, but there are others. >> George: Right. >> So from a data perspective, what are those combinatorial effects that are going to drive innovation forward? >> George: There was a big meetup for Spark last night and the focus was this new database called SnappyData that spun out of Pivotal and it's being mentored by Paul Maritz, ex-head of Development in Microsoft in the 90s and former head of VMWare. The interesting thing about this database, and we'll start seeing it in others, is you don't necessarily want to be able to query and analyze petabytes at once, it will take too long, sort of like munging through data of that size on Hadoop took too long. You can do things that approximate the answer and get it much faster. We're going to see more tricks like that. >> Dave: It's interesting you mention Maritz, I heard a lot of messaging this morning that talked about essentially real-time analysis and being able to make decisions on data that you've never seen before and actually affect outcomes. This narrative I first heard from Maritz many, many years ago when they launched Pivotal. He launched Pivotal to be this platform for building big data apps and now you're seeing Databricks and others sort of usurp that messaging and actually seeming to be at the center of that trend. What's going on there? >> I think there's two, what would you call it, two centers of gravity and our CTO David Floyer talks about this. The edge is becoming more intelligent because there's a huge bandwidth and latency gap between these smart devices at the edge, whether the smart device is like a car or a drone or just a bunch of sensors on a turbine. Those things need to analyze and respond in near real-time or hard real-time, like how to tune themselves, things like that, but they also have to send a lot of data back to the cloud to learn about how these things evolve. In other words it would be like sending the data to the cloud to figure out how the weather patterns are changing. >> Dave: Um,humm. >> That's the analogy. You need them both. >> Dave: Okay. >> So Spark right now is really good in the cloud, but they're doing work so that they can take a lighter weight version and put at the edge. We've also seen Amazon put some stuff at the edge and Azure as well. >> Dave: I want you to comment. We're going to talk about this later, we have a-- George and I are going to do a two-part series at this event. We're going to talk about the state of the market and then we're going to release our big data, in a glimpse to our big data numbers, our Spark forecast, our streaming forecast-- I say I mention streaming because that is-- we talk about batch, we talk about interactive/real-time, you know you're at a terminal-- anybody who's as old as I am remembers that. But now you're talking about streaming. Streaming is a new workload type, you call these things continuous apps, like streams of events coming into a call center, for example, >> George: Yeah. >> As one example that you used. Add some color to that. Talk about that new workload type and the roll of streaming, and really potentially how it fits into IoT. >> Okay, so for the last 60 years, since the birth of digital computing, we've had either one of two workloads, they were either batch, which is jobs that ran offline, you put your punch cards in and sometime later the answer comes out. Or we've had interactive, which is originally it was green screens and now we have PCs and mobile devices. The third one coming up now is continuous or streaming data that you act on in near real-time. It's not that those apps will replace the previous ones, it's that you'll have apps that have continuous processing, batch processing, interactive as a mix. An example would be today all the information about how your applications and data center infrastructure are operating, that's a lot of streams of data that Splunk first, took amat and did very well with-- so that you're looking in real-time and able to figure out if something goes wrong. That type of stuff, all the coulometry from your data center, that is a training wheel for Internet things, where you've got lots of stuff out at the edge. >> Dave: It's interesting you mention Splunk, Splunk doesn't actually use the big data term in its marketing, but they actually are big data and they are streaming. They're actually not talking about it, they're just doing it, but anyway-- Alright George, great thanks for that overview. We're going to break now, bring back our first guest, Arun Murthy, coming in from Hortonworks, co-founder at Hortonworks, so keep it right there everybody. This is theCUBE we're live from Spark Summit East, #SparkSummit, we'll be right back. (upbeat music)

Published Date : Feb 8 2017

SUMMARY :

Brought to you by Databricks. George, it's good to be working with you again. and now streaming, you're going to talk about that a lot. Many people at the time thought that Hadoop needed Spark and the processing model, buy this new file system and figure out how to make it work and the execution environment, to address the streaming stuff. in the stream. so let's go back to big data for a moment here. and now it's sort of waned as a buzz word, [Dave] Okay, so I often joke that the ROI on big data and machine learning, of course, IoT is the next big thing, but it's not just a technology evolution to make it work. That's something that Matei, creator of Spark, but the way I phrased it in my tweet, That the price performance of sequencing a gene all this data is the cost to acquire and manage it. I mean this industry has marched to the cadence So that says innovation has to come from elsewhere, and the focus was this new database called SnappyData and actually seeming to be at the center of that trend. but they also have to send a lot of data back to the cloud That's the analogy. So Spark right now is really good in the cloud, We're going to talk about this later, we have a-- As one example that you used. and sometime later the answer comes out. We're going to break now,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
GeorgePERSON

0.99+

Paul MaritzPERSON

0.99+

Dave VellantePERSON

0.99+

George GilbertPERSON

0.99+

Arun MurthyPERSON

0.99+

Matei ZahariaPERSON

0.99+

DavePERSON

0.99+

BostonLOCATION

0.99+

HortonworksORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

Thomas FriedmanPERSON

0.99+

IBMORGANIZATION

0.99+

David FloyerPERSON

0.99+

MateiPERSON

0.99+

Broad InstituteORGANIZATION

0.99+

BerkeleyLOCATION

0.99+

twoQUANTITY

0.99+

MaritzPERSON

0.99+

DatabricksORGANIZATION

0.99+

two-partQUANTITY

0.99+

MicrosoftORGANIZATION

0.99+

oneQUANTITY

0.99+

third oneQUANTITY

0.99+

OracleORGANIZATION

0.99+

YouTubeORGANIZATION

0.99+

25,000 per terabyteQUANTITY

0.99+

Hynes Convention CenterLOCATION

0.99+

24 monthsQUANTITY

0.99+

Boston, MassachusettsLOCATION

0.98+

first guestQUANTITY

0.98+

threeQUANTITY

0.98+

one exampleQUANTITY

0.98+

HadoopTITLE

0.97+

last nightDATE

0.97+

threeDATE

0.97+

bothQUANTITY

0.97+

40%QUANTITY

0.97+

todayDATE

0.97+

Spark Summit East 2017EVENT

0.97+

17 terabytes a dayQUANTITY

0.97+

firstQUANTITY

0.97+

24 terabytes a dayQUANTITY

0.97+

TwitterORGANIZATION

0.96+

decadesQUANTITY

0.96+

90sDATE

0.96+

Moore's LawTITLE

0.96+

two workloadsQUANTITY

0.96+

SparkTITLE

0.95+

four years agoDATE

0.94+

Moore'sTITLE

0.94+

two centersQUANTITY

0.92+

UnixCOMMERCIAL_ITEM

0.92+

KickoffEVENT

0.92+

#SparkSummitEVENT

0.91+