Haseeb Budhani & Anant Verma | AWS re:Invent 2022 - Global Startup Program

>> Well, welcome back here to the Venetian. We're in Las Vegas. It is Wednesday, Day 2 of our coverage here of AWS re:Invent, 22. I'm your host, John Walls on theCUBE and it's a pleasure to welcome in two more guests as part of our AWS startup showcase, which is again part of the startup program globally at AWS. I've got Anant Verma, who is the Vice President of Engineering at Elation. Anant, good to see you, sir. >> Good to see you too. >> Good to be with us. And Haseeb Budhani, who is the CEO and co-founder of Rafay Systems. Good to see you, sir. >> Good to see you again. >> Thanks for having, yeah. A cuber, right? You've been on theCUBE? >> Once or twice. >> Many occasions. But a first timer here, as a matter of fact, glad to have you aboard. All right, tell us about Elation. First for those whom who might not be familiar with what you're up to these days, just give it a little 30,000 foot level. >> Sure, sure. So, yeah, Elation is a startup and a leader in the enterprise data intelligence space. That really includes a lot of different things including data search, data discovery, metadata management, data cataloging, data governance, data policy management, a lot of different things that companies want to do with the hoards of data that they have and Elation, our product is the answer to solve some of those problems. We've been doing pretty good. Elation is in running for about 10 years now. We are a series A startup now, we just raised around a few, a couple of months ago. We are already a hundred million plus in revenue. So. >> John: Not shabby. >> Yeah, it's a big benchmark for companies to, startup companies, to cross that milestone. So, yeah. >> And what's the relationship? I know Rafay and you have worked together, in fact, the two of you have, which I find interesting, you have a chance, you've been meeting on Zoom for a number of months, as many of us have it meeting here for the first time. But talk about that relationship with Rafay. >> Yeah, so I actually joined Elation in January and this is part of the move of Elation to a more cloud native solution. So, we have been running on AWS since last year and as part of making our solution more cloud native, we have been looking to containerize our services and run them on Kubernetes. So, that's the reason why I joined Elation in the first place to kind of make sure that this migration or move to a cloud native actually works out really well for us. This is a big move for the companies. A lot of companies that have done in the past, including, you know, Confluent or MongoDB, when they did that, they actually really reap great benefits out of that. So to do that, of course, you know, as we were looking at Kubernetes as a solution, I was personally more looking for a way to speed up things and get things out in production as fast as possible. And that's where I think, Janeb introduced us... >> That's right. >> Two of us. I think we share the same investor actually, so that's how we found each other. And yeah, it was a pretty simple decision in terms of, you know, getting the solution, figuring it out if it's useful for us and then of course, putting it out there. >> So you've hit the keyword, Kubernetes, right? And, so if you would to honestly jump in here, there are challenges, right? That you're trying to help them solve and you're working on the Kubernetes platform. So, you know, just talk about that and how that's influenced the work that the two of you are doing together. >> Absolutely. So, the business we're in is to help companies who adopt Kubernetes as an orchestration platform do it easier, faster. It's a simple story, right? Everybody is using Kubernetes, but it turns out that Kubernetes is actually not that easy to to operationalize, playing in a sandbox is one thing. Operationalizing this at a certain level of scale is not easy. Now, we have a lot of enterprise customers who are deploying their own applications on Kubernetes, and we've had many, many of them. But when it comes to a company like Elation, it's a more complicated problem set because they're taking a very complex application, their application, but then they're providing that as a service to their customers. So then we have a chain of customers we have to make happy. Anant's team, the platform organization, his internal customers who are the developers who are deploying applications, and then, the company has customers, we have to make sure that they get a good experience as they consume this application that happens to be running on Kubernetes. So that presented a really interesting challenge, right? How do we make this partnership successful? So I will say that, we've learned a lot from each other, right? And, end of the day, the goal is, my customer, Anant's specifically, right? He has to feel that, this investment, 'cause he has to pay us money, we would like to get paid. >> John: Sure. (John laughs) >> It reduces his internal expenditure because otherwise he'd have to do it himself. And most importantly, it's not the money part, it's that he can get to a certain goalpost significantly faster because the invention time for Kubernetes management, the platform that you have to build to run Kubernetes is a very complex exercise. It took us four and a half years to get here. You want to do that again, as a company, right? Why? Why do you want to do that? We, as Rafay, the way I think about what we deliver, yes, we sell a product, but to what end? The product is the what, the why, is that every enterprise, every ISV is building a Kubernetes platform in house. They shouldn't, they shouldn't need to. They should be able to consume that as a service. They consume the Kubernetes engine the EKS is Amazon's Kubernetes, they consume that as an engine. But the management layer was a gap in the market. How do I operationalize Kubernetes? And what we are doing is we're going to, you know, the Anant said. So the warden saying, "Hey you, your team is technical, you understand the problem set. Would you like to build it or would you rather consume this as a service so you can go faster?" And, resoundingly the answer is, I don't want to do this anymore. I wouldn't allow to buy. >> Well, you know, as Haseeb is saying, speed is again, when we started talking, it only took us like a couple of months to figure out if Rafay is the right solution for us. And so we ended up purchasing Rafay in April. We launched our product based on Rafay in Kubernetes, in EKS in August. >> August. >> So that's about four months. I've done some things like this before. It takes a couple of years just to sort of figure out, how do you really work with Kubernetes, right? In a production at a large scale. Right now, we are running about a 600 node cluster on Rafay and that's serving our customers. Like, one of the biggest thing that's actually happening on December 8th is we are running what we call a virtual hands on lab. >> A virtual? >> Hands on lab. >> Okay. >> For Elation. And they're probably going to be about 500 people is going to be attending it. It's like a webinar style. But what we do in that hands on lab is we will spin up an Elation instance for each attendee, right on the spot. Okay? Now, think about this enterprise software running and people just sign up for it and it's there for you, right on the spot. And that's the beauty of the software that we have been building. There's the beauty of the work that Rafay has helped us to do over the last few months. >> Okay. >> I think we need to charge them more money, I'm getting from this congregation. I'm going to go work on that. >> I'm going to let the two of you work that out later. All right. I don't want to get in the way of a big deal. But you mentioned that, we heard about it earlier that, it's you that would offer to your cert, to your clients, these services. I assume they have their different levels of tolerance and their different challenges, right? They've got their own complexities and their own organizational barriers. So how are you juggling that end of it? Because you're kind of learning as, well, not learning, but you're experiencing some of the thing. >> Right. Same things. And yet you've got this other client base that has a multitude of experiences that they're going through. >> Right. So I think, you know a lot of our customers, they are large enterprise companies. They got a whole bunch of data that they want work with us. So one of the thing that we have learned over the past few years is that we used to actually ship our software to the customers and then they would manage it for their privacy security reasons. But now, since we're running in the cloud, they're really happy about that because they don't need to juggle with the infrastructure and the software management and upgrades and things like that, we do it for them, right? And, that's the speed for them because now they are only interested in solving the problems with the data that they're working with. They don't need to deal with all these software management issues, right? So that frees our customers up to do the thing that they want to do. Of course it makes our job harder and I'm sure in turn it makes his job harder. >> We get a short end of the stick, for sure. >> That's why he is going to get more money. >> Exactly. >> Yeah, this is a great conversation. >> No, no, no. We'll talk about that. >> So, let's talk about the cloud then. How, in terms of being the platform where all this is happening and AWS, about your relationship with them as part of the startup program and what kind of value that brings to you, what does that do for you when you go out and are looking for work and what kind of cache that brings to you >> Talk about the AWS? >> Yes, sir. >> Okay. Well, so, the thing is really like of course AWS, a lot of programs in terms of making sure that as we move our customers into AWS, they can give us some, I wouldn't call it discount, but there's some credits that you can get as you move your workloads onto AWS. So that's a really great program. Our customers love it. They want us to do more things with AWS. It's a pretty seamless way for us to, as we were talking about or thinking about moving into the cloud, AWS was our number one choice and that's the only cloud that we are in, today. We're not going to go to any other place. >> That's it. >> Yeah. >> How would you characterize? I mean, we've already heard, from one side of the fence here, but. >> Absolutely. So for us, AWS is a make or break partner, frankly. As the EKS team knows very well, we support Azure's Kubernetes and Google's Kubernetes and the community Kubernetes as well. But the number of customers on our platform who are AWS native, either a hundred percent or a large percentage is, you know, that's the majority of our customer base. >> John: Yeah. >> And AWS has made it very easy for us in a variety of ways to make us successful and our customers successful. So Anant mentioned the credit program they have which is very useful 'cause we can, you know, readily kind of bring a customer to try things out and they can do that at no cost, right? So they can spin up infrastructure, play with things and AWS will cover the cost, as one example. So that's a really good thing. Beyond that, there are multiple programs at AWS, ISV accelerate, et cetera. That, you know, you got to over time, you kind of keep getting taller and taller. And you keep getting on bigger and bigger. And as you make progress, what I'm finding is that there's a great ecosystem of support that they provide us. They introduce us to customers, they help us, you know, think through architecture issues. We get access to their roadmap. We work very, very closely with the guest team, for example. Like the, the GM for Kubernetes at AWS is a gentleman named Barry Cooks who was my sponsor, right? So, we spend a lot of time together. In fact, right after this, I'm going to be spending time with him because look, they take us seriously as a partner. They spend time with us because end of the day, they understand that if they make their partners, in this case, Rafay successful, at the end of the day helps the customer, right? Anant's customer, my customer, their AWS customers, also. So they benefit because we are collectively helping them solve a problem faster. The goal of the cloud is to help people modernize, right? Reduce operational costs because data centers are expensive, right? But then if these complex solutions this is an enterprise product, Kubernetes, at the enterprise level is a complex problem. If we don't collectively work together to save the customer effort, essentially, right? Reduce their TCO for whatever it is they're doing, right? Then the cost of the cloud is too high. And AWS clearly understands and appreciates that and that's why they are going out of their air, frankly, to make us successful and make other companies successful in the startup program. >> Well. >> I would just add a couple of things there. Yeah, so, you know, cloud is not new. It's been there for a while. You know, people used to build things on their own. And so what AWS has really done is they have advanced technology enough where everything is really simple as just turning on a switch and using it, right? So, just a recent example, and I, by the way, I love managed services, right? So the reason is really because I don't need to put my own people to build and manage those things, right? So, if you want to use a search, they got the open search, if you want to use caching, they got elastic caching and stuff like that. So it's really simple and easy to just pick and choose which services you want to use and they're ready to be consumed right away. And that's the beautiful, and that that's how we can move really fast and get things done. >> Ease of use, right? Efficiency, saving money. It's a winning combination. Thanks for sharing this story, appreciate. Anant, Haseeb thanks for being with us. >> Yeah, thank you so much having us. >> We appreciate it. >> Thank you so much. >> You have been a part of the global startup program at AWS and startup showcase. Proud to feature this great collaboration. I'm John Walls. You're watching theCUBE, which is of course the leader in high tech coverage.

Published Date : Nov 30 2022

SUMMARY :

and it's a pleasure to Good to be with us. Thanks for having, yeah. glad to have you aboard. and Elation, our product is the answer startup companies, to the two of you have, So, that's the reason why I joined Elation you know, getting the solution, that the two of you are doing together. And, end of the day, the goal is, John: Sure. the platform that you have to build the right solution for us. Like, one of the biggest thing And that's the beauty of the software I'm going to go work on that. of you work that out later. that they're going through. So one of the thing that we have learned of the stick, for sure. going to get more money. We'll talk about that. and what kind of cache that brings to you and that's the only cloud from one side of the fence here, but. and the community Kubernetes as well. The goal of the cloud is to and that that's how we Ease of use, right? the global startup program

ENTITIES

Entity	Category	Confidence
AWS	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Haseeb Budhani	PERSON	0.99+
John	PERSON	0.99+
John Walls	PERSON	0.99+
Barry Cooks	PERSON	0.99+
April	DATE	0.99+
Rafay	PERSON	0.99+
December 8th	DATE	0.99+
Anant Verma	PERSON	0.99+
January	DATE	0.99+
Las Vegas	LOCATION	0.99+
Elation	ORGANIZATION	0.99+
Anant	PERSON	0.99+
two	QUANTITY	0.99+
August	DATE	0.99+
Rafay Systems	ORGANIZATION	0.99+
Two	QUANTITY	0.99+
last year	DATE	0.99+
First	QUANTITY	0.99+
twice	QUANTITY	0.99+
four and a half years	QUANTITY	0.99+
Janeb	PERSON	0.99+
first	QUANTITY	0.99+
Rafay	ORGANIZATION	0.99+
Haseeb	PERSON	0.99+
Once	QUANTITY	0.99+
one example	QUANTITY	0.99+
EKS	ORGANIZATION	0.98+
one	QUANTITY	0.98+
first time	QUANTITY	0.98+
Google	ORGANIZATION	0.98+
Venetian	LOCATION	0.97+
Confluent	ORGANIZATION	0.97+
one side	QUANTITY	0.97+
30,000 foot	QUANTITY	0.97+
Anant	ORGANIZATION	0.97+
about four months	QUANTITY	0.97+
Kubernetes	ORGANIZATION	0.96+
each attendee	QUANTITY	0.96+
one thing	QUANTITY	0.96+
two more guests	QUANTITY	0.95+
Kubernetes	TITLE	0.95+
about 10 years	QUANTITY	0.93+
Wednesday, Day 2	DATE	0.92+
about 500 people	QUANTITY	0.91+
today	DATE	0.91+
Zoom	ORGANIZATION	0.9+

Satyen Sangani, Alation | Cube Conversation

(upbeat electronic music) >> As we've previously reported on theCUBE, Alation was an early pioneer in the data, data governance, and data management space, which is now rapidly evolving with the help of AI and machine learning, and to what's often referred to as data intelligence. Many companies, you know, they didn't make it through the last era of data. They failed to find the right product market fit or scale beyond their close circle of friends, or some ran out of money or got acquired. Alation is a company who did make it through, and has continued to attract investor support, even in a difficult market where tech IPOs have virtually dried up. Back with me on theCUBE is Satyen Sangani, who's the CEO and co-founder of Alation. Satyen, good to see you again. Thanks for coming on. >> Great to see you, Dave. It's always nice to be on theCUBE. >> Hey, so remind our audience why you started Alation 10 years ago, you and your co-founders, and what you're all about today. >> Alation's vision is to empower a curious and rational world, which sounds like a really, I think, presumptuous thing to say. But I think it's something that we really need, right? If you think about how people make decisions, often it's still with bias or ideology, and we think a lot of that happens because people are intimidated by data, or often don't know how to use it, or don't know how to think scientifically. And we, at the core, started Alation because we wanted to demystify data for people. We wanted to help people find the data they needed and allow them to use it and to understand it better. And all of those core consumption values around information were what led us to start the company, because we felt like the world of data could be a little easier to use and manage. >> Your founding premise was correct. I mean, just getting the technology to work was so hard, and as you well know, it takes seven to 10 years to actually start a company and get traction, let alone hit escape velocity. So as I said in the open, you continue to attract new investors. What's the funding news? Please share with us. >> So we're announcing that we raised 123 million from a cohort of investors led by Thoma Bravo, Sanabil Investments, and Costanoa. Databricks Ventures is a participant in that round, along with many of our other existing investors, which would also include Salesforce amongst others. And so, super excited to get the round done in this interesting market. We were able to do that because of the business performance, and it was an up round, and all of that's great and gives our employees and our customers the fuel they need to get the product that they want. >> So why the E Round? Explain that. >> So, we've been accelerating growth over the last five quarters since our Series D. We've basically increased our growth rate to almost double since the time we raised our last round. And from our perspective, the data intelligence market, which is the market that we think we have the opportunity to continue to be the leading platform in, is growing super fast. And when faced with the decision of decelerating growth in the face of what might be, what could be a challenging macroeconomic environment, and accelerating when we're seeing customers increase the size of their commitments, more new customers sign on than ever, our growth rates increasing. We and the board basically chose to take the latter approach and we sort of said, "Look, this is amazing time in this category. This is an amazing time in this company. It's time to invest and it's time to be aggressive when a lot of other folks are fearful, and a lot of other folks aren't seeing the traction that we're seeing in our business. >> Why do you think you're seeing that traction? I mean, we always talk about digital transformation, which was a buzzword before the pandemic, but now it's become a mandate. Is that why? Is it just more data related? Explain that if you could. >> I think there's this potentially, you know, somewhat confusing thing about data. There's a, maybe it's a dirty secret of data, which is there's the sense that if you have a lot of data, and you're using data really well, and you're producing a ton of data, that you might be good at managing it. And the reality of it is that as you have more people using data and as you produce more data, it just becomes more and more confusing because more and more people are trying to access the same information to answer different questions, and more workloads are produced, and more applications are produced. And so the idea of getting more data actually means that it's really hard to manage and it becomes harder to manage at scale. And so, what we're seeing is that with the advent of platforms like AWS, like Snowflake, like Databricks, and certainly with all of the different on-premise applications that are getting born every single day, we're just seeing that data is becoming really much more confusing, but being able to navigate it is so much more important because it's the lifeblood for any business to build differentiation and satisfy their customers. >> Yeah, so last time we talked, we talked about the volume and velocity bromide from the last decade, but we talked about value and how hard it is to get value. So that's really the issue is the need and desire for more organizations to get more value out of that data is actually a stronger tailwind than the headwinds that you're seeing in the macroeconomic environment. >> Right. Because I think in good times you need data in order to be able to capitalize off all the opportunities that you've got, but in bad times you've got to make hard choices. And when you need to make hard choices, how do you do that? Well, you've got to figure out what the right decisions are, and the best way to do that is to have a lot of data and a lot of people who understand that data to be able to capitalize on it and make better insights and better decisions. And so, you don't see that just, by the way, theoretically. In the last quarter, we've seen three companies that have had cost reductions and force reductions where they are increasing at the same time their investment with Alation. And it's because they need the insight in order to be able to navigate these challenging times. >> Well, congratulations on the up round. That's awesome. I got to ask you, what was it like doing a raise in this environment? I mean, sellers are in control in the public markets. Late stage SaaS companies, that had to be challenging. How did you go about this? What were the investor conversations like? >> It certainly was a challenging fundraise. And I would say even though our business is doing way better and we were able to attract evaluation that would put us in the top quartile of public companies were we trading as a public company, which we aspire to do at some point, it was challenging because there was a whole slew of investors who were basically sitting on their hands. I had one investor conversation where an investor said to me, "Look, we think you're a great business, but we have companies that are able to give us 2.5 liquidation preference, and that gives us 70%, 75% of our return day one. So we're just going to go do those companies that may have been previously overvalued, but are willing to give us these terms because they want to keep their face valuation." Other investors said, "Look, we'd really rather that you ran a lower growth plan but with a potentially lower burn plan. But we think the upside is really something that you can capitalize on." From our perspective, we were pretty clear about the plan that we wanted to run and didn't want to necessarily totally accommodate to the fashion of the current market. We've always run a historically efficient business. The company has not burned as much as many of the data peers that we've seen to grow to get to our scale, but our general view was, look, we've got a really clear plan. The board, and the company, and the management team know exactly what we'd like to do. We've got customers that know exactly what they want from us, so we really just have to go execute. And the luck is that we found investors who were willing to do that. Many investors, and we picked one in Thoma Bravo that we felt could be the best partner for the coming phase of the company. >> So I love that because you see the opportunity, you've had a very efficient business. You're punching above your weight in terms of your use of capital. So you don't want to veer off. You know your business better than anybody. You don't want to veer off that plan. The board's very supportive. I could see you, you hear it all the time, we're going to dial down the growth, dial up the EBIT, and that's what markets want today. So congratulations on sticking to your beliefs and your vision. How do you plan to use the funds? >> We are planning to invest in sales and marketing globally. So we've expanded in Asia-Pacific over the most recent year, and also in (indistinct) and we plan to continue to do that. We're going to continue to expand in public sector with fed. And so, you would see us basically just increase our presence globally in all of the markets that you might expect. In particular, you're going to see us lean in heavily to many of the partners Databricks invested alongside this particular round. But you would have seen previously that Snowflake was a fabulous, and has been a fabulous partner of ours, and we are going to continue to invest alongside these leading data platforms. What you would also expect to see from us, though, is a lot of investment in R&D. This is a really nascent category. It's a really, really hard space. People would call it a crowded market because there are a lot of players. I think from our perspective, our aspirations to be the leading data intelligence platform, platform being a really key word there because it's not like we can do it all ourselves. We have a lot of different use cases in data intelligence, things like data quality and data observability, things like data privacy and data access control. And we have some really great partners that we walk alongside in order to make the end customer successful. I think a lot of folks in this market think, "Oh, we can just be master of all. Sort of jack of all trades, master of none." That is not our strategy. Our strategy is to really focus on getting all our customers super successful, really focused on engagement and adoption, because the really hard thing with these platforms is to get people to use them, and that is not a problem Alation has had historically. >> You know, it's really interesting, Satyen, you talk about, I mean, Thoma Bravo, obviously, very savvy investors, deep pockets, they've been making some moves. Certainly we've seen that in cyber security and data. So you got some quasi patient capital there. But the interesting thing to me is that the previous Snowflake investment last year and now Databricks, a lot of people think of them as sort of battling it out, but my view is it's not a zero sum game, meaning, yes, there's overlap, but they're filling a lot of gaps in the marketplace, and I think there's room, there's so much opportunity, and there's such a large tam, that partnering with both is a really, really smart idea. I'll give you the last word. Going forward, what can we expect from Elation? >> Well, I think that's absolutely true, and I think that the biggest boogeyman with all of this is that people don't use data. And so, our ability to partner together is really just a function of making customers successful and continuing to do that. And if we can do that, all companies will grow. We ended up ultimately partnering with Databricks and deepening our partnership, really, 'cause we had one already, primarily because of the fact that we have over a hundred customers that are jointly using the products today. And so, it certainly made sense for us to continue to make that experience better 'cause customers are demanding it. From my perspective, we just have this massive opportunity. We have the ability and the insight to run a really efficient, very, very high growth business at scale. And we have this tremendous ability to get so many more companies and people to use data much more efficiently and much better. Which broadly is, I think, a way in which we can impact the world in a really positive way. And so that's a once in a lifetime opportunity for me and for the team. And we're just going to get after it. >> Well, it's been fun watching Alation over the years. I remember mid last decade talking about this thing called data lakes and how they became data swamps, and you were helping clean that up. And now, the next 10 years, and data's not going to be like the last, you know, simplifying things and and really democratizing data is the big theme. Satyen, thanks for making time to come back on theCUBE, and congratulations on the raise. >> Thank you, Dave. It's always great to see you. >> And thank you for watching this conversation with the CEO in theCUBE, your leader in enterprise and emerging tech coverage. (gentle electronic music)

Published Date : Nov 2 2022

SUMMARY :

and has continued to It's always nice to be on theCUBE. and what you're all about today. and allow them to use it and as you well know, it and our customers the fuel So why the E Round? We and the board basically chose Explain that if you could. and it becomes harder to manage at scale. for more organizations to get more value and the best way to do that that had to be challenging. And the luck is that we found investors sticking to your beliefs of the markets that you might expect. of gaps in the marketplace, and the insight to run a really efficient, and data's not going to be It's always great to see you. And thank you for

ENTITIES

Entity	Category	Confidence
Alation	ORGANIZATION	0.99+
Satyen	PERSON	0.99+
Dave	PERSON	0.99+
seven	QUANTITY	0.99+
70%	QUANTITY	0.99+
75%	QUANTITY	0.99+
Databricks	ORGANIZATION	0.99+
Sanabil Investments	ORGANIZATION	0.99+
last year	DATE	0.99+
Satyen Sangani	PERSON	0.99+
Databricks Ventures	ORGANIZATION	0.99+
both	QUANTITY	0.99+
10 years ago	DATE	0.99+
Costanoa	ORGANIZATION	0.99+
123 million	QUANTITY	0.99+
last quarter	DATE	0.99+
three companies	QUANTITY	0.98+
Snowflake	ORGANIZATION	0.98+
10 years	QUANTITY	0.98+
mid last decade	DATE	0.98+
over a hundred customers	QUANTITY	0.98+
one	QUANTITY	0.97+
today	DATE	0.97+
one investor	QUANTITY	0.96+
AWS	ORGANIZATION	0.94+
pandemic	EVENT	0.93+
Thoma Bravo	ORGANIZATION	0.91+
fed	ORGANIZATION	0.9+
single day	QUANTITY	0.87+
last decade	DATE	0.87+
Series D.	OTHER	0.87+
next 10 years	DATE	0.85+
Alation	PERSON	0.8+
Elation	ORGANIZATION	0.8+
Asia-Pacific	LOCATION	0.79+
double	QUANTITY	0.78+
last five quarters	DATE	0.76+
2.5 liquidation	QUANTITY	0.75+
theCUBE	ORGANIZATION	0.74+
Salesforce	ORGANIZATION	0.73+
recent year	DATE	0.72+
Thoma Bravo	PERSON	0.69+
Snowflake	TITLE	0.66+
t	DATE	0.65+
Cube	ORGANIZATION	0.53+
more	QUANTITY	0.5+
data	QUANTITY	0.49+

Raj Gossain Final

>>Hey everyone. Welcome to this cube conversation. I'm your host, Lisa Martin Rajko same joins me now the chief product officer at elation. Raj. Great to have you on the cube. Welcome. >>It's great to be here, Lisa. And I've been a fan for a while and excited to have a chance to talk with you live. >>And we've got some exciting stuff to talk about elation in terms of the success in the enterprise market. I see more than 25% of the fortune 100 doing great. There is customers elation and snowflake. Before we get into your exciting news. Talk to me a little bit about the evolution of the partnership. >>Yeah, no, absolutely. So, you know, we've always been a, a close partner and integrator with snowflake and last year snowflake became an investor in elation and they participated in our series D round. And the thing I'm most excited about beyond that is we were announced in the snowflake summit back in June to be their data governance partner of the year for the second year running. And so we've always had a closer relationship with snowflake, both at the go to market level and at the product level. And you know, the stuff that we're about to talk about is a Testament to that. >>Absolutely. It is. So talk to us before we get into the announcement. What you're seeing in the market as organizations are really becoming much more serious about being data driven and building a data culture. What are you seeing with respect to enterprises as well as those smaller folks? >>Yeah, no, it, it, it's, it's a great question. I mean, you, you hear the T tropes data is the new oil data is like water it's essential. And we're seeing that very consistently across every customer, every segment, every geo that we, that we talk to, I, I think the challenges that organizations are seeing that are leading to the amazing growth that we've seen at elation are there's so much data. They don't know where it resides. You've got silos or islands of knowledge that exist across the, the enterprise. And they need a data intelligence platform to bring it all together, to help them make sense of it and ultimately build a data culture that, you know, it lets their employees make data driven decisions as opposed to relying on gut. And so those are some of the macro trends that we're seeing and with the migration of data to the cloud and in particular snowflake, it seemed like a huge opportunity for us to partner even more closely with, with snowflake. And we're, we're excited about the progress that we've seen with them thus far. >>All right, let's get right into it. So first of all, define a data culture and then talk to us about how elation and snowflake are helping organizations to really achieve that. >>Yeah. You know, it, it's interesting. The, the company vision that we have at elation is to empower a curious and rational world. And you know, what that really means is we want to deliver solutions that drive curiosity and drive rational behavior. So making, making decisions based on data and insights, as opposed to gut, or, you know, the, the highest paid, you know, person's opinion or what have you. And so delivering a data culture, building a data culture, which is something we hear from all the CDOs that we talk to is, Hey, elation, help us drive data literacy across the organization, provide that single source of reference. So if anybody has a question about, do we have data that answers this, or, you know, what kind of performance are we seeing in this product area? Give me a starting point for my data exploration journey. And that's really where elation and our data intelligence solutions kind of come into the play. >>So unpack elation cloud service for snowflake. Talk to us about what it is, why you're doing it, what the significance of this partnership and this solution is delivering. >>Absolutely. So the elation cloud service for snowflake is a brand new offering that we just brought to market. And the intent really was, you know, we've had massive success in the global 2000. You mentioned the, the progress that we've had with fortune 100 customers, we see the need for data, culture, and data literacy and governance in organizations, you know, that are massive global multinational enterprises all the way down to divisions of an organization, or even, you know, mid-market and SMB companies. And so we thought there was a huge opportunity to really drive data culture for those organizations that are adopting snowflake, but still need that data intelligence overlay across the, the data that's in the snowflake cloud. And so what we did is we launched the elation cloud service for snowflake as a free trial, and then, you know, low cost purchase solution that, you know, can be adopted for less than a hundred thousand dollars a year. >>Got it. So tar from a target market perspective that lower end of the market for, of course, you know, these days, Raj, as we talk about every company, regardless of size, regardless of industry and location has to be a data company getting there and, and, and, and really defining and going on a journey to get there is really complex. So you're going now down market to meet those customers where they are, how will elation cloud service for snowflake help those customers, those smaller customers really become data driven and, and, and adopt a data culture. >>Yeah. Yeah. It's, it's a great question. I, I think the biggest goal that we had was making it really simple and easy for them to begin this journey. So, you know, we are now live in the snowflake partner connect portal. And if someone wants to experience the power of elation cloud service for snowflake, they just need to go to that portal, click the elation tile. And literally within less than two minutes, a brand new instance of elation is spun up. Their snowflake data is automatically being cataloged as part of this trial. And they have 14 days to go through this experience and, and get a sense of the power of elation to give them insights into what's in their snowflake platform, what governance options they can layer on top of their snowflake data cloud and how the data is transforming across their organization. >>So talk to me about who you're talking to within a customer. I was looking at some data that elation provided to me, and I see that according to Gartner data culture is priority number one for chief data officers, but for those smaller organizations, do they have chief data officers? Is that responsibility line still with the CIO? Who are you engaging with? >>Yeah, it's very, very, really great question. I, I think the larger organizations that we sell to definitely have a, a CDO and, you know, CDO sometimes is the chief data and analytics officer in smaller organizations, or even in divisions of big companies that, that, you know, might be target customers for ACS, for snowflake could be a, a VP of analytics could be head of marketing. Operations could be a data engineering function, so that might roll up into the it. And so I think that's, what's interesting is we, we wanted to take the friction out of the, the experience process and the trial process, and whoever is responsible for the snowflake instance and, and leveraging snowflake for, for data and analytics, they can explore and understand what the, a power elation layered on top of snowflake can provide for them. >>Okay. So another, another thing that I uncovered in researching for this segment is McKenzie says data, culture is decision culture. I thought that was a really profound statement, but it's also such a challenge to get there is organizations of all sizes are on various points in their journey to become data driven. What does that mean? How, how well, how do elation and help customers really achieve that data culture so that they can really have that decision culture so they can make faster, better data based decisions? >>Yeah, it, so I, I think a huge part of it, like if we think about our, our, our big area of focus, how do we enable users to find, understand trust, govern, and use data within snowflake in this instance? And so step one to drive data culture is how, how do you provide a single source of reference a, a, a search box, frankly, you know, Google for your, for your data environment, so that you can actually find data, then how do you understand it? You know, what's in there. What does it mean? What are the relationships between these data objects? Can I trust this? Is this sandbox data, or is this production data that can be used for reporting and analytics? How do I govern the data? So I know who's using it, who should use it, what policies are there. And so if, if we go through the set of features that we've built into ation cloud service for snowflake, it enables us to deliver on that promise result at the very end, resulting in the ability to explore the data that exists in the snowflake platform as well. >>Let's go ahead and unpack that. Now, talk to me about some of the key capabilities of the solution and what it's enabling organizations to achieve. >>Yeah, so, you know, it starts with cataloging the data itself. You know, we, we, we are the data catalog company. We basically define that category. And so step one is how do we connect to snowflake and automatically ingest all the metadata that exists within that snowflake cloud, as well as extract the lineage relationships between tables. So you can understand how the data is transforming within the snowflake data cloud. And so that provides visibility to, to begin that fine journey. You know, how, how do I actually discover data on the understand and trust front? I think where things get really interesting is we've integrated deeply with Snowflake's new data governance features. So they've got data policies that provide things like row level security and, and data masking. We integrate directly with those policies, extract them, ingest them into elation so that they can be discovered, can be easily applied or added to other data sets within snowflake directly from the elation UI. >>So now you've got policies layered on top of your data environment. Snowflake's introduced, tagging and classification capabilities. We automatically extract and ingest those tags. They're surfaced in inhalation. So if somebody looks for a data set that they're not familiar with, they can see, oh, here are the policies that this data set is applied to. Here are the tags that are applied. And so snow elation actually becomes almost like a user interface to the data that exists within that snowflake platform. And then maybe just two other things with the lineage that we extract. One of the most important things that you can deliver for users is impact analysis. Hey, if I'm gonna deprecate this table, or if I'm gonna make a change to what this table definition is, what are the downstream objects and users that should know about that? So, Hey, if this table's going away and my Tableau report over here is gonna stop working, boy, it'd be great to be able to get visibility into that before that change is made, we can do that automatically within the elation UI and, and really just make it easier for somebody to govern and manage the data that exists within the snowflake data cloud. >>So easier to govern and manage the data. Let's go up a level or two. Sure. Talk to me about some of the business outcomes that this solution is gonna help organizations to achieve. We talked about every company these days has to be a data company. Consumers expect this very personalized, relevant experience. What are you thinking? Some of the outcomes are gonna be that this technology and this partnership is gonna unlock. >>Yeah, no, I, I, I think step one, and this has always been a huge area of focus for us is just simply driving business productivity. So, you know, the, the data that we see in talking to CDOs and CDOs is the onboarding and, and getting productive the time it takes to onboard and, and get a data analyst productive. It, it can be nine to 12 months. And, you know, we all know the battle for talent these days is significant. And so if we can provide a solution, and this is exactly what we do that enables an organization to get a data analyst productive in weeks instead of months, or, or, you know, potentially even a year, the value that that analyst can deliver to the organization goes up dramatically because they're spending less time looking for data and figuring out who knows what about the data. >>They can go to elation, get those insights and start answering business questions, as opposed to trying to wrangle or figure out does the data exist. And, and, and where does it exist? So that's, that's one key dimension. I'd say the other one that, that I'd highlight is just being able to have a governance program that is monitored managed and well understood. So that, you know, whether it's dealing with CCPA or GDPR, you know, some of the regulatory regimes, the, the ability for an organization to feel like they have control over their data, and they understand where it is who's using it and how it's being used. Those are hugely important business outcomes that CIOs and CDOs tell us they need. And that's why we built the lation cloud service for snowflake >>On the first front. One of the things that popped into my mind in terms of really enabling workforce productivity, workforce efficiency, getting analysts ramped up dramatically faster also seems to me to be something that your customers can leverage from a talent attraction and retention perspective, which in today's market is critical. >>I, I so glad you mentioned that that's, that's actually one of the key pillars that we highlight as well is like, if you give great tools to employees, they're gonna be happier. And, and you'll be a, a preferred employer and people are gonna feel like, oh, this is an organization that I wanna work at because they're making my job easier and they're making it easier for me to deliver value and be productive to the organization. And that's, it's absolutely critical this, this, this war for talent that everybody talks about. It's real and great self-service tools that are empowering to employees are the things that are gonna differentiate companies and allow them to, to unleash the power of data, >>Unleash the power of data, really use it to the competitive advantage that it can and should be used for. When we look at, when you look at customers that are on that journey, that data catalog journey, they, you probably see such a variety of, of locations about where they are in that journey. Do you see a common thread when you're in customer conversations? Is there kind of a common denominator that you, you speak to where you, you really know elation and snowflake here is absolutely the right thing. >>Yeah, no, it, it, it's a good question. I would actually say the fact that a customer is on snowflake. I they're already, you know, a step up on that maturity curve. You know, one of the big use cases that we see with customers that is, is leading to the need for data intelligence solutions that, you know, like that elation can deliver is digital transformation and, and, and cloud migration, you know, we've got legacy data. On-prem, we know we need to move to the cloud to get better agility, better scaling, you know, perhaps, you know, reduced costs, et cetera. And so I think step one, on that, that qualification criteria or that maturity journey is, Hey, if you're already in snowflake, that's a great sign because you're, you're recognizing the power of a data cloud platform and, and, and warehouse like snowflake. And so that's a, a, a great signal to us that this is a customer that wants to, you know, really better understand how they can get value out of, out of their solution. I think the next step on that journey is a recognition that they're not utilizing the data that they have as effectively as they can and should be, and they're not, and, and their employees are still struggling with, you know, where does the data exist? Can I trust it? It, you know, it, who do I know tends to be more important than do I have a tool that will help me understand the data. And so customers that are asking those sorts of questions are ideal customers for the elation cloud service for snowflake solution. >>So enabling those customers to get their hands on it, there's a free trial. Talk to us about that. And where can the audience go to actually click and try? >>Absolutely. So, you know, we'll, we'll be doing our usual marketing and, and promotion of this, but what I'm super excited about, you know, again, I mentioned earlier, you know, this is part of our, our cloud native multi 10 and architecture. We are live in the snowflake partner connect portal. And so if you are logged into snowflake and are an admin, you can go to the partner connect portal and you will see a tile. I think it's alphabetically, sorted and elation starts with a so pretty easy to find. I don't think you'll have to do too much searching. And literally all you have to do is click on that tile, answer a couple quick questions. And in the background in about two minutes, your elation instance will get spun up. We'll we will have sample data sets in there. There's some guided tours that you can walk through to kind of get a feel for the power of snowflake. >>So policy center lineage, you know, tags, our intelligent SQL tool that allows you to smartly query the snowflake data cloud and publish queries, share queries with others, collaborate on them for, for greater insights. And there's, you know, as you would expect with any, you know, online free trial, you know, we've got a built in chat bot. So if you have a question, wanna get a better sense of how a particular feature works or curious about how elation might work. In other areas, you can, you know, ask a question to the chat bot and we've got product specialists on the back end that can answer questions. So we really wanna make that journey as, as seamless and easy as, as possible. And hopefully that results in enough interests that the customer wants to, to, or the, the trial user wants to become a customer. And, and that's where our great sales organization will kind of take the Baton from there. >>And there's the, there's the objective there, and I'm sure Raj folks can find out about the free trial and access it. You, you mentioned through the marketplace, more information on elian.com. I imagine they can go there to access it as well, >>A hundred percent elation.com. We're on Twitter, we're on LinkedIn, but yeah, if you have any questions, you know, you can just search for elation cloud service for snowflake, or just go to the elation.com website. Absolutely. >>All right. Elation cloud service for snowflake. Congratulations on the launch to you and the entire elation team. We look forward to hearing customer success stories and really seeing those business outcomes realize in the next few months, Raj, thanks so much for your time. >>Thank you so much, Lisa. It's great to talk to you. >>Likewise, Raj gin. I'm Lisa Martin. Thank you for watching this cube conversation. Stay right here for more great action on the cube, the leader in live tech coverage.

Published Date : Aug 31 2022

SUMMARY :

Great to have you on the cube. talk with you live. Talk to me a little bit about the evolution of the partnership. And you know, So talk to us before we get into the announcement. are seeing that are leading to the amazing growth that we've seen at elation are So first of all, define a data culture and then talk to us about And you know, what that really means is we Talk to us about what it is, And the intent really was, you know, we've had massive success in the global 2000. of course, you know, these days, Raj, as we talk about every company, regardless of size, And they have 14 days to So talk to me about who you're talking to within a customer. you know, CDO sometimes is the chief data and analytics officer in smaller organizations, statement, but it's also such a challenge to get there is organizations of all sizes are on various points And so step one to drive data culture is how, Now, talk to me about some of the key capabilities of the solution and what it's enabling organizations Yeah, so, you know, it starts with cataloging the data itself. One of the most important things that you can deliver for users is impact So easier to govern and manage the data. So, you know, the, the data that we see in talking to So that, you know, whether it's dealing with CCPA or GDPR, faster also seems to me to be something that your customers can leverage from a talent attraction and retention I, I so glad you mentioned that that's, that's actually one of the key pillars that we highlight as well is like, When we look at, when you look at customers that are on that journey, that data catalog journey, is leading to the need for data intelligence solutions that, you know, like that elation can deliver is So enabling those customers to get their hands on it, there's a free trial. And so if you are logged into snowflake and are an admin, And there's, you know, as you would expect with any, I imagine they can go there to if you have any questions, you know, you can just search for elation cloud service for snowflake, Congratulations on the launch to you and the entire elation Thank you for watching this cube conversation.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Lisa	PERSON	0.99+
Lisa Martin Rajko	PERSON	0.99+
Raj	PERSON	0.99+
14 days	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
LinkedIn	ORGANIZATION	0.99+
One	QUANTITY	0.99+
less than two minutes	QUANTITY	0.99+
12 months	QUANTITY	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Twitter	ORGANIZATION	0.98+
second year	QUANTITY	0.98+
nine	QUANTITY	0.98+
June	DATE	0.98+
less than a hundred thousand dollars a year	QUANTITY	0.97+
step one	QUANTITY	0.97+
about two minutes	QUANTITY	0.97+
both	QUANTITY	0.97+
GDPR	TITLE	0.97+
more than 25%	QUANTITY	0.97+
Raj Gossain	PERSON	0.96+
first front	QUANTITY	0.96+
one	QUANTITY	0.95+
single source	QUANTITY	0.94+
hundred percent	QUANTITY	0.93+
Tableau	TITLE	0.92+
customers	QUANTITY	0.92+
today	DATE	0.91+
two other things	QUANTITY	0.91+
SQL	TITLE	0.91+
Snowflake	ORGANIZATION	0.88+
one key dimension	QUANTITY	0.87+
a year	QUANTITY	0.85+
elation	ORGANIZATION	0.85+
elian.com	OTHER	0.82+
fortune 100	ORGANIZATION	0.81+
2000	DATE	0.81+
CCPA	TITLE	0.76+
a couple quick questions	QUANTITY	0.75+
Elation	ORGANIZATION	0.74+
Snowflake	TITLE	0.66+
partner connect	ORGANIZATION	0.65+
ACS	ORGANIZATION	0.64+
Baton	LOCATION	0.64+
elation.com	OTHER	0.62+
10	QUANTITY	0.6+
McKenzie	PERSON	0.59+
series D	OTHER	0.59+
level	QUANTITY	0.55+
elation.com	ORGANIZATION	0.53+
snowflake	TITLE	0.51+
elation	TITLE	0.37+
fortune 100	TITLE	0.32+
round	TITLE	0.31+

Mitesh Shah, Alation & Ash Naseer, Warner Bros Discovery | Snowflake Summit 2022

(upbeat music) >> Welcome back to theCUBE's continuing coverage of Snowflake Summit '22 live from Caesar's Forum in Las Vegas. I'm Lisa Martin, my cohost Dave Vellante, we've been here the last day and a half unpacking a lot of news, a lot of announcements, talking with customers and partners, and we have another great session coming for you next. We've got a customer and a partner talking tech and data mash. Please welcome Mitesh Shah, VP in market strategy at Elation. >> Great to be here. >> and Ash Naseer great, to have you, senior director of data engineering at Warner Brothers Discovery. Welcome guys. >> Thank you for having me. >> It's great to be back in person and to be able to really get to see and feel and touch this technology, isn't it? >> Yeah, it is. I mean two years or so. Yeah. Great to feel the energy in the conference center. >> Yeah. >> Snowflake was virtual, I think for two years and now it's great to kind of see the excitement firsthand. So it's wonderful. >> Th excitement, but also the boom and the number of customers and partners and people attending. They were saying the first, or the summit in 2019 had about 1900 attendees. And this is around 10,000. So a huge jump in a short time period. Talk a little bit about the Elation-Snowflake partnership and probably some of the acceleration that you guys have been experiencing as a Snowflake partner. >> Yeah. As a snowflake partner. I mean, Snowflake is an investor of us in Elation early last year, and we've been a partner for, for longer than that. And good news. We have been awarded Snowflake partner of the year for data governance, just earlier this week. And that's in fact, our second year in a row for winning that award. So, great news on that front as well. >> Repeat, congratulations. >> Repeat. Absolutely. And we're going to hope to make it a three-peat as well. And we've also been awarded industry competency badges in five different industries, those being financial services, healthcare, retail technology, and Median Telcom. >> Excellent. Okay. Going to right get into it. Data mesh. You guys actually have a data mesh and you've presented at the conference. So, take us back to the beginning. Why did you decide that you needed to implement something like data mesh? What was the impetus? >> Yeah. So when people think of Warner brothers, you always think of like the movie studio, but we're more than that, right? I mean, you think of HBO, you think of TNT, you think of CNN, we have 30 plus brands in our portfolio and each have their own needs. So the idea of a data mesh really helps us because what we can do is we can federate access across the company so that, you know, CNN can work at their own pace. You know, when there's election season, they can ingest their own data and they don't have to, you know, bump up against as an example, HBO, if Game of Thrones is going on. >> So, okay. So the, the impetus was to serve those lines of business better. Actually, given that you've got these different brands, it was probably easier than most companies. Cause if you're, let's say you're a big financial services company, and now you have to decide who owns what. CNN owns its own data products, HBO. Now, do they decide within those different brands, how to distribute even further? Or is it really, how deep have you gone in that decentralization? >> That's a great question. It's a very close partnership, because there are a number of data sets, which are used by all the brands, right? You think about people browsing websites, right? You know, CNN has a website, Warner brothers has a website. So for us to ingest that data for each of the brands to ingest that data separately, that means five different ways of doing things and you know, a big environment, right? So that is where our team comes into play. We ingest a lot of the common data sets, but like I said, any unique data sets, data sets regarding theatrical as an example, you know, Warner brothers does it themselves, you know, for streaming, HBO Max, does it themselves. So we kind of operate in partnership. >> So do you have a centralized data team and also decentralized data teams, right? >> That's right. >> So I love this conversation because that was heresy 10 years ago, five years ago, even, cause that's inefficient. But you've, I presume you've found that it's actually more productive in terms of the business output, explain that dynamic. >> You know, you bring up such a good point. So I, you know, I consider myself as one of the dinosaurs who started like 20 plus years ago in this industry. And back then, we were all taught to think of the data warehouse as like a monolithic thing. And the reason for that is the technology wasn't there. The technology didn't catch up. Now, 20 years later, the technology is way ahead, right? But like, our mindset's still the same because we think of data warehouses and data platforms still as a monolithic thing. But if you really sort of remove that sort of mental barrier, if you will, and if you start thinking about, well, how do I sort of, you know, federate everything and make sure that you let folks who are building, or are closest to the customer or are building their products, let them own that data and have a partnership. The results have been amazing. And if we were only sort of doing it as a centralized team, we would not be able to do a 10th of what we do today. So it's that massive scale in, in our company as well. >> And I should have clarified, when we talk about data mesh are we talking about the implementing in practice, the octagon sort of framework, or is this sort of your own sort of terminology? >> Well, so the interesting part is four years ago, we didn't have- >> It didn't exist. >> Yeah. It didn't exist. And, and so we, our principle was very simple, right? When we started out, we said, we want to make sure that our brands are able to operate independently with some oversight and guidance from our technology teams, right? That's what we set out to do. We did that with Snowflake by design because Snowflake allows us to, you know, separate those, those brands into different accounts. So that was done by design. And then the, the magic, I think, is the Snowflake data sharing where, which allows us to sort of bring data in here once, and then share it with whoever needs it. So think about HBO Max. On HBO Max, You not only have HBO Max content, but content from CNN, from Cartoon Network, from Warner Brothers, right? All the movies, right? So to see how The Batman movie did in theaters and then on streaming, you don't need, you know, Warner brothers doesn't need to ingest the same streaming data. HBO Max does it. HBO Max shares it with Warner brothers, you know, store once, share many times, and everyone works at their own pace. >> So they're building data products. Those data products are discoverable APIs, I presume, or I guess maybe just, I guess the Snowflake cloud, but very importantly, they're governed. And that's correct, where Elation comes in? >> That's precisely where Elation comes in, is where sort of this central flexible foundation for data governance. You know, you mentioned data mesh. I think what's interesting is that it's really an answer to the bottlenecks created by centralized IT, right? There's this notion of decentralizing that the data engineers and making the data domain owners, the people that know the data the best, have them be in control of publishing the data to the data consumers. There are other popular concepts actually happening right now, as we speak, around modern data stack. Around data fabric that are also in many ways underpinned by this notion of decentralization, right? These are concepts that are underpinned by decentralization and as the pendulum swings, sort of between decentralization and centralization, as we go back and forth in the world of IT and data, there are certain constants that need to be centralized over time. And one of those I believe is very much a centralized platform for data governance. And that's certainly, I think where we come in. Would love to hear more about how you use Elation. >> Yeah. So, I mean, elation helps us sort of, as you guys say, sort of, map, the treasure map of the data, right? So for consumers to find where their data is, that's where Elation helps us. It helps us with the data cataloging, you know, storing all the metadata and, you know, users can go in, they can sort of find, you know, the data that they need and they can also find how others are using data. So it's, there's a little bit of a crowdsourcing aspect that Elation helps us to do whereby you know, you can see, okay, my peer in the other group, well, that's how they use this piece of data. So I'm not going to spend hours trying to figure this out. You're going to use the query that they use. So yeah. >> So you have a master catalog, I presume. And then each of the brands has their own sub catalogs, is that correct? >> Well, for the most part, we have that master catalog and then the brands sort of use it, you know, separately themselves. The key here is all that catalog, that catalog isn't maintained by a centralized group as well, right? It's again, maintained by the individual teams and not only in the individual teams, but the folks that are responsible for the data, right? So I talked about the concept of crowdsourcing, whoever sort of puts the data in, has to make sure that they update the catalog and make sure that the definitions are there and everything sort of in line. >> So HBO, CNN, and each have their own, sort of access to their catalog, but they feed into the master catalog. Is that the right way to think about it? >> Yeah. >> Okay. And they have their own virtual data warehouses, right? They have ownership over that? They can spin 'em up, spin 'em down as they see fit? Right? And they're governed. >> They're governed. And what's interesting is it's not just governed, right? Governance is a, is a big word. It's a bit nebulous, but what's really being enabled here is this notion of self-service as well, right? There's two big sort of rockets that need to happen at the same time in any given organization. There's this notion that you want to put trustworthy data in the hands of data consumers, while at the same time mitigating risk. And that's precisely what Elation does. >> So I want to clarify this for the audience. So there's four principles of database. This came after you guys did it. And I wonder how it aligns. Domain ownership, give data, as you were saying to the, to the domain owners who have context, data as product, you guys are building data products, and that creates two problems. How do you give people self-service infrastructure and how do you automate governance? So the first two, great. But then it creates these other problems. Does that align with your philosophy? Where's alignment? What's different? >> Yeah. Data products is exactly where we're going. And that sort of, that domain based design, that's really key as well. In our business, you think about who the customer is, as an example, right? Depending on who you ask, it's going to be, the answer might be different, you know, to the movie business, it's probably going to be the person who watches a movie in a theater. To the streaming business, to HBO Max, it's the streamer, right? To others, someone watching live CNN on their TV, right? There's yet another group. Think about all the franchising we do. So you see Batman action figures and T-shirts, and Warner brothers branded stuff in stores, that's yet another business unit. But at the end of the day, it's not a different person, it's you and me, right? We do all these things. So the domain concept, make sure that you ingest data and you bring data relevant to the context, however, not sort of making it so stringent where it cannot integrate, and then you integrate it at a higher level to create that 360. >> And it's discoverable. So the point is, I don't have to go tap Ash on the shoulder, say, how do I get this data? Is it governed? Do I have access to it? Give me the rules of it. Just, I go grab it, right? And the system computationally automates whether or not I have access to it. And it's, as you say, self-service. >> In this case, exactly right. It enables people to just search for data and know that when they find the data, whether it's trustworthy or not, through trust flags, and the like, it's doing both of those things at the same time. >> How is it an enabler of solving some of the big challenges that the media and entertainment industry is going through? We've seen so much change the last couple of years. The rising consumer expectations aren't going to go back down. They're only going to come up. We want you to serve us up content that's relevant, that's personalized, that makes sense. I'd love to understand from your perspective, Mitesh, from an industry challenges perspective, how does this technology help customers like Warner Brothers Discovery, meet business customers, where they are and reduce the volume on those challenges? >> It's a great question. And as I mentioned earlier, we had five industry competency badges that were awarded to us by Snowflake. And one of those four, Median Telcom. And the reason for that is we're helping media companies understand their audiences better, and ultimately serve up better experiences for their audiences. But we've got Ash right here that can tell us how that's happening in practice. >> Yeah, tell us. >> So I'll share a story. I always like to tell stories, right? Once once upon a time before we had Elation in place, it was like, who you knew was how you got access to the data. So if I knew you and I knew you had access to a certain kind of data and your access to the right kind of data was based on the network you had at the company- >> I had to trust you. >> Yeah. >> I might not want to give up my data. >> That's it. And so that's where Elation sort of helps us democratize it, but, you know, puts the governance and controls, right? There are certain sensitive things as well, such as viewership, such as subscriber accounts, which are very important. So making sure that the right people have access to it, that's the other problem that Elation helps us solve. >> That's precisely part of our integration with Snowflake in particular, being able to define and manage policies within Elation. Saying, you know, certain people should have access to certain rows, doing column level masking. And having those policies actually enforced at the Snowflake data layer is precisely part of our value product. >> And that's automated. >> And all that's automated. Exactly. >> Right. So I don't have to think about it. I don't have to go through the tap on their shoulder. What has been the impact, Ash, on data quality as you've pushed it down into the domains? >> That's a great question. So it has definitely improved, but data quality is a very interesting subject, because back to my example of, you know, when we started doing things, we, you know, the centralized IT team always said, well, it has to be like this, Right? And if it doesn't fit in this, then it's bad quality. Well, sometimes context changes. Businesses change, right? You have to be able to react to it quickly. So making sure that a lot of that quality is managed at the decentralized level, at the place where you have that business context, that ensures you have the most up to date quality. We're talking about media industry changing so quickly. I mean, would we have thought three years ago that people would watch a lot of these major movies on streaming services? But here's the reality, right? You have to react and, you know, having it at that level just helps you react faster. >> So data, if I play that back, data quality is not a static framework. It's flexible based on the business context and the business owners can make those adjustments, cause they own the data. >> That's it. That's exactly it. >> That's awesome. Wow. That's amazing progress that you guys have made. >> In quality, if I could just add, it also just changes depending on where you are in your data pipeline stage, right? Data, quality data observability, this is a very fast evolving space at the moment, and if I look to my left right now, I bet you I can probably see a half-dozen quality observability vendors right now. And so given that and given the fact that Elation still is sort of a central hub to find trustworthy data, we've actually announced an open data quality initiative, allowing for best-of-breed data quality vendors to integrate with the platform. So whoever they are, whatever tool folks want to use, they can use that particular tool of choice. >> And this all runs in the cloud, or is it a hybrid sort of? >> Everything is in the cloud. We're all in the cloud. And you know, again, helps us go faster. >> Let me ask you a question. I could go on forever in this topic. One of the concepts that was put forth is whether it's a Snowflake data warehouse or a data bricks, data lake, or an Oracle data warehouse, they should all be inclusive. They should just be a node on the mesh. Like, wow, that sounds good. But I haven't seen it yet. Right? I'm guessing that Snowflake and Elation enable all the self-serve, all this automated governance, and that including those other items, it's got to be a one-off at this point in time. Do you ever see you expanding that scope or is it better off to just kind of leave it into the, the Snowflake data cloud? >> It's a good question. You know, I feel like where we're at today, especially in terms of sort of technology giving us so many options, I don't think there's a one size fits all. Right? Even though we are very heavily invested in Snowflake and we use Snowflake consistently across the organization, but you could, theoretically, could have an architecture that blends those two, right? Have different types of data platforms like a teradata or an Oracle and sort of bring it all together today. We have the technology, you know, that and all sorts of things that can make sure that you query on different databases. So I don't think the technology is the problem, I think it's the organizational mindset. I think that that's what gets in the way. >> Oh, interesting. So I was going to ask you, will hybrid tables help you solve that problem? And, maybe not, what you're saying, it's the organization that owns the Oracle database saying, Hey, we have our system. It processes, it works, you know, go away. >> Yeah. Well, you know, hybrid tables I think, is a great sort of next step in Snowflake's evolution. I think it's, in my opinion, I, think it's a game changer, but yeah. I mean, they can still exist. You could do hybrid tables right on Snowflake, or you could, you know, you could kind of coexist as well. >> Yeah. But, do you have a thought on this? >> Yeah, I do. I mean, we're always going to live in a time where you've got data distributed in throughout the organization and around the globe. And that could be even if you're all in on Snowflake, you could have data in Snowflake here, you could have data in Snowflake in EMEA and Europe somewhere. It could be anywhere. By the same token you might be using. Every organization is using on-premises systems. They have data, they naturally have data everywhere. And so, you know, this one solution to this is really centralizing, as I mentioned, not just governance, but also metadata about all of the data in your organization so that you can enable people to search and find and discover trustworthy data no matter where it is in your organization. >> Yeah. That's a great point. I mean, if you have the data about the data, then you can, you can treat these independent nodes. That's just that. Right? And maybe there's some advantages of putting it all in the Snowflake cloud, but to your point, organizationally, that's just not feasible. The whole, unfortunately, sorry, Snowflake, all the world's data is not going to go into Snowflake, but they play a key role in accelerating, what I'm hearing, your vision of data mesh. >> Yeah, absolutely. I think going forward in the future, we have to start thinking about data platforms as just one place where you sort of dump all the data. That's where the mesh concept comes in. It is going to be a mesh. It's going to be distributed and organizations have to be okay with that. And they have to embrace the tools. I mean, you know, Facebook developed a tool called Presto many years ago that that helps them solve exactly the same problem. So I think the technology is there. I think the organizational mindset needs to evolve. >> Yeah. Definitely. >> Culture. Culture is one of the hardest things to change. >> Exactly. >> Guys, this was a masterclass in data mesh, I think. Thank you so much for coming on talking. >> We appreciate it. Thank you so much. >> Of course. What Elation is doing with Snowflake and with Warner Brothers Discovery, Keep that content coming. I got a lot of stuff I got to catch up on watching. >> Sounds good. Thank you for having us. >> Thanks guys. >> Thanks, you guys. >> For Dave Vellante, I'm Lisa Martin. You're watching theCUBE live from Snowflake Summit '22. We'll be back after a short break. (upbeat music)

Published Date : Jun 30 2022

SUMMARY :

session coming for you next. and Ash Naseer great, to have you, in the conference center. and now it's great to kind of see the acceleration that you guys have of the year for data And we've also been awarded Why did you decide that you So the idea of a data mesh Or is it really, how deep have you gone the brands to ingest that data separately, terms of the business and make sure that you let allows us to, you know, separate those, guess the Snowflake cloud, of decentralizing that the data engineers the data cataloging, you know, storing all So you have a master that are responsible for the data, right? Is that the right way to think about it? And they're governed. that need to happen at the So the first two, great. the answer might be different, you know, So the point is, It enables people to just search that the media and entertainment And the reason for that is So if I knew you and I knew that the right people have access to it, Saying, you know, certain And all that's automated. I don't have to go through You have to react and, you know, It's flexible based on the That's exactly it. that you guys have made. and given the fact that Elation still And you know, again, helps us go faster. a node on the mesh. We have the technology, you that owns the Oracle database saying, you know, you could have a thought on this? And so, you know, this one solution I mean, if you have the I mean, you know, the hardest things to change. Thank you so much for coming on talking. Thank you so much. of stuff I got to catch up on watching. Thank you for having us. from Snowflake Summit '22.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
CNN	ORGANIZATION	0.99+
HBO	ORGANIZATION	0.99+
Mitesh Shah	PERSON	0.99+
Ash Naseer	PERSON	0.99+
Europe	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Mitesh	PERSON	0.99+
Elation	ORGANIZATION	0.99+
TNT	ORGANIZATION	0.99+
Warner brothers	ORGANIZATION	0.99+
EMEA	LOCATION	0.99+
second year	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
2019	DATE	0.99+
two years	QUANTITY	0.99+
one	QUANTITY	0.99+
Cartoon Network	ORGANIZATION	0.99+
Game of Thrones	TITLE	0.99+
two problems	QUANTITY	0.99+
two	QUANTITY	0.99+
Warner Brothers	ORGANIZATION	0.99+
10th	QUANTITY	0.99+
first	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Snowflake Summit '22	EVENT	0.99+
Warner brothers	ORGANIZATION	0.99+
each	QUANTITY	0.99+
four	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Median Telcom	ORGANIZATION	0.99+
20 years later	DATE	0.98+
both	QUANTITY	0.98+
five different industries	QUANTITY	0.98+
10 years ago	DATE	0.98+
30 plus brands	QUANTITY	0.98+
Alation	PERSON	0.98+
four years ago	DATE	0.98+
today	DATE	0.98+
20 plus years ago	DATE	0.97+
Warner Brothers Discovery	ORGANIZATION	0.97+
One	QUANTITY	0.97+
five years ago	DATE	0.97+
Snowflake Summit 2022	EVENT	0.97+
three years ago	DATE	0.97+
five different ways	QUANTITY	0.96+
earlier this week	DATE	0.96+
Snowflake	TITLE	0.96+
Max	TITLE	0.96+
early last year	DATE	0.95+
about 1900 attendees	QUANTITY	0.95+
Snowflake	EVENT	0.94+
Ash	PERSON	0.94+
three-peat	QUANTITY	0.94+
around 10,000	QUANTITY	0.93+

Mitesh Shah, Alation & Ash Naseer, Warner Bros Discovery | Snowflake Summit 2022

(upbeat music) >> Welcome back to theCUBE's continuing coverage of Snowflake Summit '22 live from Caesar's Forum in Las Vegas. I'm Lisa Martin, my cohost Dave Vellante, we've been here the last day and a half unpacking a lot of news, a lot of announcements, talking with customers and partners, and we have another great session coming for you next. We've got a customer and a partner talking tech and data mash. Please welcome Mitesh Shah, VP in market strategy at Elation. >> Great to be here. >> and Ash Naseer great, to have you, senior director of data engineering at Warner Brothers Discovery. Welcome guys. >> Thank you for having me. >> It's great to be back in person and to be able to really get to see and feel and touch this technology, isn't it? >> Yeah, it is. I mean two years or so. Yeah. Great to feel the energy in the conference center. >> Yeah. >> Snowflake was virtual, I think for two years and now it's great to kind of see the excitement firsthand. So it's wonderful. >> Th excitement, but also the boom and the number of customers and partners and people attending. They were saying the first, or the summit in 2019 had about 1900 attendees. And this is around 10,000. So a huge jump in a short time period. Talk a little bit about the Elation-Snowflake partnership and probably some of the acceleration that you guys have been experiencing as a Snowflake partner. >> Yeah. As a snowflake partner. I mean, Snowflake is an investor of us in Elation early last year, and we've been a partner for, for longer than that. And good news. We have been awarded Snowflake partner of the year for data governance, just earlier this week. And that's in fact, our second year in a row for winning that award. So, great news on that front as well. >> Repeat, congratulations. >> Repeat. Absolutely. And we're going to hope to make it a three-peat as well. And we've also been awarded industry competency badges in five different industries, those being financial services, healthcare, retail technology, and Median Telcom. >> Excellent. Okay. Going to right get into it. Data mesh. You guys actually have a data mesh and you've presented at the conference. So, take us back to the beginning. Why did you decide that you needed to implement something like data mesh? What was the impetus? >> Yeah. So when people think of Warner brothers, you always think of like the movie studio, but we're more than that, right? I mean, you think of HBO, you think of TNT, you think of CNN, we have 30 plus brands in our portfolio and each have their own needs. So the idea of a data mesh really helps us because what we can do is we can federate access across the company so that, you know, CNN can work at their own pace. You know, when there's election season, they can ingest their own data and they don't have to, you know, bump up against as an example, HBO, if Game of Thrones is going on. >> So, okay. So the, the impetus was to serve those lines of business better. Actually, given that you've got these different brands, it was probably easier than most companies. Cause if you're, let's say you're a big financial services company, and now you have to decide who owns what. CNN owns its own data products, HBO. Now, do they decide within those different brands, how to distribute even further? Or is it really, how deep have you gone in that decentralization? >> That's a great question. It's a very close partnership, because there are a number of data sets, which are used by all the brands, right? You think about people browsing websites, right? You know, CNN has a website, Warner brothers has a website. So for us to ingest that data for each of the brands to ingest that data separately, that means five different ways of doing things and you know, a big environment, right? So that is where our team comes into play. We ingest a lot of the common data sets, but like I said, any unique data sets, data sets regarding theatrical as an example, you know, Warner brothers does it themselves, you know, for streaming, HBO Max, does it themselves. So we kind of operate in partnership. >> So do you have a centralized data team and also decentralized data teams, right? >> That's right. >> So I love this conversation because that was heresy 10 years ago, five years ago, even, cause that's inefficient. But you've, I presume you've found that it's actually more productive in terms of the business output, explain that dynamic. >> You know, you bring up such a good point. So I, you know, I consider myself as one of the dinosaurs who started like 20 plus years ago in this industry. And back then, we were all taught to think of the data warehouse as like a monolithic thing. And the reason for that is the technology wasn't there. The technology didn't catch up. Now, 20 years later, the technology is way ahead, right? But like, our mindset's still the same because we think of data warehouses and data platforms still as a monolithic thing. But if you really sort of remove that sort of mental barrier, if you will, and if you start thinking about, well, how do I sort of, you know, federate everything and make sure that you let folks who are building, or are closest to the customer or are building their products, let them own that data and have a partnership. The results have been amazing. And if we were only sort of doing it as a centralized team, we would not be able to do a 10th of what we do today. So it's that massive scale in, in our company as well. >> And I should have clarified, when we talk about data mesh are we talking about the implementing in practice, the octagon sort of framework, or is this sort of your own sort of terminology? >> Well, so the interesting part is four years ago, we didn't have- >> It didn't exist. >> Yeah. It didn't exist. And, and so we, our principle was very simple, right? When we started out, we said, we want to make sure that our brands are able to operate independently with some oversight and guidance from our technology teams, right? That's what we set out to do. We did that with Snowflake by design because Snowflake allows us to, you know, separate those, those brands into different accounts. So that was done by design. And then the, the magic, I think, is the Snowflake data sharing where, which allows us to sort of bring data in here once, and then share it with whoever needs it. So think about HBO Max. On HBO Max, You not only have HBO Max content, but content from CNN, from Cartoon Network, from Warner Brothers, right? All the movies, right? So to see how The Batman movie did in theaters and then on streaming, you don't need, you know, Warner brothers doesn't need to ingest the same streaming data. HBO Max does it. HBO Max shares it with Warner brothers, you know, store once, share many times, and everyone works at their own pace. >> So they're building data products. Those data products are discoverable APIs, I presume, or I guess maybe just, I guess the Snowflake cloud, but very importantly, they're governed. And that's correct, where Elation comes in? >> That's precisely where Elation comes in, is where sort of this central flexible foundation for data governance. You know, you mentioned data mesh. I think what's interesting is that it's really an answer to the bottlenecks created by centralized IT, right? There's this notion of decentralizing that the data engineers and making the data domain owners, the people that know the data the best, have them be in control of publishing the data to the data consumers. There are other popular concepts actually happening right now, as we speak, around modern data stack. Around data fabric that are also in many ways underpinned by this notion of decentralization, right? These are concepts that are underpinned by decentralization and as the pendulum swings, sort of between decentralization and centralization, as we go back and forth in the world of IT and data, there are certain constants that need to be centralized over time. And one of those I believe is very much a centralized platform for data governance. And that's certainly, I think where we come in. Would love to hear more about how you use Elation. >> Yeah. So, I mean, elation helps us sort of, as you guys say, sort of, map, the treasure map of the data, right? So for consumers to find where their data is, that's where Elation helps us. It helps us with the data cataloging, you know, storing all the metadata and, you know, users can go in, they can sort of find, you know, the data that they need and they can also find how others are using data. So it's, there's a little bit of a crowdsourcing aspect that Elation helps us to do whereby you know, you can see, okay, my peer in the other group, well, that's how they use this piece of data. So I'm not going to spend hours trying to figure this out. You're going to use the query that they use. So yeah. >> So you have a master catalog, I presume. And then each of the brands has their own sub catalogs, is that correct? >> Well, for the most part, we have that master catalog and then the brands sort of use it, you know, separately themselves. The key here is all that catalog, that catalog isn't maintained by a centralized group as well, right? It's again, maintained by the individual teams and not only in the individual teams, but the folks that are responsible for the data, right? So I talked about the concept of crowdsourcing, whoever sort of puts the data in, has to make sure that they update the catalog and make sure that the definitions are there and everything sort of in line. >> So HBO, CNN, and each have their own, sort of access to their catalog, but they feed into the master catalog. Is that the right way to think about it? >> Yeah. >> Okay. And they have their own virtual data warehouses, right? They have ownership over that? They can spin 'em up, spin 'em down as they see fit? Right? And they're governed. >> They're governed. And what's interesting is it's not just governed, right? Governance is a, is a big word. It's a bit nebulous, but what's really being enabled here is this notion of self-service as well, right? There's two big sort of rockets that need to happen at the same time in any given organization. There's this notion that you want to put trustworthy data in the hands of data consumers, while at the same time mitigating risk. And that's precisely what Elation does. >> So I want to clarify this for the audience. So there's four principles of database. This came after you guys did it. And I wonder how it aligns. Domain ownership, give data, as you were saying to the, to the domain owners who have context, data as product, you guys are building data products, and that creates two problems. How do you give people self-service infrastructure and how do you automate governance? So the first two, great. But then it creates these other problems. Does that align with your philosophy? Where's alignment? What's different? >> Yeah. Data products is exactly where we're going. And that sort of, that domain based design, that's really key as well. In our business, you think about who the customer is, as an example, right? Depending on who you ask, it's going to be, the answer might be different, you know, to the movie business, it's probably going to be the person who watches a movie in a theater. To the streaming business, to HBO Max, it's the streamer, right? To others, someone watching live CNN on their TV, right? There's yet another group. Think about all the franchising we do. So you see Batman action figures and T-shirts, and Warner brothers branded stuff in stores, that's yet another business unit. But at the end of the day, it's not a different person, it's you and me, right? We do all these things. So the domain concept, make sure that you ingest data and you bring data relevant to the context, however, not sort of making it so stringent where it cannot integrate, and then you integrate it at a higher level to create that 360. >> And it's discoverable. So the point is, I don't have to go tap Ash on the shoulder, say, how do I get this data? Is it governed? Do I have access to it? Give me the rules of it. Just, I go grab it, right? And the system computationally automates whether or not I have access to it. And it's, as you say, self-service. >> In this case, exactly right. It enables people to just search for data and know that when they find the data, whether it's trustworthy or not, through trust flags, and the like, it's doing both of those things at the same time. >> How is it an enabler of solving some of the big challenges that the media and entertainment industry is going through? We've seen so much change the last couple of years. The rising consumer expectations aren't going to go back down. They're only going to come up. We want you to serve us up content that's relevant, that's personalized, that makes sense. I'd love to understand from your perspective, Mitesh, from an industry challenges perspective, how does this technology help customers like Warner Brothers Discovery, meet business customers, where they are and reduce the volume on those challenges? >> It's a great question. And as I mentioned earlier, we had five industry competency badges that were awarded to us by Snowflake. And one of those four, Median Telcom. And the reason for that is we're helping media companies understand their audiences better, and ultimately serve up better experiences for their audiences. But we've got Ash right here that can tell us how that's happening in practice. >> Yeah, tell us. >> So I'll share a story. I always like to tell stories, right? Once once upon a time before we had Elation in place, it was like, who you knew was how you got access to the data. So if I knew you and I knew you had access to a certain kind of data and your access to the right kind of data was based on the network you had at the company- >> I had to trust you. >> Yeah. >> I might not want to give up my data. >> That's it. And so that's where Elation sort of helps us democratize it, but, you know, puts the governance and controls, right? There are certain sensitive things as well, such as viewership, such as subscriber accounts, which are very important. So making sure that the right people have access to it, that's the other problem that Elation helps us solve. >> That's precisely part of our integration with Snowflake in particular, being able to define and manage policies within Elation. Saying, you know, certain people should have access to certain rows, doing column level masking. And having those policies actually enforced at the Snowflake data layer is precisely part of our value product. >> And that's automated. >> And all that's automated. Exactly. >> Right. So I don't have to think about it. I don't have to go through the tap on their shoulder. What has been the impact, Ash, on data quality as you've pushed it down into the domains? >> That's a great question. So it has definitely improved, but data quality is a very interesting subject, because back to my example of, you know, when we started doing things, we, you know, the centralized IT team always said, well, it has to be like this, Right? And if it doesn't fit in this, then it's bad quality. Well, sometimes context changes. Businesses change, right? You have to be able to react to it quickly. So making sure that a lot of that quality is managed at the decentralized level, at the place where you have that business context, that ensures you have the most up to date quality. We're talking about media industry changing so quickly. I mean, would we have thought three years ago that people would watch a lot of these major movies on streaming services? But here's the reality, right? You have to react and, you know, having it at that level just helps you react faster. >> So data, if I play that back, data quality is not a static framework. It's flexible based on the business context and the business owners can make those adjustments, cause they own the data. >> That's it. That's exactly it. >> That's awesome. Wow. That's amazing progress that you guys have made. >> In quality, if I could just add, it also just changes depending on where you are in your data pipeline stage, right? Data, quality data observability, this is a very fast evolving space at the moment, and if I look to my left right now, I bet you I can probably see a half-dozen quality observability vendors right now. And so given that and given the fact that Elation still is sort of a central hub to find trustworthy data, we've actually announced an open data quality initiative, allowing for best-of-breed data quality vendors to integrate with the platform. So whoever they are, whatever tool folks want to use, they can use that particular tool of choice. >> And this all runs in the cloud, or is it a hybrid sort of? >> Everything is in the cloud. We're all in the cloud. And you know, again, helps us go faster. >> Let me ask you a question. I could go on forever in this topic. One of the concepts that was put forth is whether it's a Snowflake data warehouse or a data bricks, data lake, or an Oracle data warehouse, they should all be inclusive. They should just be a node on the mesh. Like, wow, that sounds good. But I haven't seen it yet. Right? I'm guessing that Snowflake and Elation enable all the self-serve, all this automated governance, and that including those other items, it's got to be a one-off at this point in time. Do you ever see you expanding that scope or is it better off to just kind of leave it into the, the Snowflake data cloud? >> It's a good question. You know, I feel like where we're at today, especially in terms of sort of technology giving us so many options, I don't think there's a one size fits all. Right? Even though we are very heavily invested in Snowflake and we use Snowflake consistently across the organization, but you could, theoretically, could have an architecture that blends those two, right? Have different types of data platforms like a teradata or an Oracle and sort of bring it all together today. We have the technology, you know, that and all sorts of things that can make sure that you query on different databases. So I don't think the technology is the problem, I think it's the organizational mindset. I think that that's what gets in the way. >> Oh, interesting. So I was going to ask you, will hybrid tables help you solve that problem? And, maybe not, what you're saying, it's the organization that owns the Oracle database saying, Hey, we have our system. It processes, it works, you know, go away. >> Yeah. Well, you know, hybrid tables I think, is a great sort of next step in Snowflake's evolution. I think it's, in my opinion, I, think it's a game changer, but yeah. I mean, they can still exist. You could do hybrid tables right on Snowflake, or you could, you know, you could kind of coexist as well. >> Yeah. But, do you have a thought on this? >> Yeah, I do. I mean, we're always going to live in a time where you've got data distributed in throughout the organization and around the globe. And that could be even if you're all in on Snowflake, you could have data in Snowflake here, you could have data in Snowflake in EMEA and Europe somewhere. It could be anywhere. By the same token you might be using. Every organization is using on-premises systems. They have data, they naturally have data everywhere. And so, you know, this one solution to this is really centralizing, as I mentioned, not just governance, but also metadata about all of the data in your organization so that you can enable people to search and find and discover trustworthy data no matter where it is in your organization. >> Yeah. That's a great point. I mean, if you have the data about the data, then you can, you can treat these independent nodes. That's just that. Right? And maybe there's some advantages of putting it all in the Snowflake cloud, but to your point, organizationally, that's just not feasible. The whole, unfortunately, sorry, Snowflake, all the world's data is not going to go into Snowflake, but they play a key role in accelerating, what I'm hearing, your vision of data mesh. >> Yeah, absolutely. I think going forward in the future, we have to start thinking about data platforms as just one place where you sort of dump all the data. That's where the mesh concept comes in. It is going to be a mesh. It's going to be distributed and organizations have to be okay with that. And they have to embrace the tools. I mean, you know, Facebook developed a tool called Presto many years ago that that helps them solve exactly the same problem. So I think the technology is there. I think the organizational mindset needs to evolve. >> Yeah. Definitely. >> Culture. Culture is one of the hardest things to change. >> Exactly. >> Guys, this was a masterclass in data mesh, I think. Thank you so much for coming on talking. >> We appreciate it. Thank you so much. >> Of course. What Elation is doing with Snowflake and with Warner Brothers Discovery, Keep that content coming. I got a lot of stuff I got to catch up on watching. >> Sounds good. Thank you for having us. >> Thanks guys. >> Thanks, you guys. >> For Dave Vellante, I'm Lisa Martin. You're watching theCUBE live from Snowflake Summit '22. We'll be back after a short break. (upbeat music)

Published Date : Jun 15 2022

SUMMARY :

session coming for you next. and Ash Naseer great, to have you, in the conference center. and now it's great to kind of see the acceleration that you guys have of the year for data And we've also been awarded Why did you decide that you So the idea of a data mesh Or is it really, how deep have you gone the brands to ingest that data separately, terms of the business and make sure that you let allows us to, you know, separate those, guess the Snowflake cloud, of decentralizing that the data engineers the data cataloging, you know, storing all So you have a master that are responsible for the data, right? Is that the right way to think about it? And they're governed. that need to happen at the So the first two, great. the answer might be different, you know, So the point is, It enables people to just search that the media and entertainment And the reason for that is So if I knew you and I knew that the right people have access to it, Saying, you know, certain And all that's automated. I don't have to go through You have to react and, you know, It's flexible based on the That's exactly it. that you guys have made. and given the fact that Elation still And you know, again, helps us go faster. a node on the mesh. We have the technology, you that owns the Oracle database saying, you know, you could have a thought on this? And so, you know, this one solution I mean, if you have the I mean, you know, the hardest things to change. Thank you so much for coming on talking. Thank you so much. of stuff I got to catch up on watching. Thank you for having us. from Snowflake Summit '22.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
CNN	ORGANIZATION	0.99+
HBO	ORGANIZATION	0.99+
Mitesh Shah	PERSON	0.99+
Ash Naseer	PERSON	0.99+
Europe	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Mitesh	PERSON	0.99+
Elation	ORGANIZATION	0.99+
TNT	ORGANIZATION	0.99+
Warner brothers	ORGANIZATION	0.99+
EMEA	LOCATION	0.99+
second year	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
2019	DATE	0.99+
two years	QUANTITY	0.99+
one	QUANTITY	0.99+
Cartoon Network	ORGANIZATION	0.99+
Game of Thrones	TITLE	0.99+
two problems	QUANTITY	0.99+
two	QUANTITY	0.99+
Warner Brothers	ORGANIZATION	0.99+
10th	QUANTITY	0.99+
first	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Snowflake Summit '22	EVENT	0.99+
Warner brothers	ORGANIZATION	0.99+
each	QUANTITY	0.99+
four	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Median Telcom	ORGANIZATION	0.99+
20 years later	DATE	0.98+
both	QUANTITY	0.98+
five different industries	QUANTITY	0.98+
10 years ago	DATE	0.98+
30 plus brands	QUANTITY	0.98+
Alation	PERSON	0.98+
four years ago	DATE	0.98+
today	DATE	0.98+
20 plus years ago	DATE	0.97+
Warner Brothers Discovery	ORGANIZATION	0.97+
One	QUANTITY	0.97+
five years ago	DATE	0.97+
Snowflake Summit 2022	EVENT	0.97+
three years ago	DATE	0.97+
five different ways	QUANTITY	0.96+
earlier this week	DATE	0.96+
Snowflake	TITLE	0.96+
Max	TITLE	0.96+
early last year	DATE	0.95+
about 1900 attendees	QUANTITY	0.95+
Snowflake	EVENT	0.94+
Ash	PERSON	0.94+
three-peat	QUANTITY	0.94+
around 10,000	QUANTITY	0.93+

Ritika Gunnar, IBM | IBM Think 2018

>> Narrator: Live from Las Vegas, it's theCUBE! Covering IBM Think 2018. Brought to you by IBM. >> Hello and I'm John Furrier. We're here in theCUBE studios at Think 2018, IBM Think 2018 in Mandalay Bay, in Las Vegas. We're extracting the signal from the noise, talking to all the executives, customers, thought leaders, inside the community of IBM and theCUBE. Our next guest is Ritika Gunnar who is the VP of Product for Watson and AI, cloud data platforms, all the goodness of the product side. Welcome to theCUBE. >> Thank you, great to be here again. >> So, we love talking to the product people because we want to know what the product strategy is. What's available, what's the hottest features. Obviously, we've been talking about, these are our words, Jenny introduced the innovation sandwich. >> Ritika: She did. >> The data's in the middle, and you have blockchain and AI on both sides of it. This is really the future. This is where they're going to see automation. This is where you're going to see efficiencies being created, inefficiencies being abstracted away. Obviously blockchain's got more of an infrastructure, futuristic piece to it. AI in play now, machine learning. You got Cloud underneath it all. How has the product morphed? What is the product today? We've heard of World of Watson in the past. You got Watson for this, you got Watson for IOT, You got Watson for this. What is the current offering? What's the product? Can you take a minute, just to explain what, semantically, it is? >> Sure. I'll start off by saying what is Watson? Watson is AI for smarter business. I want to start there. Because Watson is equal to how do we really get AI infused in our enterprise organizations and that is the core foundation of what Watson is. You heard a couple of announcements that the conference this week about what we're doing with Watson Studio, which is about providing that framework for what it means to infuse AI in our clients' applications. And you talked about machine learning. It's not just about machine learning anymore. It really is about how do we pair what machine learning is, which is about tweaking and tuning single algorithms, to what we're doing with deep learning. And that's one of the core components of what we're doing with Watson Studio is how do we make AI truly accessible. Not just machine learning but deep learning to be able to infuse those in our client environments really seamlessly and so the deep learning as a service piece of what we're doing in the studio was a big part of the announcements this week because deep learning allows our clients to really have it in a very accessible way. And there were a few things we announced with deep learning as a service. We said, look just like with predictive analytics we have capabilities that easily allow you to democratize that to knowledge workers and to business analysts by adding drag-and-drop capabilities. We can do the same thing with deep learning and deep learning capabilities. So we have taken a lot of things that have come from our research area and started putting those into the product to really bring about enterprise capabilities for deep learning but in a really de-skilled way. >> Yeah, and also to remind the folks, there's a platform involved here. Maybe you can say it's been re-platformed, I don't know. Maybe you can answer that. Has it been re-platformed or is it just the platformization of existing stuff? Because there's certainly demand. TensorFlow at Google showed that there's a demand for machine learning libraries and then deep learning behind. You got Amazon Web Services with Sagemaker, Touting. As a service model for AI, it's definitely in demand. So talk about the platform piece underneath. What is it? How does it get rendered? And then we'll come back and talk about the user consumption side. >> So it definitely is not a re-platformization. You recall what we have done with a focus initially on what we did on data science and what we did on machine learning. And the number one thing that we did was we were about supporting open-source and open frameworks. So it's not just one framework, like a TensorFlow framework, but it's about what we can do with TensorFlow, Keras, PyTorch, Caffe, and be able to use all of our builders' favorite open-source frameworks and be able to use that in a way where then we can add additional value on top of that and help them accelerate what it means to actually have that in the enterprise and what it means to actually de-skill that for the organization. So we started there. But really, if you look at where Watson has focused on the APIs and the API services, it's bringing together those capabilities of what we're doing with unstructured, pre-trained services, and then allowing clients to be able to bring together the structured and unstructured together on one platform, and adding the deep learning as a service capabilities, which is truly differentiating. >> Well, I think the important point there, just to amplify, and for the people to know is, it's not just your version of the tools for the data, you're looking at bringing data in from anywhere the customer, your customer wants it. And that's super critical. You don't want to ignore data. You can't. You got to have access to the data that matters. >> Yeah, you know, I think one of the other critical pieces that we're talking about here is, data without AI is meaningless and AI without data is really not useful or very accurate. So, having both of them in a yin yang and then bringing them together as we're doing in the Watson Studio is extremely important. >> The other thing I want get now to the user side, the consumption side you mentioned making it easier, but one of the things we've been hearing, that's been a theme in the hallways and certainly in theCUBE here is; bad data equals bad AI. >> Bad data equals bad AI. >> It's not just about bolting a AI on, you really got to take a holistic approach and a hygiene approach to the data and understanding where the data is contextually is relevant to the application. Talk about, that means kind of nuance, but break that down. What's your reaction to that and how do you talk to customers saying, okay look you want to do AI here's the playbook. How do you explain that in a very simple way? >> Well you heard of the AI ladder, making your data ready for AI. This is a really important concept because you need to be able to have trust in the data that you have, relevancy in the data that you have, and so it is about not just the connectivity to that data, but can you start having curated and rich data that is really valuable, that's accurate that you can trust, that you can leverage. It becomes not just about the data, but about the governance and the self-service capabilities that you can have and around that data and then it is about the machine learning and the deep learning characteristics that you can put on there. But, all three of those components are absolutely essential. What we're seeing it's not even about the data that you have within the firewall of your organization, it's about what you're doing to really augment that with external data. That's another area that we're having pre-trained, enriched, data sets with what we're doing with the Wats and data kits is extremely important; industry specific data. >> Well you know my pet peeve is always I love data. I'm a data geek, I love innovation, I love data driven, but you can't have data without good human interaction. The human component is critical and certainly with seeing trends where startups like Elation that we've interviewed; are taking this social approach to data where they're looking at it like you don't need to be a data geek or data scientist. The average business person's creating the value in especially blockchain, we were just talking in theCUBE that it's the business model Innovations, it's universal property and the technology can be enabled and managed appropriately. This is where the value is. What's the human component? Is there like... You want to know who's using the data? >> Well-- >> Why are they using data? It's like do I share the data? Can you leverage other people's data? This is kind of a melting pot. >> It is. >> What's the human piece of it? >> It truly is about enabling more people access to what it means to infuse AI into their organization. When I said it's not about re-platforming, but it's about expanding. We started with the data scientists, and we're adding to that the application developer. The third piece of that is, how do you get the knowledge worker? The subject matter expert? The person who understand the actual machine, or equipment that needs to be inspected. How do you get them to start customizing models without having to know anything about the data science element? That's extremely important because I can auto-tag and auto-classify stuff and use AI to get them started, but there is that human element of not needing to be a data scientist, but still having input into that AI and that's a very beautiful thing. >> You know it's interesting is in the security industry you've seen groups; birds of a feather flock together, where they share hats and it's a super important community aspect of it. Data has now, and now with AI, you get the AI ladder, but this points to AI literacy within the organizations. >> Exactly. >> So you're seeing people saying, hey we need AI literacy. Not coding per se, but how do we manage data? But it's also understanding who within your peer group is evolving. So your seeing now a whole formation of user base out there, users who want to know who their; the birds of the other feather flocking together. This is now a social gamification opportunity because they're growing together. >> There're-- >> What's your thought on that? >> There're two things there I would say. First, is we often go to the technology and as a product person I just spoke to you a lot about the technology. But, what we find in talking to our clients, is that it really is about helping them with the skills, the culture, the process transformation that needs to happen within the organization to break down the boundaries and the silos exist to truly get AI into an organization. That's the first thing. The second, is when you think about AI and what it means to actually infuse AI into an enterprise organization there's an ethics component of this. There's ethics and bias, and bias components which you need to mitigate and detect, and those are real problems and by the way IBM, especially with the work that we're doing within Watson, with the work that we're doing in research, we're taking this on front and center and it's extremely important to what we do. >> You guys used to talk about that as cognitive, but I think you're so right on. I think this is such a progressive topic, love to do a deeper dive on it, but really you nailed it. Data has to have a consensus algorithm built into it. Meaning you need to have, that's why I brought up this social dynamic, because I'm seeing people within organizations address regulatory issues, legal issues, ethical, societal issues all together and it requires a group. >> That's right. >> Not just algorithm, people to synthesize. >> Exactly. >> And that's either diversity, diverse groups from different places and experiences whether it's an expert here, user there; all coming together. This is not really talked about much. How are you guys-- >> I think it will be more. >> John: It will, you think so? >> Absolutely it will be more. >> What do you see from customers? You've done a lot of client meetings. Are they talking about this? Or they still more in the how do I stand up AI, literacy. >> They are starting to talk about it because look, imagine if you train your model on bad data. You actually have bias then in your model and that means that the accuracy of that model is not where you need it to be if your going to run it in an enterprise organization. So, being able to do things like detect it and proactively mitigate it are at the forefront and by the way this where our teams are really focusing on what we can do to further the AI practice in the enterprise and it is where we really believe that the ethics part of this is so important for that enterprise or smarter business component. >> Iterating through the quality the data's really good. Okay, so now I was talking to Rob Thomas talking about data containers. We were kind of nerding out on Kubernetes and all that good stuff. You almost imagine Kubernetes and containers making data really easy to move around and manage effectively with software, but I mentioned consensus on the understanding the quality of the data and understanding the impact of the data. When you say consensus, the first thing that jumps in my mind is blockchain, cryptocurrency. Is there a tokenization economics model in data somewhere? Because all the best stuff going on in blockchain and cryptocurrency that's technically more impactful is the changing of the economics. Changing of the technical architectures. You almost can say, hmm. >> You can actually see over a time that there is a business model that puts more value not just on the data and the data assets themselves, but on the models and the insights that are actually created from the AI assets themselves. I do believe that is a transformation just like what we're seeing in blockchain and the type of cryptocurrency that exists within there, and the kind of where the value is. We will see the same shift within data and AI. >> Well, you know, we're really interested in exploring and if you guys have any input to that we'd love to get more access to thought leaders around the relationship people and things have to data. Obviously the internet of things is one piece, but the human relationship the data. You're seeing it play out in real time. Uber had a first death this week, that was tragic. First self-driving car fatality. You're seeing Facebook really get handed huge negative press on the fact that they mismanaged the data that was optimized for advertising not user experience. You're starting to see a shift in an evolution where people are starting to recognize the role of the human and their data and other people's data. This is a big topic. >> It's a huge topic and I think we'll see a lot more from it and the weeks, and months, and years ahead on this. I think it becomes a really important point as to how we start to really innovate in and around not just the data, but the AI we apply to it and then the implications of it and what it means in terms of if the data's not right, if the algorithm's aren't right, if the biases is there. It is big implications for society and for the environment as a whole. >> I really appreciate you taking the time to speak with us. I know you're super busy. My final question's much more share some color commentary on IBM Think this week, the event, your reaction to, obviously it's massive, and also the customer conversations you've had. You've told me that your in client briefings and meetings. What are they talking about? What are they asking for? What are some of the things that are, low-hanging fruit use cases? Where's the starting point? Where are people jumping in? Can you just share any data you have on-- >> Oh I can share. That's a fully loaded question; that's like 10 questions all in one. But the Think conference has been great in terms of when you think about the problems that we're trying to solve with AI, it's not AI alone, right? It actually is integrated in with things like data, with the systems, with how we actually integrate that in terms of a hybrid way of what we're doing on premises and what we're doing in private Cloud, what we're doing in public Cloud. So, actually having a forum where we're talking about all of that together in a unified manner has actually been great feedback that I've heard from many customers, many analysts, and in general from an IBM perspective, I believe has been extremely valuable. I think the types of questions that I'm hearing and the types of inputs and conversations we're having, are one of where clients want to be able to innovate and really do things that are in Horizon three type things. What are the things they should be doing in Horizon one, Horizon two, and Horizon three when it comes to AI and when it comes to AI and how they treat their data. This is really important because-- >> What's Horizon one, two and three? >> You think about Horizon one, those are things you should be doing immediately to get immediate value in your business. Horizon two, are kind of mid-term, 18 to 24. 24 plus months out is Horizon 3. So when you think about an AI journey, what is your AI journey really look like in terms of what you should be doing in the immediate terms. Small, quick wins. >> Foundational. >> What are things that you can do kind of projects that will pan out in a year and what are the two to three year projects that we should be doing. This are the most frequent conversations that I've been having with a lot of our clients in terms of what is that AI journey we should be thinking about, what are the projects right now, how do we work with you on the projects right now on H1 and H2. What are the things we can start incubating that are longer term. And these extremely transformational in nature. It's kind of like what do we do to really automate self-driving, not just cars, but what we do for trains and we do to do really revolutionize certain industries and professions. >> How does your product roadmap to your Horizons? Can you share a little bit about the priorities on the roadmap? I know you don't want to share a lot of data, competitive information. But, can you give an antidotal or at least a trajectory of what the priorities are and some guiding principals? >> I hinted at some of it, but I only talked about the Studio, right... During this discussion, but still Studio is just one of a three-pronged approach that we have in Watson. The Studio really is about laying the foundation that is equivalent for how do we get AI in our enterprises for the builders, and it's like a place where builders go to be able to create, build, deploy those models, machine learning, deep learning models and be able to do so in a de-skilled way. Well, on top of that, as you know, we've done thousands of engagements and we know the most comprehensive ways that clients are trying to use Watson and AI in their organizations. So taking our learnings from that, we're starting to harden those in applications so that clients can easily infuse that into their businesses. We have capabilities for things like Watson Assistance, which was announced this week at the conference that really helped clients with pre-existing skills like how do you have a customer care solution, but then how can you extend it to other industries like automotive, or hospitality, or retail. So, we're working not just within Watson but within broader IBM to bring solutions like that. We also have talked about compliance. Every organization has a regulatory, or compliance, or legal department that deals with either SOWs, legal documents, technical documents. How do you then start making sure that you're adhering to the types of regulations or legal requirements that you have on those documents. Compare and comply actually uses a lot of the Watson technologies to be able to do that. And scaling this out in terms of how clients are really using the AI in their business is the other point of where Watson will absolutely focus going forward. >> That's awesome, Ritika. Thank you for coming on theCUBE, sharing the awesome work and again gutting across IBM and also outside in the industry. The more data the better the potential. >> Absolutely. >> Well thanks for sharing the data. We're putting the data out there for you. theCUBE is one big data machine, we're data driven. We love doing these interviews, of course getting the experts and the product folks on theCUBE is super important to us. I'm John Furrier, more coverage for IBM Think after this short break. (upbeat music)

Published Date : Mar 21 2018

SUMMARY :

Brought to you by IBM. all the goodness of the product side. Jenny introduced the innovation sandwich. and you have blockchain and AI on both sides of it. and that is the core foundation of what Watson is. Yeah, and also to remind the folks, there's a platform and adding the deep learning as a service capabilities, and for the people to know is, and then bringing them together the consumption side you mentioned making it easier, and how do you talk to customers saying, and the self-service capabilities that you can have and the technology can be enabled and managed appropriately. It's like do I share the data? that human element of not needing to be a data scientist, You know it's interesting is in the security industry the birds of the other feather flocking together. and the silos exist to truly get AI into an organization. love to do a deeper dive on it, but really you nailed it. How are you guys-- What do you see from customers? and that means that the accuracy of that model is not is the changing of the economics. and the kind of where the value is. and if you guys have any input to and for the environment as a whole. and also the customer conversations you've had. and the types of inputs and conversations we're having, what you should be doing in the immediate terms. What are the things we can start incubating on the roadmap? of the Watson technologies to be able to do that. and also outside in the industry. and the product folks on theCUBE is super important to us.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Jenny	PERSON	0.99+
John	PERSON	0.99+
Ritika Gunnar	PERSON	0.99+
Uber	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Mandalay Bay	LOCATION	0.99+
10 questions	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Ritika	PERSON	0.99+
both	QUANTITY	0.99+
First	QUANTITY	0.99+
three year	QUANTITY	0.99+
Horizon 3	TITLE	0.99+
third piece	QUANTITY	0.99+
second	QUANTITY	0.99+
Horizon three	TITLE	0.99+
Watson	TITLE	0.99+
one piece	QUANTITY	0.98+
both sides	QUANTITY	0.98+
first death	QUANTITY	0.98+
this week	DATE	0.98+
TensorFlow	TITLE	0.98+
Las Vegas	LOCATION	0.98+
one	QUANTITY	0.98+
a year	QUANTITY	0.97+
one platform	QUANTITY	0.97+
Kubernetes	TITLE	0.97+
Horizon two	TITLE	0.97+
Elation	ORGANIZATION	0.96+
first thing	QUANTITY	0.96+
18	QUANTITY	0.96+
Watson Studio	TITLE	0.96+
today	DATE	0.95+
thousands	QUANTITY	0.95+
two things	QUANTITY	0.95+
PyTorch	TITLE	0.95+
Watson Assistance	TITLE	0.94+
24. 24	QUANTITY	0.94+
2018	DATE	0.94+
Horizon one	TITLE	0.93+
Think 2018	EVENT	0.93+
three	QUANTITY	0.9+
one framework	QUANTITY	0.89+
Sagemaker	ORGANIZATION	0.88+
theCUBE	ORGANIZATION	0.88+
single algorithms	QUANTITY	0.88+
Think	COMMERCIAL_ITEM	0.85+
Keras	TITLE	0.84+
Caffe	TITLE	0.81+
three	TITLE	0.8+
First self-	QUANTITY	0.79+

Wikibon | Action Item, Feb 2018

>> Hi I'm Peter Burris, welcome to Action Item. (electronic music) There's an enormous net new array of software technologies that are available to businesses and enterprises to tend to some new classes of problems and that means that there's an explosion in the number of problems that people perceive as could be applied, or could be solved, with software approaches. The whole world of how we're going to automate things differently in artificial intelligence and any number of other software technologies, are all being brought to bear on problems in ways that we never envisioned or never thought possible. That leads ultimately to a comparable explosion in the number of approaches to how we're going to solve some of these problems. That means new tooling, new models, new any number of other structures, conventions, and artifacts that are going to have to be factored by IT organizations and professionals in the technology industry as they conceive and put forward plans and approaches to solving some of these problems. Now, George that leads to a question. Are we going to see an ongoing ever-expanding array of approaches or are we going to see some new kind of steady-state that kind of starts to simplify what happens, or how enterprises conceive of the role of software and solving problems. >> Well, we've had... probably four decades of packaged applications being installed and defining really the systems of record, which first handled the ordered cash process and then layered around that. Once we had more CRM capabilities we had the sort of the opportunity to lead capability added in there. But systems of record fundamentally are backward looking, they're tracking about the performance of the business. The opportunity-- >> Peter: Recording what has happened? >> Yes, recording what has happened. The opportunity we have is now to combine what the big Internet companies pioneered, with systems of engagement. Where you had machine learning anticipating and influencing interactions. You can now combine those sorts of analytics with systems of record to inform and automate decisions in the form of transactions. And the question is now, how are we going to do this? Is there some way to simplify or, not completely standardized, but can we make it so that we have at least some conventions and design patterns for how to do that? >> And David, we've been working on this problem for quite some time but the notion of convergence has been extent in the hardware and the services, or in the systems business for quite some time. Take us through what convergence means and how it is going to set up new ways of thinking about software. >> So there's a hardware convergence and it's useful to define a few terms. There's converged systems, those are systems which have some management software that have been brought into it and then on top of that they have traditional SANs and networks. There's hyper-converged systems, which started off in the cloud systems and now have come to enterprise as well. And those bring software networking, software storage, software-- >> Software defined, so it's a virtualizing of those converged systems. >> David: Absolutely, and in the future is going to bring also automated operational stuff as well, AI in the operational side. And then there's full stack conversions. Where we start to put in the software, the application software, to begin with the database side of things and then the application itself on top of the database. And finally these, what you are talking about, the systems of intelligence. Where we can combine both the systems of record, the systems of engagement, and the real-time analytics as a complete stack. >> Peter: Let's talk about this for a second because ultimately what I think you're saying is, that we've got hardware convergence in the form of converged infrastructure, hyper-converged in the forms of virtualization of that, new ways of thinking about how the stack comes together, and new ways of thinking about application components. But what seems to be the common thread, through all of this, is data. >> David: Yes. >> So it's basically what we're seeing is a convergence or a rethinking of how software elements revolve around the data, is that kind of the centerpiece of this? >> David: That's the centerpiece of it and we had very serious constraints about accessing data. Those will improve with flash but there's still a lot of room for improvement. And the architecture that we are saying is going to come forward, which really helps this a lot, is the unit grid architecture. Where we offload the networking and the storage from the processor. This is already happening in the hyper scale clouds, they're putting a lot of effort into doing this. But we're at the same time allowing any processor to access any data in a much more fluid way and we can grow that to thousands of processes. Now that type of architecture gives us the ability to converge the traditional systems of record, and there are a lot of them obviously, and the systems of engagement and the the real-time analytics for the first time. >> But the focal point of that convergence is not the licensing of the software, the focal point is convergence around the data. >> The data. >> But that has some pretty significant implications when we think about how software has always been sold, how organizations to run software have been structured, the way that funding is set up within businesses. So George, what does it mean to talk about converging software around data from a practical standpoint over the next few years? >> Okay, so let me take that and interpret that as converging the software around data in the context of adding intelligence to our existing application portfolio and then the new applications that follow on. And basically, when we want to inject an intelligence enough to inform and anticipate and inform interactions or inform or automate transactions, we have a bunch of steps that need to get done. Where we're ingesting essentially contextual or ambient information. Often this is information about a user or the business process. And this data, it's got to go through a pipeline where there's both a Design Time and a Run Time. In addition to ingesting it, you have to sort of enrich it and make it ready for analysis. Then the analysis has essentially picking out of all that data and calculating the features that you plug into a machine learning model. And then that, produces essentially an inference based on all that data, that says well this is the probable value and it sounds like, sounds like it's in the weeds but the point is it's actually a standardized set of steps. Then the question is, do you put that all together in one product across that whole pipeline? Can one piece of infrastructure software manage that ? Or do you have a bunch of pieces each handing off to the next? And-- >> Peter: But let me stop you so because I want to make sure that we kind of follow this thread. So we've argued that hardware convergence and the ability to scale the role the data plays or how data is used, is happening and that opens up new opportunities to think about data. Now what we've got is we are centering a lot of the software convergence around the use of data through copies and other types of mechanisms for handling snapshots and whatnot and things like uni grid. What you're, let's start with this. It sounds like what you're saying is we need to think of new classes of investments in technologies that are specifically set up to handling the processing of data in a more distributed application way, right? If I got that right, that's kind of what we mean by pipelines? >> George: Yes. >> Okay, so once we do that, once we establish those conventions, once we establish organizationally institutionally how that's going to work. Now we take the next step of saying, are we going to default to a single set of products or are we going to do best to breed and what kind of convergence are we going to see there? >> And there's no-- >> First of all, have I got that right? >> Yes, but there's no right answer. And I think there's a bunch of variables that we have to play with that depend on who the customer is. For instance, the very largest and most sophisticated tech companies are more comfortable taking multiple pieces each that's very specialized and putting them together in a pipeline. >> Facebook, Yahoo, Google-- >> George: LinkedIn. >> Got it. >> George: Those guys. And the knobs that they're playing with, that everyone's playing with, are three, basically on the software side. There's your latency budget, which is how much time do you have to produce an answer. So that drives the transaction or the interaction. And it's not, that itself is not just a single answer because... It's not, the goal isn't to get it as short as possible. The goal is to get as much information into the analysis within the budgeted latency. >> Peter: So it's packing the latency budget with data? >> George: Yes, because the more data that goes into making the inference, the better the inference. >> Got it. >> The example that someone used actually on Fareed Zakaria GPS, one show about it was, if he had 300 attributes describing a person he could know more about that person then that person did (laughs) in terms of inferring other attributes. So the the point is, once you've got your latency budget, the other two knobs that you can play with are development complexity and admin complexity. And the idea is on development complexity, there's a bunch of abstractions that you have to deal with. If it's all one product you're going to have one data model, one address and namespace convention, one programming model, one way of persisting data, a whole bunch of things. That's simplicity. And that makes it more accessible to mainstream organizations. Similarly there's a bunch of, let me just add that, there's probably two or three times as many constructs that admins would have to deal with. So again, if you're dealing with one product, it's a huge burden off the admin and we know they struggled with Hadoop. >> So convergence, decisions about how to enact convergence is going to be partly or strongly influenced by those three issues. Latency budget, development complexity or simplicity, and administrative, David-- >> I'd like to add one more to that, and that is location of data. Because you want to be able to, you want to be able to look at the data that is most relevant to solving that particular problem. Now, today a lot of the data is inside the enterprise. There's a lot of data outside that but they're still, you will want to, in the best possible way, combine that data one way or another. >> But isn't that a variable on the latency budget? >> David: Well there's, I would think it's very useful to split the latency budget, which is to do with inference mainly, and development with the machine learning. So there is a development cycle with machine learning that is much longer. That is days, could be weeks, could be months. >> I would still done in Bash. >> It is or will be done, wait a second. It will be done in Bash, it is done in Bash, and it's. You need to test it and then deliver it as an inference engine to the applications that you're talking about. Now that's going to be very close together, that inference, then the rest of it has to be all physically very close together. But the data itself is spread out and you want to have mechanisms that can combine those datas, move application to those datas, bring those together in the best possible way. That is still a Bash process. That can run where the data is, in the cloud locally, wherever it is. >> George: And I think you brought up a great point, which I would tend to include in latency budget because... no matter what kind of answers you're looking for, some of the attributes are going to be pre computed and those could be-- >> David: Absolutely. >> External data. >> David: Yes. >> And you're not going to calculate everything in real time, there's just-- >> You can't. >> Yes you can't. >> But is the practical reality that the convergence of, so again, the argument. We've got all these new problems, all new kinds of new people that are claiming that they know how to solve the problems, each of them choosing different classes of tools to solve the problem, an explosion across the board in the approaches, which can lead to enormous downstream integration and complexity costs. You've used the example of Cloudera, for example. Some of the distro companies who claim that 50 plus percent of their development budget is dedicated to just integrating these pieces. That's a non-starter for a lot of enterprises. Are we fundamentally saying that the degree of complexity or the degree of simplicity and convergence, it's possible in software, is tied to the degree of convergence in the data? >> You're honing in on something really important, give me-- >> Peter: Thank you! (laughs) >> George: Give an example of the convergence of data that you're talking about. >> Peter: I'll let David do it because I think he's going to jump on it. >> David: Yes so let me take examples, for example. If you have a small business, there's no way that you want to invest yourself in any of the normal levels of machine learning and applications like that. You want to outsource that. So big software companies are going to do that for you and they're going to do it especially for the specific business processes which are unique to them, which give them digital differentiation of some sort or another. So for all of those type of things, software will come in from vendors, from SAP or son of SAP, which will help you solve those problems. And having data brokers which are collecting the data, putting them together, helping you with that. That seems to me the way things are going. In the same way that there's a lot of inference engines which will be out at the IOT level. Those will have very rapid analytics given to them. Again, not by yourself but by companies that specialize in facial recognition or specialize in making warehouse-- >> Wait a minute, are you saying that my customers aren't special, that require special facial recognition? (laughs) So I agree with David but I want to come back to this notion because-- >> David: The point I was getting at is, there's going to be lots and lots of room for software to be developed, to help in specific cases. >> Peter: And large markets to sell that software into. >> Very large markets. >> Whether it's a software, but increasingly also with services. But I want to come back to this notion of convergence because we talked about hardware convergence and we're starting to talk about the practical limits on software convergence. But somewhere in between I would argue, and I think you guys would agree, that really the catalyst for, or the thing that's going to determine the rate of change and the degree of convergence is going to be how we deal with data. Now you've done a lot of research on this, I'm going to put something out there and you tell me if I'm wrong. But at the end of the day, when we start thinking about uni grid, when we start thinking about some of these new technologies, and the ability to have single copies or single sources of data, multiple copies, in many respects what we're talking about is the virtualization of data without loss. >> David: Yes. >> Not loss of the characters, the fidelity of the data, or the state of the data. I got that right? >> Knowing the state of the data. >> Peter: Or knowing state of the data. >> If you take a snapshot, that's a point in time, you know what that point of time is, and you can do a lot of analytics for example on, and you want to do them on a certain time of day or whatever-- >> Peter: So is it wrong to say that we're seeing, we've moved through the virtualization of hardware and we're now in a hyper scale or hyper-converged, which is very powerful stuff. We're seeing this explosion in the amount of software that's being you know, the way we approach problems and whatnot. But that a forcing function, something that's going to both constrain how converged that can be, but also force or catalyze some convergence, is the idea that we're moving into an era where we can start to think about virtualized data through some of these distributed file systems-- >> David: That's right, and the metadata that goes with it. The most important thing about the data is, and it's increasing much more rapidly than data itself, is the metadata around it. But I want to just, make one point on this, all data isn't useful. There's a huge amount of data that we capture that we're just going to have to throw away. The idea that we can look at every piece of data for every decision is patently false. There's a lovely example of this in... fluid mechanics. >> Peter: Fluid dynamics. >> David: Fluid dynamics, if you're trying to, if you're trying to have simulation at a very very low level, the amount of-- >> Peter: High fidelity. >> High fidelity, you run out of capacity very very very quickly indeed. So you have to make trade-offs about everything and all of that data that you're doing in that simulation, you're not going to keep that. All the data from IOT, you can't keep that. >> Peter: And that's not just a statement about the performance or the power or the capabilities of the hardware, there's some physical realities-- >> David: Absolutely, yes. >> That are going to limit what you can do with the simulation. But, and we've talked. We've talked about this in other action items, There is this notion of options on data value, where the value of today's data is maybe-- >> David: Is much higher. >> Peter: Well it's higher from at a time standpoint for the problems that we understand and are trying to solve now but there may be future problems where we still want to ensure that we have some degree of data where we can be better at attending those future problems. But I want to come back to this point because in all honesty, I haven't heard anybody else talking about this and maybe's because I'm not listening. But this notion of again, your research that the notion of virtualized data inside these new architectures being a catalyst for a simplification of a lot of the sharing subsystem. >> David: It's essentially sharing of data. So instead of having the traditional way of doing it within a data center, which is I have my systems of record, I make a copy, it gets delivered to the data warehouse, for example. That's the way that's being done. That is too slow, moving data is incredibly slow. So another way of doing it is to share that data, make a virtual copy of it, and technologies allowing you to do that because the access density has gone up by thousands of times-- >> Peter: Because? >> Because. (laughs) Because of flash, because of new technologies at that level, >> Peter: High performance interfaces, high performance networks. >> David: All of that stuff is now allowing things, which just couldn't be even conceived. However, there is still a constraint there. It may be a thousand times bigger but there is still an absolute constraint to the amount of data that you can actually process. >> And that constraint is provided by latency. >> Latency. >> Peter: Speed of light. >> Speed of light and speed of the processes themselves. >> George: Let me add something that may help explain the sort of the virtualization of data and how it ties into the convergence or non convergence of the software around it. Which is, when we're building these analytic pipelines, essentially we've disassembled what used to be a DBMS. And so out of that we've got a storage engine, we've got query optimizers, we've got data manipulation languages which have grown into full-blown analytic languages, data definition language. Now the system catalog used to be just, a way to virtualize all the tables in the database and tell you where all the stuff was, and the indexes and things like that. Now, what we're seeing is since data is now spread out over so many places and products, we're seeing an emergence of a new of catalog. Whether that's from Elation or Dremio or on AWS, it's the Glue catalog, and I think there's something equivalent coming on Asure. But the point is, we're beginning, those are beginning to get useful enough to be the entry point for analytic products and maybe eventually even for transactional products to update, or at least to analyze the data in these pipelines that we're putting together out of these components of what was a disassembled database. Now, we could be-- >> I would make a difference there there between the development of analytics and again, the real-time use of those analytics within systems of intelligence. >> George: Yeah but when you're using them-- >> David: There's a different, problems they have to solve. >> George: But there's a Design Time and a Run Time, there's actually four pipelines for the sort of analytic pipeline itself. There's Design Time and Run Time, and then for the inference engine and the modeling that goes behind it, there's also a Design Time and Run Time. But I guess where. I'm not disagreeing that you could have one converged product to manage the Run Time analytic pipeline. I'm just saying that the pieces that you assemble could come from one vendor. >> Yeah but I think David's point, I think it's accurate and this has been since the beginning of time. (laughs) Certainly predated UNIVAC. That at the end of the day, read/write ratios and the characteristics of the data are going to have an enormous impact on the choices that you make. And high write to read ratios almost dictate the degree of convergence, and we used to call that SMP, or you know scale-up database managers. And for those types of applications, with those types of workloads, it's not necessarily obvious that that's going to change. Now we can still find ways to relax that but you're talking about, George, the new characteristics >> Injecting the analytics. >> Injecting the analytics where we're doing more reading as opposed to writing. We may still be writing into an application that has these characteristics-- >> That's a small amount of data. >> But a significant portion of the new function is associated with these new pipelines. >> Right. And it's actually... what data you create is generally derived data. So you're not stepping on something that's already there. >> All right, so let me get some action items here. David, I want to start with you. What's the action item? >> David: So for me, about conversions, there's two levels of conversions. First of all, converge as much as possible and give the work to the vendor, would be my action item. The more that you can go full stack, the more that you can get the software services from a single point, single throat to choke, single hand to shake, the more you have out source your problems to them. >> Peter: And that has a speed implication, time to value. >> Time to value, it has a, you don't have to do undifferentiated work. So that's the first level of convergence and then the second level of convergence is to look hard about how you can bring additional value to your existing systems of record by putting in automation or a real-time analytics. Which leads to automation, that is the second one, for me, where the money is. Automation, reduction in the number of things that people have to do. >> Peter: George, action item. >> So my action item is that you have to evaluate, you the customer have to evaluate sort of your skills as much as your existing application portfolio. And if more of your greenfield apps can start in the cloud and you're not religious about open source but you're more religious about the admin burden and development burden and your latency budget, then start focusing on the services that the cloud vendors originally created that were standalone, but they are increasingly integrating because the customers are leading them there. And then for those customers who you know, have decades and decades of infrastructure and applications on Prem and need a pathway to the cloud, some of the vendors formerly known as Hadoop vendors. But for that matter, any on Prem software vendor is providing customers a way to run workloads in a hybrid environment or to migrate data across platforms. >> All right, so let me give this a final action item here. Thank you David Foyer, George Gilbert. Neil Raiden and Jim Kobielus and the rest of the Wikibon team is with customers today. We talked today about convergence at the software level. What we've observed over the course of the last few years is an expanding array of software technologies, specifically AI, big data, machine learning, etc. That are allowing enterprises to think differently about the types of problems that they can solve with technology. That's leading to an explosion and a number of problems that folks are looking at, the number of individuals participating in making those decisions and thinking those issues through. And very importantly, an explosion of the number of vendors with piecemeal solutions about what they regard, their best approach to doing things. However, that is going to have a significant burden that could have enormous implications for years and so the question is, will we see a degree of convergence in the approach to doing software, in the form of pipelines and applications and whatnot, driven by a combination of: what the hardware is capable of doing, what the skills are or make possible, and very importantly, the natural attributes of the data. And we think that there will be. There will always be tension in the model if you try to invent new software but one of the factors that's going to bring it all back to a degree of simplicity, will be a combination of what the hardware can do, what people can do, and what the data can do. And so we believe, pretty strongly, that ultimately the issues surrounding data whether it be latency or location, as well as the development complexity and administrative complexity, are going to be a range of factors that are going to dictate ultimately of how some of these solutions start to converge and simplify within enterprises. As we look forward, our expectation is that we're going to see an enormous net new investment over the next few years in pipelines, because pipelines are a first-level set of investments on how we're going to handle data within the enterprise. And they'll look like, in certain respects, how DBMS used to look but just in a disaggregated way but conceptually and administratively and then from a product selection and service election standpoint, the expectation is that they themselves have to come together so the developers can have a consistent view of the data that's going to run inside the enterprise. Want to thank David Floyer, want to thank George Gilbert. Once again, this has been Wikibon Action Item and we look forward to seeing you on our next Action Item. (electronic music)

Published Date : Feb 16 2018

SUMMARY :

in the number of approaches to how we're going the sort of the opportunity to lead And the question is now, how are we going to do this? has been extent in the hardware and the services, and now have come to enterprise as well. of those converged systems. David: Absolutely, and in the future is going to bring hyper-converged in the forms of virtualization of that, and the the real-time analytics for the first time. the licensing of the software, the way that funding is set up within businesses. the features that you plug into a machine learning model. and the ability to scale how that's going to work. that we have to play with that It's not, the goal isn't to get it as short as possible. George: Yes, because the more data that goes the other two knobs that you can play with is going to be partly or strongly that is most relevant to solving that particular problem. to split the latency budget, that inference, then the rest of it has to be all some of the attributes are going to be pre computed But is the practical reality that the convergence of, George: Give an example of the convergence of data because I think he's going to jump on it. in any of the normal levels of there's going to be lots and lots of room for and the ability to have single copies Not loss of the characters, the fidelity of the data, the way we approach problems and whatnot. David: That's right, and the metadata that goes with it. and all of that data that you're doing in that simulation, That are going to limit what you can for the problems that we understand So instead of having the traditional way of doing it Because of flash, because of new technologies at that level, Peter: High performance interfaces, to the amount of data that you can actually process. and the indexes and things like that. the development of analytics and again, I'm just saying that the pieces that you assemble on the choices that you make. Injecting the analytics where we're doing But a significant portion of the new function is what data you create is generally derived data. What's the action item? the more that you can get the software services So that's the first level of convergence and applications on Prem and need a pathway to the cloud, of convergence in the approach to doing software,

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
David Floyer	PERSON	0.99+
George	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
George Gilbert	PERSON	0.99+
Peter	PERSON	0.99+
David Foyer	PERSON	0.99+
George Gilber	PERSON	0.99+
Feb 2018	DATE	0.99+
Yahoo	ORGANIZATION	0.99+
Neil Raiden	PERSON	0.99+
two	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
LinkedIn	ORGANIZATION	0.99+
300 attributes	QUANTITY	0.99+
Bash	TITLE	0.99+
three	QUANTITY	0.99+
second level	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
two knobs	QUANTITY	0.99+
today	DATE	0.99+
two levels	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
one	QUANTITY	0.99+
first level	QUANTITY	0.99+
each	QUANTITY	0.98+
three issues	QUANTITY	0.98+
First	QUANTITY	0.98+
first time	QUANTITY	0.98+
one point	QUANTITY	0.98+
one product	QUANTITY	0.98+
both	QUANTITY	0.98+
UNIVAC	ORGANIZATION	0.98+
50 plus percent	QUANTITY	0.98+
decades	QUANTITY	0.98+
second one	QUANTITY	0.98+
single point	QUANTITY	0.97+
three times	QUANTITY	0.97+
one way	QUANTITY	0.97+

Santhosh Mahendiran, Standard Chartered Bank | BigData NYC 2017

>> Announcer: Live, from Midtown Manhattan, it's theCUBE, covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (upbeat techno music) >> Okay welcome back, we're live here in New York City. It's theCUBE's presentation of Big Data NYC, our fifth year doing this event in conjunction with Strata Data, formerly Strata Hadoop, formerly Strata Conference, formerly Hadoop World, we've been there from the beginning. Eight years covering Hadoop's ecosystem now Big Data. This is theCUBE, I'm John Furrier. Our next guest is Santhosh Mahendiran, who is the global head of technology analytics at Standard Chartered Bank. A practitioner in the field, here getting the data, checking out the scene, giving a presentation on your journey with Data at a bank, which is big financial obviously an adopter. Welcome to theCUBE. >> Thank you very much. >> So we always want to know what the practitioners are doing because at the end of the day there's a lot of vendors selling stuff here, so you got, everyone's got their story. End of the day you got to implement. >> That's right. >> And one of the themes is the data democratization which sounds warm and fuzzy, collaborating with data, this is all good stuff and you feel good and you move into the future, but at the end of the day it's got to have business value. >> That's right. >> And as you look at that, how do you look at the business value? Cause you want to be in the bleeding edge, you want to provide value and get that edge operationally. >> That's right. >> Where's the value in data democratization? How did you guys roll this out? Share your story. >> Okay, so let me start with the journey first before I come to the value part of it, right? So, data democratization is an outcome, but the journey has been something we started three years back. So what did we do, right? So we had some guiding principles to start our journey. The first was to say that we believed in the three S's, which is speed, scale, and it should be really, really flexible and super fast. So one of the challenges that we had was our historical data warehouses was entirely becoming redundant. And why was it? Because it was RDBMS centric, and it was extremely disparate. So we weren't able to scale up to meet the demands of managing huge chunks of data. So, the first step that we did was to re-pivot it to say that okay, let's embrace Hadoop. And what you mean by embracing is just not putting in the data lake, but we said that all our data will land into the data lake. And this journey started in 2015, so we have close to 80% of the Bank's data in the lake and it is end of day data right now and this data flows in on daily basis, and we have consumers who feed off that data. Now coming to your question about-- >> So the data lake's working? >> The data lake is working, up and running. >> People like it, you just got a good spot, batch 'em all you throw everything in the lake. >> So it is not real time, it is end of day. There is some data that is real-time, but the data lake is not entirely real-time, that I have to tell you. But one part is that the data lake is working. Second part to your question is how do I actually monetize it? Are you getting some value out of it? But I think that's where tools like Paxata has actually enabled us to accelerate this journey. So we call it data democratization. So the best part it's not about having the data. We want the business users to actually use the data. Typically, data has always been either delayed or denied in most of the cases to end-users and we have end-users waiting for the data but they don't get access to the data. It was done because primarily the size of the data was too huge and it wasn't flexible enough to be shared with. So how did tools like Paxata and the data lake help us? So what we did with data democratization is basically to say that "hey we'll get end-users to access the data first in a fast manner, in a self-service manner, and something that gives operational assurance to the data, so you don't hold the data and then say that you're going to get a subset of data to play with. We'll give you the entire set of data and we'll give you the right tools which you can play with. Most importantly, from an IT perspective, we'll be able to govern it. So that's the key about democratization. It's not about just giving them a tool, giving them all data and then say "go figure it out." It's about ensuring that "okay, you've got the tools, you've got the data, but we'll also govern it," so that you obviously have control over what they're doing. >> So now you govern it, they don't have to get involved in the governance, they just have access? >> No they don't need to. Yeah, they have access. So governance works both ways. We establish the boundaries. Look at it as a referee, and then say that "okay, there are guidelines that you don't," and within the datasets that key people have access to, you can further set rules. Now, coming back to specific use cases, I can talk about two specific cases which actually helped us to move the needle. The first is on stress testing, so being a financial institution, we typically have to report various numbers to our regulators, etc. The turnaround time was extremely huge. These kind of stress testing typically involve taking huge amount-- >> What were some of the turnaround times? >> Normally it was two to three weeks, some cases a month-- >> Wow. >> So we were able to narrow it down to days, but what we essentially did was as with any stress testing or reporting, it involved taking huge amounts of data, crunching them and then running some models and then showing the output, basically a number of transformations involved. Earlier, you first couldn't access the entire dataset, so that we solved-- >> So check, that was a good step one-- >> That was step one. >> But was there automation involved in that, the Paxata piece? >> Yeah, I wouldn't say it was fully automated end-to-end, but there was definitely automation given the fact that now you got Paxata to work off the data rather than someone extracting the data and then going off and figuring what needs to be done. The ability to work off the entire dataset was a big plus. So stress testing, bringing down the cycle time. The second one use case I can talk about is again anti-money laundering, and in our financial crime compliance space. We had processes that took time to report, given the clunkiness in the various handoffs that we needed to do. But again, empowering the users, giving the tool to them and then saying "hey, this"-- >> How about know your user, because we have to anti-money launder, you need to have to know your user base, that's all set their too? >> Yeah. So the good part is know the user, know your customer, KYCs all that part is set, but the key part is making sure the end-users are able to access the data much more earlier in the life cycle and are able to play with it. In the case of anti-money laundering, again first question of three weeks to four weeks was shortened down to question of days by giving tools like Paxata again in a structured manner and with which we're able to govern. >> You control this, so you knew what you were doing, but you let their tools do the job? >> Correct, so look at it this way. Typically, the data journey has always been IT-led. It has never been business-led. If you look at the generations of what happens is, you source the data which is IT-led, then you model the data which is IT-led, then you prepare then massage the data which is again IT-led and then you have tools on top of it which is again IT-led so the end-users get it only after the fourth stage. Now look at the generations within. All these life cycles apart from the fact that you source the data which is typically an IT issue, the rest need to be done by the actual business users and that's what we did. That's the progression of the generations in which we now we're in the third generation as I call it where our role is just to source the data and then say, "yeah we'll govern it in the matter and then preparation-- >> It's really an operating system and we were talking with Aaron with Elation's co-founder, we used the analogy of a car, how this show was like a car show engine show, what's in the engine and the technology and then it evolved every year, now it's like we're talking about the cars, now we're talking about driver experience-- >> That's right. >> At the end of the day, you just want to drive. You don't really care what's under the hood, you do but you don't, but there's those people who do care what's under the hood, so you can have best of both worlds. You've got the engines, you set up the infrastructure, but ultimately, you in the business side, you just want to drive, that's what's you're getting at? >> That's right. The time-to-market and speed to empower the users to play around with the data rather than IT trying to churn the data and confine access to data, that's a thing of the past. So we want more users to have faster access to data but at the same time govern it in a seamless manner. The word governance is still important because it's not about just give the data. >> And seamless is key. >> Seamless is key. >> Cause if you have democratization of data, you're implying that it is community-oriented, means that it's available, with access privileges all transparently or abstracted away from the users. >> Absolutely. >> So here's the question I want to ask you. There's been talk, I've been saying it for years going back to 2012 that an abstraction layer, a data layer will evolve and that'll be the real key. And then here in this show, I heard things like intelligent information fabric that is business, consumer-friendly. Okay, it's a mouthful, but intelligent information fabric in essence talks about an abstraction layer-- >> That's right. >> That doesn't really compromise anything but gives some enablement, creates some enabling value-- >> That's right. >> For software, how do you see that? >> As the word suggests, the earlier model was trying to build something for the end-users, but not which was end-user friendly, meaning to say, let me just give you a simple example. You had a data model that existed. Historically the way that we have approached using data is to say "hey, I've got a model and then let's fit that data into this model," without actually saying that "does this model actually serve the purpose?" You abstracted the model to a higher level. The whole point about intelligent data is about saying that, I'll give you a very simple analogy. Take zip code. Zipcode in US is very different from zipcode in India, it's very different from zipcode in Singapore. So if I had the ability for my data to come in, to say that "I know it's a zipcode, but this zipcode belongs to US, this zipcode belongs to Singapore, and this zipcode belongs to India," and more importantly, if I can further rev it up a notch, if I say that "this belongs to India, and this zipcode is valid." Look at where I'm going with intelligent sense. So that's what's up. If you look at the earlier model, you have to say that "yeah, this is a placeholder for zipcode." Now that makes sense, but what are you doing with it? >> Being a relational database model, it's just a field in a schema, you're taking it and abstracting it and creating value out of it. >> Precisely. So what I'm actually doing is accelerating the adoption, I'm making it more simpler for users to understand what the data is. So I don't need to as a user figure out "I got a zipcode, now is it a Singapore, India or what zipcode." >> So all this automation, Paxata's got a good system, we'll come back to the Paxata question in a second, I do want to drill down on that. But the big thing that I've been seeing at the show, and again Dave Alonte, my partner, co-CEO of Silicon Angle, we always talk about this all the time. He's more less bullish on Hadoop than I am. Although I love Hadoop, I think it's great but it's not the end-all, be-all. It's a great use case. We were critical early on and the thing we were critical on it was it was too much time being spent on the engine and how things are built, not on the business value. So there's like a lull period in the business where it was just too costly-- >> That's right. >> Total cost of ownership was a huge, huge problem. >> That's right. >> So now today, how did you deal with that and are you measuring the TCO or total cost of ownership cause at the end of the day, time to value, which is can you be up and running in 90 days with value and can you continue to do that, and then what's the overall cost to get there. Thoughts? >> So look I think TCO always underpins any technology investment. If someone said I'm doing a technology investment without thinking about TCO, I don't think he's a good technology leader, so TCO is obviously a driving factor. But TCO has multiple components. One is the TCO of the solution. The other aspect is TCO of what my value I'm going to get out of this system. So talking from an implementation perspective, what I look at as TCO is my whole ecosystem which is my hardware, software, so you spoke about Hadoop, you spoke about RDBMS, is Hadoop cheaper, etc? I don't want to get into that debate of cheaper or not but what I know is the ecosystem is becoming much, much more cheaper than before. And when I talk about ecosystem, I'm talking about RDBMS tools, I'm talking about Hadoop, I'm talking about BI tools, I'm talking about governance, I'm talking about this whole framework becoming cheaper. And it is also underpinned by the fact that hardware is also becoming cheaper. So the reality is all components in the whole ecosystem are becoming cheaper and given the fact that software is also becoming more open-sourced and people are open to using open-source software, I think the whole question of TCO becomes a much more pertinent question. Now coming to your point, do you measure it regularly? I think the honest answer is I don't think we are doing a good job of measuring it that well, but we do have that as one of the criteria for us to actually measure the success of our project. The way that we do is our implementation cost, at the time of writing out our PETs, we call it PETs, which is the Project Execution Document, we talk about cost. We say that "what's the implementation cost?" What are the business cases that are going to be an outcome of this? I'll give you an example of our anti-money laundering. I told you we reduced our cycle time from few weeks to a few days, and that in turn means the number of people involved in this whole process, you're reducing the overheads and the operational folks involved in it. That itself tells you how much we're able to save. So definitely, TCO is there and to say that-- >> And you are mindful of, it's what you look at, it's key. TCO is on your radar 100% you evaluate that into your deals? >> Yes, we do. >> So Paxata, what's so great about Paxata? Obviously you've had success with them. You're a customer, what's the deal. Was it the tech, was it the automation, the team? What was the key thing that got you engaged with them or specifically why Paxata? >> Look, I think the key to partnership there cannot be one ingredient that makes a partnership successful, I think there are multiple ingredients that make a partnership successful. We were one of the earliest adopters of Paxata. Given that we're a bank and we have multiple different systems and we have lot of manual processing involved, we saw Paxata as a good fit to govern these processes and ensure at the same time, users don't lose their experience. The good thing about Paxata that we like was obviously the simplicity and the look and feel of the tool. That's number one. Simplicity was a big point. The second one is about scale. The scale, the fact that it can take in millions of roles, it's not about just working off a sample of data. It can work on the entire dataset. That's very key for us. The third is to leverage our ecosystem, so it's not about saying "okay you give me this data, let me go figure out what to do and then," so Paxata works off the data lake. The fact that it can leverage the lake that we built, the fact that it's a simple and self-preparation tool which doesn't require a lot of time to bootstrap, so end-use people like you-- >> So it makes it usable. >> It's extremely user-friendly and usable in a very short period of time. >> And that helped with the journey? >> That really helped with the journey. >> Santosh, thanks so much for sharing. Santosh Mahendiran, who is the Global Tech Lead at the Analytics of the Bank at Standard Chartered Bank. Again, financial services, always a great early adopter, and you get success under your belt, congratulations. Data democratization is huge and again, it's an ecosystem, you got all that anti-money laundering to figure out, you got to get those reports out, lot of heavylifting? >> That's right, >> So thanks so much for sharing your story. >> Thank you very much. >> We'll give you more coverage after this short break, I'm John Furrier, stay tuned. More live coverage in New York City, its theCube.

Published Date : Sep 29 2017

SUMMARY :

Brought to you by SiliconANGLE Media here getting the data, checking out the scene, End of the day you got to implement. but at the end of the day it's got to have business value. how do you look at the business value? Where's the value in data democratization? So one of the challenges that we had was People like it, you just got a good spot, in most of the cases to end-users and we have end-users guidelines that you don't," and within the datasets that Earlier, you first couldn't access the entire dataset, So stress testing, bringing down the cycle time. So the good part is know the user, know your customer, That's the progression of the generations in which we At the end of the day, you just want to drive. but at the same time govern it in a seamless manner. Cause if you have democratization of data, So here's the question I want to ask you. So if I had the ability for my data to come in, and creating value out of it. So I don't need to as a user figure out "I got a zipcode, But the big thing that I've been seeing at the show, at the end of the day, time to value, which is can you be So the reality is all components in the whole ecosystem And you are mindful of, it's what you look at, it's key. Was it the tech, was it the automation, the team? The fact that it can leverage the lake that we built, It's extremely user-friendly and usable in a very at the Analytics of the Bank at Standard Chartered Bank. We'll give you more coverage after this short break,

ENTITIES

Entity	Category	Confidence
Dave Alonte	PERSON	0.99+
Standard Chartered Bank	ORGANIZATION	0.99+
three weeks	QUANTITY	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
2012	DATE	0.99+
2015	DATE	0.99+
Santosh Mahendiran	PERSON	0.99+
two	QUANTITY	0.99+
Aaron	PERSON	0.99+
US	LOCATION	0.99+
Santhosh Mahendiran	PERSON	0.99+
Singapore	LOCATION	0.99+
Santosh	PERSON	0.99+
four weeks	QUANTITY	0.99+
TCO	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
90 days	QUANTITY	0.99+
India	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
fifth year	QUANTITY	0.99+
today	DATE	0.99+
Midtown Manhattan	LOCATION	0.99+
Paxata	ORGANIZATION	0.99+
one ingredient	QUANTITY	0.99+
third	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.99+
one part	QUANTITY	0.99+
millions	QUANTITY	0.99+
first	QUANTITY	0.99+
Eight years	QUANTITY	0.99+
Silicon Angle	ORGANIZATION	0.99+
Second part	QUANTITY	0.98+
third generation	QUANTITY	0.98+
fourth stage	QUANTITY	0.98+
two specific cases	QUANTITY	0.98+
both ways	QUANTITY	0.98+
one	QUANTITY	0.98+
BigData	ORGANIZATION	0.98+
NYC	LOCATION	0.98+
both worlds	QUANTITY	0.98+
first step	QUANTITY	0.97+
three years back	DATE	0.97+
second one	QUANTITY	0.97+
One	QUANTITY	0.97+
2017	DATE	0.96+
Hadoop	TITLE	0.96+
Strata Data	ORGANIZATION	0.96+
Strata Hadoop	ORGANIZATION	0.94+
step one	QUANTITY	0.94+
first question	QUANTITY	0.93+
a month	QUANTITY	0.92+
Elation	ORGANIZATION	0.9+
Data	EVENT	0.89+
2017	EVENT	0.89+
80%	QUANTITY	0.88+
Paxata	TITLE	0.88+
Big Data	EVENT	0.84+
theCube	ORGANIZATION	0.83+

Itamar Ankorion, Attunity & Arvind Rajagopalan, Verizon - #DataWorks - #theCUBE

>> Narrator: Live from San Jose in the heart of Silicon Valley, it's the CUBE covering DataWorks Summit 2017 brought to you by Hortonworks. >> Hey, welcome back to the CUBE live from the DataWorks Summit day 2. We've been here for a day and a half talking with fantastic leaders and innovators, learning a lot about what's happening in the world of big data, the convergence with Internet of Things Machine Learning, artificial intelligence, I could go on and on. I'm Lisa Martin, my co-host is George Gilbert and we are joined by a couple of guys, one is a Cube alumni, Itamar Ankorion, CMO of Attunity, Welcome back to the Cube. >> Thank you very much, good to be here, thank you Lisa and George. >> Lisa: Great to have you. >> And Arvind Rajagopalan, the Director of Technology Services for Verizon, welcome to the Cube. >> Thank you. >> So we were chatting before we went on, and Verizon, you're actually going to be presenting tomorrow, at the DataWorks summit, tell us about building... the journey that Verizon has been on building a Data Lake. >> Oh, Verizon is over the last 20 years, has been a large corporation, made up of a lot of different acquisitions and mergers, and that's how it was formed in 20 years back, and as we've gone through the journey of the mergers and the acquisitions over the years, we had data from different companies come together and form a lot of different data silos. So the reason we kind of started looking at this, is when our CFO started asking questions around... Being able to answer One Verizon questions, it's as simple as having Days Payable, or Working Capital Analysis across all the lines of businesses. And since we have a three-major-ERP footprint, it is extremely hard to get that data out, and there was a lot of manual data prep activities that was going into bringing together those One Verizon views. So that's really what was the catalyst to get the journey started for us. >> And it was driven by your CFO, you said? >> Arvind: That's right. >> Ah, very interesting, okay. So what are some of the things that people are going to hear tomorrow from your breakout session? >> Arvind: I'm sorry, say that again? >> Sorry, what are some of the things that the people, the attendees from your breakout session, are going to learn about the steps and the journey? >> So I'm going to primarily be talking about the challenges that we ran into, and share some around that, and also talk about some of the factors, such as the catalysts and what drew us to sort of moving in that direction, as well as getting to some architectural components, from high-level standpoint, talk about certain partners that we work with, the choices we made from an architecture perspective and the tools, as well as to kind of close the loop on, user adoption and what users are seeing in terms of business value, as we start centralizing all of the data at Verizon from a backoff as Finance and Supply Chains standpoint. So that's kind of what I'm looking at talking tomorrow. >> Arvind, it's interesting to hear you talk about sort of collecting data from essentially backoff as operational systems in a Data Lake. Were there... I assume that the state is sort of more refined and easily structured than the typical stories we hear about Data Lakes. Were there challenges in making it available for exploration and visualization, or were all the early-use cases really just Production Reporting? >> So standard reporting across the ERP systems is very mature and those capabilities are there, but then you look at across-ERP systems and we have three major ERP systems for each of the lines of businesses, when you want to look at combining all of the data, it's very hard, and to add to that, you pointed on self-service discovery, and visualization across all three datas, that's even more challenging, because it takes a lot of heavy lift, to normalize all of the data and bring it into one centralized platform, and we started off the journey with Oracle, and then we had SAP HANA, we were trying to bring all the data together, but then we were looking at systems in our non-SAP ERP systems and bringing that data into a SAP-kind of footprint, one, the cost was tremendously high, also there was a lot of heavy lift and challenges in terms of manually having to normalize the data and bring it into the same kind of data models. And even after all of that was done, it was not very self-service oriented for our users and Finance and Supply Chain. >> Let me drill into two of those things. So it sounds like the ETL process of converting it into a consumable format was very complex, and then it sounds like also, the discoverability, like where a tool, perhaps like Elation, might help, which is very, very immature right now, or maybe not immature, it's still young. Is that what was missing, or why was the ETL process so much more heavyweight than with a traditional data warehouse? >> The ETL processes, there's a lot of heavy lifting there involved, because of the proprietary data structures of the ERP systems, especially SAP is... The data structures and how the data is used across clustered and pool tables, is very proprietary. And on top of that, bringing the data formats and structures from a PeopleSoft ERP system which are supporting different lines of businesses, so there are a lot of customization that's gone into place, there are specific things that we use in the ERPs, in terms of the modules and how the processes are modeled in each of the lines of businesses, complicates things a lot. And then you try and bring all these three different ERPs, and the nuances that they have over the years, try and bring them together, it actually makes it very complex. >> So tell us then, help us understand, how the Data Lake made that easier. Was it because you didn't have to do all the refinement before it got there. And tell us how Attunity helped make that possible. >> Oh absolutely, so I think that's one of the big things, why we picked the Hortonworks as one of our key partners in terms of buidling out the Data Lake, it just came on greed, you aren't necessarily worried about doing a whole lot of ETL before you bring the data in, and it also provides with the tools and the technologies from a lot other partners. We have a lot of maturity now, better provided self-service discovery capabilities for ad hoc analysis and reporting. So this is helpful to the users because now they don't have to wait for prolonged IT development cycles to model the data, do the ETL and build reports for the to consume, which sometimes could take weeks and months. Now in a matter of days, they're able to see the data they're looking for and they're able to start the analysis, and once they start the analysis and the data is accessible, it's a matter of minutes and seconds looking at the different tools, how they want to look at it, how they want to model it, so it's actually being a huge value from the perspective of the users and what they're looking to do. >> Speaking of value, one of the things that was kind of thematic yesterday, we see enterprises are now embracing big data, they're embracing Hadoop, it's got to coexist within our ecosystem, and it's got to inter-operate, but just putting data in a Data Lake or Hadoop, that's not the value there, it's being able to analyze that data in motion, at rest, structured, unstructured, and start being able to glean or take actionable insights. From your CFO's perspective, where are you know of answering some of the questions that he or she had, from an insights perspective, with the Data Lake that you have in place? >> Yeah, before I address that, I wanted to quickly touch upon and wrap up George's question, if you don't mind. Because one of the key challenges, and I do talk about how Attunity helped. I was just about to answer the question before we moved on, so I just want to close the loop on that a little bit. So in terms of bringing the data in, the data acquisition or ingestion is key aspect of it, and again, looking at the proprietary data structures from the ERP systems is very complex, and involves a multi-step process to bring the data into a strange environment, and be able to put it in the swamp bring it into the Lake. And what Attunity has been able to help us with is, it has the intelligence to look at and understand the proprietary data structures of the ERPs, and it is able to bring all the data from the ERP source systems directly into Hadoop, without any stops, or staging data bases along the way. So it's been a huge value from that standpoint, I'll get into more details around that. And to answer your question, around how it's helping from a CFO standpoint, and the users in Finance, as I said, now all the data is available in one place, so it's very easy for them to consume the data, and be able to do ad hoc analysis. So if somebody's looking to, like I said earlier, want to look at and calculate base table, as an example, or they want to look at working capital, we are actually moving data using Attunity, CDC replicate product, we're getting data in real-time, into the Data Lake. So now they're able to turn things around, and do that kind of analysis in a matter of hours, versus overnight or in a matter of days, which was the previous environment. >> And that was kind of one of the things this morning, is it's really about speed, right? It's how fast can you move and it sounds like together with Attunity, Verizon is really not only making things simpler, as you talked about in this kind of model that you have, with different ERP systems, but you're also really able to get information into the right hands much, much faster. >> Absolutely, that's the beauty of the near real-time, and the CDC architecture, we're able to get data in, very easily and quickly, and Attunity also provides a lot of visibility as the data is in flight, we're able to see what's happening in the source system, how many packets are flowing through, and to a point, my developers are so excited to work with a product, because they don't have to worry about the changes happening in the source systems in terms of DDL and those changes are automatically understood by the product and pushed to the destination of Hadoop. So it's been a game-changer, because we have not had any downtime, because when there are things changing on the source system side, historically we had to take downtime, to change those configurations and the scripts, and publish it across environments, so that's been huge from that standpoint as well. >> Absolutely. >> Itamar, maybe, help us understand where Attunity can... It sounds like there's greatly reduced latency in the pipeline between the operational systems and the analytic system, but it also sounds like you still need to essentially reformat the data, so that it's consumable. So it sounds like there's an ETL pipeline that's just much, much faster, but at the same time, when it's like, replicate, it sounds like that goes without transformations. So help us sort of understand that nuance. >> Yeah, that's a great question, George. And indeed in the past few years, customers have been focused predominantly on getting the data to the Lake. I actually think it's one of the changes in the fame, we're hearing here in the show and the last few months is, how do we move to start using the data, the great applications on the data. So we're kind of moving to the next step, in the last few years we focused a lot on innovating and creating the solutions that facilitate and accelerate the process of getting data to the Lake, from a large scope of systems, including complex ones like SAP, and also making the process of doing that easier, providing real-time data that can both feed streaming architectures as well as batch ones. So once we got that covered, to your question, is what happens next, and one of the things we found, I think Verizon is also looking at it now and are being concomitant later. What we're seeing is, when you bring data in, and you want to adopt the streaming, or a continuous incremental type of data ingestion process, you're inherently building an architecture that takes what was originally a database, but you're kind of, in a sense, breaking it apart to partitions, as you're loading it over time. So when you land the data, and Arvind was referring to a swamp, or some customers refer to it as a landing zone, you bring the data into your Lake environment, but at the first stage that data is not structured, to your point, George, in a manner that's easily consumable. Alright, so the next step is, how do we facilitate the next step of the process, which today is still very manual-driven, has custom development and dealing with complex structures. So we actually are very excited, we've introduced, in the show here, we announced a new product by Attunity, Compose for Hive, which extends our Data Lake solutions, and what Compose of Hive is exactly designed to do, is address part of the problem you just described, where's when the data comes in and is partitioned, what Compose for Hive does, is it reassembles these partitions, and it then creates analytic-ready data sets, back in Hive, so it can create operational data stores, it can create historical data stores, so then the data becomes formatted, in a matter that's more easily accessible for users, who want to use analytic tools, VI-tools, Tableau, Qlik, any type of tool that can easily access a database. >> Would there be, as a next step, whether led by Verizon's requirements or Attunity's anticipation of broader customer requirements, something where, there's a, if not near real-time, but a very low latency landing and transformation, so that data that is time-sensitive can join the historical data. >> Absolutely, absolutely. So what we've done, is focus on real-time availability of data. So when we feed the data into the Data Lake, we fit it into ways, one is directly into Hive, but we also go through a streaming architecture, like Kafka, in the case of Hortonworks, can also fit also very well into HDF. So then the next step in the process, is producing those analytic data sets, or data source, out of it, which we enable, and what we do is design it together with our partners, with our inner customers. So again when we work on Replicate, then we worked on Compose, we worked very close with Fortune companies trying to deal with these challenges, so we can design a product. In the case of Compose for Hive for example, we have done a lot of collaboration, at a product engineering level, with Hortonworks, to leverage the latest and greatest in Hive 2.2, Hive LLAP, to be able to push down transformations, so those can be done faster, including real-time, so those datasets can be updated on a frequent basis. >> You talked about kind of customer requirements, either those specific or not, obviously talking to telecommunications company, are you seeing, Itamar, from Attunity's perspective, more of this need to... Alright, the data's in the Lake, or first it comes to the swamp, now it's in the Lake, to start partitioning it, are you seeing this need driven in specific industries, or is this really pretty horizontal? >> That's a good question and this is definitely a horizontal need, it's part of the infrastructure needs, so Verizon is a great customer, and we even worked similarly in telecommunications, we've been working with other customers in other industries, from manufacturing, to retail, to health care, to automotive and others, and in all of those cases it's on a foundation level, it's very similar architectural challenges. You need to ingest the data, you want to do it fast, you want to do it incrementally or continuously, even if you're loading directly into Hadoop. Naturally, when you're loading the data through a Kafka, or streaming architecture, it's a continuous fashon, and then you partition the data. So the partitioning of the data is kind of inherent to the architecture, and then you need to help deal with the data, for the next step in the process. And we're doing it both with Compose for Hive, but also for customers using streaming architectures like Kafka, we provide the mechanisms, from supporting or facilitating things like schema unpollution, and schema decoding, to be able to facilitate the downstream process of processing those partitions of data, so we can make the data available, that works both for analytics and streaming analytics, as well as for scenarios like microservices, where the way in which you partition the data or deliver the data, allows each microservice to pick up on the data it needs, from the relevant partition. >> Well guys, this has been a really informative conversation. Congratulations, Itamar, on the new announcement that you guys made today. >> Thank you very much. >> Lisa: Arvin, great to hear the use case and how Verizon really sounds quite pioneering in what you're doing, wish you continued success there, we look forward to hearing what's next for Verizon, we want to thank you for watching the CUBE, we are again live, day two, of the DataWorks summit, #DWS17, before me my co-host George Gilbert, I am Lisa Martin, stick around, we'll be right back. (relaxed techno music)

Published Date : Jun 14 2017

SUMMARY :

in the heart of Silicon Valley, and we are joined by a couple of guys, Thank you very much, good to be here, the Director of Technology Services for Verizon, at the DataWorks summit, So the reason we kind of started looking at this, that people are going to hear tomorrow and the tools, as well as to kind of close the loop on, than the typical stories we hear about Data Lakes. and bring it into the same kind of data models. So it sounds like the ETL process and the nuances that they have over the years, how the Data Lake made that easier. do the ETL and build reports for the to consume, and it's got to inter-operate, and it is able to bring all the data and it sounds like together with Attunity, and the CDC architecture, we're able to get data in, and the analytic system, getting the data to the Lake. can join the historical data. like Kafka, in the case of Hortonworks, Alright, the data's in the Lake, You need to ingest the data, you want to do it fast, Congratulations, Itamar, on the new announcement Lisa: Arvin, great to hear the use case

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Arvind Rajagopalan	PERSON	0.99+
Arvind	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Itamar Ankorion	PERSON	0.99+
Lisa	PERSON	0.99+
George	PERSON	0.99+
Itamar	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
two	QUANTITY	0.99+
tomorrow	DATE	0.99+
Kafka	TITLE	0.99+
three	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
Cube	ORGANIZATION	0.99+
Arvin	PERSON	0.99+
DataWorks Summit	EVENT	0.99+
SAP HANA	TITLE	0.99+
One	QUANTITY	0.99+
each	QUANTITY	0.99+
yesterday	DATE	0.99+
#DWS17	EVENT	0.99+
one	QUANTITY	0.98+
a day and a half	QUANTITY	0.98+
CDC	ORGANIZATION	0.98+
first stage	QUANTITY	0.98+
Tableau	TITLE	0.98+
DataWorks Summit 2017	EVENT	0.98+
Attunity	ORGANIZATION	0.98+
Hive	TITLE	0.98+
both	QUANTITY	0.98+
Attunity	PERSON	0.98+
DataWorks	EVENT	0.97+
today	DATE	0.97+
Compose for Hive	ORGANIZATION	0.97+
Compose	ORGANIZATION	0.96+
Hive 2.2	TITLE	0.95+
Qlik	TITLE	0.94+
Hadoop	TITLE	0.94+
one place	QUANTITY	0.93+
day two	QUANTITY	0.92+
each microservice	QUANTITY	0.9+
first	QUANTITY	0.9+
20 years back	DATE	0.89+
#DataWorks	ORGANIZATION	0.87+
three major ERP systems	QUANTITY	0.83+
last 20 years	DATE	0.82+
PeopleSoft	ORGANIZATION	0.8+
Data Lake	COMMERCIAL_ITEM	0.8+
SAP	ORGANIZATION	0.79+

Stephanie McReynolds - HP Big Data 2015 - theCUBE

live from Boston Massachusetts extracting the signal from the noise it's the kue covering HP big data conference 2015 brought to you by HP software now your host John furrier and Dave vellante okay welcome back everyone we are here live in boston massachusetts for HP's big data conference this is a special presentation of the cube our flagship program where we go out to the events and extract the season for the noise I'm John furrier with Dave allante here Wikibon down on research our next guest Stephanie McReynolds VP margon elation hot new startup that's been kind of coming out of stealth that's out there big data a lot of great stuff Stephanie welcome to the cube great see you great to be here tell us what the start at first of all because good buzz going on it's kind of stealth buzz but it's really with the fought leaders and really the you know the people in the industry who know what they're talking about like what you guys are doing so so introduce the company tells me you guys are doing and relationship with Vertica and exciting stuff absolutely a lesion is a exciting company we just started to come out of south in March of this year and we came out of self with some great production customers so eBay is a customer they have hundreds of analysts using our systems we also have square as a customer smaller analytics team but the value that you Neelix teams are getting out of this product is really being able to access their data in human context so we do some machine learning to look at how individuals are using data in an organization and take that machine learning and also gather some of the human insights about how that data is being used by experts surface that all in line with in work so what kind of data cuz Stonebreaker was kind of talking yesterday about the 3 v's which we all know but the one that's really coming mainstream in terms of a problem space is variety variety you have the different variety of schema sources and then you have a lot of unstructured exhaust or data flying around can you be specific on what you guys do yeah I mean it's interesting because there's several definitions of data and big data going around right and so I'm you know we connect to a lot of database systems and we also connect to a lot of Hadoop implementations so we deal with both structured data as well as what i consider unstructured data and i think the third part of what we do is bring in context from human created data or cumin information with which robert yesterday was talking about a little bit which is you know what happens in a lot of analytic organizations is that and there's a very manual process of documenting some of the data that's being used in these projects and that's done on wiki pages or spreadsheets that are floating around the organization and that's actually a really black base camp all these collaboration all these collaboration platforms and what you realize when you start to really get into the work of using that information to try to write your queries is that trying to reference a wiki page and then write your sequel and flip back and forth between maybe ten different documents is not very productive for the analyst so what our customers are seeing is that by consolidating all of that data and information in one place where the tables are actually reference side by side with the annotations their analysts can get from twenty to fifty percent savings and productivity and new analysts maybe more importantly new analyst can get up to speed quite a bit quicker and that square the day I was talking to one of the the data scientists and he was was talking about you know his process for finding data in the organization which prior to using elation it would take about 30 minutes going two maybe three or four people to find the data he needed for his analysis and with elation in five seconds he can run a query search for the date he wants gets it back gets all kind of all that expert annotation already around that base data said he's ready to roll he can start I'm testing some of us akashi go platform right they've heard it was it a platform and it and you said you work with a lot of database the databases right so it's tightly integrated with the database in this use case so it's interesting and you know we see databases as a source of information so we don't create copies of the data on our platform we go out and point to the data where it lies and surface that you know that data to to the end user now in the case of verdict on our relationship with Vertica and we've also integrated verdict in our stack to support we call data forensics which is the building for not an analyst who's using the system day to day but for NIT individual to understand where the behaviors around this data and the types of analysis that are being done and so verdicts a great high performance platform for dashboarding and business intelligence a back end of that providing you know quick access to aggregates so one of they will work on a vertica you guys just the engine what specifically again yeah so so we use the the vertica the vertical engine underneath our forensics product and then the that's you know one portion of our platform the rest of our platform is built out on other other technologies so verdict is part of your solution it's part of our solution it's it's one application that we part of one application we deliver so we've been talking all week about this week Colin Mahoney in his talk yesterday and I saw a pretty little history on erp how initially was highly customized and became packaged apps and he sort of pointed to a similar track with analytics although he said it's not going to be the same it's going to be more composable sort of applications I wonder and historically the analytics in the database have been closely aligned I'll say maybe not integrated you see that model continuing do you see it more packaged apps or will thus what Collins calling composable apps what's the relationship between your platforming and the application yeah so our platform is is really more tooling for those individuals that are building or creating those applications so we're helping data scientists and analysts find what algorithms they want to use as a foundation for those applications so a little bit more on the discovery side where folks are doing a lot of experiment and experimentation they may be having to prepare data in different ways in order to figure out what might work for those applications and that's where we fit in as a vendor and what's your license model and so you know we're on a subscription model we have customers that have data teams in the in the hundreds at a place like eBay you know the smaller implementations could be maybe just teams of five analyst 10a analyst fairly small spatial subscription and it's a seat base subscription but we can run in the cloud we can run on premise and we do some interesting things around securing the data where you can and see your columns bommana at the data sets for financial services organizations and our customers that have security concerns and most of those are on premise top implementation 70 talk about the inspiration of the company in about the company he's been three years since then came out of stealth what's the founders like what's the DNA the company what do you guys do differently and what was the inspiration behind this yeah what's really what's really interesting I think about the founding of the company is that and the technical founders come from both Google and Apple so you have an interesting observation that both individuals had made independently hardcore algorithmic guy and then like relevant clean yeah and both those kind of made interesting observations about how Google and Apple two of the most data-driven companies you know on the planet we're struggling and their analytics teams were struggling with being able to share queries and share data sets and there was a lot of replication of work that was happening and so much for the night you know but both of these folks from different angles kind of came together at adulation said look there's there's a lot of machine learning algorithms that could help with this process and there's also a lot of good ways with natural language processing to let people interact with their data in more natural ways the founder from from Apple Aaron key he was on the Siri team so we had a lot of experience designing products for navigability and ease of use and natural language learning and so those two perspectives coming together have created some technology fundamentals in our product and it's an experience to some scar tissue from large-scale implementations of data yeah very large-scale implementations of data and also a really deep awareness of what the human equation brings to the table so machine learning algorithms aren't enough in and of themselves and I think ken rudin had some interesting comments this morning where you know he kind of pushed it one step further and said it's not just about finding insight data science about is about having impact and you can't have impact unless you create human contacts and you have communication and collaboration around the data so we give analyst a query tool by which we surface the machine learning context that we have about the data that's being used in the organization and what queries have been running that data but we surface in a way where the human can get recommendations about how to improve their their sequel and drive towards impact and then share that understanding with other analysts in the organization so you get an innovation community that's started so who you guys targets let's step back on the page go to market now you guys are launched got some funding can you share the amount or is it private confidential or was how much did you raise who are you targeting what's your go-to market what's the value proposition give us the give us this data yeah so its initial value proposition is just really about analyst productivity that's where we're targeted how can you take your teams of analysts and everyone knows it's hard to hire these days so you're not going to be able to grow those teams out overnight how do you make the analyst the data scientist the phd's you have on staff much more productive how do you take that eighty to ninety percent of the time that they make them using stuff sharing data because I stuff you in the sharing data try to get them out of the TD of trying to just find eight in the organization and prepare it and let them really innovate and and use that to drive value back to the to the organization so we're often selling to individual analysts to analytics teams the go to market starts there and the value proposition really extends much further in the organization so you know you find teams and organizations that have been trying to document their data through traditional data governance means or ETL tools for a very long time and a lot of those projects have stalled out and the way that we crawl systems and use machine learning automation and to automate some of that documentation really gives those projects and new life in our enterprise data has always been elusive I mean do you go back decades structured day to all these pre pre built databases it's been hard right so it's you can crack that nut that's going to be a very lucrative in this opportunity I got the Duke clusters now storing everything I mean some clients we talked to here on the key customers of a CHP or IBM big companies they're storing everything just because they don't know they do it again yeah I mean if the past has been hard in part because we in some cases over manage the modeling of the data and I think what's exciting now about storing all your data in Hadoop and storing first and then asking questions later is you're able to take a more discovery oriented hypothesis testing iterative approach and if you think about how true innovation works you know you build insights on top of one another to get to the big breakthrough concepts and so I think we're at an interesting point in the market for a solution like this that can help with that increasing complexity of data environment so you just raise your series a raised nine million you maybe did some seed round before that so pretty early days for you guys you mentioned natural language processing before one of your founders are you using NLP and in your solution in any way or so we have a we have a search interface that allows you to look for that technical data to look for metadata and for data objects and by entering a simple simple natural language search terms so we are using that as part of our interface in solution right and so kind of early customer successes can you talk about any examples or yeah you know there's some great examples and jointly with Vertica square is as a customer and their analytics team is using us on a day-to-day basis not only to find data sets and the organization but to document those those data sets and eBay has hundreds of analysts that are using elation today in a day to day manner and they've seen quite a bit of productivity out of their new analysts that are coming on the system's it used to take analysts about 18 months to really get their feet around them in the ebay environment because of the complexity of all of the different systems at ebay and understanding where to go for that customer table you know that they needed to use and now analysts are up and running about six months and their data governance team has found that elation has really automated and prioritized the process around documentation for them and so it's a great light a great foundation for them there and data curators and data stewards to go in and rich the data and collaborate more with the analysts and the actual data users to get to a point of catalogued catalog data disease so what's the next you guys going to be on the road in New York Post Radek hadoop world big data NYC is coming up a big event in New York I'm Cuba visa we're getting the word out about elation and then what we're doing we have customers that are you know starting to speak about their use cases and the value that they're seeing and will be in New York market share I believe will be speaking on our behalf there to share their stories and then we're also going to a couple other conferences after that you know the fall is an exciting time which one's your big ones there so i will be at strada in New York and a September early October and then mid-october we're going to be at both teradata partners and tableaus conference as well so we connect not only to databases of all set different sorts but also to go with users are the tools yeah awesome well anything else you'd like to add share at the company is awesome we're some great things about you guys been checking around I'll see you found out about you guys and a lot of people like the company I mean a lot of insiders like moving little see you didn't raise too much cash that's raised lettin that's not the million zillion dollar round I think what led you guys take nine million yeah raised a million and I you know I think we're building this company in a traditional value oriented way great word hey stay long bringing in revenue and trying to balance that out with the venture capital investment it's not that we won't take money but we want to build this company in a very durable so the vision is to build a durable company absolutely absolutely and that may be different than some of our competitors out there these days but that's that we've and I have not taken any financing and SiliconANGLE at all so you know we're getting we believe in that and you might pass up some things but you know what have control and you guys have some good partners so congratulations um final word what's this conference like you go to a lot of events what's your take on this on this event yeah I do i do end up going to a lot of events that's part of the marketing role you know i think what's interesting about this conference is that there are a lot of great conversations that are happening and happening not just from a technology perspective but also between business people and deep thinking about how to innovate and verticals customers i think are some of the most loyal customers i've seen in the in the market so it's great in their advanced to they're talking about some pretty big problems but they're solving it's not like little point solutions it's more we architecting some devops i get a dev I'm good I got trashed on Twitter private messages all last night about me calling this a DevOps show it's not really a DevOps cloud show but there's a DevOps vibe here the people who are working on the solutions I think they're just a real of real vibe people are solving real problems and they're talking about them and they're sharing their opinions and I I think that's you know that's similar to what you see in DevOps the guys with dev ops are in the front line the real engineers their engineering so they have to engineer because of that no pretenders here that's for sure are you talking about it's not a big sales conference right it's a lot of customer content their engineering solutions talking to Peter wants a bullshit they want reaiah I mean I got a lot on the table i'm gonna i'm doing some serious work and i want serious conversations and that's refreshing for us but we love love of hits like it's all right Stephanie thinks for so much come on cubes sharing your insight congratulations good luck with the new startup hot startups here in Boston hear the verdict HP software show will be right back more on the cube after this short break you you

Published Date : Aug 12 2015

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Colin Mahoney	PERSON	0.99+
Stephanie McReynolds	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
twenty	QUANTITY	0.99+
Peter	PERSON	0.99+
eBay	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Boston	LOCATION	0.99+
three	QUANTITY	0.99+
John furrier	PERSON	0.99+
Stephanie	PERSON	0.99+
five seconds	QUANTITY	0.99+
Vertica square	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
ken rudin	PERSON	0.99+
three years	QUANTITY	0.99+
Dave vellante	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
nine million	QUANTITY	0.99+
ninety percent	QUANTITY	0.99+
Dave allante	PERSON	0.99+
yesterday	DATE	0.99+
Cuba	LOCATION	0.99+
both individuals	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
Boston Massachusetts	LOCATION	0.99+
ten different documents	QUANTITY	0.99+
two	QUANTITY	0.99+
fifty percent	QUANTITY	0.98+
both	QUANTITY	0.98+
mid-october	DATE	0.98+
HP	ORGANIZATION	0.98+
robert	PERSON	0.98+
one	QUANTITY	0.98+
nine million	QUANTITY	0.98+
about six months	QUANTITY	0.98+
Collins	PERSON	0.97+
2015	DATE	0.97+
Hadoop	TITLE	0.97+
two perspectives	QUANTITY	0.97+
Aaron	PERSON	0.97+
Siri	TITLE	0.97+
four people	QUANTITY	0.97+
eighty	QUANTITY	0.97+
this week	DATE	0.97+
Neelix	ORGANIZATION	0.96+
about 30 minutes	QUANTITY	0.96+
Radek	PERSON	0.95+
CHP	ORGANIZATION	0.95+
HP Big Data	ORGANIZATION	0.95+
one place	QUANTITY	0.95+
about 18 months	QUANTITY	0.95+
March of this year	DATE	0.95+
NYC	LOCATION	0.95+
five	QUANTITY	0.94+
hundreds of analysts	QUANTITY	0.94+
ebay	ORGANIZATION	0.93+
eight	QUANTITY	0.93+
third part	QUANTITY	0.92+
boston massachusetts	LOCATION	0.91+
one application	QUANTITY	0.91+
70	QUANTITY	0.9+
this morning	DATE	0.9+
Twitter	ORGANIZATION	0.89+
today	DATE	0.89+
SiliconANGLE	ORGANIZATION	0.88+
big data	EVENT	0.88+
one step	QUANTITY	0.88+
September early October	DATE	0.87+
last night	DATE	0.84+
lot of events	QUANTITY	0.84+
NLP	ORGANIZATION	0.83+
a million	QUANTITY	0.79+
lot of events	QUANTITY	0.78+
lot	QUANTITY	0.78+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for elation: