Image Title

Search Results for Alation:

Raj Gossain, Alation


 

(upbeat electronic music) >> Hello, and welcome to this Cube Conversation. My name is Dave Vellante, and we're here with Raj Gossain, who's the Chief Product Officer at Alation. We have some news. Hello, Raj. Thanks for coming on. >> Dave, it's great to be with you on theCUBE again. >> Yeah, good to see you. So, okay, we're going to talk about Alation Connected Sheets. You know, what is that? Talk to us about what it is, what it does, what it brings to customers. >> So we recognize, spreadsheets are really the dark matter of the data universe. And they're used by, over 78 million people use spreadsheets on a regular basis to drive critical business analysis. But there's a lot of challenges with spreadsheet usage. It brings risk to the organization. There's no visibility into where data comes from. And so we wanted to bring the power of the Alation Data Intelligence Platform to business users where they spend most of their time. And that's in a tool that they love, and that's spreadsheets. And so we're launching a brand new product next week called Alation Connected Sheets. >> So talk more about that. So yes, I get the lineage issue, like where did-- who did this, where's this data come from? I got different data. But talk more about the problems that Alation Connected Sheets solves, specifically for customers. >> Yeah, so the big challenges that we see when we talk to data organizations is how do they understand where the data came from? Is it trusted? Is it reusable? Should it be used in this format? And if you look at where most users that use spreadsheets get the data to power their spreadsheets, maybe it's a CSV download from a database, and then you have no idea where the data came from and where it's going. Or even worse, it's copying and pasting data from other spreadsheets. And so if you take those problems, how can we bring trusted data from governed sources like Snowflake and Redshift and put it in the hands of spreadsheet users, and give them the power and flexibility of Google Sheets or Microsoft Excel, but use trusted, reliable, well-governed data so that the data office feels great about them using spreadsheets and the end users, the business users, can take advantage of the tool that they know and love and do the work that they need to do quickly. >> So, okay. So I'm inferring from your comments there that you've got the ability to take data from you mentioned a couple, Snowflake and Redshift, other popular data warehouses. >> Yep. >> So talk about the key capabilities that you have, any specific features that we should know about. >> Sure. So, we built the leading data intelligence platform and the leading data catalog. And one of the benefits of that catalog is where you have visibility into all of the trusted, governed data sources that a data organization cares about, whether it's enterprise warehouses like Snowflake or Redshift, databases like SQL Server, Google BigQuery, what have you. So what we've done is we've brought the power of that data catalog directly into both Google Sheets as well as Excel. And the idea there is a user can log into their application, authenticate to Alation using the Alation Connected Sheets plugin into their spreadsheet tool, and browse those trusted data sets that are surfaced in the Alation catalog. They get trust signals, they get visibility into where this data came from. So lineage, insights, descriptive information. And then with one or two clicks, they can choose a data set from their warehouse, basically apply filtering conditions. So let's say I'm looking for customer data in Snowflake. I can find the right customer table. If I only want it for say, 2022, I can apply some filter conditions, I can reorder columns, push one button, authenticate to that data source. We want to maintain and ensure security is being applied, so only those users that have access to the warehouse can actually download that data set. But once they've authenticated, that data gets downloaded into their spreadsheet and there's a live connection that's maintained to that spreadsheet. So anytime you need to refresh the data, one push of a button and that data set gets updated. I can schedule the updates. So, you know, if I have to produce a report every Monday morning, I could have that data set refreshed at 8:00 a.m. Monday morning, or whatever schedule the user wants. And so it gives the user the data set they need, but the data organization, they can see where that data came from and they understand the lineage of the data as it is used in analysis in those spreadsheets themselves. >> So Raj, I know you're at the Super Bowl this week, a.k.a. re:Invent. >> Yes. >> And I know you got very close relationships with Snowflake, you've mentioned them a couple times with the data summit last spring. And I know you've done some integration work with those platforms and I'm sure others. So should we think of this as you're extending that sort of trust and governance out to spreadsheets, is that right? And stretching that out? >> That's exactly right. The way we talk about it is how do we bring data intelligence to business users in the tool that they know and love, which is the spreadsheet. And so, the data catalog and data intelligence platforms in general have really primarily been focused on servicing the needs of data users: data analysts, data scientists, data engineers. But you know, our vision, our aspiration at Alation is to really bring data intelligence to any business user. And so it's a big part of our strategy to make sure that the insights from the Alation catalog and platform can find their way into tools like Excel and Google Sheets. And so that's, what you highlighted, Dave, is exactly correct. We want to maximize the likelihood that a business user can have self-service access to trusted, governed data, do the work that they need to do, and ensure that the organization has a set of data assets in spreadsheets, frankly as opposed to liabilities, which is the way most data organizations look at spreadsheets is it's almost like a risk factor. We want to convert that risk, that liability, into an asset so that people can reuse data sets and they understand where this analysis is actually coming from. >> It's something that we've talked about for well over a decade on theCUBE. Is data an asset or is it a liability? >> Yeah, yeah. >> You obviously want to get value out of it, but if you can't share it, it's not trusted. So what people do is they lock it down and then that constricts value creation. >> Exactly. >> My understanding is this tech came out of an acquisition from a company, Kloudio. >> That's correct. >> Tell us about Kloudio. Why Kloudio? What's the fit there? >> Yeah, so Kloudio is a company, it's about five years old. We closed the acquisition of the company in March of this past year. And they had about 20 customers, 10 engineers. And we saw an opportunity with the spreadsheet tool that they'd created to really compliment our data intelligence strategy. And as you said, Dave, extend the value of data intelligence to business users. And so, we brought the Kloudio team into the fold. The thing I'm most excited about as a product guy, is within seven months of them joining Alation, we're actually shipping a brand new product that's going to drive revenue and meet the needs of tens of millions of users, ultimately. Like that's really our aspiration. And so, the tech they had was extremely modern. It reinforces the platform position that we have. You know, this microservices architecture that we've built Alation around, made it easy for that new team to come in and leverage existing APIs and capabilities from our platform and the tech that they brought into Alation to essentially connect the dots and deliver a brand new set of capabilities to an entirely new audience, to help our customers achieve their business objectives, which is really creating a data culture across their entire organization, inclusive of business users, not just, like I said, the data X users that are already taking advantage of solutions like Alation and cloud warehouses, et cetera. >> So I have two questions, follow up questions by me, and I think you might have answered the second one. The first one is what's the secret sauce behind Kloudio? How does the tech work? The second question is how does it fit into the Alation portfolio? How were you able to integrate it so quickly? Maybe that's the microservices architecture. But start with the secret sauce. What is it, what can you share with me? >> I think the thing that we saw with Kloudio that got us excited, and the fact that they, even though it was a small company, they had 20 customers, they were generating revenue, and they were delivering real value to business users, by really enabling business users to tap into the value of trusted, governed data, and frankly, get IT out of the way. You know, we almost refer to it as like smart self-service, which is, they could find a data asset and connect to that source, and just with a couple quick clicks, almost a low-code, no-code type of an experience, bring that sort of data into their spreadsheet so they could do the work that they needed to do. That opportunity, that tech that the Kloudio team had built out, the big gap that they had is, my goodness, what does it take to actually be aware of all the data sources that exist across an organization and connect to them? And that's what Alation does, right? That's why we built the platform that we built, so that we can basically understand all of a customer's data assets, whether they're on-prem or in the cloud. And so it was a little bit of, you know, that Reese's Peanut Butter Cup analogy. The chocolate and the peanut butter coming together. The Alation platform, the Alation catalog, coupled with the technology that Kloudio brought to us really was sort of a match made in heaven. And it's allowed us to bring this new capability to market that really is value-add on top of the platform and catalog investments that our customers have already made. >> Yeah, so they had this magic pixie dust, but it was sort of isolated, and then you've integrated it into your catalog. And that's the second part of my question. How were you able to do that so quickly? >> So, we've been on this evolution, enhancing the the Alation data intelligence platform. We've moved to a microservices architecture, we're fully multi-tenant in the cloud. And the fact that we'd made those investments over the past few years gave us the opportunity to make it easy for an acquired business like Kloudio, or you know, perhaps a future acquisition, or third party developers leveraging APIs that we expose to make it easy for them to integrate into the Alation platform. And so, I think it's a bit of foresight. We recognize that in starting with the catalog, the opportunity was much bigger than just providing a data catalog. We've added data governance, we've built out this platform and we recognize that more and more users can and should be benefiting from data intelligence. And so I think those platform investments have paid significant dividends and accelerated our ability to deliver Alation Connected Sheets as quickly as we have. >> Sounds like a great acquisition, like a diamond in the rough. I mean, I love big these big mega acquisitions 'cause the media company can write about 'em, but I really love the high, high return. You know, low denominator, high value. So, congratulations. >> Thank you. >> Where can people learn more about this? Maybe play around a little bit with it? >> Yeah, so we're going to be demoing Alation Connected Sheets at AWS re:Invent next week. And it's going to be available starting next week, so the 28th of November. And obviously you'll see it online, on social media, on our website as well. But folks that are going to be in Las Vegas next week, come to the Alation booth and you'll get a chance to see it directly. >> Awesome. Okay, Raj. Hey, thanks for spending some time with us today. Really appreciate it. >> Great, thanks so much, Dave. Great to see you. >> Hey, you're very welcome. And thank you for watching. This is Dave Vellante for theCUBE, your leader in enterprise and emerging tech coverage.

Published Date : Nov 22 2022

SUMMARY :

and we're here with Raj Gossain, Dave, it's great to be Talk to us about what it is, what it does, of the data universe. But talk more about the problems so that the data office feels great that you've got the So talk about the key And so it gives the user the Super Bowl this week, And stretching that out? and ensure that the organization It's something that we've talked about to get value out of it, from a company, Kloudio. What's the fit there? and the tech that they into the Alation portfolio? that they needed to do. And that's the second part of my question. And the fact that we'd like a diamond in the rough. But folks that are going to some time with us today. Great to see you. And thank you for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Raj GossainPERSON

0.99+

Dave VellantePERSON

0.99+

DavePERSON

0.99+

KloudioORGANIZATION

0.99+

RajPERSON

0.99+

two questionsQUANTITY

0.99+

oneQUANTITY

0.99+

10 engineersQUANTITY

0.99+

AlationORGANIZATION

0.99+

ExcelTITLE

0.99+

20 customersQUANTITY

0.99+

Las VegasLOCATION

0.99+

second partQUANTITY

0.99+

next weekDATE

0.99+

second questionQUANTITY

0.99+

8:00 a.m. Monday morningDATE

0.99+

28th of NovemberDATE

0.99+

seven monthsQUANTITY

0.99+

Super BowlEVENT

0.99+

two clicksQUANTITY

0.99+

bothQUANTITY

0.99+

last springDATE

0.98+

second oneQUANTITY

0.98+

todayDATE

0.98+

over 78 million peopleQUANTITY

0.98+

SnowflakeTITLE

0.98+

Google SheetsTITLE

0.97+

AWSORGANIZATION

0.97+

SQL ServerTITLE

0.97+

one pushQUANTITY

0.96+

Monday morningDATE

0.96+

this weekDATE

0.95+

about 20 customersQUANTITY

0.94+

first oneQUANTITY

0.92+

RedshiftTITLE

0.92+

about five years oldQUANTITY

0.92+

GoogleORGANIZATION

0.91+

Alation Connected SheetsTITLE

0.91+

one buttonQUANTITY

0.9+

MicrosoftORGANIZATION

0.87+

ReeseORGANIZATION

0.86+

tens of millions of usersQUANTITY

0.83+

March of this past yearDATE

0.78+

couple quick clicksQUANTITY

0.77+

SnowflakeORGANIZATION

0.77+

ConversationEVENT

0.75+

Alation DataORGANIZATION

0.75+

theCUBEORGANIZATION

0.73+

2022DATE

0.71+

over a decadeQUANTITY

0.68+

couple timesQUANTITY

0.66+

InventEVENT

0.64+

a buttonQUANTITY

0.64+

SheetsCOMMERCIAL_ITEM

0.63+

Satyen Sangani, Alation | Cube Conversation


 

(upbeat electronic music) >> As we've previously reported on theCUBE, Alation was an early pioneer in the data, data governance, and data management space, which is now rapidly evolving with the help of AI and machine learning, and to what's often referred to as data intelligence. Many companies, you know, they didn't make it through the last era of data. They failed to find the right product market fit or scale beyond their close circle of friends, or some ran out of money or got acquired. Alation is a company who did make it through, and has continued to attract investor support, even in a difficult market where tech IPOs have virtually dried up. Back with me on theCUBE is Satyen Sangani, who's the CEO and co-founder of Alation. Satyen, good to see you again. Thanks for coming on. >> Great to see you, Dave. It's always nice to be on theCUBE. >> Hey, so remind our audience why you started Alation 10 years ago, you and your co-founders, and what you're all about today. >> Alation's vision is to empower a curious and rational world, which sounds like a really, I think, presumptuous thing to say. But I think it's something that we really need, right? If you think about how people make decisions, often it's still with bias or ideology, and we think a lot of that happens because people are intimidated by data, or often don't know how to use it, or don't know how to think scientifically. And we, at the core, started Alation because we wanted to demystify data for people. We wanted to help people find the data they needed and allow them to use it and to understand it better. And all of those core consumption values around information were what led us to start the company, because we felt like the world of data could be a little easier to use and manage. >> Your founding premise was correct. I mean, just getting the technology to work was so hard, and as you well know, it takes seven to 10 years to actually start a company and get traction, let alone hit escape velocity. So as I said in the open, you continue to attract new investors. What's the funding news? Please share with us. >> So we're announcing that we raised 123 million from a cohort of investors led by Thoma Bravo, Sanabil Investments, and Costanoa. Databricks Ventures is a participant in that round, along with many of our other existing investors, which would also include Salesforce amongst others. And so, super excited to get the round done in this interesting market. We were able to do that because of the business performance, and it was an up round, and all of that's great and gives our employees and our customers the fuel they need to get the product that they want. >> So why the E Round? Explain that. >> So, we've been accelerating growth over the last five quarters since our Series D. We've basically increased our growth rate to almost double since the time we raised our last round. And from our perspective, the data intelligence market, which is the market that we think we have the opportunity to continue to be the leading platform in, is growing super fast. And when faced with the decision of decelerating growth in the face of what might be, what could be a challenging macroeconomic environment, and accelerating when we're seeing customers increase the size of their commitments, more new customers sign on than ever, our growth rates increasing. We and the board basically chose to take the latter approach and we sort of said, "Look, this is amazing time in this category. This is an amazing time in this company. It's time to invest and it's time to be aggressive when a lot of other folks are fearful, and a lot of other folks aren't seeing the traction that we're seeing in our business. >> Why do you think you're seeing that traction? I mean, we always talk about digital transformation, which was a buzzword before the pandemic, but now it's become a mandate. Is that why? Is it just more data related? Explain that if you could. >> I think there's this potentially, you know, somewhat confusing thing about data. There's a, maybe it's a dirty secret of data, which is there's the sense that if you have a lot of data, and you're using data really well, and you're producing a ton of data, that you might be good at managing it. And the reality of it is that as you have more people using data and as you produce more data, it just becomes more and more confusing because more and more people are trying to access the same information to answer different questions, and more workloads are produced, and more applications are produced. And so the idea of getting more data actually means that it's really hard to manage and it becomes harder to manage at scale. And so, what we're seeing is that with the advent of platforms like AWS, like Snowflake, like Databricks, and certainly with all of the different on-premise applications that are getting born every single day, we're just seeing that data is becoming really much more confusing, but being able to navigate it is so much more important because it's the lifeblood for any business to build differentiation and satisfy their customers. >> Yeah, so last time we talked, we talked about the volume and velocity bromide from the last decade, but we talked about value and how hard it is to get value. So that's really the issue is the need and desire for more organizations to get more value out of that data is actually a stronger tailwind than the headwinds that you're seeing in the macroeconomic environment. >> Right. Because I think in good times you need data in order to be able to capitalize off all the opportunities that you've got, but in bad times you've got to make hard choices. And when you need to make hard choices, how do you do that? Well, you've got to figure out what the right decisions are, and the best way to do that is to have a lot of data and a lot of people who understand that data to be able to capitalize on it and make better insights and better decisions. And so, you don't see that just, by the way, theoretically. In the last quarter, we've seen three companies that have had cost reductions and force reductions where they are increasing at the same time their investment with Alation. And it's because they need the insight in order to be able to navigate these challenging times. >> Well, congratulations on the up round. That's awesome. I got to ask you, what was it like doing a raise in this environment? I mean, sellers are in control in the public markets. Late stage SaaS companies, that had to be challenging. How did you go about this? What were the investor conversations like? >> It certainly was a challenging fundraise. And I would say even though our business is doing way better and we were able to attract evaluation that would put us in the top quartile of public companies were we trading as a public company, which we aspire to do at some point, it was challenging because there was a whole slew of investors who were basically sitting on their hands. I had one investor conversation where an investor said to me, "Look, we think you're a great business, but we have companies that are able to give us 2.5 liquidation preference, and that gives us 70%, 75% of our return day one. So we're just going to go do those companies that may have been previously overvalued, but are willing to give us these terms because they want to keep their face valuation." Other investors said, "Look, we'd really rather that you ran a lower growth plan but with a potentially lower burn plan. But we think the upside is really something that you can capitalize on." From our perspective, we were pretty clear about the plan that we wanted to run and didn't want to necessarily totally accommodate to the fashion of the current market. We've always run a historically efficient business. The company has not burned as much as many of the data peers that we've seen to grow to get to our scale, but our general view was, look, we've got a really clear plan. The board, and the company, and the management team know exactly what we'd like to do. We've got customers that know exactly what they want from us, so we really just have to go execute. And the luck is that we found investors who were willing to do that. Many investors, and we picked one in Thoma Bravo that we felt could be the best partner for the coming phase of the company. >> So I love that because you see the opportunity, you've had a very efficient business. You're punching above your weight in terms of your use of capital. So you don't want to veer off. You know your business better than anybody. You don't want to veer off that plan. The board's very supportive. I could see you, you hear it all the time, we're going to dial down the growth, dial up the EBIT, and that's what markets want today. So congratulations on sticking to your beliefs and your vision. How do you plan to use the funds? >> We are planning to invest in sales and marketing globally. So we've expanded in Asia-Pacific over the most recent year, and also in (indistinct) and we plan to continue to do that. We're going to continue to expand in public sector with fed. And so, you would see us basically just increase our presence globally in all of the markets that you might expect. In particular, you're going to see us lean in heavily to many of the partners Databricks invested alongside this particular round. But you would have seen previously that Snowflake was a fabulous, and has been a fabulous partner of ours, and we are going to continue to invest alongside these leading data platforms. What you would also expect to see from us, though, is a lot of investment in R&D. This is a really nascent category. It's a really, really hard space. People would call it a crowded market because there are a lot of players. I think from our perspective, our aspirations to be the leading data intelligence platform, platform being a really key word there because it's not like we can do it all ourselves. We have a lot of different use cases in data intelligence, things like data quality and data observability, things like data privacy and data access control. And we have some really great partners that we walk alongside in order to make the end customer successful. I think a lot of folks in this market think, "Oh, we can just be master of all. Sort of jack of all trades, master of none." That is not our strategy. Our strategy is to really focus on getting all our customers super successful, really focused on engagement and adoption, because the really hard thing with these platforms is to get people to use them, and that is not a problem Alation has had historically. >> You know, it's really interesting, Satyen, you talk about, I mean, Thoma Bravo, obviously, very savvy investors, deep pockets, they've been making some moves. Certainly we've seen that in cyber security and data. So you got some quasi patient capital there. But the interesting thing to me is that the previous Snowflake investment last year and now Databricks, a lot of people think of them as sort of battling it out, but my view is it's not a zero sum game, meaning, yes, there's overlap, but they're filling a lot of gaps in the marketplace, and I think there's room, there's so much opportunity, and there's such a large tam, that partnering with both is a really, really smart idea. I'll give you the last word. Going forward, what can we expect from Elation? >> Well, I think that's absolutely true, and I think that the biggest boogeyman with all of this is that people don't use data. And so, our ability to partner together is really just a function of making customers successful and continuing to do that. And if we can do that, all companies will grow. We ended up ultimately partnering with Databricks and deepening our partnership, really, 'cause we had one already, primarily because of the fact that we have over a hundred customers that are jointly using the products today. And so, it certainly made sense for us to continue to make that experience better 'cause customers are demanding it. From my perspective, we just have this massive opportunity. We have the ability and the insight to run a really efficient, very, very high growth business at scale. And we have this tremendous ability to get so many more companies and people to use data much more efficiently and much better. Which broadly is, I think, a way in which we can impact the world in a really positive way. And so that's a once in a lifetime opportunity for me and for the team. And we're just going to get after it. >> Well, it's been fun watching Alation over the years. I remember mid last decade talking about this thing called data lakes and how they became data swamps, and you were helping clean that up. And now, the next 10 years, and data's not going to be like the last, you know, simplifying things and and really democratizing data is the big theme. Satyen, thanks for making time to come back on theCUBE, and congratulations on the raise. >> Thank you, Dave. It's always great to see you. >> And thank you for watching this conversation with the CEO in theCUBE, your leader in enterprise and emerging tech coverage. (gentle electronic music)

Published Date : Nov 2 2022

SUMMARY :

and has continued to It's always nice to be on theCUBE. and what you're all about today. and allow them to use it and as you well know, it and our customers the fuel So why the E Round? We and the board basically chose Explain that if you could. and it becomes harder to manage at scale. for more organizations to get more value and the best way to do that that had to be challenging. And the luck is that we found investors sticking to your beliefs of the markets that you might expect. of gaps in the marketplace, and the insight to run a really efficient, and data's not going to be It's always great to see you. And thank you for

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AlationORGANIZATION

0.99+

SatyenPERSON

0.99+

DavePERSON

0.99+

sevenQUANTITY

0.99+

70%QUANTITY

0.99+

75%QUANTITY

0.99+

DatabricksORGANIZATION

0.99+

Sanabil InvestmentsORGANIZATION

0.99+

last yearDATE

0.99+

Satyen SanganiPERSON

0.99+

Databricks VenturesORGANIZATION

0.99+

bothQUANTITY

0.99+

10 years agoDATE

0.99+

CostanoaORGANIZATION

0.99+

123 millionQUANTITY

0.99+

last quarterDATE

0.99+

three companiesQUANTITY

0.98+

SnowflakeORGANIZATION

0.98+

10 yearsQUANTITY

0.98+

mid last decadeDATE

0.98+

over a hundred customersQUANTITY

0.98+

oneQUANTITY

0.97+

todayDATE

0.97+

one investorQUANTITY

0.96+

AWSORGANIZATION

0.94+

pandemicEVENT

0.93+

Thoma BravoORGANIZATION

0.91+

fedORGANIZATION

0.9+

single dayQUANTITY

0.87+

last decadeDATE

0.87+

Series D.OTHER

0.87+

next 10 yearsDATE

0.85+

AlationPERSON

0.8+

ElationORGANIZATION

0.8+

Asia-PacificLOCATION

0.79+

doubleQUANTITY

0.78+

last five quartersDATE

0.76+

2.5 liquidationQUANTITY

0.75+

theCUBEORGANIZATION

0.74+

SalesforceORGANIZATION

0.73+

recent yearDATE

0.72+

Thoma BravoPERSON

0.69+

SnowflakeTITLE

0.66+

tDATE

0.65+

CubeORGANIZATION

0.53+

moreQUANTITY

0.5+

dataQUANTITY

0.49+

Mitesh Shah, Alation & Ash Naseer, Warner Bros Discovery | Snowflake Summit 2022


 

(upbeat music) >> Welcome back to theCUBE's continuing coverage of Snowflake Summit '22 live from Caesar's Forum in Las Vegas. I'm Lisa Martin, my cohost Dave Vellante, we've been here the last day and a half unpacking a lot of news, a lot of announcements, talking with customers and partners, and we have another great session coming for you next. We've got a customer and a partner talking tech and data mash. Please welcome Mitesh Shah, VP in market strategy at Elation. >> Great to be here. >> and Ash Naseer great, to have you, senior director of data engineering at Warner Brothers Discovery. Welcome guys. >> Thank you for having me. >> It's great to be back in person and to be able to really get to see and feel and touch this technology, isn't it? >> Yeah, it is. I mean two years or so. Yeah. Great to feel the energy in the conference center. >> Yeah. >> Snowflake was virtual, I think for two years and now it's great to kind of see the excitement firsthand. So it's wonderful. >> Th excitement, but also the boom and the number of customers and partners and people attending. They were saying the first, or the summit in 2019 had about 1900 attendees. And this is around 10,000. So a huge jump in a short time period. Talk a little bit about the Elation-Snowflake partnership and probably some of the acceleration that you guys have been experiencing as a Snowflake partner. >> Yeah. As a snowflake partner. I mean, Snowflake is an investor of us in Elation early last year, and we've been a partner for, for longer than that. And good news. We have been awarded Snowflake partner of the year for data governance, just earlier this week. And that's in fact, our second year in a row for winning that award. So, great news on that front as well. >> Repeat, congratulations. >> Repeat. Absolutely. And we're going to hope to make it a three-peat as well. And we've also been awarded industry competency badges in five different industries, those being financial services, healthcare, retail technology, and Median Telcom. >> Excellent. Okay. Going to right get into it. Data mesh. You guys actually have a data mesh and you've presented at the conference. So, take us back to the beginning. Why did you decide that you needed to implement something like data mesh? What was the impetus? >> Yeah. So when people think of Warner brothers, you always think of like the movie studio, but we're more than that, right? I mean, you think of HBO, you think of TNT, you think of CNN, we have 30 plus brands in our portfolio and each have their own needs. So the idea of a data mesh really helps us because what we can do is we can federate access across the company so that, you know, CNN can work at their own pace. You know, when there's election season, they can ingest their own data and they don't have to, you know, bump up against as an example, HBO, if Game of Thrones is going on. >> So, okay. So the, the impetus was to serve those lines of business better. Actually, given that you've got these different brands, it was probably easier than most companies. Cause if you're, let's say you're a big financial services company, and now you have to decide who owns what. CNN owns its own data products, HBO. Now, do they decide within those different brands, how to distribute even further? Or is it really, how deep have you gone in that decentralization? >> That's a great question. It's a very close partnership, because there are a number of data sets, which are used by all the brands, right? You think about people browsing websites, right? You know, CNN has a website, Warner brothers has a website. So for us to ingest that data for each of the brands to ingest that data separately, that means five different ways of doing things and you know, a big environment, right? So that is where our team comes into play. We ingest a lot of the common data sets, but like I said, any unique data sets, data sets regarding theatrical as an example, you know, Warner brothers does it themselves, you know, for streaming, HBO Max, does it themselves. So we kind of operate in partnership. >> So do you have a centralized data team and also decentralized data teams, right? >> That's right. >> So I love this conversation because that was heresy 10 years ago, five years ago, even, cause that's inefficient. But you've, I presume you've found that it's actually more productive in terms of the business output, explain that dynamic. >> You know, you bring up such a good point. So I, you know, I consider myself as one of the dinosaurs who started like 20 plus years ago in this industry. And back then, we were all taught to think of the data warehouse as like a monolithic thing. And the reason for that is the technology wasn't there. The technology didn't catch up. Now, 20 years later, the technology is way ahead, right? But like, our mindset's still the same because we think of data warehouses and data platforms still as a monolithic thing. But if you really sort of remove that sort of mental barrier, if you will, and if you start thinking about, well, how do I sort of, you know, federate everything and make sure that you let folks who are building, or are closest to the customer or are building their products, let them own that data and have a partnership. The results have been amazing. And if we were only sort of doing it as a centralized team, we would not be able to do a 10th of what we do today. So it's that massive scale in, in our company as well. >> And I should have clarified, when we talk about data mesh are we talking about the implementing in practice, the octagon sort of framework, or is this sort of your own sort of terminology? >> Well, so the interesting part is four years ago, we didn't have- >> It didn't exist. >> Yeah. It didn't exist. And, and so we, our principle was very simple, right? When we started out, we said, we want to make sure that our brands are able to operate independently with some oversight and guidance from our technology teams, right? That's what we set out to do. We did that with Snowflake by design because Snowflake allows us to, you know, separate those, those brands into different accounts. So that was done by design. And then the, the magic, I think, is the Snowflake data sharing where, which allows us to sort of bring data in here once, and then share it with whoever needs it. So think about HBO Max. On HBO Max, You not only have HBO Max content, but content from CNN, from Cartoon Network, from Warner Brothers, right? All the movies, right? So to see how The Batman movie did in theaters and then on streaming, you don't need, you know, Warner brothers doesn't need to ingest the same streaming data. HBO Max does it. HBO Max shares it with Warner brothers, you know, store once, share many times, and everyone works at their own pace. >> So they're building data products. Those data products are discoverable APIs, I presume, or I guess maybe just, I guess the Snowflake cloud, but very importantly, they're governed. And that's correct, where Elation comes in? >> That's precisely where Elation comes in, is where sort of this central flexible foundation for data governance. You know, you mentioned data mesh. I think what's interesting is that it's really an answer to the bottlenecks created by centralized IT, right? There's this notion of decentralizing that the data engineers and making the data domain owners, the people that know the data the best, have them be in control of publishing the data to the data consumers. There are other popular concepts actually happening right now, as we speak, around modern data stack. Around data fabric that are also in many ways underpinned by this notion of decentralization, right? These are concepts that are underpinned by decentralization and as the pendulum swings, sort of between decentralization and centralization, as we go back and forth in the world of IT and data, there are certain constants that need to be centralized over time. And one of those I believe is very much a centralized platform for data governance. And that's certainly, I think where we come in. Would love to hear more about how you use Elation. >> Yeah. So, I mean, elation helps us sort of, as you guys say, sort of, map, the treasure map of the data, right? So for consumers to find where their data is, that's where Elation helps us. It helps us with the data cataloging, you know, storing all the metadata and, you know, users can go in, they can sort of find, you know, the data that they need and they can also find how others are using data. So it's, there's a little bit of a crowdsourcing aspect that Elation helps us to do whereby you know, you can see, okay, my peer in the other group, well, that's how they use this piece of data. So I'm not going to spend hours trying to figure this out. You're going to use the query that they use. So yeah. >> So you have a master catalog, I presume. And then each of the brands has their own sub catalogs, is that correct? >> Well, for the most part, we have that master catalog and then the brands sort of use it, you know, separately themselves. The key here is all that catalog, that catalog isn't maintained by a centralized group as well, right? It's again, maintained by the individual teams and not only in the individual teams, but the folks that are responsible for the data, right? So I talked about the concept of crowdsourcing, whoever sort of puts the data in, has to make sure that they update the catalog and make sure that the definitions are there and everything sort of in line. >> So HBO, CNN, and each have their own, sort of access to their catalog, but they feed into the master catalog. Is that the right way to think about it? >> Yeah. >> Okay. And they have their own virtual data warehouses, right? They have ownership over that? They can spin 'em up, spin 'em down as they see fit? Right? And they're governed. >> They're governed. And what's interesting is it's not just governed, right? Governance is a, is a big word. It's a bit nebulous, but what's really being enabled here is this notion of self-service as well, right? There's two big sort of rockets that need to happen at the same time in any given organization. There's this notion that you want to put trustworthy data in the hands of data consumers, while at the same time mitigating risk. And that's precisely what Elation does. >> So I want to clarify this for the audience. So there's four principles of database. This came after you guys did it. And I wonder how it aligns. Domain ownership, give data, as you were saying to the, to the domain owners who have context, data as product, you guys are building data products, and that creates two problems. How do you give people self-service infrastructure and how do you automate governance? So the first two, great. But then it creates these other problems. Does that align with your philosophy? Where's alignment? What's different? >> Yeah. Data products is exactly where we're going. And that sort of, that domain based design, that's really key as well. In our business, you think about who the customer is, as an example, right? Depending on who you ask, it's going to be, the answer might be different, you know, to the movie business, it's probably going to be the person who watches a movie in a theater. To the streaming business, to HBO Max, it's the streamer, right? To others, someone watching live CNN on their TV, right? There's yet another group. Think about all the franchising we do. So you see Batman action figures and T-shirts, and Warner brothers branded stuff in stores, that's yet another business unit. But at the end of the day, it's not a different person, it's you and me, right? We do all these things. So the domain concept, make sure that you ingest data and you bring data relevant to the context, however, not sort of making it so stringent where it cannot integrate, and then you integrate it at a higher level to create that 360. >> And it's discoverable. So the point is, I don't have to go tap Ash on the shoulder, say, how do I get this data? Is it governed? Do I have access to it? Give me the rules of it. Just, I go grab it, right? And the system computationally automates whether or not I have access to it. And it's, as you say, self-service. >> In this case, exactly right. It enables people to just search for data and know that when they find the data, whether it's trustworthy or not, through trust flags, and the like, it's doing both of those things at the same time. >> How is it an enabler of solving some of the big challenges that the media and entertainment industry is going through? We've seen so much change the last couple of years. The rising consumer expectations aren't going to go back down. They're only going to come up. We want you to serve us up content that's relevant, that's personalized, that makes sense. I'd love to understand from your perspective, Mitesh, from an industry challenges perspective, how does this technology help customers like Warner Brothers Discovery, meet business customers, where they are and reduce the volume on those challenges? >> It's a great question. And as I mentioned earlier, we had five industry competency badges that were awarded to us by Snowflake. And one of those four, Median Telcom. And the reason for that is we're helping media companies understand their audiences better, and ultimately serve up better experiences for their audiences. But we've got Ash right here that can tell us how that's happening in practice. >> Yeah, tell us. >> So I'll share a story. I always like to tell stories, right? Once once upon a time before we had Elation in place, it was like, who you knew was how you got access to the data. So if I knew you and I knew you had access to a certain kind of data and your access to the right kind of data was based on the network you had at the company- >> I had to trust you. >> Yeah. >> I might not want to give up my data. >> That's it. And so that's where Elation sort of helps us democratize it, but, you know, puts the governance and controls, right? There are certain sensitive things as well, such as viewership, such as subscriber accounts, which are very important. So making sure that the right people have access to it, that's the other problem that Elation helps us solve. >> That's precisely part of our integration with Snowflake in particular, being able to define and manage policies within Elation. Saying, you know, certain people should have access to certain rows, doing column level masking. And having those policies actually enforced at the Snowflake data layer is precisely part of our value product. >> And that's automated. >> And all that's automated. Exactly. >> Right. So I don't have to think about it. I don't have to go through the tap on their shoulder. What has been the impact, Ash, on data quality as you've pushed it down into the domains? >> That's a great question. So it has definitely improved, but data quality is a very interesting subject, because back to my example of, you know, when we started doing things, we, you know, the centralized IT team always said, well, it has to be like this, Right? And if it doesn't fit in this, then it's bad quality. Well, sometimes context changes. Businesses change, right? You have to be able to react to it quickly. So making sure that a lot of that quality is managed at the decentralized level, at the place where you have that business context, that ensures you have the most up to date quality. We're talking about media industry changing so quickly. I mean, would we have thought three years ago that people would watch a lot of these major movies on streaming services? But here's the reality, right? You have to react and, you know, having it at that level just helps you react faster. >> So data, if I play that back, data quality is not a static framework. It's flexible based on the business context and the business owners can make those adjustments, cause they own the data. >> That's it. That's exactly it. >> That's awesome. Wow. That's amazing progress that you guys have made. >> In quality, if I could just add, it also just changes depending on where you are in your data pipeline stage, right? Data, quality data observability, this is a very fast evolving space at the moment, and if I look to my left right now, I bet you I can probably see a half-dozen quality observability vendors right now. And so given that and given the fact that Elation still is sort of a central hub to find trustworthy data, we've actually announced an open data quality initiative, allowing for best-of-breed data quality vendors to integrate with the platform. So whoever they are, whatever tool folks want to use, they can use that particular tool of choice. >> And this all runs in the cloud, or is it a hybrid sort of? >> Everything is in the cloud. We're all in the cloud. And you know, again, helps us go faster. >> Let me ask you a question. I could go on forever in this topic. One of the concepts that was put forth is whether it's a Snowflake data warehouse or a data bricks, data lake, or an Oracle data warehouse, they should all be inclusive. They should just be a node on the mesh. Like, wow, that sounds good. But I haven't seen it yet. Right? I'm guessing that Snowflake and Elation enable all the self-serve, all this automated governance, and that including those other items, it's got to be a one-off at this point in time. Do you ever see you expanding that scope or is it better off to just kind of leave it into the, the Snowflake data cloud? >> It's a good question. You know, I feel like where we're at today, especially in terms of sort of technology giving us so many options, I don't think there's a one size fits all. Right? Even though we are very heavily invested in Snowflake and we use Snowflake consistently across the organization, but you could, theoretically, could have an architecture that blends those two, right? Have different types of data platforms like a teradata or an Oracle and sort of bring it all together today. We have the technology, you know, that and all sorts of things that can make sure that you query on different databases. So I don't think the technology is the problem, I think it's the organizational mindset. I think that that's what gets in the way. >> Oh, interesting. So I was going to ask you, will hybrid tables help you solve that problem? And, maybe not, what you're saying, it's the organization that owns the Oracle database saying, Hey, we have our system. It processes, it works, you know, go away. >> Yeah. Well, you know, hybrid tables I think, is a great sort of next step in Snowflake's evolution. I think it's, in my opinion, I, think it's a game changer, but yeah. I mean, they can still exist. You could do hybrid tables right on Snowflake, or you could, you know, you could kind of coexist as well. >> Yeah. But, do you have a thought on this? >> Yeah, I do. I mean, we're always going to live in a time where you've got data distributed in throughout the organization and around the globe. And that could be even if you're all in on Snowflake, you could have data in Snowflake here, you could have data in Snowflake in EMEA and Europe somewhere. It could be anywhere. By the same token you might be using. Every organization is using on-premises systems. They have data, they naturally have data everywhere. And so, you know, this one solution to this is really centralizing, as I mentioned, not just governance, but also metadata about all of the data in your organization so that you can enable people to search and find and discover trustworthy data no matter where it is in your organization. >> Yeah. That's a great point. I mean, if you have the data about the data, then you can, you can treat these independent nodes. That's just that. Right? And maybe there's some advantages of putting it all in the Snowflake cloud, but to your point, organizationally, that's just not feasible. The whole, unfortunately, sorry, Snowflake, all the world's data is not going to go into Snowflake, but they play a key role in accelerating, what I'm hearing, your vision of data mesh. >> Yeah, absolutely. I think going forward in the future, we have to start thinking about data platforms as just one place where you sort of dump all the data. That's where the mesh concept comes in. It is going to be a mesh. It's going to be distributed and organizations have to be okay with that. And they have to embrace the tools. I mean, you know, Facebook developed a tool called Presto many years ago that that helps them solve exactly the same problem. So I think the technology is there. I think the organizational mindset needs to evolve. >> Yeah. Definitely. >> Culture. Culture is one of the hardest things to change. >> Exactly. >> Guys, this was a masterclass in data mesh, I think. Thank you so much for coming on talking. >> We appreciate it. Thank you so much. >> Of course. What Elation is doing with Snowflake and with Warner Brothers Discovery, Keep that content coming. I got a lot of stuff I got to catch up on watching. >> Sounds good. Thank you for having us. >> Thanks guys. >> Thanks, you guys. >> For Dave Vellante, I'm Lisa Martin. You're watching theCUBE live from Snowflake Summit '22. We'll be back after a short break. (upbeat music)

Published Date : Jun 30 2022

SUMMARY :

session coming for you next. and Ash Naseer great, to have you, in the conference center. and now it's great to kind of see the acceleration that you guys have of the year for data And we've also been awarded Why did you decide that you So the idea of a data mesh Or is it really, how deep have you gone the brands to ingest that data separately, terms of the business and make sure that you let allows us to, you know, separate those, guess the Snowflake cloud, of decentralizing that the data engineers the data cataloging, you know, storing all So you have a master that are responsible for the data, right? Is that the right way to think about it? And they're governed. that need to happen at the So the first two, great. the answer might be different, you know, So the point is, It enables people to just search that the media and entertainment And the reason for that is So if I knew you and I knew that the right people have access to it, Saying, you know, certain And all that's automated. I don't have to go through You have to react and, you know, It's flexible based on the That's exactly it. that you guys have made. and given the fact that Elation still And you know, again, helps us go faster. a node on the mesh. We have the technology, you that owns the Oracle database saying, you know, you could have a thought on this? And so, you know, this one solution I mean, if you have the I mean, you know, the hardest things to change. Thank you so much for coming on talking. Thank you so much. of stuff I got to catch up on watching. Thank you for having us. from Snowflake Summit '22.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Lisa MartinPERSON

0.99+

CNNORGANIZATION

0.99+

HBOORGANIZATION

0.99+

Mitesh ShahPERSON

0.99+

Ash NaseerPERSON

0.99+

EuropeLOCATION

0.99+

FacebookORGANIZATION

0.99+

MiteshPERSON

0.99+

ElationORGANIZATION

0.99+

TNTORGANIZATION

0.99+

Warner brothersORGANIZATION

0.99+

EMEALOCATION

0.99+

second yearQUANTITY

0.99+

OracleORGANIZATION

0.99+

2019DATE

0.99+

two yearsQUANTITY

0.99+

oneQUANTITY

0.99+

Cartoon NetworkORGANIZATION

0.99+

Game of ThronesTITLE

0.99+

two problemsQUANTITY

0.99+

twoQUANTITY

0.99+

Warner BrothersORGANIZATION

0.99+

10thQUANTITY

0.99+

firstQUANTITY

0.99+

SnowflakeORGANIZATION

0.99+

Snowflake Summit '22EVENT

0.99+

Warner brothersORGANIZATION

0.99+

eachQUANTITY

0.99+

fourQUANTITY

0.99+

Las VegasLOCATION

0.99+

Median TelcomORGANIZATION

0.99+

20 years laterDATE

0.98+

bothQUANTITY

0.98+

five different industriesQUANTITY

0.98+

10 years agoDATE

0.98+

30 plus brandsQUANTITY

0.98+

AlationPERSON

0.98+

four years agoDATE

0.98+

todayDATE

0.98+

20 plus years agoDATE

0.97+

Warner Brothers DiscoveryORGANIZATION

0.97+

OneQUANTITY

0.97+

five years agoDATE

0.97+

Snowflake Summit 2022EVENT

0.97+

three years agoDATE

0.97+

five different waysQUANTITY

0.96+

earlier this weekDATE

0.96+

SnowflakeTITLE

0.96+

MaxTITLE

0.96+

early last yearDATE

0.95+

about 1900 attendeesQUANTITY

0.95+

SnowflakeEVENT

0.94+

AshPERSON

0.94+

three-peatQUANTITY

0.94+

around 10,000QUANTITY

0.93+

Mitesh Shah, Alation & Ash Naseer, Warner Bros Discovery | Snowflake Summit 2022


 

(upbeat music) >> Welcome back to theCUBE's continuing coverage of Snowflake Summit '22 live from Caesar's Forum in Las Vegas. I'm Lisa Martin, my cohost Dave Vellante, we've been here the last day and a half unpacking a lot of news, a lot of announcements, talking with customers and partners, and we have another great session coming for you next. We've got a customer and a partner talking tech and data mash. Please welcome Mitesh Shah, VP in market strategy at Elation. >> Great to be here. >> and Ash Naseer great, to have you, senior director of data engineering at Warner Brothers Discovery. Welcome guys. >> Thank you for having me. >> It's great to be back in person and to be able to really get to see and feel and touch this technology, isn't it? >> Yeah, it is. I mean two years or so. Yeah. Great to feel the energy in the conference center. >> Yeah. >> Snowflake was virtual, I think for two years and now it's great to kind of see the excitement firsthand. So it's wonderful. >> Th excitement, but also the boom and the number of customers and partners and people attending. They were saying the first, or the summit in 2019 had about 1900 attendees. And this is around 10,000. So a huge jump in a short time period. Talk a little bit about the Elation-Snowflake partnership and probably some of the acceleration that you guys have been experiencing as a Snowflake partner. >> Yeah. As a snowflake partner. I mean, Snowflake is an investor of us in Elation early last year, and we've been a partner for, for longer than that. And good news. We have been awarded Snowflake partner of the year for data governance, just earlier this week. And that's in fact, our second year in a row for winning that award. So, great news on that front as well. >> Repeat, congratulations. >> Repeat. Absolutely. And we're going to hope to make it a three-peat as well. And we've also been awarded industry competency badges in five different industries, those being financial services, healthcare, retail technology, and Median Telcom. >> Excellent. Okay. Going to right get into it. Data mesh. You guys actually have a data mesh and you've presented at the conference. So, take us back to the beginning. Why did you decide that you needed to implement something like data mesh? What was the impetus? >> Yeah. So when people think of Warner brothers, you always think of like the movie studio, but we're more than that, right? I mean, you think of HBO, you think of TNT, you think of CNN, we have 30 plus brands in our portfolio and each have their own needs. So the idea of a data mesh really helps us because what we can do is we can federate access across the company so that, you know, CNN can work at their own pace. You know, when there's election season, they can ingest their own data and they don't have to, you know, bump up against as an example, HBO, if Game of Thrones is going on. >> So, okay. So the, the impetus was to serve those lines of business better. Actually, given that you've got these different brands, it was probably easier than most companies. Cause if you're, let's say you're a big financial services company, and now you have to decide who owns what. CNN owns its own data products, HBO. Now, do they decide within those different brands, how to distribute even further? Or is it really, how deep have you gone in that decentralization? >> That's a great question. It's a very close partnership, because there are a number of data sets, which are used by all the brands, right? You think about people browsing websites, right? You know, CNN has a website, Warner brothers has a website. So for us to ingest that data for each of the brands to ingest that data separately, that means five different ways of doing things and you know, a big environment, right? So that is where our team comes into play. We ingest a lot of the common data sets, but like I said, any unique data sets, data sets regarding theatrical as an example, you know, Warner brothers does it themselves, you know, for streaming, HBO Max, does it themselves. So we kind of operate in partnership. >> So do you have a centralized data team and also decentralized data teams, right? >> That's right. >> So I love this conversation because that was heresy 10 years ago, five years ago, even, cause that's inefficient. But you've, I presume you've found that it's actually more productive in terms of the business output, explain that dynamic. >> You know, you bring up such a good point. So I, you know, I consider myself as one of the dinosaurs who started like 20 plus years ago in this industry. And back then, we were all taught to think of the data warehouse as like a monolithic thing. And the reason for that is the technology wasn't there. The technology didn't catch up. Now, 20 years later, the technology is way ahead, right? But like, our mindset's still the same because we think of data warehouses and data platforms still as a monolithic thing. But if you really sort of remove that sort of mental barrier, if you will, and if you start thinking about, well, how do I sort of, you know, federate everything and make sure that you let folks who are building, or are closest to the customer or are building their products, let them own that data and have a partnership. The results have been amazing. And if we were only sort of doing it as a centralized team, we would not be able to do a 10th of what we do today. So it's that massive scale in, in our company as well. >> And I should have clarified, when we talk about data mesh are we talking about the implementing in practice, the octagon sort of framework, or is this sort of your own sort of terminology? >> Well, so the interesting part is four years ago, we didn't have- >> It didn't exist. >> Yeah. It didn't exist. And, and so we, our principle was very simple, right? When we started out, we said, we want to make sure that our brands are able to operate independently with some oversight and guidance from our technology teams, right? That's what we set out to do. We did that with Snowflake by design because Snowflake allows us to, you know, separate those, those brands into different accounts. So that was done by design. And then the, the magic, I think, is the Snowflake data sharing where, which allows us to sort of bring data in here once, and then share it with whoever needs it. So think about HBO Max. On HBO Max, You not only have HBO Max content, but content from CNN, from Cartoon Network, from Warner Brothers, right? All the movies, right? So to see how The Batman movie did in theaters and then on streaming, you don't need, you know, Warner brothers doesn't need to ingest the same streaming data. HBO Max does it. HBO Max shares it with Warner brothers, you know, store once, share many times, and everyone works at their own pace. >> So they're building data products. Those data products are discoverable APIs, I presume, or I guess maybe just, I guess the Snowflake cloud, but very importantly, they're governed. And that's correct, where Elation comes in? >> That's precisely where Elation comes in, is where sort of this central flexible foundation for data governance. You know, you mentioned data mesh. I think what's interesting is that it's really an answer to the bottlenecks created by centralized IT, right? There's this notion of decentralizing that the data engineers and making the data domain owners, the people that know the data the best, have them be in control of publishing the data to the data consumers. There are other popular concepts actually happening right now, as we speak, around modern data stack. Around data fabric that are also in many ways underpinned by this notion of decentralization, right? These are concepts that are underpinned by decentralization and as the pendulum swings, sort of between decentralization and centralization, as we go back and forth in the world of IT and data, there are certain constants that need to be centralized over time. And one of those I believe is very much a centralized platform for data governance. And that's certainly, I think where we come in. Would love to hear more about how you use Elation. >> Yeah. So, I mean, elation helps us sort of, as you guys say, sort of, map, the treasure map of the data, right? So for consumers to find where their data is, that's where Elation helps us. It helps us with the data cataloging, you know, storing all the metadata and, you know, users can go in, they can sort of find, you know, the data that they need and they can also find how others are using data. So it's, there's a little bit of a crowdsourcing aspect that Elation helps us to do whereby you know, you can see, okay, my peer in the other group, well, that's how they use this piece of data. So I'm not going to spend hours trying to figure this out. You're going to use the query that they use. So yeah. >> So you have a master catalog, I presume. And then each of the brands has their own sub catalogs, is that correct? >> Well, for the most part, we have that master catalog and then the brands sort of use it, you know, separately themselves. The key here is all that catalog, that catalog isn't maintained by a centralized group as well, right? It's again, maintained by the individual teams and not only in the individual teams, but the folks that are responsible for the data, right? So I talked about the concept of crowdsourcing, whoever sort of puts the data in, has to make sure that they update the catalog and make sure that the definitions are there and everything sort of in line. >> So HBO, CNN, and each have their own, sort of access to their catalog, but they feed into the master catalog. Is that the right way to think about it? >> Yeah. >> Okay. And they have their own virtual data warehouses, right? They have ownership over that? They can spin 'em up, spin 'em down as they see fit? Right? And they're governed. >> They're governed. And what's interesting is it's not just governed, right? Governance is a, is a big word. It's a bit nebulous, but what's really being enabled here is this notion of self-service as well, right? There's two big sort of rockets that need to happen at the same time in any given organization. There's this notion that you want to put trustworthy data in the hands of data consumers, while at the same time mitigating risk. And that's precisely what Elation does. >> So I want to clarify this for the audience. So there's four principles of database. This came after you guys did it. And I wonder how it aligns. Domain ownership, give data, as you were saying to the, to the domain owners who have context, data as product, you guys are building data products, and that creates two problems. How do you give people self-service infrastructure and how do you automate governance? So the first two, great. But then it creates these other problems. Does that align with your philosophy? Where's alignment? What's different? >> Yeah. Data products is exactly where we're going. And that sort of, that domain based design, that's really key as well. In our business, you think about who the customer is, as an example, right? Depending on who you ask, it's going to be, the answer might be different, you know, to the movie business, it's probably going to be the person who watches a movie in a theater. To the streaming business, to HBO Max, it's the streamer, right? To others, someone watching live CNN on their TV, right? There's yet another group. Think about all the franchising we do. So you see Batman action figures and T-shirts, and Warner brothers branded stuff in stores, that's yet another business unit. But at the end of the day, it's not a different person, it's you and me, right? We do all these things. So the domain concept, make sure that you ingest data and you bring data relevant to the context, however, not sort of making it so stringent where it cannot integrate, and then you integrate it at a higher level to create that 360. >> And it's discoverable. So the point is, I don't have to go tap Ash on the shoulder, say, how do I get this data? Is it governed? Do I have access to it? Give me the rules of it. Just, I go grab it, right? And the system computationally automates whether or not I have access to it. And it's, as you say, self-service. >> In this case, exactly right. It enables people to just search for data and know that when they find the data, whether it's trustworthy or not, through trust flags, and the like, it's doing both of those things at the same time. >> How is it an enabler of solving some of the big challenges that the media and entertainment industry is going through? We've seen so much change the last couple of years. The rising consumer expectations aren't going to go back down. They're only going to come up. We want you to serve us up content that's relevant, that's personalized, that makes sense. I'd love to understand from your perspective, Mitesh, from an industry challenges perspective, how does this technology help customers like Warner Brothers Discovery, meet business customers, where they are and reduce the volume on those challenges? >> It's a great question. And as I mentioned earlier, we had five industry competency badges that were awarded to us by Snowflake. And one of those four, Median Telcom. And the reason for that is we're helping media companies understand their audiences better, and ultimately serve up better experiences for their audiences. But we've got Ash right here that can tell us how that's happening in practice. >> Yeah, tell us. >> So I'll share a story. I always like to tell stories, right? Once once upon a time before we had Elation in place, it was like, who you knew was how you got access to the data. So if I knew you and I knew you had access to a certain kind of data and your access to the right kind of data was based on the network you had at the company- >> I had to trust you. >> Yeah. >> I might not want to give up my data. >> That's it. And so that's where Elation sort of helps us democratize it, but, you know, puts the governance and controls, right? There are certain sensitive things as well, such as viewership, such as subscriber accounts, which are very important. So making sure that the right people have access to it, that's the other problem that Elation helps us solve. >> That's precisely part of our integration with Snowflake in particular, being able to define and manage policies within Elation. Saying, you know, certain people should have access to certain rows, doing column level masking. And having those policies actually enforced at the Snowflake data layer is precisely part of our value product. >> And that's automated. >> And all that's automated. Exactly. >> Right. So I don't have to think about it. I don't have to go through the tap on their shoulder. What has been the impact, Ash, on data quality as you've pushed it down into the domains? >> That's a great question. So it has definitely improved, but data quality is a very interesting subject, because back to my example of, you know, when we started doing things, we, you know, the centralized IT team always said, well, it has to be like this, Right? And if it doesn't fit in this, then it's bad quality. Well, sometimes context changes. Businesses change, right? You have to be able to react to it quickly. So making sure that a lot of that quality is managed at the decentralized level, at the place where you have that business context, that ensures you have the most up to date quality. We're talking about media industry changing so quickly. I mean, would we have thought three years ago that people would watch a lot of these major movies on streaming services? But here's the reality, right? You have to react and, you know, having it at that level just helps you react faster. >> So data, if I play that back, data quality is not a static framework. It's flexible based on the business context and the business owners can make those adjustments, cause they own the data. >> That's it. That's exactly it. >> That's awesome. Wow. That's amazing progress that you guys have made. >> In quality, if I could just add, it also just changes depending on where you are in your data pipeline stage, right? Data, quality data observability, this is a very fast evolving space at the moment, and if I look to my left right now, I bet you I can probably see a half-dozen quality observability vendors right now. And so given that and given the fact that Elation still is sort of a central hub to find trustworthy data, we've actually announced an open data quality initiative, allowing for best-of-breed data quality vendors to integrate with the platform. So whoever they are, whatever tool folks want to use, they can use that particular tool of choice. >> And this all runs in the cloud, or is it a hybrid sort of? >> Everything is in the cloud. We're all in the cloud. And you know, again, helps us go faster. >> Let me ask you a question. I could go on forever in this topic. One of the concepts that was put forth is whether it's a Snowflake data warehouse or a data bricks, data lake, or an Oracle data warehouse, they should all be inclusive. They should just be a node on the mesh. Like, wow, that sounds good. But I haven't seen it yet. Right? I'm guessing that Snowflake and Elation enable all the self-serve, all this automated governance, and that including those other items, it's got to be a one-off at this point in time. Do you ever see you expanding that scope or is it better off to just kind of leave it into the, the Snowflake data cloud? >> It's a good question. You know, I feel like where we're at today, especially in terms of sort of technology giving us so many options, I don't think there's a one size fits all. Right? Even though we are very heavily invested in Snowflake and we use Snowflake consistently across the organization, but you could, theoretically, could have an architecture that blends those two, right? Have different types of data platforms like a teradata or an Oracle and sort of bring it all together today. We have the technology, you know, that and all sorts of things that can make sure that you query on different databases. So I don't think the technology is the problem, I think it's the organizational mindset. I think that that's what gets in the way. >> Oh, interesting. So I was going to ask you, will hybrid tables help you solve that problem? And, maybe not, what you're saying, it's the organization that owns the Oracle database saying, Hey, we have our system. It processes, it works, you know, go away. >> Yeah. Well, you know, hybrid tables I think, is a great sort of next step in Snowflake's evolution. I think it's, in my opinion, I, think it's a game changer, but yeah. I mean, they can still exist. You could do hybrid tables right on Snowflake, or you could, you know, you could kind of coexist as well. >> Yeah. But, do you have a thought on this? >> Yeah, I do. I mean, we're always going to live in a time where you've got data distributed in throughout the organization and around the globe. And that could be even if you're all in on Snowflake, you could have data in Snowflake here, you could have data in Snowflake in EMEA and Europe somewhere. It could be anywhere. By the same token you might be using. Every organization is using on-premises systems. They have data, they naturally have data everywhere. And so, you know, this one solution to this is really centralizing, as I mentioned, not just governance, but also metadata about all of the data in your organization so that you can enable people to search and find and discover trustworthy data no matter where it is in your organization. >> Yeah. That's a great point. I mean, if you have the data about the data, then you can, you can treat these independent nodes. That's just that. Right? And maybe there's some advantages of putting it all in the Snowflake cloud, but to your point, organizationally, that's just not feasible. The whole, unfortunately, sorry, Snowflake, all the world's data is not going to go into Snowflake, but they play a key role in accelerating, what I'm hearing, your vision of data mesh. >> Yeah, absolutely. I think going forward in the future, we have to start thinking about data platforms as just one place where you sort of dump all the data. That's where the mesh concept comes in. It is going to be a mesh. It's going to be distributed and organizations have to be okay with that. And they have to embrace the tools. I mean, you know, Facebook developed a tool called Presto many years ago that that helps them solve exactly the same problem. So I think the technology is there. I think the organizational mindset needs to evolve. >> Yeah. Definitely. >> Culture. Culture is one of the hardest things to change. >> Exactly. >> Guys, this was a masterclass in data mesh, I think. Thank you so much for coming on talking. >> We appreciate it. Thank you so much. >> Of course. What Elation is doing with Snowflake and with Warner Brothers Discovery, Keep that content coming. I got a lot of stuff I got to catch up on watching. >> Sounds good. Thank you for having us. >> Thanks guys. >> Thanks, you guys. >> For Dave Vellante, I'm Lisa Martin. You're watching theCUBE live from Snowflake Summit '22. We'll be back after a short break. (upbeat music)

Published Date : Jun 15 2022

SUMMARY :

session coming for you next. and Ash Naseer great, to have you, in the conference center. and now it's great to kind of see the acceleration that you guys have of the year for data And we've also been awarded Why did you decide that you So the idea of a data mesh Or is it really, how deep have you gone the brands to ingest that data separately, terms of the business and make sure that you let allows us to, you know, separate those, guess the Snowflake cloud, of decentralizing that the data engineers the data cataloging, you know, storing all So you have a master that are responsible for the data, right? Is that the right way to think about it? And they're governed. that need to happen at the So the first two, great. the answer might be different, you know, So the point is, It enables people to just search that the media and entertainment And the reason for that is So if I knew you and I knew that the right people have access to it, Saying, you know, certain And all that's automated. I don't have to go through You have to react and, you know, It's flexible based on the That's exactly it. that you guys have made. and given the fact that Elation still And you know, again, helps us go faster. a node on the mesh. We have the technology, you that owns the Oracle database saying, you know, you could have a thought on this? And so, you know, this one solution I mean, if you have the I mean, you know, the hardest things to change. Thank you so much for coming on talking. Thank you so much. of stuff I got to catch up on watching. Thank you for having us. from Snowflake Summit '22.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Lisa MartinPERSON

0.99+

CNNORGANIZATION

0.99+

HBOORGANIZATION

0.99+

Mitesh ShahPERSON

0.99+

Ash NaseerPERSON

0.99+

EuropeLOCATION

0.99+

FacebookORGANIZATION

0.99+

MiteshPERSON

0.99+

ElationORGANIZATION

0.99+

TNTORGANIZATION

0.99+

Warner brothersORGANIZATION

0.99+

EMEALOCATION

0.99+

second yearQUANTITY

0.99+

OracleORGANIZATION

0.99+

2019DATE

0.99+

two yearsQUANTITY

0.99+

oneQUANTITY

0.99+

Cartoon NetworkORGANIZATION

0.99+

Game of ThronesTITLE

0.99+

two problemsQUANTITY

0.99+

twoQUANTITY

0.99+

Warner BrothersORGANIZATION

0.99+

10thQUANTITY

0.99+

firstQUANTITY

0.99+

SnowflakeORGANIZATION

0.99+

Snowflake Summit '22EVENT

0.99+

Warner brothersORGANIZATION

0.99+

eachQUANTITY

0.99+

fourQUANTITY

0.99+

Las VegasLOCATION

0.99+

Median TelcomORGANIZATION

0.99+

20 years laterDATE

0.98+

bothQUANTITY

0.98+

five different industriesQUANTITY

0.98+

10 years agoDATE

0.98+

30 plus brandsQUANTITY

0.98+

AlationPERSON

0.98+

four years agoDATE

0.98+

todayDATE

0.98+

20 plus years agoDATE

0.97+

Warner Brothers DiscoveryORGANIZATION

0.97+

OneQUANTITY

0.97+

five years agoDATE

0.97+

Snowflake Summit 2022EVENT

0.97+

three years agoDATE

0.97+

five different waysQUANTITY

0.96+

earlier this weekDATE

0.96+

SnowflakeTITLE

0.96+

MaxTITLE

0.96+

early last yearDATE

0.95+

about 1900 attendeesQUANTITY

0.95+

SnowflakeEVENT

0.94+

AshPERSON

0.94+

three-peatQUANTITY

0.94+

around 10,000QUANTITY

0.93+

Satyen Sangani, CEO, Alation


 

(tranquil music) >> Alation was an early pioneer in solving some of the most challenging problems in so-called big data. Founded early last decade, the company's metadata management and data catalog have always been considered leading examples of modern tooling by customers and analysts alike. Governance is one area that customers identified as a requirement to extend their use of Alation's platform. And it became an opportunity for the company to expand its scope and total available market. Alation is doing just that today, announcing new data governance capabilities, and partner integrations that align with the market's direction of supporting federated governance. In other words, a centralized view of policy to accommodate distributed data in this world of an ever expanding data cloud, which we talk about all the time in theCUBE. And with me to discuss these trends and this announcement is Satyen Sangani, who's the CEO and co-founder of Alation. Satyen, welcome back to the CUBE. Good to see you. >> Thank you Dave, It's great to be back. >> Okay, so you heard my open, please tell us about the patterns that you were seeing in the market and what you were hearing from customers that led you in this direction and then we'll get into the announcement. >> Yeah, so I think there are really two patterns, right? I mean, when we started building this notion of a data catalog, as you said a decade ago, there was this idea that metadata management broadly classified was something that belonged in IT, lived in IT and was essentially managed by IT, right? I always liken it to kind of an inventory management system within a warehouse relative to Amazon.com, which has obviously broadly published for the business. And so, with the idea of bringing all of this data directly to the business and allowing people arbitrarily, depending on their role to use the data. You know, you saw one trend, which was just this massive, shift in how much data was available at any given time. I think the other thing that happened was that at the same time, data governance went through a real transitionary phase where there was a lot of demand often spurred by regulations. Whether that's GDPR, CCPA or more recently than that, certainly the Basel accord. And if you think about all of those regulations, people had to get something in a place. Now what we ended up finding out was when we were selling in to add accounts, people would say, well guess what? I've got this data governance thing going on, but nobody's really using it. I built this business glossary, it's been three years. Nothing's been really very effective. And we were never able to get the value and we need to get value because there are so many more people now accessing and using and leveraging the data. And so with that, we started really considering whether or not we needed to build a formal capability in the market. And that's what we're today that we're doing. >> I liked the way you framed that. And if you think back, we were there as you were in the early big day-to-days. And all the talk was about volume, variety and velocity. And those are sort of IT concepts. How do you deal with all these technical challenges? And then the fourth V which you just mentioned was value. And that's where the line of business really comes in. So let's get into the news. What are you announcing today? >> So we're announcing a new application on top of Alation's Catalog platform, which is an Alations data governance application. That application will be released with our 2021.3 release on September 14th. And what's exciting about that is that we are going to now allow customers to discreetly and elegantly and quickly consume a new application to get data governance regimes off the ground and initiatives off the ground, much more quickly than they've ever been able to do. This app is really all about time to value. It's about allowing customers to be able to consume what they need when they need it in order to be able to get successful governance initiatives going. And so that's what we're trying to deliver. >> So maybe you could talk a little bit about how you think about data governance and specifically your data governance approach. And maybe what's different about Alation's solution. >> Yeah, I think there's a couple of things that are different. I think the first thing that's most critically different is that we move beyond this notion of sort of policy declaration into the world of policy application and policy enforcement, right? I think a lot of data governance regimes basically stand up and say, look you know, it's all about people and then process and then technology. And what we need to do is declare who all the governors are and who all the stewards are. And then we're going to get all our policies in the same place and then the business will follow them. And the reality is people don't change their workflows to go off and arbitrarily follow some data governance policy that they don't know exists, or they don't want to actually have to follow up. And so really what you've got to do is make sure that the policy and the knowledge exists as in where the data exists. And that's why it's so critical to build governance into the catalog. And so what we're doing here is we're basically saying, look, you could declare policies with a new policy center inside of Alation. Those policies will get automatically created in some cases by integrating with technologies like Snowflake. But beyond that, what we're also doing is we're saying, look, we're going to move into the world of taking those policies and applying them to the data on an automated basis using ML and AI and basically saying that now it doesn't have to be some massive boil the ocean three-year regime to get very little value in a very limited business loss rate. Rather all of your data sets, all of your terms can be put into a single place on an automated basis. That's constantly being used by people and constantly being updated by the new systems that are coming online. And that's what's exciting about it. >> So I just want to follow up on that. So if I'm hearing you correctly, it's the humans are in the loop, but it's not the only source of policy, right? The machines are assisting. And in some cases managing end-to-end that policy. Is that right? >> You've got it. I think the the biggest challenge with data governance today is that it basically relies a little bit like the Golden Gate Bridge. You know, you start painting it and by the time you're done painting it, you've got to go back and start painting it again, because it relies upon people. And there's just too much change in the weather and there's too much traffic and there's just too much going on in the world of data. And frankly in today's world, that's not even an apt analogy because often what happens is midway through. You've got to restart painting from the very beginning because everything's changed. And so there's so much change in the IT landscape that the traditional way of doing data governance just doesn't work. >> Got it, so in winning through the press release, three things kind of stood out. I wonder if we could unpack them, there were multi-cloud, governance and security. And then of course the AI or what I like to call machine intelligence in there. And what you call the people centric approach. So I wonder if we could dig in into these and help us understand how they fit together. So thinking about multi-cloud governance, how do you think about that? Why is that so challenging and how are you solving that problem? >> Yeah, well every cloud technology provider has its own set of capabilities and platforms. And often those slight differences are causing differences in how those technologies are adopted. And so some teams optimize for certain capabilities and certain infrastructure over others. And that's true even within businesses. And of course, IT teams are also trying to diversify their IT portfolios. And that's another reason to go multi-cloud. So being able to have a governance capability that spans, certainly all of the good grade called megascalers, but also these new, huge emerging platforms like Snowflake of course and others. Those are really critical capabilities that are important for our customers to be able to get a handle on top of. And so this idea of being cloud agnostic and being able to sort of have a single control plane for all of your policies, for all of your data sets, that's a critical must have in a governance regime today. So that's point number one. >> Okay and then the machine learning piece, the AI, you're obviously injecting that into the application, but maybe tell us what that means both maybe technically and from a business stand point. >> Yeah, so this idea of a data policy, right? Can be sometimes by operational teams, but basically it's a set of rules around how one should and should not be able to use data, right? And so those are great rules. It could be that people who are in one country shouldn't be able to access the data of another country, very simple role, right? But how do you actually enforce that? Like you can declare it, but if there is a end point on a server that allows you to access the data, the policy is effectively moot. And so what you got to go do is make sure that at the point of leverage or at the point of usage, people know what the policy happens to be. And that's where AI come in. You can say let's document all the data sets that happened to be domiciled in Korea or in China. And therefore make sure that those are arbitrarily segregated so that when people want to use that as datasets, they know that the policy exists and they know that it's been applied to that particular dataset. That's somewhere where AI and ML can be super valuable rather than a human being trying to document thousands of databases or tens of thousands of data sets, which is really kind of a (mumbles) exercise. And so, that application of automation is really critical and being able to do governance at the scale that most enterprises have to do it. >> You got it 'cause humans just can't do that at scale. Now what do you mean by people-centric approach? Can you explain that? >> Yeah, often what I find with governance is that there's this notion of kind of there's this heavy notion of how one should deal with the data, right? So often what I find is that there are certain folks who think, oh well, we're going to declare the rules and people are just going to follow them. And if you've ever been well, a parent or in some cases seeing government operate, you realize that that actually isn't how things work. And involve them in how things are run. And if you do that, right? You're going to get a lot more success in how you apply rules and procedures because people will understand that and people know why they exist. And so what we do within this governance regime is we basically say, look, we want to make sure that the people who are using the data, leveraging the data are also the people who are stewarding the data. There shouldn't be a separate role of data steward that is arbitrarily defined off, just because you've been assigned to a job that you never wanted to do. Rather it should be a part of your day job. And it should be something that you do because you really want to do it. And it's a part of your workflow. And so this idea of being people centric is all about how do you engage the analyst, the product managers, the sales operation managers, to document those sales data sets and those product data sets. So that in fact, those people can be the ones who are answering the questions, not somebody off to the side who knows nothing about the data. >> Yeah, I think you've talked in previous CUBE interviews about context and that really fits to this discussion. So these capabilities are part of an application, which is what? it's a module onto your existing platform. And it's sort of it's a single platform, right? I mean, we're not bespoke products. Maybe you can talk about that. >> Yeah, that's exactly right. I mean, it's funny because we've evolved and built a relation with a lot of capability. I mean, interestingly we're launching this data governance application but I would say 60% of our almost 300 customers would say they do a form or a significant part of data governance, leveraging relations. So it's not like we're new to this market. We've been selling in this market for years. What's different though, is that we've talked a lot about the catalog as a platform over the last year. And we think that that's a really important concept because what is a platform? It's a capability that has multiple applications built on top of it, definitionally. And it's also a capability where third party developers can leverage APIs and SDKs to build applications. And thirdly, it has all of the requisite capabilities and content. So that those application developers, whether it's first party from Alation or third party can really build those applications efficiently, elegantly and economically well. And the catalog is a natural platform because it contains all of the knowledge of the datasets. And it has all of the people who might be leveraging data in some fundamental way. And so this idea of building this data governance module allows a very specialized audience of people in governance to be able to leverage the full capabilities of the platform, to be able to do their work faster, easier, much more simply and easily than they ever could have. And that's why we're so excited about this launch, because we think it's one example of many applications, whether it's ourselves building it or third parties that could be done so much more elegantly than it previously could have been. Because we have so much knowledge of the data and so much knowledge of how the company operates. >> Irrespective of the underlying cloud platform is what I heard before. >> irrespective of the underlying cloud platform, because the data as you know, lives everywhere. It's going to live in AWS, it's going to live in Snowflake. It's going to live on-premise inside of an Oracle database. That's not going to be changed. It's going to live in Teradata. It's going to live all over the place. And as a consequence of that, we've got to be able to connect to everything and we've got to be able to know everything. >> Okay, so that leads me to another big part of the announcement, which is the partnership and integration with Snowflake. Talk about how that came about. I mean, why snowflake? How should customers think about the future of data management. In the context of this relationship, obviously Snowflake talks about the data cloud. I want to understand that better and where you fit. >> Yeah, so interestingly, this partnership like most great partnerships was born in the field. We at the late part of last year had observed with Snowflake that we were in scores of their biggest accounts. And we found that when you found a really, really large Snowflake engagement, often you were going to be complementing that with a reasonable engagement with Alation. And so seeing that pattern as we were going out and raising our funding route at the beginning of this year, we basically found that Snowflake obviously with their Snowflake Ventures Investment arm realized how strategic having a great answer in the governance market happened to be. Now there are other use cases that we do with Snowflake. We can certainly get into those. But what we realized was that if you had a huge scale, Snowflake engagement, governance was a rate limiter to customers' ability to grow faster. And therefore also Snowflake's ability to grow faster within that account. And so we worked with them to not only develop a partnership but much more critically a roadmap that was really robust. And so we're now starting to deliver on that roadmap and are super excited to share a lot of those capabilities in this release. And so that means that we're automatically ingesting policies and controls from Snowflake into Alation, giving full transparency into both setting and also modifying and understanding those policies for anybody. And so that gives you another control plane through which to be able to manage all of the data inside of your enterprise, irrespective of how many instances of Snowflake you have and irrespective of how many controls you have available to you. >> And again, on which cloud runs on. So I want to follow up with that really interesting because Snowflake's promise of the data cloud, is it essentially abstracts the underlying complexity of the cloud. And I'm trying to understand, okay, how much of this is vision, how much is is real? And it's fine to have a Northstar, but sometimes you get lost in the marketing. And then the other part of the promise, and of course, big value proposition is data sharing. I mean, I think they've nailed that use case, but the challenge when you start sharing data is federated governance. And as well, I think you mentioned Oracle, Teradata that stuff's not all in the cloud, a lot of that stuff on-prem and you guys can deal with that as well. So help us sort of to those circles, if you can. Where do you fit into that equation? >> I think, so look, Snowflake is a magical technology and in the sense that if you look at the data, I mean, it reveals a very, very clear story of the ability to adopt Snowflake very quickly. So any data team with an organization can get up and running with the Snowflake instance with extraordinary speed and capability. Now that means that you could have scores, hundreds of instances of Snowflake within a single institution. And to the extent that you want to be able to govern that data to your point, you've got to have a single control plane through which you can manage all of those various instances. Whether they're combined or merged or completely federated and distinct from each other. Now, the other problem that comes up on governance is also discoverability. If you have all these instances, how do you know what the right hand is doing if the left hand is working independently of it? You need some way to be able to coordinate that effort. And so that idea of discoverability and governance is really the value proposition that Alation brings to the table. And the idea there is that people can then can get up and running much more quickly because, hey, what if I want to spin up a Snowflake instance, but there's somebody else, two teams over those already solved the problem or has the data that I need? Well, then maybe I don't even need to do that anymore. Or maybe I can build on top of that work to be able to get to even better outcome even faster. And so that's the sort of kind of one plus one equals three equation that we're trying to build with them. >> So that makes sense and that leads me to one of my favorite topics with the notion is this burgeoning movement around the concept of a data mesh in it. In other words, the notion that increasingly organizations are going to push to decentralize their data architectures and at the same time support a centralized policy. What do you think about this trend? How do you see Alation fitting in? >> Yeah, maybe in a different CUBE conversation. We can talk a little bit about my sort of stylized history of data, but I've got this basic theory that like everybody started out what sort of this idea of a single source of truth. That was a great term back in the 90s where people were like, look, we just need to build a single source of truth and we can take all of our data and physically land it up in a single place. And when we do that, it's going to all be clean, available and perfect. And we'll get back to the garden of Eden, right? And I think that idea has always been sort of this elusive thing that nobody's ever been able to really accomplish, right? Because in any data environment, what you're going to find is that if people use data, they create more data, right? And so in that world, you know, like that notion of centralization is always going to be fighting this idea of data sprawl. And so this concept of data mesh I think is, you know, there's formal technical definitions. But I'll stick with maybe a very informal one, which is the one that you offered. Which is just sort of this decentralized mode of architecture. You can't have decentralization if nobody knows how to access those different data points, 'cause otherwise they'll just have copies and sprawl and rework. And so you need a catalog and you need centralized policies so that people know what's available to them. And people have some way of being able to get conformed data. Like if you've got data spread out all over the place, how do you know which is the right master? How do you know what's the right customer record? How do you know what's your right chart of accounts? You've got to have services that exist in order to be able to find that stuff and to be able to leverage them consistently. And so, to me the data mesh is really a continuation of this idea, which the catalog really enabled. Which is if you can build a single source of reference, not a single source of truth, but a single place where people can find and discover the data, then you can govern a single plane and you can build consistent architectural rules so that different services can exist in a decentralized way without having to sort of bear all the costs of centralization. And I think that's a super exciting trend 'cause it gives power back to people who want to use the data more quickly and efficiently. >> And I think as we were talking about before, it's not about just the IT technical aspects, hey, it works. It's about putting power in the hands of the lines of business. And a big part of the data mesh conversation is around building data products and putting context or putting data in the hands of the people who have the context. And so it seems to me that Alation, okay, so you could have a catalog that is of the line of businesses catalog, but then there's an Uber catalog that sort of rolls up. So you've got full visibility. It seems that you've fit perfectly into that data mesh. And whether it's a data hub, a data warehouse, data lake, I mean, you don't care. I mean, that's just another node that you can help manage. >> That's exactly right. I mean, it's funny because we all look at these market scapes where people see these vendor landscapes of 500 or 800 different data and AI and ML and data architecture vendors. And often I get asked, well, why doesn't somebody come along and like consolidate all this stuff? And the reality is that tools are a reflection of how people think. And when people have different problems and different sets of experiences, they're going to want to use the best tool in order to be able to solve their problem. And so the nice thing about having a mesh architecture is you can use whatever tool you want. You just have to expose your data in a consistent way. And if you have a catalog, you can be able to have different teams using different infrastructure, different tools, different fundamental methods of building the software. But as long as they're exposing it in a consistent way, it doesn't matter. You don't necessarily need to care how it's built. You just need to know that you've got good data available to you. And that's exactly what a catalog does. >> Well, at least your catalog. I think the data mesh, it should be tools that are agnostic. And I think there are certain tools that are, I think you guys started with that principle. Not every data catalog is going to enable that, but I think that is the trend Satyen. And I think you guys have always fit into that. It's just that I think you were ahead of the time. Hey, we'll give you the last word. Give us the closing thoughts and bring us home. >> Well, I mean that's exactly right. Like, not all the catalogs are created equal and certainly not all governance is created equal. And I think most people say these words and think that are simple to get into. And then it's a death by a thousand cuts. I was literally on the phone with a chief data officer yesterday of a major distributor. And they basically said, look, like we've got sprawl everywhere. We've got data everywhere. We've got it in every type of system. And so having that sophistication turned into something that's actually easy to use is a super hard problem. And it's the one that we're focused on every single day that we wake up and every single night when we go to sleep. And so, that's kind of what we do. And we're here to make governance easy, to make data discovery easy. Those are the things that we hold our hats on. And we're super excited to put this release out 'cause we think it's going to make customers so much more capable of building on top of the problems that they've already solved. And that's what we're here to do. >> Good stuff, Satyen. Thanks so much, congratulations on the announcement and great to see you again. >> You too, Dave. Great talking. >> All right, thanks for watching this CUBE conversation. This is Dave Vellante, we'll see you next time. (tranquil music)

Published Date : Sep 14 2021

SUMMARY :

and partner integrations that align in the market and what you And if you think about And all the talk was about And so that's what And maybe what's different And the reality is people And in some cases managing that the traditional way And what you call the And so this idea of being cloud that into the application, And so what you got to Now what do you mean by And it should be something that you do And it's sort of it's a And it has all of the people Irrespective of the because the data as you of the announcement, And so that gives you And it's fine to have a Northstar, And so that's the sort of kind and that leads me to And so in that world, you know, And so it seems to me that Alation, And so the nice thing about And I think you guys have And it's the one that we're and great to see you again. You too, Dave. we'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

KoreaLOCATION

0.99+

ChinaLOCATION

0.99+

DavePERSON

0.99+

September 14thDATE

0.99+

OracleORGANIZATION

0.99+

AlationORGANIZATION

0.99+

Satyen SanganiPERSON

0.99+

60%QUANTITY

0.99+

AWSORGANIZATION

0.99+

hundredsQUANTITY

0.99+

three yearsQUANTITY

0.99+

yesterdayDATE

0.99+

TeradataORGANIZATION

0.99+

SnowflakeORGANIZATION

0.99+

two teamsQUANTITY

0.99+

SatyenPERSON

0.99+

UberORGANIZATION

0.99+

Basel accordTITLE

0.99+

bothQUANTITY

0.99+

two patternsQUANTITY

0.99+

todayDATE

0.99+

Amazon.comORGANIZATION

0.99+

CCPATITLE

0.99+

three-yearQUANTITY

0.98+

Golden Gate BridgeLOCATION

0.98+

last yearDATE

0.98+

AlationPERSON

0.98+

500QUANTITY

0.98+

SnowflakeTITLE

0.98+

GDPRTITLE

0.97+

AlationsORGANIZATION

0.97+

early last decadeDATE

0.97+

thousands of databasesQUANTITY

0.97+

oneQUANTITY

0.97+

single sourceQUANTITY

0.96+

one countryQUANTITY

0.96+

singleQUANTITY

0.96+

first thingQUANTITY

0.96+

a decade agoDATE

0.96+

one exampleQUANTITY

0.96+

Snowflake Ventures InvestmentORGANIZATION

0.95+

almost 300 customersQUANTITY

0.95+

single institutionQUANTITY

0.95+

one areaQUANTITY

0.95+

tens of thousands of data setsQUANTITY

0.95+

single placeQUANTITY

0.95+

fourth VQUANTITY

0.94+

single platformQUANTITY

0.94+

NorthstarORGANIZATION

0.94+

one trendQUANTITY

0.94+

EdenLOCATION

0.91+

single planeQUANTITY

0.91+

three equationQUANTITY

0.91+

single dayQUANTITY

0.91+

three thingsQUANTITY

0.91+

2021.3DATE

0.87+

beginningDATE

0.86+

eachQUANTITY

0.85+

CUBEORGANIZATION

0.84+

90sDATE

0.83+

every single nightQUANTITY

0.82+

Satyen Sangani, Alation | CUBE Conversation, June 2021


 

(upbeat music) >> Announcer: From theCUBE studios in Palo Alto, in Boston connecting with all leaders all around the world, this is theCUBE conversation. >> Lisa Martin here with theCUBE conversation. One of our alumni is joining me Satyen Sangani the CEO and Co-Founder of Alation is back. Satyen, it's great to see you this morning. >> I know it's so great to see you especially so soon after we last talked. >> Yeah, we only spoke a couple of months ago when you guys launched the Alation Cloud Service and now big news raising 110 million in Series D led by Riverwood Capital from participation with some new investors, including Snowflake Ventures. Talk to us about this new funding raise. >> Yeah, it's so funny. I mean, we've seen market demand pick up ever since the sort of tail end of last year. And it's just been incredible. And quarter after quarter we keep on hitting and exceeding our numbers and we keep on hiring faster and faster and faster and it just doesn't seem like it's ever been fast enough. And so we've been aggressive since the beginning of the year. And even actually before that in spending and, taking the company from roughly 275 people at the end of the year to now, by the end of this year, 525 people. So with that kind of growth we definitely wanted to have the capital to, carry us to this year and then certainly beyond. And, so we went out and raised around and, obviously we're able to do that on great terms and to find a phenomenal partner in Riverwood. And so super excited about the outcome. >> Exactly saw a lot of demand as you and I talked about just a couple of months ago the acceleration of the business during the pandemic. Talk to me about, as you mentioned the demand has never been higher. Let's talk about the demand for the data intelligence platform how the funding is going to help. What are some of the things that you're specifically going to do? >> Yeah, so there's you know we're going to grow the business in a pretty balanced way. And so from our perspective, that means a couple of things right? So starting with sales and marketing, we've got just a need for more feet on the street. Everybody understands generally that they've got problems in data governance, data management, data search and discovery, enablement to people around data. These are things that people are now starting to understand but they don't always necessarily know how to solve the problem in the most efficient and best way. And many of the traditional approaches that sort of command and control top down, you know, let's go hire an army of consultants to figure this stuff out, tends to be the first thing that comes to mind. And so we're building our sales organization is one thing that we're going to do. The second thing that we're going to do is invest in our customer success and customer journey because everybody's looking for best practice and last but not least workforce investing in product and R&D. And so we're going to be growing the R&D organization by almost a factor of two, and that's going to be globally. And, just being the best in the market means you've got to still solve all these unsolved problems. And we're going to do that. >> Sounds like a tremendous amount of momentum kind of igniting this next era for Alation. When we talk about customers, I love that you're doubling down on the customer success. That's absolutely critical. That's why you're in business. But one of the things that we talk about with customers in every industry is being data-driven. And as we see data intelligence emerging as a very, very critical technology investment to enable an enterprise to become more data-driven or actually data-driven, what are some of the things that you're seeing that those customers are saying Alation help us with XYZ? >> Yeah, so I think everybody feels like they need to be on this. So let's first of all, talk about data intelligence. Like, what is this category? So historically there has been these sort of data management categories where the general approach has been let's curate or manage or clean the data in this manual way in order to be able to get good data in front of people so they can start to use it, right. And that data cleaning, that data work that data stewardship has lived often in IT sometimes with very technical people in the business. And it just doesn't scale. There's just too much data out there and there's too much demand for data. So the demand for data is increasing, the supply for data is increasing. So now there's this category of data intelligence. And basically what it's doing it's saying, look all these things that we're talking about machine learning, AI, all of that can be applied to actually the management of data. People can be way more intelligent about how they do this work. They can be more intelligent how they search. They can be more intelligent about how they curate the data. And so what we're seeing is that people are saying, look, I've got so much data. My entire business relies upon data, and now I need you Alation or somebody to help me do this better to do this faster, to do this more efficiently. And all of these really traditional approaches where you use, you know, predominantly workflows and all this stuff it's just not working. And so that's why people are coming after us. >> Well, that need for data in real time is something that we saw during the pandemic. It's for many industries and many different types of situations. It's no longer a nice to have. It's really going to be the defining element between those businesses that succeed in really kind of leveraged COVID as an accelerant versus those that don't succeed. But I'm curious where your conversations are going within the customer base. As we see the need for data across an organization, but the need to access data that they can trust quickly, data that tells the truth, data that can be shared. Are you seeing this elevate up to C-suite in terms of your customer conversations? >> Yeah, and it is and it is because of one really critical reason because a lot of these data projects both fail and under exceed expectations and they do it for reasons that the C-suite doesn't understand. And so now the C-suite is getting forced to say, well, why is this happening? Why are these not going like, wow, you know the boardroom is saying like, well, we need to do more AI. Well, why aren't we doing more AI? Well, it's 'cause your data isn't really clean 'cause you don't actually have the data that you think you have. Because people don't share your data because people are, you know, your data is locked in some on-premise instance in, some access database that nobody's ever heard of. And so all of these reasons are things that now because they're impeding the business or getting to more senior levels in the organization >> That's kind of what I was thinking. I want to talk now about the investment this particular Series D that we talked about. So you've got investment, as I mentioned from a couple of new partners, but talk to me about the Snowflake and the Salesforce Ventures and how that is helping to catalyze what Alation is doing. >> Yeah, so we've, you know had a long time relationship with Salesforce but we found in the last year in particular that our relationship with Snowflake has just taken off in a way that I have seen few partnerships taking off in in certainly in my career. And, you know, it started really with just scores of customers. I mean, literally scores of customers that are all global to 1000s and fortune 500s where we would often just say, hey, what's your data source. And, you know, let's start with Alation and they'd be like, yeah we are either about to invest in Snowflake or we're invested in Snowflake or, something like that. So we'd often see customers on the journey with Snowflake and Alation at the exact same time. And then the next order conversation became well, you know if we're expanding and rolling out with Snowflake, which customers, you know, everybody looks at Snowflakes 168 net percent net expansion rate where every customer is spending a dollar 68 more than they were last year on average. And, you know, says, wow, if I'm going to scale that much we need to govern all of that data. And so Snowflake customers came to Snowflake and to Alation at the same time, and we've been the natural solution of choice. And so that kind of marriage has been quite symbiotic and we're super excited to partner with them. You know, they think exclusively about data consumption. We think about, you know, finding, discovering understanding data. So it's a really natural marriage. And so we're really excited to partner with them and you're going to see a lot from the two companies moving forward. >> So it sounds like that really was driven from joint customers in terms of facilitating maybe an expansion of the partnership that Alation and Snowflake have. Talk to me a little bit more about what some of the things are that we can expect in the next year. >> Yeah, so I won't take away from the stories that we're about to release, but you are going to see really exciting innovations and product between Snowflake and Alation over the course of the next couple of months. And in particular, you're going to see, you know some fun announcements at the snowflake summit coming up next week. So stay tuned for that. Not surprisingly data governance is going to be a big topic for us. Data search and discovery is going to be a big topic for us. Data privacy and security is going to be a big topic us. And so those are all areas where you're going to see lots of fun products innovation. And then on the other side, you're going to see a lot of go to market innovation. So customers are moving data to the cloud, obviously and that's going to be a big place of discussion just enabling all of the field sales forces getting the stories and the customer stories to market. You're going to see a lot of that from us. >> In the last year, I'm curious if you saw any verticals in particular that really have pivoted with fuel from Alation. I think healthcare, life sciences, manufacturing anything that you, that really stood out to you in the last year >> You know, it's, I mean I think there's been the pandemic certainly hurt certain industries more than others transportation, travel and hospitality. And so we definitely saw a trend where there were dips in some of those industries but those were really temporary. And what we're finding is in a lot of those industries are now coming back bigger than ever. And the other industries in manufacturing and pharma in financial services, you know those are just as strong as they've ever been. And interestingly through the pandemic, what we found is that our user account within the company doubled. So even though the customer base itself didn't double the number of users on the platform across all of our customers, literally doubled on an active basis. And so, it's just been, interestingly enough it's just that across the board the growth has been consistent. And I think, really speaks to the fact that everybody's working from home and needs more data to do their job. >> Well, hopefully that's something that's going to be temporary. This, I was telling you, this is my first day back in the studio and not sitting in the home office. So in terms of the demand we talked about the demand we're customers, you're more than 250 customers now, big names, including one of the I think last year's most used terms household terms of Pfizer. Talk to me about the customer perspective on the funding and in terms of the things that you're going to be able to do to go to market. What are you hearing from your customer? >> Yeah I mean, literally the first thing I hear from 80 to 90% of my customers is go faster. You know, like there's this fun story, right? Where there's two people, they meet in the forest, they start walking together and then all of a sudden they both see a big bear. And the bear is, right about to come right after them. One person sits down and like puts on their running shoes. And they're like, well, you know, the other guy says, oh, there's no way you're going out run the bear. And they're like, well, I (indistinct)the bear. I've got to out run you. Right, and our customers are basically saying to us, look the bear of the data problem is gigantic. And yeah, you might be better than everything else out there, but I still have to as a customer contend with this massive data problem. And you know, if I have to do that, I need you to go faster because data's coming after me faster than ever. And I've got to contend with all of that work. And so they just want us to go faster and they want us to go faster in product. And they want us to go faster in developing the customer journey. And they want us to go faster in developing the ecosystem because many of our customers are you leveraging us as a platform. They want to see data on top of Alation. They want to see data privacy on top of Alation. They want to see data migration on top of Alation. So building out all these capabilities with our partners in our ecosystem and with partners like Snowflake and Salesforce, I mean, they just want us to move faster >> Moving faster, I think we all want that in certain senses but in any industry, consumers, users are getting more and more demanding as you're helping customers achieve their desire of going faster. How do you do that and help them foster a data culture that's, that supports that speed. >> Yeah, it's so interesting because cultural transformation, as you all know, like as we all know, that's like that's certainly slow work, right? Like you're not going to show up at an enterprise and say, hey, I installed Alation. You know what? You're going to have a totally different area culture. Everybody's going to start asking questions with data and the world's going to change, right. And so that, that, you know I'd love for that to be the eventual vision that we achieve. But it's certainly not where we are at today. I think, one of the things that I believe is that you can't go fast and big things you've got to break up big problems and turn them into small problems. And so one of the habits that we've seen within the organization, and one of the things that I talked to our team about every single day is look, you know make small promises and deliver on them. If you got to connect to data source, do that faster. If you're going to train a set of employees do that more quickly because customers have intent with data, but if they don't get the data in front of themselves quickly then they're just going to go to their gut decision. And so capturing that moment of intent and building a sort of velocity is where we see our best customer engagements go. And so that sort of incremental success approach, as opposed to the boil the ocean three month engagement, you know never see the finish line approach is really what I think makes us special and different. >> Tell me a little bit about speaking of culture, about Alations culture. What are some of the things that have changed in the last year? And it sounds like with the Series D round that you've just raised a lot of growth opportunities you mentioned that. Talk to me about the culture, how it's transformed in the last year and what you are excited for going forward. >> Yeah, it's so funny 'cause I always think about culture. You know, people think about culture and they say, companies (indistinct) culture and they think of that culture as being a fixed thing. And it's totally true that, yeah, there's got to be some shared vision, shared values shared ideals within a company in order for it to grow at the pace that we're growing, right. Adding 250 people in a 12 month period is not easy. But it's also the case that, you know, what we found is that there's a lot more specialization within the company. And so people now really, you know where you found the company on generalist you scale a company on specialists. And so getting those specialists inside of the company and respecting them and letting them do their jobs and really kind of building that expertise in the company is something that's been really fabulous and just wonderful to see the team work that way. I think the other thing that's been really interesting obviously is just the remote first work. I mean, we've seen zero loss in productivity and I've talked to CEOs who were like, yeah we need to get people back in the office. I don't really care where my team works. They're getting the job done and they're doing it fabulously for customers. And so if customers want them in front of them, totally great. Obviously love to see the team all the time but it is so wonderful to see how productive people can be when they don't have to spend two hours in a car every day. And so those have been two small things. I mean, at the core, there are other aspects of our culture that have been more permanent, but those two have been slightly different. >> That's great to hear that about the productivity. I was actually very excited to commute this morning for the first time. Although there was no traffic to navigate. As we look at the current market valuation, 1.2 billion the growth rate, the demand for the technologies. What are some, you mentioned some of the events that you're going to be at you mentioned Snowflakes event. Where can folks go to hear more information about this? >> Yeah, absolutely. You can come to our website, of course, at alation.com. There's a ton of information there. Anybody who's watching this interview obviously is a experienced and thoughtful enterprise IT buyers. So certainly, you know, this is a fairly expert audience but we do have tons of field resources that are available. The Alation Cloud instance allows you to get up and running super quickly. And you're going to see that speed increase further over the coming 12 months, but, you know start with alation.com and go from there. And then there's a whole bunch of people who are sitting behind that front door waiting to help you. >> Excellent, alation.com. Well, Satyen congratulations on the funding announcement. Thank you for joining me today helping us unpack what at means the impact, the demand from the customers and how we're going see Alation go even faster. I'm excited to see what happens next in the next couple of months. I'm sure I'll see you again. >> I know. Me too. Thank you Lisa, it's always great to talk. >> Likewise, for Satyen Sangani, I'm Lisa Martin. You're watching this CUBE conversation. (upbeat music)

Published Date : Jun 4 2021

SUMMARY :

all around the world, the CEO and Co-Founder of Alation is back. I know it's so great to see you of months ago when you guys launched And so super excited about the outcome. how the funding is going to help. And many of the traditional But one of the things that we talk about all of that can be applied to actually but the need to access data And so now the C-suite and how that is helping to And so that kind of marriage of the things are that we going to see, you know out to you in the last year it's just that across the board and in terms of the And the bear is, right about How do you do that and help And so one of the habits that we've seen in the last year and what you And so people now really, you know of the events that you're going to be at over the coming 12 months, but, you know in the next couple of months. Thank you Lisa, it's always great to talk. Likewise, for Satyen Sangani,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Lisa MartinPERSON

0.99+

Satyen SanganiPERSON

0.99+

June 2021DATE

0.99+

AlationORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

two companiesQUANTITY

0.99+

SalesforceORGANIZATION

0.99+

LisaPERSON

0.99+

Riverwood CapitalORGANIZATION

0.99+

Snowflake VenturesORGANIZATION

0.99+

two peopleQUANTITY

0.99+

two hoursQUANTITY

0.99+

12 monthQUANTITY

0.99+

1.2 billionQUANTITY

0.99+

PfizerORGANIZATION

0.99+

last yearDATE

0.99+

One personQUANTITY

0.99+

Salesforce VenturesORGANIZATION

0.99+

BostonLOCATION

0.99+

80QUANTITY

0.99+

twoQUANTITY

0.99+

SnowflakesEVENT

0.99+

250 peopleQUANTITY

0.99+

second thingQUANTITY

0.99+

first timeQUANTITY

0.99+

SatyenPERSON

0.99+

next weekDATE

0.99+

525 peopleQUANTITY

0.99+

more than 250 customersQUANTITY

0.99+

next yearDATE

0.99+

three monthQUANTITY

0.99+

SnowflakeORGANIZATION

0.99+

110 millionQUANTITY

0.99+

oneQUANTITY

0.99+

OneQUANTITY

0.99+

todayDATE

0.98+

theCUBEORGANIZATION

0.98+

this yearDATE

0.98+

168QUANTITY

0.98+

first thingQUANTITY

0.98+

1000sQUANTITY

0.98+

pandemicEVENT

0.98+

bothQUANTITY

0.97+

two small thingsQUANTITY

0.97+

90%QUANTITY

0.97+

end of this yearDATE

0.95+

alation.comOTHER

0.94+

first workQUANTITY

0.94+

next couple of monthsDATE

0.94+

first thingQUANTITY

0.93+

couple of months agoDATE

0.92+

firstQUANTITY

0.91+

RiverwoodLOCATION

0.9+

one thingQUANTITY

0.9+

Series DEVENT

0.88+

12 monthsQUANTITY

0.87+

CUBEORGANIZATION

0.87+

endDATE

0.86+

zero lossQUANTITY

0.86+

275 peopleQUANTITY

0.85+

SnowflakesORGANIZATION

0.85+

Satyen Sangani, Alation | CUBEconversation


 

(soft music) >> Hey, welcome to this "CUBE Conversation". I'm Lisa Martin today talking to a CUBE alumni who's been on many times talking about data, all things data. Please welcome Satyen Sangani the Co-Founder and CEO of Alation. Satyen, it's great to have you back on theCUBE. >> Hi Lisa, it's great to see you too. It's been a while. >> It has been a while. And of course in the last year we've been living in this virtual world. So, I know you've gotten to be on theCUBE during this virtual world. Hopefully someday soon, we'll get to actually sit down together again. There's some exciting news that's coming out of Alation. Talk to us about what's going on. What are you announcing? >> So we're announcing that we are releasing our Alation Cloud Service which actually comes out today, and is available to all of our customers. And as a consequence are going to be the fastest, easiest deploy and easiest to use data catalog on the Marketplace, and using this release to really double down on that core differentiation. >> So the value prop for Alation has always been about speed to deployment, time to value. Those have really been, what you've talked about as the fundamental strengths of the platform. How does the cloud service double down on that value prop? >> Well, if you think about data, our basic premise and the reason that we exist is that, people could use data with so many of their different decisions. People could use data to inform their thinking. People can use data in order to figure out what decision is the best decision at any given point in time. But often they don't. Often gut instinct, or whatever's most fast or easy to access is the basis off of which people decide to do what they do. And so if you want to get people to use data more often you've got to make sure that the data is available that the data is correct, and that the data is easy to find and leverage. And so everything that we can do at Alation to make data more accessible, to allow people to be more curious, is what we get excited about. Because unlike, paying your payables or unlike, figuring out whether or not you want to be able to have greater or lesser inventory, those are all things that a business absolutely has to do but people don't have to use data. And to get people to use data, the best thing you can do is to make it easy and to make it fast. >> And speaking of fast, that's one of the things I think the last year has taught us is that, real-time access to data is no longer a nice to have. It's really a competitive differentiator. Talk to me about how you enable customers to get access to the right data fast enough, to be able to do what so many companies say, and that is actually make data-driven decisions. >> Yeah, that's absolutely right. So, it really is a entire continuum. The first and most obvious thing that we do is we start with the user. So, if you're a user of data, you might have to hunt through a myriad of reports, thousands of tables in a database, hundreds of thousands of files in a data lake, and you might not know where to find your answer and you might have the best of intentions but if you don't have the time to go through all of those sources, the first thing you might do is, go ask your buddy down the hall. Now, if your buddy down the hall or your colleague over Zoom can't give you the time of day or can't answer your question quickly enough then you're not going to be able to use that data. So the first thing, and the most obvious thing that we do is we have the industry's best search experience and the industry's best browse experience. And if you think about that search experience, that's really fueled by our understanding of all of the data patterns in your data environment. We basically look at every search. We look at every log within a company's data environment to understand what it is that people are actually doing with the data. And that knowledge just like Google has page rank to help it inform which are the best results for a given webpage. We do the exact same thing with data. And so great search is the basis of what we do. Now, above and beyond that, there's a couple of other things that we do, but they all get to the point of getting to that end search experience and making that perfect so that people can then curate the data and leverage the data as easily as possible. >> Sounds like that's really kind of personalized based on the business, in terms of the search, looking at what's going on. Talk to me a little bit more about that, and how does that context help fuel innovation? >> Yeah. So, to build that context, you can't just do, historically and traditionally what's been done in the data management space. Lots of companies come to the data management world and they say, "Well, what we're going to do is we're going to hire... "We've got this great software. "But setting the software up is a journey. "It takes two to three to four years to set it up "and we're going to get an army of consultants "and everybody's going to go and assert quality of data assets "and measure what the data assets do "and figure out how the data assets are used. "And once we do all of that work, "then in four years we're going to get you to a response." The real key is not to have that context to be built, sort of through an army of consultants and an army of labor that frankly nine times out of 10 never gets to the end of the road. But to actually generate that context day one, by understanding what's going on inside of those systems and learning that by just observing what's happening inside of the company. And we can do that. >> Excellent. And as we've seen the acceleration in the last year of digital transformation, how much of that accelerant was an accelerator revelation putting this service forward and what are customers saying so far? >> Yeah, it's been incredible. I mean, what we've seen in our existing accounts is that, our expansions have gone up by over 100% year over year with the kind of crisis in place. Obviously, you would hypothesize that these catalogs, these, sort of accessibility and search tools and data in general, would be leveraged more when all of us are virtual and all of us can't talk to each other. But, it's been amazing to see that we've found that that's actually what's happening. People are actually using data more. People are actually searching for data more. And that experience and bringing that to our customers has been a huge focus of what we're trying to do. So we've seen the pandemic, in many cases obviously been bad for many people but for us it's been a huge accelerant of customers using our product. >> Talk to me about Alation with AWS. What does that enable your customers to achieve that they maybe couldn't necessarily do On-Prem? >> Yeah, so, customers obviously don't really care anymore, or as much as they used to, about managing the software internally. They just want to be able to, get whatever they need to get done and move forward with their business. And so by leveraging our partnership with AWS, one, we've got elastic compute capability. I think that's obviously, something that they bring to the table, better than perhaps any other in the market. But much more fundamentally, the ability to stand up Alation and get it going, now means that all you have to do is go to the AWS Marketplace or call up an Alation rep. And you can, within a matter of minutes, get an Alation instance that's up and running and fit for purpose for what you need. And that capability is really quite powerful because, now that we have that elasticity and the speed of deployment, customers can realize the value, so much more quickly than they otherwise might've. >> And that speed is absolutely critical as we saw a lot last year that was the difference between the winners and those that were not going to make it. Talk to me a little bit about creating a data culture. We talk about that a lot. It's one thing to talk about it, it's a whole other thing to put it in place, especially for legacy institutions that have been around for a while. How do you help facilitate the actual birth of a data culture? >> Yeah, I mean, I think we view ourselves as a technology, as a catalyst, to our best customers and our best customer champions. And when we talk to chief data officers and when we talk to data leaders within various organizations that we service, organizations like Pfizer, organizations like Salesforce, organizations like Cisco, what they often tell me is, "Look, we've got to build products faster. "We've got to move at the speed and the scale "of all of the startups that are nipping at our heels. "And how do we do that? "Well, we've got to empower our people "and the way that we empower our people "is by giving them context. "And we need to give them the data "to make the right decisions, "so that they can build those products "and move faster than they ever might've." Now those are amazing intentions but those same leaders also come and say, "I've just been mired in risk "and I've been mired in compliance, "and I've been mired in "doing all of these data janitorial projects. "And it's really hard for me to get "on the offense with data. "It's really hard for me to get proactive with data." And so the biggest thing that we do, is we just help companies be more proactive, much more easily, because what they're able to do, is they're able to leave a lot of that janitorial work, lead a lot of that discovery work, lead a lot of that curation work to the software. And so what they get to focus on is, how is it that I can then drive change and drive behavioral change within my organizations so that people have the right data at their disposal. And that's really the magic of the technology. >> So I was reading the "Alation State of Data Culture Report" that was just published a few weeks ago. This is this quarterly assessment that Alation does, looking at the progress that enterprises have made in creating this data culture. And the number that really struck out at me was 87% of respondents say, data quality issues are a barrier to successful implementation of AI in their organizations. How can Alation help them solve that problem? >> Yeah, I think the first is, whenever you've got a problem, the first thing you've got to do is acknowledge that you've got a problem. And a lot of the time people, leaders will often jump to AI and say, "well, hey, everybody's talking about AI. "The board level conversation is AI. "McKinsey is talking about AI, let's go do some AI." And that sounds great in theory. And of course we all want to do that more, but the reality is that many of these projects are stymied by the basic plumbing. You don't necessarily know where the data's coming from. You don't know if people have entered it properly in the source systems or in the systems that are online. Those data often get corrupted in the transformation processes or the processes themselves don't run appropriately. And so you don't have transparency. You don't have any awareness of what people are doing, what people are using, how the data is actually being manipulated from step to step, what that data lineage is. And so that's really where we certainly help many of our customers by giving them transparency and an understanding of their data landscape. Ironically, what we find is that, data leaders are super excited to get data to the business but they often don't themselves have the data to understand how to manage the data itself. >> Wow, that's a conundrum. Let's talk about customers because I was looking on the website and there's some pretty big metrics-based business outcomes that Alation is helping customers drive. I wanted to kind of pick through some examples from your perspective. First one is 364% ROI. Second one is 70% less time for analysts to complete projects. Workforce productivity is huge. Talk to me about how Alation is helping customers achieve business outcomes like that. >> Yeah, so if you think about a typical analytical project you would think that most of the time is spent inside of the analytical tool, inside of your Excel, inside of your Tableau, that where you're thinking about the data and you're analyzing it, you're thinking deep thoughts. And you're trying to hypothesize you're trying to understand. But the reality is going back to the data quality issue that most of the time is spent with figuring out which are the right datasets. Because at one of our customers, for example, there were 4,000 different types of customer transaction datasets, that spoke to the exact same data. Which data set do I actually use out of a particular database? And then once I figured out which ones to use, how do I construct the appropriate query and assumptions in order to be able to get the data into a format that makes sense to me. Those are the kinds of things that most analysts and data scientists struggle with. And what we do is we help them by not having them reinvent the wheel. We allow them to figure out what the right dataset is fast, how to manipulate it fast, so that they can focus most of their time on doing that end analytical work. And that's where all the ROI or a lot of the ROI is coming from because they don't know how to reinvent the wheel. They can do the work and they can move on with the much faster business decision which means that that business moves significantly faster. And so what we find is that for these very highly priced resources, some data scientists who make 200, 300, $400,000 fully load it for a company, those people can do their job 74% faster which means they can get not only the answer faster but they can get many more tasks done, for over a given period of time. >> Well, that just opens up a potential suite of benefits that the organization will achieve, not just getting the analyst productivity cranked up in a big way, but also allowing your organization to be more agile which many organizations are striving to be. to be able to identify new products, new services, what's happening, especially, in a changing chaotic environment like we've been living in the last year. >> Yeah, absolutely. And they also can learn... Not only can they help themselves figure out what new products to launch, but they can also help themselves figure out where their risks happen to be, and where they need to comply, because it could be the case that analysts are using datasets that they ought not to be using or the businesses using the data incorrectly. And so you can find both the patterns but also the anti-patterns, which means that you're not only moving faster, but you're moving forward with less risk. And so we've seen so many failures with data governance, regimes, where people have tried to assert the quality of data and figure out the key data elements and develop a business glossary. And there's that great quote, "I wanted data governance but all I got is a data glossary." That all happens because, they just don't have enough time in the day to do the value added work. They only have enough time in the day to start doing the data cleaning and all of the janitorial work that we, as a company, really strive to allow them to completely eliminate. >> So wrapping things up here, Alation Cloud Service. Tell me about when it's available, how can customers get it? >> So it's available today, which is super exciting. Customers can get it either through the AWS Marketplace or by calling your Alation representative. You can do that coming to our website. And that's super easy to do and getting a demo and moving forward. But we try to make it as easy as possible. And we really want to get out of the way, of allowing people to have a seamless frictionless experience and are super excited to have this cloud service that allows them to do that, even faster than they were able to do before. >> And we all know how important that speed is. Well, Satyen, congratulations on the announcement of Alation Cloud Service. We appreciate you coming on here and sharing with us the news and really what's in it for the customers. >> Thank you, Lisa. It's been phenomenal catch up and great seeing you. >> Likewise. For Satyen Sangani, I'm Lisa Martin. You're watching this "CUBE Conversation." (soft music)

Published Date : Apr 7 2021

SUMMARY :

Satyen, it's great to Hi Lisa, it's great to see you too. And of course in the last year and is available to all of our customers. of the platform. and that the data is easy to find Talk to me about how you enable customers and leverage the data and how does that context that context to be built, how much of that accelerant bringing that to our customers Talk to me about Alation with AWS. something that they bring to the table, And that speed is absolutely critical And so the biggest thing that we do, And the number that And a lot of the time people, Talk to me about how that most of the time is spent with suite of benefits that the that they ought not to be using how can customers get it? You can do that coming to our website. on the announcement of up and great seeing you. (soft music)

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
CiscoORGANIZATION

0.99+

PfizerORGANIZATION

0.99+

Lisa MartinPERSON

0.99+

200QUANTITY

0.99+

AWSORGANIZATION

0.99+

LisaPERSON

0.99+

Satyen SanganiPERSON

0.99+

AlationORGANIZATION

0.99+

SatyenPERSON

0.99+

twoQUANTITY

0.99+

74%QUANTITY

0.99+

ExcelTITLE

0.99+

SalesforceORGANIZATION

0.99+

87%QUANTITY

0.99+

firstQUANTITY

0.99+

GoogleORGANIZATION

0.99+

$400,000QUANTITY

0.99+

four yearsQUANTITY

0.99+

10QUANTITY

0.99+

TableauTITLE

0.99+

last yearDATE

0.99+

CUBEORGANIZATION

0.99+

threeQUANTITY

0.99+

nine timesQUANTITY

0.99+

todayDATE

0.98+

Second oneQUANTITY

0.98+

bothQUANTITY

0.98+

First oneQUANTITY

0.98+

hundreds of thousands of filesQUANTITY

0.98+

oneQUANTITY

0.97+

first thingQUANTITY

0.97+

364% ROIQUANTITY

0.97+

thousands of tablesQUANTITY

0.97+

over 100%QUANTITY

0.97+

Alation State of Data Culture ReportTITLE

0.96+

pandemicEVENT

0.93+

300QUANTITY

0.88+

4,000 different typesQUANTITY

0.87+

few weeks agoDATE

0.86+

70% lessQUANTITY

0.82+

Alation Cloud ServiceORGANIZATION

0.82+

CUBE ConversationEVENT

0.77+

theCUBEORGANIZATION

0.71+

one thingQUANTITY

0.69+

ZoomORGANIZATION

0.67+

day oneQUANTITY

0.57+

McKinseyORGANIZATION

0.54+

of reportsQUANTITY

0.53+

Cloud ServiceTITLE

0.51+

Satyen Sangani, Alation | CUBEConversation


 

>> Narrator: From theCUBE studios in Palo Alto, in Boston, connecting with thought leaders all around the world. This is a CUBE Conversation. >> Hey, welcome back everybody Jeff Frick here with theCUBE. We're coming to you today from our Palo Alto studios with theCUBE conversation, talking about data, and we're excited to have our next guest. He's been on a number of times, many times, CUBE alum, really at the forefront of helping companies and customers be more data centric in their activities. So we'd like to welcome onto the show Satyen Sangani. He is the co founder and CEO of Alation. Satyen, great to see you. >> Great to see you, Jeff. It's good to see you again in this new world, a new format. >> It is a new world, a new format, and what's crazy is, in March and April we were talking about this light switch moment, and now we've just turned the calendar to October and it seems like we're going to be doing this thing for a little bit longer. So, it is kind of the new normal, and even I think when it's over, I don't think everything's going to go back to the way it was, so here we are, but you guys have some exciting news to announce, so let's just jump to the news and then we'll get into a little bit more of the nitty gritty. So what do you got coming out today, right? >> Yeah its so. >> What we are announcing today is basically Alation 2020, which is probably one of the biggest releases that I've been with, that we've had since I've been with the company. We with it are releasing three things. So in some sense, there's a lot of simplicity to the release. The first thing that we're releasing is a new experience around what we call the business user experience, which will bring in a whole new set of users into the catalog. The second thing that we're announcing is basically around Alation analytics and the third is around what we would describe as a cloud-native architecture. In total, it brings a fully transformative experience, basically lowering the total cost of getting to a data management experience, lower and data intelligent experience, much lower than previously had been the case. >> And you guys have a really simple mission, right? You're just trying to help your customers be more data, what's the right word? Data centric, use data more often and to help people actually make that decision. And you had an interesting quote in another interview, you talked about trying to be the Yelp for information which is such a nice kind of humanizing way to think about it because data isn't necessarily that way and I think, you mentioned before we turned on the cameras, that for a lot of people, maybe it's just easier to ignore the data. If I can just get the decision through, on a gut and intuition and get onto my next decision. >> Yeah, you know it's funny. I mean, we live in a time where people talk a lot about fake news and alternative facts and our vision is to empower a curious and rational world and I always smile when I say that a little bit, because it's such a crazy vision, right? Like how you get people to be curious and how do you get people to think rationally? But you know, to us, it's about one making the data really accessible, just allowing people to find the data they need when and as they want it. And the second is for people to be able to think scientifically, teaching people to take the facts at their disposal and interpret them correctly. And we think that if those two skills existed, just the ability to find information and interpret it correctly, people can make a lot better decisions. And so the Yelp analogy is a perfect one, because if you think about it, Yelp did that for local businesses, just like Amazon did it for really complicated products on the web and what we're trying to do at Alation is, in some sense very simple, which is to just take information and make it super usable for people who want to use it. >> Great, but I'm sure there's the critics out there, right? Who say, yeah, we've heard this before the promise of BI has been around forever and I think a lot of peoples think it just didn't work whether the data was too hard to get access to, whether it was too hard to manipulate, whether it was too hard to pull insights out, whether there's just too much scrubbing and manipulating. So, what is some of the secret sauce to take? What is a very complex world? And again and you got some very large customers with some giant data sets and to, I don't want to say humanize it, but kind of humanize it and make it easier, more accessible for that business analyst not just generally, but more specifically when I need it to make a decision. >> Yeah I mean, it's so funny because, making something, data is like a lot of software death by 1000 cuts. I mean you look at something from the outside and it looks really, really, really simple, but then you kind of dwell into any problem and that can be CRM something like Salesforce, or it can be something like service now with ITSM, but these are all really, really complicated spaces and getting into the depths and the detail of it is really hard. And data is really no different, like data is just the sort of exhaust from all of those different systems that exist inside of your company. So the detail around the data in your company is exhaustingly minute. And so, how do you make something like that simple? I think really the biggest challenge there is progressively revealing complexity, right? Giving people the right amount of information at the right amount of time. So, one of the really clever things that we do in this business user experience is we allow people to search for and receive the information that's most relevant to them. And we determined that relevance based upon the other people in the enterprise that happen to be using that data. And we know what other people are using in that company, because we look at the logs to understand which data sources are used most often, and which reports are used most often. So right after that, when you get something, you just see the name of the report and it could be around the revenues of a certain product line. But the first thing that you see is who else uses it. And that's something that people can identify with, you may not necessarily know what the algorithm was or what the formula might be, how the business glossary term relates to some data model or data artifact, but you know the person and if you know the person, then you can trust the information. And so, a lot of what we do is spend time on design to think about what is it that a person expects to see and how do they verify what's true. And that's what helps us really understand what to serve up to somebody so that they can navigate this really complicated, relevant data. >> That's awesome, cause there's really a signal to noise problem, right? And I think I've heard you speak before. >> Yeah >> And of course this is not new information, right? There's just so much data, right? The increasing proliferation of data. And it's not that there's that much more data, we're just capturing a lot more of it. So your signal to noise problem just gets worse and worse and worse. And so what you're talking about is really kind of helping filter that down to get through a lot of that, a lot of that noise, so that you can find the piece of information within the giant haystack. That is what you're looking for at this particular time in this particular moment. >> Yeah and it's a really tough problem. I mean, one of the things that, it's true that we've been talking about this problem for such a long time. And in some instance, if we're lucky, we're going to be talking about it for a lot longer because it used to be that the problem was, back when I was growing up, you were doing research on a topic and you'd go to the card catalog and you'd go to the Dewey decimal system. And in your elementary school or high school library, you might be lucky if you were to find, one, two or three books that map to the topic that you were looking for. Now, you go to Google and you find 10,000 books. Now you go inside of an enterprise and you find 4,000 relational database tables and 200 reports about an artifact that you happened to be looking for. And so really the problem is what do I trust? And what's correct and getting to that level of accuracy around information, if there's so much information out there is really the big problem of our time and I think, for me it's a real privilege to be able to work on it because I think if we can teach people to use information better and better then they can make better decisions and that can help the world in so many different. >> Right, right, my other favorite example that everybody knows is photographs, right? Back when you only got 24 and a roll and cost you six bucks to develop it. Those were pretty special and now you go buy a fancy camera. You can shoot 11, 11 frames a second. You go out and shoot the kids at the soccer game. You come home with 5,000 photos. How do you find the good photo? It's a real, >> Yeah. >> It's a real problem. If you've ever faced something like that, it's kind of a splash of water in the face. Like where do I even begin? But the other piece that you talk about a lot, which is slightly different but related is context, and in favorite concept, it's like 55, right? That's a number, but if you don't have any context for that number, is it a temperature? Is it cold inside the building? Is it a speed? Is it too slow on i5? Or is it fast because I'm on a bicycle going down a Hill and without context data is just, it's just a number. It doesn't mean anything. So you guys really by adding this metadata around the data are adding a lot more contextual information to help figure out kind of what that signal is from the noise. >> Yap, you'll get facts from anywhere, right? Like, you're going to have a Hitchcock, you've got a 55 or 42, and you can figure out like what the meaning of the universe is and apparently the answer is 42 and what does that mean? It might mean a million different things and that, to me, that context is the difference between, suspecting and knowing. And there's the difference between having confidence and basically guessing. And I think to the extent that we can provide more of that over time, that's, what's going to make us, an ever more valuable partner to the customers that we satisfy today. >> Right, well, I do know why 42 is always the answer 'cause that's Ronnie Lot and that's always the answer. So, that one I know that's an easy one. (both chuckles) But it is really interesting and then you guys just came out. I heard Aaron Kalb on, one of your co-founders the other day and we talked about this new report that you guys have sponsored the Data Culture Report and really, putting some granularity on a Data Culture Index and I thought it was pretty interesting and I'm excited that you guys are going to be doing this, longitudinally because whether you do or do not necessarily agree with the method, it does give you a number, It does give you a score, It's a relatively simple formula. And at least you can compare yourself over time to see how you're tracking. I wonder if you could share, I mean, the thing that jumps out right off the top of that report is something we were talking about before we turned the cameras on that, people's perception of where they are on this path doesn't necessarily map out when you go bottoms up and add the score versus top down when I'm just making an assessment. >> Yeah, it's funny, it's kind of the equivalent of everybody thinks they're an above average driver or everybody thinks they're above average in terms of obviously intelligence. And obviously that mathematically is not possible or true, but I think in the world of data management, we all talk about data, we all talk about how important it is to use data. And if you're a data management professional, you want people in your company to use more data. But ironically, the discipline of data management doesn't actually use a lot of data itself. It tends to be a very slow methodical process driven gut oriented process to develop things like, what data models exist and how do I use my infrastructure and where do I put my data and which data quality is best? Like all of those things tend to be, somewhat heuristic driven or gut driven and they don't have to be and a big part of our release actually is around this product called Alation Analytics. And what we do with that product is really quite interesting. We start measuring elements of how your organization uses data by team, by data source, by use case. And then we give you transparency into what's going on with the data inside of your landscape and eco-system. So you can start to actually score yourself both internally, but also as we reveal in our customer success methodology against other customers, to understand what it is that you're doing well and what it is that you're doing badly. And so you don't need necessarily to have a ton of guts instinct anymore. You can look at the data of yourselves and others to figure out where you need to improve. And so that's a pretty exciting thing and I think this notion that says, look, you think you're good, but are you really good? I mean, that's fundamental to improvement in business process and improvement in data management, improvement in data culture fundamentally for every company that we work with. >> Right, right and if you don't know, there's a problem, and if you're not measuring it, then there's no way to improve on it, right? Cause you can't, you don't know, what you're measuring is. >> Right. >> But I'm curious of the three buckets that you guys measured. So you measured data search and discovery was bucket number one, data literacy, you know what you do once you find it and then data governance in terms of managing. It feels like that the search and discovery, which is, it sounds like what you're primarily focused on is the biggest gap because you can't get to those other two buckets unless you can find and understand what you're looking for. So is that JIve or is that really not problem, is it more than manipulation of the data once you get it? >> Yeah, I mean we focus really. We focus on all three and I think that, certainly it's the case that it's a virtuous cycle. So if you think about kind of search and discovery of data, if you have very little context, then it's really hard to guide people to the right bit of information. But if I know for example that a certain data is used by a certain team and then a new member of that team comes on board. Then I can go ahead and serve them with exactly that bit of data, because I know that the human relationships are quite tight in the context graph on the back end. And so that comes from basically building more context over time. Now that context can come from a stewardship process implemented by a data governance framework. It can come from, building better data literacy through having more analytics. But however, that context is built and revealed, there tends to be a virtuous cycle, which is you get more, people searching for data. Then once they've searched for the data, you know how to necessarily build up the right context. And that's generally done through data governance and data stewardship. And then once that happens, you're building literacy in the organization. So people then know what data to search for. So that tends to be a cycle. Now, often people don't recognize that cycle. And so they focus on one thing thinking that you can do one to the exclusion of the others, but of course that's not the case. You have to do all three. >> Great and I would presume you're using some good machine, Machine Learning and Artificial Intelligence in that process to continue to improve it over time as you get more data, the metadata around the data in terms of the usage and I think, again I saw in another interview there talking about, where should people invest? What is the good data? What's the crap data? what's the stuff we shouldn't use 'cause nobody ever uses it or what's the stuff, maybe we need to look and decide whether we want to keep it or not versus, the stuff that's guiding a lot of decisions with Bob, Mary and Joe, that seems to be a good investment. So, it's a great application of applied AI Machine Learning to a very specific process to again get you in this virtuous cycle. That sounds awesome. >> Yeah, I know it is and it's really helpful to, I mean, it's really helpful to think about this, I mean the problem, one of the biggest problems with data is that it's so abstract, but it's really helpful to think about it in just terms of use cases. Like if I'm using a customer dataset and I want to join that with a transaction dataset, just knowing which other transaction datasets people joined with that customer dataset can be super helpful. If I'm an analyst coming in to try to answer a question or ask a question, and so context can come in different ways, just in the same way that Amazon, their people who bought this product also bought this product. You can have all of the same analogies exist. People who use this product also use that product. And so being able to generate all that intelligence from the back end to serve up simple seeming experience on the front end is the fun part of the problem. >> Well I'm just curious, cause there's so many pieces of this thing going on. What's kind of the, aha moment when you're in with a new customer and you finish the install and you've done all the crawling and where all the datasets are, and you've got some baseline information about who's using what I mean, what is kind of the, Oh, my goodness. When they see this thing suddenly delivering results that they've never had at their fingertips before. >> Yeah, it's so funny 'cause you can show Alation as a demo and you can show it to people with data sets that are fake. And so we have this like medical provider data set that, we've got in there and we've got a whole bunch of other data sets that are in there and people look at it and interestingly enough, a lot of time, they're like, Oh yeah, I can kind of see it work and I can kind of like understand that. And then you turn it on against their own data. The data they have been using every single day and literally their faces change. They look at the data and they say, Oh my God, like, this is a dataset that Steven uses, I didn't even know that Steven thought that this data existed and, Oh my God, like people are using this data in this particular way. They shouldn't be using that data at all, Like I thought I deprecated that dataset two years ago. And so people have all of these interesting insights and it's interesting how much more real it gets when you turn it on against the company's systems themselves. And so that's been a really fun thing that I've just seen over and over again, over the course of multiple years where people just turn on the cup, they turn on the product and all of a sudden it just changes their view of how they've been doing it all along. And that's been really fun and exciting. >> That's great yeah, cause it means something to them, right? It's not numbers on a page, It's actually, it's people, it's customers, it's relationships, It's a lot of things. That's a great story and I'm curious too, in that process, is it more often that they just didn't know that there were these other buckets of reports and other buckets of data or was it more that they just didn't have access to it? Or if they did, they didn't really know how to manipulate it or to integrate it into their own workflow. >> Yeah, It's kind of funny and it's somewhat role dependent, but it's kind of all of the above. So, if you think about it, if you're a data management professional, often you kind of know what data sources might exist in the enterprise, but you don't necessarily know how people are using the data. And so you look at data and you're like, Oh my God, I can't believe this team is using this data for this particular purpose. They shouldn't be doing that. They should be using this other data set. I deprecated that data set like two years ago. And then sometimes if you're a data scientist, you're you find, Oh my gosh, there's this new database that I otherwise didn't realize existed. And so now I can use that data and I can process that for building some new machine learning algorithms. In one case we've had a customer where they had the same data set procured five different times. So it was a pure, it was a data set that cost multiple hundreds of thousands of dollars. They were spending $2 million overall on a data set where they could have been spending literally one fifth of that amount. And then you had a sort of another case finally, where you're basically just looking at it and saying, Hey, I remember that data set. I knew I had that dataset, but I just don't remember exactly where it was. Where did I put that report? And so it's exactly the same way that you would use Google. Sometimes you use it for knowledge discovery, but sometimes you also use it for just remembering the thing you forgot. >> Right but, but the thing, like I remember when people were trying to put Google search in that companies just to find records not necessarily to support data efforts and the knock was always, you didn't have enough traffic to drive the algorithm to really have effective search say across a large enterprise that has a lot of records, but not necessarily a lot of activity. So, that's a similar type of problem that you must have. So is it really extracting that extra context of other people's usage that helps you get around kind of that you just don't have a big numbers? >> Yeah, I mean that kind of is fundamentally the special sauce. I mean, I think a lot of data management has been this sort of manual brute force effort where I get a whole bunch of consultants or a whole bunch of people in the room and we do this big documentation session. And all of a sudden we hope that we've kind of, painted the golden gate bridge is at work. But, knowing that three to six months later, you're going to have to go back and repaint the golden gate bridge overall all over again, if not immediately, depending on the size and scale of your company. The one thing that Google did to sort of crawl the web was to really understand, Oh, if a certain webpage was linked to super often, then that web page is probably a really useful webpage. And when we crawled the logs, we basically do the exact same thing. And that's really informed getting a really, really specific day one view of your data without having to have a whole bunch of manual effort. And that's been really just dramatical. I mean, it's been, it's allowed people to really see their data very quickly and new different ways and I think a big part of this is just friction reduction, right? We'd all love to have an organized data world. We'd love to organize all the information in a company, but for anybody has an email inbox, organizing your own inbox, let alone organizing every database in your company just seems like a specificity in effort. And so being able to focus people on what's the most important thing has been the most important thing. And that's kind of why we've been so successful. >> I love it and I love just kind of the human factors kind of overlay, that you've done to add the metadata with the knowledge of who is accessing these things and how are they accessing it. And the other thing I think is so important Satyen is, we talk about innovation all the time. Everybody wants more innovation and they've got DevOps so they can get software out faster, et cetera, et cetera. But, I fundamentally believe in my heart of hearts that it's much more foundational than that, right? That if you just get more people, access to more information and then the ability to manipulate and clean knowledge out of that information and then actually take action and have the power and the authority to take action. And you have that across, everyone in the company or an increasing number of people in the company. Now suddenly you're leveraging all those brains, right? You're leveraging all that insight. You're leveraging all that kind of First Line experience to drive kind of a DevOps type of innovation with each individual person, as opposed to, kind of classic waterfall with the Chief Innovation Officer, Doing PowerPoints in his office, on his own time. And then coming down from the mountain and handing it out to everybody to go build. So it's a really a kind of paradox that by adding more human factors to the data, you're actually making it so much more usable and so much more accessible and ultimately more valuable. >> Yeah, it's funny we, there's this new term of art called data intelligence. And it's interesting because there's lots of people who are trying to define it and there's this idea and I think IDC, IDC has got a definition and you can go look it up, but if you think about the core word of intelligence, it basically DevOps down to the ability to acquire information or skills, right? And so if you then apply that to companies and data, data intelligence then stands to reason. It's sort of the ability for an organization to acquire, information or skills leveraging their data. And that's not just for the company, but it's for every individual inside of that company. And we talk a lot about how much change is going on in the world with COVID and with wildfires here in California. And then obviously with the elections and then with new regulations and with preferences, cause now that COVID happened everybody's at home. So what products and what services do you have to deliver to them? And all of this change is, basically what every company has to keep up with to survive, right? If capitalism is creative destruction, the world's getting destroyed, like, unfortunately more often than we'd like it to be,. >> Right. >> And so then you're say there going, Oh my God, how do I deal with all of this? And it used to be the case that you could just build a company off of being really good at one thing. Like you could just be the best like logistics delivery company, but that was great yesterday when you were delivering to restaurants. But since there are no restaurants in business, you would just have to change your entire business model and be really good at delivering to homes. And how do you go do that? Well, the only way to really go do that, is to be really, really intelligent throughout your entire company. And that's a function of data. That's a function of your ability to adapt to a world around you. And that's not just some CEO cause literally by the time it gets to the CEO, it's probably too late. Innovations got to be occurring on the ground floor. And people have got to repackage things really quickly. >> I love it, I love it. And I love the other human factor that we talked about earlier. It's just, people are curious, right? So if you can make it easy for them to fulfill their curiosity, they're going to naturally seek out the information and use it versus if you make it painful, like a no fun lesson, then people's eyes roll in and they don't pay attention. So I think that it's such an insightful way to address the problem and really the opportunity and the other piece I think that's so different when you're going down the card catalog analogy earlier, right? Is there was a day when all the information was in that library. And if you went to the UCLA psych library, every single reference that you could ever find is in that library, I know I've been there, It was awesome, but that's not the way anymore, right? You can't have all the information and it's pulling your own information along with public information and as much information as you can. where you start to build that competitive advantage. So I think it's a really great way to kind of frame this thing where information in and of itself is really not that valuable. It's about the context, the usability, the speed of these ability and that democratization is where you really start to get these force multipliers and using data as opposed to just talking about data. >> Yeah and I think that that's the big insight, right? Like if you're a CEO and you're kind of looking at your Chief Data Officer or Chief Data and Analytics Officer. The real question that you're trying to ask yourself is, how often do my people use data? How measurable is it? Like how much do people, what is the level at which people are making decisions leveraging data and that's something that, you can talk about in a board room and you can talk about in a management meeting, but that's not where the question gets answered. The question gets really answered in the actual behaviors of individuals. And the only way to answer that question, if you're a Chief Analytics Officer or somebody who's responsible for data usage within the company is by measuring it and managing it and training it and making sure it's a part of every process and every decision by building habit and building those habits are just super hard. And that's, I think the thing that we've chosen to be sort of the best in the world at, and it's really hard. I mean, we're still learning about how to do it, but, from our customers and then taking that knowledge and kind of learning about it over time. >> Right, well, that's fantastic. And if it wasn't hard, it wouldn't be valuable. So those are always the best problems to solve. So Satyen, really enjoyed the conversation. Congratulations to you and the team on the new release. I'm sure there's lots of sweat, blood and tears that went into that effort. So congrats on getting that out and really great to catch up. Look forward to our next catch up. >> You too Jeff, It's been great to talk. Thank you so much. >> All right, take care. All righty Satyen and I'm Jeff, you're watching theCUBE. We'll see you next time. Thanks for watching. (ethereal music)

Published Date : Oct 6 2020

SUMMARY :

leaders all around the world. We're coming to you today It's good to see you again in the calendar to October and the third is around what we would and I think, you mentioned And the second is for people to be able And again and you got and if you know the person, you speak before. so that you can find and that can help the and cost you six bucks to develop it. that signal is from the noise. and you can figure out like and I'm excited that you guys and they don't have to be and if you're not measuring it, of the data once you get it? So that tends to be a cycle. in that process to continue from the back end to serve and you finish the install and you can show it to is it more often that they just the thing you forgot. get around kind of that you and repaint the golden gate and handing it out to and you can go look it up, and be really good at delivering to homes. and really the opportunity and you can talk about and really great to catch up. Thank you so much. We'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Jeff FrickPERSON

0.99+

SatyenPERSON

0.99+

JeffPERSON

0.99+

AmazonORGANIZATION

0.99+

11QUANTITY

0.99+

Palo AltoLOCATION

0.99+

$2 millionQUANTITY

0.99+

oneQUANTITY

0.99+

Ronnie LotPERSON

0.99+

StevenPERSON

0.99+

OctoberDATE

0.99+

24QUANTITY

0.99+

200 reportsQUANTITY

0.99+

GoogleORGANIZATION

0.99+

Aaron KalbPERSON

0.99+

YelpORGANIZATION

0.99+

CaliforniaLOCATION

0.99+

six bucksQUANTITY

0.99+

MarchDATE

0.99+

10,000 booksQUANTITY

0.99+

twoQUANTITY

0.99+

thirdQUANTITY

0.99+

Satyen SanganiPERSON

0.99+

BostonLOCATION

0.99+

AprilDATE

0.99+

second thingQUANTITY

0.99+

AlationORGANIZATION

0.99+

bothQUANTITY

0.99+

two skillsQUANTITY

0.99+

BobPERSON

0.99+

theCUBEORGANIZATION

0.98+

two years agoDATE

0.98+

todayDATE

0.98+

secondQUANTITY

0.98+

hundreds of thousands of dollarsQUANTITY

0.98+

yesterdayDATE

0.98+

two bucketsQUANTITY

0.98+

Data Culture ReportTITLE

0.98+

1000 cutsQUANTITY

0.98+

JoePERSON

0.97+

AlationPERSON

0.97+

5,000 photosQUANTITY

0.97+

first thingQUANTITY

0.97+

five different timesQUANTITY

0.97+

55QUANTITY

0.97+

three bucketsQUANTITY

0.97+

one thingQUANTITY

0.97+

threeDATE

0.96+

one caseQUANTITY

0.96+

Alation 2020TITLE

0.95+

six months laterDATE

0.94+

each individual personQUANTITY

0.94+

CUBEORGANIZATION

0.93+

COVIDEVENT

0.92+

three booksQUANTITY

0.91+

MaryPERSON

0.91+

one fifthQUANTITY

0.91+

threeQUANTITY

0.91+

IDCORGANIZATION

0.88+

Alation AnalyticsORGANIZATION

0.88+

4,000 relational databaseQUANTITY

0.86+

First LineQUANTITY

0.85+

42QUANTITY

0.85+

HitchcockPERSON

0.84+

three thingsQUANTITY

0.82+

11 frames a secondQUANTITY

0.82+

42OTHER

0.81+

UCLA psychORGANIZATION

0.75+

Aaron Kalb, Alation | CUBEConversation, September 2020


 

>> Announcer: From theCUBE studios in Palo Alto, in Boston, connecting with thought leaders all around the world. This is theCUBE conversation. >> Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're in our Palo Alto studios today for theCUBE conversation. We're talking about data. We're always talking about data and it's really interesting. You know we like to go out and get you the first person insight from the people that start the companies, run the companies, the practitioners and, and, and get the insight directly from them. We also like to go out and get original research and hear from original research. And this is a great opportunity to hear from both. So we're excited to have, and welcome back into the studio. He's Aaron Kalb. He's the co founder of Alation, many time CUBE alumni. Aaron. Great to see you. >> Yeah, thanks for having me. It's good to be here. >> Yeah, it's very cool. But today it's a special, a special thing. We've never done this before with you. You guys are releasing a brand new report called, the Alation State of Data Culture Report. So really interesting report. A lot of great information that we're going to dig in here for the next few minutes. But before we do, tell us kind of the history of this report. This is a, the kind of the inaugural release. What was kind of behind it, why did you guys do this? And give us a little background before we get into the details. >> Absolutely. So, yes, that's exactly right. It's debuting today that we plan to kind of update this research quarterly we going to see the trends over time. And this emerged because, you know, I, part of my job, I talk to chief data officers and chief analytics officers across our customer base and prospects. And I keep hearing anecdotally over and over that establishing a data culture, is often the number one priority for these data leaders and for these organizations. And so we wanted to really say, can we quantify that? Can we agree upon a definition of data culture? And can we create sort of a simple yardstick to more objectively measure where organizations are on this sort of data maturity curve to get it into culture. >> Right. I love it. So you created this data, data index right? The data culture index. And, and I think it's important to look at methodology. I think people, a lot of times go right to the results on reports before talking about the methodologies. And let's talk about the methodologies cause we're supposed to be talking about data, right? So you talked to 300, some odd executives, correct. And I think it's really interesting and you broke it down into three kind of buckets of data literacy, if you will. Data search and discovery, number one, data, two kind of literacy in terms of their ability to work with the data. And then the third bucket is really data governance. And then in, in the form ABCD, you gave him a four point score and basically, are they doing it well? Are they doing it in the majority of the time? Are they doing it about half, they got one or they got a zero and you get this four point scale and you end up with a 12 point scale which we're all familiar with from, from school, from an A to an, A minus and B, et cetera. Just dig it a little bit on those three categories and how you chose those. So the first one again is kind of the data search and discovery, you know can they find it and then their competency, if you will and then a governance and compliance. Kind of dig into each of those three buckets a little bit. >> For sure. So, so the, the end goal in data culture, is to have an organization in which data is valued and decisions are made based on data and evidence, right? Versus a culture in which we go with the highest paid person's opinion or what we did last quarter or any of these other ways things get done. And so the idea is to make that possible, as you said you've to be able to find the data when you need it. That's the data search and discovery. You've to be able to interpret that data correctly and draw valid conclusions from it. And that's a data literacy, excuse me. And both of those are contingent upon having data governance in place. So that data is well-defined and has high data quality, as well as other aspects, so that it is possible to find it and understand it properly. >> Right. And what are the things too that I think is really important that we call that, and again, we're going to dive into the details, is your perceived execution versus the reported execution by the people that are actually providing data. And I think you've found and you've highlighted on specific slides that you know, there's not necessarily a match there. And sometimes that you know, what you perceive is happening, isn't necessarily what's happening when you go down and query the people in the field. So really important to come up with a number. And I think a, I think you said this is going to be an ongoing thing over a period of time. So you kind of start to see longitudinal changes in these organizations. >> Absolutely. And we're very excited to see those, those trends over time. But even at the outset is this you know, very striking effect emerges which is, as you said, if we ask one of these you know, 300 data leaders, you know, all around the world actually, you know, if we ask, how is the data culture at your company overall, and this is very broad general top down way and have them graded on the sort of SaaS scale. You know, we get results where there's a large gap between kind of that level of maturity and what emerges in a bottom up methodology excuse me, in which you ask about, you know governance and literacy and, and such kind of by department and in a more bottom up way. And so we do see that that, you know, it can be helpful, even for data people to have a, a more granular metric and framework for quantifying their progress. >> Right? Let's jump into some of the results. It's, it's a fascinating, they're kind of all over the map, but there's some definite trends. One of the trends you talked about is that there's a lot of questions on the quality of the data. But that's a real inhibitor to people. Whether that suspicion is because it's not good data. And I don't know, this question for you, is, is, do they think it's not relevant to the decision that's being made? Is it an incomplete data set or the wrong data set? It seems to be that keeps coming up over and over about, decision-makers not necessarily having confidence in the data. What, can you share a little bit more color around that? >> Yeah, it's quite interesting actually. So what we find is that 90%. So 90 people, 10 executives (indistinct) to question the data sometimes often or always. But the part that's maybe disappointing or concerning is the two thirds of executives are believed to ignore the data and make a decision kind of pushing the data aside which is really quite striking when you think about it, why have all this data, if more often than not you're sort of disregarding it to make your final answer. And so you're absolutely correct when we dug into why, what are the reasons behind pushing it aside. Data quality was number one. And I think it is a question of, Oh, is the data inaccurate? Is it out of date, these sort of concerns sort of we, we hear from customers and prospects. But as we dig in deeper in the survey results, excuse me, we, we see some other reasons behind that. One is a lack of collaboration between the data analytics folks and the business folks. And so there's a question of, I don't know exactly where this data came from or to your point kind of how it was produced. What was the methodology? How was it sourced? And maybe because of that disconnect is a lack of trust. So trust really is the ultimate I think, failure to having data culture really take root. >> Right? And it's trust in this trust, as you said, not only in the data per se, the source of the data, the quality of the data, the relevance of the data but also the people who are providing you with the data. And obviously you get, you get some data sets. Sometimes you didn't get other data sets. So, that's really I'm a little bit disconcerting. The other thing I thought was kind of interesting is, it seems to be consistent that the, the primary reason that people are using big data projects is around operations and operations efficiency, a little bit about compliance, but, you know, it's interesting we had you on at the MIT CDOIQ, Chief Data Information Officer quality symposium, and you talked about the goodness of people moving from kind of a defensive posture to an offensive posture, you know using data in terms of product development and innovation. And, and what comes across in this survey is that's kind of down the list behind you know, kind of operational efficiency. We're seeing a little bit of governance and regulation but the, the quest for data as a tool for innovation, didn't really shine through in this report. >> Well, you know, it's very interesting. It depends whether you look at the aggregate level or you break things down a little bit more. So one thing we did after we got that zero to 12 scale on the data culture index or DCI, is it actually, we were able to break it down into thirds. And among the sort of bottom third, it has the least well-established data culture by this yardstick. We've found that governance and regulatory compliance, was the number one application of data. But among the top third of respondents, we actually found the opposite where things like providing a great customer experience, doing product innovation, those sort of things actually came to the fore and governance fell behind. So I think there is this curve where, It's table stakes to get the sort of defense side of data figured out. And then you can move on to offense in using data to make your organization meet its meet its other goals. >> Right. Right. And then I wanted to get your take on kind of the democratization of data, right? This is a, this is a trend that's been going on, and really, I think you said before you know, your guys' whole mission is to empower curious and rational world to give people the ability to ask the right questions have the right data and get the right answer. So, you know, we've seen democratization in terms of the access to the data, the access to the tools, the ability to do something with the data and the tool, and then the actual authority to execute business decision based on that. The results on that seem a little bit split here because a lot of the problems seem to be focused on leadership, not necessarily taking a data based decision move, but on the good hand a lot of people trying to break down data silos and make data more accessible for a larger group of people. So that more people in the organization are making data based decisions. This seems kind of like this little bit of a bifurcation between the C suite and everybody else trying to get their job done. >> Absolutely. There's always this question of you know, sort of the, that organizational wide initiative and then what's happening on the ground. One thing we saw that was very heartening and aligns with our customers index success, is a real emphasis being placed on having data governance and data context and data literacy factors sort of be embedded at the point of use. To not expecting people, to just like take a course and look things up and kind of end up with their workflow to be able to use data quickly and accurately and, and interpret it in varied ways. So that was really exciting to see as, as, as a initiative. It sort of bridges that gap along with initiatives to have more collaboration and integration between the data people and the business people. because really you know, they exist to serve one another. But in terms of the disconnect between the C suite and other parts of the org, there was a really interesting inverse correlation. Well, or maybe it's not interesting how you look at it, but basically, you know, when we talk to C level executives and ask, you know, does the C suite ignore data? Do they question data et cetera, those numbers came in lower than when we talked to, you know, senior director about the C suite right? It's sort of the farther you get, and there's a difference there, you know, from my perspective, I almost wonder whether that distance is actually is more objective viewpoint. And when you're in that role, it's hard to even see your cognitive biases and your tendency to ignore a data when it doesn't suit you. >> Right. Right. So there's, there's some other interesting things here. So one of them is, you know, kind of predictors, right? One of the whole reasons to do studies and collect data so that we can have some predictive ability. And, and it comes out here that the reporting structure is a strong predictor of a company's data tier structure. So, you know, there's the whole rise of the chief data officers and the chief analytics officer and the chief data and analytics officer and lots of conversations about those roles and what exactly are those roles and who do they report to. Your study finds a pretty compelling leading indicator that if that role is reporting to either the CEO or the executive board, which is often a one in the same person, that that's actually a terrific indicator of success in moving to a more data centric culture. >> That's absolutely correct. So we found that that top third of organizations on the data culture index were much more likely to have a chief data executive, a CDO, CAO or CDAO. In fact, they're more likely to have folks with the analytics in their title because in some organizations, data is thought to mean sort of raw data, infrastructural defense and analytics is sort of where it gets you know, infused into business processes and value. But certainly that top third is much more likely to have the chief data executive reporting into the executive board or CEO when the highest ranking data executive is under the CIO or some other part of the organization, those orgs tend to score a far lower on the DCI. >> Right. Right. So it's interesting, you know you're a really interesting guy even doing this for a while. You were at Siri before you were at Alation. So you have a really good feel for kind of what data can do and can't do and natural human or natural language processing and, and, and human voice interaction with these devices, a really interesting case study, and they can do a really good job within a small defined data set and instruction set, but they don't do necessarily so well once you kind of get outside how, how they're trained. And you've talked a lot about how metaphor shaped the way that we think and I know you and Dave talked about data oil and data lakes I don't want to necessarily go down that whole path but I do think it's important. And what came out of the study and the way people think about data. You know, there's a lot of conversation. How do you value data? Is data, you know it used to just be an expense that we had to buy servers to store the stuff we weren't sure what we ever did with it. So I wonder if there's any, you know, kind of top level metaphors level, kind of a thought or process or framing in the companies that you study that came out. maybe not necessarily in the top line data, but maybe in some of the notes that help define why some people, you know are being successful at making this transition and putting, you know kind of data out front of their decision processing versus data, either behind as a supporting thing or maybe data, I just don't have time with it or I don't trust it, or God knows where you got that, and this is not the data that I wanted. You know, was there any, you know, kind of tangental or anecdotal stuff that came out of this study that's more reflective of, of the softer parts of a data culture versus the harder parts in terms of titles and roles and, and, and job responsibilities. >> Yeah. It's a really interesting place to explore. I do think there's a, I don't want to make this overly simplistic group binary, but at the end of the day you know, like anything else within an organization, you can view data as a liability to say, okay, we have for example, you know, customer's names and phone numbers and passwords, and we just need to prevent an adverse event in which there's a leak or some sort of InfoSec problem that could cause, you know, bad press and fines and other negative consequences. And I think the issue there is if data's a liability, the most you know, the best case is that it's worth zero as opposed to some huge negative on your company's balance sheet. And, and I think, you know, intuitively, if you really want to prevent data misuse and data problems, one fail safe, but I think ultimately in its own way risky way to do that was just not collect any data, right. And not store it. So I think that the transition is to say, look data must be protected and taken care of that's step zero. But you know, it's really just the beginning and data is this asset that can be used to inform the huge company level strategic decisions that are made in annual planning at the board level, down to the millions of little decisions every day in the work of people in customer support and in sales and in product management and in, you know, various roles that just across industries. And I think once you have that, that shift, you know the upside is potentially, you know, unbounded. >> Right. And, and it just changes the way, the way you think. And suddenly instead of saying, Oh, data needs to be kind of hidden away, it's more like, Oh, people need to be trained on data use and empowered with data. And it's all about not if it's used or if it's misused but really how it's used and why it's used, what it's being used for to make a real impact. >> Right. Right. And it's funny when I just remember it being back in business school one of the great things that help teach is to think in terms of data, right. And you always have the infamous center consulting interview question, How many manhole covers are in Manhattan. Right. So, you know, to, to, to start to think about that problem from a data centric, point of view really gives you a leg up and, and even, you know where to start and how to attack those types of problems. And I thought it was interesting you know, talking about challenges for people to have a more data centric, point of view. It's interesting. The reports says, basically everybody said there's all kinds of challenges around data quality and compliance, and they had democratization. But the bottom companies, the bottom companies said that the biggest challenge was lack of buy in from company leadership. So I guess the good news bad news is that there's a real opportunity to make a significant change and get your company from the bottom third to a middle third or a top third, simply by taking a change in attitude about putting data in a much more central role in your decision making process. 'Cause all the other stuff's kind of operational, execution challenges that we all have, not enough people, blah, blah, blah. But in terms of attitude of leadership and prioritization, that's something that's very easy to change if you so choose. And really seems to be the key to unlock this real journey as opposed to the minutiae of a lot of the little details that that are a challenge for everybody. >> Absolutely. In your changing attitudes might be the easiest thing or the hardest thing depending on (indistinct). But I think you're absolutely right. The first step, which, which which could, maybe it should be easy, is admitting that you have a problem or maybe to put it more positively, realizing you have an opportunity. >> I love that. And then just again, looking at the top tier companies, the other thing that I thought was pretty interesting in this study is, I'm looking at it here, is getting champions in each of the operational segments. So rather than, I mean, a chief data officer is important and you know, somebody kind of at the high level to shepherd it in the executive suite, as we just discussed, but within each of the individual tasks and functions and roles, whether that's operations or customer service or product development or operational efficiency, you need some type of champion, some type of person, you know, banging the gavel, collecting the data, smoothing out the complexities, helping people get their thing together. And again, another way to really elevate your position on the score. >> Absolutely. And I think this idea of again, bridging between, you know, if data is centralized you have a chance to try to really get excellent practices within the data org. But even it becomes even more essential to have those ambassadors, people who are in the business and understand all the business context who can sort of make the data relevant, identify the key areas where data can really help, maybe demystify data and pick the right metaphors and the right examples to make it real for the people in their function. >> Right. Right. So Aaron has a lot of great stuff. People can go to the website at alation.com. I'm sure you'll have a link to this, a very prominently displayed, but, and they should and they should check it out and really think about it and think about how it applies to their own situation, their own department, company et cetera. I just wanted to give you the last word before we before we sign off, you know, kind of what was the most you know, kind of positive affirmation or not the most but one or two of the most outcome affirming outcomes of this exercise. And what were one or two of the things that were a little concerning or, you know, kind of surprises on the downside that, that came out of this research? >> Yeah. So I think one thing that was maybe surprising or concerning the biggest one is sort of where we started with that disconnect between, you know, what people would, say as an off the cuff overall assessment and the disconnect between that and what emerges when we go department by department and (indistinct) to be pillars of data culture from such a discovery to data literacy, to data governance. I think that disconnect, you know, should give one pause. I think certainly it should make one think, Hmm. Maybe I shouldn't look from 10,000 feet, but actually be a little more systematic. And considering the framework I use to assess data culture that is the most important thing to my organization. I think though, there's this quote that you move what you measure, just having this hopefully simple but not simplistic yardstick to measure data culture and the data culture index should help people be a little bit more realistic in their quantification and they track their progress, you know, quarter over quarter. So I think that's very promising. I think another thing is that, you know sometimes we ask, how long have you had this initiative? How much progress have you made? And it can sometimes seem like pushing a boulder uphill. Obviously the COVID pandemic and the economic impacts of that has been really tragic and really hard. You know, a tiny silver lining in that is the survey results showed that organizations have really observed a shift in how much they're using data because sometimes things are changing but it's like a frog in boiling water. You don't realize it. And so you just assume that the future is going to look like the recent past and you don't look at the data or you ignore the data or you miss parts of the data. And a lot of organizations said, you know COVID was this really troubling wake up call, but they could even after this crisis is over, producing enduring change which people were consulting data more and making decisions in a more data driven way. >> Yeah, certainly an accelerant that, that is for sure whether you wanted it, didn't want it, thought you had it at the time, didn't have time. You know COVID is definitely digital transformation accelerant and data is certainly the thing that powers that. Well again, it's the Alation State of Data Culture Report available, go check it at alation.com. Aaron always great to catch up and again, thank you for, for doing the work and supporting this research. And I think it's really important stuff. And it's going to be interesting to see how it changes over time. 'Cause that's really when these types of reports really start to add value. >> Thanks for having me, Jeff and I really look forward to discussing some of those trends as the research is completed. >> All right. Thanks a lot, Aaron, take care. Alright. He's Aaron and I'm Jeff. You're watching theCUBE, Palo Alto. Thanks for watching. We'll see you next time. (upbeat music)

Published Date : Oct 1 2020

SUMMARY :

leaders all around the world. and get the insight directly from them. It's good to be here. This is a, the kind of you know, I, part of my job, and then their competency, if you will And so the idea is to make that possible, And sometimes that you know, But even at the outset is this you know, One of the trends you talked of pushing the data aside and you talked about the And among the sort of bottom third, in terms of the access to the It's sort of the farther you get, and the chief data and analytics officer where it gets you know, and putting, you know but at the end of the day you know, the way, the way you think. a lot of the little details that you have a problem or and you know, somebody and the right examples to make it real before we sign off, you know, And a lot of organizations said, you know and data is certainly the and I really look forward to We'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AaronPERSON

0.99+

DavePERSON

0.99+

JeffPERSON

0.99+

Jeff FrickPERSON

0.99+

Aaron KalbPERSON

0.99+

Palo AltoLOCATION

0.99+

oneQUANTITY

0.99+

10 executivesQUANTITY

0.99+

12 pointQUANTITY

0.99+

September 2020DATE

0.99+

SiriTITLE

0.99+

90%QUANTITY

0.99+

90 peopleQUANTITY

0.99+

ManhattanLOCATION

0.99+

twoQUANTITY

0.99+

CUBEORGANIZATION

0.99+

10,000 feetQUANTITY

0.99+

OneQUANTITY

0.99+

bothQUANTITY

0.99+

BostonLOCATION

0.99+

eachQUANTITY

0.99+

todayDATE

0.99+

zeroQUANTITY

0.99+

first stepQUANTITY

0.99+

theCUBEORGANIZATION

0.99+

four pointQUANTITY

0.98+

alation.comOTHER

0.98+

Alation State of Data Culture ReportTITLE

0.98+

one thingQUANTITY

0.98+

COVID pandemicEVENT

0.97+

millionsQUANTITY

0.96+

third bucketQUANTITY

0.96+

AlationORGANIZATION

0.95+

first oneQUANTITY

0.94+

two thirdsQUANTITY

0.94+

last quarterDATE

0.92+

300 data leadersQUANTITY

0.91+

about halfQUANTITY

0.91+

three categoriesQUANTITY

0.9+

three bucketsQUANTITY

0.89+

MIT CDOIQORGANIZATION

0.89+

thirdQUANTITY

0.89+

InfoSecORGANIZATION

0.88+

step zeroQUANTITY

0.86+

first personQUANTITY

0.85+

three kindQUANTITY

0.84+

thirdsQUANTITY

0.83+

AlationPERSON

0.82+

12 scaleQUANTITY

0.74+

C suiteTITLE

0.73+

CTITLE

0.71+

300OTHER

0.71+

One thingQUANTITY

0.7+

bottomQUANTITY

0.67+

Alation State of Data Culture ReportTITLE

0.65+

minutesDATE

0.58+

OfficerEVENT

0.56+

top thirdQUANTITY

0.56+

middleQUANTITY

0.51+

Aaron Kalb, Alation | CUBEConversation, September 2020


 

>> Announcer: From theCUBE studios in Palo Alto, in Boston, connecting with thought leaders all around the world. This is theCUBE conversation. >> Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're in our Palo Alto studios today for theCUBE conversation. We're talking about data. We're always talking about data and it's really interesting. You know we like to go out and get you the first person insight from the people that start the companies, run the companies, the practitioners and, and, and get the insight directly from them. We also like to go out and get original research and hear from original research. And this is a great opportunity to hear from both. So we're excited to have, and welcome back into the studio. He's Aaron Kalb. He's the co founder of Alation, many time CUBE alumni. Aaron. Great to see you. >> Yeah, thanks for having me. It's good to be here. >> Yeah, it's very cool. But today it's a special, a special thing. We've never done this before with you. You guys are releasing a brand new report called, the Alation State of Data Culture Report. So really interesting report. A lot of great information that we're going to dig in here for the next few minutes. But before we do, tell us kind of the history of this report. This is a, the kind of the inaugural release. What was kind of behind it, why did you guys do this? And give us a little background before we get into the details. >> Absolutely. So, yes, that's exactly right. It's debuting today that we plan to kind of update this research quarterly we going to see the trends over time. And this emerged because, you know, I, part of my job, I talk to chief data officers and chief analytics officers across our customer base and prospects. And I keep hearing anecdotally over and over that establishing a data culture, is often the number one priority for these data leaders and for these organizations. And so we wanted to really say, can we quantify that? Can we agree upon a definition of data culture? And can we create sort of a simple yardstick to more objectively measure where organizations are on this sort of data maturity curve to get it into culture. >> Right. I love it. So you created this data, data index right? The data culture index. And, and I think it's important to look at methodology. I think people, a lot of times go right to the results on reports before talking about the methodologies. And let's talk about the methodologies cause we're supposed to be talking about data, right? So you talked to 300, some odd executives, correct. And I think it's really interesting and you broke it down into three kind of buckets of data literacy, if you will. Data search and discovery, number one, data, two kind of literacy in terms of their ability to work with the data. And then the third bucket is really data governance. And then in, in the form ABCD, you gave him a four point score and basically, are they doing it well? Are they doing it in the majority of the time? Are they doing it about half, they got one or they got a zero and you get this four point scale and you end up with a 12 point scale which we're all familiar with from, from school, from an A to an, A minus and B, et cetera. Just dig it a little bit on those three categories and how you chose those. So the first one again is kind of the data search and discovery, you know can they find it and then their competency, if you will and then a governance and compliance. Kind of dig into each of those three buckets a little bit. >> For sure. So, so the, the end goal in data culture, is to have an organization in which data is valued and decisions are made based on data and evidence, right? Versus a culture in which we go with the highest paid person's opinion or what we did last quarter or any of these other ways things get done. And so the idea is to make that possible, as you said you've to be able to find the data when you need it. That's the data search and discovery. You've to be able to interpret that data correctly and draw valid conclusions from it. And that's a data literacy, excuse me. And both of those are contingent upon having data governance in place. So that data is well-defined and has high data quality, as well as other aspects, so that it is possible to find it and understand it properly. >> Right. And what are the things too that I think is really important that we call that, and again, we're going to dive into the details, is your perceived execution versus the reported execution by the people that are actually providing data. And I think you've found and you've highlighted on specific slides that you know, there's not necessarily a match there. And sometimes that you know, what you perceive is happening, isn't necessarily what's happening when you go down and query the people in the field. So really important to come up with a number. And I think a, I think you said this is going to be an ongoing thing over a period of time. So you kind of start to see longitudinal changes in these organizations. >> Absolutely. And we're very excited to see those, those trends over time. But even at the outset is this you know, very striking effect emerges which is, as you said, if we ask one of these you know, 300 data leaders, you know, all around the world actually, you know, if we ask, how is the data culture at your company overall, and this is very broad general top down way and have them graded on the sort of SaaS scale. You know, we get results where there's a large gap between kind of that level of maturity and what emerges in a bottom up methodology excuse me, in which you ask about, you know governance and literacy and, and such kind of by department and in a more bottom up way. And so we do see that that, you know, it can be helpful, even for data people to have a, a more granular metric and framework for quantifying their progress. >> Right? Let's jump into some of the results. It's, it's a fascinating, they're kind of all over the map, but there's some definite trends. One of the trends you talked about is that there's a lot of questions on the quality of the data. But that's a real inhibitor to people. Whether that suspicion is because it's not good data. And I don't know, this question for you, is, is, do they think it's not relevant to the decision that's being made? Is it an incomplete data set or the wrong data set? It seems to be that keeps coming up over and over about, decision-makers not necessarily having confidence in the data. What, can you share a little bit more color around that? >> Yeah, it's quite interesting actually. So what we find is that 90%. So 90 people, 10 executives (indistinct) to question the data sometimes often or always. But the part that's maybe disappointing or concerning is the two thirds of executives are believed to ignore the data and make a decision kind of pushing the data aside which is really quite striking when you think about it, why have all this data, if more often than not you're sort of disregarding it to make your final answer. And so you're absolutely correct when we dug into why, what are the reasons behind pushing it aside. Data quality was number one. And I think it is a question of, Oh, is the data inaccurate? Is it out of date, these sort of concerns sort of we, we hear from customers and prospects. But as we dig in deeper in the survey results, excuse me, we, we see some other reasons behind that. One is a lack of collaboration between the data analytics folks and the business folks. And so there's a question of, I don't know exactly where this data came from or to your point kind of how it was produced. What was the methodology? How was it sourced? And maybe because of that disconnect is a lack of trust. So trust really is the ultimate I think, failure to having data culture really take root. >> Right? And it's trust in this trust, as you said, not only in the data per se, the source of the data, the quality of the data, the relevance of the data but also the people who are providing you with the data. And obviously you get, you get some data sets. Sometimes you didn't get other data sets. So, that's really I'm a little bit disconcerting. The other thing I thought was kind of interesting is, it seems to be consistent that the, the primary reason that people are using big data projects is around operations and operations efficiency, a little bit about compliance, but, you know, it's interesting we had you on at the MIT CDOIQ, Chief Data Information Officer quality symposium, and you talked about the goodness of people moving from kind of a defensive posture to an offensive posture, you know using data in terms of product development and innovation. And, and what comes across in this survey is that's kind of down the list behind you know, kind of operational efficiency. We're seeing a little bit of governance and regulation but the, the quest for data as a tool for innovation, didn't really shine through in this report. >> Well, you know, it's very interesting. It depends whether you look at the aggregate level or you break things down a little bit more. So one thing we did after we got that zero to 12 scale on the data culture index or DCI, is it actually, we were able to break it down into thirds. And among the sort of bottom third, it has the least well-established data culture by this yardstick. We've found that governance and regulatory compliance, was the number one application of data. But among the top third of respondents, we actually found the opposite where things like providing a great customer experience, doing product innovation, those sort of things actually came to the fore and governance fell behind. So I think there is this curve where, It's table stakes to get the sort of defense side of data figured out. And then you can move on to offense in using data to make your organization meet its meet its other goals. >> Right. Right. And then I wanted to get your take on kind of the democratization of data, right? This is a, this is a trend that's been going on, and really, I think you said before you know, your guys' whole mission is to empower curious and rational world to give people the ability to ask the right questions have the right data and get the right answer. So, you know, we've seen democratization in terms of the access to the data, the access to the tools, the ability to do something with the data and the tool, and then the actual authority to execute business decision based on that. The results on that seem a little bit split here because a lot of the problems seem to be focused on leadership, not necessarily taking a data based decision move, but on the good hand a lot of people trying to break down data silos and make data more accessible for a larger group of people. So that more people in the organization are making data based decisions. This seems kind of like this little bit of a bifurcation between the C suite and everybody else trying to get their job done. >> Absolutely. There's always this question of you know, sort of the, that organizational wide initiative and then what's happening on the ground. One thing we saw that was very heartening and aligns with our customers index success, is a real emphasis being placed on having data governance and data context and data literacy factors sort of be embedded at the point of use. To not expecting people, to just like take a course and look things up and kind of end up with their workflow to be able to use data quickly and accurately and, and interpret it in varied ways. So that was really exciting to see as, as, as a initiative. It sort of bridges that gap along with initiatives to have more collaboration and integration between the data people and the business people. because really you know, they exist to serve one another. But in terms of the disconnect between the C suite and other parts of the org, there was a really interesting inverse correlation. Well, or maybe it's not interesting how you look at it, but basically, you know, when we talk to C level executives and ask, you know, does the C suite ignore data? Do they question data et cetera, those numbers came in lower than when we talked to, you know, senior director about the C suite right? It's sort of the farther you get, and there's a difference there, you know, from my perspective, I almost wonder whether that distance is actually is more objective viewpoint. And when you're in that role, it's hard to even see your cognitive biases and your tendency to ignore a data when it doesn't suit you. >> Right. Right. So there's, there's some other interesting things here. So one of them is, you know, kind of predictors, right? One of the whole reasons to do studies and collect data so that we can have some predictive ability. And, and it comes out here that the reporting structure is a strong predictor of a company's data tier structure. So, you know, there's the whole rise of the chief data officers and the chief analytics officer and the chief data and analytics officer and lots of conversations about those roles and what exactly are those roles and who do they report to. Your study finds a pretty compelling leading indicator that if that role is reporting to either the CEO or the executive board, which is often a one in the same person, that that's actually a terrific indicator of success in moving to a more data centric culture. >> That's absolutely correct. So we found that that top third of organizations on the data culture index were much more likely to have a chief data executive, a CDO, CAO or CDAO. In fact, they're more likely to have folks with the analytics in their title because in some organizations, data is thought to mean sort of raw data, infrastructural defense and analytics is sort of where it gets you know, infused into business processes and value. But certainly that top third is much more likely to have the chief data executive reporting into the executive board or CEO when the highest ranking data executive is under the CIO or some other part of the organization, those orgs tend to score a far lower on the DCI. >> Right. Right. So it's interesting, you know you're a really interesting guy even doing this for a while. You were at Siri before you were at Alation. So you have a really good feel for kind of what data can do and can't do and natural human or natural language processing and, and, and human voice interaction with these devices, a really interesting case study, and they can do a really good job within a small defined data set and instruction set, but they don't do necessarily so well once you kind of get outside how, how they're trained. And you've talked a lot about how metaphor shaped the way that we think and I know you and Dave talked about data oil and data lakes I don't want to necessarily go down that whole path but I do think it's important. And what came out of the study and the way people think about data. You know, there's a lot of conversation. How do you value data? Is data, you know it used to just be an expense that we had to buy servers to store the stuff we weren't sure what we ever did with it. So I wonder if there's any, you know, kind of top level metaphors level, kind of a thought or process or framing in the companies that you study that came out. maybe not necessarily in the top line data, but maybe in some of the notes that help define why some people, you know are being successful at making this transition and putting, you know kind of data out front of their decision processing versus data, either behind as a supporting thing or maybe data, I just don't have time with it or I don't trust it, or God knows where you got that, and this is not the data that I wanted. You know, was there any, you know, kind of tangental or anecdotal stuff that came out of this study that's more reflective of, of the softer parts of a data culture versus the harder parts in terms of titles and roles and, and, and job responsibilities. >> Yeah. It's a really interesting place to explore. I do think there's a, I don't want to make this overly simplistic group binary, but at the end of the day you know, like anything else within an organization, you can view data as a liability to say, okay, we have for example, you know, customer's names and phone numbers and passwords, and we just need to prevent an adverse event in which there's a leak or some sort of InfoSec problem that could cause, you know, bad press and fines and other negative consequences. And I think the issue there is if data's a liability, the most you know, the best case is that it's worth zero as opposed to some huge negative on your company's balance sheet. And, and I think, you know, intuitively, if you really want to prevent data misuse and data problems, one fail safe, but I think ultimately in its own way risky way to do that was just not collect any data, right. And not store it. So I think that the transition is to say, look data must be protected and taken care of that's step zero. But you know, it's really just the beginning and data is this asset that can be used to inform the huge company level strategic decisions that are made in annual planning at the board level, down to the millions of little decisions every day in the work of people in customer support and in sales and in product management and in, you know, various roles that just across industries. And I think once you have that, that shift, you know the upside is potentially, you know, unbounded. >> Right. And, and it just changes the way, the way you think. And suddenly instead of saying, Oh, data needs to be kind of hidden away, it's more like, Oh, people need to be trained on data use and empowered with data. And it's all about not if it's used or if it's misused but really how it's used and why it's used, what it's being used for to make a real impact. >> Right. Right. And it's funny when I just remember it being back in business school one of the great things that help teach is to think in terms of data, right. And you always have the infamous center consulting interview question, How many manhole covers are in Manhattan. Right. So, you know, to, to, to start to think about that problem from a data centric, point of view really gives you a leg up and, and even, you know where to start and how to attack those types of problems. And I thought it was interesting you know, talking about challenges for people to have a more data centric, point of view. It's interesting. The reports says, basically everybody said there's all kinds of challenges around data quality and compliance, and they had democratization. But the bottom companies, the bottom companies said that the biggest challenge was lack of buy in from company leadership. So I guess the good news bad news is that there's a real opportunity to make a significant change and get your company from the bottom third to a middle third or a top third, simply by taking a change in attitude about putting data in a much more central role in your decision making process. 'Cause all the other stuff's kind of operational, execution challenges that we all have, not enough people, blah, blah, blah. But in terms of attitude of leadership and prioritization, that's something that's very easy to change if you so choose. And really seems to be the key to unlock this real journey as opposed to the minutiae of a lot of the little details that that are a challenge for everybody. >> Absolutely. In your changing attitudes might be the easiest thing or the hardest thing depending on (indistinct). But I think you're absolutely right. The first step, which, which which could, maybe it should be easy, is admitting that you have a problem or maybe to put it more positively, realizing you have an opportunity. >> I love that. And then just again, looking at the top tier companies, the other thing that I thought was pretty interesting in this study is, I'm looking at it here, is getting champions in each of the operational segments. So rather than, I mean, a chief data officer is important and you know, somebody kind of at the high level to shepherd it in the executive suite, as we just discussed, but within each of the individual tasks and functions and roles, whether that's operations or customer service or product development or operational efficiency, you need some type of champion, some type of person, you know, banging the gavel, collecting the data, smoothing out the complexities, helping people get their thing together. And again, another way to really elevate your position on the score. >> Absolutely. And I think this idea of again, bridging between, you know, if data is centralized you have a chance to try to really get excellent practices within the data org. But even it becomes even more essential to have those ambassadors, people who are in the business and understand all the business context who can sort of make the data relevant, identify the key areas where data can really help, maybe demystify data and pick the right metaphors and the right examples to make it real for the people in their function. >> Right. Right. So Aaron has a lot of great stuff. People can go to the website at alation.com. I'm sure you'll have a link to this, a very prominently displayed, but, and they should and they should check it out and really think about it and think about how it applies to their own situation, their own department, company et cetera. I just wanted to give you the last word before we before we sign off, you know, kind of what was the most you know, kind of positive affirmation or not the most but one or two of the most outcome affirming outcomes of this exercise. And what were one or two of the things that were a little concerning or, you know, kind of surprises on the downside that, that came out of this research? >> Yeah. So I think one thing that was maybe surprising or concerning the biggest one is sort of where we started with that disconnect between, you know, what people would, say as an off the cuff overall assessment and the disconnect between that and what emerges when we go department by department and (indistinct) to be pillars of data culture from such a discovery to data literacy, to data governance. I think that disconnect, you know, should give one pause. I think certainly it should make one think, Hmm. Maybe I shouldn't look from 10,000 feet, but actually be a little more systematic. And considering the framework I use to assess data culture that is the most important thing to my organization. I think though, there's this quote that you move what you measure, just having this hopefully simple but not simplistic yardstick to measure data culture and the data culture index should help people be a little bit more realistic in their quantification and they track their progress, you know, quarter over quarter. So I think that's very promising. I think another thing is that, you know sometimes we ask, how long have you had this initiative? How much progress have you made? And it can sometimes seem like pushing a boulder uphill. Obviously the COVID pandemic and the economic impacts of that has been really tragic and really hard. You know, a tiny silver lining in that is the survey results showed that organizations have really observed a shift in how much they're using data because sometimes things are changing but it's like a frog in boiling water. You don't realize it. And so you just assume that the future is going to look like the recent past and you don't look at the data or you ignore the data or you miss parts of the data. And a lot of organizations said, you know COVID was this really troubling wake up call, but they could even after this crisis is over, producing enduring change which people were consulting data more and making decisions in a more data driven way. >> Yeah, certainly an accelerant that, that is for sure whether you wanted it, didn't want it, thought you had it at the time, didn't have time. You know COVID is definitely digital transformation accelerant and data is certainly the thing that powers that. Well again, it's the Alation State of Data Culture Report available, go check it at alation.com. Aaron always great to catch up and again, thank you for, for doing the work and supporting this research. And I think it's really important stuff. And it's going to be interesting to see how it changes over time. 'Cause that's really when these types of reports really start to add value. >> Thanks for having me, Jeff and I really look forward to discussing some of those trends as the research is completed. >> All right. Thanks a lot, Aaron, take care. Alright. He's Aaron and I'm Jeff. You're watching theCUBE, Palo Alto. Thanks for watching. We'll see you next time. (upbeat music)

Published Date : Sep 30 2020

SUMMARY :

leaders all around the world. and get the insight directly from them. It's good to be here. This is a, the kind of you know, I, part of my job, and then their competency, if you will And so the idea is to make that possible, And sometimes that you know, But even at the outset is this you know, One of the trends you talked of pushing the data aside and you talked about the And among the sort of bottom third, in terms of the access to the It's sort of the farther you get, and the chief data and analytics officer where it gets you know, and putting, you know but at the end of the day you know, the way, the way you think. a lot of the little details that you have a problem or and you know, somebody and the right examples to make it real before we sign off, you know, And a lot of organizations said, you know and data is certainly the and I really look forward to We'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AaronPERSON

0.99+

DavePERSON

0.99+

JeffPERSON

0.99+

Jeff FrickPERSON

0.99+

Aaron KalbPERSON

0.99+

Palo AltoLOCATION

0.99+

oneQUANTITY

0.99+

10 executivesQUANTITY

0.99+

12 pointQUANTITY

0.99+

September 2020DATE

0.99+

SiriTITLE

0.99+

90%QUANTITY

0.99+

90 peopleQUANTITY

0.99+

ManhattanLOCATION

0.99+

twoQUANTITY

0.99+

CUBEORGANIZATION

0.99+

10,000 feetQUANTITY

0.99+

OneQUANTITY

0.99+

bothQUANTITY

0.99+

BostonLOCATION

0.99+

eachQUANTITY

0.99+

todayDATE

0.99+

zeroQUANTITY

0.99+

first stepQUANTITY

0.99+

theCUBEORGANIZATION

0.99+

four pointQUANTITY

0.98+

alation.comOTHER

0.98+

Alation State of Data Culture ReportTITLE

0.98+

one thingQUANTITY

0.98+

COVID pandemicEVENT

0.97+

millionsQUANTITY

0.96+

third bucketQUANTITY

0.96+

AlationORGANIZATION

0.95+

first oneQUANTITY

0.94+

two thirdsQUANTITY

0.94+

last quarterDATE

0.92+

300 data leadersQUANTITY

0.91+

about halfQUANTITY

0.91+

three categoriesQUANTITY

0.9+

three bucketsQUANTITY

0.89+

MIT CDOIQORGANIZATION

0.89+

thirdQUANTITY

0.89+

InfoSecORGANIZATION

0.88+

step zeroQUANTITY

0.86+

first personQUANTITY

0.85+

three kindQUANTITY

0.84+

thirdsQUANTITY

0.83+

AlationPERSON

0.82+

12 scaleQUANTITY

0.74+

C suiteTITLE

0.73+

CTITLE

0.71+

300OTHER

0.71+

One thingQUANTITY

0.7+

bottomQUANTITY

0.67+

Alation State of Data Culture ReportTITLE

0.65+

minutesDATE

0.58+

OfficerEVENT

0.56+

top thirdQUANTITY

0.56+

middleQUANTITY

0.51+

Kiran Narsu, Alation & William Murphy, BigID | CUBE Conversation, May 2020


 

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation LeBron welcome to the cube studio I'm John Ferrier here in Palo Alto in our remote coverage of the tech industry we are in our quarantine crew here getting all the stories in the technology industry from all the thought leaders and all the newsmakers we've got a great story here about data data compliance and really about the platforms around how enterprises are using data I've got two great guests and some news to announce Kieran our CEO is the vice president of business development with elation and William Murphy vice president of technology alliances of big ID got some interesting news a integration partnership between the two companies really kind of compelling especially now as people have to look at the cloud scale what's happening in our world certainly in the new realities of kovin 19 and going forward the role of data new kinds of applications and the speed and agility are gonna require more and more automation more reality around making sure things are in place so guys thanks for coming on appreciate it Kieran William thanks for joining me thank you thank you so let's take a step back elation you guys have been on the cube many times we've been following you guys been a leader and Enterprise catalog a new approach it's a real new technology approach and methodology and team approach to building out the data catalogues so talk about the Alliance here why what's the news why you guys in Creighton is integration partnership well let me start and thank you for having us today you know as you know elation launched the data catalog a category seven years ago and even today we're acknowledging the leader as a leader in that space you know and but we really began with the core belief that ultimately data management will be drive driven more and more by business demand and less by information suppliers so you know another way to think about that is you know how people behave with data will drive how companies manage data so our philosophy put very simply is to start with people and not first not data and our customers really seem to agree with this approach and we've got close to 200 brands using our data you know our tool every single day to drive vibrant data communities and and foster a real data culture in the environment so one of the things that was really exciting to us is the in been in data privacy by large corporate customers to get their arms around this and you know we really strive to improve our ability to use the tool inside you know these enterprises across more use cases so the partnership that we're announcing with big ID today is really you know Big Ideas the leading modern data intelligence platform for privacy and what we're trying to do is to bring bring a level of integration between our two technologies so that enterprises in better manage and scale their their data privacy compliance capability William talked about big ID what you guys are doing you guys also have a date intelligence platform we've been covering gdpr for a very long time I once called I won't say it again because it wasn't really that complimentary but the reality has sit in and they and the users now understand more than ever privacy super important companies have to deal with this you guys have a solution take a minute to explain big-big ID and what you guys are doing yeah absolutely so our founders Demetri Shirota and Nimrod Beck's founded big idea in 2016 Sam you know gdpr was authored and the big reason there is that data changed and how companies and enterprises doubled data was changing pretty much forever that profound change meant that the status quo could no longer exist and so privacy was gonna have to become a day-to-day reality to these enterprises but what big ID realized is that to start to do to do anything with privacy you actually have to understand where your data is what it is and whose it is and so that's really the genesis of what dimitri nimrod created which which is a privacy centric data discovery and intelligence platform that allows our enterprise customers and we have over 70 customers in the enterprise space many within the Fortune hundred to be able to find classify and correlate sensitive data as they defined it across data sources whether its own Prem or in the cloud and this gives our users and kind of unprecedented ability to look into their data to get better visibility which if both allows for collaboration and also allows for real-time decision-making a big place with better accuracy and confidence that regulations are not being broken and that customers data is being treated appropriately great I'm just reading here from the release that I want to get you guys thoughts and unpack some of the concepts on here but the headline is elation strengthens privacy capabilities with big ID part nur ship empowering organizations to mitigate risks delivering privacy aware data use and improved adherence to data privacy regulations it's a mouthful but the bottom line is is that there's a lot of stuff to that's a lot of complexity around these rules and these platforms and what's interesting you mentioned discovery the enterprise discovery side of the business has always been a complex nightmare I think what's interesting about this partnership from my standpoint is that you guys are bringing an interface into a complex platform and creating an easy abstraction to kind of make it usable I mean the end of the day you know we're seeing the trends with Amazon they have Kendre which they announced and they're gonna have a ship soon fast speed of insights has to be there so unifying data interfaces with back-end is really what seems to be the pattern is that the magic going on here can you guys explain what's going on with this and what's the outcome gonna be for customers yeah I guess I'll kick off and we'll please please chime in I think really there's three overarching challenges that I think enterprises are facing is they're grappling with these regulations as as we'll talked about you know number one it's really hard to both identify and classify private data right it's it's not as easy as it might sound and you know we can talk a little bit more about that it's also very difficult to flag at the point of analysis when somebody wants to find information the relevant policies that might apply to the given data that they're looking to it to run an analysis on and lastly the enterprise's are constantly in motion as enterprises change and by new businesses and enter new markets and launch new products these policies have to keep up with that change and these are real challenges to address and you know with Big Idea halation we're trying to really accelerate that compliance right with the the you know the combination of our tools you know reduce the the cost and complexity of compliance and fundamentally keep up through a single interface so that users can know what to do with data at the point of consumption and I think that's the way to think about it well I don't know if you want to add something to that absolutely I think when Karen and I have been working on this for actually many months at this point but most companies don't have a business plan of just saying let's store as much data as possible without getting anything out of it but in order to get something out of it the ability to find that data rapidly and then analyze it so that decision makers make up-to-date decisions is pretty vital a lot of these things when they have to be done manually take a long time they're huge business issues there and so the ability to both automate data discovery and then cataloging across elation and big ID gives those decision makers whether the data steward the data analyst the chief data officer an ability to really dive deeper than they have previously with better speed you know one of the things that we've been talking about for a long time with big data as these data links and they're fairly easy to pull I mean you can put a bunch of data into a corpus and you you act on them but as you start to get across these silos there's a need for you know getting a process down around managing just not only the data wrangling but the policies behind it and platforms are becoming more complex can you guys talk about the product market fit here because there's sass involved so there's also a customer activity what's the product market fit that you guys see with this integration what are some of the things that you're envisioning to emerge out of this value proposition I think I can start I think you're exactly right enterprises have made huge investments in you know historically data warehouses data Mart's data lakes all kinds of other technology infrastructure aimed at making the data easier to get to but they've effectively just layered on to the problem so elations catalog has made it incredibly much more effective at helping organizations to find to understand trust to reuse and use that data so that stewards and people who know about the data can inform users who may need need to run a particular report or conduct a specific analysis can accelerate that process and compress the time the insights much much more than then it's are possible with today's technologies and if you if you overlay that on to the data privacy challenge its compounded and I think you know will it would be great for you to comment on what the data discovery capability it's a big ID do to improve that that even further yeah absolutely so as to companies we're trying to bridge this gap between data governance and privacy and and John as you mentioned there's been a proliferation of a lot of tools whether their data lakes data analysis tools etc what Big Idea is able to do is we're looking across over 70 different types of data platforms whether they be legacy systems like SharePoint and sequel whether they be on pram or in the cloud whether it's data at rest or in motion and we're able to auto populate our metadata findings into relations data catalog the main purpose there being that those data stewards and have access to the most authentic real time data possible so on the terms of the customer value they're going to see what more built in privacy aware features is its speed but you know what I mean the problem is compounded with the data getting that catalog and getting insights out of it but for this partnership is it speed to outcome what does the outcome that you guys are envisioning here for the customer I think it's a combination of speed as you said you know they can much more rapidly get up to speed so an analyst who needs to make a decision about specific data set whether they can use it or not and know at the point of analysis if this data is governed by policies that has been informed by big IDs so the elation catalog user can make a much more rapid decision about how to use that the second piece is the complexity and costs of compliance they can really reduce and start to winnow down their technology footprint because with the combination of the discovery that big ID provides the the the ongoing discovery the big ID provides and the enterprise it data catalog provided violation we give the framework for being able to keep up with these changes in policies as rules and as companies change so they don't have to keep reinventing the wheel every time so we think that there's a significant speed time the market advantage as well as an ability to really consolidate technology footprint well I'll add to that yeah yeah just one moment so elation when they helped create this marketplace seven years ago one of the goals there and I think we're Big Ideas assisting as well as the trusting confidence that both the users of these software's the data store of the analysts have and the data that they're using and then the the trust and confidence are building with their end consumers is much better knowing that there is the this is both bi-directional and ongoing continuously you know I've always been impressed with relations vision it's big vision around the role of the human and data and it's always been impressive and yeah I think the world spinning in that direction you starting to see that now William I want to get your thoughts with big id because you know one of the things is challenging out there from what we're hearing is you know people want to protect the sensitive data obviously with the hacks and everything else and personal information there's all kinds of regulation and believe me state by state nation by nation it's crazy complex at the same time they've got to ensure this compliance tripwires everywhere right so you have this kind of nested complex web of stuff and some real security concerns at the same time you want to make data available for machine learning and for things like that this is the real kind of things that the problem has twisted around so if I'm an enterprise I'm like oh man this is a pain in the butt so how are you guys seeing this evolve because this solution is one step in that direction what are some of the pain points what are some of the examples can you share any insights around how people are overcoming that because they want to get the data out there they want to create applications that are gonna be modern robust and augmented with whether it's augmented AI of some sort or some sort of application at the same time protecting the information and compliance it's a huge problem challenge your thoughts absolutely so to your point regulations and compliance measures both state-by-state and internationally they're growing I mean I think when we saw GDP our four years ago in the proliferation of other things whether it be in Latin America in Asia Pacific or across the United States potentially even at the federal level in the future it's not making it easier to add complexity to that every industry and many companies individually have their own policies in the way that they describe data whether what's sensitive to them is it patent numbers is it loyalty card numbers is it any number of different things where they could just that that enterprise says that this type of data is particularly sensitive the way we're trying to do this is we're saying that if we can be a force multiplier for the individuals within our organization that are in charge of the stewardship over their data whether it be on the privacy side on the security side or on the data and analytics side that's what we want to do and automation is a huge piece of this so yes the ID has a number of patents in the machine learning area around data discovery and classification cluster analysis being able to find duplicate of data out there and when we put that in conjunction with what elations doing and actually gave the users of the data the kind of unprecedented ability to curate deduplicate secure sensitive data all by a policy driven automated platform that's actually I think the magic gear is we want to make sure that when humans get involved their actions can be made how do I say this minimum minimum human interaction and when it's done it's done for a reason of remediation so they're there the second step not the first step here I'll get your thoughts you know I always riff on the idea of DevOps and it's a cloud term and when you apply that the data you talk about programmability scale automation but the humans are making calls whether you're a programmer and devops world or to a data customer of the catalog and halation i'm making decisions with my business I'm a human I'm taking action at the point of design or whatever this is where I think the magic can happen your thoughts on how this evolves for that use case because what you're doing is you're augmenting the value for the user by taking advantage of these things is is that right or am i around the right area yeah I think so I think the one way to think about elation and that analogy is that the the biggest struggle that enterprise business users have and we target the the consumers of data we're not a provider to the information suppliers if you will but the people who had need to make decisions every single day on the right set of data we're here to empower them to be able to do that with the data that they know has been given the thumbs up by people who know about the data connecting stewards who know about the subject matter at hand with the data that the analyst wants to use at the time of consumption and that powerful connection has been so effective in our customers that enabling them to do in our analytical work that they just couldn't dream of before so the key piece here is with the combination with big ID we can now layer in a privacy aware consumption angle which means if you have a question about running some customer propensity model and you don't know if you can use this data or that data the big ID data discovery platform informs the elation catalog of the usage capabilities of that given data set at the moment the analyst wants conduct his or her analysis with the appropriate data set as identified by the stewards and and as endorsed by the steward so that point in time is really critical because that's where the we can we can fundamentally shrink the decision sight yeah it's interesting and so have the point of attack on the user in this case the person in the business who's doing some real work that's where the action is yeah it's a whole nother meaning of actionable data right so you know this seems to where the values quits its agility really it's kind of what we're talking about here isn't it it is very agile on the differentiation between elation and big idea in what we're bringing to the market now is we're also bringing flexibility and you meant that the point of agility there is because we allow our customers to say what their policies are what their sense of gait is define that themselves within our platforms and then go out find that data classify and catalog at etc like that's giving them that extra flexibility the enterprise's today need so that it can make business decisions and faster and I actually operationalize data guys great job good good news it's I think this is kind of a interesting canary in the coal mine around the trends that are going on around how data is evolving what's next how you guys gonna go to market partnership obviously makes a lot of sense technical integration business model integration good fit what's next for you guys I'm sorry I mean I think the the great thing is that you know from the CEO down our organizations are very much aligned in terms of how we want to integrate our two solutions and how we want to go to market so myself and will have been really focused on making sure that the skill sets of the various constituents within both of our companies have the level of education and knowledge to bring these results to bear coupled with the integration of our two technologies well your thoughts yeah absolutely I mean between our CEOs who have a good cadence to care to myself who probably spend too much time on the phone at this point we might have to get him a guest bedroom or something alignments a huge key here ensuring that we've enabled our field to - and to evangelize this out to the marketplace itself and then doing whether it's this or our webinars or or however we're getting the news out it's important that the markets know that these capabilities are out there because the biggest obstacle honestly to adoption it's not that other solutions or build-it-yourself it's just lack of knowledge that it could be easier it could be done better that you could have you could know your data better you could catalog it better great final question to end the segment message to the potential customer out there what it what about their environment that might make them a great prospect for this solution is it is it a known problem is it a blind spot when would someone know to call you guys up in this to ship and leverage this partnership is it too much data as it's just too much many applications across geographies I'm just trying to understand the folks watching when it's an opportunity to call you guys welcome a relation perspective there that can never be too much data they the a signal that may may indicate an interest or a potential fit for us would be you know the need to be compliant with one or more data privacy regulations and as well said these are coming up left and right individual states in the in addition to the countries are rolling out data privacy regulations that require a whole set of capabilities to be in place and a very rigorous framework of compliance those those requirements and the ability to make decisions every single day all day long about what data to use and when and under what conditions are a perfect set of conditions for the use of a data catalog evacuation coupled with a data discovery and data privacy solution like big I well absolutely if you're an organization out there and you have a lot of customers you have a lot of employees you have a lot of different data sources and disparate locations whether they're on prime of the cloud these are solid indications that you should look at purchasing best-of-breed solutions like elation and Big Ideas opposed to trying to build something internally guys congratulations relations strengthening your privacy capabilities with the big ID partnership congratulations on the news and we'll we'll be tracking it thanks for coming I appreciate it thank you okay so cube coverage here in Palo Alto on remote interviews as we get through this kovat crisis we have our quarantine crew here in Palo Alto I'm John Fourier thanks for watching [Music] okay guys

Published Date : May 13 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

EntityCategoryConfidence
KieranPERSON

0.99+

KarenPERSON

0.99+

Kieran WilliamPERSON

0.99+

2016DATE

0.99+

Palo AltoLOCATION

0.99+

John FourierPERSON

0.99+

Palo AltoLOCATION

0.99+

AmazonORGANIZATION

0.99+

WilliamPERSON

0.99+

May 2020DATE

0.99+

two solutionsQUANTITY

0.99+

second pieceQUANTITY

0.99+

Kiran NarsuPERSON

0.99+

two companiesQUANTITY

0.99+

William MurphyPERSON

0.99+

Nimrod BeckPERSON

0.99+

United StatesLOCATION

0.99+

Latin AmericaLOCATION

0.99+

two technologiesQUANTITY

0.99+

Demetri ShirotaPERSON

0.99+

Asia PacificLOCATION

0.99+

first stepQUANTITY

0.99+

AlationPERSON

0.99+

JohnPERSON

0.99+

William MurphyPERSON

0.99+

todayDATE

0.99+

seven years agoDATE

0.99+

over 70 customersQUANTITY

0.99+

second stepQUANTITY

0.99+

BostonLOCATION

0.99+

four years agoDATE

0.99+

LeBronPERSON

0.98+

John FerrierPERSON

0.98+

bothQUANTITY

0.98+

two great guestsQUANTITY

0.98+

oneQUANTITY

0.98+

KendreORGANIZATION

0.97+

200 brandsQUANTITY

0.96+

single interfaceQUANTITY

0.95+

SharePointTITLE

0.95+

over 70 different typesQUANTITY

0.94+

one stepQUANTITY

0.93+

three overarching challengesQUANTITY

0.89+

BigIDORGANIZATION

0.85+

Big IdeaORGANIZATION

0.84+

Big IdeasORGANIZATION

0.81+

MartORGANIZATION

0.78+

one momentQUANTITY

0.77+

gdprTITLE

0.76+

every single dayQUANTITY

0.74+

big IDORGANIZATION

0.74+

one wayQUANTITY

0.73+

monthsQUANTITY

0.73+

bunch of dataQUANTITY

0.72+

every single dayQUANTITY

0.7+

muchQUANTITY

0.69+

a lot of stuffQUANTITY

0.68+

a lot of toolsQUANTITY

0.67+

CreightonLOCATION

0.66+

DevOpsTITLE

0.65+

vicePERSON

0.58+

nimrodPERSON

0.58+

thingsQUANTITY

0.57+

kovin 19ORGANIZATION

0.55+

dimitriPERSON

0.52+

every industryQUANTITY

0.52+

bigORGANIZATION

0.47+

hundredQUANTITY

0.46+

bigTITLE

0.44+

IDTITLE

0.35+

Stephanie McReynolds, Alation | CUBEConversation, November 2019


 

>> Announcer: From our studios, in the heart of Silicon Valley, Palo Alto, California, this is a CUBE conversation. >> Hello, and welcome to theCUBE studios, in Palo Alto, California for another CUBE conversation where we go in depth with though leaders driving innovation across tech industry. I'm your host, Peter Burris. The whole concept of self service analytics has been with us decades in the tech industry. Sometimes its been successful, most times it hasn't been. But we're making great progress and have over the last few years as the technologies matures, as the software becomes more potent, but very importantly as the users of analytics become that much more familiar with what's possible and that much more wanting of what they could be doing. But this notion of self service analytics requires some new invention, some new innovation. What are they? How's that going to play out? Well, we're going to have a great conversation today with Stephanie McReynolds, she's Senior Vice President of Marketing, at Alation. Stephanie, thanks again for being on theCUBE. >> Thanks for inviting me, it's great to be back. >> So, tell us a little, give us an update on Alation. >> So as you know, Alation was one of the first companies to bring a data catalog to the market. And that market category has now been cemented and defined depending on the industry analyst you talk to. There could be 40 or 50 vendors now who are providing data catalogs to the market. So this has become one of the hot technologies to include in a modern analytics stacks. Particularly, we're seeing a lot of demand as companies move from on premise deployments into the cloud. Not only are they thinking about how do we migrate our systems, our infrastructure into the cloud but with data cataloging more importantly, how do we migrate our users to the cloud? How do we get self-service users to understand where to go to find data, how to understand it, how to trust it, what re-use can we do of it's existing assets so we're not just exploding the amount of processing we're doing in the cloud. So that's been very exciting, it's helped us grow our business. We've now seen four straight years of triple digit revenue growth which is amazing for a high growth company like us. >> Sure. >> We also have over 150 different organizations in production with a data catalog as part of their modern analytics stack. And many of those organizations are moving into the thousands of users. So eBay was probably our first customer to move into the, you know, over a thousand weekly logins they're now up to about 4,000 weekly logins through Alation. But now we have customers like Boeing and General Electric and Pfizer and we just closed a deal with US Air Force. So we're starting to see all sorts of different industries and all sorts of different users from the analytics specialist in your organization, like a data scientist or a data engineer, all the way out to maybe a product manager or someone who doesn't really think of them as an analytics expert using Alation either directly or sometimes through one of our partnerships with folks like Tableau or Microstrategy or Power BI. >> So, if we think about this notion of self- service analytics, Stephanie, and again it's Alation has been a leader in defining this overall category, we think in terms of an individual who has some need for data but is, most importantly, has questions they think data can answer and now they're out looking for data. Take us through that process. They need to know where the data is, they need to know what it is, they need to know how to use it, and they need to know what to do if they make a mistake. How is that, how are the data catalogs, like Alation, serving that, and what's new? >> Yeah, so as consumers, this world of data cataloging is very similar if you go back to the introduction of the internet. >> Sure. >> How did you find a webpage in the 90's? Pretty difficult, you had to know the exact URL to go to in most cases, to find a webpage. And then a Yahoo was introduced, and Yahoo did a whole bunch of manual curation of those pages so that you could search for a page and find it. >> So Yahoo was like a big catalog. >> It was like a big catalog, an inventory of what was out there. So the original data catalogs, you could argue, were what we would call from an technical perspective, a metadata repository. No business user wants to use a metadata repository but it created an inventory of what are all the data assets that we have in the organizations and what's the description of those data assets. The meta- data. So metadata repositories were kind of the original catalogs. The big breakthrough for data catalogs was: How do we become the Google of finding data in the organization? So rather than manually curating everything that's out there and providing an in- user inferant with an answer, how could we use machine learning and AI to look at patterns of usage- what people are clicking on, in terms of data assets- surface those as data recommendations to any end user whether they're an analytics specialist or they're just a self- service analytics user. And so that has been the real break through of this new category called data cataloging. And so most folks are accessing a data catalog through a search interface or maybe they're writing a SQL query and there's SQL recommendations that are being provided by the catalog-- >> Or using a tool that utilizes SQL >> Or using a tool that utilizes SQL, and for most people in a- most employees in a large enterprise when you get those thousands of users, they're using some other tool like Tableau or Microstrategy or, you know, a variety of different data visualization providers or data science tools to actually access that data. So a big part of our strategy at Alation has been, how do we surface this data recommendation engine in those third party products. And then if you think about it, once you're surfacing that information and providing some value to those end users, the next thing you want to do is make sure that they're using that data accurately. And that's a non- trivial problem to solve, because analytics and data is complicated. >> Right >> And metadata is extremely complicated-- >> And metadata is-- because often it's written in a language that's arcane and done to be precise from a data standpoint, that's not easily consumable or easily accessible by your average human being. >> Right, so a label, for example, on a table in a data base might be cust_seg_257, what does that mean? >> It means we can process it really quickly in the system. >> Yeah, but as-- >> But it's useless to a human being-- >> As a marketing manager, right? I'm like, hey, I want to do some customer segmentation analysis and I want to find out if people who live in California might behave differently if I provide them an offer than people that live in Massachusetts, it's not intuitive to say, oh yeah, that's in customer_seg_ so what data catalogs are doing is they're thinking about that marketing manager, they're thinking about that peer business user and helping make that translation between business terminology, "Hey I want to run some customer segmentation analysis for the West" with the technical, physical model, that underlies the data in that data base which is customer_seg_257 is the table you need to access to get the answer to that question. So as organizations start to adapt more self- service analytics, it's important that we're managing not just the data itself and this translation from technical metadata to business metadata, but there's another layer that's becoming even more important as organizations embrace self- service analytics. And that's how is this data actually being processed? What is the logic that is being used to traverse different data sets that end users now have access to. So if I take gender information in one table and I have information on income on another table, and I have some private information that identifies those two customers as the same in those two tables, in some use tables I can join that data, if I'm doing marketing campaigns, I likely can join that data. >> Sure. >> If I'm running a loan approval process here in the United States, I cannot join that data. >> That's a legal limitation, that's not a technical issue-- >> That's a legal, federal, government issue. Right? And so here's where there's a discussion, in folks that are knowledgeable about data and data management, there's a discussion of how do we govern this data? But I think by saying how we govern this data, we're kind of covering up what's actually going on, because you don't have govern that data so much as you have to govern the analysis. How is this joined, how are we combining these two data sets? If I just govern the data for accuracy, I might not know the usage scenario which is someone wants to combine these two things which makes it's illegal. Separately, it's fine, combined, it's illegal. So now we need to think about, how do we govern the analytics themselves, the logic that is being used. And that gets kind of complicated, right? For a marketing manager to understand the difference between those things on the surface is doesn't really make sense. It only makes sense when the context of that government regulation is shared and explained and in the course of your workflow and dragging and dropping in a Tableau report, you might not remember that, right? >> That's right, and the derivative output that you create that other people might then be able to use because it's back in the data catalog, doesn't explicitly note, often, that this data was generated as a combination of a join that might not be in compliance with any number of different rules. >> Right, so about a year and a half ago, we introduced a new feature in our data catalog called Trust Check. >> Yeah, I really like this. This is a really interesting thing. >> And that was meant to be a way where we could alert end users to these issues- hey, you're trying to run the same analytic and that's not allowed. We're going to give you a warning, we're not going to let you run that query, we're going to stop you in your place. So that was a way in the workflow of someone while they're typing a SQL statement or while they're dragging and dropping in Tableau to surface that up. Now, some of the vendors we work with, like Tableau, have doubled down on this concept of how do they integrate with an enterprise data catalog to make this even easier. So at Tableau conference last week, they introduced a new metadata API, they introduced a Tableau catalog, and the opportunity for these type of alerts to be pushed into the Tableau catalog as well as directly into reports and worksheets and dashboards that end users are using. >> Let me make sure I got this. So it means that you can put a lot of the compliance rules inside Alation and have a metadata API so that Alation effectively is governing the utilization of data inside the Tableau catalog. >> That's right. So think about the integration with Tableau is this communication mechanism to surface up these policies that are stored centrally in your data catalog. And so this is important, this notion of a central place of reference. We used to talk about data catalogs just as a central place of reference for where all your data assets lie in the organizations, and we have some automated ways to crawl those sources and create a centralized inventory. What we've added in our new release, which is coming out here shortly, is the ability to centralize all your policies in that catalog as well as the pointers to your data in that catalog. So you have a single source of reference for how this data needs to be governed, as well as a single source of reference for how this data is used in the organization. >> So does that mean, ultimately, that someone could try to do something, trust check and say, no you can't, but this new capability will say, and here's why or here's what you do. >> Exactly. >> A descriptive step that says let me explain why you can't do it. >> That's right. Let me not just stop your query and tell you no, let me give you the details as to why this query isn't a good query and what you might be able to do to modify that query should you still want to run it. And so all of that context is available for any end user to be able to become more aware of what is the system doing, and why is recommending. And on the flip side, in the world before we had something like Trust Check, the only opportunity for an IT Team to stop those queries was just to stop them without explanation or to try to publish manuals and ask people to run tests, like the DMV, so that they memorized all those rules of governance. >> Yeah, self- service, but if there's a problem you have to call us. >> That's right. That's right. So what we're trying to do is trying to make the work of those governance teams, those IT Teams, much easier by scaling them. Because we all know the volume of data that's being created, the volume of analysis that's being created is far greater than any individual can come up with, so we're trying to scale those precious data expert resources-- >> Digitize them-- >> Yeah, exactly. >> It's a digital transformation of how we acquire data necessary-- >> And then-- >> for data transformation. >> make it super transparent for the end user as to why they're being told yes or no so that we remove this friction that's existed between business and IT when trying to perform analytics. >> But I want to build a little bit on one of the things I thought I heard you say, and that is that the idea that this new feature, this new capability will actually prescribe an alternative, logical way for you to get your information that might be in compliance. Have I got that right? >> Yeah, that's right. Because what we also have in the catalog is a workflow that allows individuals called Stewards, analytics Stewards to be able to make recommendations and certifications. So if there's a policy that says though shall not use the data in this way, the Stewards can then say, but here's an alternative mechanism, here's an alternative method, and by the way, not only are we making this as a recommendation but this is certified for success. We know that our best analysts have already tried this out, or we know that this complies with government regulation. And so this is a more active way, then, for the two parties to collaborate together in a distributed way, that's asynchronous, and so it's easy for everyone no matter what hour of the day they're working or where they're globally located. And it helps progress analytics throughout the organization. >> Oh and more importantly, it increases the likelihood that someone who is told you now have self- service capability doesn't find themselves abandoning it the first time that somebody says no, because we've seen that over and over with a lot of these query tools, right? That somebody says, oh wow, look at this new capability until the screen, you know, metaphorically, goes dark. >> Right, until it becomes too complicated-- >> That's right-- >> and then you're like, oh I guess I wasn't really trained on this. >> And then they walk away. And it doesn't get adopted. >> Right. >> And this is a way, it's very human centered way to bring that self- service analyst into the system and be a full participant in how you generate value out of it. >> And help them along. So you know, the ultimate goal that we have as an organization, is help organizations become our customers, become data literate populations. And you can only become data literate if you get comfortable working with the date and it's not a black box to you. So the more transparency that we can create through our policy center, through documenting the data for end users, and making it more easy for them to access, the better. And so, in the next version of the Alation product, not only have we implemented features for analytic Stewards to use, to certify these different assets, to log their policies, to ensure that they can document those policies fully with examples and use cases, but we're also bringing to market a professional services offering from our own team that says look, given that we've now worked with about 20% of our installed base, and observed how they roll out Stewardship initiatives and how they assign Stewards and how they manage this process, and how they manage incentives, we've done a lot of thinking about what are some of the best practices for having a strong analytics Stewardship practice if you're a self- service analytics oriented organization. And so our professional services team is now available to help organizations roll out this type of initiative, make it successful, and have that be supported with product. So the psychological incentives of how you get one of these programs really healthy is important. >> Look, you guys have always been very focused on ensuring that your customers were able to adopt valued proposition, not just buy the valued proposition. >> Right. >> Stephanie McReynolds, Senior Vice President of Marketing Relation, once again, thanks for being on theCUBE. >> Thanks for having me. >> And thank you for joining us for another CUBE conversation. I'm Peter Burris. See you next time.

Published Date : Dec 10 2019

SUMMARY :

in the heart of Silicon Valley, Palo Alto, California, and that much more wanting of what they could be doing. So, tell us a little, depending on the industry analyst you talk to. and General Electric and Pfizer and we just closed a deal and they need to know what to do if they make a mistake. of the internet. of those pages so that you could search for a page And so that has been the real break through the next thing you want to do is make sure that's arcane and done to be precise from a data standpoint, and I have some private information that identifies in the United States, I cannot join that data. and in the course of your workflow and dragging and dropping That's right, and the derivative output that you create we introduced a new feature in our data catalog This is a really interesting thing. and the opportunity for these type of alerts to be pushed So it means that you can put a lot of the compliance rules is the ability to centralize all your policies and here's why or here's what you do. let me explain why you can't do it. the only opportunity for an IT Team to stop those queries but if there's a problem you have to call us. the volume of analysis that's being created so that we remove this friction that's existed and that is that the idea that this new feature, and by the way, not only are we making this Oh and more importantly, it increases the likelihood and then you're like, And then they walk away. And this is a way, it's very human centered way So the psychological incentives of how you get one of these not just buy the valued proposition. Senior Vice President of Marketing Relation, once again, And thank you for joining us for another

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
BoeingORGANIZATION

0.99+

PfizerORGANIZATION

0.99+

General ElectricORGANIZATION

0.99+

Stephanie McReynoldsPERSON

0.99+

StephaniePERSON

0.99+

Peter BurrisPERSON

0.99+

40QUANTITY

0.99+

CaliforniaLOCATION

0.99+

MassachusettsLOCATION

0.99+

YahooORGANIZATION

0.99+

November 2019DATE

0.99+

AlationORGANIZATION

0.99+

eBayORGANIZATION

0.99+

two partiesQUANTITY

0.99+

two thingsQUANTITY

0.99+

two tablesQUANTITY

0.99+

two customersQUANTITY

0.99+

one tableQUANTITY

0.99+

United StatesLOCATION

0.99+

50 vendorsQUANTITY

0.99+

GoogleORGANIZATION

0.99+

Palo Alto, CaliforniaLOCATION

0.99+

SQLTITLE

0.99+

last weekDATE

0.99+

US Air ForceORGANIZATION

0.99+

MicrostrategyORGANIZATION

0.99+

first customerQUANTITY

0.99+

TableauORGANIZATION

0.98+

TableauTITLE

0.98+

StewardsORGANIZATION

0.98+

Power BIORGANIZATION

0.98+

over 150 different organizationsQUANTITY

0.98+

90'sDATE

0.97+

todayDATE

0.97+

singleQUANTITY

0.97+

oneQUANTITY

0.97+

about 20%QUANTITY

0.97+

four straight yearsQUANTITY

0.97+

first timeQUANTITY

0.97+

CUBEORGANIZATION

0.96+

over a thousand weekly loginsQUANTITY

0.96+

thousands of usersQUANTITY

0.96+

two dataQUANTITY

0.94+

MicrostrategyTITLE

0.94+

first companiesQUANTITY

0.92+

TableauEVENT

0.9+

aboutDATE

0.9+

Silicon Valley, Palo Alto, CaliforniaLOCATION

0.89+

a year and a half agoDATE

0.88+

about 4,000 weekly loginsQUANTITY

0.86+

Trust CheckORGANIZATION

0.82+

single sourceQUANTITY

0.79+

Trust CheckTITLE

0.75+

theCUBEORGANIZATION

0.75+

customer_seg_257OTHER

0.74+

upQUANTITY

0.73+

AlationPERSON

0.72+

decadesQUANTITY

0.7+

cust_seg_257OTHER

0.66+

Senior Vice PresidentPERSON

0.65+

yearsDATE

0.58+

CUBEConversationEVENT

0.51+

Aaron Kalb, Alation | MIT CDOIQ 2019


 

>> From Cambridge, Massachusetts, it's theCUBE covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. (dramatic music) >> Welcome back to Cambridge, Massachusetts, everybody. This is theCUBE, the leader in live tech coverage. We go out to the events, and we extract the signal from then noise. And, we're here at the MIT CDOIQ, the Chief Data Officer conference. I'm Dave Vellante with my cohost Paul Gillin. Day two of our wall to wall coverage. Aaron Kalb is here. He's the cofounder and chief data officer of Alation. Aaron, thanks for making the time to come on. >> Thanks so much Dave and Paul for having me. >> You're welcome. So, words matter, you know, and we've been talking about data, and big data, and the three Vs, and data is the new oil, and all this stuff. You gave a talk this week about, you know, "We're maybe not talking the right language "when it comes to data." What did you mean by all that? >> Absolutely, so I get a little bit frustrated by some of these cliques we hear at conference after conference, and the one I, sort of, took aim at in this talk is, data is the new oil. I think what people want to invoke with that is to say, in the same way that oil powered the industrial age, data's powering the information age. Just saying, data's really cool and trendy and important. That's true, but there are a lot of other associations and contexts that people have with data, and some of them don't really apply as, I'm sorry, with oil. And, some of them apply, as well, to data. >> So, is data more valuable than oil? >> Well, I think they're each valuable in different ways, but I think there's a couple issues with the metaphor. One is that data is scarce and dwindling, and part of value comes from the fact that it's so rare. Whereas, the experience with data is that it's so plentiful and abundant, we're almost drowning in it. And so, what I contend is, instead of talking about data as compared to oil, we should talk about data compared to water. And, the idea is, you know, water is very plentiful on the planet, but sometimes, you know, if you have saltwater or contaminated water, you can't drink it. Water is good for different purposes, depending on its form, and so it's all about getting the right data for the right purpose, like water. >> Well, we've certainly, at least in my opinion, fought wars, Paul, over oil. >> And, over water. >> And, certainly, conflicts over water. Do you think we'll be fighting wars over data? Or, are we already? >> No, we might be. One of my favorite talks from the sessions here was a keynote by the CDO for the Department of Defense, who was talking about, you know, the civic duty about transparency but was observing that, actually, more IP addresses from China and Russia are looking at our public datasets than from within the country. So, you know, it's definitely a resource that can be very powerful. >> So, what was the reaction to your premise from the audience. What kind of questions did you get? >> You know, people actually responded very favorably, including some folks from the oil and gas industry, which I was pleased to find. We have a lot of customers in energy, so that was cool. But, what it was nice being here at MIT and just really geeking out about language and linguistics and data with a bunch of CDOs and other people who are, kind of, data intellectuals. >> Right, so if data is not the new oil. >> And, water isn't really a good analogy either, because the supply of water is finite. >> That's true. >> So, what is data? >> Yeah. >> Space? >> Yeah, it's a good point. >> Matter? >> Maybe it is like the universe in that it's always expanding, right, somehow. Right, because any thing, any physic which is on the planet probably won't be growing at that exponential speed. >> So, give us the punchline. >> Well, so I contend that water, while imperfect, is, actually, a really good metaphor that helps for a lot of things. It has properties like the fact that if it's a data quality issue, it flows downstream like pollution in a river. It's the fact that it can come in different forms, useful for different purposes. You might have gray water, right, which is good enough for, you know, irrigation or industrial purposes, but not safe to drink. And so, you rely on metadata to get the data that's in the right form. And, you know, the talk is more fun because you've a lot of visual examples that make this clear. >> Yeah, of course, yeah. >> I actually had one person in the audience say that he used a similar analogy in his own company, so it's fun to trade notes. >> So, chief data officer is a relatively new title for you, is it not? In terms of your role at Alation. >> Yeah, that's right, and the most fun thing about my job is being able to interact with all of the other CDOs and CDAOs at a conference like this. And, it was cool to see. I believe this conference doubled since the last year. Is that right? >> No. >> No, it's up about a hundred, though. >> Right. >> Well. >> And, it's about double from three years ago. >> And, when we first started, in 2013, yeah. >> 130 people, yeah. >> Yeah, it was a very small and intimate event. >> Yeah, here we're outgrowing this building, it seems. >> Yeah, they're kicking us out. >> I think what's interesting is, you know, if we do a little bit of analysis, this is a small data, within our own company, you know, our biggest and most visionary customers typically bought Alation. The buyer champion either was a CDO or they weren't a CDO when they bought the software and have since been promoted to be a CDO. And so, seeing this trend of more and more CDOs cropping up is really exciting for us. And also, just hearing all of the people at the conference saying, two trends we're hearing. A move from, sort of, infrastructure and technology to driving business value, and a move from defense and governance to, sort of, playing offense and doing revenue generation with data. Both of those trends are really exciting for us. >> So, don't hate me for asking this question, because what a lot of companies will do is, they'll give somebody a CDO title, and it's, kind of, a little bit of gimmick, right, to go to market. And, they'll drag you into sales, because I'm sure they do, as a cofounder. But, as well, I know CDOs at tech companies that are actually trying to apply new techniques, figure out how data contributes to their business, how they can cut costs, raise revenue. Do you have an internal role, as well? >> Absolutely, yeah. >> Explain that. >> So, Alation, you know, we're about 250 people, so we're not at the same scale as many of the attendees here. But, we want to learn, you know, from the best, and always apply everything that we learn internally as well. So, obviously, analytics, data science is a huge role in our internal operations. >> And so, what kinds of initiatives are you driving internally? Is it, sort of, cost initiatives, efficiency, innovation? >> Yeah, I think it's all of the above, right. Every single division and both in the, sort of, operational efficiency and cost cutting side as well as figuring out the next big bet to make, can be informed by data. And, our goal was to empower a curious and rational world, and our every decision be based not on the highest paid person's opinion, but on the best evidence possible. And so, you know, the goal of my function is largely to enable that both centrally and within each business unit. >> I want to talk to you about data catalogs a bit because it's a topic close to my heart. I've talked to a lot of data catalog companies over the last couple years, and it seems like, for one thing, the market's very crowded right now. It seems to me. Would you agree there are a lot of options out there? >> Yeah, you know, it's been interesting because when we started it, we were basically the first company to make this technology and to, kind of, use this term, data catalog, in this way. And, it's been validating to see, you know, a lot of big players and other startups even, kind of, coming to that terminology. But, yeah, it has gotten more crowded, and I think our customers who, or our prospects, used to ask us, you know, "What is it that you do? "Explain this catalog metaphor to me," are now saying, "Yeah, catalogs, heard about that." >> It doesn't need to be defined anymore. >> "Which one should I pick? "Why you?" Yeah. >> What distinguished one product from another, you know? What are the major differentiation points? >> Yeah, I think one thing that's interesting is, you know, my talk was about how the metaphors we use shape the way we think. And, I think there's a sense in which, kind of, the history of each company shapes their philosophy and their approach, so we've always been a data catalog company. That's our one product. Some of the other catalog vendors come from ETL background, so they're a lot more focused on technical metadata and infrastructure. Some of the catalog products grew out of governance, and so it's, sort of, governance first, no sorry, defense first and then offense secondary. So, I think that's one of the things, I think, we encourage our prospects to look at, is, kind of, the soul of the company and how that affects their decisions. The other thing is, of course, technology. And, what we at Alation are really excited about, and it's been validating to hear Gartner and others and a lot of the people here, like the GSK keynote speaker yesterday, talking about the importance of comprehensiveness and on taking a behavioral approach, right. We have our Behavioral IO technology that really says, "Let's not look at all the bits and the bytes, "but how are people using the data to drive results?" As our core differentiator. >> Do your customers generally standardize on one data catalog, or might they have multiple catalogs for multiple purposes? >> Yeah, you know, we heard a term more last season, of catalog of catalogs, you know. And, people here can get arbitrarily, you know, meta, meta, meta data, where we like to go there. I think the customers we see most successful tend to have one catalog that serves this function of the single source of reference. Many of our customers will say, you know, that their catalog serves as, sort of, their internal Google for data. Or, the one stop shop where you could find everything. Even though they may have many different sources, Typically you don't want to have siloed catalogs. It makes it harder to find what you're looking for. >> Let's play a little word association with some metaphors. Data lake. (laughter) >> Data lake's another one that I sort of hate. If you think about it, people had data warehouses and didn't love them, but at least, when you put something into a warehouse, you can get it out, right. If you throw something into a lake, you know, there's really no hope you're ever going to find it. It's probably not going to be in great shape, and we're not surprised to find that many folks who invested heavily in data lakes are now having to invest in a layer over it, to make it comprehensible and searchable. >> So, yeah, the lake is where we hide the stolen cars. Data swamp. >> Yeah, I mean, I think if your point is it's worse than lake, it works. But, I think we can do better a lake, right. >> How about data ocean? (laughter) >> You know, out of respect for John Furrier, I'll say it's fantastic. But, to us we think, you know, it isn't really about the size. The more data you have, people think the more data the better. It's actually the more data the worse unless you have a mechanism for finding the little bit of data that is relevant and useful for your task and put it to use. >> And to, want to set up, enter the catalog. So, technically, how does the catalog solve that problem? >> Totally, so if we think about, maybe let's go to the warehouse, for example. But, it works just as well on a data lake in practice. >> Yeah, cool. >> Through the catalog is. It starts with the inventory, you know, what's on every single shelf. But, if you think about what Amazon has done, they have the inventory warehouse in the back, but what you see as a consumer is a simple search interface, where you type in the word of the product you're looking for. And then, you see ranked suggestions for different items, you know, toasters, lamps, whatever, books I want to buy. Same thing for data. I can type in, you know, if I'm at the DOD, you know, information about aircraft, or information about, you know, drug discovery if I'm at GSK. And, I should be able to therefore see all of the different data sets that I have. And, that's true in almost any catalog, that you can do some search over the curated data sets there. With Alation in particular, what I can see is, who's using it, how are they using it, what are they joining it with, what results do they find in that process. And, that can really accelerate the pace of discovery. >> Go ahead. >> I'm sorry, Dave. To what degree can you automate some of that detail, like who's using it and what it's being used for. I mean, doesn't that rely on people curating the catalog? Or, to what degree can you automate that? >> Yeah, so it's a great question. I think, sometimes, there's a sense with AI or ML that it's like the computer is making the decisions or making things up. Which is, obviously, very scary. Usually, the training data comes from humans. So, our goal is to learn from humans in two ways. There's learning from humans where humans explicitly teach you. Somebody goes and says, "This is goal standard data versus this is, "you know, low quality data." And, they do that manually. But, there's also learning implicitly from people. So, in the same way on amazon.com, if I buy one item and then buy another, I'm doing that for my own purposes, but Amazon can do collaborative filtering over all of these trends and say, "You might want to buy this item." We can do a similar thing where we parse the query logs, parse the usage logs and be eye tools, and can basically watch what people are doing for their own purposes. Not to, you know, extra work on top of their job to help us. We can learn from that and make everybody more effective. >> Aaron, is data classification a part of all this? Again, when we started in the industry, data classification was a manual exercise. It's always been a challenge. Certainly, people have applied math to it. You've seen support vector machines and probabilistic latent cement tech indexing being used to classify data. Have we solved that problem, as an industry? Can you automate the classification of data on creation or use at this point in time? >> Well, one thing that came up in a few talks about AI and ML here is, regardless of the algorithm you're using, whether it's, you know, IFH or SVM, or something really modern and exciting that keeps learning. >> Stuff that's been around forever or, it's like you say, some new stuff, right. >> Yeah, you know, actually, I think it was said best by Michael Collins at the DOD, that data is more important than the algorithm because even the best algorithm is useless without really good training data. Plus, the algorithm's, kind of, everyone's got them. So, really often, training data is the limiting reactant in getting really good classification. One thing we try to do at Alation is create an upward spiral where maybe some data is curated manually, and then we can use that as a seed to make some suggestions about how to label other data. And then, it's easier to just do a confirm or deny of a guess than to actually manually label everything. So, then you get more training, get it faster, and it kind of accelerates that way instead of being a big burden. >> So, that's really the advancement in the last five to what, five, six years. Where you're able to use machine intelligence to, sort of, solve that problem as opposed to brute forcing it with some algorithm. Is that fair? >> Yeah, I think that's right, and I think what gets me very excited is when you can have these interactive loops where the human helps the computer, which helps the human. You get, again, this upward spiral. Instead of saying, "We have to have all of this, "you know, manual step done "before we even do the first step," or trying to have an algorithm brute force it without any human intervention. >> It's kind of like notes key mode on write, except it actually works. I'm just kidding to all my ADP friends. All right, Aaron, hey. Thanks very much for coming on theCUBE, but give your last word on the event. I think, is this your first one or no? >> This is our first time here. >> Yeah, okay. So, what are your thoughts? >> I think we'll be back. It's just so exciting to get people who are thinking really big about data but are also practitioners who are solving real business problems. And, just the exchange of ideas and best practices has been really inspiring for me. >> Yeah, that's great. >> Yeah. >> Well, thank you for the support of the event, and thanks for coming on theCUBE. It was great to see you again. >> Thanks Dave, thanks Paul. >> All right, you're welcome. >> Thank you, sir. >> All right, keep it right there, everybody. We'll be back with our next guest right after this short break. You're watching theCUBE from MIT CDOIQ. Be right back. (upbeat music)

Published Date : Aug 1 2019

SUMMARY :

brought to you by SiliconANGLE Media. Aaron, thanks for making the time to come on. and data is the new oil, and all this stuff. in the same way that oil powered the industrial age, And, the idea is, you know, water is very plentiful Well, we've certainly, at least in my opinion, Do you think we'll be fighting wars over data? So, you know, it's definitely a resource What kind of questions did you get? We have a lot of customers in energy, so that was cool. because the supply of water is finite. Maybe it is like the universe And, you know, the talk is more fun because you've a lot I actually had one person in the audience say So, chief data officer is a relatively Yeah, that's right, and the most fun thing I think what's interesting is, you know, And, they'll drag you into sales, But, we want to learn, you know, from the best, And so, you know, the goal of my function I want to talk to you about data catalogs a bit And, it's been validating to see, you know, "Which one should I pick? Yeah, I think one thing that's interesting is, you know, Or, the one stop shop where you could find everything. Data lake. when you put something into a warehouse, So, yeah, the lake is where we hide the stolen cars. But, I think we can do better a lake, right. But, to us we think, you know, So, technically, how does the catalog solve that problem? maybe let's go to the warehouse, for example. I can type in, you know, if I'm at the DOD, you know, Or, to what degree can you automate that? Not to, you know, extra work on top of their job to help us. Can you automate the classification of data whether it's, you know, IFH or SVM, or something it's like you say, some new stuff, right. Yeah, you know, actually, I think it was said best in the last five to what, five, six years. when you can have these interactive loops I'm just kidding to all my ADP friends. So, what are your thoughts? And, just the exchange of ideas It was great to see you again. We'll be back with our next guest

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Michael CollinsPERSON

0.99+

Paul GillinPERSON

0.99+

PaulPERSON

0.99+

AmazonORGANIZATION

0.99+

DavePERSON

0.99+

2013DATE

0.99+

Aaron KalbPERSON

0.99+

Dave VellantePERSON

0.99+

AaronPERSON

0.99+

fiveQUANTITY

0.99+

Department of DefenseORGANIZATION

0.99+

six yearsQUANTITY

0.99+

John FurrierPERSON

0.99+

amazon.comORGANIZATION

0.99+

yesterdayDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

AlationPERSON

0.99+

AlationORGANIZATION

0.99+

GartnerORGANIZATION

0.99+

one itemQUANTITY

0.99+

Cambridge, MassachusettsLOCATION

0.99+

first stepQUANTITY

0.99+

last yearDATE

0.99+

GSKORGANIZATION

0.99+

bothQUANTITY

0.99+

DODORGANIZATION

0.99+

one personQUANTITY

0.99+

GoogleORGANIZATION

0.99+

130 peopleQUANTITY

0.98+

OneQUANTITY

0.98+

first timeQUANTITY

0.98+

MITORGANIZATION

0.98+

one productQUANTITY

0.97+

three years agoDATE

0.97+

this weekDATE

0.97+

twoQUANTITY

0.97+

MIT CDOIQORGANIZATION

0.96+

MIT Chief Data Officer andEVENT

0.96+

one data catalogQUANTITY

0.96+

eachQUANTITY

0.96+

each companyQUANTITY

0.95+

BothQUANTITY

0.95+

one thingQUANTITY

0.95+

first oneQUANTITY

0.94+

one catalogQUANTITY

0.93+

two trendsQUANTITY

0.93+

theCUBEORGANIZATION

0.93+

firstQUANTITY

0.92+

first companyQUANTITY

0.92+

last couple yearsDATE

0.92+

CDOORGANIZATION

0.91+

about a hundredQUANTITY

0.91+

single shelfQUANTITY

0.88+

about 250 peopleQUANTITY

0.88+

single sourceQUANTITY

0.87+

ChinaLOCATION

0.87+

2019DATE

0.86+

Day twoQUANTITY

0.86+

oneQUANTITY

0.85+

each business unitQUANTITY

0.82+

MIT CDOIQEVENT

0.79+

ADPORGANIZATION

0.79+

couple issuesQUANTITY

0.76+

Information Quality Symposium 2019EVENT

0.76+

One thingQUANTITY

0.7+

single divisionQUANTITY

0.69+

one stopQUANTITY

0.68+

RussiaLOCATION

0.64+

threeQUANTITY

0.61+

doubleQUANTITY

0.59+

favoriteQUANTITY

0.5+

CDOIQEVENT

0.46+

ChiefPERSON

0.42+

Aaron Kalb, Alation | CUBE Conversation, April 2019


 

our Studios in the heart of Silicon Valley Palo Alto California this is a cute conversation hi welcome to the cube studio for another cube conversation where we go in-depth with thought leaders creating new business outcomes with technology I'm your host Peter Burris one of the biggest challenges that enterprises face as they try to get more value out of their data is how to establish as the practices how to establish the processes and the tooling necessary to both discover data liberate data and communicate data and its value to the organization to have that conversation we've got a great guest Aaron kal who's a co-founder and chief data officer revelation Aaron welcome to the conversation Peter thank you for having me so give us a quick update what's going on with relation things relations are very very exciting so you know from the very beginning we had one main goal to create technology that empowers people to be more curious make better choices to help them find relevant data understand the trust that uses it and we use it that the organization and we're just very happy to keep getting more and more customers and make a bother and broader impact through them now showing how unbelievably attentive I am I noticed that I that you are now chief data officer so your titles changed what does that entail what's going on yeah it's a pretty recent change in a moment were very excited about I think from the very beginning you know we've been preaching you know data-driven organization but we haven't been able to practice what we've preached as much since we've been comparatively so much smaller than our customers what's exciting now is that we've collected enough data and a wide enough network of customers there's an opportunity to really be more data-driven internally and also to kind of have all of these chief data officer x' and other data people they donors like me in our network to synthesize the learnings across all of them about how to build a data culture and kind of take that in and share it back out so we can all go through this journey together so I'm gonna tell you I have been a chief data officer skeptic but I'll explain why but if I could just summarize what you just said there's gonna be an operational part of your job generally but also an advocacy part of your job externally to help catalyze some of your conversations but let me tell you why I've been something of a skeptic and you tell me how things are gonna change you know what what vector were on so to speak see do as a job often was something that people went for because of digital change or a new media change and new types of marketing it's been a job that's been all over the map it's had different definitions different roles different sets of responsibilities when I think of any chief I say you give the chief title to someone who's going to generate you know superior returns on the assets entrusted to them so what that means to me is that the chief data officer should be someone who's going to create competitive or superior returns on the data assets that have been entrusted to them is that kind of how you see it - that's exactly right and this is a term in a title that we're borrowing from our customers who've been very very successful with it and and the goal is exactly that first of all to protect the data and ensure that it's being used appropriately and is well governed that's the defense but then going on offense and ensuring that all that data is actually driving business value and business impact that's the fundamental role of the position the only thing I would I would maybe amend what you said is as chief my management style is really it's just about empowering everybody in the organization within the division and across the company 2 billion drivers impact well it's a leadership job exactly yeah so chiefest you know you're supposed to you're supposed to use the resources at your disposal to generate returns out of those resources and it's obviously it's a leadership job but let's let's walk through that a little bit not so much focusing on how elations CDO is going to operate but let's talk about your customers because one of the observations I'd make is that elation now has a large enough footprint and presence in the industry where you now have significant numbers of customers and I'm sure you're seeing the variety of insights and practices that customers are using to get value out of data so I got to believe that partly this is discovering those new practices those new procedures turning that into a pedagogy something that folks can actually use to improve the way they do things and then helping alation build or participate in the tool change necessary to actually establish those disciplines how far off am i you are spot on so so as you said we have over a hundred production customers now well over that and they all are different in different ways depending on on their geography and their vertical but there are many commonalities we see and our goal is to basically learn from all of them and synthesize those learnings and then push them back out to our network and also apply them internally and sometimes applying that means making changes to our software and sometimes it means just sharing best practices and thought leadership within our network and and beyond so to give a very particular example you know one thing that we'd thought about a little bit but we really learned from our customers was the power that kind of competition and and and and kind of game theory can can play in helping people be successful in their data initiative so gamification gamification exactly yes so we saw for example some of our customers there what they called data duals or metadata duals where a different departments would compete to document their data more thoroughly for accurate outcomes and and they would get cakes they had you know metadata on them it's gotta fund on time we'd seen the word metadata printed on a cake probably in the history of baking an email a different customer in a different region different vertical came out with a doc you Jam which is taking the idea of a hackathon and that's a little bit less competitive and what more collaborative people kind of shoulder-to-shoulder doing data documentation it's a very similar thing of using kind of human psychology to better drive forward data projects we saw in two different places and we thought okay how can we have strapped out a principle from this and we're looking both at integrating some of these principles directly into our product and also sharing other ways the different customers could benefit from the basic concept all right so where are we you've got 100-plus hundred-plus customers now you're an acknowledged leader in the in the catalog world we generally believe the catalogs are going to be an important feature of virtually every successful data-driven digital business because it's going to be one of the places where you actually or data and other assets derived from that data models and whatnot so where are we as as this new CDO where are we in the adoption of what you today would regard as the best practices how is that happening in the industry we have a skills gap are we starting to see that be closed a bit as as more companies start to gain the experience they need to be successful in this yeah you know it's funny there's sort of a learning curve with any new technology or any principal and we see customers and prospects all along that curve and we start kind of mapping out the shape just to give a sense of different extremes you know a few years ago what everybody was talking about people would say I'm a data person and there's people in my company who just don't get it who see the data and instead of being appropriately skeptical and saying I'm not sure how this was sourced they'll just say add the other shmurda here's how I used to do it at my old job and we're just gonna do it that way because it's how we've always done it and you know there was there was that sort of a defensive Nisour resistance to data now we're seeing some customers who have jumped way past that I was talking to a data scientist at one of our customers who said basically they have a recommendation engine in their enterprise and people who years ago might have been completely ignoring it are now just blindly doing whatever it said and and she was saying his own set of implications it does and she said look you know as a data scientist I know how the sausage is made for this engine I wouldn't want to eat that sausage it worries me the people are just putting it on their mouths so to speak and this elaborate metaphor and so I think I think you know the pendulum can swing back and forth what we're trying to work on with our customers is how do you teach individuals to engage in that data culture to be skeptical in the right ways not defensive but to ask where did this come from how is it computed you know the questions that can actually help you interpret it correctly and put it to use and I'd go to the other extreme of you know basically a deferring to the algorithm entirely and taking out all human judgment well and I think that's the important thing is that all these any systems accommodation machines doing things and human beings doing things will take out those animal driven systems for many years ago machines doing things and people doing things and you when you use machines to do things the tech industry has been really good at diffusing knowledge very very quickly so it's over time it's difficult to have your machinery be the source of differentiation so over time humans will consistently be the source of differentiation in your business and how they render judgment and what they determine the priorities and the commitments that they make and sustaining keeping those commitments so catalog to me seems to be an especially important feature of any digital transformation or data-driven process going forward because it touches people and because people use it and it will will also touch other systems and other elements but people remain essential to catalog design and the notion of catalog experience are you seeing that as well and is that helping you to stay close to these CDOs and you know really driving the people oriented process or knowledge about people or any processes you know absolutely throughout our time adulation over the last you know seven years we've always seen people as extremely central and I think one of our key differentiators philosophically was where a lot of data management was sort of thinking about what's good for the computer oh we can save a couple bits by using some lookup code that doesn't mean that's you know comprehensive all we said well what is the human consumer of data what do they need and a lot of our technology has actually been again bringing the human back into the fold of what's been to kind of computer and machine dominated and then the other thing you mentioned it's really critical is we're in an age where automation is very exciting there are a lot of wins there one thing that I hear from C do after C do as I talk to is a three-phased process for bringing data into the organization phase one is is descriptive analytics what happened last quarter then there's predictive analytics what's gonna happen next quarter and the final goal is prescriptive analytics for your computer what holds you what to do well yeah and and where the computer can act you know before any humans even looked at it or been in the loop and I think it's an interesting aspiration especially for certain things that are really really urgent but these are all garbage in garbage out processes and the good news is that if you're looking at a place where the humans in the loop they can say you know what that doesn't look right in that graph and maybe it's a problem with the ETL job or with thesaurus data and they can set them for something Bad's happened so I think as you progress down this evolution there are great rewards but also greater risks and our hope is that with a catalog you can make sure that whatever process you're feeding instead of garbage in garbage out it's the best data that's up-to-date that's trustworthy its contextualized for the business process ok one last question you've you're now in a new role operational external what's the first two things that you want to accomplish in this new role especially on the as you as it pertains to working with your customers what are you really focused on right now yeah so one of our core values at halation is that we listen as though we could be wrong because we know that that's part where data company is you know how do we learn from from numeric and other kinds of signals that come in to always be growing and improving and so step one unambiguously is to listen as much as I can to the incredibly smart innovative thoughtful customers that we have and try to synthesize the best learnings across all of them I think the next step is to then is to then do that synthesis and say oh what do we see this happening in retail that could pertain to finance or vice versa and figure out kind of what is that that curve and how can we kind of either push everybody up the steep parts of the curve so we can all be more data-driven and more curious and I'm more rational together or even have you know the software kind of lower that curve and right before your two great points so it's faster up or use the tool to flatten a curve exactly it's very wise man interview well Aaron this has been a great conversation once again I want to thank you for joining us on another cube conversation my name is Peter Burris see you next time thank you Peter you [Music]

Published Date : Apr 26 2019

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

EntityCategoryConfidence
April 2019DATE

0.99+

Peter BurrisPERSON

0.99+

Peter BurrisPERSON

0.99+

AaronPERSON

0.99+

PeterPERSON

0.99+

Aaron KalbPERSON

0.99+

Aaron kalPERSON

0.99+

Silicon ValleyLOCATION

0.99+

last quarterDATE

0.98+

next quarterDATE

0.98+

two great pointsQUANTITY

0.98+

2 billionQUANTITY

0.97+

seven yearsQUANTITY

0.97+

oneQUANTITY

0.97+

C doTITLE

0.97+

bothQUANTITY

0.95+

first two thingsQUANTITY

0.95+

Palo Alto CaliforniaLOCATION

0.94+

one thingQUANTITY

0.93+

step oneQUANTITY

0.93+

todayDATE

0.92+

NisourORGANIZATION

0.92+

three-phasedQUANTITY

0.91+

two different placesQUANTITY

0.91+

few years agoDATE

0.88+

100-plus hundred-plus customersQUANTITY

0.87+

firstQUANTITY

0.87+

one thingQUANTITY

0.87+

over a hundred production customersQUANTITY

0.84+

one main goalQUANTITY

0.8+

many years agoDATE

0.77+

one last questionQUANTITY

0.77+

AlationPERSON

0.74+

a couple bitsQUANTITY

0.67+

lot ofQUANTITY

0.63+

lotQUANTITY

0.6+

CDOORGANIZATION

0.53+

placesQUANTITY

0.52+

Aaron Kalb, Alation | CUBEConversation, January 2019


 

>> Hello everyone. Welcome to this Cube conversation here in Palo Alto. On John Furrier, co host of the Cube. I'm here. Aaron Kalb is the co founder and VP of design and Alation. Great to see them on some fresh funding news. Aaron, Thanks for coming. And spend the time. Good to see you again. >> Good to see you, John. Thanks for having me >> So big news. You guys got a very big round of financing because you go to the next level. A startup. Certainly coming out that start up phase and growth phase super exciting news. You guys doing some very innovative things around, date around community around people and really kind of cracking the code on this humanization democratization of data, but actually helping businesses. I want to talk about it with you. First. Give us the update on the financing, the amount what it means to the company. A lot of cash. >> Yeah. So we're very excited to have raised a fifty million dollar round. Sapphire led the round, and we also had, you know, re ups from all of our existing investors. And, you know, as as a co founder, he always had big dreams for growth. And it's just validating tohave. Ah, a community of investors who can see the future, too, as well as our great community of over one hundred customers now who want to build this data democratized future with us. >> We've been following you guys since the founding obviously watching you guys great use of capital. Fifty million's a lot of capital, so obviously validation check. Good, good job. But now you go to a whole other level growth. What's the capital gonna be deployed for? What's going on with company where you guys I and in terms of innovation, what's the key focus? >> It's a great question. So you know, obviously we have revenue from our customers. But getting this extra infusion from VC lets us just supercharge our development. It's growth. It's going to more customers, both domestically and abroad, goingto a broader user base. And we're Enterprise-wide Adoption within those customers, as well as innovation in the core product, new technology, great design and futures. that are really going to change the organization's access and use data to make better decisions? >> What was the key Learnings As you guys went into this round of funding outside the validation to get through due diligence, all that good stuff. But you guys have made some successful milestones. What was the key? Notable accomplishments that Alation hit to kind of hit this trigger point here for the fifty million? >> Yeah, I'm glad you asked about that. I think that the key thing that's changed it's enabled this. This next phase is that the data catalog market has really come into its own right. In the beginning, in the early days, we were knocking on doors, trying to say, You know, we don't even know it was going to be called data catalog in our first few months. And even though we had the technology, we said, Hey, we got this thing and we know it's useful. Please buy it. Please want it. And the question was, you know, what's the data catalog by what I ever even look at that? And it's just turned a corner. Now, you know, Thanks. In part of things like Gartner telling companies you know, in the next year by twenty twenty, if you have a data catalog, you're goingto see twice the ROI from your existing data investments than if you don't your stories like that are making companies say? Of course, you want to data catalog. It just turned out a dime. Now they're asking, Which data catalog should we get? Why is yours the best in this change of the market maturing? I think it's the biggest change we've seen >> with one thing that we've observed. I want to get your reaction to This is that I'll stay with cloud computing economics, a phenomenally C scale data data science working the cloud. We see great success there. Now there's multiple clouds, multi clouds, a big trend, but also the validation that it's not just all cloud anymore. The on premises activity steel is relevant, although it might have a cloud. Operations really kind of changes the role of data. You mentioned the data catalogue kind of being kind of having a common mainstream visibility from the analysts like Gardner and others on Wiki Bond as well. It makes data the center of the innovation. Now you have data challenges around. Okay, where's the data deployed? Where my using the data? Because data scientists want ease of data, they want quality data. They want to make sure their their algorithm, whether it's machine learning component or software actually running a good data. So data effectiveness is now part of the operations of most businesses. What's your reaction to that? Which your thoughts. Is that how you see it? Is there something different there? What's going on with the whole date at the center? >> Absolutely hit on two key themes for us. One of that idea of the center and the other is your point about data quality and data trust. So, so centrality, we think, is really essential. You know, we're seeing cataloging technology crop up more and more. A lot of people were coming out with catalogs or catalog kind of add ons to their products. But what our customers really tell us is they want the data catalog to be the hub, that one stop shop where they go to to access any data, wherever it lives, whether it's in the cloud or on Prem, whether it's in a relational database or a file system, so is one of Alations key. Differentiators early on was being that central index, much like Google is out of the front page to the Internet, even though it's linking to ad pages all over the place. And the other thing in terms of that data quality and data trustworthiness has been a differentiator, and this was something that was part of our technology when we launched that we didn't put the label out till later. Is this idea of Behavior IO, that's kind of looking at previous human behavior to influence future human behavior to be better. And there's another place we really took some inspiration from Google and Terry Winograd at Stanford before that, you know, he observed. You know, if you remember back before Google search sucked, frankly, right, the results on top are not the most development were not the most trustworthy. And the reason was those algorithms were based on saying, how often does your key word appear in that website? Built, in other words, and so you'd get results on top. That might just not be very good. Or even that were created by spammers who put in a lot of words to get SEO and and, you know, that isn't the best result for you on what Google did was turned that around with page rank and say, Let's use the signals that other people are getting behind about the pages they find valuable to get the best result on top. And Alation is the exact same thing our patented proprietary behavior technology lets us say Who's using this data? How were they using it? Is it reputable? And that enables us to get the right data and transfer the data in front of decision makers. >> And you call that Behavioral IO >> Behavior IO, that's right. >> I mean, certainly remember Google algorithmic search was pooh poohed. It first had to be a portal. Everyone kind of my age. You can't remember those those days and the results were key word stuff by spammer's. But algorithmic search accelerated the quality. So I got to ask you the behavioral Io to kind of impact a little bit. Go a little deeper. What does that mean for customers? Because now I'll see as people start thinking, OK, I need to catalogue my data because now I need to have replication, all kinds of least technical things that are going on around integrity of the data. But why Behavioral Aya? What's the angle on that? What's the impact of the customer? Why is this important? Absolutely so. >> Might have to work through an example, you know we joke about. You might be looking around in your SharePoint drive and find an Excel file called Q three Numbers final. Underscore final. Okay, that seems that'S inject the final numbers, and then you see next to it when it says underscore final underscore, final underscore finalist. Okay, well, is that one final? And it turns out what Data says about itself is less reliable than what other people say about the data. Same thing with Google that if everyone's linking with Wikipedia Page, that's a more reliable page than one that just has, you know, paid for a higher placement, Right? So what a means an organization is with Alation will tell you. You know, this is the data table that was refreshed yesterday and that the CFO and everybody in this department is using every day. That's a really strong signal. That's trustworthy data, as opposed to something that was only used once a year ago. >> So relevance is key there. >> Absolutely. It's relevant. And trustworthiness. We find both all right, indicated more strongly by who's using it and how than by the data itself. >> Are you seeing adoption with data scientist and people who were wrangling date or data analysts that if the date is not high quality, they abandoned. The usage is they're getting kind of stats around that are because that we're hearing a lot of Hey, you know, that I'm not going to really work on the data. But I'm not going to do all the heavy lifting on the front end the data qualities, not there. >> Absolutely. We see a really cool upward spiral. So in Alation, we have a mix of manual, human curated metadata, you know, data stewards and that a curator saying, this is endorsed data. It's a certified data. This is applicable for this context. But we also do this automatic behavior. Io. We parse the query logs. These logs were, you know, put there for audit on debugging purposes. But we were mining that for behavioral insight, and we'll show them side by side on what we see is overtime on day one. There's no manual curation. But as that curation gets added in, we see a strong correlation between the best highest quality data and the most used data. And we also see an upward spiral where, if on day one. People are using data that isn't trustworthy that stale or miscalculated as soon as Ah, an Alation steward slaps a deprecation or a warning on the data asset because of technology like trust check talking about last time I was here, that technology, that's the O part of behavior IO We then stop the future behavior from being on bad data, and we see an upward spiral where suddenly the bad sata is no longer being used and everyone's guided put the pound. >> One thing I'm really impressed with you guys on is you have a great management team and overall team with mixed disciplines. Okay, I think last night about your role, Stanford and the human side of the world. But you have to search analogy, which is interesting because you have search folks. You got hardcore data data geeks all working together. And if you think about Discovery and navigation, which is the Google parent, I need to find a Web page and go, Go, go to it. You guys were in that same business of helping people discover data and act on it or take action. Same kind of paradigm, so explain some customer impact anecdotes. People who bought Alation, what your service and offering and what happened after and what was it like before? We talk about some of that? And because I think you're onto something pretty big here with this discovery. Actionable data perspective. >> Yeah, well, one of our values, it Alation, is that we measure our success through customer impact, you know, not do financing or other other milestones that we are excited about them. So I I would love to talk about our customers. One example of a business impact is an example that our champion at Safeway Albertsons describes where, after safe, it was acquired by Albertson's. They've been sort of pioneers of sort of digital, ah, loyalty and engagement. And there was a move to kind of stop that in its tracks and switch should just mailing people big books of coupons that of customizing, you know, deals for you based on your buying behavior. And they talked about getting a thirty x  ROI on the dollars they've spent on Alation by basically proving the value of their program and kind of maximizing their relationship with their customers. But the stories they're even more exciting to me, then just business impacts in dollars and cents when we can leave a positive impact on people's lives with data. There's a few examples of that Munich reinsurance, the biggest being sure and also a primary ensure in Europe, had some coverage and Forbes about the way that they use Alation, other data tools to be able to help people get back on their feet more quickly after, ah, earthquakes and other natural disasters. And similarly, there's a piece in The Wall Street Journal about how Pfizer is able to create diagnostics and treatments for rare diseases where it wouldn't have been a good ROI even invest in those if they didn't get that increased efficient CNN analytics from Alation on the other data. >> So it's not just one little vertical. It's kind of mean data is horizontally. Scaleable is not like one. Industry is going to leverage Alation, >> Absolutely so you know, I mentioned just now. Insurance and health care and retail were also in tech were in basically every vertical you can imagine and even multiple sectors. You know, I've been focusing on industry, but there's another case that you can read about at the city of San Diego were there. They're doing an open data initiative, enabling people to figure out everything from where parking is easiest, the hardest to anything else. >> The behavioral Io. And it's all about context and behavior, role of data and all this. It's kind of fundamental to businesses. >> That's right. It's all about taking everything about how people using data today and driving people to be even more data driven, more accurate, better able to satisfy their curiosity and be more rational in >> the future. So if I'm a from a potential customer and I heard a rAlation, get the buzz out there, why would I need you? What air? Some signals that would indicate that I should call Alation. What's some of that Corvette? What's the pitch? >> Yeah, it's a great question. No, I sometimes joke with the team that you know every five minutes another enterprise reaches that point where they can't do it the old way anymore. And the needle ations. And the reason for that is that data is growing exponentially and people can only grow at most, you know, linearly. So I compare it a bit again to the days of of Yahoo When the Internet was small, you make a table of contents for it. But as there came to be trillions of red pages, you needed an automatic index with pay drink to make sense of it. So I would say, once you find that your analytics team has spread out and they're spending, you know eighty percent of their time calling up other people to find where development data is, you're asked to Your point is this data high quality show even spend my time on it? You know that's probably not money is well spent with these highly paid people spending other times scrounging If you switch from scrounging to finding understanding and trusting their data for quick and accurate analysis, give us >> a call. So basically the pitches, if you want to be like Yahoo, do it the old way. We know what happened. Yeah, you want to be like Google, two algorithmic and have data >> God rAlation, and you'll be around for a while very well. After that, maybe the one see that that's my words. >> And and that's part of turning that corner. I think in the beginning we were trying to tell people this could be a nice toe have. And now customers are coming to us realizing it's a must have to stay a relevant, you know, And if you've made all these investments in data infrastructure and data people, but you can't connect the dots is you said, between the human side and the tech side that money's all wasted and you're going to not be able to compete against your competitors and impact of customers what you want. >> Well, Eric, congratulations. Certainly is the co founder. It's great success. And how hard is that you start ups? You guys worked hard and again. Why following you guys? Been interesting to see that growth and this innovation involved in creative, A lot of energy. You guys do a good job. So final question, talk about the secret sauce of Alation. What's the key innovation formula? And now that you got the funding where you're going to double down on, where's the innovation going to come next? So the innovation formula and where the innovation, the future, >> absolutely innovation has been critical for us to get here on our customers didn't just buy the exciting features with behavioral and trust. Check that we had but also are buying into the idea that we're going to continue to be the leaders and to innovate. Andi, we're going to do that. So I think the secret sauce which we've had in the past, we're going to continue to innovate in this vein, is to be really conscious of water computers great at and what humans uniquely good at what you humans like doing and trying to have the human and computers work together to really help the human achieve their goals. Right? So, Doctor, the Google example. You know, there's a bunch of systems for collaboratively ranking things, but it takes work to, you know, write a review on the upper Amazon. Google had the insight that we could leverage people are already doing and make it about it. Out of that, we're going to continue to do that. >> The other kind of innovation you'll see is bringing Alation to a wider and wider audience, with less and less technical skill needed. So I came from Syria Apple, and the idea is you have to learn a programming language to Queria database. You could just speak in English. That helps you ask answer questions like What's the weather today? Imagine taking that same kind of experience of seamless integration to the more important questions enterprises are asking. >> We'll have to tap your expertise is we want to have an app called the Cube Syria, which is a cube. What's the innovation in Silicon Valley and have it just spit out a video on the kidding? Final question just to double down on that piece, because I think the human interactions a big part of what you're saying I've always loved that about with your vision is. But this points to a major problems. Seeing whether it's, you know, media, the news cycle These days, people are challenging the efficacy of finding the research and the real deep research on the media. So I was seeing scale on data scale is a huge challenge. You mentioned the growth of data. Computers can scale things, but the knowledge and the curation kind of dynamic of packaging it, finding it, acting on it. It's kind of where you guys are hitting. Talk about that tie name, my getting that right and set is that important? Because, you know, certainly scale is table stakes these days. >> That is super insightful John, because I think human cognition and human thought excuse me, is the bottleneck four being data driven right we have on the Internet trillions of Web pages, you know, more than the Library of Alexandria a hundred times over, and we have in databases millions of columns and trillions of rose. But for that to actually impact the business and impact the world in a positive way, it's got to go through a person who could understand it. And so, in the same way that Google became the mechanism by which the Internet becomes accessible, we think that Alation for organizations is becoming the way that data can become actionable. And the other thing I would say is, you know, in this age of alternative facts and mistrust of data, you know, we've sort of realizing the just having more information out there doesn't actually make people wiser and better able to reason. It can actually be a lot of noise that muddies the signal and confuses people. So we think Alation by also using human computer interaction to help separate the signal from the noise and the quality from the garbage can help stop the garbage in garbage out and make people more rational and more curious and have more trust than what there. Hearing understanding >> build that Paige rang kind of metaphor is interesting because the human gestures, whether it's work or engaging on the data, is a signal tube, not just algorithmic meta data extraction. >> Absolutely anything you do with data and any tool, even outside of Alation. Alation will capture that and use it to guide future behavior for you and your appears to be better and smarter. >> Fifty million dollars. Where's this all going to lead to wins the next innovation. What do you guys see? The future for rAlation? >> Well, you know, I, uh I was just thinking before the show I used to be an apple kind of in the golden Age when Apple was really innovative. And there was the joke where they released something new and say, Redman, start your photocopier. So in this interview, I'm going to be a little close to the chest about the specifics, but we're releasing. But I will tell you we have a room that we're really excited about to go to a broader and broader audience that impactor customers more fully >> well you feel free to say one more thing? >> Yeah. I think the secret to the future is Aaron. Thanks for coming on. >> Really preachy. Congratulations on the funding. He has got a very innovative formula. Good luck. And we'll be following you guys. Thanks, but come on, keep commerce. Thanks so much. Eric Kalb, co founder and VP of designing Alation. Interesting formula. Great. Successful. Former great innovation. Alation. Check him out. I'm Jennifer here in Palo Alto for cube conversation. Thanks for watching.

Published Date : Jan 24 2019

SUMMARY :

Good to see you again. Good to see you, of cracking the code on this humanization democratization of data, but actually helping businesses. and we also had, you know, re ups from all of our existing investors. been following you guys since the founding obviously watching you guys great use of capital. So you know, obviously we have revenue from our customers. What was the key Learnings As you guys went into this round of funding outside the validation to get through due diligence, And the question was, you know, what's the data catalog by what I ever even look at that? Is that how you see it? One of that idea of the center and the other is your point So I got to ask you the behavioral Io Okay, that seems that'S inject the final numbers, and then you see next to it when it says underscore And trustworthiness. a lot of Hey, you know, that I'm not going to really work on the data. we have a mix of manual, human curated metadata, you know, One thing I'm really impressed with you guys on is you have a great management team and overall team with mixed disciplines. you know, deals for you based on your buying behavior. Industry is going to leverage Alation, the hardest to anything else. It's kind of fundamental to businesses. more data driven, more accurate, better able to satisfy their curiosity and be more rational So if I'm a from a potential customer and I heard a rAlation, get the buzz out there, the days of of Yahoo When the Internet was small, you make a table of contents for it. So basically the pitches, if you want to be like Yahoo, do it the old way. maybe the one see that that's my words. And now customers are coming to us realizing it's a must have to stay a relevant, you know, And now that you got the funding where you're going to double down on, where's the innovation going to come next? things, but it takes work to, you know, write a review on the upper Amazon. and the idea is you have to learn a programming language to Queria database. It's kind of where you guys are hitting. And the other thing I would say is, you know, in this age of alternative facts build that Paige rang kind of metaphor is interesting because the human gestures, whether it's work or Alation will capture that and use it to guide future behavior for you and your appears to be better and smarter. What do you guys see? But I will tell you we have a room that we're really excited about to go to a broader and broader Thanks for coming on. And we'll be following you guys.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
EricPERSON

0.99+

Eric KalbPERSON

0.99+

Aaron KalbPERSON

0.99+

JenniferPERSON

0.99+

AaronPERSON

0.99+

JohnPERSON

0.99+

PfizerORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

EuropeLOCATION

0.99+

YahooORGANIZATION

0.99+

Terry WinogradPERSON

0.99+

January 2019DATE

0.99+

GoogleORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

CNNORGANIZATION

0.99+

San DiegoLOCATION

0.99+

fifty millionQUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

Fifty million dollarsQUANTITY

0.99+

AppleORGANIZATION

0.99+

Fifty millionQUANTITY

0.99+

GartnerORGANIZATION

0.99+

GardnerPERSON

0.99+

yesterdayDATE

0.99+

FirstQUANTITY

0.99+

ExcelTITLE

0.99+

Safeway AlbertsonsORGANIZATION

0.99+

eighty percentQUANTITY

0.99+

twiceQUANTITY

0.99+

AlationORGANIZATION

0.99+

OneQUANTITY

0.98+

thirtyQUANTITY

0.98+

bothQUANTITY

0.98+

AlationPERSON

0.98+

StanfordORGANIZATION

0.98+

Library of AlexandriaORGANIZATION

0.98+

John FurrierPERSON

0.98+

next yearDATE

0.98+

firstQUANTITY

0.98+

todayDATE

0.98+

millions of columnsQUANTITY

0.97+

over one hundred customersQUANTITY

0.97+

one thingQUANTITY

0.97+

twoQUANTITY

0.97+

trillions of red pagesQUANTITY

0.97+

AlbertsonORGANIZATION

0.97+

oneQUANTITY

0.96+

AlationsORGANIZATION

0.96+

two key themesQUANTITY

0.95+

RedmanPERSON

0.95+

trillions of roseQUANTITY

0.95+

ForbesORGANIZATION

0.95+

appleORGANIZATION

0.95+

The Wall Street JournalTITLE

0.94+

CubeORGANIZATION

0.94+

last nightDATE

0.93+

SyriaLOCATION

0.93+

fifty million dollarQUANTITY

0.92+

twenty twentyQUANTITY

0.91+

trillions of Web pagesQUANTITY

0.91+

EnglishOTHER

0.91+

SyriaCOMMERCIAL_ITEM

0.9+

WikipediaORGANIZATION

0.89+

first few monthsQUANTITY

0.89+

day oneQUANTITY

0.89+

SapphireORGANIZATION

0.88+

day oneQUANTITY

0.87+

a year agoDATE

0.86+

One thingQUANTITY

0.86+

CorvetteCOMMERCIAL_ITEM

0.83+

Stephanie McReynolds, Alation | CUBE Conversation, December 2018


 

(bright classical music) >> Hi, I'm Peter Burris and welcome to another CUBE Conversation from our studios here in Palo Alto, California. We've got another great conversation today, specifically we're going to talk about some of the trends and changes in data catalogs, which were emerging as a crucial technology to advance data-driven business on a global scale. And to do that, we've got Alation here, specifically Stephanie McReynolds who's the Vice-President of Marketing at Alation. Stephanie, welcome back to theCUBE. >> Thank you, it's great to be here again. >> So Stephanie, before we get into this very important topic of the increasing, obviously role or connection between knowing what your data is, knowing where it is, and business outcomes in a data-driven business world, let's talk about Alation. What's the update? >> Yeah, so we just celebrated, yesterday in fact, the sixth anniversary of incorporation of the company. And upon, reflecting on some of the milestones that we've seen over those six years, one of the exciting developments is we went from initially about seven production implementations a couple years after we were founded, to now over a hundred organizations that are using Alation. And in those organizations over the last couple of years, we've seen many organizations move from hundreds of users, to now thousands of users. An organization like Scout24 has 70 percent of the company as self-servicing analytics users and a significant portion of those users now using Alation. So we're seeing companies in Europe like Scout24 who's in Germany. Companies like Pfizer in the United States. Munich Reinsurance in the financial services industry. Also hit about 2000 users of Alation, and so it's exciting to look at our origins with eBay as our very first customer, who's now up to about 3000 users. And then these more recent companies adopt Alation all of them now getting to a point where they really have a large population that's using a data catalog to drive self-service analytics and business outcomes out of those self-serving analytics. >> So a hundred first-rate brands as users, it's international expansion. Sounds like Alation's really going places. What I want to do though, is I want to talk a little bit about some of the outcomes that these companies are starting to achieve. Now we have been on the record here at circling the angle with theCUBE wiki bomb for quite some time, trying to draw a relationship between business, digital business, and the role that data plays. Digital business transformation, in many respects, is about how you evolve the role the data plays in your business to become more data-driven. It's hard to do without knowing what your data is, where it is, and having some notion of how it's being used in a verified trusted way. How are you seeing your company's start to tie the use of catalogs to some of these outcomes? What kind of outcomes are folks trying to achieve first off? >> Yeah, you're right. Just basic table stakes for turning an organization into an organization that relies on data-driven decision-making rather than intuitive-decision making requires an inventory. And so that's table stakes for any catalog, and you see a number of vendors out there providing data inventories. But what I think is exciting with the customers that we work with, is they are really undertaking transformative change, not just in the tooling and technology their company uses, but also in the organizational structure, and data literacy programs, and driving towards real business impact, and real business outcomes. An example of an Alation customer, who's been talking recently about outcomes, is Pfizer. Pfizer was covered in a Wall Street Journal article, recently. Also was speaking at TABLO Conference, about how they're using a combination of the Alation data catalog with TABLO on the front end, and a data science platform called Data IQ, in an integrated analytics workbench that is helping them with new drug discovery. And so, for populations of ill individuals, who may have a rare form of heart disease, they're now able to use machine learning and algorithms that are informed by the data catalog to catch one percent, two percent of heart disease patients who have a slight deviation from the norm, and can deliver drugs appropriately to that population. Another example of the business outcome would be with an insurance company; very different industry, right? But, Munich Reinsurance is a huge global reinsurance company, so you think about hurricanes or the fires we had here in the United States, they actually support first line insurers by reinsuring them. They're also founding new business units for new types of risks in the market. An example would be a factory that is fully controlled by robots. Think about the risks of having that factory be taken over by hackers in the middle of the night, where there's not a lot of employees on the floor. Munich Reinsurance is leveraging the data catalog as a collaboration platform between actuaries and individuals that are knowledgeable in the business to define what are the data products that could support an entirely new business units, like for cyber crimes. And investing in those business units based on the innovation they're doing using the data catalog as a collaboration platform. So these are two great examples of organizations that, a couple years ago started with a data catalog, but have driven so many more initiatives than just analyst productivity off of that implementation. >> Oh, those are great outcomes. One of them talking about robots in the factory, automated factory, one thing, if they went haywire, would make for some interesting viral video. (gently laughs) >> That's right. That's right. >> But coming back, but the reason I say that is because in many respects, these practices, these relations with the outcomes, the outcomes are the real complex thing. You talked about becoming more familiar with data, using data differently, becoming more data driven. That requires some pretty significant organizational change. And it seems to me, and I'm querying you on this, that the bringing together these users to share their stories about how to achieve these data driven outcomes, made more productive by catalogs and related technologies. Communities must start to be forming. Are you seeing communities form around achieving these outcomes and utilizing these types of technologies to accelerate the business change? >> So what's really interesting at an organization like Munich Reinsurance or at Pfizer, is there's an internal community that is using the data catalog as a collaboration platform and as kind of a social networking platform for the data nerds. So if I am a brand new user of self-service analytics, I may be a product manager who doesn't know how to write a sequel query yet. Who doesn't know how to go and wrangle my own data. >> Yeah, may never want to. (playfully laughs) >> May never want to. May never want to. Who may not know how to go and validate data for quality or consistency. I can now go to the data catalog to find trusted resources of data assets, be that a dashboard to report that's already been written or be that raw data that someone else has certified, or just has used in the past. So we're seeing this social influence happen within companies that are using data catalogs, where they can see for the data catalog pages, who's used, who's validated this data set so that I now trust the data. And then, what we've seen happen, just within the last year and-a-half or so, is these organizations, the sponsors of the data of these organizations, are starting to share best practices naturally with one another, and saying, hey >> Across organizations. >> Across organizations. And so there has been a demand for Alation to get out into the market and help catalyze the creation of communities across different organizations. We kicked off, within the last two months, a series of meetings that we've called RevAlation. >> R-E-V-A-L >> That's right >> A-T-I-O-N >> R-E-V-A-L-A-T-I-O-N And the thing behind the name is, if you can start to share best practices in terms of how you create a data-driven culture across organizations, you can begin to really get breakthrough speed, right? In making this transformation to a data-driven organization. And so, I think what's interesting at the RevAlation events, is folks are not talking just about how they're using the tool, how they're using technology. They're actually talking about how do we improve the data literacy of our organizations and what are the programs in place that leverage, maybe the data catalog, to do that. And so they're starting to really think about, how does, not just the technical architecture and the tooling change in their organizations, but how do we close this gap between having access to data and trusting the data and getting folks who maybe aren't, too familiar with the technical aspects of the data supply chain. How do we make them comfortable in moving away from intuitive decisions to data-driven decisions? >> Yeah, so the outcome really is not just the application of the tool, it's the new behaviors in the business that are associated with data-driven. But to do that, you still have to gain insight and understand what kinds of practices are best used with the tool itself. >> That's right. >> So it's got to be a combination. But, you know, Alation has been, if I can say this. Alation's been on this path for a while. Not too long ago, you came on theCUBE and you talked about trust check. >> Right. >> Which was an effort to establish conventions and standards for how data could be verified and validated so that it would be easy to use, so that someone could use the data and be certain that it is what it is, without necessarily having to understand the data. Something that could be very good for, for example, for folks who are very focused on the outcome, and not focused on the science of the data associated with it. >> That's right. >> So, is this part of, it's RevAlation, it's trust check. Is this part of the journey you're on to try to get people to see this relation between data-driven business and knowing more about your data? >> It absolutely is. It's a journey to get organizations to understand what is the power that they have internally, within this data. And close the gap on, which is in part organizational, but in part for individuals user's psychological and how do you get to a trusted decision. And so, you'll continue to see us invest in features like trust check that highlight how technology can make recommendations; can help validate and verify what the experts in the organization know and propagate that more widely. And then you'll also see us share more best practices about how do you start to create the right organizational change, and how do you start to impact the psychology of fear that we've had in many organizations around data. And I think that's where Alation is uniquely placed, because we have the highest number of data catalog customers of any other vendor I'm familiar with in this space. And we also have a unique design approach. When we go into organizations and talk about adopting a data catalog, it's as much about, how do our products support psychological comfort with data as well as, how do they support the actual workflow of getting that query completed, or getting that data certified. And so I think we've taken a bit of a unique approach to the market from the beginning where we're really designing holistically. We're not just, how do you execute a software program that supports workflow? But how do you start to think about how the data consumer actually adopts that best practices and starts to think differently about how they use data in a more confident way? >> Well I think the first time that you and I talked in theCUBE was probably 2016, and I was struck by the degree to which Alation as a tool, and the language that you used in describing it was clearly designed for human beings to use it. >> Right. >> As opposed to for data. And I think that, that is a unique proposition, because at the end of the day, the goal here, is to have people use data to achieve outcomes and not just to do a better job of managing data. >> And that doesn't mean that, I mean we have a ton of machine learning, >> Sure. >> And AI in the products. That doesn't take away from the power of those algorithms to speed up human work and human behavior. But we really believe that the algorithms need to compliment human input and that there should be a human in the loop with decision-making. And then the algorithms propagate the knowledge that we have of experts in the organization. And that's where you get the real breakthrough business outcomes, when you can take input from a lot of different human perspectives and optimize an outcome by using technology as a support structure to help that. >> In a way that's familiar and natural and easy for others in your organization. >> That's right. That seems, you know, if you go back to. >> It makes sense. >> When we were all introduced to Google it was a little bit of an odd thing to go ask Google questions and get results back from the internet. We see data evolving in the same way. Alation is the Google for your data in your organization. At some point it'll be very natural to say, 'Hey Alation, what happened with revenue last month?' And Alation will come back with an answer. So I think that, that future is in sight, where it's very easy to use data. You know you're getting trusted responses. You know that they're accurate because there's either a certification program in place that the technology supports, or there's a social network that's bubbling this information up to the top, that is a trusted source. And so, that evolution in data needs to happen for our organizations to broadly see analytic driven outcomes. Just as in our consumer or personal life, Google had to show us a new way to evolving, you know, to a kind of answering machine on the internet. >> Excellent. Stephanie McReynolds, Vice-President of Marketing Alation, talked to us about building communities, to become more of a, to achieve data-driven outcomes, utilizing data catalog technology. Stephanie, thanks very much for being here. >> Thanks for inviting me. >> And once again, I'm Peter Burris, and this has been another CUBE Conversation until next time. (bright classical music)

Published Date : Dec 14 2018

SUMMARY :

And to do that, we've got Alation here, What's the update? Munich Reinsurance in the about some of the outcomes combination of the Alation data robots in the factory, That's right. that the bringing together platform for the data nerds. Yeah, may never want to. the data of these organizations, into the market and help the data catalog, to do that. of the tool, it's the new So it's got to be a combination. the data associated with it. to see this relation between And close the gap on, which to use it. and not just to do a better And AI in the products. in your organization. That seems, you know, if you go back to. that the technology supports, talked to us about building communities, and this has been another CUBE

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
StephaniePERSON

0.99+

Stephanie McReynoldsPERSON

0.99+

PfizerORGANIZATION

0.99+

Peter BurrisPERSON

0.99+

EuropeLOCATION

0.99+

Scout24ORGANIZATION

0.99+

December 2018DATE

0.99+

GoogleORGANIZATION

0.99+

GermanyLOCATION

0.99+

one percentQUANTITY

0.99+

two percentQUANTITY

0.99+

70 percentQUANTITY

0.99+

AlationORGANIZATION

0.99+

2016DATE

0.99+

Munich ReinsuranceORGANIZATION

0.99+

Palo Alto, CaliforniaLOCATION

0.99+

United StatesLOCATION

0.99+

eBayORGANIZATION

0.99+

hundreds of usersQUANTITY

0.99+

yesterdayDATE

0.99+

first customerQUANTITY

0.99+

sixth anniversaryQUANTITY

0.98+

first timeQUANTITY

0.98+

about 2000 usersQUANTITY

0.98+

OneQUANTITY

0.98+

six yearsQUANTITY

0.97+

thousands of usersQUANTITY

0.97+

todayDATE

0.96+

about 3000 usersQUANTITY

0.95+

oneQUANTITY

0.95+

two great examplesQUANTITY

0.94+

TABLO ConferenceEVENT

0.93+

firstQUANTITY

0.92+

over a hundred organizationsQUANTITY

0.9+

TABLOORGANIZATION

0.88+

last monthDATE

0.88+

AlationTITLE

0.86+

last couple of yearsDATE

0.86+

couple years agoDATE

0.85+

last two monthsDATE

0.85+

VicePERSON

0.84+

Data IQTITLE

0.82+

Marketing AlationORGANIZATION

0.81+

MunichORGANIZATION

0.81+

about seven production implementationsQUANTITY

0.74+

hundred first-rate brandsQUANTITY

0.74+

one thingQUANTITY

0.74+

first lineQUANTITY

0.72+

CUBE ConversationEVENT

0.68+

AlationPERSON

0.67+

Wall Street JournalTITLE

0.67+

last year and-a-halfDATE

0.66+

PresidentPERSON

0.63+

up toQUANTITY

0.58+

theCUBEORGANIZATION

0.56+

RevAlationEVENT

0.55+

RevAlationTITLE

0.52+

couple yearsQUANTITY

0.48+

ReinsuranceTITLE

0.44+

Stephanie McReynolds, Alation | theCUBE NYC 2018


 

>> Live from New York, It's theCUBE! Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hello and welcome back to theCUBE live in New York City, here for CUBE NYC. In conjunct with Strata Conference, Strata Data, Strata Hadoop This is our ninth year covering the big data ecosystem which has evolved into machine learning, A.I., data science, cloud, a lot of great things happening all things data, impacting all businesses I'm John Furrier, your host with Dave Vellante and Peter Burris, Peter is filling in for Dave Vellante. Next guest, Stephanie McReynolds who is the CMO, VP of Marketing for Alation, thanks for joining us. >> Thanks for having me. >> Good to see you. So you guys have a pretty spectacular exhibit here in New York. I want to get to that right away, top story is Attack of the Bots. And you're showing a great demo. Explain what you guys are doing in the show. >> Yah, well it's robot fighting time in our booth, so we brought a little fun to the show floor my kids are.. >> You mean big data is not fun enough? >> Well big data is pretty fun but occasionally you got to get your geek battle on there so we're having fun with robots but I think the real story in the Alation booth is about the product and how machine learning data catalogs are helping a whole variety of users in the organization everything from improving analyst productivity and even some business user productivity of data to then really supporting data scientists in their work by helping them to distribute their data products through a data catalog. >> You guys are one of the new guard companies that are doing things that make it really easy for people who want to use data, practitioners that the average data citizen has been called, or people who want productivity. Not necessarily the hardcore, setting up clusters, really kind of like the big data user. What's that market look like right now, has it met your expectations, how's business, what's the update? >> Yah, I think we have a strong perspective that for us to close the final mile and get to real value out of the data, it's a human challenge, there's a trust gap with managers. Today on stage over at STRATA it was interesting because Google had a speaker and it wasn't their chief data officer it was their chief decision scientist and I think that reflects what that final mile is is that making decisions and it's the trust gap that managers have with data because they don't know how the insides are coming to them, what are all the details underneath. In order to be able to trust decisions you have to understand who processed the data, what decision making criteria did they use, was this data governed well, are we introducing some bias into our algorithms, and can that be controlled? And so Alation becomes a platform for supporting getting answers to those issues. And then there's plenty of other companies that are optimizing the performance of those QUERYS and the storage of that data, but we're trying to really to close that trust gap. >> It's very interesting because from a management standpoint we're trying to do more evidence based management. So there's a major trend in board rooms, and executive offices to try to find ways to acculturate the executive team to using data, evidence based management healthcare now being applied to a lot of other domains. We've also historically had a situation where the people who focused or worked with the data was a relatively small coterie of individuals that crave these crazy systems to try to bring those two together. It sounds like what you're doing, and I really like the idea of the data scientists, being able to create data products that then can be distributed. It sounds like you're trying to look at data as an asset to be created, to be distributed so they can be more easily used by more people in your organization, have we got that right? >> Absolutely. So we're now seeing we're in just over a hundred production implementations of Alation, at large enterprises, and we're now seeing those production implementations get into the thousands of users. So this is going beyond those data specialists. Beyond the unicorn data scientists that understand the systems and math and technology. >> And business. >> And business, right. In business. So what we're seeing now is that a data catalog can be a point of collaboration across those different audiences in an enterprise. So whereas three years ago some of our initial customers kept the data catalog implementations small, right. They were getting access to the specialists to this catalog and asked them to certify data assets for others, what were starting to see is a proliferation of creation of self service data assets, a certification process that now is enterprise-wide, and thousands of users in these organizations. So Ebay has over a thousand weekly logins, Munich Reinsurance was on stage yesterday, their head of data engineering said they have 2,000 users on Alation at this point on their data lake, Fiserv is going to speak on Thursday and they're getting up to those numbers as well, so we see some really solid organizations that are solving medical, pharmaceutical issues, right, the largest re insurer in the world leading tech companies, starting to adopt a data catalog as a foundation for how their going to make those data driven decisions in the organization. >> Talk about how the product works because essentially you're bringing kind of the decision scientists, for lack of a better word, and productivity worker, almost like a business office suite concept, as a SAS, so you got a SAS model that says "Hey you want to play with data, use it but you have to do some front end work." Take us through how you guys roll out the platform, how are your customers consuming the service, take us through the engagement with customers. >> I think for customers, the most interesting part of this product is that it displays itself as an application that anyone can use, right? So there's a super familiar search interface that, rather than bringing back webpages, allows you to search for data assets in your organization. If you want more information on that data asset you click on those search results and you can see all of the information of how that data has been used in the organization, as well as the technical details and the technical metadata. And I think what's even more powerful is we actually have a recommendation engine that recommends data assets to the user. And that can be plugged into Tablo and Salesworth, Einstein Analytics, and a whole variety of other data science tools like Data Haiku that you might be using in your organization. So this looks like a very easy to use application that folks are familiar with that you just need a web browser to access, but on the backend, the hard work that's happening is the automation that we do with the platform. So by going out and crawling these source systems and looking at not just the technical descriptions of data, the metadata that exists, but then being able to understand by parsing the sequel weblogs, how that data is actually being used in the organization. We call it behavior I.O. by looking at the behaviors of how that data's being used, from those logs, we can actually give you a really good sense of how that data should be used in the future or where you might have gaps in governing that data or how you might want to reorient your storage or compute infrastructure to support the type of analytics that are actually being executed by real humans in your organization. And that's eye opening to a lot of I.T. sources. >> So you're providing insights to the data usage so that the business could get optimized for whether it's I.T. footprint component, or kinds of use cases, is that kind of how it's working? >> So what's interesting is the optimization actually happens in a pretty automated way, because we can make recommendations to those consumers of data of how they want to navigate the system. Kind of like Google makes recommendations as you browse the web, right? >> If you misspell something, "Oh did you mean this", kind of thing? >> "Did you mean this, might you also be interested in this", right? It's kind of a cross between Google and Amazon. Others like you may have used these other data assets in the past to determine revenue for that particular region, have you thought about using this filter, have you thought about using this join, did you know that you're trying to do analysis that maybe the sales ops guy has already done, and here's the certified report, why don't you just start with that? We're seeing a lot of reuse in organizations, wherein the past I think as an industry when Tablo and Click and all these B.I tools that were very self service oriented started to take off it was all about democratizing visualization by letting every user do their own thing and now we're realizing to get speed and accuracy and efficiency and effectiveness maybe there's more reuse of the work we've already done in existing data assets and by recommending those and expanding the data literacy around the interpretation of those, you might actually close this trust gap with the data. >> But there's one really important point that you raised, and I want to come back to it, and that is this notion of bias. So you know, Alation knows something about the data, knows a lot about the metadata, so therefore, I don't want to say understands, but it's capable of categorizing data in that way. And you're also able to look at the usage of that data by parsing some of sequel statements and then making a determination of the data as it's identified is appropriately being used based on how people are actually applying it so you can identify potential bias or potential misuse or whatever else it might be. That is an incredibly important thing. As you know John, we had an event last night and one of the things that popped up is how do you deal with emergence in data science in A.I, etc. And what methods do you put in place to actually ensure that the governance model can be extended to understand how those things are potentially in a very soft way, corrupting the use of the data. So could you spend a little bit more time talking about that because it's something a lot of people are interested in, quite frankly we don't know about a lot of tools that are doing that kind of work right now. It's an important point. >> I think the traditional viewpoint was if we just can manage the data we will be able to have a govern system. So if we control the inputs then well have a safe environment, and that was kind of like the classic single source of truth, data warehouse type model. >> Stewards of the data. >> What we're seeing is with the proliferation of sources of data and how quickly with IOT and new modern sources, data is getting created, you're not able to manage data at that point of that entry point. And it's not just about systems, it's about individuals that go on the web and find a dataset and then load it into a corporate database, right? Or you merge an Excel file with something that in a database. And so I think what we see happening, not only when you look at bias but if you look at some of the new regulations like [Inaudible] >> Sure. Ownership, [Inaudible] >> The logic that you're using to process that data, the algorithm itself can be biased, if you have a biased training data site that you feed it into a machine learning algorithm, the algorithm itself is going to be biased. And so the control point in this world where data is proliferating and we're not sure we can control that entirely, becomes the logic embedded in the algorithm. Even if that's a simple sequel statement that's feeding a report. And so Alation is able to introspect that sequel and highlight that maybe there is bias at work and how this algorithm is composed. So with GDPR the consumer owns their own data, if they want to pull it out from a training data set, you got to rerun that algorithm without that consumer data and that's your control point then going forward for the organization on different governance issues that pop up. >> Talk about the psychology of the user base because one of the things that shifted in the data world is a few stewards of data managed everything, now you've got a model where literally thousands of people of an organization could be users, productivity users, so you get a social component in here that people know who's doing data work, which in a way, creates a new persona or class of worker. A non techy worker. >> Yeah. It's interesting if you think about moving access to the data and moving the individuals that are creating algorithms out to a broader user group, what's important, you have to make sure that you're educating and training and sharing knowledge with that democratized audience, right? And to be able to do that you kind of want to work with human psychology, right? You want to be able to give people guidance in the course of their work rather than have them memorize a set of rules and try to remember to apply those. If you had a specialist group you can kind of control and force them to memorize and then apply, the more modern approach is to say "look, with some of these machine learning techniques that we have, why don't we make a recommendation." What you're going to do is introduce bias into that calculation. >> And we're capturing that information as you use the data. >> Well were also making a recommendation to say "Hey do you know you're doing this? Maybe you don't want to do that." Most people are using the data are not bad actors. They just can't remember all the rule sets to apply. So what were trying to do is cut someone behaviorally in the act before they make that mistake and say hey just a bit of a reminder, a bit of a coaching moment, did you know what you're doing? Maybe you can think of another approach to this. And we've found that many organizations that changes the discussion around data governance. It's no longer this top down constraint to finding insight, which frustrates an audience, is trying to use that data. It's more like a coach helping you improve and then social aspect of wanting to contribute to the system comes into play and people start communicating, collaborating, the platform and curating information a little bit. >> I remember when Microsoft Excel came out, the spreadsheet, or Lotus 123, oh my God, people are going to use these amazing things with spreadsheets, they did. You're taking a similar approach with analytics, much bigger surface area of work to kind of attack from a data perspective, but in a way kind of the same kind of concept, put the hands of the users, have the data in their hands so to speak. >> Yeah, enable everyone to make data driven decisions. But make sure that they're interpreting that data in the right way, right? Give them enough guidance, don't let them just kind of attack the wild west and fair it out. >> Well looking back at the Microsoft Excel spreadsheet example, I remember when a finance department would send a formatted spreadsheet with all the rules for how to use it out of 50 different groups around the world, and everyone figured out that you can go in and manipulate the macros and deliver any results they want. And so it's that same notion, you have to know something about that, but this site, in many respects Stephanie you're describing a data governance model that really is more truly governance, that if we think about a data asset it's how do we mediate a lot of different claims against that set of data so that its used appropriately, so its not corrupted, so that it doesn't effect other people, but very importantly so that the out6comes are easier to agree upon because there's some trust and there's some valid behaviors and there's some verification in the flow of the data utilization. >> And where we give voice to a number of different constituencies. Because business opinions from different departments can run slightly counter to one another. There can be friction in how to use particular data assets in the business depending on the lens that you have in that business and so what were trying to do is surface those different perspectives, give them voice, allow those constituencies to work that out in a platform that captures that debate, captures that knowledge, makes that debate a knowledge of foundation to build upon so in many ways its kind of like the scientific method, right? As a scientist I publish a paper. >> Get peer reviewed. >> Get peer reviewed, let other people weigh in. >> And it becomes part of the canon of knowledge. >> And it becomes part of the canon. And in the scientific community over the last several years you see that folks are publishing their data sets out publicly, why can't an enterprise do the same thing internally for different business groups internally. Take the same approach. Allow others to weigh in. It gets them better insights and it gets them more trust in that foundation. >> You get collective intelligence from the user base to help come in and make the data smarter and sharper. >> Yeah and have reusable assets that you can then build upon to find the higher level insights. Don't run the same report that a hundred people in the organization have already run. >> So the final question for you. As you guys are emerging, starting to do really well, you have a unique approach, honestly we think it fits in kind of the new guard of analytics, a productivity worker with data, which is we think is going to be a huge persona, where are you guys winning, and why are you winning with your customer base? What are some things that are resonating as you go in and engage with prospects and customers and existing customers? What are they attracted to, what are they like, and why are you beating the competition in your sales and opportunities? >> I think this concept of a more agile, grassroots approach to data governance is a breath of fresh air for anyone who spend their career in the data space. Were at a turning point in industry where you're now seeing chief decision scientists, chief data officers, chief analytic officers take a leadership role in organizations. Munich Reinsurance is using their data team to actually invest and hold new arms of their business. That's how they're pushing the envelope on leadership in the insurance space and were seeing that across our install base. Alation becomes this knowledge repository for all of those mines in the organization, and encourages a community to be built around data and insightful questions of data. And in that way the whole organization raises to the next level and I think its that vision of what can be created internally, how we can move away from just claiming that were a big data organization and really starting to see the impact of how new business models can be creative in these data assets, that's exciting to our customer base. >> Well congratulations. A hot start up. Alation here on theCUBE in New York City for cubeNYC. Changing the game on analytics, bringing a breath of fresh air to hands of the users. A new persona developing. Congratulations, great to have you. Stephanie McReynolds. Its the cube. Stay with us for more live coverage, day one of two days live in New York City. We'll be right back.

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media the CMO, VP of Marketing for Alation, thanks for joining us. So you guys have a pretty spectacular so we brought a little fun to the show floor in the Alation booth is about the product You guys are one of the new guard companies is that making decisions and it's the trust gap and I really like the idea of the data scientists, production implementations get into the thousands of users. and asked them to certify data assets for others, kind of the decision scientists, gaps in governing that data or how you might want to so that the business could get optimized as you browse the web, right? in the past to determine revenue for that particular region, and one of the things that popped up is how do you deal and that was kind of like the classic it's about individuals that go on the web and find a dataset the algorithm itself is going to be biased. because one of the things that shifted in the data world And to be able to do that you kind of They just can't remember all the rule sets to apply. have the data in their hands so to speak. that data in the right way, right? and everyone figured out that you can go in in the business depending on the lens that you have And in the scientific community over the last several years You get collective intelligence from the user base Yeah and have reusable assets that you can then build upon and why are you winning with your customer base? and really starting to see the impact of how new business bringing a breath of fresh air to hands of the users.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Stephanie McReynoldsPERSON

0.99+

AmazonORGANIZATION

0.99+

Dave VellantePERSON

0.99+

JohnPERSON

0.99+

Peter BurrisPERSON

0.99+

GoogleORGANIZATION

0.99+

StephaniePERSON

0.99+

ThursdayDATE

0.99+

New YorkLOCATION

0.99+

John FurrierPERSON

0.99+

50 different groupsQUANTITY

0.99+

PeterPERSON

0.99+

New York CityLOCATION

0.99+

EbayORGANIZATION

0.99+

2,000 usersQUANTITY

0.99+

ExcelTITLE

0.99+

Attack of the BotsTITLE

0.99+

thousandsQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

two daysQUANTITY

0.99+

yesterdayDATE

0.99+

ninth yearQUANTITY

0.99+

twoQUANTITY

0.99+

STRATAORGANIZATION

0.99+

TodayDATE

0.99+

FiservORGANIZATION

0.99+

last nightDATE

0.99+

three years agoDATE

0.99+

AlationPERSON

0.99+

NYCLOCATION

0.98+

Lotus 123TITLE

0.98+

Munich ReinsuranceORGANIZATION

0.98+

oneQUANTITY

0.98+

GDPRTITLE

0.97+

AlationORGANIZATION

0.96+

MicrosoftORGANIZATION

0.94+

SASORGANIZATION

0.94+

over a thousand weekly loginsQUANTITY

0.91+

theCUBEORGANIZATION

0.9+

Strata ConferenceEVENT

0.89+

single sourceQUANTITY

0.86+

thousands of peopleQUANTITY

0.86+

thousands of usersQUANTITY

0.84+

TabloORGANIZATION

0.83+

day oneQUANTITY

0.78+

2018EVENT

0.75+

CUBEORGANIZATION

0.75+

SalesworthORGANIZATION

0.74+

Einstein AnalyticsORGANIZATION

0.73+

TabloTITLE

0.73+

Strata HadoopEVENT

0.73+

a hundred peopleQUANTITY

0.7+

2018DATE

0.66+

pointQUANTITY

0.63+

yearsDATE

0.63+

AlationLOCATION

0.62+

ClickORGANIZATION

0.62+

Munich ReinsuranceTITLE

0.6+

over a hundredQUANTITY

0.59+

DataORGANIZATION

0.58+

Strata DataEVENT

0.57+

lastDATE

0.55+

HaikuTITLE

0.47+

Aaron Kalb, Alation | CUBEconversations June 2018


 

(stirring music) >> Hi, I'm Peter Burris, and welcome to another CUBE Conversation from theCUBE Studios in beautiful Palo Alto, California. Got a great conversation today. We're going to be talking about some of the new advances that are associated with big data analytics and improving the rate at which human beings, people who actually work with data, can get more out of their data, be more certain about their data, and improve the social system that actually is dependent upon data. To do that, we've got Aaron Kalb of Alation here with us. Aaron is the co-founder and is VP of design and strategic initiatives. Aaron, welcome back to theCUBE. >> Thanks so much for having me, Peter. >> So, then, let's start this off. The concern that a lot of folks have when they think about analytics, big data, and the promise of some of these new advanced technologies is they see how they could be generating significant business value, but they observe that it often falls short. It falls short for technological reasons, you know, setting up the infrastructure is very, very, difficult. But we've started solving that by moving a lot of these workloads to the cloud. They also are discovering that the toolchains can be very complex, but they're starting to solve that by working with companies with vision, like Alation, about how you can bring these things together more easily. There are some good things happening within the analytics space, but one of the biggest challenges is, even if you set up your pipelines and your analytics systems and applications right, you still encounter resistance inside the business, because human beings don't necessarily have a natural affinity for data. Data is not something that's easy to consume, it's not something easy to recognize. People just haven't been trained in it. We need more that makes it easy to identify data quality, data issues, et cetera. Tell us a little bit about what Alation's doing to solve that human side, the adoption side of the challenge. >> That's a great point and a great question, Peter. Fundamentally, what we see is it used to be a problem of quantity. There wasn't enough ability to generate data assets, and to distribute them, and to get to them. Now, there's just an overwhelming amount of places to gather data. The problem becomes finding development data for your need, understanding and putting it into context, and most fundamentally, trusting that it's actually telling you a true story about the world. You know, what we find now is, as there's been more self-service analytics, there's more and more dashboards and queries and content being generated, and often an executive will look at two different answers to the same question that are trending in totally different directions. They'll say, "I can't trust any of this. "On paper, I want to be data-driven, "but in actuality, I'm just going to go back to my gut, "'cause the data is not always trustworthy, "and it's hard to tell what's trustworthy and what's not." >> This is, even after they've found the data and enough people have been working on it to say, to put it in context to say, "Yes, this data is being used in marketing," or, "This data has been used in operations production." there's another layer of branding or whatnot that we can put on data that says, "This data is appropriate for use in this way." Is that what we're talking about here? >> Absolutely right. To help with finding and understanding data, you can group it and make it browsable by topic. You can enable keyword search over it in that natural language. That's stuff that Alation has done in the past. What we're excited to unveil now is this idea of trust check, which is all about saying, wherever you're at in that data value chain of taking raw data and schematizing it and eventually producing pretty dashboards and visualizations, that at every step, we can ensure that only the most trustworthy data sets are being used, because any problem upstream flows downstream. >> So, trust check. >> Trust check. >> Trust check, it's something that comes out of Alation. Is it also being used with other visualization tools or other sources or other applications? >> That's a great question. It's all of the above. Trust check starts with saying, if I'm an analyst who wants to create a dashboard or a visualization, I'm going to have to write some SQL query to do that. What we've done in that context with Alation Compose, is our home-grown SQL tool, is provided a tool, and trust check kind of gets its name from spell check. It used to be there was a dictionary, and you could look it up by hand, and you could look it up online, but that's a lot of work for every single word to check it. And then, you know, Microsoft, I think, was the first innovative saying, "Oh, let's put a little red squiggle that you can't miss "right in your workflow as you're writing, "so you don't have to go to it, it comes to you." We do the exact same thing. I'm about to query a table that is deprecated or has a data quality issue. I immediately see bright red on my screen, can't miss it, and I can fix my behavior. That's as I'm creating a data asset. We also, through our partnerships with Salesforce and with Tableau, each of whom have very popular visualization tools, to, say. if people are consuming a dashboard, not a SQL query, but looking at a Tableau dashboard or a visualization in Salesforce Einstein Analytics, what would it mean to badge right there and then, put a stamp of approval on the most trustworthy sources and a warning or caveat on things that might have an upstream data quality problem? >> So, when you say warning or caveat, you're saying literally that there are exceptions or there are other concerns associated with the data, and reviewing that as part of the analytic process. >> That's exactly right. Much like, again, spell check underlines, or looking at, if you think about if I'm driving in my car with Waze, and it says, "Oh, traffic up ahead, view route this way." What does it mean to get in the user interface where people live, whether they're a business user in Salesforce or Tableau, or a data analyst in a query tool, right there in their flow having onscreen indications of everything happening below the tip of the iceberg that affects their work and the trustworthiness of the data sets they're using. >> So that's what it is. I'll tell you a quick story about spell check. >> Please. >> Many years ago, I'm old enough that I was one of the first users of some of these tools. When you typed in IBM, Microsoft Word would often change it to DUM, which was kind of interesting, given the things that were going on between them. But it leads you to ask questions. How does this work? I mean, how does spell check work? Well, how does trust check work, because that's going to have an enormous implication. People have to trust how trust check works. Tell us a little bit about how trust check works. >> Absolutely. How do you trust trust check? The little red or yellow or bright, salient indicators we've designed are just to get your attention. Then, as a user, you can click into those indicators and see why is this appearing. The biggest reason that an indicator will appear in a trust check context is that a person, a data curator or data steward, has put a warning or a deprecation on the data set. It's not, you know, oh, IBM doesn't like Microsoft, or vice versa. You know, you can see the sourcing. It isn't just, oh, because Merriam-Webster says so. It emerges from the logic of your own organization. But now Alation has this entire catalog backing trust check where it gives a bunch of signals that can help those curators and stewards to decide what indicators to put on what objects. For example, we might observe, this table used to be refreshed frequently. It hasn't in a while. Does that mean it's ripe for getting a bit of a warning on it? Or, people aren't really using this data set. Is there a reason for that? Or, something upstream was just flagged having a data quality issue. That data quality issue might flow downstream like pollution in a creek, and that can be an indication of another reason why you might want to label data as not trustworthy. >> In Alation context with Salesforce and Tableau partners, and perhaps some others, this trust check ends up being a social moniker for what constitutes good data that is branded as a consequence of both technological as well as social activities around that data captured by Alation. I got that right? >> That's exactly right. We're taking technical signals and social signals, because what happens in our customers today before we launched trust check, what they would do is, if you had the time, you would phone a friend. You'd say, "Hey, you seem to be data-savvy. "Does this number look weird to you? "Do you know what's going on? "Is something wrong with the table that it's sourced from?" The problem is, that person's on vacation, and you're out of luck. This is saying, let's push everything we know across that entire chain, from the rawest data to the most polished asset and have all that information pushed up to where you live in the moment you're making a decision, should I trust this data, how should I use it? >> In the whole, going back to this whole world of big data and analytics, we're moving more of the workloads to the cloud to get rid of the infrastructure problems. We're utilizing more integrated toolchains to get rid of the complexity associated with a lot of the analytic pipelines. How does trust check then applied, go back to this notion of human beings not being willing to accept somebody else's data. Give us that use case of how someone's going to sit down in a boardroom or at a strategic meeting or whatever else it is, see trust check, and go, "I get it." >> Absolutely, that's a fantastic question. There's two reasons why, even though all organizations, or 80% according to Gartner, claim they're committed to being data-driven. You still have these moments, people say, "Yeah, I see the numbers, "but I'm going to ignore them, or discount them, "or be very skeptical of them." One issue is just how much of the data that gets to you in the boardroom or the exec team meeting is wrong. We had an incredibly successful data-driven customer who did an internal audit and found that 1/3 of the numbers that appeared in the PowerPoint presentations on which major business decisions were being made, a full 1/3 of them were off by an extraordinary amount, an amount so big that it would, the decision would've cut the other way had the number been accurate. The sheer volume of bad data coming in to undermine trust. The second is, even if only 5% of the data were untrustworthy, if you don't know which is which, the 95% that's trustworthy and the 5% that's not, you still might not be able to use it with confidence. We believe that having trust check be at every stage in this data value chain will solve, actually, both problems by having that spell-check-like experience in the query tool, which is where most analytics projects start. We can reduce the amount of garbage going into the meeting rooms where business choices are being made. And by putting that badge saying "This is certified," or, "Take this with a grain of salt," or, "No, this is totally wrong," that putting that badge on the visualizations that business leaders are looking at in Salesforce and Tableau, and over time, in ideally every tool that anybody would use in an enterprise, we can also help distinguish the wheat from the chaff in that context as well. We think we're attacking both parts of this problem, and that will really drive a data-driven culture truly being adoptable in an organization. >> I want to tie a couple things that you said here. You mentioned the word design a couple times. You're the VP of design at Alation. It also sounds like when you're talking about design, you're not just talking about design of the interface or the software. You're talking about design of how people are going to use the software. What is the extent to which design, what's the scope of design as you see it in this context of advanced analytics, and is trust check just a first step that you're taking? Tell us a little bit about that. >> Yeah, that's a great set of questions, Peter. Design for us means really looking at humans, and starting by listening and watching. You know, a lot of people in the cataloging space and the governance space, they list a lot of should statements. "People should adopt this process, "because otherwise, mistakes will be made." >> Because Gartner said 80% of you have! >> Right, exactly. We think the shoulds only get you so far. We want to really understand the human psychology. How do people actually behave when they're under pressure to move quickly in a rapidly changing environment, when they're afraid of being caught having made a mistake? There's all these pressures people are under. And so, it's not realistic to say, again, you could imagine saying, "Oh, every time before you go out the door, "go to MapQuest or some sort of traffic website "and look up the route and print it out, "so you make sure you plot correctly." No one has time for that, just like no one has time to look up every single word in their essay or their memo or their email and look it up in the dictionary to see if it's right. But when you have an intervention that comes into somebody's flow and is impossible to miss, and is an angel on your shoulder keeping you from making a mistake, or, you know, in-car navigation that tells you in real time, "Here's how you should route." Those sort of things fit into somebody's lifestyle and actually move impact. Our idea is, let's meet people where they are. Acknowledge the challenges that humans face and make technology that really helps them and comes to them instead of scolding them and saying, "Oh, you should change your flow in this uncomfortable way "and come to us, "and that's the only way "you'll achieve the outcome you want." >> Invest the tool into the process and into the activity, as opposed to force people to alter the activity around the limitations or capabilities of the tool. >> Exactly right. And so, while design is optimizing the exact color and size and UI/UX both in our own tools and working with our partners to optimize that, it's starting at an even bigger level of saying, "How do we design the entire workflow "so humans can do what they do best "and the computer just gives them "what they need in real time?" >> And as something as important, and this kind of takes it full circle, something as important and potentially strategic as advanced analytics, having that holistic view is really going to determine success or failure in a lot of businesses. >> That is absolutely right, Peter, and you asked earlier, "Is this just the beginning?" That's absolutely true. Our goal is to say, whatever part of the analytics process you are in, that you get these realtime interventions to help you get the information that's relevant to you, understand what it means in the context you're in, and make sure that it's trustworthy and reliable so people can be truly data-driven. >> Well, there's a lot of invention going on, but what we're really seeking here is changes in social behavior that lead to consequential improvements in business. Aaron Kalb, VP of design and strategic initiatives at Alation, thanks very much for talking about this important advance in how we think about analytics. >> Thank you so much for having me, Peter. >> This is, again, Peter Burris. This has been a CUBE Conversation. Until next time. (stirring music)

Published Date : Jul 12 2018

SUMMARY :

and improving the rate at which human beings, and the promise of some of these new advanced technologies and to distribute them, and to get to them. Is that what we're talking about here? That's stuff that Alation has done in the past. Trust check, it's something that comes out of Alation. "Oh, let's put a little red squiggle that you can't miss and reviewing that as part of the analytic process. and the trustworthiness of the data sets they're using. I'll tell you a quick story about spell check. But it leads you to ask questions. and that can be an indication of another reason I got that right? and have all that information pushed up to where you live to get rid of the infrastructure problems. that gets to you in the boardroom What is the extent to which design, and the governance space, and make technology that really helps them and comes to them around the limitations or capabilities of the tool. and UI/UX both in our own tools and this kind of takes it full circle, to help you get the information that's relevant to you, that lead to consequential improvements in business. This is, again, Peter Burris.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AaronPERSON

0.99+

Peter BurrisPERSON

0.99+

MicrosoftORGANIZATION

0.99+

PeterPERSON

0.99+

IBMORGANIZATION

0.99+

Aaron KalbPERSON

0.99+

80%QUANTITY

0.99+

June 2018DATE

0.99+

two reasonsQUANTITY

0.99+

95%QUANTITY

0.99+

AlationORGANIZATION

0.99+

firstQUANTITY

0.99+

One issueQUANTITY

0.99+

Merriam-WebsterORGANIZATION

0.99+

GartnerORGANIZATION

0.99+

5%QUANTITY

0.99+

secondQUANTITY

0.99+

Palo Alto, CaliforniaLOCATION

0.99+

bothQUANTITY

0.99+

TableauTITLE

0.99+

todayDATE

0.98+

theCUBEORGANIZATION

0.98+

oneQUANTITY

0.98+

theCUBE StudiosORGANIZATION

0.98+

first stepQUANTITY

0.98+

SalesforceORGANIZATION

0.98+

both partsQUANTITY

0.97+

PowerPointTITLE

0.97+

both problemsQUANTITY

0.97+

SalesforceTITLE

0.95+

TableauORGANIZATION

0.94+

eachQUANTITY

0.94+

MapQuestORGANIZATION

0.92+

1/3QUANTITY

0.91+

two different answersQUANTITY

0.9+

first usersQUANTITY

0.86+

couple timesQUANTITY

0.84+

SQLTITLE

0.82+

Many years agoDATE

0.8+

WazeTITLE

0.8+

single wordQUANTITY

0.79+

Alation ComposeTITLE

0.78+

WordTITLE

0.78+

GartnerPERSON

0.77+

Einstein AnalyticsTITLE

0.7+

DUMTITLE

0.65+

CUBEORGANIZATION

0.62+

AlationPERSON

0.59+

couple thingsQUANTITY

0.59+

ConversationEVENT

0.58+

challengesQUANTITY

0.53+

Stephanie McReynolds, Alation | DataWorks Summit 2018


 

>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We're joined by Stephanie McReynolds. She is the Vice President of Marketing at Alation. Thanks so much for, for returning to theCUBE, Stephanie. >> Thank you for having me again. >> So, before the cameras were rolling, we were talking about Kevin Slavin's talk on the main stage this morning, and talking about, well really, a background to sort of this concern about AI and automation coming to take people's jobs, but really, his overarching point was that we really, we shouldn't, we shouldn't let the algorithms take over, and that humans actually are an integral piece of this loop. So, riff on that a little bit. >> Yeah, what I found fascinating about what he presented were actual examples where having a human in the loop of AI decision-making had a more positive impact than just letting the algorithms decide for you, and turning it into kind of a black, a black box. And the issue is not so much that, you know, there's very few cases where the algorithms make the wrong decision. What happens the majority of the time is that the algorithms actually can't be understood by human. So if you have to roll back >> They're opaque, yeah. >> in your decision-making, or uncover it, >> I mean, who can crack what a convolutional neural network does, at a layer by layer, nobody can. >> Right, right. And so, his point was, if we want to avoid not just poor outcomes, but also make sure that the robots don't take over the world, right, which is where every like, media person goes first, right? (Rebecca and James laugh) That you really need a human in the loop of this process. And a really interesting example he gave was what happened with the 2015 storm, and he talked about 16 different algorithms that do weather predictions, and only one algorithm predicted, mis-predicted that there would be a huge weather storm on the east coast. So if there had been a human in the loop, we wouldn't have, you know, caused all this crisis, right? The human could've >> And this is the storm >> Easily seen. >> That shut down the subway system, >> That's right. That's right. >> And really canceled New York City for a few days there, yeah. >> That's right. So I find this pretty meaningful, because Alation is in the data cataloging space, and we have a lot of opportunity to take technical metadata and automate the collection of technical and business metadata and do all this stuff behind the scenes. >> And you make the discovery of it, and the analysis of it. >> We do the discovery of this, and leading to actual recommendations to users of data, that you could turn into automated analyses or automated recommendations. >> Algorithmic, algorithmically augmented human judgment is what it's all about, the way I see it. What do you think? >> Yeah, but I think there's a deeper insight that he was sharing, is it's not just human judgment that is required, but for humans to actually be in the loop of the analysis as it moves from stage to stage, that we can try to influence or at least understand what's happening with that algorithm. And I think that's a really interesting point. You know, there's a number of data cataloging vendors, you know, some analysts will say there's anywhere from 10 to 30 different vendors in the data cataloging space, and as vendors, we kind of have this debate. Some vendors have more advanced AI and machine learning capabilities, and other vendors haven't automated at all. And I think that the answer, if you really want humans to adopt analytics, and to be comfortable with the decision-making of those algorithms, you need to have a human in the loop, in the middle of that process, of not only making the decision, but actually managing the data that flows through these systems. >> Well, algorithmic transparency and accountability is an increasing requirement. It's a requirement for GDPR compliance, for example. >> That's right. >> That I don't see yet with Wiki, but we don't see a lot of solution providers offering solutions to enable more of an automated roll-up of a narrative of an algorithmic decision path. But that clearly is a capability as it comes along, and it will. That will absolutely depend on a big data catalog managing the data, the metadata, but also helping to manage the tracking of what models were used to drive what decision, >> That's right. >> And what scenario. So that, that plays into what Alation >> So we talk, >> And others in your space do. >> We call that data catalog, almost as if the data's the only thing that we're tracking, but in addition to that, that metadata or the data itself, you also need to track the business semantics, how the business is using or applying that data and that algorithmic logic, so that might be logic that's just being used to transform that data, or it might be logic to actually make and automate decision, like what they're talking about GDPR. >> It's a data artifact catalog. These are all artifacts that, they are derived in many ways, or supplement and complement the data. >> That's right. >> They're all, it's all the logic, like you said. >> And what we talk about is, how do you create transparency into all those artifacts, right? So, a catalog starts with this inventory that creates a foundation for transparency, but if you don't make those artifacts accessible to a business person, who might not understand what is metadata, what is a transformation script. If you can't make that, those artifacts accessible to a, what I consider a real, or normal human being, right, (James laughs) I love to geek out, but, (all laugh) at some point, not everyone is going to understand. >> She's the normal human being in this team. >> I'm normal. I'm normal. >> I'm the abnormal human being among the questioners here. >> So, yeah, most people in the business are just getting our arms around how do we trust the output of analytics, how do we understand enough statistics and know what to apply to solve a business problem or not, and then we give them this like, hairball of technical artifacts and say, oh, go at it. You know, here's your transparency. >> Well, I want to ask about that, that human that we're talking about, that needs to be in the loop at every stage. What, that, surely, we can make the data more accessible, and, but it also requires a specialized skill set, and I want to ask you about the talent, because I noticed on your LinkedIn, you said, hey, we're hiring, so let me know. >> That's right, we're always hiring. We're a startup, growing well. >> So I want to know from you, I mean, are you having difficulty with filling roles? I mean, what is at the pipeline here? Are people getting the skills that they need? >> Yeah, I mean, there's a wide, what I think is a misnomer is there's actually a wide variety of skills, and I think we're adding new positions to this pool of skills. So I think what we're starting to see is an expectation that true business people, if you are in a finance organization, or you're in a marketing organization, or you're in a sales organization, you're going to see a higher level of data literacy be expected of that, that business person, and that's, that doesn't mean that they have to go take a Python course and learn how to be a data scientist. It means that they have to understand statistics enough to realize what the output of an algorithm is, and how they should be able to apply that. So, we have some great customers, who have formally kicked off internal training programs that are data literacy programs. Munich Re Insurance is a good example. They spoke with James a couple of months ago in Berlin. >> Yeah, this conference in Berlin, yeah. >> That's right, that's right, and their chief data officer has kicked off a formal data literacy training program for their employees, so that they can get business people comfortable enough and trusting the data, and-- >> It's a business culture transformation initiative that's very impressive. >> Yeah. >> How serious they are, and how comprehensive they are. >> But I think we're going to see that become much more common. Pfizer has taken, who's another customer of ours, has taken on a similar initiative, and how do they make all of their employees be able to have access to data, but then also know when to apply it to particular decision-making use cases. And so, we're seeing this need for business people to get a little bit of training, and then for new roles, like information stewards, or data stewards, to come online, folks who can curate the data and the data assets, and help be kind of translators in the organization. >> Stephanie, will there be a need for a algorithm curator, or a model curator, to, you know, like a model whisperer, to explain how these AI, convolutional, recurrent, >> Yeah. >> Whatever, all these neural, how, what they actually do, you know. Would there be a need for that going forward? Another as a normal human being, who can somehow be bilingual in neural net and in standard language? >> I think, I think so. I mean, I think we've put this pressure on data scientists to be that person. >> Oh my gosh, they're so busy doing their job. How can we expect them to explain, and I mean, >> Right. >> And to spend 100% of their time explaining it to the rest of us? >> And this is the challenge with some of the regulations like GDPR. We aren't set up yet, as organizations, to accommodate this complexity of understanding, and I think that this part of the market is going to move very quickly, so as vendors, one of the things that we can do is continue to help by building out applications that make it easy for information stewardship. How do you lower the barrier for these specialist roles and make it easy for them to do their job by using AI and machine learning, where appropriate, to help scale the manual work, but keeping a human in the loop to certify that data asset, or to add additional explanation and then taking their work and using AI, machine learning, and automation to propagate that work out throughout the organization, so that everyone then has access to those explanations. So you're no longer requiring the data scientists to hold like, I know other organizations that hold office hours, and the data scientist like sits at a desk, like you did in college, and people can come in and ask them questions about neural nets. That's just not going to scale at today's pace of business. >> Right, right. >> You know, the term that I used just now, the algorithm or model whisperer, you know, the recommend-er function that is built into your environment, in similar data catalog, is a key piece of infrastructure to rank the relevance rank, you know, the outputs of the catalog or responses to queries that human beings might make. You know, the recommendation ranking is critically important to help human beings assess the, you know, what's going on in the system, and give them some advice about how to, what avenues to explore, I think, so. >> Yeah, yeah. And that's part of our definition of data catalog. It's not just this inventory of technical metadata. >> That would be boring, and dry, and useless. >> But that's where, >> For most human beings. >> That's where a lot of vendor solutions start, right? >> Yeah. >> And that's an important foundation. >> Yeah, for people who don't live 100% of their work day inside the big data catalog. I hear what you're saying, you know. >> Yeah, so people who want a data catalog, how you make that relevant to the business is you connect those technical assets, that technical metadata with how is the business actually using this in practice, and how can we have proactive recommendation or the recommendation engines, and certifications, and this information steward then communicating through this platform to others in the organization about how do you interpret this data and how do you use it to actually make business decisions. And I think that's how we're going to close the gap between technology adoption and actual data-driven decision-making, which we're not quite seeing yet. We're only seeing about 30, when they survey, only about 36% of companies are actually confident they're making data-driven decisions, even though there have been, you know, millions, if not billions of dollars that have gone into the data analytics market and investments, and it's because as a manager, I don't quite have the data literacy yet, and I don't quite have the transparency across the rest of the organization to close that trust gap on analytics. >> Here's my feeling, in terms of cultural transformations across businesses in general. I think the legal staff of every company is going to need to get real savvy on using those kinds of tools, like your catalog, with recommendation engines, to support e-discovery, or discovery of the algorithmic decision paths that were taken by their company's products, 'cause they're going to be called by judges and juries, under a subpoena and so forth, and so on, to explain all this, and they're human beings who've got law degrees, but who don't know data, and they need the data environment to help them frame up a case for what we did, and you know, so, we being the company that's involved. >> Yeah, and our politicians. I mean, anyone who's read Cathy's book, Weapons of Math Destruction, there are some great use cases of where, >> Math, M-A-T-H, yeah. >> Yes, M-A-T-H. But there are some great examples of where algorithms can go wrong, and many of our politicians and our representatives in government aren't quite ready to have that conversation. I think anyone who watched the Zuckerberg hearings you know, in congress saw the gap of knowledge that exists between >> Oh my gosh. >> The legal community, and you know, and the tech community today. So there's a lot of work to be done to get ready for this new future. >> But just getting back to the cultural transformation needed to be, to make data-driven decisions, one of the things you were talking about is getting the managers to trust the data, and we're hearing about what are the best practices to have that happen in the sense, of starting small, be willing to experiment, get out of the lab, try to get to insight right away. What are, what would your best advice be, to gain trust in the data? >> Yeah, I think the biggest gap is this issue of transparency. How do you make sure that everyone understands each step of the process and has access to be able to dig into that. If you have a foundation of transparency, it's a lot easier to trust, rather than, you know, right now, we have kind of like the high priesthood of analytics going on, right? (Rebecca laughs) And some believers will believe, but a lot of folks won't, and, you know, the origin story of Alation is really about taking these concepts of the scientific revolution and scientific process and how can we support, for data analysis, those same steps of scientific evaluation of a finding. That means that you need to publish your data set, you need to allow others to rework that data, and come up with their own findings, and you have to be open and foster conversations around data in your organization. One other customer of ours, Meijer, who's a grocery store in the mid-west, and if you're west coast or east coast-based, you might not have heard of them-- >> Oh, Meijers, thrifty acres. I'm from Michigan, and I know them, yeah. >> Gigantic. >> Yeah, there you go. Gigantic grocery chain in the mid-west, and, Joe Oppenheimer there actually introduced a program that he calls the social contract for analytics, and before anyone gets their license to use Tableau, or MicroStrategy, or SaaS, or any of the tools internally, he asks those individuals to sign a social contract, which basically says that I'll make my work transparent, I will document what I'm doing so that it's shareable, I'll use certain standards on how I format the data, so that if I come up with a, with a really insightful finding, it can be easily put into production throughout the rest of the organization. So this is a really simple example. His inspiration for that social contract was his high school freshman. He was entering high school and had to sign a social contract, that he wouldn't make fun of the teachers, or the students, you know, >> I love it. >> Very simple basics. >> Yeah, right, right, right. >> I wouldn't make fun of the teacher. >> We all need social contract. >> Oh my gosh, you have to make fun of the teacher. >> I think it was a little more formal than that, in the language, but that was the concept. >> That's violating your civil rights as a student. I'm sorry. (Stephanie laughs) >> Stephanie, always so much fun to have you here. Thank you so much for coming on. >> Thank you. It's a pleasure to be here. >> I'm Rebecca Knight, for James Kobielus. We'll have more of theCUBE's live coverage of DataWorks just after this.

Published Date : Jun 20 2018

SUMMARY :

brought to you by Hortonworks. She is the Vice President of Marketing Thank you for having me and that humans actually of the time is that yeah. I mean, who can crack but also make sure that the robots That's right. And really canceled because Alation is in the and the analysis of it. and leading to actual recommendations the way I see it. and to be comfortable with It's a requirement for GDPR compliance, the metadata, but also helping to manage that plays into what Alation that metadata or the data itself, or supplement and complement the data. it's all the logic, I love to geek out, but, She's the normal human being I'm normal. I'm the abnormal and know what to apply that needs to be in the That's right, we're always hiring. and how they should be able to apply that. Yeah, this conference It's a business culture and how comprehensive they are. in the organization. and in standard language? on data scientists to be to explain, and I mean, and the data scientist to rank the relevance rank, you know, definition of data catalog. and dry, and useless. And that's an important inside the big data catalog. and I don't quite have the transparency and so on, to explain all this, Yeah, and our politicians. and many of our politicians and the tech community today. is getting the managers to trust the data, and has access to be and I know them, yeah. or the students, you know, the teacher. the teacher. in the language, but that was That's violating much fun to have you here. It's a pleasure to be here. We'll have more of theCUBE's live coverage

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
James KobielusPERSON

0.99+

Stephanie McReynoldsPERSON

0.99+

Rebecca KnightPERSON

0.99+

RebeccaPERSON

0.99+

MichiganLOCATION

0.99+

StephaniePERSON

0.99+

BerlinLOCATION

0.99+

JamesPERSON

0.99+

100%QUANTITY

0.99+

Kevin SlavinPERSON

0.99+

San JoseLOCATION

0.99+

millionsQUANTITY

0.99+

CathyPERSON

0.99+

Silicon ValleyLOCATION

0.99+

PfizerORGANIZATION

0.99+

LinkedInORGANIZATION

0.99+

Munich Re InsuranceORGANIZATION

0.99+

San Jose, CaliforniaLOCATION

0.99+

congressORGANIZATION

0.99+

New York CityLOCATION

0.99+

Joe OppenheimerPERSON

0.99+

PythonTITLE

0.99+

10QUANTITY

0.99+

MeijersORGANIZATION

0.99+

ZuckerbergPERSON

0.99+

16 different algorithmsQUANTITY

0.99+

Weapons of Math DestructionTITLE

0.99+

GDPRTITLE

0.99+

OneQUANTITY

0.98+

each stepQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

about 36%QUANTITY

0.98+

DataWorks Summit 2018EVENT

0.97+

TableauTITLE

0.97+

about 30QUANTITY

0.97+

HortonworksORGANIZATION

0.97+

AlationORGANIZATION

0.96+

one algorithmQUANTITY

0.96+

30 different vendorsQUANTITY

0.95+

billions of dollarsQUANTITY

0.95+

2015DATE

0.95+

SaaSTITLE

0.94+

oneQUANTITY

0.94+

GiganticORGANIZATION

0.93+

firstQUANTITY

0.9+

MicroStrategyTITLE

0.88+

this morningDATE

0.88+

couple of months agoDATE

0.84+

todayDATE

0.81+

MeijerORGANIZATION

0.77+

WikiTITLE

0.74+

Vice PresidentPERSON

0.72+

DataWorksORGANIZATION

0.71+

AlationPERSON

0.53+

DataWorksEVENT

0.43+

Satyen Sangani, Alation | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. (upbeat music) >> Welcome back to theCUBE, I'm Lisa Martin with John Furrier. We are covering our second day of our event Big Data SV. We've had some great conversations, John, yesterday, today as well. Really looking at Big Data, digital transformation, Big Data, plus data science, lots of opportunity. We're excited to welcome back to theCUBE an alumni, Satyen Sangani, the co-founder and CEO of Alation. Welcome back! >> Thank you, it's wonderful to be here again. >> So you guys finish up your fiscal year end of December 2017, where in the first quarter of 2018. You guys had some really strong results, really strong momentum. >> Yeah. >> Tell us what's going on at Alation, how are you pulling this momentum through 2018. >> Well, I think we have had an enterprise focused business historically, because we solve a very complicated problem for very big enterprises, and so, in the last quarter we added customers like American Express, PepsiCo, Roche. And with huge expansions from our existing customers, some of whom, over the course of a year, I think went 12 X from an initial base. And so, we found some just incredible momentum in Q4 and for us that was a phenomenal cap to a great year. >> What about the platform you guys are doing? Can you just take a minute to explain what Alation does again just to refresh where you are on the product side? You mentioned some new accounts, some new use cases. >> Yeah. >> What's the update? Take a minute, talk about the update. >> Absolutely, so, you certainly know, John, but Alation's a data catalog and a data catalog essentially, you can think of it as Yelp or Amazon for data and information side of the enterprise. So if you think about how many different databases there are, how many different reports there are, how many different BI tools there are, how many different APIs there are, how many different algorithms there are, it's pretty dizzying for the average analyst. It's pretty dizzying for the average CIO. It's pretty dizzying for the average chief data officer. And particularly, inside of Fortune 500s where you have hundreds of thousands of databases. You have a situation where people just have too much signal or too much noise, not enough signal. And so what we do is we provide this Yelp for that information. You can come to Alation as a catalog. You can do a search on revenue 2017. You'll get all of the reports, all of the dashboards, all of the tables, all of the people that you might need to be able to find. And that gives you a single place of reference, so you can understand what you've got and what can answer your questions. >> What's interesting is, first of all, I love data. We're data driven, we're geeks on data. But when I start talking to folks that are outside the geek community or nerd community, you say data and they go, "Oh," because they cringe and they say, "Facebook." They see that data issues there. GDPR, data nightmare, where's the store, you got to manage it. And then, people are actually using data, so they're realizing how hard (laughs) it is. >> Yeah >> How much data do we have? So it's kind of like a tropic disillusionment, if you will. Now they got to get their hands on it. They've got to put it to work. >> Yeah. >> And they know that So, it's now becoming really hard (laughs) in their mind. This is business people. >> Yeah. >> They have data everywhere. How do you guys talk to that customer? Because, if you don't have quality data, if you don't have data you can trust, if you don't have the right people, it's hard to get it going. >> Yeah. >> How do you guys solve that problem and how do you talk to customers? >> So we talk a lot about data literacy. There is a lot of data in this world and that data is just emblematic of all of the stuff that's going on in this world. There's lots of systems, there's lots of complexity and the data, basically, just is about that complexity. Whether it's weblogs, or sensors, or the like. And so, you can either run away from that data, and say, "Look, I'm going to not, "I'm going to bury my head in the sand. "I'm going to be a business. "I'm just going to forget about that data stuff." And that's certainly a way to go. >> John: Yeah. >> It's a way to go away. >> Not a good outlook. >> I was going to say, is that a way of going out of business? >> Or, you can basically train, it's a human resources problem fundamentally. You've got to train your people to understand how to use data, to become data literate. And that's what our software is all about. That's what we're all about as a company. And so, we have a pretty high bar for what we think we do as a business and we're this far into that. Which is, we think we're training people to use data better. How do you learn to think scientifically? How do you go use data to make better decisions? How do you build a data driven culture? Those are the sorts of problems that I'm excited to work on. >> Alright, now take me through how you guys play out in an engagement with the customer. So okay, that's cool, you guys can come in, we're getting data literate, we understand we need to use data. Where are you guys winning? Where are you guys seeing some visibility, both in terms of the traction of the usage of the product, the use cases? Where is it kind of coming together for you guys? >> Yeah, so we literally, we have a mantra. I think any early stage company basically wins because they can focus on doing a couple of things really well. And for us, we basically do three things. We allow people to find data. We allow people to understand the data that they find. And we allow them to trust the data that they see. And so if I have a question, the first place I start is, typically, Google. I'll go there and I'll try to find whatever it is that I'm looking for. Maybe I'm looking for a Mediterranean restaurant on 1st Street in San Jose. If I'm going to go do that, I'm going to do that search and I'm going to find the thing that I'm looking for, and then I'm going to figure out, out of the possible options, which one do I want to go to. And then I'll figure out whether or not the one that has seven ratings is the one that I trust more than the one that has two. Well, data is no different. You're going to have to find the data sets. And inside of companies, there could be 20 different reports and there could be 20 different people who have information, and so you're going to trust those people through having context and understanding. >> So, trust, people, collaboration. You mentioned some big brands that you guys added towards the end of calendar 2017. How do you facilitate these conversations with maybe the chief data officer. As we know, in large enterprises, there's still a lot of ownership over data silos. >> Satyen: Yep. >> What is that conversation like, as you say on your website, "The first data catalog designed for collaboration"? How do you help these organizations as large as Coca-Cola understand where all the data are and enable the human resources to extract values, and find it, understand it, and trust it? >> Yeah, so we have a very simple hypothesis, which is, look, people fundamentally have questions. They're fundamentally curious. So, what you need to do as a chief data officer, as a chief information officer, is really figure out how to unlock that curiosity. Start with the most popular data sets. Start with the most popular systems. Start with the business people who have the most curiosity and the most demand for information. And oh, by the way, we can measure that. Which is the magical thing that we do. So we can come in and say, "Look, "we look at the logs inside of your systems to know "which people are using which data sets, "which sources are most popular, which areas are hot." Just like a social network might do. And so, just like you can say, "Okay, these are the trending restaurants." We can say, "These are the trending data sets." And that curiosity allows people to know, what data should I document first? What data should I make available first? What data do I improve the data quality over first? What data do I govern first? And so, in a world where you've got tons of signal, tons of systems, it's totally dizzying to figure out where you should start. But what we do is, we go these chief data officers and say, "Look, we can give you a tool and a catalyst so "that you know where to go, "what questions to answer, who to serve first." And you can use that to expand to other groups in the company. >> And this is interesting, a lot of people you mentioned social networks, use data to optimize for something, and in the case of Facebook, they they use my data to target ads for me. You're using data to actually say, "This is how people are using the data." So you're using data for data. (laughs) >> That's right. >> So you're saying-- >> Satyen: We're measuring how you can use data. >> And that's interesting because, I hear a lot of stories like, we bought a tool, we never used it. >> Yep. >> Or people didn't like the UI, just kind of falls on the side. You're looking at it and saying, "Let's get it out there and let's see who's using the data." And then, are you doubling down? What happens? Do I get a little star, do I get a reputation point, am I being flagged to HR as a power user? How are you guys treating that gamification in this way? It's interesting, I mean, what happens? Do I become like-- >> Yeah, so it's funny because, when you think about search, how do you figure out that something's good? So what Google did is, they came along and they've said, "We've got PageRank." What we're going to do is we're going to say, "The pages that are the best pages are the ones "that people link to most often." Well, we can do the same thing for data. The data sources that are the most useful ones are the people that are used most often. Now on top of that, you can say, "We're going to have experts put ratings," which we do. And you can say people can contribute knowledge and reviews of how this data set can be used. And people can contribute queries and reports on top of those data sets. And all of that gives you this really rich graph, this rich social graph, so that now when I look at something it doesn't look like Greek. It looks like, "Oh, well I know Lisa used this data set, "and then John used it "and so at least it must answer some questions "that are really intelligent about the media business "or about the software business. "And so that can be really useful for me "if I have no clue as to what I'm looking at." >> So the problem that you-- >> It's on how you demystify it through the social connections. >> So the problem that you solve, if what I hear you correctly, is that you make it easy to get the data. So there's some ease of use piece of it, >> Yep. >> cataloging. And then as you get people using it, this is where you take the data literacy and go into operationalizing data. >> Satyen: That's right. >> So this seems to be the challenge. So, if I'm a customer and I have a problem, the profile of your target customer or who your customers are, people who need to expand and operationalize data, how would you talk about it? >> Yeah, so it's really interesting. We talk about, one of our customers called us, sort of, the social network for nerds inside of an enterprise. And I think for me that's a compliment. (John laughing) But what I took from that, and when I explained the business of Alation, we start with those individuals who are data literate. The data scientists, the data engineers, the data stewards, the chief data officer. But those people have the knowledge and the context to then explain data to other people inside of that same institution. So in the same way that Facebook started with Harvard, and then went to the rest of the Ivies, and then went to the rest of the top 20 schools, and then ultimately to mom, and dad, and grandma, and grandpa. We're doing the exact same thing with data. We start with the folks that are data literate, we expand from there to a broader audience of people that don't necessarily have data in their titles, but have curiosity and questions. >> I like that on the curiosity side. You spent some time up at Strata Data. I'm curious, what are some of the things you're hearing from customers, maybe partners? Everyone used to talk about Hadoop, it was this big thing. And then there was a creation of data lakes, and swampiness, and all these things that are sort of becoming more complex in an organization. And with the rise of myriad data sources, the velocity, the volume, how do you help an enterprise understand and be able to catalog data from so many different sources? Is it that same principle that you just talked about in terms of, let's start with the lowest hanging fruit, start making the impact there and then grow it as we can? Or is an enterprise needs to be competitive and move really, really quickly? I guess, what's the process? >> How do you start? >> Right. >> What do people do? >> Yes! >> So it's interesting, what we find is multiple ways of starting with multiple different types of customers. And so, we have some customers that say, "Look, we've got a big, we've got Teradata, "and we've got some Hadoop, "and we've got some stuff on Amazon, "and we want to connect it all." And those customers do get started, and they start with hundreds of users, in some case, they start with thousands of users day one, and they just go Big Bang. And interestingly enough, we can get those customers enabled in matters of weeks or months to go do that. We have other customers that say, "Look, we're going to start with a team of 10 people "and we're going to see how it grows from there." And, we can accommodate either model or either approach. From our prospective, you just have to have the resources and the investment corresponding to what you're trying to do. If you're going to say, "Look, we're going to have, two dollars of budget, and we're not going to have the human resources, and the stewardship resources behind it." It's going to be hard to do the Big Bang. But if you're going to put the appropriate resources up behind it, you can do a lot of good. >> So, you can really facilitate the whole go big or go home approach, as as well as the let's start small think fast approach. >> That's right, and we always, actually ironically, recommend the latter. >> Let's start small, think fast, yeah. >> Because everybody's got a bigger appetite than they do the ability to execute. And what's great about the tool, and what I tell our customers and our employees all day long is, there's only metric I track. So year over year, for our business, we basically grow in accounts by net of churn by 55%. Year over year, and that's actually up from the prior year. And so from my perspective-- >> And what does that mean? >> So what that means is, the same customer gave us 55 cents more on the dollar than they did the prior year. Now that's best in class for most software businesses that I've heard. But what matters to me is not so much that growth rate in and of itself. What it means to me is this, that nobody's come along and says, "I've mastered my data. "I understand all of the information side of my company. "Every person knows everything there is to know." That's never been said. So if we're solving a problem where customers are saying, "Look, we get, and we can find, and understand, "and trust data, and we can do that better last year "than we did this year, and we can do it even more "with more people," we're going to be successful. >> What I like about what you're doing is, you're bringing an element of operationalizing data for literacy and for usage. But you're really bringing this notion of a humanizing element to it. Where you see it in security, you see it in emerging ecosystems. Where there's a community of data people who know how hard it is and was, and it seems to be getting easier. But the tsunami of new data coming in, IOT data, whatever, and new regulators like GDPR. These are all more surface area problems. But there's a community coming together. How have you guys seen your product create community? Have you seen any data on that, 'cause it sounds like, as people get networked together, the natural outcome of that is possibly usage you attract. But is there a community vibe that you're seeing? Is there an internal collaboration where they sit, they're having meet ups, they're having lunches. There's a social aspect in a human aspect. >> No, it's humanal, no, it's amazing. So in really subtle but really, really powerful ways. So one thing that we do for every single data source or every single report that we document, we just put who are the top users of this particular thing. So really subtly, day one, you're like, "I want to go find a report. "I don't even know "where to go inside of this really mysterious system". Postulation, you're able to say, "Well, I don't know where to go, but at least I can go call up John or Lisa," and say, "Hey, what is it that we know about this particular thing?" And I didn't have to know them. I just had to know that they had this report and they had this intelligence. So by just discovering people in who they are, you pick up on what people can know. >> So people of the new Google results, so you mentioned Google PageRank, which is web pages and relevance. You're taking a much more people approach to relevance. >> Satyen: That's right. >> To the data itself. >> That's right, and that builds community in very, very clear ways, because people have curiosity. Other people are in the mechanism why in which they satisfy that curiosity. And so that community builds automatically. >> They pay it forward, they know who to ask help for. >> That's right. >> Interesting. >> That's right. >> Last question, Satyen. The tag line, first data catalog designed for collaboration, is there a customer that comes to mind to you as really one that articulates that point exactly? Where Alation has come in and really kicked open the door, in terms of facilitating collaboration. >> Oh, absolutely. I was literally, this morning talking to one of our customers, Munich Reinsurance, largest reinsurance customer or company in the world. Their chief data officer said, "Look, three years ago, "we started with 10 people working on data. "Today, we've got hundreds. "Our aspiration is to get to thousands." We have three things that we do. One is, we actually discover insights. It's actually the smallest part of what we do. The second thing that we do is, we enable people to use data. And the third thing that we do is, drive a data driven culture. And for us, it's all about scaling knowledge, to centers in China, to centers in North America, to centers in Australia. And they've been doing that at scale. And they go to each of their people and they say, "Are you a data black belt, are you a data novice?" It's kind of like skiing. Are you blue diamond or a black diamond. >> Always ski in pairs (laughs) >> That's right. >> And they do ski in pairs. And what they end up ultimately doing is saying, "Look, we're going to train all of our workforce to become better, so that in three, 10 years, we're recognized as one of the most innovative insurance companies in the world." Three years ago, that was not the case. >> Process improvement at a whole other level. My final question for you is, for the folks watching or the folks that are going to watch this video, that could be a potential customer of yours, what are they feeling? If I'm the customer, what smoke signals am I seeing that say, I need to call Alation? What are some of the things that you've found that would tell a potential customer that they should be talkin' to you guys? >> Look, I think that they've got to throw out the old playbook. And this was a point that was made by some folks at a conference that I was at earlier this week. But they basically were saying, "Look, the DLNA's PlayBook was all about providing the right answer." Forget about that. Just allow people to ask the right questions. And if you let people's curiosity guide them, people are industrious, and ambitious, and innovative enough to go figure out what they need to go do. But if you see this as a world of control, where I'm going to just figure out what people should know and tell them what they're going to go know. that's going to be a pretty, a poor career to go choose because data's all about, sort of, freedom and innovation and understanding. And we're trying to push that along. >> Satyen, thanks so much for stopping by >> Thank you. >> and sharing how you guys are helping organizations, enterprises unlock data curiosity. We appreciate your time. >> I appreciate the time too. >> Thank you. >> And thanks John! >> And thank you. >> Thanks for co-hosting with me. For John Furrier, I'm Lisa Martin, you're watching theCUBE live from our second day of coverage of our event Big Data SV. Stick around, we'll be right back with our next guest after a short break. (upbeat music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media Satyen Sangani, the co-founder and CEO of Alation. So you guys finish up your fiscal year how are you pulling this momentum through 2018. in the last quarter we added customers like What about the platform you guys are doing? Take a minute, talk about the update. And that gives you a single place of reference, you got to manage it. So it's kind of like a tropic disillusionment, if you will. And they know that How do you guys talk to that customer? And so, you can either run away from that data, Those are the sorts of problems that I'm excited to work on. Where is it kind of coming together for you guys? and I'm going to find the thing that I'm looking for, that you guys added towards the end of calendar 2017. And oh, by the way, we can measure that. a lot of people you mentioned social networks, I hear a lot of stories like, we bought a tool, And then, are you doubling down? And all of that gives you this really rich graph, It's on how you demystify it So the problem that you solve, And then as you get people using it, and operationalize data, how would you talk about it? and the context to then explain data the volume, how do you help an enterprise understand have the resources and the investment corresponding to So, you can really facilitate the whole recommend the latter. than they do the ability to execute. What it means to me is this, that nobody's come along the natural outcome of that is possibly usage you attract. And I didn't have to know them. So people of the new Google results, And so that community builds automatically. is there a customer that comes to mind to And the third thing that we do is, And what they end up ultimately doing is saying, that they should be talkin' to you guys? And if you let people's curiosity guide them, and sharing how you guys are helping organizations, Thanks for co-hosting with me.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
PepsiCoORGANIZATION

0.99+

Lisa MartinPERSON

0.99+

Satyen SanganiPERSON

0.99+

JohnPERSON

0.99+

American ExpressORGANIZATION

0.99+

AlationORGANIZATION

0.99+

RocheORGANIZATION

0.99+

SatyenPERSON

0.99+

thousandsQUANTITY

0.99+

LisaPERSON

0.99+

55 centsQUANTITY

0.99+

AustraliaLOCATION

0.99+

AmazonORGANIZATION

0.99+

Coca-ColaORGANIZATION

0.99+

2018DATE

0.99+

10 peopleQUANTITY

0.99+

threeQUANTITY

0.99+

John FurrierPERSON

0.99+

hundredsQUANTITY

0.99+

YelpORGANIZATION

0.99+

San JoseLOCATION

0.99+

ChinaLOCATION

0.99+

HarvardORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

twoQUANTITY

0.99+

TodayDATE

0.99+

2017DATE

0.99+

55%QUANTITY

0.99+

second dayQUANTITY

0.99+

North AmericaLOCATION

0.99+

GoogleORGANIZATION

0.99+

todayDATE

0.99+

two dollarsQUANTITY

0.99+

20 different peopleQUANTITY

0.99+

yesterdayDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

last yearDATE

0.99+

three years agoDATE

0.99+

firstQUANTITY

0.99+

second thingQUANTITY

0.99+

OneQUANTITY

0.99+

oneQUANTITY

0.99+

first quarter of 2018DATE

0.99+

20 different reportsQUANTITY

0.99+

three thingsQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

last quarterDATE

0.98+

DLNAORGANIZATION

0.98+

third thingQUANTITY

0.98+

Three years agoDATE

0.98+

eachQUANTITY

0.98+

singleQUANTITY

0.98+

bothQUANTITY

0.98+

1st StreetLOCATION

0.98+

Big BangEVENT

0.98+

this yearDATE

0.98+

Strata DataORGANIZATION

0.97+

12 XQUANTITY

0.97+

GDPRTITLE

0.97+

seven ratingsQUANTITY

0.96+

AlationPERSON

0.95+

this morningDATE

0.95+

Big Data SV 2018EVENT

0.94+

first dataQUANTITY

0.94+

TeradataORGANIZATION

0.93+

10 yearsQUANTITY

0.93+

Aaron Kalb, Alation | AWS re:Invent


 

>> Announcer: Live from Las Vegas, it's theCUBE. Covering AWS Reinvent 2017, presented by AWS, intel, and our ecosystem of partners. >> Welcome back to theCUBE's continuing coverage of AWS Reinvent 2017. This is day two for us. Incredible day one. We had great buzz on day two. Great announcements coming out from AWS today. I'm Lisa Martin with my cohost Keith Townsend, and we're excited to be joined by CUBE alumni, Aaron Kalb, the head of product and a founder of Alation. Welcome back to the show. >> Thanks so much for having me. I'm excited to be here. >> So speaking of excitement, you can hear the buzz behind us. Interesting about Alation, the first data catalog designed for human collaboration. What gap did Alation see in the market five years ago when you started? >> That's a great question, Lisa. So, yeah, we're the first data catalog, period, and we're excited to see a lot of other people kind of using that label, I believe it validates this as a space, and I think that everybody needs, and I think our approach, as you said, was to really to approach it from the human side, to say the data might be generated by machines or stored on machines, but it's not meant to ultimately be consumed by machines. Even if there's algorithms that's pulling it in, it's to ultimately serve human interests. So the goal was to design from the human back and really think, what does this data mean? Can I trust it? Is it gonna drive the processes correctly? >> So Aaron, I have seen that term quite a bit, and data catalog, for me, means one specific thing. Can you kind of wrap that up for us? >> What is a data catalog? >> That's a really great question, Keith, and I think what's interesting is we took a lot of inspiration in the early days actually from Amazon.com, right? So Amazon is an amazing modern product catalog. You can go in, type in English and see a variety of products that match that keyword. And for each one you can see whose bought it before, how many stars did they give it? Is it good? So it helps you find, understand, and trust, and get the right product for your need. We want to do that same thing for data. How do you found a trustworthy data asset, understand what it is, and put it to use? So that's exactly the goal. >> So, a simple problem is I've worked with a ton of researchers in the Big Pharma industry, data across the world basically. And a lot of data sets, repetitive. A team in Germany is working with one set of data, team in New Jersey working with another one, how does your solution help those researchers find the data that they're looking for? >> Exactly right. So the problem is many different data sets, many different things claiming to be true. Some of them are just plain wrong. Sometimes the answer might be one thing in Germany but something else elsewhere, and they're both valid. And so you've hit the nail on the head. The way people use data contains a lot of hints about the way you should use data. So just like Amazon, again, because we're here. And it'll say, oh, customers who bought what you're about to buy also bought this, and that can help you discover something useful. We try to expose we call behavior IO. Let the past behavior of the most knowledgeable people in the organization drive the future behavior. That's a big part of what we do. So one of the things I was reading about you guys on your website and some editorials is, a lot of data lakes fail. Why is that? How is Alation different? >> That's a great question. So I think what's interesting about a data lake is it's kind of like having a huge basement, right? And it can make you adopt a hoarder mentality, you say, oh it's so cheap to store everything, we'll just store it, and then when we need it we'll figure it out then. Well, the truth is, it's not always how it goes. Often you store so many things, it's cheap to store it, but when that actual human who has an actual analytical question they want to answer or an actual business process they want to improve, goes looking for the data, all they see are all these unlabeled boxes. Right? So I think the key is to think about how do you make information searchable, discoverable, understandable, trustworthy? And what's great is a lot of people are migrating from their on-premise data lakes to the Clouds, and obviously (mumbles) a big leader in where that's going. It gives you an opportunity to ask, just like when you move houses to say, let me look at what I've got, and can I adopt an approach? You know, what do I actually need? You might keep it all, but what's gonna be in the top shelf? What's gonna be in the basement? And how do you make everything accessible? >> So Aaron, can you talk a little bit about today's announcements? A lot of machine learning, analytics announcements from AWS. However, I don't know what I already have. So how can I make use of that data? Can you help talk about how Alation helps to leverage some of these new tools from AWS? >> Absolutely. So, we've had a bunch of customers on AWS Stack already, and increasingly so. Fundamentally our customers are people who do analysis. A lot of them are using S3, Redshift, the like. And people are hosting on the Cloud increasingly. And it's exactly the problem you described. It's I know I have it somewhere, but I can't get my head around what I already have. What region is it in? >> Aaron: Exactly. >> Is it in a region, is it in my data center, where is it? >> Exactly. so whether that data is in Redshift, in S3, or somewhere else. Maybe it's, you know, in a Postgres or SQL Server or Oracle Server. (mumbles) hosted one. Whatever it is, we crawl and index everything you have, just the way Google crawls and indexes everything out on the web, and we make it searchable, and we put information about who's used it and how good it is front and center, just the way you can say, oh this is a five-star clock on Amazon, I'm gonna go click buy it now. >> So one challenge with data lakes is security around that data. So data catalog, I get meta data around the data that I have, but some of that data is sensitive. How do you guys handle security around the data catalog itself? >> Absolutely. So we respect all the security and privacy settings that exist that are on the data itself, and we just sort of surface those in the catalog. Some of our customers say, look, we want to let people know what exists so they can ask for permission. Others say, even having awareness of this data is too much for us. And you mentioned, Pharma, that'll vary by industry. >> Where do you guys get involved in the customer conversation? You said many customers of yours are already using AWS for different things, but where does Alation come into the conversation? Are you brought in by AWS? Are you brought in by customers? Where are they on this journey towards leveraging the Cloud for the things that they need, agility, the speed, and the cost reduction? >> Absolutely. So our promise is we help you find, understand, and trust your data wherever it lives and whoever you are, democratizing it. So customers choose the right infrastructure for their needs, given cost, given performance. Obviously Amazon is increasingly a part of that. But that's a choice they make, and we resolve to handle that wherever it is. And as of customers, our customers are so smart, we learn so much from them. We're meeting a bunch of CIOs, both the prospects and also talking some current customers like Expedia today here at AWS lunch with our investor Costanoa and another at dinner tonight. And folks like Chegg and Invoice2go who've been longstanding AWS customers using S3, using Redshift, and actually in Chegg's case, they have a lot of homegrown tooling that they developed on the backend, but they said Alation is the best place to surface that and have it be the central portal for business users and analysts who might not be able to otherwise access things that are just available via (mumbles) >> So how are you, Alation, and AWS helping a customer like Chegg extract ROI quickly? >> Yeah, it's a great question, so, AWS is really great for cost containment. You have all this data and all this processing, but you have peaks and you have troughs, and how do you make sure you're not overpaying (mumbles) so it's great for helping with storage and computation. And Alation helps with the human side, how do you get that upside by saying you have this data, that could effect the way you stock your shelves, the way you price your products or who you hire, what markets you go into. And that requires that last step. If you have the data but it isn't in the right hands at the right time or it's interpreted incorrectly, it has no value. So the two of them together (mumbles) end-to-end solution. >> So Aaron, with GDPR coming up quick, the enforcement of that coming up May 2018, customers have to be concerned about having data they shouldn't have. Does Alation help identify some of that data? >> Absolutely. So data catalog is fundamentally an inventory of everything you have, plus information about how it has been and could be consumed. We very much focus on the upside, potential of using that to drive better business choices and better analysis. But we have customers actually saying, oh, we can use that same information about what we have, who's using it, what's in it, to instead make sure that it's used compliantly with a regulation like GDPR to make sure that you aren't holding onto health records longer than you should or PII. And it's absolutely a very big use case for many of our customers. >> So data is touched by a lot of people in an organization. AWS has done a great job of really developing a lot of synergy with the developer community for a long time now. But we're also seeing some trends suggesting they're going up the stack. They want to get more enterprises, enterprises are at the precipice, as Andy Jassey said, of this mass migration to the Cloud. You mentioned, all of your work with AWS and the CIO events that you're having here. Where are you guys in a conversation with customers? Are you more now having to get to that C-suite as now their business are absolutely predicated upon the best use of data to identify ways to monetize new revenue streams. How influential is that C-level in this conversation. >> It's a great question. So I think what is interesting is, all companies, we sort of commoditized a basic business school, consultant, best practice knowledge. Everyone is kind of already doing that. To get to the next level our customers are recently telling us it is only by finding key insights in data that they're gonna beat out the competition and stay relevant. I mean, look what Amazon and Netflix have done to the industries that, they weren't as data driven, and have that kind of agility around data. So everybody wants to do the same thing. So CIOs, CDOs, chief data officers, we're seeing them crop up more and more and being more and more empowered in the organization. Because it's seen as central to hitting revenue targets and making an impact, which is what customers want to do. And I mentioned CISOs as well with the question that you asked, Keith, about security. >> The CISOs, the chief information security officers. >> Aaron: Yeah, absolutely. Yeah, absolutely, so I think usually often a CISO will report into a CIO, often you see it as adjacent to them, there's somebody who needs to have the confidence, as they do, in Alation's process of mirroring what's in the data source, not introducing security holes. Potentially even taking a step forward and saying, as I implement GDPR and other policies, how do I use a comprehensive automated inventory like Alations to make sure that process isn't just started but actually finished and avoid the fines and the adverse events. We absolutely see across the C-suite a lot of interest. >> So let's go one step below the CIO, and I think the CIO understands this. This data is the new oil. Very, very straightforward. But now you're getting into the enterprise architect, the VP of infrastructure, and they have to implement these technologies. What have been some of the rewards and challenges with those conversations? >> That's a great question. Right, so here at AWS Reinvent we have a very technical audience, very infrastructure minded. Those are folks that we love to engage with, but our primary audience is the business. >> Keith: Right. >> Right. And so I think what's interesting is, the problem we solve for the more infrastructure-minded executives is how do I deal with these business users? How do I turn this relationship that feels adversarial, where they're putting strain on my system, they're upset about cost overruns, we don't speak the same language with the same values. Alation can be a great bridge. Because we do all of this automated extraction and tying to the sources where they are, and kind of meet the industry people where they live, but then can communicate the value in a clean interface that demonstrates real business ROI to the business. So we can kid of be an ambassador between those sides of the customer. >> I love that, being an ambassador. Aaron, your passion for Alation, what you do, your engagement with customers is palpable. So we thank you for joining us on theCUBE, and wish you guys the best of luck with what you're doing here at AWS Reinvent. >> Lisa, thank you so much for having me. >> Lisa: Awesome. >> Keith: Great job, Aaron. >> Thank you for watching. We are live at AWS Reinvent 2017 with 42,000 other people. I'm Lisa Martin, for my cohost Keith Townsend and Aaron Kalb, stick around. We'll be right back.

Published Date : Nov 29 2017

SUMMARY :

and our ecosystem of partners. Aaron Kalb, the head of product and a founder of Alation. I'm excited to be here. What gap did Alation see in the market five years ago and I think our approach, as you said, So Aaron, I have seen that term quite a bit, and get the right product for your need. find the data that they're looking for? So one of the things I was reading about you guys And how do you make everything accessible? So Aaron, can you talk a little bit about And it's exactly the problem you described. just the way you can say, How do you guys handle security that exist that are on the data itself, So our promise is we help you find, that could effect the way you stock your shelves, the enforcement of that coming up May 2018, an inventory of everything you have, and the CIO events that you're having here. and being more and more empowered in the organization. and the adverse events. So let's go one step below the CIO, but our primary audience is the business. and kind of meet the industry people where they live, So we thank you for joining us on theCUBE, Thank you for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Aaron KalbPERSON

0.99+

AaronPERSON

0.99+

Lisa MartinPERSON

0.99+

AmazonORGANIZATION

0.99+

Andy JasseyPERSON

0.99+

Keith TownsendPERSON

0.99+

KeithPERSON

0.99+

AWSORGANIZATION

0.99+

NetflixORGANIZATION

0.99+

LisaPERSON

0.99+

May 2018DATE

0.99+

GermanyLOCATION

0.99+

twoQUANTITY

0.99+

New JerseyLOCATION

0.99+

five-starQUANTITY

0.99+

CheggORGANIZATION

0.99+

Amazon.comORGANIZATION

0.99+

Las VegasLOCATION

0.99+

GDPRTITLE

0.99+

one setQUANTITY

0.99+

GoogleORGANIZATION

0.99+

todayDATE

0.99+

one thingQUANTITY

0.99+

AlationPERSON

0.98+

AlationORGANIZATION

0.98+

CUBEORGANIZATION

0.98+

bothQUANTITY

0.98+

tonightDATE

0.98+

Invoice2goORGANIZATION

0.98+

S3TITLE

0.98+

one stepQUANTITY

0.98+

oneQUANTITY

0.97+

five years agoDATE

0.97+

RedshiftTITLE

0.97+

first data catalogQUANTITY

0.97+

day twoQUANTITY

0.96+

day oneQUANTITY

0.96+

each oneQUANTITY

0.95+

AWS ReinventORGANIZATION

0.95+

OracleORGANIZATION

0.95+

one challengeQUANTITY

0.94+

theCUBEORGANIZATION

0.94+

EnglishOTHER

0.92+

AlationsORGANIZATION

0.91+

CostanoaORGANIZATION

0.83+

SQL ServerTITLE

0.82+

AWS Reinvent 2017EVENT

0.79+

42,000 otherQUANTITY

0.77+

ExpediaORGANIZATION

0.77+

Aaron Kalb, Alation | BigData NYC 2017


 

>> Announcer: Live from midtown Manhattan, it's the Cube. Covering Big Data New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. >> Welcome back everyone, we are here live in New York City, in Manhattan for BigData NYC, our event we've been doing for five years in conjunction with Strata Data which is formerly Strata Hadoop, which was formerly Strata Conference, formerly Hadoop World. We've been covering the big data space going on ten years now. This is the Cube. I'm here with Aaron Kalb, whose Head of Product and co-founder at Alation. Welcome to the cube. >> Aaron Kalb: Thank you so much for having me. >> Great to have you on, so co-founder head of product, love these conversations because you're also co-founder, so it's your company, you got a lot of equity interest in that, but also head of product you get to have the 20 mile stare, on what the future looks, while inventing it today, bringing it to market. So you guys have an interesting take on the collaboration of data. Talk about what the means, what's the motivation behind that positioning, what's the core thesis around Alation? >> Totally so the thing we've observed is a lot of people working in the data space, are concerned about the data itself. How can we make it cheaper to store, faster to process. And we're really concerned with the human side of it. Data's only valuable if it's used by people, how do we help people find the data, understand the data, trust in the data, and that involves a mix of algorithmic approaches and also human collaboration, both human to human and human to computer to get that all organized. >> John Furrier: It's interesting you have a symbolics background from Stanford, worked at Apple, involved in Siri, all this kind of futuristic stuff. You can't go a day without hearing about Alexia is going to have voice-activated, you've got Siri. AI is taking a really big part of this. Obviously all of the hype right now, but what it means is the software is going to play a key role as an interface. And this symbolic systems almost brings on this neural network kind of vibe, where objects, data, plays a critical role. >> Oh, absolutely, yeah, and in the early days when we were co-founding the company, we talked about what is Siri for the enterprise? Right, I was you know very excited to work on Siri, and it's really a kind of fun gimmick, and it's really useful when you're in the car, your hands are covered in cookie dough, but if you could answer questions like what was revenue last quarter in the UK and get the right answer fast, and have that dialogue, oh do you mean fiscal quarter or calendar quarter. Do you mean UK including Ireland, or whatever it is. That would really enable better decisions and a better outcome. >> I was worried that Siri might do something here. Hey Siri, oh there it is, okay be careful, I don't want it to answer and take over my job. >> (laughs) >> Automation will take away the job, maybe Siri will be doing interviews. Okay let's take a step back. You guys are doing well as a start up, you've got some great funding, great investors. How are you guys doing on the product? Give us a quick highlight on where you guys are, obviously this is BigData NYC a lot going on, it's Manhattan, you've got financial services, big industry here. You've got the Strata Data event which is the classic Hadoop industry that's morphed into data. Which really is overlapping with cloud, IoTs application developments all kind of coming together. How do you guys fit into that world? >> Yeah, absolutely, so the idea of the data lake is kind of interesting. Psychologically it's sort of a hoarder mentality, oh everything I've ever had I want to keep in the attic, because I might need it one day. Great opportunity to evolve these new streams of data, with IoT and what not, but just cause you can get to it physically doesn't mean it's easy to find the thing you want, the needle in all that big haystack and to distinguish from among all the different assets that are available, which is the one that is actually trustworthy for your need. So we find that all these trends make the need for a catalog to kind of organize that information and get what you want all the more valuable. >> This has come up a lot, I want to get into the integration piece and how you're dealing with your partnerships, but the data lake integration has been huge, and having the catalog has come up with, has been the buzz. Foundationally if you will saying catalog is important. Why is it important to do the catalog work up front, with a lot of the data strategies? >> It's a great question, so, we see data cataloging as step zero. Before you can prep the data in a tool like Trifacta, PACSAT, or Kylo. Before you can visualize it in a tool like Tableau, or MicroStrategy. Before you can do some sort of cool prediction of what's going to happen in the future, with a data science engine, before any of that. These are all garbage in garbage out processes. The step zero is find the relevant data. Understand it so you can get it in the right format. Trust that it's good and then you can do whatever comes next >> And governance has become a key thing here, we've heard of the regulations, GDPR outside of the United States, but also that's going to have an arms length reach over into the United States impact. So these little decisions, and there's going to be an Equifax someday out there. Another one's probably going to come around the corner. How does the policy injection change the catalog equation? A lot of people are building machine learning algorithms on top of catalogs, and they're worried they might have to rewrite everything. How do you balance the trade off between good catalog design and flexibility on the algorithm side? >> Totally yes it's a complicated thing with governance and consumption right. There's people who are concerned with keeping the data safe, and there are people concerned with turning that data into real value, and these can seem to be at odds. What we find is actually a catalog as a foundation for both, and they are not as opposed as they seem. What Alation fundamentally does is we make a map of where the data is, who's using what data, when, how. And that can actually be helpful if your goal is to say let's follow in the footsteps of the best analyst and make more insights generated or if you want to say, hey this data is being used a lot, let's make sure it's being used correctly. >> And by the right people. >> And by the right people exactly >> Equifax they were fishing that pond dry months, months before it actually happened. With good tools like this they might have seen this right? Am I getting it right? >> That's exactly right, how can you observe what's going on to make sure it's compliant and that the answers are correct and that it's happening quickly and driving results. >> So in a way you're taking the collective intelligence of the user behavior and using that into understanding what to do with the data modeling? >> That's exactly right. We want to make each person in your organization as knowledgeable as all of their peers combined. >> So the benefit then for the customer would be if you see something that's developing you can double down on it. And if the users are using a lot of data, then you can provision more technology, more software. >> Absolutely, absolutely. It's sort of like when I was going to Stanford, there was a place where the grass was all dead, because people were riding their bikes diagonally across it. And then somebody smart was like, we're going to put a real gravel path there. So the infrastructure should follow the usage, instead of being something you try to enforce on people. >> It's a classic design meme that goes around. Good design is here, the more effective design is the path. >> Exactly. >> So let's get into the integration. So one of the hot topics here this year obviously besides cloud and AI, with cloud really being more the driver, the tailwind for the growth, AI being more the futuristic head room, is integration. You guys have some partnerships that you announced with integration, what are some of the key ones, and why are they important? >> Absolutely, so, there have been attempts in the past to centralize all the data in one place have one warehouse or one lake have one BI tool. And those generally fail, for different reasons, different teams pick different stacks that work for them. What we think is important is the single source of reference One hub with spokes out to all those different points. If you think about it it's like Google, it's one index of the whole web even though the web is distributed all over the place. To make that happen it's very important that we have partnerships to get data in from various sources. So we have partnerships with database vendors, with Cloudera and Hortonworks, with different BI tools. What's new are a few things. One is with Cloudera Navigator, they have great technical metadata around security and lineage over HGFS, and that's a way to bolster our catalog to go even deeper into what's happening in the files before things get surfaced and higher for places where we have a deeper offering today. >> So it's almost a connector to them in a way, you kind of share data. >> That's exactly right, we've a lot of different connectors, this is one new one that we have. Another, go ahead. >> I was going to go ahead continue. >> I was just going to say another place that is exciting is data prep tools, so Trifacta and Paxata are both places where you can find and understand an alation and then begin to manipulate in those tools. We announced with Paxata yesterday, the ability to click to profile, so if you want to actually see what's in some raw compressed avro file, you can see that in one click. >> It's interesting, Paxata has really been almost lapping, Trifacta because they were the leader in my mind, but now you've got like a Nascar race going on between the two firms, because data wrangling is a huge issue. Data prep is where everyone is stuck right now, they just want to do the data science, it's interesting. >> They are both amazing companies and I'm happy to partner with both. And actually Trifacta and Alation have a lot of joint customers we're psyched to work with as well. I think what's interesting is that data prep, and this is beginning to happen with analyst definitions of that field. It isn't just preparing the data to be used, getting it cleaned and shaped, it's also preparing the humans to use the data giving them the confidence, the tools, the knowledge to know how to manipulate it. >> And it's great progress. So the question I wanted to ask is now the other big trend here is, I mean it's kind of a subtext in this show, it's not really front and center but we've been seeing it kind of emerge as a concept, we see in the cloud world, on premise vs cloud. On premise a lot of people bring in the dev ops model in, and saying I may move to the cloud for bursting and some native applications, but at the end of the day there is a lot of work going on on premise. A lot of companies are kind of cleaning house, retooling, replatforming, whatever you want to do resetting. They are kind of getting their house in order to do on prem cloud ops, meaning a business model of cloud operations on site. A lot of people doing that, that will impact the story, it's going to impact some of the server modeling, that's a hot trend. How do you guys deal with the on premise cloud dynamic? >> Totally, so we just want to do what's right for the customer, so we deploy both on prem and in the cloud and then from wherever the Alation server is it will point to usually a mix of sources, some that are in the cloud like vetshifter S3 often with Amazon today, and also sources that are on prem. I do think I'm seeing a trend more and more toward the cloud and we have people that are migrating from HGFS to S3 is one thing we hear a lot about it. Strata with sort of dupe interest. But I think what's happening is people are realizing as each Equifax in turn happens, that this old wild west model of oh you surround your bank with people on horseback and it's physically in one place. With data it isn't like that, most people are saying I'd rather have the A+ teams at Salesforce or Amazon or Google be responsible for my security, then the people I can get over in the midwest. >> And the Paxata guys have loved the term Data Democracy, because that is really democratization, making the data free but also having the governance thing. So tell me about the Data Lake governance, because I've never loved the term Data Lake, I think it's more of a data ocean, but now you see data lake, data lake, data lake. Are they just silos of data lakes happening now? Are people trying to connect them? That's key, so that's been a key trend here. How do you handle the governance across multiple data lakes? >> That's right so the key is to have that single source of reference, so that regardless of which lake or warehouse, or little siloed Sequel server somewhere, that you can search in a single portal and find that thing no matter where it is. >> John: Can you guys do that? >> We can do that, yeah, I think the metaphor for people who haven't seen it really is Google, if you think about it, you don't even know what physical server a webpage is hosted from. >> Data lakes should just be invisible >> Exactly. >> So your interfacing with multiple data lakes, that's a value proposition for you. >> That's right so it could be on prem or in the cloud, multi-cloud. >> Can you share an example of a customer that uses that and kind of how it's laid out? >> Absolutely, so one great example of an interesting data environment is eBay. They have the biggest teradata warehouse in the world. They also have I believe two huge data lakes, they have hive on top of that, and Presto is used to sort of virtualize it across a mixture of teradata, and hive and then direct Presto query It gets very complicated, and they have, they are a very data driven organization, so they have people who are product owners who are in jobs where data isn't in their job title and they know how to look at excel and look at numbers and make choices, but they aren't real data people. Alation provides that accessibility so that they can understand it. >> We used to call the Hadoop world the car show for the data world, where for a long time it was about the engine what was doing what, and then it became, what's the car, and now how's it drive. Seeing that same evolution now where all that stuff has to get done under the hood. >> Aaron: Exactly. >> But there are still people who care about that, right. They are the mechanics, they are the plumbers, whatever you want to call them, but then the data science are the guys really driving things and now end users potentially, and even applications bots or what nots. It seems to evolve, that's where we're kind of seeing the show change a little bit, and that's kind of where you see some of the AI things. I want to get your thoughts on how you or your guys are using AI, how you see AI, if it's AI at all if it's just machine learning as a baby step into AI, we all know what AI could be, but it's really just machine learning now. How do you guys use quote AI and how has it evolved? >> It's a really insightful question and a great metaphor that I love. If you think about it, it used to be how do you build the car, and now I can drive the car even though I couldn't build it or even fix it, and soon I don't even have to drive the car, the car will just drive me, all I have to know is where I want to go. That's sortof the progression that we see as well. There's a lot of talk about deep learning, all these different approaches, and it's super interesting and exciting. But I think even more interesting than the algorithms are the applications. And so for us it's like today how do we get that turn by turn directions where we say turn left at the light if you want to get there And eventually you know maybe the computer can do it for you The thing that is also interesting is to make these algorithms work no matter how good your algorithm is it's all based on the quality of your training data. >> John: Which is a historical data. Historical data in essence the more historical data you have you need that to train the data. >> Exactly right, and we call this behavior IO how do we look at all the prior human behavior to drive better behavior in the future. And I think the key for us is we don't want to have a bunch of unpaid >> John: You can actually get that URL behavioral IO. >> We should do it before it's too late (Both laugh) >> We're live right now, go register that Patrick. >> Yeah so the goal is we don't want to have a bunch of unpaid interns trying to manually attack things, that's error prone and that's slow. I look at things like Luis von Ahn over at CMU, he does a thing where as you're writing in a CAPTCHA to get an email account you're also helping Google recognize a hard to read address or a piece of text from books. >> John: If you shoot the arrow forward, you just take this kind of forward, you almost think augmented reality is a pretext to what we might see for what you're talking about and ultimately VR are you seeing some of the use cases for virtual reality be very enterprise oriented or even end consumer. I mean Tom Brady the best quarterback of all time, he uses virtual reality to play the offense virtually before every game, he's a power user, in pharma you see them using virtual reality to do data mining without being in the lab, so lab tests. So you're seeing augmentation coming in to this turn by turn direction analogy. >> It's exactly, I think it's the other half of it. So we use AI, we use techniques to get great data from people and then we do extra work watching their behavior to learn what's right. And to figure out if there are recommendations, but then you serve those recommendations, either it's Google glasses it appears right there in your field of view. We just have to figure out how do we make sure, that in a moment of you're making a dashboard, or you're making a choice that you have that information right on hand. >> So since you're a technical geek, and a lot of folks would love to talk about this, so I'll ask you a tough question cause this is something everyone is trying to chase for the holy grail. How do you get the right piece of data at the right place at the right time, given that you have all these legacy silos, latencies and network issues as well, so you've got a data warehouse, you've got stuff in cold storage, and I've got an app and I'm doing something, there could be any points of data in the world that could be in milliseconds potentially on my phone or in my device my internet of thing wearable. How do you make that happen? Because that's the struggle, at the same time keep all the compliance and all the overhead involved, is it more compute, is it an architectural challenge how do you view that because this is the big challenge of our time. >> Yeah again I actually think it's the human challenge more than the technology challenge. It is true that there is data all over the place kind of gathering dust, but again if you think about Google, billions of web pages, I only care about the one I'm about to use. So for us it's really about being in that moment of writing a query, building a chart, how do we say in that moment, hey you're using an out of date definition of profit. Or hey the database you chose to use, the one thing you chose out of the millions that is actually is broken and stale. And we have interventions to do that with our partners and through our own first party apps that actually change how decisions get made at companies. >> So to make that happen, if I imagine it, you'd have to need access to the data, and then write software that is contextually aware to then run, compute, in context to the user interaction. >> It's exactly right, back to the turn by turn directions concept you have to know both where you're trying to go and where you are. And so for us that can be the from where I'm writing a Sequel statement after join we can suggest the table most commonly joined with that, but also overlay onto that the fact that the most commonly joined table was deprecated by a data steward data curator. So that's the moment that we can change the behavior from bad to good. >> So a chief data officer out there, we've got to wrap up, but I wanted to ask one final question, There's a chief data officer out there they might be empowered or they might be just a CFO assistant that's managing compliance, either way, someone's going to be empowered in an organization to drive data science and data value forward because there is so much proof that data science works. From military to play you're seeing examples where being data driven actually has benefits. So everyone is trying to get there. How do you explain the vision of Alation to that prospect? Because they have so much to select from, there's so much noise, there's like, we call it the tool shed out there, there's like a zillion tools out there there's like a zillion platforms, some tools are trying to turn into something else, a hammer is trying to be a lawnmower. So they've got to be careful on who the select, so what's the vision of Alation to that chief data officer, or that person in charge of analytics to scale operational analytics. >> Absolutely so we say to the CDO we have a shared vision for this place where your company is making decisions based on data, instead of based on gut, or expensive consultants months too late. And the way we get there, the reason Alation adds value is, we're sort of the last tool you have to buy, because with this lake mentality, you've got your tool shed with all the tools, you've got your library with all the books, but they're just in a pile on the floor, if you had a tool that had everything organized, so you just said hey robot, I need an hammer and this size nail and this text book on this set of information and it could just come to you, and it would be correct and it would be quick, then you could actually get value out of all the expense you've already put in this infrastructure, that's especially true on the lake. >> And also tools describe the way the works done so in that model tools can be in the tool shed no one needs to know it's in there. >> Aaron: Exactly. >> You guys can help scale that. Well congratulations and just how far along are you guys in terms of number of employees, how many customers do you have? If you can share that, I don't know if that's confidential or what not >> Absolutely, so we're small but growing very fast planning to double in the next year, and in terms of customers, we've got 85 customers including some really big names. I mentioned eBay, Pfizer, Safeway Albertsons, Tesco, Meijer. >> And what are they saying to you guys, why are they buying, why are they happy? >> They share that same vision of a more data driven enterprise, where humans are empowered to find out, understand, and trust data to make more informed choices for the business, and that's why they come and come back. >> And that's the product roadmap, ethos, for you guys that's the guiding principle? >> Yeah the ultimate goal is to empower humans with information. >> Alright Aaron thanks for coming on the Cube. Aaron Kalb, co-founder head of product for Alation here in New York City for BigData NYC and also Strata Data I'm John Furrier thanks for watching. We'll be right back with more after this short break.

Published Date : Sep 28 2017

SUMMARY :

Brought to you by This is the Cube. Great to have you on, so co-founder head of product, Totally so the thing we've observed is a lot Obviously all of the hype right now, and get the right answer fast, and have that dialogue, I don't want it to answer and take over my job. How are you guys doing on the product? doesn't mean it's easy to find the thing you want, and having the catalog has come up with, has been the buzz. Understand it so you can get it in the right format. and flexibility on the algorithm side? and make more insights generated or if you want to say, Am I getting it right? That's exactly right, how can you observe what's going on We want to make each person in your organization So the benefit then for the customer would be So the infrastructure should follow the usage, Good design is here, the more effective design is the path. You guys have some partnerships that you announced it's one index of the whole web So it's almost a connector to them in a way, this is one new one that we have. the ability to click to profile, going on between the two firms, It isn't just preparing the data to be used, but at the end of the day there is a lot of work for the customer, so we deploy both on prem and in the cloud because that is really democratization, making the data free That's right so the key is to have that single source really is Google, if you think about it, So your interfacing with multiple data lakes, on prem or in the cloud, multi-cloud. They have the biggest teradata warehouse in the world. the car show for the data world, where for a long time and that's kind of where you see some of the AI things. and now I can drive the car even though I couldn't build it Historical data in essence the more historical data you have to drive better behavior in the future. Yeah so the goal is and ultimately VR are you seeing some of the use cases but then you serve those recommendations, and all the overhead involved, is it more compute, the one thing you chose out of the millions So to make that happen, if I imagine it, back to the turn by turn directions concept you have to know How do you explain the vision of Alation to that prospect? And the way we get there, no one needs to know it's in there. If you can share that, I don't know if that's confidential planning to double in the next year, for the business, and that's why they come and come back. Yeah the ultimate goal is Alright Aaron thanks for coming on the Cube.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Luis von AhnPERSON

0.99+

eBayORGANIZATION

0.99+

Aaron KalbPERSON

0.99+

PfizerORGANIZATION

0.99+

JohnPERSON

0.99+

AaronPERSON

0.99+

TescoORGANIZATION

0.99+

John FurrierPERSON

0.99+

Safeway AlbertsonsORGANIZATION

0.99+

SiriTITLE

0.99+

GoogleORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

New York CityLOCATION

0.99+

UKLOCATION

0.99+

20 mileQUANTITY

0.99+

HortonworksORGANIZATION

0.99+

BigDataORGANIZATION

0.99+

five yearsQUANTITY

0.99+

EquifaxORGANIZATION

0.99+

two firmsQUANTITY

0.99+

AppleORGANIZATION

0.99+

MeijerORGANIZATION

0.99+

ten yearsQUANTITY

0.99+

ClouderaORGANIZATION

0.99+

TrifactaORGANIZATION

0.99+

85 customersQUANTITY

0.99+

AlationORGANIZATION

0.99+

PatrickPERSON

0.99+

bothQUANTITY

0.99+

Strata DataORGANIZATION

0.99+

millionsQUANTITY

0.99+

United StatesLOCATION

0.99+

PaxataORGANIZATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

excelTITLE

0.99+

ManhattanLOCATION

0.99+

last quarterDATE

0.99+

IrelandLOCATION

0.99+

GDPRTITLE

0.99+

Tom BradyPERSON

0.99+

each personQUANTITY

0.99+

SalesforceORGANIZATION

0.98+

next yearDATE

0.98+

NYCLOCATION

0.98+

oneQUANTITY

0.98+

this yearDATE

0.98+

yesterdayDATE

0.98+

todayDATE

0.97+

one lakeQUANTITY

0.97+

NascarORGANIZATION

0.97+

one warehouseQUANTITY

0.97+

Strata DataEVENT

0.96+

TableauTITLE

0.96+

OneQUANTITY

0.96+

Both laughQUANTITY

0.96+

billions of web pagesQUANTITY

0.96+

single portalQUANTITY

0.95+

Satyen Sangani, Alation | SAP Sapphire Now 2017


 

>> Narrator: It's theCUBE covering Sapphire Now 2017 brought to you by SAP Cloud Platform and HANA Enterprise Cloud. >> Welcome back everyone to our special Sapphire Now 2017 coverage in our Palo Alto Studios. We have folks on the ground in Orlando. It's the third day of Sapphire Now and we're bringing our friends and experts inside our new 4500 square foot studio where we're starting to get our action going and covering events anywhere they are from here. If we can't get there we'll do it from here in Palo Alto. Our next guest is Satyen Sangani, CEO of Alation. A hot start-up funded by Custom Adventures, Catalyst Data Collective, and I think Andreessen Horowitz is also an investor? >> Satyen: That's right. >> Satyen, welcome to the cube conversation here. >> Thank you for having me. >> So we are doing this special coverage, and I wanted to bring you in and discuss Sapphire Now as it relates to the context of the biggest wave hitting the industry, with waves are ones cloud. We've known that for a while. People surfing that one, then the data wave is coming fast, and I think this is a completely different animal in the sense of it's going to look different, but be just as big. Your business is in the data business. You help companies figure this out. Give us the update on, first take a minute talk about Alation, for the folks who aren't following you, what do you guys do, and then let's talk about data. >> Yeah. So for those of you that don't know about what Alation is, it's basically a data catalog. You know, if you think about all of the databases that exist in the enterprise, stuff on Prem, stuff in the cloud, all the BI tools like Tableau and MicroStrategy, and Business Objects. When you've got a lot of data that sits inside the enterprise today and a wide variety of legacy and modern tools, and what Alation does is, it creates a catalog, crawling all of those systems like Google crawls the web and effectively looks at all the logs inside of those systems, to understand how the data is interrelated and we create this data social graph, and it kind of looks >> John: It's a metadata catalog? >> We call you know, we don't use the word metadata because metadata is the word that people use when you know that's that's Johnny back in the corner office, Right? And people don't want to talk about metadata if you're a business person you think about metadata you're like, I don't, not my thing. >> So you guys are democratizing what data means to an organization? That's right. >> We just like to talk about context. We basically say, look in the same way that information, or in the same way when you're eating your food, you need, you know organic labeling to understand whether or not that's good or bad, we have on some level a provenance problem, a trust problem inside of data in the enterprise, and you need a layer of you know trust, and understanding in context. >> So you guys are a SAS, or you guys are a SAS solution, or are you a software subscription? >> We are both. Most of this is actually on Prem because most of the people that have the problem that Alation solves are very big complicated institutions, or institutions with a lot of data, or a lot of people trying to analyze it, but we do also have a SAS offering, and actually that's how we intersect with SAP Altiscale, and so we have a cloud base that's offering that we work with. >> Tell me about your relation SAP because you kind of backdoored in through an acquisition, quickly note that we'll get into the conversation. >> Yeah that's right, So Altiscale to big intersections, big data, and then they do big data in the cloud SAP acquired them last year and what we do is we provide a front-end capability for people to access that data in the cloud, so that as analysts want to analyze that data, as data governance folks want to manage that data, we provide them with a single catalog to do that. >> So talk about the dynamics in the industry because SAP clearly the big news there is the Leonardo, they're trying to create this framework, we just announced an alpha because everyone's got these names of dead creative geniuses, (Satyen laughs) We just ingest our Nostradamus products, Since they have Leonardo and, >> That's right. >> SAP's got Einstein, and IBM's got Watson, and Informatica has got Claire, so who thought maybe we just get our own version, but anyway, everyone's got some sort of like bot, or like AI program. >> Yep. >> I mean I get that, but the reality is, the trend is, they're trying to create a tool chest of platform re-platforming around tooling >> Satyen: Yeah. >> To make things easier. >> Satyen: Yeah. >> You have a lot of work in this area, through relation, trying to make things easier. >> Satyen: Yeah. >> And also they get the cloud, On-premise, HANA Enterprise Cloud, SAV cloud platform, meaning developers. So the convergence between developers, cloud, and data are happening. What's your take on that strategy? You think SAP's got a good move by going multi cloud, or should they, should be taking a different approach? >> Well I think they have to, I mean I think the economics in cloud, and the unmanageability, you know really human economics, and being able to have more and more being managed by third-party providers that are, you know, effectively like AWS, and how they skill, in the capability to manage at scale, and you just really can't compete if you're SAP, and you can't compete if your customers are buying, and assembling the toolkits On-premise, so they've got to go there, and I think every IT provider has to >> John: Got to go to the cloud you mean? >> They've got to go to the cloud, I think there's no question about it, you know I think that's at this point, a foregone conclusion in the world of enterprise IT. >> John: Yeah it's pretty obvious, I mean hybrid cloud is happening, that's really a gateway to multi-cloud, the submission is when I build Norton, a guest in latency multi-cloud issues there, but the reality is not every workloads gone there yet, a lot of analytics going on in the cloud. >> Satyen: Yeah. >> DevTest, okay check the box on DevTest >> Satyen: That's right. >> Analytics is all a ballgame right now, in terms of state of the art, your thoughts on the trends in how companies are using the cloud for analytics, and things that are challenges and opportunities. >> Yeah, I think there's, I think the analytics story in the cloud is a little bit earlier. I think that the transaction processing and the new applications, and the new architectures, and new integrations, certainly if you're going to build a new project, you're going to do that in the cloud, but I think the analytics in a stack, first of all there's like data gravity, right, you know there's a lot of gravity to that data, and moving it all into the cloud, and so if you're transaction processing, your behavioral apps are in the cloud, then it makes sense to keep the data in an AWS, or in the cloud. Conversely you know if it's not, then you're not going to take a whole bunch of data that sits on Prem and move it whole hog all the way to the cloud just because, right, that's super expensive, >> Yeah. >> You've got legacy. >> A lot of risks too and a lot of governance and a lot of compliance stuff as well. >> That's exactly right I mean if you're trying to comply with Basel II or GDPR, and you know you want to manage all that privacy information. How are you going to do that if you're going to move your data at the same time >> John: Yeah. >> And so it's a tough >> John: Great point. >> It's a tough move, I think from our perspective, and I think this is really important, you know we sort of say look, in a world where data is going to be on Prem, on the cloud, you know in BI tools, in databases and no SQL databases, on Hadoop, you're going to have data everywhere, and in that world where data is going to be in multiple locations and multiple technologies you got to figure out a way to manage. >> Yeah. I mean data sprawls all over the place, it's a big problem, oh and this oh and by the way that's a good thing, store it to your storage is getting cheaper and cheaper, data legs are popping out, but you have data links, for all you have data everywhere. >> Satyen: That's right. >> How are you looking at that problem as a start-up, and how a customer's dealing with that, and what is this a real issue, or is this still too early to talk about data sprawl? >> It's a real issue, I mean it, we liken it to the advent of the Internet in the time of traditional media, right, so you had you had traditional media, there were single sort of authoritative sources we all watched it may be CNN may be CBS we had the nightly news we had Newsweek, we got our information, also the Internet comes along, and anybody can blog about anything, right and so the cost of creating information is now this much lower anybody can create any reality anybody can store data anywhere, right, and so now you've got a world where, with tableau, with Hadoop, with redshift, you can build any stack you want to at any cost, and so now what do you do? Because everybody's creating their own thing, every Dev is doing their own thing, everybody's got new databases, new applications, you know software is eating the world right? >> And data it is eating software. >> And data is eating software, and so now you've got this problem where you're like look I got all this stuff, and I don't know I don't know what's fake news, what's real, what's alternative fact, what doesn't make any sense, and so you've got a signal and noise problem, and I think in that world you got to figure out how to get to truth, right, >> John: Yeah. And what's the answer to that in your mind, not that you have the answer, if you did, we'd be solving it better. >> Yeah. >> But I mean directionally where's the vector going in your mind? I try to talk to Paul Martino about this at bullpen capital he's a total analytics geek he doesn't think this big data can solve that yet but they started to see some science around trying to solve these problems with data. What's your vision on this? >> Satyen: Yeah you know so I believe that every I think that every developer is going to start building applications based on data I think that every business person is going to have an analytical role in their job because if they're not dealing with the world on the certainty, and they're not using all the evidence, at their disposable, they're not making the best decisions and obviously they're going to be more and more analysts and so you know at some level everybody is an analyst >> I wrote a post in 2008, my old blog was hosted on WordPress, before I started SilicionANGLE, data is the new developer kid. >> That's right. >> And I saw that early, and it was still not as clear to this now as obvious as least to us because we're in the middle, in this industry, but it's now part of the software fabric, it's like a library, like as developer you'd call a library of code software to come in and be part of your program >> Yeah >> Building blocks approach, Lego blocks, but now data as Lego blocks completely changes the game on things if you think of it that way. Where are we on that notion of you really using data as a development component, I mean it seems to be early, I don't, haven't seen any proof points, that says, well that company's actually using the data programmatically with software. >> Satyen: Yeah. well I mean look I think there's features in almost every software application whether it's you know 27% of the people clicked on this button into this particular thing, I mean that's a data based application right and so I think there is this notion that we talked a lot about, which is data literacy, right, and so that's kind of a weird thing, so what does that exactly mean? Well data is just information like a news article is information, and you got to decide whether it's good or it's bad, and whether you can come to a conclusion, or whether you can't, just as if you're using an API from a third-party developer you need documentation, you need context about that data, and people have to be intelligent about how they use it. >> And literacies also makes it, makes it addressable. >> That's right. >> If you have knowledge about data, at some point it's named and addressed at some point in a network. >> Satyen: Yeah. >> Especially Jada in motion, I mean data legs I get, data at rest, we start getting into data in motion, real-time data, every piece of data counts. Right? >> That's exactly right. And so now you've got to teach people about how to use this stuff you've got to give them the right data you got to make that discoverable you got to make that information usable you've got to get people to know who the experts are about the data, so they can ask questions, you know these are tougher problems, especially as you get more and more systems. >> All right, as a start up, you're a growing start-up, you guys are, are lean and mean, doing well. You have to go compete in this war. It's a lot of, you know a lot of big whales in there, I mean you got Oracle, SAP, IBM, they're all trying to transform, everybody is transforming all the incumbent winners, potential buyers of your company, or potentially you displacing this, as a young CEO, they you know eat their lunch, you have to go compete in a big game. How are you guys looking at that compass, I see your focus so I know a little bit about your plan, but take us through the mindset of a start-up CEO, that has to go into this world, you guys have to be good, I mean this is a big wave, see it's a big wave. >> Yeah. Nobody buys from a start-up unless you get, and a start-up could be even a company, less than a 100-200 people, I mean nobody's buying from a company unless there's a 10x return to value relative to the next best option, and so in that world how do you build 10x value? Well one you've got to have great technology, and then that's the start point, but the other thing is you've got to have deep focus on your customers, right, and so I think from our perspective, we build focus by just saying, look nobody understands data in your company, and by and large you've got to make money by understanding this data, as you do the digital transformation stuff, a big part of that is differentiating and making better products and optimizing based upon understanding your data because that helps you and your business make better decisions, >> John: Yeah. >> And so what we're going to do is help you understand that data better and faster than any other company can do. >> You really got to pick your shots, but what you're saying, if I hear you saying is as a start-up you got to hit the beachhead segment you want to own. >> Satyen: That's right. >> And own it. >> Satyen: That's exactly. >> No other decision, just get it, and then maybe get to a bigger scope later, and sequence around, and grow it that way. >> Satyen: You can't solve 10 problems >> Can't be groping for a beachhead if you don't know what you want, you're never going to get it. >> That's right. You can't solve 10 problems unless you solve one, right, and so you know I think we're at a phase where we've proven that we can scalably solved one, we've got customers like, you know Pfizer and Intuit and Citrix and Tesco and Tesla and eBay and Munich Reinsurance and so these are all you know amazing brands that are traditionally difficult to sell into, but you know I think from our perspective it's really about focus and just helping customers that are making that digital analytical transformation. Do it faster, and do it by enabling their people. >> But a lot going on this week for events, we had Informatica world this week, we got V-mon. We had Google I/O. We had Sapphire. It's a variety of other events going on, but I want to ask you kind of a more of a entrepreneurial industry question, which is, if we're going through the so-called digital transformation, that means a new modern era an old one movie transformed, yet I go to every event, and everyone's number one at something, that's like I was just at Informatica, they're number one in six squadrons. Michael Dell we're number in four every character, Mark Hurr at the press meeting said they're number one in all categories, Ross Perot think quote about you could be number one depends on how you slice the market, seems to be in play, my point is I kind of get a little bit, you know weirded out by that, but that is okay, you know I guess theCUBE's number one in overall live videos produced at an enterprise event, you know I, so we're number one at something, but my point is. >> Satyen: You really are. >> My point is, in a new transformation, what is the new scoreboard going to look like because a lot of things that you're talking about is horizontally integrated, there's new use cases developing, a new environment is coming online, so if someone wanted to actually try to keep score of who number one is and who's winning, besides customer wins, because that's clearly the one that you can point to and say hey they're winning customers, customer growth is good, outside of customer growth, what do you think will be the key requirements to get some sort of metric on who's really doing well these are the others, I mean we're not yet there with >> Yeah it's a tough problem, I mean you know used to be the world was that nobody gets fired for choosing choosing IBM. >> John: Yeah. >> Right, and I think that that brand credibility worked in a world where you could be conservative right, in this world I think, that looking for those measures, it is going to be really tough, and I think on some level that quest for looking for what is number one, or who is the best is actually the sort of fool's errand, and if that's what you're looking for, if you're looking for, you know what's the best answer for me based upon social signal, you know it's kind of like you know I'm going to go do the what the popular kids do in high school, I mean that could lead to you know a path, but it doesn't lead to the one that's going to actually get you satisfaction, and so on some level I think that customers, like you are the best signal, you know, always, >> John: Yeah, I mean it's hard, it's a rhetorical question, we ask it because, you know, we're trying to see not mystical with the path of fact called the fashion, what's fashionable. >> Satyen: Yeah. >> That's different. I mean talk about like really a cure metro, in the old days market share is one, actually IDC used a track who had market shares, and they would say based upon the number of shipments products, this is the market share winner, right? yeah that's pretty clean, I mean that's fairly clean, so just what it would be now? Number of instances, I mean it's so hard to figure out anyway, I digress. >> No, I think that's right, I mean I think I think it's really tough, that I think customers stories that, sort of map to your case. >> Yeah. It all comes back down to customer wins, how many customers you have was the >> Yeah and how much value they are getting out of your stuff. >> Yeah. That 10x value, and I think that's the multiplier minimum, if not more and with clouds and the scale is happening, you agree? >> Satyen: Yeah. >> It's going to get better. Okay thanks for coming on theCUBE. We have Satyen Sangani. CEO, co-founder of Alation, great start-up. Follow them on Twitter, these guys got some really good focus, learning about your data, because once you understand the data hygiene, you start think about ethics, and all the cool stuff happening with data. Thanks so much for coming on CUBE. More coverage, but Sapphire after the short break. (techno music)

Published Date : May 19 2017

SUMMARY :

brought to you by SAP Cloud Platform and I think Andreessen Horowitz is also an investor? and I wanted to bring you in and discuss So for those of you that don't know about what Alation is, that people use when you know that's So you guys are democratizing and you need a layer of you know trust, and so we have a cloud base that's offering because you kind of backdoored in through an acquisition, and then they do big data in the cloud and IBM's got Watson, You have a lot of work in this area, through relation, and data are happening. you know I think that's at this point, a lot of analytics going on in the cloud. and things that are challenges and opportunities. you know there's a lot of gravity to that data, and a lot of compliance stuff as well. and you know you want to and multiple technologies you got to figure out but you have data links, not that you have the answer, but they started to see some science data is the new developer kid. the game on things if you think of it that way. and you got to decide whether it's good or it's bad, And literacies also makes it, If you have knowledge about data, I mean data legs I get, you know these are tougher problems, I mean you got Oracle, SAP, IBM, and so in that world how do you build 10x value? is help you understand that data better and faster the beachhead segment you want to own. and then maybe get to a bigger scope later, if you don't know what you want, and so you know I think we're at a phase you know I guess theCUBE's number one in overall I mean you know you know, I mean it's so hard to figure out anyway, I mean I think I think it's really tough, how many customers you have was the Yeah and how much value they are getting and I think that's the multiplier minimum, and all the cool stuff happening with data.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Michael DellPERSON

0.99+

OracleORGANIZATION

0.99+

IBMORGANIZATION

0.99+

Paul MartinoPERSON

0.99+

JohnPERSON

0.99+

PfizerORGANIZATION

0.99+

Ross PerotPERSON

0.99+

Mark HurrPERSON

0.99+

Palo AltoLOCATION

0.99+

2008DATE

0.99+

27%QUANTITY

0.99+

SatyenPERSON

0.99+

Satyen SanganiPERSON

0.99+

10 problemsQUANTITY

0.99+

OrlandoLOCATION

0.99+

Catalyst Data CollectiveORGANIZATION

0.99+

CBSORGANIZATION

0.99+

TeslaORGANIZATION

0.99+

4500 square footQUANTITY

0.99+

SAPORGANIZATION

0.99+

CNNORGANIZATION

0.99+

last yearDATE

0.99+

AWSORGANIZATION

0.99+

TescoORGANIZATION

0.99+

Basel IITITLE

0.99+

eBayORGANIZATION

0.99+

AlationORGANIZATION

0.99+

10xQUANTITY

0.99+

Custom AdventuresORGANIZATION

0.99+

six squadronsQUANTITY

0.99+

this weekDATE

0.99+

Andreessen HorowitzPERSON

0.99+

bothQUANTITY

0.99+

TableauTITLE

0.98+

InformaticaORGANIZATION

0.98+

GDPRTITLE

0.98+

2017DATE

0.97+

MicroStrategyTITLE

0.97+

IntuitORGANIZATION

0.97+

firstQUANTITY

0.97+

third dayQUANTITY

0.96+

NortonORGANIZATION

0.96+

JadaPERSON

0.96+

JohnnyPERSON

0.96+

SapphireORGANIZATION

0.95+

TwitterORGANIZATION

0.94+

Munich ReinsuranceORGANIZATION

0.94+

HANA Enterprise CloudTITLE

0.94+

less than a 100-200 peopleQUANTITY

0.94+

singleQUANTITY

0.94+

ClairePERSON

0.93+

Business ObjectsTITLE

0.93+

bigEVENT

0.93+

oneQUANTITY

0.93+

GoogleORGANIZATION

0.92+

LeonardoORGANIZATION

0.91+

IDCORGANIZATION

0.9+

DevTestTITLE

0.9+

AlationPERSON

0.89+

Cloud PlatformTITLE

0.89+

EinsteinPERSON

0.88+

fourQUANTITY

0.87+

Stephanie McReynolds, Alation & Lee Paries, Think Big Analytics - #BigDataSV - #theCUBE


 

>> Voiceover: San Jose, California, tt's theCUBE, covering Big Data Silicon Valley 2017. (techno music) >> Hey, welcome back everyone. Live in Silicon Valley for Big Data SV. This is theCUBE coverage in conjunction with Strata + Hadoop. I'm John Furrier with George Gilbert at Wikibon. Two great guests. We have Stephanie McReynolds, Vice President of startup Alation, and Lee Paries who is the VP of Think Big Analytics. Thanks for coming back. Both been on theCUBE, you have been on theCUBE before, but Think Big has been on many times. Good to see you. What's new, what are you guys up to? >> Yeah, excited to be here and to be here with Lee. Lee and I have a personal relationship that goes back quite aways in the industry. And then what we're talking about today is the integration between Kylo, which was recently announced as an open source project from Think Big, and Alation's capability to sit on top of Kylo and to gather to increase the velocity of data lake initiatives, kind of going from zero to 60 in a pretty short amount of time to get both technical value from Kylo and business value from Alation. >> So talk about Alation's traction, because you guys has been an interesting startup, a lot of great press. George is a big fan. He's going to jump in with some questions, but some good product fit with the market. What's the update? What's some of the status on the traction in terms of the company and customers and whatnot? >> Yeah, we've been growing pretty rapidly for a startup. We've doubled our production customer count from last time we talked. Some great brand names. Munich Reinsurance this morning was talking about their implementation. So they have 600 users of Alation in their organization. We've entered Europe, not only with Munich Reinsurance but Tesco is a large account of ours in Europe now. And here in the States we've seen broad adoption across a wide range of industries, every one from Pfizer in the healthcare space to eBay, who's been our longest standing customer. They have about 1,000 weekly users on Alation. So not only a great increase in number of logos, but also organic growth internally at many of these companies across data scientists, data analysts, business analysts, a wide range of users of the product, as well. >> It's been interesting. What I like about your approach, and we talk about Think Big about it before, we let every guest come in so far that's been in the same area is talking about metadata layers, and so this is interesting, there's a metadata data addressability if you will for lack of a better description, but yet human usable has to be integrating into human processes, whether it's virtualization, or any kind of real time app or anything. So you're seeing this convergence between I need to get the data into an app, whether it's IoT data or something else, really really fast, so really kind of the discovery pieces now, the interesting layer, how competitive is it, and what's the different solutions that you guys see in this market? >> Yeah, I think it's interesting, because metadata has kind of had a revival, right? Everyone is talking about the importance in metadata and open integration with metadata. I think really our angle is as Alation is that having open transfer of technical metadata is very important for the foundation of analytics, but what really brings that technical metadata to life is also understanding what is the business context of what's happening technically in the system? What's the business context of data? What's the behavioral context of how that data has been used that might inform me as an analyst? >> And what's your unique approach to that? Because that's like the Holy Grail. It's like translating geek metadata, indexing stuff into like usable business outcomes. It's been a cliche for years, you know. >> The approach is really based on machine learning and AI technology to make recommendations to business users about what might be interesting to them. So we're at a state in the market where there is so much data that is available and that you can access, either in Hadoop as a data lake or in a data warehouse in a database like Teradata, that today what you need as state of the art is the system to start to recommend to you what might be interesting data for you to use as a data scientist or an analyst, and not just what's the data you could use, but how accurate is that data, how trustworthy is it? I think there's a whole nother theme of governance that's rising that's tied to that metadata discussion, which is it's not enough to just shove bits and bytes between different systems anymore. You really need to understand how has this data been manipulated and used and how does that influence my security considerations, my privacy considerations, the value I'm going to be able to get out of that data set? >> What's your take on this, 'cause you guys have a relationship. How is Think Big doing? Then talk about the partnership you guys have with Alation. >> Sure, so I mean when you look at what we've done specifically to an open source project it's the first one that Teradata has fully sponsored and released based on Apache 2.0 called Kylo, it's really about the enablement of the full data lake platform and the full framework, everywhere from ingest, to securing it, to governing it, which part of that is collecting is part of that process, the basic technical and business metadata so later you can hand it over to the user so they could sample, they could profile the data, they can find, they can search in a Google like manner, and then you can enable the organization with that data. So when you look at it from a standpoint of partnering together, it's really about collecting that data specifically within Hadoop to enable it, yet with the ability then to hand it off to more the enterprise wide solution like Alation through API connections that connect to that, and then for them they enrich it in a way that they go about it with the social collaboration and the business to extend it from there. >> So that's the accelerant then. So you're accelerating the open source project in through this new, with Alation. So you're still going to rock and roll with the open source. >> Very much going to rock and roll with the open source. So it's really been based on five years of Think Big's work in the marketplace over about 150 data lakes. The IT we've built around that to do things repeatedly, consistently, and then releasing that in the last two years, dedicated development based on Apache Spark and NiFi to stand that out. >> Great work by the way. Open sources continue to be more relevant. But I got to get your perspective on a meme that's been floating around day one here, and maybe it's because of the election, but someone said, "We got to drain the data swamp, "and make data great again." And not a play on Trump, but the data lake is going through a transition and saying, "Okay, we've got data lakes," but now this year it's been a focus on making that much more active and cleaner and making sure it doesn't become a swamp if you will. So there's been a focus of taking data lake content and getting it into real time, and IoT has kind of I think been a forcing function. But you guys, do you guys have a perspective on that on where data lakes are going? Certainly it's been trending conversation here at the show. >> Yeah, I think IoT has been part of drain that data swamp, but I think also now you have a mass of business analysts that are starting to get access to that data in the lake. These Hadoop implementations are maturing to the stage where you have-- >> John: To value coming out of it. >> Yeah, and people are trying to wring value out of that lake, and sometimes finding that it is harder than they expected because the data hasn't been pre-prepared for them. This old world of IT would pre-prepare the data, and then I got a single metric or I got a couple metrics to choose from is now turned on its head. People are taking a more exploratory, discovery oriented approach to navigating through their data and finding that the nuisances of data really matter when trying to evolve an insight. So the literacy in these organizations and their awareness of some of the challenges of a lake are coming to the forefront, and I think that's a healthy conversation for us all to have. If you're going to have a data driven organization, you have to really understand the nuisances of your data to know where to apply it appropriately to decision making. >> So (mumbles) actually going back quite a few years when he started at Microsoft said, Internet software has changed paradigm so much in that we have this new set of actions where it was discover, learn, try, buy, recommend, and it sounds like as a consumer of data in a data lake we've added or preppended this discovery step. Where in a well curated data warehouse it was learn, you had your X dimensions that were curated and refined, and you don't have that as much with the data lake. I guess I'm wondering, it's almost like if you're going to take, as we were talking to the last team with AtScale and moving OLAP to be something you consume on a data lake the way you consume on a data warehouse, it's almost like Alation and a smart catalog is as much a requirement as a visualization tool is by itself on a data warehouse? >> I think what we're seeing is this notion of data needing to be curated, and including many brains and many different perspectives in that curation process is something that's defining the future of analytics and how people use technical metadata, and what does it mean for the devops organization to get involved in draining that swamp? That means not only looking at the elements of the data that are coming in from a technical perspective, but then collaborating with a business to curate the value on top of that data. >> So in other words it's not just to help the user, the business analyst, navigate, but it's also to help the operational folks do a better job of curating once they find out who's using it, who's using the data and how. >> That's right. They kind of need to know how this data is going to be used in the organization. The volumes are so high that they couldn't possibly curate every bit and byte that is stored in the data lake. So by looking at how different individuals in the organization and different groups are trying to access that data that gives early signal to where should we be spending more time or less time in processing this data and helping the organization really get to their end goals of usage. >> Lee, I want to ask you a question. On your blog post, I just was pointed out earlier, you guys quote a Gartner stat which says, which is pretty doom and gloom, which said, "70% of Hadoop deployments in 2017 "will either fail or deliver their estimated cost savings "of their predicted revenue." And then it says, "That's a dim view, "but not shared by the Kylo community." How are you guys going to make the Kylo data lake software work well? What's your thoughts on that? Because I think people, that's the number one, again, question that I highlighted earlier is okay, I don't want a swamp, so that's fear, whether they get one or not, so they worry about data cleansing and all these things. So what's Kylo doing that's going to accelerate, or lower that number, of fails in the data lake world? >> Yeah sure, so again, a lot of it's through experience of going out there and seeing what's done. A lot of people have been doing a lot of different things within the data lakes, but when you go in there there's certain things they're not doing, and then when you're doing them it's about doing them over consistently and continually improving upon that, and that's what Kylo is, it's really a framework that we keep adding to, and as the community grows and other projects come in there can enhance it we bring the value. But a lot of times when we go in it it's basically end users can't get to the data, either one because they're not allowed to because maybe it's not secured and relied to turn it over to them and let them drive with it, or they don't know the data is there, which goes back to basic collecting the basic metadata and data (mumbles) to know it's there to leverage it. So a lot of times it's going back and looking at and leveraging what we have to build that solid foundation so IT and operations can feel like they can hand that over in a template format so business users could get to the data and start acting off of that. >> You just lost your mic there, but Stephanie, I got to ask you a question. So just on a point of clarification, so you guys, are you supporting Kylo? Is that the relationship, or how does that work? >> So we're integrated with Kylo. So Kylo will ingest data into the lake, manage that data lake from a security perspective giving folks permissions, enables some wrangling on that data, and what Alation is receiving then from Kylo is that technical metadata that's being created along that entire path. >> So you're certified with Kylo? How does that all work from the customer standpoint? >> That's a very much integration partnership that we'd be working together. >> So from a customer standpoint it's clean and you then provide the benefits on the other side? >> Correct. >> Yeah, absolutely. We've been working with data lake implementations for some time, since our founding really, and I think this is an extension of our philosophy that the data lakes are going to play an important role that are going to complement databases and analytics tools, business intelligence tools, and the analytics environment, and the open source is part of the future of how folks are building these environments. So we're excited to support the Kylo initiative. We've had a longstanding relationship with Teradata as a partner, so it's a great way to work together. >> Thanks for coming on theCUBE. Really appreciate it, and thank... What do you think of the show you guys so far? What's the current vibe of the show? >> Oh, it's been good so far. I mean, it's one day into it, but very good vibe so far. Different topics and different things-- >> AI machine learning. You couldn't be more happier with that machine learning-- >> Great to see machine learning taking a forefront, people really digging into the details around what it means when you apply it. >> Stephanie, thanks for coming on theCUBE, really appreciate it. More CUBE coverage after the show break. Live from Silicon Valley, I'm John Furrier with George Gilbert. We'll be right back after this short break. (techno music)

Published Date : Mar 15 2017

SUMMARY :

(techno music) What's new, what are you guys up to? and to gather to increase He's going to jump in with some questions, And here in the States we've seen broad adoption that you guys see in this market? Everyone is talking about the importance in metadata Because that's like the Holy Grail. is the system to start to recommend to you Then talk about the partnership you guys have with Alation. and the business to extend it from there. So that's the accelerant then. and NiFi to stand that out. and maybe it's because of the election, to the stage where you have-- and finding that the nuisances of data really matter to be something you consume on a data lake and many different perspectives in that curation process but it's also to help the operational folks and helping the organization really get in the data lake world? and data (mumbles) to know it's there to leverage it. but Stephanie, I got to ask you a question. and what Alation is receiving then from Kylo that we'd be working together. that the data lakes are going to play an important role What's the current vibe of the show? Oh, it's been good so far. You couldn't be more happier with that machine learning-- people really digging into the details More CUBE coverage after the show break.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Stephanie McReynoldsPERSON

0.99+

George GilbertPERSON

0.99+

EuropeLOCATION

0.99+

StephaniePERSON

0.99+

LeePERSON

0.99+

TescoORGANIZATION

0.99+

Lee PariesPERSON

0.99+

GeorgePERSON

0.99+

TrumpPERSON

0.99+

2017DATE

0.99+

JohnPERSON

0.99+

PfizerORGANIZATION

0.99+

five yearsQUANTITY

0.99+

MicrosoftORGANIZATION

0.99+

Think BigORGANIZATION

0.99+

John FurrierPERSON

0.99+

70%QUANTITY

0.99+

San Jose, CaliforniaLOCATION

0.99+

AlationORGANIZATION

0.99+

TeradataORGANIZATION

0.99+

Think Big AnalyticsORGANIZATION

0.99+

Silicon ValleyLOCATION

0.99+

GartnerORGANIZATION

0.99+

zeroQUANTITY

0.99+

KyloORGANIZATION

0.99+

60QUANTITY

0.99+

600 usersQUANTITY

0.98+

AtScaleORGANIZATION

0.98+

eBayORGANIZATION

0.98+

GoogleORGANIZATION

0.98+

todayDATE

0.98+

first oneQUANTITY

0.98+

HadoopTITLE

0.98+

BothQUANTITY

0.98+

bothQUANTITY

0.97+

Two great guestsQUANTITY

0.97+

this yearDATE

0.97+

about 1,000 weekly usersQUANTITY

0.97+

one dayQUANTITY

0.95+

single metricQUANTITY

0.95+

Apache SparkORGANIZATION

0.94+

KyloTITLE

0.93+

WikibonORGANIZATION

0.93+

NiFiORGANIZATION

0.92+

about 150 data lakesQUANTITY

0.92+

Apache 2.0TITLE

0.89+

this morningDATE

0.88+

coupleQUANTITY

0.86+

Big Data Silicon Valley 2017EVENT

0.84+

day oneQUANTITY

0.83+

Vice PresidentPERSON

0.81+

StrataTITLE

0.77+

KyloPERSON

0.77+

#theCUBEORGANIZATION

0.76+

Big DataORGANIZATION

0.75+

last two yearsDATE

0.71+

oneQUANTITY

0.7+

Munich ReinsuranceORGANIZATION

0.62+

CUBEORGANIZATION

0.52+

Analyst Predictions 2023: The Future of Data Management


 

(upbeat music) >> Hello, this is Dave Valente with theCUBE, and one of the most gratifying aspects of my role as a host of "theCUBE TV" is I get to cover a wide range of topics. And quite often, we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. And one of our favorite topics, of course, is data. Now, in 2021, after being in isolation for the better part of two years, a group of industry analysts met up at AWS re:Invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back, and with me are five outstanding market analyst, Sanjeev Mohan of SanjMo, Tony Baer of dbInsight, Carl Olofson from IDC, Dave Menninger from Ventana Research, and Doug Henschen, VP and Principal Analyst at Constellation Research. Now, what is it that we're calling you, guys? A data pack like the rat pack? No, no, no, no, that's not it. It's the data crowd, the data crowd, and the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. >> Good to be here. >> Thank you. >> Thanks, Dave. (Tony and Dave faintly speaks) >> All right, before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So, let's get right into it. We're going to bring these up here, the predictions from 2022, they're color-coded red, yellow, and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system. But as always, we want to be open, so you can decide for yourselves. So, we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So, Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over, but elaborate, because you had more detail when you double click on that. >> Yeah, absolutely. Thank you so much, Dave, for having us on the show today. And we self-graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called DataZone. Before that, mid-year, we saw Unity Catalog from Databricks went GA. So, overall, I saw there was tremendous movement. When you see these big players launching a new data catalog, you know that they want to be in this space. And this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Collibra's conference, data.world, work closely with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So, it did become key. The reason I left it as yellow is because I had made a prediction that Collibra would go IPO, and it did not. And I don't think anyone is going IPO right now. The market is really, really down, the funding in VC IPO market. But other than that, data governance had a banner year in 2022. >> Yeah. Well, thank you for that. And of course, you saw data clean rooms being announced at AWS re:Invent, so more evidence. And I like how the fact that you included in your predictions some things that were binary, so you dinged yourself there. So, good job. Okay, Tony Baer, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. (Tony laughing) Okay. Let's hear why you feel that was the case. What do you mean by reality check? >> Okay. Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from, and this just a topic just won't go away. I did speak with a number of folks, early adopters and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re:Invent. I was packing at the time, and for some reason, I was doing Google search on data mesh. And I happened to have tripped across this ridiculous article, I will not say where, because it doesn't deserve any publicity, about the eight (Dave laughing) best data mesh software companies of 2022. (Tony laughing) One of my predictions was that you'd see data mesh washing. And I just quickly just hopped on that maybe three sentences and wrote it at about a couple minutes saying this is hogwash, essentially. (laughs) And that just reun... And then, I left for re:Invent. And the next night, when I got into my Vegas hotel room, I clicked on my computer. I saw a 15,000 hits on that post, which was the most hits of any single post I put all year. And the responses were wildly pro and con. So, it pretty much validates my expectation in that data mesh really did hit a lot more scrutiny over this past year. >> Yeah, thank you for that. I remember that article. I remember rolling my eyes when I saw it, and then I recently, (Tony laughing) I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their data mesh. So, it takes a really lot of thought, and it really, as we've talked about, is really as much an organizational construct. You're not buying data mesh >> Bingo. >> to your point. Okay. Thank you, Tony. Carl Olofson, here we go. You've graded yourself a yellow in the prediction of graph databases. Take off. Please elaborate. >> Yeah, sure. So, I realized in looking at the prediction that it seemed to imply that graph databases could be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part in that I should have said it in the right context. It's really a three to five-year time period that graph databases will really become significant, because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off, because for one thing, Neo4j, which is the leading independent graph database provider, had a very good year. And also, we're seeing interesting developments in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like TigerGraph and so forth, that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. >> Yeah, thank you for that. These specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms, but there's a place for these specialized databases, I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe to the edge. >> Well, part of it is that it's not as specialized as you might think it. You can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. >> Yeah. >> And so, it's going to be a process. (laughs) >> All right, Dave Menninger, streaming data permeates the landscape. You gave yourself a yellow. Why? >> Well, I couldn't think of a appropriate combination of yellow and green. Maybe I should have used chartreuse, (Dave laughing) but I was probably a little hard on myself making it yellow. This is another type of specialized data processing like Carl was talking about graph databases is a stream processing, and nearly every data platform offers streaming capabilities now. Often, it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50%, continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, they didn't necessarily expect it to happen in one year, is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest, but the worlds are starting to merge. And we even see some vendors bringing products to market, such as K2View, Hazelcast, and RisingWave Labs. So, in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. >> I like the tough grading system, and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think, you're right, it probably was a yellow plus. Doug Henschen, data lakehouses will emerge as dominant. When you talk to people about lakehouses, practitioners, they all use that term. They certainly use the term data lake, but now, they're using lakehouse more and more. What's your thoughts on here? Why the green? What's your evidence there? >> Well, I think, I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lakehouse advocacy in 2022. Google introduced its BigLake service alongside BigQuery. Salesforce introduced Genie, which is really a lakehouse architecture. And it was a safe prediction to say vendors are going to be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce now, IBM, all advocate this idea of a single platform for all of your data. Now, the trend was also supported in 2023, in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lakehouse platforms. It's open, so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloud Era, SAP, Salesforce, IBM, all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as their approach. So, they're advocating end users. It's very cutting edge. I'd say the top, leading edge, 5% of of companies have really embraced the lakehouse. I think, we're now seeing the fast followers, the next 20 to 25% of firms embracing this idea and embracing a lakehouse architecture. I recall Christian Kleinerman at the big Snowflake event last summer, making the announcement about Iceberg, and he asked for a show of hands for any of you in the audience at the keynote, have you heard of Iceberg? And just a smattering of hands went up. So, the vendors are ahead of the curve. They're pushing this trend, and we're now seeing a little bit more mainstream uptake. >> Good. Doug, I was there. It was you, me, and I think, two other hands were up. That was just humorous. (Doug laughing) All right, well, so I liked the fact that we had some yellow and some green. When you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make, and of course, the degree of difficulty. So, thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? >> So, my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata, which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years, data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore, because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So, what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this, if this succeeds, go do that. But it's like getting to the metadata level, and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So, this space is starting to gel, and I see there'll be more maturation in the metadata space. Even security privacy, some of these topics, which are handled separately. And I'm just talking about data security and data privacy. I'm not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge graph, semantic layer on top, so you can do analytics on it. So, it's no longer something that sits on the side, it's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. >> Great. Thank you for that. Doug. Doug Henschen, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? >> Well, I agree with many aspects of what he says. I think, there's a huge opportunity for consolidation and streamlining of these as aspects of governance. Last year, Sanjeev, you said something like, we'll see more people using catalogs than BI. And I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's a behind the scenes activity for the wonky few, or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they failed to sweep away all the earlier investments in data quality and master data management. So, yes, I hope the latest tech offers, glimmers that there's going to be a better, cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So, vendors bring on the ML and AI smarts and the automation of these sorts of activities. >> So, if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon, not going to be standalone products, in my opinion. They're going to get embedded. So, when you use a BI tool, you'll actually use the catalog to find out what is it that you want to do, whether you are looking for data or you're looking for an existing dashboard. So, the catalog becomes embedded into the BI tool. >> Hey, Dave Menninger, sometimes you have some data in your back pocket. Do you have any stats (chuckles) on this topic? >> No, I'm glad you asked, because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them. They're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So, those are good things. So, we expect to see the capabilities, whether it's embedded or separate. We expect to see those capabilities continue to permeate the market. >> And a lot of those catalogs are driven now by machine learning and things. So, they're learning from those patterns of usage by people when people use the data. (airy laughs) >> All right. Okay. Thank you, guys. All right. Let's move on to the next one. Tony Bear, let's bring up the predictions. You got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? >> I think, in a way, it's got almost too modern. It's gotten too, I don't know if it's being long in the tooth, but it is getting long. The modern data stack, it's traditionally been defined as basically you have the data platform, which would be the operational database and the data warehouse. And in between, you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter into basically the data warehouse, or as we might be seeing more and more, the data lakehouse. And I think, what's important here is that, or I think, we have seen a lot of progress, and this would be in the cloud, is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see they have the Informatica's, and all the other players there in Fivetrans have their own SaaS services. And within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it (Dave laughing) to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so, what we really need to think about here, we have too many pieces. And going back to the discussion of catalogs, it's like we have so many catalogs out there, which one do we use? 'Cause chances are of most organizations do not rely on a single catalog at this point. What I'm calling on all the data providers or all the SaaS service providers, is to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that we're data platform providers have been adding services that are adjacent. And there's some very good examples of this. We've seen progress over the past year or so. For instance, MongoDB integrating search. It's a very common, I guess, sort of tool that basically, that the applications that are developed on MongoDB use, so MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon just... AWS just did the zero-ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen same thing with Google, BigQuery integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine learning. So, there's some good moves in this direction. I expect to see more than this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly, because you're never going to get... This is like asking your data team and your developers, herding cats to standardizing the same tool. In most organizations, that is not going to happen. So, take a look at the most popular combinations of tools and start to come up with some pre-built integrations and pre-built orchestrations, and offer some promotional pricing, maybe not quite two for, but in other words, get two products for the price of two services or for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step and I expect to see more of this here. >> Yeah, and you see in Oracle, MySQL heat wave, yet another example of eliminating that ETL. Carl Olofson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? >> Well, I think, that the... I really like Tony's phrase, islands of simplification. It really says (Tony chuckles) what's going on here, which is that all these different vendors you ask about, about how these stacks work. All these different vendors have their own stack vision. And you can... One application group is going to use one, and another application group is going to use another. And some people will say, let's go to, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So, the challenge is how do we make those things work together? As Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So, the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy. But the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So, I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from? Who created it? What's its current state? What's the security level? Et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, was it Doug, I think, people are disinclined to put a lot of definition into meaning metadata. So, that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources, with streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, "I understand your product can make my system run faster, but right now I just feel I'm putting my problems on roller skates. (chuckles) I don't need that to accelerate what's already not working." >> Excellent. Okay, Carl, let's stay with you. I remember in the early days of the big data movement, Hadoop movement, NoSQL was the big thing. And I remember Amr Awadallah said to us in theCUBE that SQL is the killer app for big data. So, your prediction here, if we bring that up is SQL is back. Please elaborate. >> Yeah. So, of course, some people would say, well, it never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing, retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary conventional business analytics, Spark is like an over-engineered solution to the problem. SQL works just great. What's happened in the past couple years, and what's going to continue to happen is that SQL is insinuating itself into everything we're seeing. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or... And of course, Snowflake is loving this, because that is what they do, and their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day, we hear SQL is dead. They're not teaching SQL in schools anymore, and this kind of thing. And then, a couple days later at the same conference, they announced we're adding a new analytic capability-based on SQL. But didn't you just say SQL is dead? So, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection, no matter whether it happens to be relational or non-relational. And even in systems that are very non-relational, such as graph and document databases, their query languages are being built or extended to resemble SQL, because SQL is something people understand. >> Now, you remember when we were in high school and you had had to take the... Your debating in the class and you were forced to take one side and defend it. So, I was was at a Vertica conference one time up on stage with Curt Monash, and I had to take the NoSQL, the world is changing paradigm shift. And so just to be controversial, I said to him, Curt Monash, I said, who really needs acid compliance anyway? Tony Baer. And so, (chuckles) of course, his head exploded, but what are your thoughts (guests laughing) on all this? >> Well, my first thought is congratulations, Dave, for surviving being up on stage with Curt Monash. >> Amen. (group laughing) >> I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance and if there's any proof of the pudding here, I see lakehouse is being icing on the cake. As Doug had predicted last year, now, (clears throat) for the record, I think, Doug was about a year ahead of time in his predictions that this year is really the year that I see (clears throat) the lakehouse ecosystems really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. I've actually, I'm on the home stretch of doing a market, a landscape on the lakehouse. And lakehouse will not replace data lakes in terms of that. There is the need for those, data scientists who do know Python, who knows Spark, to go in there and basically do their thing without all the restrictions or the constraints of a pre-built, pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. Basically, (clears throat) maybe MongoDB was saying that we're not teaching SQL anymore. Well, maybe we have an oversupply of SQL developers. Well, I'm being facetious there, but there is a huge skills based in SQL. Analytics have been built on SQL. They came with lakehouse and why this really helps to fuel a SQL revival is that the core need in the data lake, what brought on the lakehouse was not so much SQL, it was a need for acid. And what was the best way to do it? It was through a relational table structure. So, the whole idea of acid in the lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and more granularly governed, where you could govern down to column and row level, which you really could not do in a data lake or a file system. So, while lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so, for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So, the bottom line I'd say here is that lakehouse is proof that we're starting to see the revenge of the SQL nerds. (Dave chuckles) >> Excellent. Okay, let's bring up back up the predictions. Dave Menninger, this one's really thought-provoking and interesting. We're hearing things like data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? >> So, I think, for too long, we've thought about data as the, I would say facts that we collect the readings off of devices and things like that, but data on its own is really insufficient. Organizations need to manipulate that data and examine derivatives of the data to really understand what's happening in their organization, why has it happened, and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed. We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack. So, if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So, for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, those are designed to promote the reuse and consistency across the AI and ML initiatives. The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So, any type of formula involving the data points. I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data. So, just like figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third-party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know the issues associated with spreadsheets. They're hard to govern, they're error-prone. And so, if we can take that type of analysis, collecting the occupancy of a rental property, the projected rise in rental rates, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do. What the income might look like from that property income, the expenses, we can plan and purchase things appropriately. So, I think, we need this broader purview and I'm beginning to see some of those things happen. And the evidence today I would say, is more focused around the metric stores and the feature stores starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. >> Very interesting. When I hear metrics, I think of KPIs, I think of data apps, orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge more... Somebody once predicted they'll have more metadata than data. Carl, what are your thoughts on this prediction? >> Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it, how much data have I used and what is the relationship between the data that I've used and the benefits that I've derived from using it? Well, we don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here, because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created, when was it modified? Who modified it? And so on and so forth. We need to do more of that with the structure data that we have, so that we can track what it's used. And also, it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. (laughs) So, we need the mechanisms to be able to do that. So, I would say that that's... And I'd say that it's true that we need that stuff. I think, that starting to expand is probably the right way to put it. It's going to be expanding for some time. I think, we're still a distance from having all that stuff really working together. >> Maybe we should say it's gestating. (Dave and Carl laughing) >> Sorry, if I may- >> Sanjeev, yeah, I was going to say this... Sanjeev, please comment. This sounds to me like it supports Zhamak Dehghani's principles, but please. >> Absolutely. So, whether we call it data mesh or not, I'm not getting into that conversation, (Dave chuckles) but data (audio breaking) (Tony laughing) everything that I'm hearing what Dave is saying, Carl, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products, all this thing about vacation rentals and how is it doing, that data is coming from different sources. I'm packaging it into our data product. And to Carl's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity, how many releases am I doing of this? What data products are most popular? I'm actually in right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So, I can discover these data products. I'm not just creating data products left, right, and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save cost. But this is a data product. Now, there's a associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it, what is the quality that I am prescribing? So, data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point when, Dave, you're talking about data and all of data discovery curation is a very data producer-centric. So, I think, we'll see a shift more into the consumer space. >> Yeah. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Zhamak has been talking about. It's basically about making, one, data products are about the lifecycle management of data. Metadata is just elemental to that. And essentially, one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. >> By the way, did everyone just no notice how Sanjeev just snuck in another prediction there? So, we've got- >> Yeah. (group laughing) >> But you- >> Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. >> Yeah, but- >> Yeah. >> And also comment to, Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. >> No. >> It's very specific >> Mm-hmm. >> to an organization, their own data product. So, touche on that one. Okay, last prediction. Let's bring them up. Doug Henschen, BI analytics is headed to embedding. What does that mean? >> Well, we all know that conventional BI dashboarding reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still, using analytics as triggers for automation and workflows, and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So, leading companies are pushing for next generation applications, part of this low-code, no-code movement we've seen. And they want to build that decision support right into the app. So, the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think, the progressive BI analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans interacting with it if there's confidence. So, we want prediction, we want embedding, we want automation. This low-code, no-code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond the, what I call swivel chair integration, between where people do their work and going off to separate reports and dashboards, and having to interpret and analyze before you can go back and do take action. >> And Dave Menninger, today, if you want, analytics or you want to absorb what's happening in the business, you typically got to go ask an expert, and then wait. So, what are your thoughts on Doug's prediction? >> I'm in total agreement with Doug. I'm going to say that collectively... So, how did we get here? I'm going to say collectively as an industry, we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago, we had to separate these two systems, so that the analytics didn't impact the operations. You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce and the majority of organizations doesn't have access to analytics. That's wrong. (chuckles) We've got to change that. And one of the ways that's going to change is with embedded analytics. 2/3 of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting. This is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose, in the wild with the data. They're given a guided path to the data. And the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. >> Can I just break in with something here, a comment on what Dave said about what Doug said, which (laughs) is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence, future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build... You have to write the applications to work that way. It just doesn't happen automagically. Developers have to write applications that reference analytic data and apply it while they're running. And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing. Or it could be through something more sophisticated that involves AI operations as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come, because it seems to take a long time for people to change their thinking. But I think, as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. >> Yeah, this is very much, I would say, very consistent with what we were talking about, I was talking about before, about basically rethinking the modern data stack and going into more of an end-to-end solution solution. I think, that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work into their workspace and further flatten that modern data stack. >> But with all this metadata and all this intelligence, we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require a human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from, whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? >> I think, that's certainly- >> Very much so, but it's going to take a while, as Carl said. You have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. >> And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how difficult, how much of a challenge it is to get to what Doug's saying. >> One element that could be applied in this case is (indistinct) architecture. If applications are developed that are event-driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. (laughs) But in the end (Tony laughing) it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago, and still, we are not seeing this come to fruition in most business applications. >> And do you think it's going to require a new type of data platform database? Today, data's all far-flung. We see that's all over the clouds and at the edge. Today, you cache- >> We need a super cloud. >> You cache that data, you're throwing into memory. I mentioned, MySQL heat wave. There are other examples where it's a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures, and just when we thought we had it all figured out. >> Well, without referring to disk, which to my mind, is almost like talking about cave painting. I think, that (Dave laughing) all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that, to determine whether we can still use it. Is it still good? That kind of thing. Bringing together data from multiple sources dynamically and real-time requires all the things we've been talking about. All the predictions that we've talked about today add up to elements that can make this happen. >> Well, guys, it's always tremendous to get these wonderful minds together and get your insights, and I love how it shapes the outcome here of the predictions, and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David, and Doug. Really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. >> Thank you. >> Thanks, Dave. >> Thank you for having us. >> Thanks. >> Thank you. >> All right, this is Dave Valente for theCUBE, signing off for now. Follow these guys on social media. Look for coverage on siliconangle.com, theCUBE.net. Thank you for watching. (upbeat music)

Published Date : Jan 11 2023

SUMMARY :

and pleased to tell you (Tony and Dave faintly speaks) that led them to their conclusion. down, the funding in VC IPO market. And I like how the fact And I happened to have tripped across I talked to Walmart in the prediction of graph databases. But I stand by the idea and maybe to the edge. You can apply graphs to great And so, it's going to streaming data permeates the landscape. and to be honest, I like the tough grading the next 20 to 25% of and of course, the degree of difficulty. that sits on the side, Thank you for that. And I have to disagree. So, the catalog becomes Do you have any stats for just the reasons that And a lot of those catalogs about the modern data stack. and more, the data lakehouse. and the application stack, So, the alternative is to have metadata that SQL is the killer app for big data. but in the perception of the marketplace, and I had to take the NoSQL, being up on stage with Curt Monash. (group laughing) is that the core need in the data lake, And your prediction is the and examine derivatives of the data to optimize around a set of KPIs. that folks in the content world (Dave and Carl laughing) going to say this... shifts the conversation to the consumers And essentially, one of the things (group laughing) the term that we'll remember today, to your last year's prediction, is headed to embedding. and going off to separate happening in the business, so that the analytics didn't And the thing that we're waiting for and that deep modeling. that the system can of decision has to be relevant And the fact that we're But in the end We see that's all over the You cache that data, and improvement of the and I love how it shapes the outcome here Thank you for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

Doug HenschenPERSON

0.99+

Dave MenningerPERSON

0.99+

DougPERSON

0.99+

CarlPERSON

0.99+

Carl OlofsonPERSON

0.99+

Dave MenningerPERSON

0.99+

Tony BaerPERSON

0.99+

TonyPERSON

0.99+

Dave ValentePERSON

0.99+

CollibraORGANIZATION

0.99+

Curt MonashPERSON

0.99+

Sanjeev MohanPERSON

0.99+

Christian KleinermanPERSON

0.99+

Dave ValentePERSON

0.99+

WalmartORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

AWSORGANIZATION

0.99+

SanjeevPERSON

0.99+

Constellation ResearchORGANIZATION

0.99+

IBMORGANIZATION

0.99+

Ventana ResearchORGANIZATION

0.99+

2022DATE

0.99+

HazelcastORGANIZATION

0.99+

OracleORGANIZATION

0.99+

Tony BearPERSON

0.99+

25%QUANTITY

0.99+

2021DATE

0.99+

last yearDATE

0.99+

65%QUANTITY

0.99+

GoogleORGANIZATION

0.99+

todayDATE

0.99+

five-yearQUANTITY

0.99+

TigerGraphORGANIZATION

0.99+

DatabricksORGANIZATION

0.99+

two servicesQUANTITY

0.99+

AmazonORGANIZATION

0.99+

DavidPERSON

0.99+

RisingWave LabsORGANIZATION

0.99+