Jim Cushman, CPO, Collibra

>> From around the globe, it's theCUBE, covering Data Citizens'21. Brought to you by Collibra. >> We're back talking all things data at Data Citizens '21. My name is Dave Vellante and you're watching theCUBE's continuous coverage, virtual coverage #DataCitizens21. I'm here with Jim Cushman who is Collibra's Chief Product Officer who shared the company's product vision at the event. Jim, welcome, good to see you. >> Thanks Dave, glad to be here. >> Now one of the themes of your session was all around self-service and access to data. This is a big big point of discussion amongst organizations that we talk to. I wonder if you could speak a little more toward what that means for Collibra and your customers and maybe some of the challenges of getting there. >> So Dave our ultimate goal at Collibra has always been to enable service access for all customers. Now, one of the challenges is they're limited to how they can access information, these knowledge workers. So our goal is to totally liberate them and so, why is this important? Well, in and of itself, self-service liberates, tens of millions of data lyric knowledge workers. This will drive more rapid, insightful decision-making, it'll drive productivity and competitiveness. And to make this level of adoption possible, the user experience has to be as intuitive as say, retail shopping, like I mentioned in my previous bit, like you're buying shoes online. But this is a little bit of foreshadowing and there's even a more profound future than just enabling a self-service, that we believe that a new class of shopper is coming online and she may not be as data-literate as our knowledge worker of today. Think of her as an algorithm developer, she builds machine learning or AI. The engagement model for this user will be, to kind of build automation, personalized experiences for people to engage with data. But in order to build that automation, she too needs data. Because she's not data literate, she needs the equivalent of a personal shopper. Someone that can guide her through the experience without actually having her know all the answers to the questions that would be asked. So this level of self-service goes one step further and becomes an automated service. One to really help find the best unbiased in a labeled training data to help train an algorithm in the future. >> That's, okay please continue. >> No please, and so all of this self and automated service, needs to be complemented with kind of a peace of mind that you're letting the right people gain access to it. So when you automate it, it's like, well, geez are the right people getting access to this. So it has to be governed and secured. This can't become like the Wild Wild West or like a data, what we call a data flea market or you know, data's everywhere. So, you know, history does quickly forget the companies that do not adjust to remain relevant. And I think we're in the midst of an exponential differentiation in Collibra data intelligence cloud is really kind of established to be the key catalyst for companies that will be on the winning side. >> Well, that's big because I mean, I'm a big believer in putting data in the hands of those folks in the line of business. And of course the big question that always comes up is, well, what about governance? What about security? So to the extent that you can federate that, that's huge. Because data is distributed by its very nature, it's going to stay that way. It's complex. You have to make the technology work in that complex environment, which brings me to this idea of low code or no code. It's gaining a lot of momentum in the industry. Everybody's talking about it, but there are a lot of questions, you know, what can you actually expect from no code and low code who were the right, you know potential users of that? Is there a difference between low and no? And so from your standpoint, why is this getting so much attention and why now, Jim? >> You don't want me to go back even 25 years ago we were talking about four and five generational languages that people were building. And it really didn't re reach the total value that folks were looking for because it always fell short. And you'd say, listen, if you didn't do all the work it took to get to a certain point how are you possibly going to finish it? And that's where the four GLs and five GLs fell short as capability. With our stuff where if you really get a great self-service how are you going to be self-service if it still requires somebody right though? Well, I guess you could do it if the only self-service people are people who write code, well, that's not bad factor. So if you truly want the ability to have something show up at your front door, without you having to call somebody or make any efforts to get it, then it needs to generate itself. The beauty of doing a catalog, new governance, understanding all the data that is available for choice, giving someone the selection that is using objective criteria, like this is the best objective cause if it's quality for what you want or it's labeled or it's unbiased and it has that level of deterministic value to it versus guessing or civic activity or what my neighbor used or what I used on my last job. Now that we've given people the power with confidence to say, this is the one that I want, the next step is okay, can you deliver it to them without them having to write any code? So imagine being able to generate those instructions from everything that we have in our metadata repository to say this is exactly the data I need you to go get and perform what we call a distributed query against those data sets and bringing it back to them. No code written. And here's the real beauty Dave, pipeline development, data pipeline development is a relatively expensive thing today and that's why people spend a lot of money maintaining these pipelines but imagine if there was zero cost to building your pipeline would you spend any money to maintain it? Probably not. So if we can build it for no cost, then why maintain it? Just build it every time you need it. And it then again, done on a self-service basis. >> I really liked the way you're thinking about this cause you're right. A lot of times when you hear self self-service it's about making the hardcore developers, you know be able to do self service. But the reality is, and you talk about that data pipeline it's complex a business person sitting there waiting for data or wants to put in new data and it turns out that the smallest unit is actually that entire team. And so you sit back and wait. And so to the extent that you can actually enable self-serve for the business by simplification that is it's been the holy grail for a while, isn't it? >> I agree. >> Let's look a little bit dig into where you're placing your bets. I mean, your head of products, you got to make bets, you know, certainly many many months if not years in advance. What are your big focus areas of investment right now? >> Yeah, certainly. So one of the things we've done very successfully since our origin over a decade ago, was building a business user-friendly software and it was predominantly kind of a plumbing or infrastructure area. So, business users love working with our software. They can find what they're looking for and they don't need to have some cryptic key of how to work with it. They can think about things in their terms and use our business glossary and they can navigate through what we call our data intelligence graph and find just what they're looking for. And we don't require a business to change everything just to make it happen. We give them kind of a universal translator to talk to the data. But with all that wonderful usability the common compromise that you make as well, its only good up to a certain amount of information, kind of like Excel. You know, you can do almost anything with Excel, right? But when you get to into large volumes, it becomes problematic and now you need that, you know go with a hardcore database and application on top. So what the industry is pulling us towards is far greater amounts of data not that just millions or even tens of millions but into the hundreds of millions and billions of things that we need to manage. So we have a huge focus on scale and performance on a global basis and that's a mouthful, right? Not only are you dealing with large amounts at performance but you have to do it in a global fashion and make it possible for somebody who might be operating in a Southeast Asia to have the same experience with the environment as they would be in Los Angeles. And the data needs to therefore go to the user as opposed to having the user come to the data as much as possible. So it really does put a lot of emphasis on some of what you call the non-functional requirements also known as the ilities and so our ability to bring the data and handle those large enterprise grade capabilities at scale and performance globally is what's really driving a good number of our investments today. >> I want to talk about data quality. This is a hard topic, but it's one that's so important. And I think it's been really challenging and somewhat misunderstood when you think about the chief data officer role itself, it kind of emerged from these highly regulated industries. And it came out of the data quality, kind of a back office role that's kind of gone front and center and now is, you know pretty strategic. Having said that, the you know, the prevailing philosophy is okay, we got to have this centralized data quality approach and that it's going to be imposed throughout. And it really is a hard problem and I think about, you know these hyper specialized roles, like, you know the quality engineer and so forth. And again, the prevailing wisdom is, if I could centralize that it can be lower cost and I can service these lines of business when in reality, the real value is, you know speed. And so how are you thinking about data quality? You hear so much about it. Why is it such a big deal and why is it so hard in a priority in the marketplace? You're thoughts. >> Thanks for that. So we of course acquired a data quality company, not burying delete, earlier this year LGQ and the big question is, okay, so why, why them and why now, not before? Well, at least a decade ago you started hearing people talk about big data. It was probably around 2009, it was becoming the big talk and what we don't really talk about when we talk about this ever expanding data, the byproduct is, this velocity of data, is increasing dramatically. So the speed of which new data is being presented the way in which data is changing is dramatic. And why is that important to data quality? Cause data quality historically for the last 30 years or so has been a rules-based business where you analyze the data at a certain point in time and you write a rule for it. Now there's already a room for error there cause humans are involved in writing those rules, but now with the increased velocity, the likelihood that it's going to atrophy and become no longer a valid or useful rule to you increases exponentially. So we were looking for a technology that was doing it in a new way similar to the way that we do auto classification when we're cataloging attributes is how do we look at millions of pieces of information around metadata and decide what it is to put it into context? The ability to automatically generate these rules and then continuously adapt as data changes to adjust these rules, is really a game changer for the industry itself. So we chose OwlDQ for that very reason. It's not only where they had this really kind of modern architecture to automatically generate rules but then to continuously monitor the data and adjust those rules, cutting out the huge amounts of costs, clearly having rules that aren't helping you save and frankly, you know how this works is, you know no one really complains about it until there's the squeaky wheel, you know, you get a fine or exposes and that's what is causing a lot of issues with data quality. And then why now? Well, I think and this is my speculation, but there's so much movement of data moving to the cloud right now. And so anyone who's made big investments in data quality historically for their on-premise data warehouses, Netezzas, Teradatas, Oracles, et cetera or even their data lakes are now moving to the cloud. And they're saying, hmm, what investments are we going to carry forward that we had on premise? And which ones are we going to start a new from and data quality seems to be ripe for something new and so these new investments in data in the cloud are now looking up. Let's look at new next generation method of doing data quality. And that's where we're really fitting in nicely. And of course, finally, you can't really do data governance and cataloging without data quality and data quality without data governance and cataloging is kind of a hollow a long-term story. So the three working together is very a powerful story. >> I got to ask you some Colombo questions about this cause you know, you're right. It's rules-based and so my, you know, immediate like, okay what are the rules around COVID or hybrid work, right? If there's static rules, there's so much unknown and so what you're saying is you've got a dynamic process to do that. So and one of the my gripes about the whole big data thing and you know, you referenced that 2009, 2010, I loved it, because there was a lot of profound things about Hadoop and a lot of failings. And one of the challenges is really that there's no context in the big data system. You know, the data, the folks in the data pipeline, they don't have the business context. So my question is, as you it's and it sounds like you've got this awesome magic to automate, who would adjudicates the dynamic rules? How does, do humans play a role? What role do they play there? >> Absolutely. There's the notion of sampling. So you can only trust a machine for certain point before you want to have some type of a steward or a assisted or supervised learning that goes on. So, you know, suspect maybe one out of 10, one out of 20 rules that are generated, you might want to have somebody look at it. Like there's ways to do the equivalent of supervised learning without actually paying the cost of the supervisor. Let's suppose that you've written a thousand rules for your system that are five years old. And we come in with our ability and we analyze the same data and we generate rules ourselves. We compare the two themselves and there's absolutely going to be some exact matching some overlap that validates one another. And that gives you confidence that the machine learning did exactly what you did and what's likelihood that you guessed wrong and machine learning guessed wrong exactly the right way that seems pretty, pretty small concern. So now you're really saying, well, why are they different? And now you start to study the samples. And what we learned, is that our ability to generate between 60 and 70% of these rules anytime we were different, we were right. Almost every single time, like almost every, like only one out of a hundred where was it proven that the handwritten rule was a more profound outcome. And of course, it's machine learning. So it learned, and it caught up the next time. So that's the true power of this innovation is it learns from the data as well as the stewards and it gives you confidence that you're not missing things and you start to trust it, but you should never completely walk away. You should constantly do your periodic sampling. >> And the secret sauce is math. I mean, I remember back in the mid two thousands it was like 2006 timeframe. You mentioned, you know, auto classification. That was a big problem with the federal rules of civil procedure trying to figure out, okay, you know, had humans classifying humans don't scale, until you had, you know, all kinds of support, vector machines and probabilistic, latent semantic indexing, but you didn't have the compute power or the data corpus to really do it well. So it sounds like a combination of you know, cheaper compute, a lot more data and machine intelligence have really changed the game there. Is that a fair assumption? >> That's absolutely fair. I think the other aspect that to keep in mind is that it's an innovative technology that actually brings all that compute as close into the data as possible. One of the greatest expenses of doing data quality was of course, the profiling concept bringing up the statistics of what the data represents. And in most traditional senses that data is completely pulled out of the database itself, into a separate area and now you start talking about terabytes or petabytes of data that takes a long time to extract that much information from a database and then to process through it all. Imagine bringing that profiling closer into the database, what's happening in the NAPE the same space as the data, that cuts out like 90% of the unnecessary processing speed. It also gives you the ability to do it incrementally. So you're not doing a full analysis each time, you have kind of an expensive play when you're first looking at a full database and then maybe over the course of a day, an hour, 15 minutes you've only seen a small segment of change. So now it feels more like a transactional analysis process. >> Yeah and that's, you know, again, we talked about the old days of big data, you know the Hadoop days and the boat was profound was it was all about bringing five megabytes of code to a petabyte of data, but that didn't happen. We shoved it all into a central data lake. I'm really excited for Collibra. It sounds like you guys are really on the cutting edge and doing some really interesting things. I'll give you the last word, Jim, please bring us on. >> Yeah thanks Dave. So one of the really exciting things about our solution is, it trying to be a combination of best of breed capabilities but also integrated. So to actually create a full and complete story that customers are looking for, you don't want to have them worry about a complex integration in trying to manage multiple vendors and the times of their releases, et cetera. If you can find one customer that you don't have to say well, that's good enough, but every single component is in fact best of breed that you can find in it's integrated and they'll manage it as a service. You truly unlock the power of your data, literate individuals in your organization. And again, that goes back to our overall goal. How do we empower the hundreds of millions of people around the world who are just looking for insightful decision? Did they feel completely locked it's as if they're looking for information before the internet and they're kind of limited to whatever their local library has and if we can truly become somewhat like the internet of data, we make it possible for anyone to access it without controls but we still govern it and secure it for privacy laws, I think we do have a chance to to change the world for better. >> Great. Thank you so much, Jim. Great conversation really appreciate your time and your insights. >> Yeah, thank you, Dave. Appreciate it. >> All right and thank you for watching theCUBE's continuous coverage of Data Citizens'21. My name is Dave Vellante. Keep it right there for more great content. (upbeat music)

Published Date : Jun 17 2021

SUMMARY :

Brought to you by Collibra. and you're watching theCUBE's and maybe some of the And to make this level So it has to be governed and secured. And of course the big question and it has that level of And so to the extent that you you got to make bets, you know, And the data needs to and that it's going to and frankly, you know how this works is, So and one of the my gripes and it gives you confidence or the data corpus to really do it well. of data that takes a long time to extract Yeah and that's, you know, again, is in fact best of breed that you can find Thank you so much, Jim. you for watching theCUBE's

ENTITIES

Entity	Category	Confidence
Jim Cushman	PERSON	0.99+
Dave	PERSON	0.99+
Jim	PERSON	0.99+
Dave Vellante	PERSON	0.99+
90%	QUANTITY	0.99+
Collibra	ORGANIZATION	0.99+
2009	DATE	0.99+
Oracles	ORGANIZATION	0.99+
Netezzas	ORGANIZATION	0.99+
LGQ	ORGANIZATION	0.99+
Los Angeles	LOCATION	0.99+
Excel	TITLE	0.99+
Teradatas	ORGANIZATION	0.99+
two	QUANTITY	0.99+
2010	DATE	0.99+
15 minutes	QUANTITY	0.99+
2006	DATE	0.99+
millions of pieces	QUANTITY	0.99+
millions	QUANTITY	0.99+
tens of millions	QUANTITY	0.99+
an hour	QUANTITY	0.99+
five GLs	QUANTITY	0.99+
Southeast Asia	LOCATION	0.99+
one	QUANTITY	0.99+
four GLs	QUANTITY	0.99+
billions	QUANTITY	0.99+
Hadoop	TITLE	0.99+
hundreds of millions	QUANTITY	0.98+
20 rules	QUANTITY	0.98+
three	QUANTITY	0.98+
70%	QUANTITY	0.98+
each time	QUANTITY	0.98+
one customer	QUANTITY	0.98+
earlier this year	DATE	0.97+
10	QUANTITY	0.97+
today	DATE	0.95+
a decade ago	DATE	0.95+
first	QUANTITY	0.95+
a day	QUANTITY	0.95+
25 years ago	DATE	0.94+
Collibra	PERSON	0.94+
hundreds of millions of people	QUANTITY	0.94+
four	QUANTITY	0.94+
petabytes	QUANTITY	0.91+
over a decade ago	DATE	0.9+
terabytes	QUANTITY	0.9+
theCUBE	ORGANIZATION	0.9+
five years old	QUANTITY	0.88+
CPO	PERSON	0.87+
Wild Wild West	LOCATION	0.86+
tens of millions of data	QUANTITY	0.86+
One	QUANTITY	0.84+
five generational languages	QUANTITY	0.83+
a thousand rules	QUANTITY	0.81+
single component	QUANTITY	0.8+
60	QUANTITY	0.8+
last 30 years	DATE	0.79+
Data Citizens'21	TITLE	0.78+
zero cost	QUANTITY	0.77+
five megabytes of code	QUANTITY	0.76+
OwlDQ	ORGANIZATION	0.7+
single time	QUANTITY	0.69+
Data Citizens '21	EVENT	0.67+
Chief Product Officer	PERSON	0.64+
hundred	QUANTITY	0.63+
two thousands	QUANTITY	0.63+
Data	EVENT	0.58+
#DataCitizens21	EVENT	0.58+
petabyte	QUANTITY	0.49+
COVID	OTHER	0.48+

Breaking Analysis: Five Questions About Snowflake’s Pending IPO

>> From theCUBE Studios in Palo Alto in Boston, bringing you data driven insights from theCUBE and ETR. This is breaking analysis with Dave Vellante. >> In June of this year, Snowflake filed a confidential document suggesting that it would do an IPO. Now of course, everybody knows about it, found out about it and it had a $20 billion valuation. So, many in the community and the investment community and so forth are excited about this IPO. It could be the hottest one of the year, and we're getting a number of questions from investors and practitioners and the entire Wiki bond, ETR and CUBE community. So, welcome everybody. This is Dave Vellante. This is "CUBE Insights" powered by ETR. In this breaking analysis, we're going to unpack five critical questions around Snowflake's IPO or pending IPO. And with me to discuss that is Erik Bradley. He's the Chief Engagement Strategists at ETR and he's also the Managing Director of VENN. Erik, thanks for coming on and great to see you as always. >> Great to see you too. Always enjoy being on the show. Thank you. >> Now for those of you don't know Erik, VENN is a roundtable that he hosts and he brings in CIOs, IT practitioners, CSOs, data experts and they have an open and frank conversation, but it's private to ETR clients. But they know who the individual is, what their role is, what their title is, et cetera and it's a kind of an ask me anything. And I participated in one of them this past week. Outstanding. And we're going to share with you some of that. But let's bring up the agenda slide if we can here. And these are really some of the questions that we're getting from investors and others in the community. There's really five areas that we want to address. The first is what's happening in this enterprise data warehouse marketplace? The second thing is kind of a one area. What about the legacy EDW players like Oracle and Teradata and Netezza? The third question we get a lot is can Snowflake compete with the big cloud players? Amazon, Google, Microsoft. I mean they're right there in the heart, in the thick of things there. And then what about that multi-cloud strategy? Is that viable? How much of a differentiator is that? And then we get a lot of questions on the TAM. Meaning the total available market. How big is that market? Does it justify the valuation for Snowflake? Now, Erik, you've been doing this now. You've run a couple VENNs, you've been following this, you've done some other work that you've done with Eagle Alpha. What's your, just your initial sort of takeaway from all this work that you've been doing. >> Yeah, sure. So my first take on Snowflake was about two and a half years ago. I actually hosted them for one of my VENN interviews and my initial thought was impressed. So impressed. They were talking at the time about their ability to kind of make ease of use of a multi-cloud strategy. At the time although I was impressed, I did not expect the growth and the hyper growth that we have seen now. But, looking at the company in its current iteration, I understand where the hype is coming from. I mean, it's 12 and a half billion private valuation in the last round. The least confidential IPO (laughs) anyone's ever seen (Dave laughs) with a 15 to $20 billion valuation coming out, which is more than Teradata, Margo and Cloudera combined. It's a great question. So obviously the success to this point is warranted, but we need to see what they're going to be able to do next. So I think the agenda you laid out is a great one and I'm looking forward to getting into some of those details. >> So let's start with what's happening in the marketplace and let's pull up a slide that I very much love to use. It's the classic X-Y. On the vertical axis here we show net score. And remember folks, net score is an indicator of spending momentum. ETR every quarter does like a clockwork survey where they're asking people, "Essentially are you spending more or less?" They subtract the less from the more and comes up with a net score. It's more complicated than, but like NPS, it's a very simple and reliable methodology. That's the vertical axis. And the horizontal axis is what's called market share. Market share is the pervasiveness within the data set. So it's calculated by the number of mentions of the vendor divided by the number of mentions within that sector. And what we're showing here is the EDW sector. And we've pulled out a few companies that I want to talk about. So the big three, obviously Microsoft, AWS and Google. And you can see Microsoft has a huge presence far to the right. AWS, very, very strong. A lot of Redshift in there. And then they're pretty high on the vertical axis. And then Google, not as much share, but very solid in that. Close to 60% net score. And then you can see above all of them from a vertical standpoint is Snowflake with a 77.5% net score. You can see them in the upper right there in the green. One of the highest Erik in the entire data set. So, let's start with some sort of initial comments on the big guys and Snowflakes. Your thoughts? >> Sure. Just first of all to comment on the data, what we're showing there is just the data warehousing sector, but Snowflake's actual net score is that high amongst the entire universe that we follow. Their data strength is unprecedented and we have forward-looking spending intention. So this bodes very well for them. Now, what you did say very accurately is there's a difference between their spending intentions on a net revenue level compared to AWS, Microsoft. There no one's saying that this is an apples-to-apples comparison when it comes to actual revenue. So we have to be very cognizant of that. There is domination (laughs) quite frankly from AWS and from Azure. And Snowflake is a necessary component for them not only to help facilitate a multi-cloud, but look what's happening right now in the US Congress, right? We have these tech leaders being grilled on their actual dominance. And one of the main concerns they have is the amount of data that they're collecting. So I think the environment is right to have another player like this. I think Snowflake really has a lot of longevity and our data is supporting that. And the commentary that we hear from our end users, the people that take the survey are supporting that as well. >> Okay, and then let's stay on this X-Y slide for a moment. I want to just pull out a couple of other comments here, because one of the questions we're asking is Whither, the legacy EDW players. So we've got in here, IBM, Oracle, you can see Teradata and then Hortonworks and MapR. We're going to talk a little bit about Hortonworks 'cause it's now Cloudera. We're going to talk a little bit about Hadoop and some of the data lakes. So you can see there they don't have nearly the net score momentum. Oracle obviously has a huge install base and is investing quite frankly in R&D and do an Exadata and it has its own cloud. So, it's got a lock on it's customers and if it keeps investing and adding value, it's not going away. IBM with Netezza, there's really been some questions around their commitment to that base. And I know that a lot of the folks in the VENNs that we've talked to Erik have said, "Well, we're replacing Netezza." Frank Slootman has been very vocal about going after Teradata. And then we're going to talk a little bit about the Hadoop space. But, can you summarize for us your thoughts in your research and the commentary from your community, what's going on with the legacy guys? Are these guys cooked? Can they hang on? What's your take? >> Sure. We focus on this quite a bit actually. So, I'm going to talk about it from the data perspective first, and then we'll go into some of the commentary and the panel. You even joined one yesterday. You know that it was touched upon. But, first on the data side, what we're noticing and capturing is a widening bifurcation between these cloud native and the legacy on-prem. It is undeniable. There is nothing that you can really refute. The data is concrete and it is getting worse. That gap is getting wider and wider and wider. Now, the one thing I will say is, nobody's going to rip out their legacy applications tomorrow. It takes years and years. So when you look at Teradata, right? Their market cap's only 2 billion, 2.3 billion. How much revenue growth do they need to stay where they are? Not much, right? No one's expecting them to grow 20%, which is what you're seeing on the left side of that screen. So when you look at the legacy versus the cloud native, there is very clear direction of what's happening. The one thing I would note from the data perspective is if you switched from net score or adoptions and you went to flat spending, you suddenly see Oracle and Teradata move over to that left a little bit, because again what I'm trying to say is I don't think they're going to catch up. No, but also don't think they're going away tomorrow. That these have large install bases, they have relationships. Now to kind of get into what you were saying about each particular one, IBM, they shut down Netezza. They shut it down and then they brought it back to life. How does that make you feel if you're the head of data architecture or you're DevOps and you're trying to build an application for a large company? I'm not going back to that. There's absolutely no way. Teradata on the other hand is known to be incredibly stable. They are known to just not fail. If you need to kind of re-architect or you do a migration, they work. Teradata also has a lot of compliance built in. So if you're a financials, if you have a regulated business or industry, there's still some data sets that you're not going to move up to the cloud. Whether it's a PII compliance or financial reasons, some of that stuff is still going to live on-prem. So Teradata is still has a very good niche. And from what we're hearing from our panels, then this is a direct quote if you don't mind me looking off screen for one second. But this is a great one. Basically said, "Teradata is the only one from the legacy camp who is putting up a fight and not giving up." Basically from a CIO perspective, the rest of them aren't an option anymore. But Teradata is still fighting and that's great to hear. They have their own data as a service offering and listen, they're a small market cap compared to these other companies we're talking about. But, to summarize, the data is very clear. There is a widening bifurcation between the two camps. I do not think legacy will catch up. I think all net new workloads are moving to data as a service, moving to cloud native, moving to hosted, but there are still going to be some existing legacy on-prem applications that will be supported with these older databases. And of those, Oracle and Teradata are still viable options. >> I totally agree with you and my colleague David Floyd is actually quite high on Teradata Vantage because he really does believe that a key component, we're going to talk about the TAM in a minute, but a key component of the TAM he believes must include the on-premises workloads. And Frank Slootman has been very clear, "We're not doing on-prem, we're not doing this halfway house." And so that's an opportunity for companies like Teradata, certainly Oracle I would put it in that camp is putting up a fight. Vertica is another one. They're very small, but another one that's sort of battling it out from the old NPP world. But that's great. Let's go into some of the specifics. Let's bring up here some of the specific commentary that we've curated here from the roundtables. I'm going to go through these and then ask you to comment. The first one is just, I mean, people are obviously very excited about Snowflake. It's easy to use, the whole thing zero to Snowflake in 90 minutes, but Snowflake is synonymous with cloud-native data warehousing. There are no equals. We heard that a lot from your VENN panelist. >> We certainly did. There was even more euphoria around Snowflake than I expected when we started hosting these series of data warehousing panels. And this particular gentleman that said that happens to be the global head of data architecture for a fortune 100 financials company. And you mentioned earlier that we did a report alongside Eagle Alpha. And we noticed that among fortune 100 companies that are also using the big three public cloud companies, Snowflake is growing market share faster than anyone else. They are positioned in a way where even if you're aligned with Azure, even if you're aligned with AWS, if you're a large company, they are gaining share right now. So that particular gentleman's comments was very interesting. He also made a comment that said, "Snowflake is the person who championed the idea that data warehousing is not dead yet. Use that old monthly Python line and you're not dead yet." And back in the day where the Hadoop came along and the data lakes turned into a data swamp and everyone said, "We don't need warehousing anymore." Well, that turned out to be a head fake, right? Hadoop was an interesting technology, but it's a complex technology. And it ended up not really working the way people want it. I think Snowflake came in at that point at an opportune time and said, "No, data warehousing isn't dead. We just have to separate the compute from the storage layer and look at what I can do. That increases flexibility, security. It gives you that ability to run across multi-cloud." So honestly the commentary has been nothing but positive. We can get into some of the commentary about people thinking that there's competition catching up to what they do, but there is no doubt that right now Snowflake is the name when it comes to data as a service. >> The other thing we heard a lot was ETL is going to get completely disrupted, you sort of embedded ETL. You heard one panelist say, "Well, it's interesting to see that guys like Informatica are talking about how fast they can run inside a Snowflake." But Snowflake is making that easy. That data prep is sort of part of the package. And so that does not bode well for ETL vendors. >> It does not, right? So ETL is a legacy of on-prem databases and even when Hadoop came along, it still needed that extra layer to kind of work with the data. But this is really, really disrupting them. Now the Snowflake's credit, they partner well. All the ETL players are partnered with Snowflake, they're trying to play nice with them, but the writings on the wall as more and more of this application and workloads move to the cloud, you don't need the ETL layer. Now, obviously that's going to affect their talent and Informatica the most. We had a recent comment that said, this was a CIO who basically said, "The most telling thing about the ETL players right now is every time you speak to them, all they talk about is how they work in a Snowflake architecture." That's their only metric that they talk about right now. And he said, "That's very telling." That he basically used it as it's their existential identity to be part of Snowflake. If they're not, they don't exist anymore. So it was interesting to have sort of a philosophical comment brought up in one of my roundtables. But that's how important playing nice and finding a niche within this new data as a service is for ETL, but to be quite honest, they might be going the same way of, "Okay, let's figure out our niche on these still the on-prem workloads that are still there." I think over time we might see them maybe as an M&A possibility, whether it's Snowflake or one of these new up and comers, kind of bring them in and sort of take some of the technology that's useful and layer it in. But as a large market cap, solo existing niche, I just don't know how long ETL is for this world. >> Now, yeah. I mean, you're right that if it wasn't for the marketing, they're not fighting fashion. But >> No. >> really there're some challenges there. Now, there were some contrarians in the panel and they signaled some potential icebergs ahead. And I guarantee you're going to see this in Snowflake's Red Herring when we actually get it. Like we're going to see all the risks. One of the comments, I'll mention the two and then we can talk about it. "Their engineering advantage will fade over time." Essentially we're saying that people are going to copycat and we've seen that. And the other point is, "Hey, we might see some similar things that happened to Hadoop." The public cloud players giving away these offerings at zero cost. Essentially marginal cost of adding another service is near zero. So the cloud players will use their heft to compete. Your thoughts? >> Yeah, first of all one of the reasons I love doing panels, right? Because we had three gentlemen on this panel that all had nothing but wonderful things to say. But you always get one. And this particular person is a CTO of a well known online public travel agency. We'll put it that way. And he said, "I'm going to be the contrarian here. I have seven different technologies from private companies that do the same thing that I'm evaluating." So that's the pressure from behind, right? The technology, they're going to catch up. Right now Snowflake has the best engineering which interestingly enough they took a lot of that engineering from IBM and Teradata if you actually go back and look at it, which was brought up in our panel as well. He said, "However, the engineering will catch up. They always do." Now from the other side they're getting squeezed because the big cloud players just say, "Hey, we can do this too. I can bundle it with all the other services I'm giving you and I can squeeze your pay. Pretty much give it a waive at the cost." So I do think that there is a very valid concern. When you come out with a $20 billion IPO evaluation, you need to warrant that. And when you see competitive pressures from both sides, from private emerging technologies and from the more dominant public cloud players, you're going to get squeezed there a little bit. And if pricing gets squeezed, it's going to be very, very important for Snowflake to continue to innovate. That comment you brought up about possibly being the next Cloudera was certainly the best sound bite that I got. And I'm going to use it as Clickbait in future articles, because I think everyone who starts looking to buy a Snowflake stock and they see that, they're going to need to take a look. But I would take that with a grain of salt. I don't think that's happening anytime soon, but what that particular CTO was referring to was if you don't innovate, the technology itself will become commoditized. And he believes that this technology will become commoditized. So therefore Snowflake has to continue to innovate. They have to find other layers to bring in. Whether that's through their massive war chest of cash they're about to have and M&A, whether that's them buying analytics company, whether that's them buying an ETL layer, finding a way to provide more value as they move forward is going to be very important for them to justify this valuation going forward. >> And I want to comment on that. The Cloudera, Hortonworks, MapRs, Hadoop, et cetera. I mean, there are dramatic differences obviously. I mean, that whole space was so hard, very difficult to stand up. You needed science project guys and lab coats to do it. It was very services intensive. As well companies like Cloudera had to fund all these open source projects and it really squeezed their R&D. I think Snowflake is much more focused and you mentioned some of the background of their engineers, of course Oracle guys as well. However, you will see Amazon's going to trot out a ton of customers using their RA3 managed storage and their flash. I think it's the DC two piece. They have a ton of action in the marketplace because it's just so easy. It's interesting one of the comments, you asked this yesterday, was with regard to separating compute from storage, which of course it's Snowflakes they basically invented it, it was one of their climbs to fame. The comment was what AWS has done to separate compute from storage for Redshift is largely a bolt on. Which I thought that was an interesting comment. I've had some other comments. My friend George Gilbert said, "Hey, despite claims to the contrary, AWS still hasn't separated storage from compute. What they have is really primitive." We got to dig into that some more, but you're seeing some data points that suggest there's copycatting going on. May not be as functional, but at the same time, Erik, like I was saying good enough is maybe good enough in this space. >> Yeah, and especially with the enterprise, right? You see what Microsoft has done. Their technology is not as good as all the niche players, but it's good enough and I already have a Microsoft license. So, (laughs) you know why am I going to move off of it. But I want to get back to the comment you mentioned too about that particular gentleman who made that comment about RedShift, their separation is really more of a bolt on than a true offering. It's interesting because I know who these people are behind the scenes and he has a very strong relationship with AWS. So it was interesting to me that in the panel yesterday he said he switched from Redshift to Snowflake because of that and some other functionality issues. So there is no doubt from the end users that are buying this. And he's again a fortune 100 financial organization. Not the same one we mentioned. That's a different one. But again, a fortune 100 well known financials organization. He switched from AWS to Snowflake. So there is no doubt that right now they have the technological lead. And when you look at our ETR data platform, we have that adoption reasoning slide that you show. When you look at the number one reason that people are adopting Snowflake is their feature set of technological lead. They have that lead now. They have to maintain it. Now, another thing to bring up on this to think about is when you have large data sets like this, and as we're moving forward, you need to have machine learning capabilities layered into it, right? So they need to make sure that they're playing nicely with that. And now you could go open source with the Apache suite, but Google is doing so well with BigQuery and so well with their machine learning aspects. And although they don't speak enterprise well, they don't sell to the enterprise well, that's changing. I think they're somebody to really keep an eye on because their machine learning capabilities that are layered into the BigQuery are impressive. Now, of course, Microsoft Azure has Databricks. They're layering that in, but this is an area where I think you're going to see maybe what's next. You have to have machine learning capabilities out of the box if you're going to do data as a service. Right now Snowflake doesn't really have that. Some of the other ones do. So I had one of my guest panelist basically say to me, because of that, they ended up going with Google BigQuery because he was able to run a machine learning algorithm within hours of getting set up. Within hours. And he said that that kind of capability out of the box is what people are going to have to use going forward. So that's another thing we should dive into a little bit more. >> Let's get into that right now. Let's bring up the next slide which shows net score. Remember this is spending momentum across the major cloud players and plus Snowflake. So you've got Snowflake on the left, Google, AWS and Microsoft. And it's showing three survey timeframes last October, April 20, which is right in the middle of the pandemic. And then the most recent survey which has just taken place this month in July. And you can see Snowflake very, very high scores. Actually improving from the last October survey. Google, lower net scores, but still very strong. Want to come back to that and pick up on your comments. AWS dipping a little bit. I think what's happening here, we saw this yesterday with AWS's results. 30% growth. Awesome. Slight miss on the revenue side for AWS, but look, I mean massive. And they're so exposed to so many industries. So some of their industries have been pretty hard hit. Microsoft pretty interesting. A little softness there. But one of the things I wanted to pick up on Erik, when you're talking about Google and BigQuery and it's ML out of the box was what we heard from a lot of the VENN participants. There's no question about it that Google technically I would say is one of Snowflake's biggest competitors because it's cloud native. Remember >> Yep. >> AWS did a license one time. License deal with PowerShell and had a sort of refactor the thing to be cloud native. And of course we know what's happening with Microsoft. They basically were on-prem and then they put stuff in the cloud and then all the updates happen in the cloud. And then they pushed to on-prem. But they have that what Frank Slootman calls that halfway house, but BigQuery no question technically is very, very solid. But again, you see Snowflake right now anyway outpacing these guys in terms of momentum. >> Snowflake is out outpacing everyone (laughs) across our entire survey universe. It really is impressive to see. And one of the things that they have going for them is they can connect all three. It's that multi-cloud ability, right? That portability that they bring to you is such an important piece for today's modern CIO as data architects. They don't want vendor lock-in. They are afraid of vendor lock-in. And this ability to make their data portable and to do that with ease and the flexibility that they offer is a huge advantage right now. However, I think you're a hundred percent right. Google has been so focused on the engineering side and never really focusing on the enterprise sales side. That is why they're playing catch up. I think they can catch up. They're bringing in some really important enterprise salespeople with experience. They're starting to learn how to talk to enterprise, how to sell, how to support. And nobody can really doubt their engineering. How many open sources have they given us, right? They invented Kubernetes and the entire container space. No one's really going to compete with them on that side if they learn how to sell it and support it. Yeah, right now they're behind. They're a distant third. Don't get me wrong. From a pure hosted ability, AWS is number one. Microsoft is yours. Sometimes it looks like it's number one, but you have to recognize that a lot of that is because of simply they're hosted 365. It's a SAS app. It's not a true cloud type of infrastructure as a service. But Google is a distant third, but their technology is really, really great. And their ability to catch up is there. And like you said, in the panels we were hearing a lot about their machine learning capability is right out of the box. And that's where this is going. What's the point of having this huge data if you're not going to be supporting it on new application architecture. And all of those applications require machine learning. >> Awesome. So we're. And I totally agree with what you're saying about Google. They just don't have it figured out how to sell the enterprise yet. And a hundred percent AWS has the best cloud. I mean, hands down. But a very, very competitive market as we heard yesterday in front of Congress. Now we're on the point about, can Snowflake compete with the big cloud players? I want to show one more data point. So let's bring up, this is the same chart as we showed before, but it's new adoptions. And this is really telling. >> Yeah. >> You can see Snowflake with 34% in the yellow, new adoptions, down yes from previous surveys, but still significantly higher than the other players. Interesting to see Google showing momentum on new adoptions, AWS down on new adoptions. And again, exposed to a lot of industries that have been hard hit. And Microsoft actually quite low on new adoption. So this is very impressive for Snowflake. And I want to talk about the multi-cloud strategy now Erik. This came up a lot. The VENN participants who are sort of fans of Snowflake said three things: It was really the flexibility, the security which is really interesting to me. And a lot of that had to do with the flexibility. The ability to easily set up roles and not have to waste a lot of time wrangling. And then the third was multi-cloud. And that was really something that came through heavily in the VENN. Didn't it? >> It really did. And again, I think it just comes down to, I don't think you can ever overstate how afraid these guys are of vendor lock-in. They can't have it. They don't want it. And it's best practice to make sure your sensitive information is being kind of spread out a little bit. We all know that people don't trust Bezos. So if you're in certain industries, you're not going to use AWS at all, right? So yeah, this ability to have your data portability through multi-cloud is the number one reason I think people start looking at Snowflake. And to go to your point about the adoptions, it's very telling and it bodes well for them going forward. Most of the things that we're seeing right now are net new workloads. So let's go again back to the legacy side that we were talking about, the Teradatas, IBMs, Oracles. They still have the monolithic applications and the data that needs to support that, right? Like an old ERP type of thing. But anyone who's now building a new application, bringing something new to market, it's all net new workloads. There is no net new workload that is going to go to SAP or IBM. It's not going to happen. The net new workloads are going to the cloud. And that's why when you switch from net score to adoption, you see Snowflake really stand out because this is about new adoption for net new workloads. And that's really where they're driving everything. So I would just say that as this continues, as data as a service continues, I think Snowflake's only going to gain more and more share for all the reasons you stated. Now get back to your comment about security. I was shocked by that. I really was. I did not expect these guys to say, "Oh, no. Snowflake enterprise security not a concern." So two panels ago, a gentleman from a fortune 100 financials said, "Listen, it's very difficult to get us to sign off on something for security. Snowflake is past it, it is enterprise ready, and we are going full steam ahead." Once they got that go ahead, there was no turning back. We gave it to our DevOps guys, we gave it to everyone and said, "Run with it." So, when a company that's big, I believe their fortune rank is 28. (laughs) So when a company that big says, "Yeah, you've got the green light. That we were okay with the internal compliance aspect, we're okay with the security aspect, this gives us multi-cloud portability, this gives us flexibility, ease of use." Honestly there's a really long runway ahead for Snowflake. >> Yeah, so the big question I have around the multi-cloud piece and I totally and I've been on record saying, "Look, if you're going looking for an agnostic multi-cloud, you're probably not going to go with the cloud vendor." (laughs) But I've also said that I think multi-cloud to date anyway has largely been a symptom as opposed to a strategy, but that's changing. But to your point about lock-in and also I think people are maybe looking at doing things across clouds, but I think that certainly it expands Snowflake's TAM and we're going to talk about that because they support multiple clouds and they're going to be the best at that. That's a mandate for them. The question I have is how much of complex joining are you going to be doing across clouds? And is that something that is just going to be too latency intensive? Is that really Snowflake's expertise? You're really trying to build that data layer. You're probably going to maybe use some kind of Postgres database for that. >> Right. >> I don't know. I need to dig into that, but that would be an opportunity from a TAM standpoint. I just don't know how real that is. >> Yeah, unfortunately I'm going to just be honest with this one. I don't think I have great expertise there and I wouldn't want to lead anyone a wrong direction. But from what I've heard from some of my VENN interview subjects, this is happening. So the data portability needs to be agnostic to the cloud. I do think that when you're saying, are there going to be real complex kind of workloads and applications? Yes, the answer is yes. And I think a lot of that has to do with some of the container architecture as well, right? If I can just pull data from one spot, spin it up for as long as I need and then just get rid of that container, that ethereal layer of compute. It doesn't matter where the cloud lies. It really doesn't. I do think that multi-cloud is the way of the future. I know that the container workloads right now in the enterprise are still very small. I've heard people say like, "Yeah, I'm kicking the tires. We got 5%." That's going to grow. And if Snowflake can make themselves an integral part of that, then yes. I think that's one of those things where, I remember the guy said, "Snowflake has to continue to innovate. They have to find a way to grow this TAM." This is an area where they can do so. I think you're right about that, but as far as my expertise, on this one I'm going to be honest with you and say, I don't want to answer incorrectly. So you and I need to dig in a little bit on this one. >> Yeah, as it relates to question four, what's the viability of Snowflake's multi-cloud strategy? I'll say unquestionably supporting multiple clouds, very viable. Whether or not portability across clouds, multi-cloud joins, et cetera, TBD. So we'll keep digging into that. The last thing I want to focus on here is the last question, does Snowflake's TAM justify its $20 billion valuation? And you think about the data pipeline. You go from data acquisition to data prep. I mean, that really is where Snowflake shines. And then of course there's analysis. You've got to bring in EMI or AI and ML tools. That's not Snowflake's strength. And then you're obviously preparing that, serving that up to the business, visualization. So there's potential adjacencies that they could get into that they may or may not decide to. But so we put together this next chart which is kind of the TAM expansion opportunity. And I just want to briefly go through it. We published this stuff so you can go and look at all the fine print, but it's kind of starts with the data lake disruption. You called it data swamp before. The Hadoop no schema on, right? Basically the ROI of Hadoop became reduction of investment as my friend Abby Meadow would say. But so they're kind of disrupting that data lake which really was a failure. And then really going after that enterprise data warehouse which is kind of I have it here as a 10 billion. It's actually bigger than that. It's probably more like a $20 billion market. I'll update this slide. And then really what Snowflake is trying to do is be data as a service. A data layer across data stores, across clouds, really make it easy to ingest and prepare data and then serve the business with insights. And then ultimately this huge TAM around automated decision making, real-time analytics, automated business processes. I mean, that is potentially an enormous market. We got a couple of hundred billion. I mean, just huge. Your thoughts on their TAM? >> I agree. I'm not worried about their TAM and one of the reasons why as I mentioned before, they are coming out with a whole lot of cash. (laughs) This is going to be a red hot IPO. They are going to have a lot of money to spend. And look at their management team. Who is leading the way? A very successful, wise, intelligent, acquisitive type of CEO. I think there is going to be M&A activity, and I believe that M&A activity is going to be 100% for the mindset of growing their TAM. The entire world is moving to data as a service. So let's take as a backdrop. I'm going to go back to the panel we did yesterday. The first question we asked was, there was an understanding or a theory that when the virus pandemic hit, people wouldn't be taking on any sort of net new architecture. They're like, "Okay, I have Teradata, I have IBM. Let's just make sure the lights are on. Let's stick with it." Every single person I've asked, they're just now eight different experts, said to us, "Oh, no. Oh, no, no." There is the virus pandemic, the shift from work from home. Everything we're seeing right now has only accelerated and advanced our data as a service strategy in the cloud. We are building for scale, adopting cloud for data initiatives. So, across the board they have a great backdrop. So that's going to only continue, right? This is very new. We're in the early innings of this. So for their TAM, that's great because that's the core of what they do. Now on top of it you mentioned the type of things about, yeah, right now they don't have great machine learning. That could easily be acquired and built in. Right now they don't have an analytics layer. I for one would love to see these guys talk to Alteryx. Alteryx is red hot. We're seeing great data and great feedback on them. If they could do that business intelligence, that analytics layer on top of it, the entire suite as a service, I mean, come on. (laughs) Their TAM is expanding in my opinion. >> Yeah, your point about their leadership is right on. And I interviewed Frank Slootman right in the heart of the pandemic >> So impressed. >> and he said, "I'm investing in engineering almost sight unseen. More circumspect around sales." But I will caution people. That a lot of people I think see what Slootman did with ServiceNow. And he came into ServiceNow. I have to tell you. It was they didn't have their unit economics right, they didn't have their sales model and marketing model. He cleaned that up. Took it from 120 million to 1.2 billion and really did an amazing job. People are looking for a repeat here. This is a totally different situation. ServiceNow drove a truck through BMCs install base and with IT help desk and then created this brilliant TAM expansion. Let's learn and expand model. This is much different here. And Slootman also told me that he's a situational CEO. He doesn't have a playbook. And so that's what is most impressive and interesting about this. He's now up against the biggest competitors in the world: AWS, Google and Microsoft and dozens of other smaller startups that have raised a lot of money. Look at the company like Yellowbrick. They've raised I don't know $180 million. They've got a great team. Google, IBM, et cetera. So it's going to be really, really fun to watch. I'm super excited, Erik, but I'll tell you the data right now suggest they've got a great tailwind and if they can continue to execute, this is going to be really fun to watch. >> Yeah, certainly. I mean, when you come out and you are as impressive as Snowflake is, you get a target on your back. There's no doubt about it, right? So we said that they basically created the data as a service. That's going to invite competition. There's no doubt about it. And Yellowbrick is one that came up in the panel yesterday about one of our CIOs were doing a proof of concept with them. We had about seven others mentioned as well that are startups that are in this space. However, none of them despite their great valuation and their great funding are going to have the kind of money and the market lead that Slootman is going to have which Snowflake has as this comes out. And what we're seeing in Congress right now with some antitrust scrutiny around the large data that's being collected by AWS as your Google, I'm not going to bet against this guy either. Right now I think he's got a lot of opportunity, there's a lot of additional layers and because he can basically develop this as a suite service, I think there's a lot of great opportunity ahead for this company. >> Yeah, and I guarantee that he understands well that customer acquisition cost and the lifetime value of the customer, the retention rates. Those are all things that he and Mike Scarpelli, his CFO learned at ServiceNow. Not learned, perfected. (Erik laughs) Well Erik, really great conversation, awesome data. It's always a pleasure having you on. Thank you so much, my friend. I really appreciate it. >> I appreciate talking to you too. We'll do it again soon. And stay safe everyone out there. >> All right, and thank you for watching everybody this episode of "CUBE Insights" powered by ETR. This is Dave Vellante, and we'll see you next time. (soft music)

Published Date : Jul 31 2020

SUMMARY :

This is breaking analysis and he's also the Great to see you too. and others in the community. I did not expect the And the horizontal axis is And one of the main concerns they have and some of the data lakes. and the legacy on-prem. but a key component of the TAM And back in the day where of part of the package. and Informatica the most. I mean, you're right that if And the other point is, "Hey, and from the more dominant It's interesting one of the comments, that in the panel yesterday and it's ML out of the box the thing to be cloud native. That portability that they bring to you And I totally agree with what And a lot of that had to and the data that needs and they're going to be the best at that. I need to dig into that, I know that the container on here is the last question, and one of the reasons heart of the pandemic and if they can continue to execute, And Yellowbrick is one that and the lifetime value of the customer, I appreciate talking to you too. This is Dave Vellante, and

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Frank Slootman	PERSON	0.99+
George Gilbert	PERSON	0.99+
Erik Bradley	PERSON	0.99+
Erik	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Mike Scarpelli	PERSON	0.99+
Google	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
David Floyd	PERSON	0.99+
Slootman	PERSON	0.99+
Teradata	ORGANIZATION	0.99+
Abby Meadow	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
$180 million	QUANTITY	0.99+
$20 billion	QUANTITY	0.99+
Netezza	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
77.5%	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
20%	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
12 and a half billion	QUANTITY	0.99+
120 million	QUANTITY	0.99+
Oracles	ORGANIZATION	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Yellowbrick	ORGANIZATION	0.99+

Breaking Analysis: Emerging Tech sees Notable Decline post Covid-19

>> Announcer: From theCUBE studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is a CUBE conversation. >> As you may recall, coming into the second part of 2019 we reported, based on ETR Survey data, that there was a narrowing of spending on emerging tech and an unplugging of a lot of legacy systems. This was really because people were going from experimentation into operationalizing their digital initiatives. When COVID hit, conventional wisdom suggested that there would be a flight to safety. Now, interestingly, we reported with Eric Bradley, based on one of the Venns, that a lot of CIOs were still experimenting with emerging vendors. But this was very anecdotal. Today, we have more data, fresh data, from the ETR Emerging Technology Study on private companies, which really does suggest that there's a notable decline in experimentation, and that's affecting emerging technology vendors. Hi, everybody, this is Dave Vellante, and welcome to this week's Wikibon Cube Insights, powered by ETR. Once again, Sagar Kadakia is joining us. Sagar is the Director of Research at ETR. Sagar, good to see you. Thanks for coming on. >> Good to see you again. Thanks for having me, Dave. >> So, it's really important to point out, this Emerging Tech Study that you guys do, it's different from your quarterly Technology Spending Intention Survey. Take us through the methodology. Guys, maybe you could bring up the first chart. And, Sagar, walk us through how you guys approach this. >> No problem. So, a lot of the viewers are used to seeing a lot of the results from the Technology Spending Intention Survey, or the TSIS, as we call it. That study, as the title says, it really tracks spending intentions on more pervasive vendors, right, Microsoft, AWS, as an example. What we're going to look at today is our Emerging Technology Study, which we conduct biannually, in May and November. This study is a little bit different. We ask CIOs around evaluations, awareness, planned evaluations, so think of this as pre-spend, right. So that's a major differentiator from the TSIS. That, and this study, really focuses on private emerging providers. We're really only focused on those really emerging private companies, say, like your Series B to Series G or H, whatever it may be, so, two big differences within those studies. And then today what we're really going to look at is the results from the Emerging Technology Study. Just a couple of quick things here. We had 811 CIOs participate, which represents about 380 billion in annual IT spend, so the results from this study matter. We had almost 75 Fortune 100s take it. So, again, we're really measuring how private emerging providers are doing in the largest organizations. And so today we're going to be reviewing notable sectors, but largely this survey tracks roughly 356 private technologies and frameworks. >> All right, guys, bring up the pie chart, the next slide. Now, Sagar, this is sort of a snapshot here, and it basically says that 44% of CIOs agree that COVID has decreased the organization's evaluation and utilization of emerging tech, despite what I mentioned, Eric Bradley's Venn, which suggested one CIO in particular said, "Hey, I always pick somebody in the lower left "of the magic quadrant." But, again, this is a static view. I know we have some other data, but take us through this, and how this compares to other surveys that you've done. >> No problem. So let's start with the high level takeaways. And I'll actually kind of get into to the point that Eric was debating, 'cause that point is true. It's just really how you kind of slice and dice the data to get to that. So, what you're looking at here, and what the overall takeaway from the Emerging Technology Study was, is, you know, you are going to see notable declines in POCs, of proof-of-concepts, any valuations because of COVID-19. Even though we had been communicating for quite some time, you know, the last few months, that there's increasing pressure for companies to further digitize with COVID-19, there are IT budget constraints. There is a huge pivot in IT resources towards supporting remote employees, a decrease in risk tolerance, and so that's why what you're seeing here is a rather notable number of CIOs, 44%, that said that they are decreasing their organization's evaluation and utilization of private emerging providers. So that is notable. >> Now, as you pointed out, you guys run this survey a couple of times a year. So now let's look at the time series. Guys, if you bring up the next chart. We can see how the sentiment has changed since last year. And, of course, we're isolating here on some of larger companies. So, take us through what this data means. >> No problem. So, how do we quantify what we just saw in the prior slide? We saw 44% of CIOs indicating that they are going to be decreasing their evaluations. But what exactly does that mean? We can pretty much determine that by looking at a lot of the data that we captured through our Emerging Technology Study. There's a lot going on in this slide, but I'll walk you through it. What you're looking at here is Fortune 1000 organizations, so we've really isolated the data to those organizations that matter. So, let's start with the teal, kind of green line first, because I think it's a little bit easier to understand. What you're looking at, Fortune 1000 evaluations, both planned and current, okay? And you're looking at a time series, one year ago and six months ago. So, two of the answer options that we provide CIOs in this survey, right, think about the survey as a grid, where you have seven answer options going horizontally, and then 300-plus vendors and technologies going vertically. For any given vendor, they can essentially indicate one of these options, two of them being on currently evaluating them or I plan to evaluate them in six months. So what you're looking at here is effectively the aggregate number, or the average number of Fortune 1000 evaluations. So if you look into May 2019, all the way on the left of that chart, that 24% roughly means that a quarter of selections made by Fortune 1000 of the survey, they selected plan to evaluate or currently evaluating. If you fast-forward six months, to the middle of the chart, November '19, it's roughly the same, one in four technologies that are Fortune 1000 selected, they indicated that I plan or am currently evaluating them. But now look at that big drop off going into May 2020, the 17%, right? So now one out of every six technologies, or one out of every selections that they made was an evaluation. So a very notable drop. And then if you look at the blue line, this is another answer option that we provided CIOs: I'm aware of the technology but I have no plans to evaluate. So this answer option essentially tracks awareness levels. If you look at the last six months, look at that big uptick from 44% to over 50%, right? So now, essentially one out of every two technologies, or private technologies that a CIO is aware of, they have no plans to evaluate. So this is going to have an impact on the general landscape, when we think about those private emerging providers. But there is one caveat, and, Dave, this is what you mentioned earlier, this is what Eric was talking about. The providers that are doing well are the ones that are work-from-home aligned. And so, just like a few years ago, we were really analyzing results based on are you cloud-native or are you Cloud-aligned, because those technologies are going to do the best, what we're seeing in the emerging space is now the same thing. Those emerging providers that enable organizations to maintain productivity for their employees, essentially allowing their employees to work remotely, those emerging providers are still doing well. And that is probably the second biggest takeaway from this study. >> So now what we're seeing here is this flight to perceive safety, which, to your point, Sagar, doesn't necessarily mean good news for all enterprise tech vendors, but certainly for those that are positioned for the work-from-home pivot. So now let's take a look at a couple of sectors. We'll start with information security. We've reported for years about how the perimeter's been broken down, and that more spend was going to shift from inside the moat to a distributed network, and that's clearly what's happened as a result of COVID. Guys, if you bring up the next chart. Sagar, you take us through this. >> No problem. And as you imagine, I think that the big theme here is zero trust. So, a couple of things here. And let me just explain this chart a little bit, because we're going to be going through a couple of these. What you're seeing on the X-axis here, is this is effectively what we're classifying as near term growth opportunity from all customers. The way we measure that effectively is we look at all the evaluations, current evaluations, planned evaluations, we look at people who are evaluated and plan to utilize these vendors. The more indications you get on that the more to the top right you're going to be. The more indications you get around I'm aware of but I don't plan to evaluate, or I'm replacing this early-stage vendor, the further down and on the left you're going to be. So, on the X-axis you have near term growth opportunity from all customers, and on the Y-axis you have near term growth opportunity from, really, the biggest shops in the world, your Global 2000, your Forbes Private 225, like Cargill, as an example, and then, of course, your federal agencies. So you really want to be positioned up and to the right here. So, the big takeaway here is zero trust. So, just a couple of things on this slide when we think about zero trust. As organizations accelerate their Cloud and Saas spend because of COVID-19, and, you know, what we were talking about earlier, Dave, remote work becomes the new normal, that perimeter security approach is losing appeal, because the perimeter's less defined, right? Apps and data are increasingly being stored in the Cloud. That, and employees are working remotely from everywhere, and they're accessing all of these items. And so what we're seeing now is a big move into zero trust. So, if we look at that chart again, what you're going to see in that upper right quadrant are a lot of identity and access management players. And look at the bifurcation in general. This is what we were talking about earlier in terms of the landscape not doing well. Most security vendors are in that red area, you know, in the middle to the bottom. But if you look at the top right, what are you seeing here? Unify ID, Auth0, WSO2, right, all identity and access management players. These are critical in your zero trust approach, and this is one of the few area where we are seeing upticks. You also see here BitSight, Lucideus. So that's going to be security assessment. You're seeing VECTRA and Netskope and Darktrace, and a few others here. And Cloud Security and IDPS, Intrusion Detection and Prevention System. So, very few sectors are seeing an uptick, very few security sectors actually look pretty good, based on opportunities that are coming. But, essentially, all of them are in that work-from-home aligned security stack, so to speak. >> Right, and of course, as we know, as we've been reporting, buyers have options, from both established companies and these emerging companies that are public, Okta, CrowdStrike, Zscaler. We've seen the work-from-home pivot benefit those guys, but even Palo Alto Networks, even CISCO, I asked (other speaker drowns out speech) last week, I said, "Hey, what about this pivot to work from home? "What about this zero trust?" And he said, "Look, the reality is, yes, "a big part of our portfolio is exposed "to that traditional infrastructure, "but we have options for zero trust as well." So, from a buyer's standpoint, that perceived flight to safety, you have a lot of established vendors, and that clearly is showing up in your data. Now, the other sector that we want to talk about is database. We've been reporting a lot on database, data warehouse. So, why don't you take us through the next graphic here, if you would. >> Sagar: No problem. So, our theme here is that Snowflake is really separating itself from the pack, and, again, you can see that here. Private database and data warehousing vendors really continue to impact a lot of their public peers, and Snowflake is leading the way. We expect Snowflake to gain momentum in the next few years. And, look, there's some rumors that IPOing soon. And so when we think about that set-up, we like it, because as organizations transition away from hybrid Cloud architectures to 100% or near-100% public Cloud, Snowflake is really going to benefit. So they look good, their data stacks look pretty good, right, that's resiliency, redundancy across data centers. So we kind of like them as well. Redis Labs bring a DB and they look pretty good here on the opportunity side, but we are seeing a little bit of churn, so I think probably Snowflake and DataStax are probably our two favorites here. And again, when you think about Snowflake, we continue to think more pervasive vendors, like Paradata and Cloudera, and some of the other larger database firms, they're going to continue seeing wallet and market share losses due to some of these emerging providers. >> Yeah. If you could just keep that slide up for a second, I would point out, in many ways Snowflake is kind of a safer bet, you know, we talk about flight to safety, because they're well-funded, they're established. You can go from zero to Snowflake very quickly, that's sort of their mantra, if you will. But I want to point out and recognize that it is somewhat oranges and tangerines here, Snowflake being an analytical database. You take MariaDB, for instance, I look at that, anyway, as relational and operational. And then you mentioned DataStax. I would say Couchbase, Redis Labs, Aerospike. Cockroach is really a... EValue Store. You've got some non-relational databases in there. But we're looking at the entire sector of databases, which has become a really interesting market. But again, some of those established players are going to do very well, and I would put Snowflake on that cusp. As you pointed out, Bloomberg broke the story, I think last week, that they were contemplating an IPO, which we've known for a while. >> Yeah. And just one last thing on that. We do like some of the more pervasive players, right. Obviously, AWS, all their products, Redshift and DynamoDB. Microsoft looks really good. It's just really some of the other legacy ones, like the Teradatas, the Oracles, the Hadoops, right, that we are going to be impacted. And so the claw providers look really good. >> So, the last decade has really brought forth this whole notion of DevOps, infrastructure as code, the whole API economy. And that's the piece we want to jump into now. And there are some real stand-outs here, you know, despite the early data that we showed you, where CIOs are less prone to look at emerging vendors. There are some, for instance, if you bring up the next chart, guys, like Hashi, that really are standing out, aren't they? >> That's right, Dave. So, again, what you're seeing here is you're seeing that bifurcation that we were talking about earlier. There are a lot of infrastructure software vendors that are not positioned well, but if you look at the ones at the top right that are positioned well... We have two kind of things on here, starting with infrastructure automation. We think a winner here is emerging with Terraform. Look all the way up to the right, how well-positioned they are, how many opportunities they're getting. And for the second straight survey now, Terraform is leading along their peers, Chef, Puppet, SaltStack. And they're leading their peers in so many different categories, notably on allocating more spend, which is obviously very important. For Chef, Puppet and SaltStack, which you can see a little bit below, probably a little bit higher than the middle, we are seeing some elevator churn levels. And so, really, Terraform looks like they're kind of separating themselves. And we've got this great quote from the CIO just a few months ago, on why Terraform is likely pulling away, and I'll read it out here quickly. "The Terraform tool creates "an entire infrastructure in a box. "Unlike vendors that use procedural languages, "like Ants, Bull and Chef, "it will show you the infrastructure "in the way you want it to be. "You don't have to worry about "the things that happen underneath." I know some companies where you can put your entire Amazon infrastructure through Terraform. If Amazon disappears, if your availability drops, load balancers, RDS, everything, you just run Terraform and everything will be created in 10 to 15 minutes. So that shows you the power of Terraform and why we think it's ranked better than some of the other vendors. >> Yeah, I think that really does sum it up. And, actually, guys, if you don't mind bringing that chart back up again. So, a point out, so, Mitchell Hashimoto, Hashi, really, I believe I'm correct, talking to Stu about this a little bit, he sort of led the Terraform project, which is an Open Source project, and, to your point, very easy to deploy. Chef, Puppet, Salt, they were largely disrupted by Cloud, because they're designed to automate deployment largely on-prem and DevOps, and now Terraform sort of packages everything up into a platform. So, Hashi actually makes money, and you'll see it on this slide, and things, Vault, which is kind of their security play. You see GitLab on here. That's really application tooling to deploy code. You see Docker containers, you know, Docker, really all about open source, and they've had great adoption, Docker's challenge has always been monetization. You see Turbonomic on here, which is application resource management. You can't go too deep on these things, but it's pretty deep within this sector. But we are comparing different types of companies, but just to give you a sense as to where the momentum is. All right, let's wrap here. So maybe some final thoughts, Sagar, on the Emerging Technology Study, and then what we can expect in the coming month here, on the update in the Technology Spending Intention Study, please. >> Yeah, no problem. One last thing on the zero trust side that has been a big issue that we didn't get to cover, is VPN spend. Our data is pointing that, yes, even though VPN spend did increase the last few months because of remote work, we actually think that people are going to move away from that as they move onto zero trust. So just one last point on that, just in terms of overall thoughts, you know, again, as we cover it, you can see how bifurcated all these spaces are. Really, if we were to go sector by sector by sector, right, storage and block chain and MLAI and all that stuff, you would see there's a few or maybe one or two vendors doing well, and the majority of vendors are not seeing as many opportunities. And so, again, are you work-from-home aligned? Are you the best vendor of all the other emerging providers? And if you fit those two criteria then you will continue seeing POCs and evaluations. And if you don't fit that criteria, unfortunately, you're going to see less opportunities. So think that's really the big takeaway on that. And then, just in terms of next steps, we're already transitioning now to our next Technology Spending Intention Survey. That launched last week. And so, again, we're going to start getting a feel for how CIOs are spending in 2H-20, right, so, for the back half of the year. And our question changes a little bit. We ask them, "How do you plan on spending in the back half year "versus how you actually spent "in the first half of the year, or 1H-20?" So, we're kind of, tighten the screw, so to speak, and really getting an idea of what's spend going to look like in the back half, and we're also going to get some updates as it relates to budget impacts from COVID-19, as well as how vendor-relationships have changed, as well as business impacts, like layoffs and furloughs, and all that stuff. So we have a tremendous amount of data that's going to be coming in the next few weeks, and it should really prepare us for what to see over the summer and into the fall. >> Yeah, very excited, Sagar, to see that. I just wanted to double down on what you said about changes in networking. We've reported with you guys on NPLS networks, shifting to SD-WAN. But even VPN and SD-WAN are being called into question as the internet becomes the new private network. And so lots of changes there. And again, very excited to see updated data, return of post-COVID, as we exit this isolation economy. Really want to point out to folks that this is not a snapshot survey, right? This is an ongoing exercise that ETR runs, and grateful for our partnership with you guys. Check out ETR.plus, that's the ETR website. I publish weekly on Wikibon.com and SiliconANGLE.com. Sagar, thanks so much for coming on. Once again, great to have you. >> Thank you so much, for having me, Dave. I really appreciate it, as always. >> And thank you for watching this episode of theCube Insights, powered by ETR. This Dave Vellante. We'll see you next time. (gentle music)

Published Date : Jun 22 2020

SUMMARY :

leaders all around the world, Sagar is the Director of Research at ETR. Good to see you again. So, it's really important to point out, So, a lot of the viewers that COVID has decreased the of slice and dice the data So now let's look at the time series. by looking at a lot of the data is this flight to perceive safety, and on the Y-axis you have Now, the other sector that we and Snowflake is leading the way. And then you mentioned DataStax. And so the claw providers And that's the piece we "in the way you want it to be. but just to give you a sense and the majority of vendors are not seeing on what you said about Thank you so much, for having me, Dave. And thank you for watching this episode

ENTITIES

Entity	Category	Confidence
Sagar	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Eric	PERSON	0.99+
May 2019	DATE	0.99+
CISCO	ORGANIZATION	0.99+
Dave	PERSON	0.99+
two	QUANTITY	0.99+
May 2020	DATE	0.99+
Eric Bradley	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Terraform	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Mitchell Hashimoto	PERSON	0.99+
100%	QUANTITY	0.99+
Zscaler	ORGANIZATION	0.99+
one	QUANTITY	0.99+
44%	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
last year	DATE	0.99+
November '19	DATE	0.99+
Palo Alto Networks	ORGANIZATION	0.99+
24%	QUANTITY	0.99+
10	QUANTITY	0.99+
17%	QUANTITY	0.99+
May	DATE	0.99+
Amazon	ORGANIZATION	0.99+
last week	DATE	0.99+
Redis Labs	ORGANIZATION	0.99+
Couchbase	ORGANIZATION	0.99+
Okta	ORGANIZATION	0.99+
Aerospike	ORGANIZATION	0.99+
COVID-19	OTHER	0.99+
Paradata	ORGANIZATION	0.99+
811 CIOs	QUANTITY	0.99+
Hashi	PERSON	0.99+
CrowdStrike	ORGANIZATION	0.99+
one caveat	QUANTITY	0.99+
November	DATE	0.99+
two criteria	QUANTITY	0.99+
Series G	OTHER	0.99+
Boston	LOCATION	0.99+
X-axis	ORGANIZATION	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
Bloomberg	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
DataStax	ORGANIZATION	0.99+
two kind	QUANTITY	0.99+
six months ago	DATE	0.99+
15 minutes	QUANTITY	0.99+
Today	DATE	0.99+
six months	QUANTITY	0.98+
Sagar Kadakia	PERSON	0.98+
about 380 billion	QUANTITY	0.98+
Oracles	ORGANIZATION	0.98+
one year ago	DATE	0.98+
MariaDB	TITLE	0.98+
over 50%	QUANTITY	0.98+
zero trust	QUANTITY	0.98+
two vendors	QUANTITY	0.98+
Series B	OTHER	0.98+
first chart	QUANTITY	0.98+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Teradatas: