Itamar Ankorion, Qlik & Peter MacDonald, Snowflake | AWS re:Invent 2022

(upbeat music) >> Hello, welcome back to theCUBE's AWS RE:Invent 2022 Coverage. I'm John Furrier, host of theCUBE. Got a great lineup here, Itamar Ankorion SVP Technology Alliance at Qlik and Peter McDonald, vice President, cloud partnerships and business development Snowflake. We're going to talk about bringing SAP data to life, for joint Snowflake, Qlik and AWS Solution. Gentlemen, thanks for coming on theCUBE Really appreciate it. >> Thank you. >> Thank you, great meeting you John. >> Just to get started, introduce yourselves to the audience, then going to jump into what you guys are doing together, unique relationship here, really compelling solution in cloud. Big story about applications and scale this year. Let's introduce yourselves. Peter, we'll start with you. >> Great. I'm Peter MacDonald. I am vice president of Cloud Partners and business development here at Snowflake. On the Cloud Partner side, that means I manage AWS relationship along with Microsoft and Google Cloud. What we do together in terms of complimentary products, GTM, co-selling, things like that. Importantly, working with other third parties like Qlik for joint solutions. On business development, it's negotiating custom commercial partnerships, large companies like Salesforce and Dell, smaller companies at most for our venture portfolio. >> Thanks Peter and hi John. It's great to be back here. So I'm Itamar Ankorion and I'm the senior vice president responsible for technology alliances here at Qlik. With that, own strategic alliances, including our key partners in the cloud, including Snowflake and AWS. I've been in the data and analytics enterprise software market for 20 plus years, and my main focus is product management, marketing, alliances, and business development. I joined Qlik about three and a half years ago through the acquisition of Attunity, which is now the foundation for Qlik data integration. So again, we focus in my team on creating joint solution alignment with our key partners to provide more value to our customers. >> Great to have both you guys, senior executives in the industry on theCUBE here, talking about data, obviously bringing SAP data to life is the theme of this segment, but this reinvent, it's all about the data, big data end-to-end story, a lot about data being intrinsic as the CEO says on stage around in the organizations in all aspects. Take a minute to explain what you guys are doing as from a company standpoint. Snowflake and Qlik and the solutions, why here at AWS? Peter, we'll start with you at Snowflake, what you guys do as a company, your mission, your focus. >> That was great, John. Yeah, so here at Snowflake, we focus on the data platform and until recently, data platforms required expensive on-prem hardware appliances. And despite all that expense, customers had capacity constraints, inexpensive maintenance, and had limited functionality that all impeded these organizations from reaching their goals. Snowflake is a cloud native SaaS platform, and we've become so successful because we've addressed these pain points and have other new special features. For example, securely sharing data across both the organization and the value chain without copying the data, support for new data types such as JSON and structured data, and also advance in database data governance. Snowflake integrates with complimentary AWS services and other partner products. So we can enable holistic solutions that include, for example, here, both Qlik and AWS SageMaker, and comprehend and bring those to joint customers. Our customers want to convert data into insights along with advanced analytics platforms in AI. That is how they make holistic data-driven solutions that will give them competitive advantage. With Snowflake, our approach is to focus on customer solutions that leverage data from existing systems such as SAP, wherever they are in the cloud or on-premise. And to do this, we leverage partners like Qlik native US to help customers transform their businesses. We provide customers with a premier data analytics platform as a result. Itamar, why don't you talk about Qlik a little bit and then we can dive into the specific SAP solution here and some trends >> Sounds great, Peter. So Qlik provides modern data integration and analytics software used by over 38,000 customers worldwide. Our focus is to help our customers turn data into value and help them close the gap between data all the way through insight and action. We offer click data integration and click data analytics. Click data integration helps to automate the data pipelines to deliver data to where they want to use them in real-time and make the data ready for analytics and then Qlik data analytics is a robust platform for analytics and business intelligence has been a leader in the Gartner Magic Quadrant for over 11 years now in the market. And both of these come together into what we call Qlik Cloud, which is our SaaS based platform. So providing a more seamless way to consume all these services and accelerate time to value with customer solutions. In terms of partnerships, both Snowflake and AWS are very strategic to us here at Qlik, so we have very comprehensive investment to ensure strong joint value proposition to we can bring to our mutual customers, everything from aligning our roadmaps through optimizing and validating integrations, collaborating on best practices, packaging joint solutions like the one we'll talk about today. And with that investment, we are an elite level, top level partner with Snowflake. We fly that our technology is Snowflake-ready across the entire product set and we have hundreds of joint customers together and with AWS we've also partnered for a long time. We're here to reinvent. We've been here with the first reinvent since the inaugural one, so it kind of gives you an idea for how long we've been working with AWS. We provide very comprehensive integration with AWS data analytics services, and we have several competencies ranging from data analytics to migration and modernization. So that's our focus and again, we're excited about working with Snowflake and AWS to bring solutions together to market. >> Well, I'm looking forward to unpacking the solutions specifically, and congratulations on the continued success of both your companies. We've been following them obviously for a very long time and seeing the platform evolve beyond just SaaS and a lot more going on in cloud these days, kind of next generation emerging. You know, we're seeing a lot of macro trends that are going to be powering some of the things we're going to get into real quickly. But before we get into the solution, what are some of those power dynamics in the industry that you're seeing in trends specifically that are impacting your customers that are taking us down this road of getting more out of the data and specifically the SAP, but in general trends and dynamics. What are you hearing from your customers? Why do they care? Why are they going down this road? Peter, we'll start with you. >> Yeah, I'll go ahead and start. Thanks. Yeah, I'd say we continue to see customers being, being very eager to transform their businesses and they know they need to leverage technology and data to do so. They're also increasingly depending upon the cloud to bring that agility, that elasticity, new functionality necessary to react in real-time to every evolving customer needs. You look at what's happened over the last three years, and boy, the macro environment customers, it's all changing so fast. With our partnerships with AWS and Qlik, we've been able to bring to market innovative solutions like the one we're announcing today that spans all three companies. It provides a holistic solution and an integrated solution for our customer. >> Itamar let's get into it, you've been with theCUBE, you've seen the journey, you have your own journey, many, many years, you've seen the waves. What's going on now? I mean, what's the big wave? What's the dynamic powering this trend? >> Yeah, in a nutshell I'll call it, it's all about time. You know, it's time to value and it's about real-time data. I'll kind of talk about that a bit. So, I mean, you hear a lot about the data being the new oil, but it's definitely, we see more and more customers seeing data as their critical enabler for innovation and digital transformation. They look for ways to monetize data. They look as the data as the way in which they can innovate and bring different value to the customers. So we see customers want to use more data so to get more value from data. We definitely see them wanting to do it faster, right, than before. And we definitely see them looking for agility and automation as ways to accelerate time to value, and also reduce overall costs. I did mention real-time data, so we definitely see more and more customers, they want to be able to act and make decisions based on fresh data. So yesterday's data is just not good enough. >> John: Yeah. >> It's got to be down to the hour, down to the minutes and sometimes even lower than that. And then I think we're also seeing customers look to their core business systems where they have a lot of value, like the SAP, like mainframe and thinking, okay, our core data is there, how can we get more value from this data? So that's key things we see all the time with customers. >> Yeah, we did a big editorial segment this year on, we called data as code. Data as code is kind of a riff on infrastructure as code and you start to see data becoming proliferating into all aspects, fresh data. It's not just where you store it, it's how you share it, it's how you turn it into an application intrinsically involved in all aspects. This is the big theme this year and that's driving all the conversations here at RE:Invent. And I'm guaranteeing you, it's going to happen for another five and 10 years. It's not stopping. So I got to get into the solution, you guys mentioned SAP and you've announced the solution by Qlik, Snowflake and AWS for your customers using SAP. Can you share more about this solution? What's unique about it? Why is it important and why now? Peter, Itamar, we'll start with you first. >> Let me jump in, this is really, I'll jump because I'm excited. We're very excited about this solution and it's also a solution by the way and again, we've seen proven customer success with it. So to your point, it's ready to scale, it's starting, I think we're going to see a lot of companies doing this over the next few years. But before we jump to the solution, let me maybe take a few minutes just to clarify the need, why we're seeing, why we're seeing customers jump to do this. So customers that use SAP, they use it to manage the core of their business. So think order processing, management, finance, inventory, supply chain, and so much more. So if you're running SAP in your company, that data creates a great opportunity for you to drive innovation and modernization. So what we see customers want to do, they want to do more with their data and more means they want to take SAP with non-SAP data and use it together to drive new insights. They want to use real-time data to drive real-time analytics, which they couldn't do to date. They want to bring together descriptive with predictive analytics. So adding machine learning in AI to drive more value from the data. And naturally they want to do it faster. So find ways to iterate faster on their solutions, have freedom with the data and agility. And I think this is really where cloud data platforms like Snowflake and AWS, you know, bring that value to be able to drive that. Now to do that you need to unlock the SAP data, which is a lot of also where Qlik comes in because typical challenges these customers run into is the complexity, inherent in SAP data. Tens of thousands of tables, proprietary formats, complex data models, licensing restrictions, and more than, you have performance issues, they usually run into how do we handle the throughput, the volumes while maintaining lower latency and impact. Where do we find knowledge to really understand how to get all this done? So these are the things we've looked at when we came together to create a solution and make it unique. So when you think about its uniqueness, because we put together a lot, and I'll go through three, four key things that come together to make this unique. First is about data delivery. How do you have the SAP data delivery? So how do you get it from ECC, from HANA from S/4HANA, how do you deliver the data and the metadata and how that integration well into Snowflake. And what we've done is we've focused a lot on optimizing that process and the continuous ingestion, so the real-time ingestion of the data in a way that works really well with the Snowflake system, data cloud. Second thing is we looked at SAP data transformation, so once the data arrives at Snowflake, how do we turn it into being analytics ready? So that's where data transformation and data worth automation come in. And these are all elements of this solution. So creating derivative datasets, creating data marts, and all of that is done by again, creating an optimized integration that pushes down SQL based transformations, so they can be processed inside Snowflake, leveraging its powerful engine. And then the third element is bringing together data visualization analytics that can also take all the data now that in organizing inside Snowflake, bring other data in, bring machine learning from SageMaker, and then you go to create a seamless integration to bring analytic applications to life. So these are all things we put together in the solution. And maybe the last point is we actually took the next step with this and we created something we refer to as solution accelerators, which we're really, really keen about. Think about this as prepackaged templates for common business analytic needs like order to cash, finance, inventory. And we can either dig into that a little more later, but this gets the next level of value to the customers all built into this joint solution. >> Yeah, I want to get to the accelerators, but real quick, Peter, your reaction to the solution, what's unique about it? And obviously Snowflake, we've been seeing the progression data applications, more developers developing on top of Snowflake, data as code kind of implies developer ecosystem. This is kind of interesting. I mean, you got partnering with Qlik and AWS, it's kind of a developer-like thinking real solution. What's unique about this SAP solution that's, that's different than what customers can get anywhere else or not? >> Yeah, well listen, I think first of all, you have to start with the idea of the solution. This are three companies coming together to build a holistic solution that is all about, you know, creating a great opportunity to turn SAP data into value this is Itamar was talking about, that's really what we're talking about here and there's a lot of technology underneath it. I'll talk more about the Snowflake technology, what's involved here, and then cover some of the AWS pieces as well. But you know, we're focusing on getting that value out and accelerating time to value for our joint customers. As Itamar was saying, you know, there's a lot of complexity with the SAP data and a lot of value there. How can we manage that in a prepackaged way, bringing together best of breed solutions with proven capabilities and bringing this to market quickly for our joint customers. You know, Snowflake and AWS have been strong partners for a number of years now, and that's not only on how Snowflake runs on top of AWS, but also how we integrate with their complementary analytics and then all products. And so, you know, we want to be able to leverage those in addition to what Qlik is bringing in terms of the data transformations, bringing data out of SAP in the visualization as well. All very critical. And then we want to bring in the predictive analytics, AWS brings and what Sage brings. We'll talk about that a little bit later on. Some of the technologies that we're leveraging are some of our latest cutting edge technologies that really make things easier for both our partners and our customers. For example, Qlik leverages Snowflakes recently released Snowpark for Python functionality to push down those data transformations from clicking the Snowflake that Itamar's mentioning. And while we also leverage Snowpark for integrations with Amazon SageMaker, but there's a lot of great new technology that just makes this easy and compelling for customers. >> I think that's the big word, easy button here for what may look like a complex kind of integration, kind of turnkey, really, really compelling example of the modern era we're living in, as we always say in theCUBE. You mentioned accelerators, SAP accelerators. Can you give an example of how that works with the technology from the third party providers to deliver this business value Itamar, 'cause that was an interesting comment. What's the example? Give an example of this acceleration. >> Yes, certainly. I think this is something that really makes this truly, truly unique in the industry and again, a great opportunity for customers. So we kind talked earlier about there's a lot of things that need to be done with SP data to turn it to value. And these accelerator, as the name suggests, are designed to do just that, to kind of jumpstart the process and reduce the time and the risk involved in such project. So again, these are pre-packaged templates. We basically took a lot of knowledge, and a lot of configurations, best practices about to get things done and we put 'em together. So think about all the steps, it includes things like data extraction, so already knowing which tables, all the relevant tables that you need to get data from in the contexts of the solution you're looking for, say like order to cash, we'll get back to that one. How do you continuously deliver that data into Snowflake in an in efficient manner, handling things like data type mappings, metadata naming conventions and transformations. The data models you build all the way to data mart definitions and all the transformations that the data needs to go through moving through steps until it's fully analytics ready. And then on top of that, even adding a library of comprehensive analytic dashboards and integrations through machine learning and AI and put all of that in a way that's in pre-integrated and tested to work with Snowflake and AWS. So this is where again, you get this entire recipe that's ready. So take for example, I think I mentioned order to cash. So again, all these things I just talked about, I mean, for those who are not familiar, I mean order to cash is a critical business process for every organization. So especially if you're in retail, manufacturing, enterprise, it's a big... This is where, you know, starting with booking a sales order, following by fulfilling the order, billing the customer, then managing the accounts receivable when the customer actually pays, right? So this all process, you got sales order fulfillment and the billing impacts customer satisfaction, you got receivable payments, you know, the impact's working capital, cash liquidity. So again, as a result this order to cash process is a lifeblood for many businesses and it's critical to optimize and understand. So the solution accelerator we created specifically for order to cash takes care of understanding all these aspects and the data that needs to come with it. So everything we outline before to make the data available in Snowflake in a way that's really useful for downstream analytics, along with dashboards that are already common for that, for that use case. So again, this enables customers to gain real-time visibility into their sales orders, fulfillment, accounts receivable performance. That's what the Excel's are all about. And very similarly, we have another one for example, for finance analytics, right? So this will optimize financial data reporting, helps customers get insights into P&L, financial risk of stability or inventory analytics that helps with, you know, improve planning and inventory management, utilization, increased efficiencies, you know, so in supply chain. So again, these accelerators really help customers get a jumpstart and move faster with their solutions. >> Peter, this is the easy button we just talked about, getting things going, you know, get the ball rolling, get some acceleration. Big part of this are the three companies coming together doing this. >> Yeah, and to build on what Itamar just said that the SAP data obviously has tremendous value. Those sales orders, distribution data, financial data, bringing that into Snowflake makes it easily accessible, but also it enables it to be combined with other data too, is one of the things that Snowflake does so well. So you can get a full view of the end-to-end process and the business overall. You know, for example, I'll just take one, you know, one example that, that may not come to mind right away, but you know, looking at the impact of weather conditions on supply chain logistics is relevant and material and have interest to our customers. How do you bring those different data sets together in an easy way, bringing the data out of SAP, bringing maybe other data out of other systems through Qlik or through Snowflake, directly bringing data in from our data marketplace and bring that all together to make it work. You know, fundamentally organizational silos and the data fragmentation exist otherwise make it really difficult to drive modern analytics projects. And that in turn limits the value that our customers are getting from SAP data and these other data sets. We want to enable that and unleash. >> Yeah, time for value. This is great stuff. Itamar final question, you know, what are customers using this? What do you have? I'm sure you have customers examples already using the solution. Can you share kind of what these examples look like in the use cases and the value? >> Oh yeah, absolutely. Thank you. Happy to. We have customers across different, different sectors. You see manufacturing, retail, energy, oil and gas, CPG. So again, customers in those segments, typically sectors typically have SAP. So we have customers in all of them. A great example is like Siemens Energy. Siemens Energy is a global provider of gas par services. You know, over what, 28 billion, 30 billion in revenue. 90,000 employees. They operate globally in over 90 countries. So they've used SAP HANA as a core system, so it's running on premises, multiple locations around the world. And what they were looking for is a way to bring all these data together so they can innovate with it. And the thing is, Peter mentioned earlier, not just the SAP data, but also bring other data from other systems to bring it together for more value. That includes finance data, these logistics data, these customer CRM data. So they bring data from over 20 different SAP systems. Okay, with Qlik data integration, feeding that into Snowflake in under 20 minutes, 24/7, 365, you know, days a year. Okay, they get data from over 20,000 tables, you know, over million, hundreds of millions of records daily going in. So it is a great example of the type of scale, scalability, agility and speed that they can get to drive these kind of innovation. So that's a great example with Siemens. You know, another one comes to mind is a global manufacturer. Very similar scenario, but you know, they're using it for real-time executive reporting. So it's more like feasibility to the production data as well as for financial analytics. So think, think, think about everything from audit to texts to innovate financial intelligence because all the data's coming from SAP. >> It's a great time to be in the data business again. It keeps getting better and better. There's more data coming. It's not stopping, you know, it's growing so fast, it keeps coming. Every year, it's the same story, Peter. It's like, doesn't stop coming. As we wrap up here, let's just get customers some information on how to get started. I mean, obviously you're starting to see the accelerators, it's a great program there. What a great partnership between the two companies and AWS. How can customers get started to learn about the solution and take advantage of it, getting more out of their SAP data, Peter? >> Yeah, I think the first place to go to is talk to Snowflake, talk to AWS, talk to our account executives that are assigned to your account. Reach out to them and they will be able to educate you on the solution. We have packages up very nicely and can be deployed very, very quickly. >> Well gentlemen, thank you so much for coming on. Appreciate the conversation. Great overview of the partnership between, you know, Snowflake and Qlik and AWS on a joint solution. You know, getting more out of the SAP data. It's really kind of a key, key solution, bringing SAP data to life. Thanks for coming on theCUBE. Appreciate it. >> Thank you. >> Thank you John. >> Okay, this is theCUBE coverage here at RE:Invent 2022. I'm John Furrier, your host of theCUBE. Thanks for watching. (upbeat music)

Published Date : Dec 1 2022

SUMMARY :

bringing SAP data to life, great meeting you John. then going to jump into what On the Cloud Partner side, and I'm the senior vice and the solutions, and the value chain and accelerate time to value that are going to be powering and data to do so. What's the dynamic powering this trend? You know, it's time to value all the time with customers. and that's driving all the and it's also a solution by the way I mean, you got partnering and bringing this to market of the modern era we're living in, that the data needs to go through getting things going, you know, Yeah, and to build in the use cases and the value? agility and speed that they can get It's a great time to be to educate you on the solution. key solution, bringing SAP data to life. Okay, this is theCUBE

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Dell	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Siemens	ORGANIZATION	0.99+
Peter MacDonald	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Peter McDonald	PERSON	0.99+
Qlik	ORGANIZATION	0.99+
28 billion	QUANTITY	0.99+
two companies	QUANTITY	0.99+
Tens	QUANTITY	0.99+
three companies	QUANTITY	0.99+
Siemens Energy	ORGANIZATION	0.99+
20 plus years	QUANTITY	0.99+
yesterday	DATE	0.99+
Snowflake	ORGANIZATION	0.99+
Itamar Ankorion	PERSON	0.99+
third element	QUANTITY	0.99+
First	QUANTITY	0.99+
three	QUANTITY	0.99+
Itamar	PERSON	0.99+
over 20,000 tables	QUANTITY	0.99+
both	QUANTITY	0.99+
90,000 employees	QUANTITY	0.99+
first	QUANTITY	0.99+
Salesforce	ORGANIZATION	0.99+
Cloud Partners	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
over 38,000 customers	QUANTITY	0.99+
under 20 minutes	QUANTITY	0.99+
10 years	QUANTITY	0.99+
five	QUANTITY	0.99+
Excel	TITLE	0.99+
one	QUANTITY	0.99+
over 11 years	QUANTITY	0.98+
Snowpark	TITLE	0.98+
Second thing	QUANTITY	0.98+

Peter MacDonald & Itamar Ankorion | AWS re:Invent 2022

(upbeat music) >> Hello, welcome back to theCUBE's AWS RE:Invent 2022 Coverage. I'm John Furrier, host of theCUBE. Got a great lineup here, Itamar Ankorion SVP Technology Alliance at Qlik and Peter McDonald, vice President, cloud partnerships and business development Snowflake. We're going to talk about bringing SAP data to life, for joint Snowflake, Qlik and AWS Solution. Gentlemen, thanks for coming on theCUBE Really appreciate it. >> Thank you. >> Thank you, great meeting you John. >> Just to get started, introduce yourselves to the audience, then going to jump into what you guys are doing together, unique relationship here, really compelling solution in cloud. Big story about applications and scale this year. Let's introduce yourselves. Peter, we'll start with you. >> Great. I'm Peter MacDonald. I am vice president of Cloud Partners and business development here at Snowflake. On the Cloud Partner side, that means I manage AWS relationship along with Microsoft and Google Cloud. What we do together in terms of complimentary products, GTM, co-selling, things like that. Importantly, working with other third parties like Qlik for joint solutions. On business development, it's negotiating custom commercial partnerships, large companies like Salesforce and Dell, smaller companies at most for our venture portfolio. >> Thanks Peter and hi John. It's great to be back here. So I'm Itamar Ankorion and I'm the senior vice president responsible for technology alliances here at Qlik. With that, own strategic alliances, including our key partners in the cloud, including Snowflake and AWS. I've been in the data and analytics enterprise software market for 20 plus years, and my main focus is product management, marketing, alliances, and business development. I joined Qlik about three and a half years ago through the acquisition of Attunity, which is now the foundation for Qlik data integration. So again, we focus in my team on creating joint solution alignment with our key partners to provide more value to our customers. >> Great to have both you guys, senior executives in the industry on theCUBE here, talking about data, obviously bringing SAP data to life is the theme of this segment, but this reinvent, it's all about the data, big data end-to-end story, a lot about data being intrinsic as the CEO says on stage around in the organizations in all aspects. Take a minute to explain what you guys are doing as from a company standpoint. Snowflake and Qlik and the solutions, why here at AWS? Peter, we'll start with you at Snowflake, what you guys do as a company, your mission, your focus. >> That was great, John. Yeah, so here at Snowflake, we focus on the data platform and until recently, data platforms required expensive on-prem hardware appliances. And despite all that expense, customers had capacity constraints, inexpensive maintenance, and had limited functionality that all impeded these organizations from reaching their goals. Snowflake is a cloud native SaaS platform, and we've become so successful because we've addressed these pain points and have other new special features. For example, securely sharing data across both the organization and the value chain without copying the data, support for new data types such as JSON and structured data, and also advance in database data governance. Snowflake integrates with complimentary AWS services and other partner products. So we can enable holistic solutions that include, for example, here, both Qlik and AWS SageMaker, and comprehend and bring those to joint customers. Our customers want to convert data into insights along with advanced analytics platforms in AI. That is how they make holistic data-driven solutions that will give them competitive advantage. With Snowflake, our approach is to focus on customer solutions that leverage data from existing systems such as SAP, wherever they are in the cloud or on-premise. And to do this, we leverage partners like Qlik native US to help customers transform their businesses. We provide customers with a premier data analytics platform as a result. Itamar, why don't you talk about Qlik a little bit and then we can dive into the specific SAP solution here and some trends >> Sounds great, Peter. So Qlik provides modern data integration and analytics software used by over 38,000 customers worldwide. Our focus is to help our customers turn data into value and help them close the gap between data all the way through insight and action. We offer click data integration and click data analytics. Click data integration helps to automate the data pipelines to deliver data to where they want to use them in real-time and make the data ready for analytics and then Qlik data analytics is a robust platform for analytics and business intelligence has been a leader in the Gartner Magic Quadrant for over 11 years now in the market. And both of these come together into what we call Qlik Cloud, which is our SaaS based platform. So providing a more seamless way to consume all these services and accelerate time to value with customer solutions. In terms of partnerships, both Snowflake and AWS are very strategic to us here at Qlik, so we have very comprehensive investment to ensure strong joint value proposition to we can bring to our mutual customers, everything from aligning our roadmaps through optimizing and validating integrations, collaborating on best practices, packaging joint solutions like the one we'll talk about today. And with that investment, we are an elite level, top level partner with Snowflake. We fly that our technology is Snowflake-ready across the entire product set and we have hundreds of joint customers together and with AWS we've also partnered for a long time. We're here to reinvent. We've been here with the first reinvent since the inaugural one, so it kind of gives you an idea for how long we've been working with AWS. We provide very comprehensive integration with AWS data analytics services, and we have several competencies ranging from data analytics to migration and modernization. So that's our focus and again, we're excited about working with Snowflake and AWS to bring solutions together to market. >> Well, I'm looking forward to unpacking the solutions specifically, and congratulations on the continued success of both your companies. We've been following them obviously for a very long time and seeing the platform evolve beyond just SaaS and a lot more going on in cloud these days, kind of next generation emerging. You know, we're seeing a lot of macro trends that are going to be powering some of the things we're going to get into real quickly. But before we get into the solution, what are some of those power dynamics in the industry that you're seeing in trends specifically that are impacting your customers that are taking us down this road of getting more out of the data and specifically the SAP, but in general trends and dynamics. What are you hearing from your customers? Why do they care? Why are they going down this road? Peter, we'll start with you. >> Yeah, I'll go ahead and start. Thanks. Yeah, I'd say we continue to see customers being, being very eager to transform their businesses and they know they need to leverage technology and data to do so. They're also increasingly depending upon the cloud to bring that agility, that elasticity, new functionality necessary to react in real-time to every evolving customer needs. You look at what's happened over the last three years, and boy, the macro environment customers, it's all changing so fast. With our partnerships with AWS and Qlik, we've been able to bring to market innovative solutions like the one we're announcing today that spans all three companies. It provides a holistic solution and an integrated solution for our customer. >> Itamar let's get into it, you've been with theCUBE, you've seen the journey, you have your own journey, many, many years, you've seen the waves. What's going on now? I mean, what's the big wave? What's the dynamic powering this trend? >> Yeah, in a nutshell I'll call it, it's all about time. You know, it's time to value and it's about real-time data. I'll kind of talk about that a bit. So, I mean, you hear a lot about the data being the new oil, but it's definitely, we see more and more customers seeing data as their critical enabler for innovation and digital transformation. They look for ways to monetize data. They look as the data as the way in which they can innovate and bring different value to the customers. So we see customers want to use more data so to get more value from data. We definitely see them wanting to do it faster, right, than before. And we definitely see them looking for agility and automation as ways to accelerate time to value, and also reduce overall costs. I did mention real-time data, so we definitely see more and more customers, they want to be able to act and make decisions based on fresh data. So yesterday's data is just not good enough. >> John: Yeah. >> It's got to be down to the hour, down to the minutes and sometimes even lower than that. And then I think we're also seeing customers look to their core business systems where they have a lot of value, like the SAP, like mainframe and thinking, okay, our core data is there, how can we get more value from this data? So that's key things we see all the time with customers. >> Yeah, we did a big editorial segment this year on, we called data as code. Data as code is kind of a riff on infrastructure as code and you start to see data becoming proliferating into all aspects, fresh data. It's not just where you store it, it's how you share it, it's how you turn it into an application intrinsically involved in all aspects. This is the big theme this year and that's driving all the conversations here at RE:Invent. And I'm guaranteeing you, it's going to happen for another five and 10 years. It's not stopping. So I got to get into the solution, you guys mentioned SAP and you've announced the solution by Qlik, Snowflake and AWS for your customers using SAP. Can you share more about this solution? What's unique about it? Why is it important and why now? Peter, Itamar, we'll start with you first. >> Let me jump in, this is really, I'll jump because I'm excited. We're very excited about this solution and it's also a solution by the way and again, we've seen proven customer success with it. So to your point, it's ready to scale, it's starting, I think we're going to see a lot of companies doing this over the next few years. But before we jump to the solution, let me maybe take a few minutes just to clarify the need, why we're seeing, why we're seeing customers jump to do this. So customers that use SAP, they use it to manage the core of their business. So think order processing, management, finance, inventory, supply chain, and so much more. So if you're running SAP in your company, that data creates a great opportunity for you to drive innovation and modernization. So what we see customers want to do, they want to do more with their data and more means they want to take SAP with non-SAP data and use it together to drive new insights. They want to use real-time data to drive real-time analytics, which they couldn't do to date. They want to bring together descriptive with predictive analytics. So adding machine learning in AI to drive more value from the data. And naturally they want to do it faster. So find ways to iterate faster on their solutions, have freedom with the data and agility. And I think this is really where cloud data platforms like Snowflake and AWS, you know, bring that value to be able to drive that. Now to do that you need to unlock the SAP data, which is a lot of also where Qlik comes in because typical challenges these customers run into is the complexity, inherent in SAP data. Tens of thousands of tables, proprietary formats, complex data models, licensing restrictions, and more than, you have performance issues, they usually run into how do we handle the throughput, the volumes while maintaining lower latency and impact. Where do we find knowledge to really understand how to get all this done? So these are the things we've looked at when we came together to create a solution and make it unique. So when you think about its uniqueness, because we put together a lot, and I'll go through three, four key things that come together to make this unique. First is about data delivery. How do you have the SAP data delivery? So how do you get it from ECC, from HANA from S/4HANA, how do you deliver the data and the metadata and how that integration well into Snowflake. And what we've done is we've focused a lot on optimizing that process and the continuous ingestion, so the real-time ingestion of the data in a way that works really well with the Snowflake system, data cloud. Second thing is we looked at SAP data transformation, so once the data arrives at Snowflake, how do we turn it into being analytics ready? So that's where data transformation and data worth automation come in. And these are all elements of this solution. So creating derivative datasets, creating data marts, and all of that is done by again, creating an optimized integration that pushes down SQL based transformations, so they can be processed inside Snowflake, leveraging its powerful engine. And then the third element is bringing together data visualization analytics that can also take all the data now that in organizing inside Snowflake, bring other data in, bring machine learning from SageMaker, and then you go to create a seamless integration to bring analytic applications to life. So these are all things we put together in the solution. And maybe the last point is we actually took the next step with this and we created something we refer to as solution accelerators, which we're really, really keen about. Think about this as prepackaged templates for common business analytic needs like order to cash, finance, inventory. And we can either dig into that a little more later, but this gets the next level of value to the customers all built into this joint solution. >> Yeah, I want to get to the accelerators, but real quick, Peter, your reaction to the solution, what's unique about it? And obviously Snowflake, we've been seeing the progression data applications, more developers developing on top of Snowflake, data as code kind of implies developer ecosystem. This is kind of interesting. I mean, you got partnering with Qlik and AWS, it's kind of a developer-like thinking real solution. What's unique about this SAP solution that's, that's different than what customers can get anywhere else or not? >> Yeah, well listen, I think first of all, you have to start with the idea of the solution. This are three companies coming together to build a holistic solution that is all about, you know, creating a great opportunity to turn SAP data into value this is Itamar was talking about, that's really what we're talking about here and there's a lot of technology underneath it. I'll talk more about the Snowflake technology, what's involved here, and then cover some of the AWS pieces as well. But you know, we're focusing on getting that value out and accelerating time to value for our joint customers. As Itamar was saying, you know, there's a lot of complexity with the SAP data and a lot of value there. How can we manage that in a prepackaged way, bringing together best of breed solutions with proven capabilities and bringing this to market quickly for our joint customers. You know, Snowflake and AWS have been strong partners for a number of years now, and that's not only on how Snowflake runs on top of AWS, but also how we integrate with their complementary analytics and then all products. And so, you know, we want to be able to leverage those in addition to what Qlik is bringing in terms of the data transformations, bringing data out of SAP in the visualization as well. All very critical. And then we want to bring in the predictive analytics, AWS brings and what Sage brings. We'll talk about that a little bit later on. Some of the technologies that we're leveraging are some of our latest cutting edge technologies that really make things easier for both our partners and our customers. For example, Qlik leverages Snowflakes recently released Snowpark for Python functionality to push down those data transformations from clicking the Snowflake that Itamar's mentioning. And while we also leverage Snowpark for integrations with Amazon SageMaker, but there's a lot of great new technology that just makes this easy and compelling for customers. >> I think that's the big word, easy button here for what may look like a complex kind of integration, kind of turnkey, really, really compelling example of the modern era we're living in, as we always say in theCUBE. You mentioned accelerators, SAP accelerators. Can you give an example of how that works with the technology from the third party providers to deliver this business value Itamar, 'cause that was an interesting comment. What's the example? Give an example of this acceleration. >> Yes, certainly. I think this is something that really makes this truly, truly unique in the industry and again, a great opportunity for customers. So we kind talked earlier about there's a lot of things that need to be done with SP data to turn it to value. And these accelerator, as the name suggests, are designed to do just that, to kind of jumpstart the process and reduce the time and the risk involved in such project. So again, these are pre-packaged templates. We basically took a lot of knowledge, and a lot of configurations, best practices about to get things done and we put 'em together. So think about all the steps, it includes things like data extraction, so already knowing which tables, all the relevant tables that you need to get data from in the contexts of the solution you're looking for, say like order to cash, we'll get back to that one. How do you continuously deliver that data into Snowflake in an in efficient manner, handling things like data type mappings, metadata naming conventions and transformations. The data models you build all the way to data mart definitions and all the transformations that the data needs to go through moving through steps until it's fully analytics ready. And then on top of that, even adding a library of comprehensive analytic dashboards and integrations through machine learning and AI and put all of that in a way that's in pre-integrated and tested to work with Snowflake and AWS. So this is where again, you get this entire recipe that's ready. So take for example, I think I mentioned order to cash. So again, all these things I just talked about, I mean, for those who are not familiar, I mean order to cash is a critical business process for every organization. So especially if you're in retail, manufacturing, enterprise, it's a big... This is where, you know, starting with booking a sales order, following by fulfilling the order, billing the customer, then managing the accounts receivable when the customer actually pays, right? So this all process, you got sales order fulfillment and the billing impacts customer satisfaction, you got receivable payments, you know, the impact's working capital, cash liquidity. So again, as a result this order to cash process is a lifeblood for many businesses and it's critical to optimize and understand. So the solution accelerator we created specifically for order to cash takes care of understanding all these aspects and the data that needs to come with it. So everything we outline before to make the data available in Snowflake in a way that's really useful for downstream analytics, along with dashboards that are already common for that, for that use case. So again, this enables customers to gain real-time visibility into their sales orders, fulfillment, accounts receivable performance. That's what the Excel's are all about. And very similarly, we have another one for example, for finance analytics, right? So this will optimize financial data reporting, helps customers get insights into P&L, financial risk of stability or inventory analytics that helps with, you know, improve planning and inventory management, utilization, increased efficiencies, you know, so in supply chain. So again, these accelerators really help customers get a jumpstart and move faster with their solutions. >> Peter, this is the easy button we just talked about, getting things going, you know, get the ball rolling, get some acceleration. Big part of this are the three companies coming together doing this. >> Yeah, and to build on what Itamar just said that the SAP data obviously has tremendous value. Those sales orders, distribution data, financial data, bringing that into Snowflake makes it easily accessible, but also it enables it to be combined with other data too, is one of the things that Snowflake does so well. So you can get a full view of the end-to-end process and the business overall. You know, for example, I'll just take one, you know, one example that, that may not come to mind right away, but you know, looking at the impact of weather conditions on supply chain logistics is relevant and material and have interest to our customers. How do you bring those different data sets together in an easy way, bringing the data out of SAP, bringing maybe other data out of other systems through Qlik or through Snowflake, directly bringing data in from our data marketplace and bring that all together to make it work. You know, fundamentally organizational silos and the data fragmentation exist otherwise make it really difficult to drive modern analytics projects. And that in turn limits the value that our customers are getting from SAP data and these other data sets. We want to enable that and unleash. >> Yeah, time for value. This is great stuff. Itamar final question, you know, what are customers using this? What do you have? I'm sure you have customers examples already using the solution. Can you share kind of what these examples look like in the use cases and the value? >> Oh yeah, absolutely. Thank you. Happy to. We have customers across different, different sectors. You see manufacturing, retail, energy, oil and gas, CPG. So again, customers in those segments, typically sectors typically have SAP. So we have customers in all of them. A great example is like Siemens Energy. Siemens Energy is a global provider of gas par services. You know, over what, 28 billion, 30 billion in revenue. 90,000 employees. They operate globally in over 90 countries. So they've used SAP HANA as a core system, so it's running on premises, multiple locations around the world. And what they were looking for is a way to bring all these data together so they can innovate with it. And the thing is, Peter mentioned earlier, not just the SAP data, but also bring other data from other systems to bring it together for more value. That includes finance data, these logistics data, these customer CRM data. So they bring data from over 20 different SAP systems. Okay, with Qlik data integration, feeding that into Snowflake in under 20 minutes, 24/7, 365, you know, days a year. Okay, they get data from over 20,000 tables, you know, over million, hundreds of millions of records daily going in. So it is a great example of the type of scale, scalability, agility and speed that they can get to drive these kind of innovation. So that's a great example with Siemens. You know, another one comes to mind is a global manufacturer. Very similar scenario, but you know, they're using it for real-time executive reporting. So it's more like feasibility to the production data as well as for financial analytics. So think, think, think about everything from audit to texts to innovate financial intelligence because all the data's coming from SAP. >> It's a great time to be in the data business again. It keeps getting better and better. There's more data coming. It's not stopping, you know, it's growing so fast, it keeps coming. Every year, it's the same story, Peter. It's like, doesn't stop coming. As we wrap up here, let's just get customers some information on how to get started. I mean, obviously you're starting to see the accelerators, it's a great program there. What a great partnership between the two companies and AWS. How can customers get started to learn about the solution and take advantage of it, getting more out of their SAP data, Peter? >> Yeah, I think the first place to go to is talk to Snowflake, talk to AWS, talk to our account executives that are assigned to your account. Reach out to them and they will be able to educate you on the solution. We have packages up very nicely and can be deployed very, very quickly. >> Well gentlemen, thank you so much for coming on. Appreciate the conversation. Great overview of the partnership between, you know, Snowflake and Qlik and AWS on a joint solution. You know, getting more out of the SAP data. It's really kind of a key, key solution, bringing SAP data to life. Thanks for coming on theCUBE. Appreciate it. >> Thank you. >> Thank you John. >> Okay, this is theCUBE coverage here at RE:Invent 2022. I'm John Furrier, your host of theCUBE. Thanks for watching. (upbeat music)

Published Date : Nov 23 2022

SUMMARY :

bringing SAP data to life, great meeting you John. then going to jump into what On the Cloud Partner side, and I'm the senior vice and the solutions, and the value chain and accelerate time to value that are going to be powering and data to do so. What's the dynamic powering this trend? You know, it's time to value all the time with customers. and that's driving all the and it's also a solution by the way I mean, you got partnering and bringing this to market of the modern era we're living in, that the data needs to go through getting things going, you know, Yeah, and to build in the use cases and the value? agility and speed that they can get It's a great time to be to educate you on the solution. key solution, bringing SAP data to life. Okay, this is theCUBE

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Siemens	ORGANIZATION	0.99+
Peter MacDonald	PERSON	0.99+
John Furrier	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Peter McDonald	PERSON	0.99+
Itamar Ankorion	PERSON	0.99+
Qlik	ORGANIZATION	0.99+
28 billion	QUANTITY	0.99+
two companies	QUANTITY	0.99+
Tens	QUANTITY	0.99+
three companies	QUANTITY	0.99+
Siemens Energy	ORGANIZATION	0.99+
20 plus years	QUANTITY	0.99+
yesterday	DATE	0.99+
Snowflake	ORGANIZATION	0.99+
third element	QUANTITY	0.99+
First	QUANTITY	0.99+
three	QUANTITY	0.99+
Itamar	PERSON	0.99+
over 20,000 tables	QUANTITY	0.99+
both	QUANTITY	0.99+
90,000 employees	QUANTITY	0.99+
first	QUANTITY	0.99+
Salesforce	ORGANIZATION	0.99+
Cloud Partners	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
over 38,000 customers	QUANTITY	0.99+
under 20 minutes	QUANTITY	0.99+
10 years	QUANTITY	0.99+
five	QUANTITY	0.99+
Excel	TITLE	0.99+
one	QUANTITY	0.99+
over 11 years	QUANTITY	0.98+
Snowpark	TITLE	0.98+
Second thing	QUANTITY	0.98+

Breaking Analysis: Snowflake caught in the storm clouds

>> From the CUBE Studios in Palo Alto in Boston, bringing you data driven insights from the Cube and ETR. This is Breaking Analysis with Dave Vellante. >> A better than expected earnings report in late August got people excited about Snowflake again, but the negative sentiment in the market is weighed heavily on virtually all growth tech stocks and Snowflake is no exception. As we've stressed many times the company's management is on a long term mission to dramatically simplify the way organizations use data. Snowflake is tapping into a multi hundred billion dollar total available market and continues to grow at a rapid pace. In our view, Snowflake is embarking on its third major wave of innovation data apps, while its first and second waves are still bearing significant fruit. Now for short term traders focused on the next 90 or 180 days, that probably doesn't matter. But those taking a longer view are asking, "Should we still be optimistic about the future of this high flyer or is it just another over hyped tech play?" Hello and welcome to this week's Wiki Bond Cube Insights powered by ETR. Snowflake's Quarter just ended. And in this breaking analysis we take a look at the most recent survey data from ETR to see what clues and nuggets we can extract to predict the near term future in the long term outlook for Snowflake which is going to announce its earnings at the end of this month. Okay, so you know the story. If you've been investor in Snowflake this year, it's been painful. We said at IPO, "If you really want to own this stock on day one, just hold your nose and buy it." But like most IPOs we said there will be likely a better entry point in the future, and not surprisingly that's been the case. Snowflake IPOed a price of 120, which you couldn't touch on day one unless you got into a friends and family Delio. And if you did, you're still up 5% or so. So congratulations. But at one point last year you were up well over 200%. That's been the nature of this volatile stock, and I certainly can't help you with the timing of the market. But longer term Snowflake is targeting 10 billion in revenue for fiscal year 2028. A big number. Is it achievable? Is it big enough? Tell you what, let's come back to that. Now shorter term, our expert trader and breaking analysis contributor Chip Simonton said he got out of the stock a while ago after having taken a shot at what turned out to be a bear market rally. He pointed out that the stock had been bouncing around the 150 level for the last few months and broke that to the downside last Friday. So he'd expect 150 is where the stock is going to find resistance on the way back up, but there's no sign of support right now. He said maybe at 120, which was the July low and of course the IPO price that we just talked about. Now, perhaps earnings will be a catalyst, when Snowflake announces on November 30th, but until the mentality toward growth tech changes, nothing's likely to change dramatically according to Simonton. So now that we have that out of the way, let's take a look at the spending data for Snowflake in the ETR survey. Here's a chart that shows the time series breakdown of snowflake's net score going back to the October, 2021 survey. Now at that time, Snowflake's net score stood at a robust 77%. And remember, net score is a measure of spending velocity. It's a proprietary network, and ETR derives it from a quarterly survey of IT buyers and asks the respondents, "Are you adopting the platform new? Are you spending 6% or more? Is you're spending flat? Is you're spending down 6% or worse? Or are you leaving the platform decommissioning?" You subtract the percent of customers that are spending less or churning from those that are spending more and adopting or adopting and you get a net score. And that's expressed as a percentage of customers responding. In this chart we show Snowflake's in out of the total survey which ranges... The total survey ranges between 1,200 and 1,400 each quarter. And the very last column... Oh sorry, very last row, we show the number of Snowflake respondents that are coming in the survey from the Fortune 500 and the Global 2000. Those are two very important Snowflake constituencies. Now what this data tells us is that Snowflake exited 2021 with very strong momentum in a net score of 82%, which is off the charts and it was actually accelerating from the previous survey. Now by April that sentiment had flipped and Snowflake came down to earth with a 68% net score. Still highly elevated relative to its peers, but meaningfully down. Why was that? Because we saw a drop in new ads and an increase in flat spend. Then into the July and most recent October surveys, you saw a significant drop in the percentage of customers that were spending more. Now, notably, the percentage of customers who are contemplating adding the platform is actually staying pretty strong, but it is off a bit this past survey. And combined with a slight uptick in planned churn, net score is now down to 60%. That uptick from 0% and 1% and then 3%, it's still small, but that net score at 60% is still 20 percentage points higher than our highly elevated benchmark of 40% as you recall from listening to earlier breaking analysis. That 40% range is we consider a milestone. Anything above that is actually quite strong. But again, Snowflake is down and coming back to churn, while 3% churn is very low, in previous quarters we've seen Snowflake 0% or 1% decommissions. Now the last thing to note in this chart is the meaningful uptick in survey respondents that are citing, they're using the Snowflake platform. That's up to 212 in the survey. So look, it's hard to imagine that Snowflake doesn't feel the softening in the market like everyone else. Snowflake is guiding for around 60% growth in product revenue against the tough compare from a year ago with a 2% operating margin. So like every company, the reaction of the street is going to come down to how accurate or conservative the guide is from their CFO. Now, earlier this year, Snowflake acquired a company called Streamlit for around $800 million. Streamlit is an open source Python library and it makes it easier to build data apps with machine learning, obviously a huge trend. And like Snowflake, generally its focus is on simplifying the complex, in this case making data science easier to integrate into data apps that business people can use. So we were excited this summer in the July ETR survey to see that they added some nice data and pick on Streamlit, which we're showing here in comparison to Snowflake's core business on the left hand side. That's the data warehousing, the Streamlit pieces on the right hand side. And we show again net score over time from the previous survey for Snowflake's core database and data warehouse offering again on the left as compared to a Streamlit on the right. Snowflake's core product had 194 responses in the October, 22 survey, Streamlit had an end of 73, which is up from 52 in the July survey. So significant uptick of people responding that they're doing business in adopting Streamlit. That was pretty impressive to us. And it's hard to see, but the net scores stayed pretty constant for Streamlit at 51%. It was 52% I think in the previous quarter, well over that magic 40% mark. But when you blend it with Snowflake, it does sort of bring things down a little bit. Now there are two key points here. One is that the acquisition seems to have gained exposure right out of the gate as evidenced by the large number of responses. And two, the spending momentum. Again while it's lower than Snowflake overall, and when you blend it with Snowflake it does pull it down, it's very healthy and steady. Now let's do a little pure comparison with some of our favorite names in this space. This chart shows net score or spending velocity in the Y-axis, an overlap or presence, pervasiveness if you will, in the data set on the X-axis. That red dotted line again is that 40% highly elevated net score that we like to talk about. And that table inserted informs us as to how the companies are plotted, where the dots set up, the net score, the ins. And we're comparing a number of database players, although just a caution, Oracle includes all of Oracle including its apps. But we just put it in there for reference because it is the leader in database. Right off the bat, Snowflake jumps out with a net score of 64%. The 60% from the earlier chart, again included Streamlit. So you can see its core database, data warehouse business actually is higher than the total company average that we showed you before 'cause the Streamlit is blended in. So when you separate it out, Streamlit is right on top of data bricks. Isn't that ironic? Only Snowflake and Databricks in this selection of names are above the 40% level. You see Mongo and Couchbase, they know they're solid and Teradata cloud actually showing pretty well compared to some of the earlier survey results. Now let's isolate on the database data platform sector and see how that shapes up. And for this analysis, same XY dimensions, we've added the big giants, AWS and Microsoft and Google. And notice that those three plus Snowflake are just at or above the 40% line. Snowflake continues to lead by a significant margin in spending momentum and it keeps creeping to the right. That's that end that we talked about earlier. Now here's an interesting tidbit. Snowflake is often asked, and I've asked them myself many times, "How are you faring relative to AWS, Microsoft and Google, these big whales with Redshift and Synapse and Big Query?" And Snowflake has been telling folks that 80% of its business comes from AWS. And when Microsoft heard that, they said, "Whoa, wait a minute, Snowflake, let's partner up." 'Cause Microsoft is smart, and they understand that the market is enormous. And if they could do better with Snowflake, one, they may steal some business from AWS. And two, even if Snowflake is winning against some of the Microsoft database products, if it wins on Azure, Microsoft is going to sell more compute and more storage, more AI tools, more other stuff to these customers. Now AWS is really aggressive from a partnering standpoint with Snowflake. They're openly negotiating, not openly, but they're negotiating better prices. They're realizing that when it comes to data, the cheaper that you make the offering, the more people are going to consume. At scale economies and operating leverage are really powerful things at volume that kick in. Now Microsoft, they're coming along, they obviously get it, but Google is seemingly resistant to that type of go to market partnership. Rather than lean into Snowflake as a great partner Google's field force is kind of fighting fashion. Google itself at Cloud next heavily messaged what they call the open data cloud, which is a direct rip off of Snowflake. So what can we say about Google? They continue to be kind of behind the curve when it comes to go to market. Now just a brief aside on the competitive posture. I've seen Slootman, Frank Slootman, CEO of Snowflake in action with his prior companies and how he depositioned the competition. At Data Domain, he eviscerated a company called Avamar with their, what he called their expensive and slow post process architecture. I think he actually called it garbage, if I recall at one conference I heard him speak at. And that sort of destroyed BMC when he was at ServiceNow, kind of positioning them as the equivalent of the department of motor vehicles. And so it's interesting to hear how Snowflake openly talks about the data platforms of AWS, Microsoft, Google, and data bricks. I'll give you this sort of short bumper sticker. Redshift is just an on-prem database that AWS morphed to the cloud, which by the way is kind of true. They actually did a brilliant job of it, but it's basically a fact. Microsoft Excel, a collection of legacy databases, which also kind of morphed to run in the cloud. And even Big Query, which is considered cloud native by many if not most, is being positioned by Snowflake as originally an on-prem database to support Google's ad business, maybe. And data bricks is for those people smart enough to get it to Berkeley that love complexity. And now Snowflake doesn't, they don't mention Berkeley as far as I know. That's my addition. But you get the point. And the interesting thing about Databricks and Snowflake is a while ago in the cube I said that there was a new workload type emerging around data where you have AWS cloud, Snowflake obviously for the cloud database and Databricks data for the data science and EML, you bring those things together and there's this new workload emerging that's going to be very powerful in the future. And it's interesting to see now the aspirations of all three of these platforms are colliding. That's quite a dynamic, especially when you see both Snowflake and Databricks putting venture money and getting their hooks into the loyalties of the same companies like DBT labs and Calibra. Anyway, Snowflake's posture is that we are the pioneer in cloud native data warehouse, data sharing and now data apps. And our platform is designed for business people that want simplicity. The other guys, yes, they're formidable, but we Snowflake have an architectural lead and of course we run in multiple clouds. So it's pretty strong positioning or depositioning, you have to admit. Now I'm not sure I agree with the big query knockoffs completely. I think that's a bit of a stretch, but snowflake, as we see in the ETR survey data is winning. So in thinking about the longer term future, let's talk about what's different with Snowflake, where it's headed and what the opportunities are for the company. Snowflake put itself on the map by focusing on simplifying data analytics. What's interesting about that is the company's founders are as you probably know from Oracle. And rather than focusing on transactional data, which is Oracle's sweet spot, the stuff they worked on when they were at Oracle, the founder said, "We're going to go somewhere else. We're going to attack the data warehousing problem and the data analytics problem." And they completely re-imagined the database and how it could be applied to solve those challenges and reimagine what was possible if you had virtually unlimited compute and storage capacity. And of course Snowflake became famous for separating the compute from storage and being able to completely shut down compute so you didn't have to pay for it when you're not using it. And the ability to have multiple clusters hit the same data without making endless copies and a consumption/cloud pricing model. And then of course everyone on the planet realized, "Wow, that's a pretty good idea." Every venture capitalist in Silicon Valley has been funding companies to copy that move. And that today has pretty much become mainstream in table stakes. But I would argue that Snowflake not only had the lead, but when you look at how others are approaching this problem, it's not necessarily as clean and as elegant. Some of the startups, the early startups I think get it and maybe had an advantage of starting later, which can be a disadvantage too. But AWS is a good example of what I'm saying here. Is its version of separating compute from storage was an afterthought and it's good, it's... Given what they had it was actually quite clever and customers like it, but it's more of a, "Okay, we're going to tier to storage to lower cost, we're going to sort of dial down the compute not completely, we're not going to shut it off, we're going to minimize the compute required." It's really not true as separation is like for instance Snowflake has. But having said that, we're talking about competitors with lots of resources and cohort offerings. And so I don't want to make this necessarily all about the product, but all things being equal architecture matters, okay? So that's the cloud S-curve, the first one we're showing. Snowflake's still on that S-curve, and in and of itself it's got legs, but it's not what's going to power the company to 10 billion. The next S-curve we denote is the multi-cloud in the middle. And now while 80% of Snowflake's revenue is AWS, Microsoft is ramping up and Google, well, we'll see. But the interesting part of that curve is data sharing, and this idea of data clean rooms. I mean it really should be called the data sharing curve, but I have my reasons for calling it multi-cloud. And this is all about network effects and data gravity, and you're seeing this play out today, especially in industries like financial services and healthcare and government that are highly regulated verticals where folks are super paranoid about compliance. There not going to share data if they're going to get sued for it, if they're going to be in the front page of the Wall Street Journal for some kind of privacy breach. And what Snowflake has done is said, "Put all the data in our cloud." Now, of course now that triggers a lot of people because it's a walled garden, okay? It is. That's the trade off. It's not the Wild West, it's not Windows, it's Mac, it's more controlled. But the idea is that as different parts of the organization or even partners begin to share data that they need, it's got to be governed, it's got to be secure, it's got to be compliant, it's got to be trusted. So Snowflake introduced the idea of, they call these things stable edges. I think that's the term that they use. And they track a metric around stable edges. And so a stable edge, or think of it as a persistent edge is an ongoing relationship between two parties that last for some period of time, more than a month. It's not just a one shot deal, one a done type of, "Oh guys shared it for a day, done." It sent you an FTP, it's done. No, it's got to have trajectory over time. Four weeks or six weeks or some period of time that's meaningful. And that metric is growing. Now I think sort of a different metric that they track. I think around 20% of Snowflake customers are actively sharing data today and then they track the number of those edge relationships that exist. So that's something that's unique. Because again, most data sharing is all about making copies of data. That's great for storage companies, it's bad for auditors, and it's bad for compliance officers. And that trend is just starting out, that middle S-curve, it's going to kind of hit the base of that steep part of the S-curve and it's going to have legs through this decade we think. And then finally the third wave that we show here is what we call super cloud. That's why I called it multi-cloud before, so it could invoke super cloud. The idea that you've built a PAS layer that is purpose built for a specific objective, and in this case it's building data apps that are cloud native, shareable and governed. And is a long-term trend that's going to take some time to develop. I mean, application development platforms can take five to 10 years to mature and gain significant adoption, but this one's unique. This is a critical play for Snowflake. If it's going to compete with the big cloud players, it has to have an app development framework like Snowpark. It has to accommodate new data types like transactional data. That's why it announced this thing called UniStore last June, Snowflake a summit. And the pattern that's forming here is Snowflake is building layer upon layer with its architecture at the core. It's not currently anyway, it's not going out and saying, "All right, we're going to buy a company that's got to another billion dollars in revenue and that's how we're going to get to 10 billion." So it's not buying its way into new markets through revenue. It's actually buying smaller companies that can complement Snowflake and that it can turn into revenue for growth that fit in to the data cloud. Now as to the 10 billion by fiscal year 28, is that achievable? That's the question. Yeah, I think so. Would the momentum resources go to market product and management prowess that Snowflake has? Yes, it's definitely achievable. And one could argue to $10 billion is too conservative. Indeed, Snowflake CFO, Mike Scarpelli will fully admit his forecaster built on existing offerings. He's not including revenue as I understand it from all the new stuff that's in the pipeline because he doesn't know what it's going to look like. He doesn't know what the adoption is going to look like. He doesn't have data on that adoption, not just yet anyway. And now of course things can change quite dramatically. It's possible that is forecast for existing businesses don't materialize or competition picks them off or a company like Databricks actually is able in the longer term replicate the functionality of Snowflake with open source technologies, which would be a very competitive source of innovation. But in our view, there's plenty of room for growth, the market is enormous and the real key is, can and will Snowflake deliver on the promises of simplifying data? Of course we've heard this before from data warehouse, the data mars and data legs and master data management and ETLs and data movers and data copiers and Hadoop and a raft of technologies that have not lived up to expectations. And we've also, by the way, seen some tremendous successes in the software business with the likes of ServiceNow and Salesforce. So will Snowflake be the next great software name and hit that 10 billion magic mark? I think so. Let's reconnect in 2028 and see. Okay, we'll leave it there today. I want to thank Chip Simonton for his input to today's episode. Thanks to Alex Myerson who's on production and manages the podcast. Ken Schiffman as well. Kristin Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hove is our Editor in Chief over at Silicon Angle. He does some great editing for us. Check it out for all the news. Remember all these episodes are available as podcasts. Wherever you listen, just search Breaking Analysis podcast. I publish each week on wikibon.com and siliconangle.com. Or you can email me to get in touch David.vallante@siliconangle.com. DM me @dvellante or comment on our LinkedIn post. And please do check out etr.ai, they've got the best survey data in the enterprise tech business. This is Dave Vellante for the CUBE Insights, powered by ETR. Thanks for watching, thanks for listening and we'll see you next time on breaking analysis. (upbeat music)

Published Date : Nov 10 2022

SUMMARY :

insights from the Cube and ETR. And the ability to have multiple

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
Mike Scarpelli	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
November 30th	DATE	0.99+
Ken Schiffman	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Chip Simonton	PERSON	0.99+
October, 2021	DATE	0.99+
Rob Hove	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Four weeks	QUANTITY	0.99+
July	DATE	0.99+
six weeks	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
five	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Slootman	PERSON	0.99+
BMC	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
6%	QUANTITY	0.99+
80%	QUANTITY	0.99+
last year	DATE	0.99+
October	DATE	0.99+
Silicon Valley	LOCATION	0.99+
40%	QUANTITY	0.99+
1,400	QUANTITY	0.99+
$10 billion	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
April	DATE	0.99+
3%	QUANTITY	0.99+
77%	QUANTITY	0.99+
64%	QUANTITY	0.99+
60%	QUANTITY	0.99+
194 responses	QUANTITY	0.99+
Kristin Martin	PERSON	0.99+
two parties	QUANTITY	0.99+
51%	QUANTITY	0.99+
2%	QUANTITY	0.99+
Silicon Angle	ORGANIZATION	0.99+
fiscal year 28	DATE	0.99+
billion dollars	QUANTITY	0.99+
0%	QUANTITY	0.99+
Avamar	ORGANIZATION	0.99+
52%	QUANTITY	0.99+
Berkeley	LOCATION	0.99+
2028	DATE	0.99+
Mongo	ORGANIZATION	0.99+
Data Domain	ORGANIZATION	0.99+
1%	QUANTITY	0.99+
late August	DATE	0.99+
two	QUANTITY	0.99+
three	QUANTITY	0.99+
fiscal year 2028	DATE	0.99+

Super Data Cloud | Supercloud22

(electronic music) >> Welcome back to our studios in Palo Alto, California. My name is Dave Vellante, I'm here with John Furrier, who is taking a quick break. You know, in one of the early examples that we used of so called super cloud was Snowflake. We called it a super data cloud. We had, really, a lot of fun with that. And we've started to evolve our thinking. Years ago, we said that data was going to form in the cloud around industries and ecosystems. And Benoit Dogeville is a many time guest of theCube. He's the co-founder and president of products at Snowflake. Benoit, thanks for spending some time with us, at Supercloud 22, good to see you. >> Thank you, thank you, Dave. >> So, you know, like I said, we've had some fun with this meme. But it really is, we heard on the previous panel, everybody's using Snowflake as an example. Somebody how builds on top of hyper scale infrastructure. You're not building your own data centers. And, so, are you building a super data cloud? >> We don't call it exactly that way. We don't like the super word, it's a bit dismissive. >> That's our term. >> About our friends, cloud provider friends. But we call it a data cloud. And the vision, really, for the data cloud is, indeed, it's a cloud which overlays the hyper scaler cloud. But there is a big difference, right? There are several ways to do this super cloud, as you name them. The way we picked is to create one single system, and that's very important, right? There are several ways, right. You can instantiate your solution in every region of the cloud and, you know, potentially that region could be AWS, that region could be GCP. So, you are, indeed, a multi-cloud solution. But Snowflake, we did it differently. We are really creating cloud regions, which are superimposed on top of the cloud provider region, infrastructure region. So, we are building our regions. But where it's very different is that each region of Snowflake is not one instantiation of our service. Our service is global, by nature. We can move data from one region to the other. When you land in Snowflake, you land into one region. But you can grow from there and you can, you know, exist in multiple cloud at the same time. And that's very important, right? It's not different instantiation of a system, it's one single instantiation which covers many cloud regions and many cloud provider. >> So, we used Snowflake as an example. And we're trying to understand what the salient aspects are of your data cloud, what we call super cloud. In fact, you've used the word instantiate. Kit Colbert, just earlier today, laid out, he said, there's sort of three levels. You can run it on one cloud and communicate with the other cloud, you can instantiate on the clouds, or you can have the same service running 24/7 across clouds, that's the hardest example. >> Yeah. >> The most mature. You just described, essentially, doing that. How do you enable that? What are the technical enablers? >> Yeah, so, as I said, first we start by building, you know, Snowflake regions, we have today 30 regions that span the world, so it's a world wide system, with many regions. But all these regions are connected together. They are meshed together with our technology, we name it Snow Grid, and that makes it hard because, you know, Azure region can talk to a WS region, or GCP regions, and as a user for our cloud, you don't see, really, these regional differences, that regions are in different potentially cloud. When you use Snowflake, you can exist, your presence as an organization can be in several regions, several clouds, if you want, geographic, both geographic and cloud provider. >> So, I can share data irrespective of the cloud. And I'm in the Snowflake data cloud, is that correct? I can do that today? >> Exactly, and that's very critical, right? What we wanted is to remove data silos. And when you insociate a system in one single region, and that system is locked in that region, you cannot communicate with other parts of the world, you are locking data in one region. Right, and we didn't want to do that. We wanted data to be distributed the way customer wants it to be distributed across the world. And potentially sharing data at world scales. >> Does that mean if I'm in one region and I want to run a query, if I'm in AWS in one region, and I want to run a query on data that happens to be in an Azure cloud, I can actually execute that? >> So, yes and no. The way we do it is very expensive to do that. Because, generally, if you want to join data which are in different region and different cloud, it's going to be very expensive because you need to move data every time you join it. So, the way we do it is that you replicate the subset of data that you want to access from one region from other region. So, you can create this data mesh, but data is replicated to make it very cheap and very performing too. >> And is the Snow Grid, does that have the metadata intelligence to actually? >> Yes, yes. >> Can you describe that a little? >> Yeah, Snow Grid is both a way to exchange metadata. So, each region of Snowflake knows about all the other regions of Snowflake. Every time we create a new region, the metadata is distributed over our data cloud, not only region knows all the region, but knows every organization that exists in our cloud, where this organization is, where data can be replicated by this organization. And then, of course, it's also used as a way to exchange data, right? So, you can exchange data by scale of data size. And I was just receiving an email from one of our customers who moved more than four petabytes of data, cross region, cross cloud providers in, you know, few days. And it's a lot of data, so it takes some time to move. But they were able to do that online, completely online, and switch over to the other region, which is very important also. >> So, one of the hardest parts about super cloud that I'm still trying to struggling through is the security model. Because you've got the cloud as your sort of first line of defense. And now we've got multiple clouds, with multiple first lines of defense, I've got a shared responsibility model across those clouds, I've got different tools in each of those clouds. Do you take care of that? Where do you pick up from the cloud providers? Do you abstract that security layer? Do you bring in partners? It's a very complicated. >> No, this is a great question. Security has always been the most important aspect of Snowflake sense day one, right? This is the question that every customer of ours has. You know, how can you guarantee the security of my data? And, so, we secure data really tightly in region. We have several layers of security. It starts by creating every data at rest. And that's very important. A lot of customers are not doing that, right? You hear of these attacks, for example, on cloud, where someone left their buckets. And then, you know, you can access the data because it's a non-encrypted. So, we are encrypting everything at rest. We are encrypting everything in transit. So, a region is very secure. Now, you know, from one region, you never access data from another region in Snowflake. That's why, also, we replicate data. Now the replication of that data across region, or the metadata, for that matter, is really our least secure, so Snow Grid ensures that everything is encrypted, everything is, we have multiple encryption keys, and it's stored in hardware secure modules, so, we bit Snow Grid such that it's secure and it allows very secure movement of data. >> Okay, so, I know we kind of, getting into the technology here a lot today, but because super cloud is the future, we actually have to have an architectural foundation on which to build. So, you mentioned a bucket, like an S3 bucket. Okay, that's storage, but you also, for instance, taking advantage of new semi-conductor technology. Like Graviton, as an example, that drives efficiency. You guys talk about how you pass that on to your customers. Even if it means less revenue for you, so, awesome, we love that, you'll make it up in volume. And, so. >> Exactly. >> How do you deal with the lowest common denominator problem? I was talking to somebody the other day and this individual brought up what I thought was a really good point. What if we, let's say, AWS, have the best, silicon. And we can run the fastest and the least expensive, and the lowest power. But another cloud provider hasn't caught up yet. How do you deal with that delta? Do you just take the best of and try to respect that? >> No, it's a great question. I mean, of course, our software is extracting all the cloud providers infrastructure so that when you run in one region, let's say AWS, or Azure, it doesn't make any difference, as far as the applications are concerned. And this abstraction, of course, is a lot of work. I mean, really, a lot of work. Because it needs to be secure, it needs to be performance, and every cloud, and it has to expose APIs which are uniform. And, you know, cloud providers, even though they have potentially the same concept, let's say block storage, APIs are completely different. The way these systems are secure, it's completely different. There errors that you can get. And the retry mechanism is very different from one cloud to the other. The performance is also different. We discovered that when we starting to port our software. And we had to completely rethink how to leverage block storage in that cloud versus that cloud, because just off performance too. And, so, we had, for example, to stripe data. So, all this work is work that you don't need as an application because our vision, really, is that application, which are running in our data cloud, can be abstracted for this difference. And we provide all the services, all the workload that this application need. Whether it's transactional access to data, analytical access to data, managing logs, managing metrics, all of this is abstracted too, so that they are not tied to one particular service of one cloud. And distributing this application across many region, many cloud, is very seamless. >> So, Snowflake has built, your team has built a true abstraction layer across those clouds that's available today? It's actually shipping? >> Yes, and we are still developing it. You know, transactional, Unistore, as we call it, was announced last summit. So, they are still, you know, work in progress. >> You're not done yet. >> But that's the vision, right? And that's important, because we talk about the infrastructure, right. You mention a lot about storage and compute. But it's not only that, right. When you think about application, they need to use the transactional database. They need to use an analytical system. They need to use machine learning. So, you need to provide, also, all these services which are consistent across all the cloud providers. >> So, let's talk developers. Because, you know, you think Snowpark, you guys announced a big application development push at the Snowflake summit recently. And we have said that a criterion of super cloud is a super paz layer, people wince when I say that, but okay, we're just going to go with it. But the point is, it's a purpose built application development layer, specific to your particular agenda, that supports your vision. >> Yes. >> Have you essentially built a purpose built paz layer? Or do you just take them off the shelf, standard paz, and cobble it together? >> No, we build it a custom build. Because, as you said, what exist in one cloud might not exist in another cloud provider, right. So, we have to build in this, all these components that a multi-application need. And that goes to machine learning, as I said, transactional analytical system, and the entire thing. So that it can run in isolation physically. >> And the objective is the developer experience will be identical across those clouds? >> Yes, the developers doesn't need to worry about cloud provider. And, actually, our system will have, we didn't talk about it, but a marketplace that we have, which allows, actually, to deliver. >> We're getting there. >> Yeah, okay. (both laughing) I won't divert. >> No, no, let's go there, because the other aspect of super cloud that we've talked about is the ecosystem. You have to enable an ecosystem to add incremental value, it's not the power of many versus the capabilities of one. So, talk about the challenges of doing that. Not just the business challenges but, again, I'm interested in the technical and architectural challenges. >> Yeah, yeah, so, it's really about, I mean, the way we enable our ecosystem and our partners to create value on top of our data cloud, is via the marketplace. Where you can put shared data on the marketplace. Provide listing on this marketplace, which are data sets. But it goes way beyond data. It's all the way to application. So, you can think of it as the iPhone. A little bit more, all right. Your iPhone is great. Not so much because the hardware is great, or because of the iOS, but because of all the applications that you have. And all these applications are not necessarily developed by Apple, basically. So, we are, it's the same model with our marketplace. We foresee an environment where providers and partners are going to build these applications. We call it native application. And we are going to help them distribute these applications across cloud, everywhere in the world, potentially. And they don't need to worry about that. They don't need to worry about how these applications are going to be instantiated. We are going to help them to monetize these applications. So, that unlocks, you know, really, all the partner ecosystem that you have seen, you know, with something like the iPhone, right? It has created so many new companies that have developed these applications. >> Your detractors have criticized you for being a walled garden. I've actually used that term. I used terms like defacto standard, which are maybe less sensitive to you, but, nonetheless, we've seen defacto standards actually deliver value. I've talked to Frank Slootman about this, and he said, Dave, we deliver value, that's what we're all about. At the same time, he even said to me, and I want your thoughts on this, is, look, we have to embrace open source where it makes sense. You guys announced Apache Iceberg. So, what are your thoughts on that? Is that to enable a developer ecosystem? Why did you do Iceberg? >> Yeah, Iceberg is very important. So, just to give some context, Iceberg is an open table format. >> Right. >> Which was first developed by Netflix. And Netflix put it open source in the Apache community. So, we embraced that open source standard because it's widely used by many companies. And, also, many companies have really invested a lot of effort in building big data, Hadoop Solutions, or DataX Solution, and they want to use Snowflake. And they couldn't really use Snowflake, because all their data were in open format. So, we are embracing Iceberg to help these companies move through the cloud. But why we have been reluctant with direct access to data, direct access to data is a little bit of a problem for us. And the reason is when you direct access to data, now you have direct access to storage. Now you have to understand, for example, the specificity of one cloud versus the other. So, as soon as you start to have direct access to data, you lose your cloud data sync layer. You don't access data with API. When you have direct access to data, it's very hard to sync your data. Because you need to grant access, direct access to tools which are not protected. And you see a lot of hacking of data because of that. So, direct access to data is not serving well our customers, and that's why we have been reluctant to do that. Because it is not cloud diagnostic. You have to code that, you need a lot of intelligence, why APIs access, so we want open APIs. That's, I guess, the way we embrace openness, is by open API versus you access, directly, data. >> iPhone. >> Yeah, yeah, iPhone, APIs, you know. We define a set of APIs because APIs, you know, the implementation of the APIs can change, can improve. You can improve compression of data, for example. If you open direct access to data now, you cannot evolve. >> My point is, you made a promise, from governed, security, data sharing ecosystem. It works the same way, so that's the path that you've chosen. Benoit Dogeville, thank you so much for coming on theCube and participating in Supercloud 22, really appreciate that. >> Thank you, Dave. It was a great pleasure. >> All right, keep it right there, we'll be right back with our next segment, right after this short break. (electronic music)

Published Date : Aug 9 2022

SUMMARY :

You know, in one of the So, you know, like I said, We don't like the super and you can, you know, or you can have the same How do you enable that? we start by building, you know, And I'm in the Snowflake And when you insociate a So, the way we do it is that you replicate So, you can exchange data So, one of the hardest And then, you know, So, you mentioned a and the least expensive, so that when you run in one So, they are still, you know, So, you need to provide, Because, you know, you think Snowpark, And that goes to machine a marketplace that we have, I won't divert. So, talk about the of all the applications that you have. At the same time, he even said to me, So, just to give some context, You have to code that, you because APIs, you know, so that's the path that you've chosen. It was a great pleasure. with our next segment, right

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Benoit	PERSON	0.99+
Dave	PERSON	0.99+
Apple	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Netflix	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Kit Colbert	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Palo Alto, California	LOCATION	0.99+
Benoit Dogeville	PERSON	0.99+
one region	QUANTITY	0.99+
iOS	TITLE	0.99+
30 regions	QUANTITY	0.99+
more than four petabytes	QUANTITY	0.99+
Snowflake	EVENT	0.99+
first line	QUANTITY	0.99+
Snowpark	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.98+
today	DATE	0.98+
Apache	ORGANIZATION	0.98+
both	QUANTITY	0.98+
each	QUANTITY	0.98+
one	QUANTITY	0.97+
Unistore	ORGANIZATION	0.97+
Supercloud 22	EVENT	0.97+
first lines	QUANTITY	0.97+
DataX Solution	ORGANIZATION	0.97+
each region	QUANTITY	0.96+
Snow Grid	TITLE	0.96+
one cloud	QUANTITY	0.96+
one single region	QUANTITY	0.96+
first	QUANTITY	0.96+
one region	QUANTITY	0.96+
Hadoop Solutions	ORGANIZATION	0.95+
WS	ORGANIZATION	0.94+
Supercloud 22	ORGANIZATION	0.93+
three levels	QUANTITY	0.93+
Snowflake	TITLE	0.93+
one single system	QUANTITY	0.92+
Iceberg	TITLE	0.89+
one single instantiation	QUANTITY	0.89+
theCube	ORGANIZATION	0.86+
Azure	ORGANIZATION	0.85+
Years ago	DATE	0.83+
earlier today	DATE	0.82+
one instantiation	QUANTITY	0.82+
Super Data Cloud	ORGANIZATION	0.81+
S3	COMMERCIAL_ITEM	0.8+
one cloud	QUANTITY	0.76+
delta	ORGANIZATION	0.76+
Azure	TITLE	0.75+
one of our customers	QUANTITY	0.72+
day one	QUANTITY	0.72+
Supercloud22	EVENT	0.66+

Ed Walsh, ChaosSearch | AWS re:Inforce 2022

(upbeat music) >> Welcome back to Boston, everybody. This is the birthplace of theCUBE. In 2010, May of 2010 at EMC World, right in this very venue, John Furrier called it the chowder and lobster post. I'm Dave Vellante. We're here at RE:INFORCE 2022, Ed Walsh, CEO of ChaosSearch. Doing a drive by Ed. Thanks so much for stopping in. You're going to help me wrap up in our final editorial segment. >> Looking forward to it. >> I really appreciate it. >> Thank you for including me. >> How about that? 2010. >> That's amazing. It was really in this-- >> Really in this building. Yeah, we had to sort of bury our way in, tunnel our way into the Blogger Lounge. We did four days. >> Weekends, yeah. >> It was epic. It was really epic. But I'm glad they're back in Boston. AWS was going to do June in Houston. >> Okay. >> Which would've been awful. >> Yeah, yeah. No, this is perfect. >> Yeah. Thank God they came back. You saw Boston in summer is great. I know it's been hot, And of course you and I are from this area. >> Yeah. >> So how you been? What's going on? I mean, it's a little crazy out there. The stock market's going crazy. >> Sure. >> Having the tech lash, what are you seeing? >> So it's an interesting time. So I ran a company in 2008. So we've been through this before. By the way, the world's not ending, we'll get through this. But it is an interesting conversation as an investor, but also even the customers. There's some hesitation but you have to basically have the right value prop, otherwise things are going to get sold. So we are seeing longer sales cycles. But it's nothing that you can't overcome. But it has to be something not nice to have, has to be a need to have. But I think we all get through it. And then there is some, on the VC side, it's now buckle down, let's figure out what to do which is always a challenge for startup plans. >> In pre 2000 you, maybe you weren't a CEO but you were definitely an executive. And so now it's different and a lot of younger people haven't seen this. You've got interest rates now rising. Okay, we've seen that before but it looks like you've got inflation, you got interest rates rising. >> Yep. >> The consumer spending patterns are changing. You had 6$, $7 gas at one point. So you have these weird crosscurrents, >> Yup. >> And people are thinking, "Okay post-September now, maybe because of the recession, the Fed won't have to keep raising interest rates and tightening. But I don't know what to root for. It's like half full, half empty. (Ed laughing) >> But we haven't been in an environment with high inflation. At least not in my career. >> Right. Right. >> I mean, I got into 92, like that was long gone, right?. >> Yeah. >> So it is a interesting regime change that we're going to have to deal with, but there's a lot of analogies between 2008 and now that you still have to work through too, right?. So, anyway, I don't think the world's ending. I do think you have to run a tight shop. So I think the grow all costs is gone. I do think discipline's back in which, for most of us, discipline never left, right?. So, to me that's the name of the game. >> What do you tell just generally, I mean you've been the CEO of a lot of private companies. And of course one of the things that you do to retain people and attract people is you give 'em stock and it's great and everybody's excited. >> Yeah. >> I'm sure they're excited cause you guys are a rocket ship. But so what's the message now that, Okay the market's down, valuations are down, the trees don't grow to the moon, we all know that. But what are you telling your people? What's their reaction? How do you keep 'em motivated? >> So like anything, you want over communicate during these times. So I actually over communicate, you get all these you know, the Sequoia decks, 2008 and the recent... >> (chuckles) Rest in peace good times, that one right? >> I literally share it. Why? It's like, Hey, this is what's going on in the real world. It's going to affect us. It has almost nothing to do with us specifically, but it will affect us. Now we can't not pay attention to it. It does change how you're going to raise money, so you got to make sure you have the right runway to be there. So it does change what you do, but I think you over communicate. So that's what I've been doing and I think it's more like a student of the game, so I try to share it, and I say some appreciate it others, I'm just saying, this is normal, we'll get through this and this is what happened in 2008 and trust me, once the market hits bottom, give it another month afterwards. Then everyone says, oh, the bottom's in and we're back to business. Valuations don't go immediately back up, but right now, no one knows where the bottom is and that's where kind of the world's ending type of things. >> Well, it's interesting because you talked about, I said rest in peace good times >> Yeah >> that was the Sequoia deck, and the message was tighten up. Okay, and I'm not saying you shouldn't tighten up now, but the difference is, there was this period of two years of easy money and even before that, it was pretty easy money. >> Yeah. >> And so companies are well capitalized, they have runway so it's like, okay, I was talking to Frank Slootman about this now of course there are public companies, like we're not taking the foot off the gas. We're inherently profitable, >> Yeah. >> we're growing like crazy, we're going for it. You know? So that's a little bit of a different dynamic. There's a lot of good runway out there, isn't there? >> But also you look at the different companies that were either born or were able to power through those environments are actually better off. You come out stronger in a more dominant position. So Frank, listen, if you see what Frank's done, it's been unbelievable to watch his career, right?. In fact, he was at Data Domain, I was Avamar so, but look at what he's done since, he's crushed it. Right? >> Yeah. >> So for him to say, Hey, I'm going to literally hit the gas and keep going. I think that's the right thing for Snowflake and a right thing for a lot of people. But for people in different roles, I literally say that you have to take it seriously. What you can't be is, well, Frank's in a different situation. What is it...? How many billion does he have in the bank? So it's... >> He's over a billion, you know, over a billion. Well, you're on your way Ed. >> No, no, no, it's good. (Dave chuckles) Okay, I want to ask you about this concept that we've sort of we coined this term called Supercloud. >> Sure. >> You could think of it as the next generation of multi-cloud. The basic premises that multi-cloud was largely a symptom of multi-vendor. Okay. I've done some M&A, I've got some Shadow IT, spinning up, you know, Shadow clouds, projects. But it really wasn't a strategy to have a continuum across clouds. And now we're starting to see ecosystems really build, you know, you've used the term before, standing on the shoulders of giants, you've used that a lot. >> Yep. >> And so we're seeing that. Jerry Chen wrote a seminal piece on Castles in The Cloud, so we coined this term SuperCloud to connote this abstraction layer that hides the underlying complexities and primitives of the individual clouds and then adds value on top of it and can adjudicate and manage, irrespective of physical location, Supercloud. >> Yeah. >> Okay. What do you think about that concept?. How does it maybe relate to some of the things that you're seeing in the industry? >> So, standing on shoulders of giants, right? So I always like to do hard tech either at big company, small companies. So we're probably your definition of a Supercloud. We had a big vision, how to literally solve the core challenge of analytics at scale. How are you going to do that? You're not going to build on your own. So literally we're leveraging the primitives, everything you can get out of the Amazon cloud, everything get out of Google cloud. In fact, we're even looking at what it can get out of this Snowflake cloud, and how do we abstract that out, add value to it? That's where all our patents are. But it becomes a simplified approach. The customers don't care. Well, they care where their data is. But they don't care how you got there, they just want to know the end result. So you simplify, but you gain the advantages. One thing's interesting is, in this particular company, ChaosSearch, people try to always say, at some point the sales cycle they say, no way, hold on, no way that can be fast no way, or whatever the different issue. And initially we used to try to explain our technology, and I would say 60% was explaining the public, cloud capabilities and then how we, harvest those I guess, make them better add value on top and what you're able to get is something you couldn't get from the public clouds themselves and then how we did that across public clouds and then extracted it. So if you think about that like, it's the Shoulders of giants. But what we now do, literally to avoid that conversation because it became a lengthy conversation. So, how do you have a platform for analytics that you can't possibly overwhelm for ingest. All your messy data, no pipelines. Well, you leverage things like S3 and EC2, and you do the different security things. You can go to environments say, you can't possibly overrun me, I could not say that. If I didn't literally build on the shoulders giants of all these public clouds. But the value. So if you're going to do hard tech as a startup, you're going to build, you're going to be the principles of Supercloud. Maybe they're not the same size of Supercloud just looking at Snowflake, but basically, you're going to leverage all that, you abstract it out and that's where you're able to have a lot of values at that. >> So let me ask you, so I don't know if there's a strict definition of Supercloud, We sort of put it out to the community and said, help us define it. So you got to span multiple clouds. It's not just running in each cloud. There's a metadata layer that kind of understands where you're pulling data from. Like you said you can pull data from Snowflake, it sounds like we're not running on Snowflake, correct? >> No, complimentary to them in their different customers. >> Yeah. Okay. >> They want to build on top of a data platform, data apps. >> Right. And of course they're going cross cloud. >> Right. >> Is there a PaaS layer in there? We've said there's probably a Super PaaS layer. You're probably not doing that, but you're allowing people to bring their own, bring your own PaaS sort of thing maybe. >> So we're a little bit different but basically we publish open APIs. We don't have a user interface. We say, keep the user interface. Again, we're solving the challenge of analytics at scale, we're not trying to retrain your analytics, either analysts or your DevOps or your SOV or your Secop team. They use the tools they already use. Elastic search APIs, SQL APIs. So really they program, they build applications on top of us, Equifax is a good example. Case said it coming out later on this week, after 18 months in production but, basically they're building, we provide the abstraction layer, the quote, I'm going to kill it, Jeff Tincher, who owns all of SREs worldwide, said to the effect of, Hey I'm able to rethink what I do for my data pipelines. But then he also talked about how, that he really doesn't have to worry about the data he puts in it. We deal with that. And he just has to, just query on the other side. That simplicity. We couldn't have done that without that. So anyway, what I like about the definition is, if you were going to do something harder in the world, why would you try to rebuild what Amazon, Google and Azure or Snowflake did? You're going to add things on top. We can still do intellectual property. We're still doing patents. So five grand patents all in this. But literally the abstraction layer is the simplification. The end users do not want to know that complexity, even though they ask the questions. >> And I think too, the other attribute is it's ecosystem enablement. Whereas I think, >> Absolutely >> in general, in the Multicloud 1.0 era, the ecosystem wasn't thinking about, okay, how do I build on top and abstract that. So maybe it is Multicloud 2.0, We chose to use Supercloud. So I'm wondering, we're at the security conference, >> RE: INFORCE is there a security Supercloud? Maybe Snyk has the developer Supercloud or maybe Okta has the identity Supercloud. I think CrowdStrike maybe not. Cause CrowdStrike competes with Microsoft. So maybe, because Microsoft, what's interesting, Merritt Bear was just saying, look, we don't show up in the spending data for security because we're not charging for most of our security. We're not trying to make a big business. So that's kind of interesting, but is there a potential for the security Supercloud? >> So, I think so. But also, I'll give you one thing I talked to, just today, at least three different conversations where everyone wants to log data. It's a little bit specific to us, but basically they want to do the security data lake. The idea of, and Snowflake talks about this too. But the idea of putting all the data in one repository and then how do you abstract out and get value from it? Maybe not the perfect, but it becomes simple to do but hard to get value out. So the different players are going to do that. That's what we do. We're able to, once you land it in your S3 or it doesn't matter, cloud of choice, simple storage, we allow you to get after that data, but we take the primitives and hide them from you. And all you do is query the data and we're spinning up stateless computer to go after it. So then if I look around the floor. There's going to be a bunch of these players. I don't think, why would someone in this floor try to recreate what Amazon or Google or Azure had. They're going to build on top of it. And now the key thing is, do you leave it in standard? And now we're open APIs. People are building on top of my open APIs or do you try to put 'em in a walled garden? And they're in, now your Supercloud. Our belief is, part of it is, it needs to be open access and let you go after it. >> Well. And build your applications on top of it openly. >> They come back to snowflake. That's what Snowflake's doing. And they're basically saying, Hey come into our proprietary environment. And the benefit is, and I think both can win. There's a big market. >> I agree. But I think the benefit of Snowflake's is, okay, we're going to have federated governance, we're going to have data sharing, you're going to have access to all the ecosystem players. >> Yep. >> And as everything's going to be controlled and you know what you're getting. The flip side of that is, Databricks is the other end >> Yeah. >> of that spectrum, which is no, no, you got to be open. >> Yeah. >> So what's going to happen, well what's happening clearly, is Snowflake's saying, okay we've got Snowpark. we're going to allow Python, we're going to have an Apache Iceberg. We're going to have open source tooling that you can access. By the way, it's not going to be as good as our waled garden where the flip side of that is you get Databricks coming at it from a data science and data engineering perspective. And there's a lot of gaps in between, aren't there? >> And I think they both win. Like for instance, so we didn't do Snowpark integration. But we work with people building data apps on top of Snowflake or data bricks. And what we do is, we can add value to that, or what we've done, again, using all the Supercloud stuff we're done. But we deal with the unstructured data, the four V's coming at you. You can't pipeline that to save. So we actually could be additive. As they're trying to do like a security data cloud inside of Snowflake or do the same thing in Databricks. That's where we can play. Now, we play with them at the application level that they get some data from them and some data for us. But I believe there's a partnership there that will do it inside their environment. To us they're just another large scaler environment that my customers want to get after data. And they want me to abstract it out and give value. >> So it's another repository to you. >> Yeah. >> Okay. So I think Snowflake recently added support for unstructured data. You chose not to do Snowpark because why? >> Well, so the way they're doing the unstructured data is not bad. It's JSON data. Basically, This is the dilemma. Everyone wants their application developers to be flexible, move fast, securely but just productivity. So you get, give 'em flexibility. The problem with that is analytics on the end want to be structured to be performant. And this is where Snowflake, they have to somehow get that raw data. And it's changing every day because you just let the developers do what they want now, in some structured base, but do what you need to do your business fast and securely. So it completely destroys. So they have large customers trying to do big integrations for this messy data. And it doesn't quite work, cause you literally just can't make the pipelines work. So that's where we're complimentary do it. So now, the particular integration wasn't, we need a little bit deeper integration to do that. So we're integrating, actually, at the data app layer. But we could, see us and I don't, listen. I think Snowflake's a good actor. They're trying to figure out what's best for the customers. And I think we just participate in that. >> Yeah. And I think they're trying to figure out >> Yeah. >> how to grow their ecosystem. Because they know they can't do it all, in fact, >> And we solve the key thing, they just can't do certain things. And we do that well. Yeah, I have SQL but that's where it ends. >> Yeah. >> I do the messy data and how to play with them. >> And when you talk to one of their founders, anyway, Benoit, he comes on the cube and he's like, we start with simple. >> Yeah. >> It reminds me of the guy's some Pure Storage, that guy Coz, he's always like, no, if it starts to get too complicated. So that's why they said all right, we're not going to start out trying to figure out how to do complex joins and workload management. And they turn that into a feature. So like you say, I think both can win. It's a big market. >> I think it's a good model. And I love to see Frank, you know, move. >> Yeah. I forgot So you AVMAR... >> In the day. >> You guys used to hate each other, right? >> No, no, no >> No. I mean, it's all good. >> But the thing is, look what he's done. Like I wouldn't bet against Frank. I think it's a good message. You can see clients trying to do it. Same thing with Databricks, same thing with BigQuery. We get a lot of same dynamic in BigQuery. It's good for a lot of things, but it's not everything you need to do. And there's ways for the ecosystem to play together. >> Well, what's interesting about BigQuery is, it is truly cloud native, as is Snowflake. You know, whereas Amazon Redshift was sort of Parexel, it's cobbled together now. It's great engineering, but BigQuery gets a lot of high marks. But again, there's limitations to everything. That's why companies like yours can exist. >> And that's why.. so back to the Supercloud. It allows me as a company to participate in that because I'm leveraging all the underlying pieces. Which we couldn't be doing what we're doing now, without leveraging the Supercloud concepts right, so... >> Ed, I really appreciate you coming by, help me wrap up today in RE:INFORCE. Always a pleasure seeing you, my friend. >> Thank you. >> All right. Okay, this is a wrap on day one. We'll be back tomorrow. I'll be solo. John Furrier had to fly out but we'll be following what he's doing. This is RE:INFORCE 2022. You're watching theCUBE. I'll see you tomorrow.

Published Date : Jul 26 2022

SUMMARY :

John Furrier called it the How about that? It was really in this-- Yeah, we had to sort of bury our way in, But I'm glad they're back in Boston. No, this is perfect. And of course you and So how you been? But it's nothing that you can't overcome. but you were definitely an executive. So you have these weird crosscurrents, because of the recession, But we haven't been in an environment Right. that was long gone, right?. I do think you have to run a tight shop. the things that you do But what are you telling your people? 2008 and the recent... So it does change what you do, and the message was tighten up. the foot off the gas. So that's a little bit But also you look at I literally say that you you know, over a billion. Okay, I want to ask you about this concept you know, you've used the term before, of the individual clouds and to some of the things So I always like to do hard tech So you got to span multiple clouds. No, complimentary to them of a data platform, data apps. And of course people to bring their own, the quote, I'm going to kill it, And I think too, the other attribute is in the Multicloud 1.0 era, for the security Supercloud? And now the key thing is, And build your applications And the benefit is, But I think the benefit of Snowflake's is, you know what you're getting. which is no, no, you got to be open. that you can access. You can't pipeline that to save. You chose not to do Snowpark but do what you need to do they're trying to figure out how to grow their ecosystem. And we solve the key thing, I do the messy data And when you talk to So like you say, And I love to see Frank, you know, move. So you AVMAR... it's all good. but it's not everything you need to do. there's limitations to everything. so back to the Supercloud. Ed, I really appreciate you coming by, I'll see you tomorrow.

ENTITIES

Entity	Category	Confidence
Jeff Tincher	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Boston	LOCATION	0.99+
2008	DATE	0.99+
Jerry Chen	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Ed Walsh	PERSON	0.99+
Frank	PERSON	0.99+
Frank Slootman	PERSON	0.99+
AWS	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Houston	LOCATION	0.99+
2010	DATE	0.99+
tomorrow	DATE	0.99+
Benoit	PERSON	0.99+
Ed	PERSON	0.99+
60%	QUANTITY	0.99+
Dave	PERSON	0.99+
ChaosSearch	ORGANIZATION	0.99+
June	DATE	0.99+
May of 2010	DATE	0.99+
BigQuery	TITLE	0.99+
Castles in The Cloud	TITLE	0.99+
September	DATE	0.99+
Data Domain	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
today	DATE	0.99+
$7	QUANTITY	0.99+
each cloud	QUANTITY	0.99+
both	QUANTITY	0.99+
over a billion	QUANTITY	0.99+
Multicloud 2.0	TITLE	0.99+
four days	QUANTITY	0.99+
M&A	ORGANIZATION	0.98+
one repository	QUANTITY	0.98+
Python	TITLE	0.98+
Databricks	ORGANIZATION	0.98+
Merritt Bear	PERSON	0.98+
Supercloud	ORGANIZATION	0.98+
Azure	ORGANIZATION	0.97+
SQL	TITLE	0.97+
EC2	TITLE	0.97+
one	QUANTITY	0.96+
Fed	ORGANIZATION	0.96+
S3	TITLE	0.96+
five grand patents	QUANTITY	0.96+
Snowpark	ORGANIZATION	0.96+
Multicloud 1.0	TITLE	0.95+
billion	QUANTITY	0.94+
Avamar	ORGANIZATION	0.93+
EMC World	LOCATION	0.93+
Snowflake	PERSON	0.93+
one point	QUANTITY	0.93+
Supercloud	TITLE	0.93+
Equifax	ORGANIZATION	0.92+
92	QUANTITY	0.91+
Super PaaS	TITLE	0.91+
Snowflake	TITLE	0.89+

Breaking Analysis: Snowflake Summit 2022...All About Apps & Monetization

>> From theCUBE studios in Palo Alto in Boston, bringing you data driven insights from theCUBE and ETR. This is "Breaking Analysis" with Dave Vellante. >> Snowflake Summit 2022 underscored that the ecosystem excitement which was once forming around Hadoop is being reborn, escalated and coalescing around Snowflake's data cloud. What was once seen as a simpler cloud data warehouse and good marketing with the data cloud is evolving rapidly with new workloads of vertical industry focus, data applications, monetization, and more. The question is, will the promise of data be fulfilled this time around, or is it same wine, new bottle? Hello, and welcome to this week's Wikibon CUBE Insights powered by ETR. In this "Breaking Analysis," we'll talk about the event, the announcements that Snowflake made that are of greatest interest, the major themes of the show, what was hype and what was real, the competition, and some concerns that remain in many parts of the ecosystem and pockets of customers. First let's look at the overall event. It was held at Caesars Forum. Not my favorite venue, but I'll tell you it was packed. Fire Marshall Full, as we sometimes say. Nearly 10,000 people attended the event. Here's Snowflake's CMO Denise Persson on theCUBE describing how this event has evolved. >> Yeah, two, three years ago, we were about 1800 people at a Hilton in San Francisco. We had about 40 partners attending. This week we're close to 10,000 attendees here. Almost 10,000 people online as well, and over over 200 partners here on the show floor. >> Now, those numbers from 2019 remind me of the early days of Hadoop World, which was put on by Cloudera but then Cloudera handed off the event to O'Reilly as this article that we've inserted, if you bring back that slide would say. The headline it almost got it right. Hadoop World was a failure, but it didn't have to be. Snowflake has filled the void created by O'Reilly when it first killed Hadoop World, and killed the name and then killed Strata. Now, ironically, the momentum and excitement from Hadoop's early days, it probably could have stayed with Cloudera but the beginning of the end was when they gave the conference over to O'Reilly. We can't imagine Frank Slootman handing the keys to the kingdom to a third party. Serious business was done at this event. I'm talking substantive deals. Salespeople from a host sponsor and the ecosystems that support these events, they love physical. They really don't like virtual because physical belly to belly means relationship building, pipeline, and deals. And that was blatantly obvious at this show. And in fairness, all theCUBE events that we've done year but this one was more vibrant because of its attendance and the action in the ecosystem. Ecosystem is a hallmark of a cloud company, and that's what Snowflake is. We asked Frank Slootman on theCUBE, was this ecosystem evolution by design or did Snowflake just kind of stumble into it? Here's what he said. >> Well, when you are a data clouding, you have data, people want to do things with that data. They don't want just run data operations, populate dashboards, run reports. Pretty soon they want to build applications and after they build applications, they want build businesses on it. So it goes on and on and on. So it drives your development to enable more and more functionality on that data cloud. Didn't start out that way, you know, we were very, very much focused on data operations. Then it becomes application development and then it becomes, hey, we're developing whole businesses on this platform. So similar to what happened to Facebook in many ways. >> So it sounds like it was maybe a little bit of both. The Facebook analogy is interesting because Facebook is a walled garden, as is Snowflake, but when you come into that garden, you have assurances that things are going to work in a very specific way because a set of standards and protocols is being enforced by a steward, i.e. Snowflake. This means things run better inside of Snowflake than if you try to do all the integration yourself. Now, maybe over time, an open source version of that will come out but if you wait for that, you're going to be left behind. That said, Snowflake has made moves to make its platform more accommodating to open source tooling in many of its announcements this week. Now, I'm not going to do a deep dive on the announcements. Matt Sulkins from Monte Carlo wrote a decent summary of the keynotes and a number of analysts like Sanjeev Mohan, Tony Bear and others are posting some deeper analysis on these innovations, and so we'll point to those. I'll say a few things though. Unistore extends the type of data that can live in the Snowflake data cloud. It's enabled by a new feature called hybrid tables, a new table type in Snowflake. One of the big knocks against Snowflake was it couldn't handle and transaction data. Several database companies are creating this notion of a hybrid where both analytic and transactional workloads can live in the same data store. Oracle's doing this for example, with MySQL HeatWave and there are many others. We saw Mongo earlier this month add an analytics capability to its transaction system. Mongo also added sequel, which was kind of interesting. Here's what Constellation Research analyst Doug Henschen said about Snowflake's moves into transaction data. Play the clip. >> Well with Unistore, they're reaching out and trying to bring transactional data in. Hey, don't limit this to analytical information and there's other ways to do that like CDC and streaming but they're very closely tying that again to that marketplace, with the idea of bring your data over here and you can monetize it. Don't just leave it in that transactional database. So another reach to a broader play across a big community that they're building. >> And you're also seeing Snowflake expand its workload types in its unique way and through Snowpark and its stream lit acquisition, enabling Python so that native apps can be built in the data cloud and benefit from all that structure and the features that Snowflake is built in. Hence that Facebook analogy, or maybe the App Store, the Apple App Store as I propose as well. Python support also widens the aperture for machine intelligence workloads. We asked Snowflake senior VP of product, Christian Kleinerman which announcements he thought were the most impactful. And despite the who's your favorite child nature of the question, he did answer. Here's what he said. >> I think the native applications is the one that looks like, eh, I don't know about it on the surface but he has the biggest potential to change everything. That's create an entire ecosystem of solutions for within a company or across companies that I don't know that we know what's possible. >> Snowflake also announced support for Apache Iceberg, which is a new open table format standard that's emerging. So you're seeing Snowflake respond to these concerns about its lack of openness, and they're building optionality into their cloud. They also showed some cost op optimization tools both from Snowflake itself and from the ecosystem, notably Capital One which launched a software business on top of Snowflake focused on optimizing cost and eventually the rollout data management capabilities, and all kinds of features that Snowflake announced that the show around governance, cross cloud, what we call super cloud, a new security workload, and they reemphasize their ability to read non-native on-prem data into Snowflake through partnerships with Dell and Pure and a lot more. Let's hear from some of the analysts that came on theCUBE this week at Snowflake Summit to see what they said about the announcements and their takeaways from the event. This is Dave Menninger, Sanjeev Mohan, and Tony Bear, roll the clip. >> Our research shows that the majority of organizations, the majority of people do not have access to analytics. And so a couple of the things they've announced I think address those or help to address those issues very directly. So Snowpark and support for Python and other languages is a way for organizations to embed analytics into different business processes. And so I think that'll be really beneficial to try and get analytics into more people's hands. And I also think that the native applications as part of the marketplace is another way to get applications into people's hands rather than just analytical tools. Because most people in the organization are not analysts. They're doing some line of business function. They're HR managers, they're marketing people, they're sales people, they're finance people, right? They're not sitting there mucking around in the data, they're doing a job and they need analytics in that job. >> Primarily, I think it is to contract this whole notion that once you move data into Snowflake, it's a proprietary format. So I think that's how it started but it's usually beneficial to the customers, to the users because now if you have large amount of data in paket files you can leave it on S3, but then you using the Apache Iceberg table format in Snowflake, you get all the benefits of Snowflake's optimizer. So for example, you get the micro partitioning, you get the metadata. And in a single query, you can join, you can do select from a Snowflake table union and select from an iceberg table and you can do store procedure, user defined function. So I think what they've done is extremely interesting. Iceberg by itself still does not have multi-table transactional capabilities. So if I'm running a workload, I might be touching 10 different tables. So if I use Apache Iceberg in a raw format, they don't have it, but Snowflake does. So the way I see it is Snowflake is adding more and more capabilities right into the database. So for example, they've gone ahead and added security and privacy. So you can now create policies and do even cell level masking, dynamic masking, but most organizations have more than Snowflake. So what we are starting to see all around here is that there's a whole series of data catalog companies, a bunch of companies that are doing dynamic data masking, security and governance, data observability which is not a space Snowflake has gone into. So there's a whole ecosystem of companies that is mushrooming. Although, you know, so they're using the native capabilities of Snowflake but they are at a level higher. So if you have a data lake and a cloud data warehouse and you have other like relational databases, you can run these cross platform capabilities in that layer. So that way, you know, Snowflake's done a great job of enabling that ecosystem. >> I think it's like the last mile, essentially. In other words, it's like, okay, you have folks that are basically that are very comfortable with Tableau but you do have developers who don't want to have to shell out to a separate tool. And so this is where Snowflake is essentially working to address that constituency. To Sanjeev's point, and I think part of it, this kind of plays into it is what makes this different from the Hadoop era is the fact that all these capabilities, you know, a lot of vendors are taking it very seriously to put this native. Now, obviously Snowflake acquired Streamlit. So we can expect that the Streamlit capabilities are going to be native. >> I want to share a little bit about the higher level thinking at Snowflake, here's a chart from Frank Slootman's keynote. It's his version of the modern data stack, if you will. Now, Snowflake of course, was built on the public cloud. If there were no AWS, there would be no Snowflake. Now, they're all about bringing data and live data and expanding the types of data, including structured, we just heard about that, unstructured, geospatial, and the list is going to continue on and on. Eventually I think it's going to bleed into the edge if we can figure out what to do with that edge data. Executing on new workloads is a big deal. They started with data sharing and they recently added security and they've essentially created a PaaS layer. We call it a SuperPaaS layer, if you will, to attract application developers. Snowflake has a developer-focused event coming up in November and they've extended the marketplace with 1300 native apps listings. And at the top, that's the holy grail, monetization. We always talk about building data products and we saw a lot of that at this event, very, very impressive and unique. Now here's the thing. There's a lot of talk in the press, in the Wall Street and the broader community about consumption-based pricing and concerns over Snowflake's visibility and its forecast and how analytics may be discretionary. But if you're a company building apps in Snowflake and monetizing like Capital One intends to do, and you're now selling in the marketplace, that is not discretionary, unless of course your costs are greater than your revenue for that service, in which case is going to fail anyway. But the point is we're entering a new error where data apps and data products are beginning to be built and Snowflake is attempting to make the data cloud the defacto place as to where you're going to build them. In our view they're well ahead in that journey. Okay, let's talk about some of the bigger themes that we heard at the event. Bringing apps to the data instead of moving the data to the apps, this was a constant refrain and one that certainly makes sense from a physics point of view. But having a single source of data that is discoverable, sharable and governed with increasingly robust ecosystem options, it doesn't have to be moved. Sometimes it may have to be moved if you're going across regions, but that's unique and a differentiator for Snowflake in our view. I mean, I'm yet to see a data ecosystem that is as rich and growing as fast as the Snowflake ecosystem. Monetization, we talked about that, industry clouds, financial services, healthcare, retail, and media, all front and center at the event. My understanding is that Frank Slootman was a major force behind this shift, this development and go to market focus on verticals. It's really an attempt, and he talked about this in his keynote to align with the customer mission ultimately align with their objectives which not surprisingly, are increasingly monetizing with data as a differentiating ingredient. We heard a ton about data mesh, there were numerous presentations about the topic. And I'll say this, if you map the seven pillars Snowflake talks about, Benoit Dageville talked about this in his keynote, but if you map those into Zhamak Dehghani's data mesh framework and the four principles, they align better than most of the data mesh washing that I've seen. The seven pillars, all data, all workloads, global architecture, self-managed, programmable, marketplace and governance. Those are the seven pillars that he talked about in his keynote. All data, well, maybe with hybrid tables that becomes more of a reality. Global architecture means the data is globally distributed. It's not necessarily physically in one place. Self-managed is key. Self-service infrastructure is one of Zhamak's four principles. And then inherent governance. Zhamak talks about computational, what I'll call automated governance, built in. And with all the talk about monetization, that aligns with the second principle which is data as product. So while it's not a pure hit and to its credit, by the way, Snowflake doesn't use data mesh in its messaging anymore. But by the way, its customers do, several customers talked about it. Geico, JPMC, and a number of other customers and partners are using the term and using it pretty closely to the concepts put forth by Zhamak Dehghani. But back to the point, they essentially, Snowflake that is, is building a proprietary system that substantially addresses some, if not many of the goals of data mesh. Okay, back to the list, supercloud, that's our term. We saw lots of examples of clouds on top of clouds that are architected to spin multiple clouds, not just run on individual clouds as separate services. And this includes Snowflake's data cloud itself but a number of ecosystem partners that are headed in a very similar direction. Snowflake still talks about data sharing but now it uses the term collaboration in its high level messaging, which is I think smart. Data sharing is kind of a geeky term. And also this is an attempt by Snowflake to differentiate from everyone else that's saying, hey, we do data sharing too. And finally Snowflake doesn't say data marketplace anymore. It's now marketplace, accounting for its application market. Okay, let's take a quick look at the competitive landscape via this ETR X-Y graph. Vertical access remembers net score or spending momentum and the x-axis is penetration, pervasiveness in the data center. That's what ETR calls overlap. Snowflake continues to lead on the vertical axis. They guide it conservatively last quarter, remember, so I wouldn't be surprised if that lofty height, even though it's well down from its earlier levels but I wouldn't be surprised if it ticks down again a bit in the July survey, which will be in the field shortly. Databricks is a key competitor obviously at a strong spending momentum, as you can see. We didn't draw it here but we usually draw that 40% line or red line at 40%, anything above that is considered elevated. So you can see Databricks is quite elevated. But it doesn't have the market presence of Snowflake. It didn't get to IPO during the bubble and it doesn't have nearly as deep and capable go-to market machinery. Now, they're getting better and they're getting some attention in the market, nonetheless. But as a private company, you just naturally, more people are aware of Snowflake. Some analysts, Tony Bear in particular, believe Mongo and Snowflake are on a bit of a collision course long term. I actually can see his point. You know, I mean, they're both platforms, they're both about data. It's long ways off, but you can see them sort of in a similar path. They talk about kind of similar aspirations and visions even though they're quite in different markets today but they're definitely participating in similar tam. The cloud players are probably the biggest or definitely the biggest partners and probably the biggest competitors to Snowflake. And then there's always Oracle. Doesn't have the spending velocity of the others but it's got strong market presence. It owns a cloud and it knows a thing about data and it definitely is a go-to market machine. Okay, we're going to end on some of the things that we heard in the ecosystem. 'Cause look, we've heard before how particular technology, enterprise data warehouse, data hubs, MDM, data lakes, Hadoop, et cetera. We're going to solve all of our data problems and of course they didn't. And in fact, sometimes they create more problems that allow vendors to push more incremental technology to solve the problems that they created. Like tools and platforms to clean up the no schema on right nature of data lakes or data swamps. But here are some of the things that I heard firsthand from some customers and partners. First thing is, they said to me that they're having a hard time keeping up sometimes with the pace of Snowflake. It reminds me of AWS in 2014, 2015 timeframe. You remember that fire hose of announcements which causes increased complexity for customers and partners. I talked to several customers that said, well, yeah this is all well and good but I still need skilled people to understand all these tools that I'm integrated in the ecosystem, the catalogs, the machine learning observability. A number of customers said, I just can't use one governance tool, I need multiple governance tools and a lot of other technologies as well, and they're concerned that that's going to drive up their cost and their complexity. I heard other concerns from the ecosystem that it used to be sort of clear as to where they could add value you know, when Snowflake was just a better data warehouse. But to point number one, they're either concerned that they'll be left behind or they're concerned that they'll be subsumed. Look, I mean, just like we tell AWS customers and partners, you got to move fast, you got to keep innovating. If you don't, you're going to be left. Either if your customer you're going to be left behind your competitor, or if you're a partner, somebody else is going to get there or AWS is going to solve the problem for you. Okay, and there were a number of skeptical practitioners, really thoughtful and experienced data pros that suggested that they've seen this movie before. That's hence the same wine, new bottle. Well, this time around I certainly hope not given all the energy and investment that is going into this ecosystem. And the fact is Snowflake is unquestionably making it easier to put data to work. They built on AWS so you didn't have to worry about provisioning, compute and storage and networking and scaling. Snowflake is optimizing its platform to take advantage of things like Graviton so you don't have to, and they're doing some of their own optimization tools. The ecosystem is building optimization tools so that's all good. And firm belief is the less expensive it is, the more data will get brought into the data cloud. And they're building a data platform on which their ecosystem can build and run data applications, aka data products without having to worry about all the hard work that needs to get done to make data discoverable, shareable, and governed. And unlike the last 10 years, you don't have to be a keeper and integrate all the animals in the Hadoop zoo. Okay, that's it for today, thanks for watching. Thanks to my colleague, Stephanie Chan who helps research "Breaking Analysis" topics. Sometimes Alex Myerson is on production and manages the podcasts. Kristin Martin and Cheryl Knight help get the word out on social and in our newsletters, and Rob Hof is our editor in chief over at Silicon, and Hailey does some wonderful editing, thanks to all. Remember, all these episodes are available as podcasts wherever you listen. All you got to do is search Breaking Analysis Podcasts. I publish each week on wikibon.com and siliconangle.com and you can email me at David.Vellante@siliconangle.com or DM me @DVellante. If you got something interesting, I'll respond. If you don't, I'm sorry I won't. Or comment on my LinkedIn post. Please check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, and we'll see you next time. (upbeat music)

Published Date : Jun 18 2022

SUMMARY :

bringing you data driven that the ecosystem excitement here on the show floor. and the action in the ecosystem. Didn't start out that way, you know, One of the big knocks against Snowflake the idea of bring your data of the question, he did answer. is the one that looks like, and from the ecosystem, And so a couple of the So that way, you know, from the Hadoop era is the fact the defacto place as to where

ENTITIES

Entity	Category	Confidence
Frank Slootman	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Stephanie Chan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Rob Hof	PERSON	0.99+
Benoit Dageville	PERSON	0.99+
2014	DATE	0.99+
Matt Sulkins	PERSON	0.99+
JPMC	ORGANIZATION	0.99+
2019	DATE	0.99+
Cheryl Knight	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Denise Persson	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Tony Bear	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Dell	ORGANIZATION	0.99+
July	DATE	0.99+
Geico	ORGANIZATION	0.99+
November	DATE	0.99+
Snowflake	TITLE	0.99+
40%	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
App Store	TITLE	0.99+
Capital One	ORGANIZATION	0.99+
second principle	QUANTITY	0.99+
Sanjeev Mohan	PERSON	0.99+
Snowflake	ORGANIZATION	0.99+
1300 native apps	QUANTITY	0.99+
Tony Bear	PERSON	0.99+
David.Vellante@siliconangle.com	OTHER	0.99+
Kristin Martin	PERSON	0.99+
Mongo	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Snowflake Summit 2022	EVENT	0.99+
First	QUANTITY	0.99+
two	DATE	0.99+
Python	TITLE	0.99+
10 different tables	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
ETR	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Snowflake	EVENT	0.98+
one place	QUANTITY	0.98+
each week	QUANTITY	0.98+
O'Reilly	ORGANIZATION	0.98+
This week	DATE	0.98+
Hadoop World	EVENT	0.98+
this week	DATE	0.98+
Pure	ORGANIZATION	0.98+
about 40 partners	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
last quarter	DATE	0.98+
One	QUANTITY	0.98+
S3	TITLE	0.97+
Hadoop	LOCATION	0.97+
single	QUANTITY	0.97+
Caesars Forum	LOCATION	0.97+
Iceberg	TITLE	0.97+
single source	QUANTITY	0.97+
Silicon	ORGANIZATION	0.97+
Nearly 10,000 people	QUANTITY	0.97+
Apache Iceberg	ORGANIZATION	0.97+

Data Power Panel V3

(upbeat music) >> The stampede to cloud and massive VC investments has led to the emergence of a new generation of object store based data lakes. And with them two important trends, actually three important trends. First, a new category that combines data lakes and data warehouses aka the lakehouse is emerged as a leading contender to be the data platform of the future. And this novelty touts the ability to address data engineering, data science, and data warehouse workloads on a single shared data platform. The other major trend we've seen is query engines and broader data fabric virtualization platforms have embraced NextGen data lakes as platforms for SQL centric business intelligence workloads, reducing, or somebody even claim eliminating the need for separate data warehouses. Pretty bold. However, cloud data warehouses have added complimentary technologies to bridge the gaps with lakehouses. And the third is many, if not most customers that are embracing the so-called data fabric or data mesh architectures. They're looking at data lakes as a fundamental component of their strategies, and they're trying to evolve them to be more capable, hence the interest in lakehouse, but at the same time, they don't want to, or can't abandon their data warehouse estate. As such we see a battle royale is brewing between cloud data warehouses and cloud lakehouses. Is it possible to do it all with one cloud center analytical data platform? Well, we're going to find out. My name is Dave Vellante and welcome to the data platform's power panel on theCUBE. Our next episode in a series where we gather some of the industry's top analysts to talk about one of our favorite topics, data. In today's session, we'll discuss trends, emerging options, and the trade offs of various approaches and we'll name names. Joining us today are Sanjeev Mohan, who's the principal at SanjMo, Tony Baers, principal at dbInsight. And Doug Henschen is the vice president and principal analyst at Constellation Research. Guys, welcome back to theCUBE. Great to see you again. >> Thank guys. Thank you. >> Thank you. >> So it's early June and we're gearing up with two major conferences, there's several database conferences, but two in particular that were very interested in, Snowflake Summit and Databricks Data and AI Summit. Doug let's start off with you and then Tony and Sanjeev, if you could kindly weigh in. Where did this all start, Doug? The notion of lakehouse. And let's talk about what exactly we mean by lakehouse. Go ahead. >> Yeah, well you nailed it in your intro. One platform to address BI data science, data engineering, fewer platforms, less cost, less complexity, very compelling. You can credit Databricks for coining the term lakehouse back in 2020, but it's really a much older idea. You can go back to Cloudera introducing their Impala database in 2012. That was a database on top of Hadoop. And indeed in that last decade, by the middle of that last decade, there were several SQL on Hadoop products, open standards like Apache Drill. And at the same time, the database vendors were trying to respond to this interest in machine learning and the data science. So they were adding SQL extensions, the likes Hudi and Vertical we're adding SQL extensions to support the data science. But then later in that decade with the shift to cloud and object storage, you saw the vendor shift to this whole cloud, and object storage idea. So you have in the database camp Snowflake introduce Snowpark to try to address the data science needs. They introduced that in 2020 and last year they announced support for Python. You also had Oracle, SAP jumped on this lakehouse idea last year, supporting both the lake and warehouse single vendor, not necessarily quite single platform. Google very recently also jumped on the bandwagon. And then you also mentioned, the SQL engine camp, the Dremios, the Ahanas, the Starbursts, really doing two things, a fabric for distributed access to many data sources, but also very firmly planning that idea that you can just have the lake and we'll help you do the BI workloads on that. And then of course, the data lake camp with the Databricks and Clouderas providing a warehouse style deployments on top of their lake platforms. >> Okay, thanks, Doug. I'd be remiss those of you who me know that I typically write my own intros. This time my colleagues fed me a lot of that material. So thank you. You guys make it easy. But Tony, give us your thoughts on this intro. >> Right. Well, I very much agree with both of you, which may not make for the most exciting television in terms of that it has been an evolution just like Doug said. I mean, for instance, just to give an example when Teradata bought AfterData was initially seen as a hardware platform play. In the end, it was basically, it was all those after functions that made a lot of sort of big data analytics accessible to SQL. (clears throat) And so what I really see just in a more simpler definition or functional definition, the data lakehouse is really an attempt by the data lake folks to make the data lake friendlier territory to the SQL folks, and also to get into friendly territory, to all the data stewards, who are basically concerned about the sprawl and the lack of control in governance in the data lake. So it's really kind of a continuing of an ongoing trend that being said, there's no action without counter action. And of course, at the other end of the spectrum, we also see a lot of the data warehouses starting to edit things like in database machine learning. So they're certainly not surrendering without a fight. Again, as Doug was mentioning, this has been part of a continual blending of platforms that we've seen over the years that we first saw in the Hadoop years with SQL on Hadoop and data warehouses starting to reach out to cloud storage or should say the HDFS and then with the cloud then going cloud native and therefore trying to break the silos down even further. >> Now, thank you. And Sanjeev, data lakes, when we first heard about them, there were such a compelling name, and then we realized all the problems associated with them. So pick it up from there. What would you add to Doug and Tony? >> I would say, these are excellent points that Doug and Tony have brought to light. The concept of lakehouse was going on to your point, Dave, a long time ago, long before the tone was invented. For example, in Uber, Uber was trying to do a mix of Hadoop and Vertical because what they really needed were transactional capabilities that Hadoop did not have. So they weren't calling it the lakehouse, they were using multiple technologies, but now they're able to collapse it into a single data store that we call lakehouse. Data lakes, excellent at batch processing large volumes of data, but they don't have the real time capabilities such as change data capture, doing inserts and updates. So this is why lakehouse has become so important because they give us these transactional capabilities. >> Great. So I'm interested, the name is great, lakehouse. The concept is powerful, but I get concerned that it's a lot of marketing hype behind it. So I want to examine that a bit deeper. How mature is the concept of lakehouse? Are there practical examples that really exist in the real world that are driving business results for practitioners? Tony, maybe you could kick that off. >> Well, put it this way. I think what's interesting is that both data lakes and data warehouse that each had to extend themselves. To believe the Databricks hype it's that this was just a natural extension of the data lake. In point of fact, Databricks had to go outside its core technology of Spark to make the lakehouse possible. And it's a very similar type of thing on the part with data warehouse folks, in terms of that they've had to go beyond SQL, In the case of Databricks. There have been a number of incremental improvements to Delta lake, to basically make the table format more performative, for instance. But the other thing, I think the most dramatic change in all that is in their SQL engine and they had to essentially pretty much abandon Spark SQL because it really, in off itself Spark SQL is essentially stop gap solution. And if they wanted to really address that crowd, they had to totally reinvent SQL or at least their SQL engine. And so Databricks SQL is not Spark SQL, it is not Spark, it's basically SQL that it's adapted to run in a Spark environment, but the underlying engine is C++, it's not scale or anything like that. So Databricks had to take a major detour outside of its core platform to do this. So to answer your question, this is not mature because these are all basically kind of, even though the idea of blending platforms has been going on for well over a decade, I would say that the current iteration is still fairly immature. And in the cloud, I could see a further evolution of this because if you think through cloud native architecture where you're essentially abstracting compute from data, there is no reason why, if let's say you are dealing with say, the same basically data targets say cloud storage, cloud object storage that you might not apportion the task to different compute engines. And so therefore you could have, for instance, let's say you're Google, you could have BigQuery, perform basically the types of the analytics, the SQL analytics that would be associated with the data warehouse and you could have BigQuery ML that does some in database machine learning, but at the same time for another part of the query, which might involve, let's say some deep learning, just for example, you might go out to let's say the serverless spark service or the data proc. And there's no reason why Google could not blend all those into a coherent offering that's basically all triggered through microservices. And I just gave Google as an example, if you could generalize that with all the other cloud or all the other third party vendors. So I think we're still very early in the game in terms of maturity of data lakehouses. >> Thanks, Tony. So Sanjeev, is this all hype? What are your thoughts? >> It's not hype, but completely agree. It's not mature yet. Lakehouses have still a lot of work to do, so what I'm now starting to see is that the world is dividing into two camps. On one hand, there are people who don't want to deal with the operational aspects of vast amounts of data. They are the ones who are going for BigQuery, Redshift, Snowflake, Synapse, and so on because they want the platform to handle all the data modeling, access control, performance enhancements, but these are trade off. If you go with these platforms, then you are giving up on vendor neutrality. On the other side are those who have engineering skills. They want the independence. In other words, they don't want vendor lock in. They want to transform their data into any number of use cases, especially data science, machine learning use case. What they want is agility via open file formats using any compute engine. So why do I say lakehouses are not mature? Well, cloud data warehouses they provide you an excellent user experience. That is the main reason why Snowflake took off. If you have thousands of cables, it takes minutes to get them started, uploaded into your warehouse and start experimentation. Table formats are far more resonating with the community than file formats. But once the cost goes up of cloud data warehouse, then the organization start exploring lakehouses. But the problem is lakehouses still need to do a lot of work on metadata. Apache Hive was a fantastic first attempt at it. Even today Apache Hive is still very strong, but it's all technical metadata and it has so many different restrictions. That's why we see Databricks is investing into something called Unity Catalog. Hopefully we'll hear more about Unity Catalog at the end of the month. But there's a second problem. I just want to mention, and that is lack of standards. All these open source vendors, they're running, what I call ego projects. You see on LinkedIn, they're constantly battling with each other, but end user doesn't care. End user wants a problem to be solved. They want to use Trino, Dremio, Spark from EMR, Databricks, Ahana, DaaS, Frink, Athena. But the problem is that we don't have common standards. >> Right. Thanks. So Doug, I worry sometimes. I mean, I look at the space, we've debated for years, best of breed versus the full suite. You see AWS with whatever, 12 different plus data stores and different APIs and primitives. You got Oracle putting everything into its database. It's actually done some interesting things with MySQL HeatWave, so maybe there's proof points there, but Snowflake really good at data warehouse, simplifying data warehouse. Databricks, really good at making lakehouses actually more functional. Can one platform do it all? >> Well in a word, I can't be best at breed at all things. I think the upshot of and cogen analysis from Sanjeev there, the database, the vendors coming out of the database tradition, they excel at the SQL. They're extending it into data science, but when it comes to unstructured data, data science, ML AI often a compromise, the data lake crowd, the Databricks and such. They've struggled to completely displace the data warehouse when it really gets to the tough SLAs, they acknowledge that there's still a role for the warehouse. Maybe you can size down the warehouse and offload some of the BI workloads and maybe and some of these SQL engines, good for ad hoc, minimize data movement. But really when you get to the deep service level, a requirement, the high concurrency, the high query workloads, you end up creating something that's warehouse like. >> Where do you guys think this market is headed? What's going to take hold? Which projects are going to fade away? You got some things in Apache projects like Hudi and Iceberg, where do they fit Sanjeev? Do you have any thoughts on that? >> So thank you, Dave. So I feel that table formats are starting to mature. There is a lot of work that's being done. We will not have a single product or single platform. We'll have a mixture. So I see a lot of Apache Iceberg in the news. Apache Iceberg is really innovating. Their focus is on a table format, but then Delta and Apache Hudi are doing a lot of deep engineering work. For example, how do you handle high concurrency when there are multiple rights going on? Do you version your Parquet files or how do you do your upcerts basically? So different focus, at the end of the day, the end user will decide what is the right platform, but we are going to have multiple formats living with us for a long time. >> Doug is Iceberg in your view, something that's going to address some of those gaps in standards that Sanjeev was talking about earlier? >> Yeah, Delta lake, Hudi, Iceberg, they all address this need for consistency and scalability, Delta lake open technically, but open for access. I don't hear about Delta lakes in any worlds, but Databricks, hearing a lot of buzz about Apache Iceberg. End users want an open performance standard. And most recently Google embraced Iceberg for its recent a big lake, their stab at having supporting both lakes and warehouses on one conjoined platform. >> And Tony, of course, you remember the early days of the sort of big data movement you had MapR was the most closed. You had Horton works the most open. You had Cloudera in between. There was always this kind of contest as to who's the most open. Does that matter? Are we going to see a repeat of that here? >> I think it's spheres of influence, I think, and Doug very much was kind of referring to this. I would call it kind of like the MongoDB syndrome, which is that you have... and I'm talking about MongoDB before they changed their license, open source project, but very much associated with MongoDB, which basically, pretty much controlled most of the contributions made decisions. And I think Databricks has the same iron cloud hold on Delta lake, but still the market is pretty much associated Delta lake as the Databricks, open source project. I mean, Iceberg is probably further advanced than Hudi in terms of mind share. And so what I see that's breaking down to is essentially, basically the Databricks open source versus the everything else open source, the community open source. So I see it's a very similar type of breakdown that I see repeating itself here. >> So by the way, Mongo has a conference next week, another data platform is kind of not really relevant to this discussion totally. But in the sense it is because there's a lot of discussion on earnings calls these last couple of weeks about consumption and who's exposed, obviously people are concerned about Snowflake's consumption model. Mongo is maybe less exposed because Atlas is prominent in the portfolio, blah, blah, blah. But I wanted to bring up the little bit of controversy that we saw come out of the Snowflake earnings call, where the ever core analyst asked Frank Klutman about discretionary spend. And Frank basically said, look, we're not discretionary. We are deeply operationalized. Whereas he kind of poo-pooed the lakehouse or the data lake, et cetera, saying, oh yeah, data scientists will pull files out and play with them. That's really not our business. Do any of you have comments on that? Help us swing through that controversy. Who wants to take that one? >> Let's put it this way. The SQL folks are from Venus and the data scientists are from Mars. So it means it really comes down to it, sort that type of perception. The fact is, is that, traditionally with analytics, it was very SQL oriented and that basically the quants were kind of off in their corner, where they're using SaaS or where they're using Teradata. It's really a great leveler today, which is that, I mean basic Python it's become arguably one of the most popular programming languages, depending on what month you're looking at, at the title index. And of course, obviously SQL is, as I tell the MongoDB folks, SQL is not going away. You have a large skills base out there. And so basically I see this breaking down to essentially, you're going to have each group that's going to have its own natural preferences for its home turf. And the fact that basically, let's say the Python and scale of folks are using Databricks does not make them any less operational or machine critical than the SQL folks. >> Anybody else want to chime in on that one? >> Yeah, I totally agree with that. Python support in Snowflake is very nascent with all of Snowpark, all of the things outside of SQL, they're very much relying on partners too and make things possible and make data science possible. And it's very early days. I think the bottom line, what we're going to see is each of these camps is going to keep working on doing better at the thing that they don't do today, or they're new to, but they're not going to nail it. They're not going to be best of breed on both sides. So the SQL centric companies and shops are going to do more data science on their database centric platform. That data science driven companies might be doing more BI on their leagues with those vendors and the companies that have highly distributed data, they're going to add fabrics, and maybe offload more of their BI onto those engines, like Dremio and Starburst. >> So I've asked you this before, but I'll ask you Sanjeev. 'Cause Snowflake and Databricks are such great examples 'cause you have the data engineering crowd trying to go into data warehousing and you have the data warehousing guys trying to go into the lake territory. Snowflake has $5 billion in the balance sheet and I've asked you before, I ask you again, doesn't there has to be a semantic layer between these two worlds? Does Snowflake go out and do M&A and maybe buy ad scale or a data mirror? Or is that just sort of a bandaid? What are your thoughts on that Sanjeev? >> I think semantic layer is the metadata. The business metadata is extremely important. At the end of the day, the business folks, they'd rather go to the business metadata than have to figure out, for example, like let's say, I want to update somebody's email address and we have a lot of overhead with data residency laws and all that. I want my platform to give me the business metadata so I can write my business logic without having to worry about which database, which location. So having that semantic layer is extremely important. In fact, now we are taking it to the next level. Now we are saying that it's not just a semantic layer, it's all my KPIs, all my calculations. So how can I make those calculations independent of the compute engine, independent of the BI tool and make them fungible. So more disaggregation of the stack, but it gives us more best of breed products that the customers have to worry about. >> So I want to ask you about the stack, the modern data stack, if you will. And we always talk about injecting machine intelligence, AI into applications, making them more data driven. But when you look at the application development stack, it's separate, the database is tends to be separate from the data and analytics stack. Do those two worlds have to come together in the modern data world? And what does that look like organizationally? >> So organizationally even technically I think it is starting to happen. Microservices architecture was a first attempt to bring the application and the data world together, but they are fundamentally different things. For example, if an application crashes, that's horrible, but Kubernetes will self heal and it'll bring the application back up. But if a database crashes and corrupts your data, we have a huge problem. So that's why they have traditionally been two different stacks. They are starting to come together, especially with data ops, for instance, versioning of the way we write business logic. It used to be, a business logic was highly embedded into our database of choice, but now we are disaggregating that using GitHub, CICD the whole DevOps tool chain. So data is catching up to the way applications are. >> We also have databases, that trans analytical databases that's a little bit of what the story is with MongoDB next week with adding more analytical capabilities. But I think companies that talk about that are always careful to couch it as operational analytics, not the warehouse level workloads. So we're making progress, but I think there's always going to be, or there will long be a separate analytical data platform. >> Until data mesh takes over. (all laughing) Not opening a can of worms. >> Well, but wait, I know it's out of scope here, but wouldn't data mesh say, hey, do take your best of breed to Doug's earlier point. You can't be best of breed at everything, wouldn't data mesh advocate, data lakes do your data lake thing, data warehouse, do your data lake, then you're just a node on the mesh. (Tony laughs) Now you need separate data stores and you need separate teams. >> To my point. >> I think, I mean, put it this way. (laughs) Data mesh itself is a logical view of the world. The data mesh is not necessarily on the lake or on the warehouse. I think for me, the fear there is more in terms of, the silos of governance that could happen and the silo views of the world, how we redefine. And that's why and I want to go back to something what Sanjeev said, which is that it's going to be raising the importance of the semantic layer. Now does Snowflake that opens a couple of Pandora's boxes here, which is one, does Snowflake dare go into that space or do they risk basically alienating basically their partner ecosystem, which is a key part of their whole appeal, which is best of breed. They're kind of the same situation that Informatica was where in the early 2000s, when Informatica briefly flirted with analytic applications and realized that was not a good idea, need to redouble down on their core, which was data integration. The other thing though, that raises the importance of and this is where the best of breed comes in, is the data fabric. My contention is that and whether you use employee data mesh practice or not, if you do employee data mesh, you need data fabric. If you deploy data fabric, you don't necessarily need to practice data mesh. But data fabric at its core and admittedly it's a category that's still very poorly defined and evolving, but at its core, we're talking about a common meta data back plane, something that we used to talk about with master data management, this would be something that would be more what I would say basically, mutable, that would be more evolving, basically using, let's say, machine learning to kind of, so that we don't have to predefine rules or predefine what the world looks like. But so I think in the long run, what this really means is that whichever way we implement on whichever physical platform we implement, we need to all be speaking the same metadata language. And I think at the end of the day, regardless of whether it's a lake, warehouse or a lakehouse, we need common metadata. >> Doug, can I come back to something you pointed out? That those talking about bringing analytic and transaction databases together, you had talked about operationalizing those and the caution there. Educate me on MySQL HeatWave. I was surprised when Oracle put so much effort in that, and you may or may not be familiar with it, but a lot of folks have talked about that. Now it's got nowhere in the market, that no market share, but a lot of we've seen these benchmarks from Oracle. How real is that bringing together those two worlds and eliminating ETL? >> Yeah, I have to defer on that one. That's my colleague, Holger Mueller. He wrote the report on that. He's way deep on it and I'm not going to mock him. >> I wonder if that is something, how real that is or if it's just Oracle marketing, anybody have any thoughts on that? >> I'm pretty familiar with HeatWave. It's essentially Oracle doing what, I mean, there's kind of a parallel with what Google's doing with AlloyDB. It's an operational database that will have some embedded analytics. And it's also something which I expect to start seeing with MongoDB. And I think basically, Doug and Sanjeev were kind of referring to this before about basically kind of like the operational analytics, that are basically embedded within an operational database. The idea here is that the last thing you want to do with an operational database is slow it down. So you're not going to be doing very complex deep learning or anything like that, but you might be doing things like classification, you might be doing some predictives. In other words, we've just concluded a transaction with this customer, but was it less than what we were expecting? What does that mean in terms of, is this customer likely to turn? I think we're going to be seeing a lot of that. And I think that's what a lot of what MySQL HeatWave is all about. Whether Oracle has any presence in the market now it's still a pretty new announcement, but the other thing that kind of goes against Oracle, (laughs) that they had to battle against is that even though they own MySQL and run the open source project, everybody else, in terms of the actual commercial implementation it's associated with everybody else. And the popular perception has been that MySQL has been basically kind of like a sidelight for Oracle. And so it's on Oracles shoulders to prove that they're damn serious about it. >> There's no coincidence that MariaDB was launched the day that Oracle acquired Sun. Sanjeev, I wonder if we could come back to a topic that we discussed earlier, which is this notion of consumption, obviously Wall Street's very concerned about it. Snowflake dropped prices last week. I've always felt like, hey, the consumption model is the right model. I can dial it down in when I need to, of course, the street freaks out. What are your thoughts on just pricing, the consumption model? What's the right model for companies, for customers? >> Consumption model is here to stay. What I would like to see, and I think is an ideal situation and actually plays into the lakehouse concept is that, I have my data in some open format, maybe it's Parquet or CSV or JSON, Avro, and I can bring whatever engine is the best engine for my workloads, bring it on, pay for consumption, and then shut it down. And by the way, that could be Cloudera. We don't talk about Cloudera very much, but it could be one business unit wants to use Athena. Another business unit wants to use some other Trino let's say or Dremio. So every business unit is working on the same data set, see that's critical, but that data set is maybe in their VPC and they bring any compute engine, you pay for the use, shut it down. That then you're getting value and you're only paying for consumption. It's not like, I left a cluster running by mistake, so there have to be guardrails. The reason FinOps is so big is because it's very easy for me to run a Cartesian joint in the cloud and get a $10,000 bill. >> This looks like it's been a sort of a victim of its own success in some ways, they made it so easy to spin up single note instances, multi note instances. And back in the day when compute was scarce and costly, those database engines optimized every last bit so they could get as much workload as possible out of every instance. Today, it's really easy to spin up a new node, a new multi node cluster. So that freedom has meant many more nodes that aren't necessarily getting that utilization. So Snowflake has been doing a lot to add reporting, monitoring, dashboards around the utilization of all the nodes and multi node instances that have spun up. And meanwhile, we're seeing some of the traditional on-prem databases that are moving into the cloud, trying to offer that freedom. And I think they're going to have that same discovery that the cost surprises are going to follow as they make it easy to spin up new instances. >> Yeah, a lot of money went into this market over the last decade, separating compute from storage, moving to the cloud. I'm glad you mentioned Cloudera Sanjeev, 'cause they got it all started, the kind of big data movement. We don't talk about them that much. Sometimes I wonder if it's because when they merged Hortonworks and Cloudera, they dead ended both platforms, but then they did invest in a more modern platform. But what's the future of Cloudera? What are you seeing out there? >> Cloudera has a good product. I have to say the problem in our space is that there're way too many companies, there's way too much noise. We are expecting the end users to parse it out or we expecting analyst firms to boil it down. So I think marketing becomes a big problem. As far as technology is concerned, I think Cloudera did turn their selves around and Tony, I know you, you talked to them quite frequently. I think they have quite a comprehensive offering for a long time actually. They've created Kudu, so they got operational, they have Hadoop, they have an operational data warehouse, they're migrated to the cloud. They are in hybrid multi-cloud environment. Lot of cloud data warehouses are not hybrid. They're only in the cloud. >> Right. I think what Cloudera has done the most successful has been in the transition to the cloud and the fact that they're giving their customers more OnRamps to it, more hybrid OnRamps. So I give them a lot of credit there. They're also have been trying to position themselves as being the most price friendly in terms of that we will put more guardrails and governors on it. I mean, part of that could be spin. But on the other hand, they don't have the same vested interest in compute cycles as say, AWS would have with EMR. That being said, yes, Cloudera does it, I think its most powerful appeal so of that, it almost sounds in a way, I don't want to cast them as a legacy system. But the fact is they do have a huge landed legacy on-prem and still significant potential to land and expand that to the cloud. That being said, even though Cloudera is multifunction, I think it certainly has its strengths and weaknesses. And the fact this is that yes, Cloudera has an operational database or an operational data store with a kind of like the outgrowth of age base, but Cloudera is still based, primarily known for the deep analytics, the operational database nobody's going to buy Cloudera or Cloudera data platform strictly for the operational database. They may use it as an add-on, just in the same way that a lot of customers have used let's say Teradata basically to do some machine learning or let's say, Snowflake to parse through JSON. Again, it's not an indictment or anything like that, but the fact is obviously they do have their strengths and their weaknesses. I think their greatest opportunity is with their existing base because that base has a lot invested and vested. And the fact is they do have a hybrid path that a lot of the others lack. >> And of course being on the quarterly shock clock was not a good place to be under the microscope for Cloudera and now they at least can refactor the business accordingly. I'm glad you mentioned hybrid too. We saw Snowflake last month, did a deal with Dell whereby non-native Snowflake data could access on-prem object store from Dell. They announced a similar thing with pure storage. What do you guys make of that? Is that just... How significant will that be? Will customers actually do that? I think they're using either materialized views or extended tables. >> There are data rated and residency requirements. There are desires to have these platforms in your own data center. And finally they capitulated, I mean, Frank Klutman is famous for saying to be very focused and earlier, not many months ago, they called the going on-prem as a distraction, but clearly there's enough demand and certainly government contracts any company that has data residency requirements, it's a real need. So they finally addressed it. >> Yeah, I'll bet dollars to donuts, there was an EBC session and some big customer said, if you don't do this, we ain't doing business with you. And that was like, okay, we'll do it. >> So Dave, I have to say, earlier on you had brought this point, how Frank Klutman was poo-pooing data science workloads. On your show, about a year or so ago, he said, we are never going to on-prem. He burnt that bridge. (Tony laughs) That was on your show. >> I remember exactly the statement because it was interesting. He said, we're never going to do the halfway house. And I think what he meant is we're not going to bring the Snowflake architecture to run on-prem because it defeats the elasticity of the cloud. So this was kind of a capitulation in a way. But I think it still preserves his original intent sort of, I don't know. >> The point here is that every vendor will poo-poo whatever they don't have until they do have it. >> Yes. >> And then it'd be like, oh, we are all in, we've always been doing this. We have always supported this and now we are doing it better than others. >> Look, it was the same type of shock wave that we felt basically when AWS at the last moment at one of their reinvents, oh, by the way, we're going to introduce outposts. And the analyst group is typically pre briefed about a week or two ahead under NDA and that was not part of it. And when they dropped, they just casually dropped that in the analyst session. It's like, you could have heard the sound of lots of analysts changing their diapers at that point. >> (laughs) I remember that. And a props to Andy Jassy who once, many times actually told us, never say never when it comes to AWS. So guys, I know we got to run. We got some hard stops. Maybe you could each give us your final thoughts, Doug start us off and then-- >> Sure. Well, we've got the Snowflake Summit coming up. I'll be looking for customers that are really doing data science, that are really employing Python through Snowflake, through Snowpark. And then a couple weeks later, we've got Databricks with their Data and AI Summit in San Francisco. I'll be looking for customers that are really doing considerable BI workloads. Last year I did a market overview of this analytical data platform space, 14 vendors, eight of them claim to support lakehouse, both sides of the camp, Databricks customer had 32, their top customer that they could site was unnamed. It had 32 concurrent users doing 15,000 queries per hour. That's good but it's not up to the most demanding BI SQL workloads. And they acknowledged that and said, they need to keep working that. Snowflake asked for their biggest data science customer, they cited Kabura, 400 terabytes, 8,500 users, 400,000 data engineering jobs per day. I took the data engineering job to be probably SQL centric, ETL style transformation work. So I want to see the real use of the Python, how much Snowpark has grown as a way to support data science. >> Great. Tony. >> Actually of all things. And certainly, I'll also be looking for similar things in what Doug is saying, but I think sort of like, kind of out of left field, I'm interested to see what MongoDB is going to start to say about operational analytics, 'cause I mean, they're into this conquer the world strategy. We can be all things to all people. Okay, if that's the case, what's going to be a case with basically, putting in some inline analytics, what are you going to be doing with your query engine? So that's actually kind of an interesting thing we're looking for next week. >> Great. Sanjeev. >> So I'll be at MongoDB world, Snowflake and Databricks and very interested in seeing, but since Tony brought up MongoDB, I see that even the databases are shifting tremendously. They are addressing both the hashtag use case online, transactional and analytical. I'm also seeing that these databases started in, let's say in case of MySQL HeatWave, as relational or in MongoDB as document, but now they've added graph, they've added time series, they've added geospatial and they just keep adding more and more data structures and really making these databases multifunctional. So very interesting. >> It gets back to our discussion of best of breed, versus all in one. And it's likely Mongo's path or part of their strategy of course, is through developers. They're very developer focused. So we'll be looking for that. And guys, I'll be there as well. I'm hoping that we maybe have some extra time on theCUBE, so please stop by and we can maybe chat a little bit. Guys as always, fantastic. Thank you so much, Doug, Tony, Sanjeev, and let's do this again. >> It's been a pleasure. >> All right and thank you for watching. This is Dave Vellante for theCUBE and the excellent analyst. We'll see you next time. (upbeat music)

Published Date : Jun 2 2022

SUMMARY :

And Doug Henschen is the vice president Thank you. Doug let's start off with you And at the same time, me a lot of that material. And of course, at the and then we realized all the and Tony have brought to light. So I'm interested, the And in the cloud, So Sanjeev, is this all hype? But the problem is that we I mean, I look at the space, and offload some of the So different focus, at the end of the day, and warehouses on one conjoined platform. of the sort of big data movement most of the contributions made decisions. Whereas he kind of poo-pooed the lakehouse and the data scientists are from Mars. and the companies that have in the balance sheet that the customers have to worry about. the modern data stack, if you will. and the data world together, the story is with MongoDB Until data mesh takes over. and you need separate teams. that raises the importance of and the caution there. Yeah, I have to defer on that one. The idea here is that the of course, the street freaks out. and actually plays into the And back in the day when the kind of big data movement. We are expecting the end And the fact is they do have a hybrid path refactor the business accordingly. saying to be very focused And that was like, okay, we'll do it. So Dave, I have to say, the Snowflake architecture to run on-prem The point here is that and now we are doing that in the analyst session. And a props to Andy Jassy and said, they need to keep working that. Great. Okay, if that's the case, Great. I see that even the databases I'm hoping that we maybe have and the excellent analyst.

ENTITIES

Entity	Category	Confidence
Doug	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Tony	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Frank	PERSON	0.99+
Frank Klutman	PERSON	0.99+
Tony Baers	PERSON	0.99+
Mars	LOCATION	0.99+
Doug Henschen	PERSON	0.99+
2020	DATE	0.99+
AWS	ORGANIZATION	0.99+
Venus	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
2012	DATE	0.99+
Databricks	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Holger Mueller	PERSON	0.99+
Andy Jassy	PERSON	0.99+
last year	DATE	0.99+
$5 billion	QUANTITY	0.99+
$10,000	QUANTITY	0.99+
14 vendors	QUANTITY	0.99+
Last year	DATE	0.99+
last week	DATE	0.99+
San Francisco	LOCATION	0.99+
SanjMo	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
8,500 users	QUANTITY	0.99+
Sanjeev	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
32 concurrent users	QUANTITY	0.99+
two	QUANTITY	0.99+
Constellation Research	ORGANIZATION	0.99+
Mongo	ORGANIZATION	0.99+
Sanjeev Mohan	PERSON	0.99+
Ahana	ORGANIZATION	0.99+
DaaS	ORGANIZATION	0.99+
EMR	ORGANIZATION	0.99+
32	QUANTITY	0.99+
Atlas	ORGANIZATION	0.99+
Delta	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
Python	TITLE	0.99+
each	QUANTITY	0.99+
Athena	ORGANIZATION	0.99+
next week	DATE	0.99+

Christian Keynote with Disclaimer

(upbeat music) >> Hi everyone, thank you for joining us at the Data Cloud Summit. The last couple of months have been an exciting time at Snowflake. And yet, what's even more compelling to all of us at Snowflake is what's ahead. Today I have the opportunity to share new product developments that will extend the reach and impact of our Data Cloud and improve the experience of Snowflake users. Our product strategy is focused on four major areas. First, Data Cloud content. In the Data Cloud silos are eliminated and our vision is to bring the world's data within reach of every organization. You'll hear about new data sets and data services available in our data marketplace and see how previous barriers to sourcing and unifying data are eliminated. Second, extensible data pipelines. As you gain frictionless access to a broader set of data through the Data Cloud, Snowflakes platform brings additional capabilities and extensibility to your data pipelines, simplifying data ingestion, and transformation. Third, data governance. The Data Cloud eliminates silos and breaks down barriers and in a world where data collaboration is the norm, the importance of data governance is ratified and elevated. We'll share new advancements to support how the world's most demanding organizations mobilize your data while maintaining high standards of compliance and governance. Finally, our fourth area focuses on platform performance and capabilities. We remain laser focused on continuing to lead with the most performant and capable data platform. We have some exciting news to share about the core engine of Snowflake. As always, we love showing you Snowflake in action, and we prepared some demos for you. Also, we'll keep coming back to the fact that one of the characteristics of Snowflake that we're proud as staff is that we offer a single platform from which you can operate all of your data workloads, across clouds and across regions, which workloads you may ask, specifically, data warehousing, data lake, data science, data engineering, data applications, and data sharing. Snowflake makes it possible to mobilize all your data in service of your business without the cost, complexity and overhead of managing multiple systems, tools and vendors. Let's dive in. As you heard from Frank, the Data Cloud offers a unique capability to connect organizations and create collaboration and innovation across industries fueled by data. The Snowflake data marketplace is the gateway to the Data Cloud, providing visibility for organizations to browse and discover data that can help them make better decisions. For data providers on the marketplace, there is a new opportunity to reach new customers, create new revenue streams, and radically decrease the effort and time to data delivery. Our marketplace dramatically reduces the friction of sharing and collaborating with data opening up new possibilities to all participants in the Data Cloud. We introduced the Snowflake data marketplace in 2019. And it is now home to over 100 data providers, with half of them having joined the marketplace in the last four months. Since our most recent product announcements in June, we have continued broadening the availability of the data marketplace, across regions and across clouds. Our data marketplace provides the opportunity for data providers to reach consumers across cloud and regional boundaries. A critical aspect of the Data Cloud is that we envisioned organizations collaborating not just in terms of data, but also data powered applications and services. Think of instances where a provider doesn't want to open access to the entirety of a data set, but wants to provide access to business logic that has access and leverages such data set. That is what we call data services. And we want Snowflake to be the platform of choice for developing discovering and consuming such rich building blocks. To see How the data marketplace comes to live, and in particular one of these data services, let's jump into a demo. For all of our demos today, we're going to put ourselves in the shoes of a fictional global insurance company. We've called it Insureco. Insurance is a data intensive and highly regulated industry. Having the right access control and insight from data is core to every insurance company's success. I'm going to turn it over to Prasanna to show how the Snowflake data marketplace can solve a data discoverability and access problem. >> Let's look at how Insureco can leverage data and data services from the Snowflake data marketplace and use it in conjunction with its own data in the Data Cloud to do three things, better detect fraudulent claims, arm its agents with the right information, and benchmark business health against competition. Let's start with detecting fraudulent claims. I'm an analyst in the Claims Department. I have auto claims data in my account. I can see there are 2000 auto claims, many of these submitted by auto body shops. I need to determine if they are valid and legitimate. In particular, could some of these be insurance fraud? By going to the Snowflake data marketplace where numerous data providers and data service providers can list their offerings, I find the quantifying data service. It uses a combination of external data sources and predictive risk typology models to inform the risk level of an organization. Quantifying external sources include sanctions and blacklists, negative news, social media, and real time search engine results. That's a wealth of data and models built on that data which we don't have internally. So I'd like to use Quantifind to determine a fraud risk score for each auto body shop that has submitted a claim. First, the Snowflake data marketplace made it really easy for me to discover a data service like this. Without the data marketplace, finding such a service would be a lengthy ad hoc process of doing web searches and asking around. Second, once I find Quantifind, I can use Quantifind service against my own data in three simple steps using data sharing. I create a table with the names and addresses of auto body shops that have submitted claims. I then share the table with Quantifind to start the risk assessment. Quantifind does the risk scoring and shares the data back with me. Quantifind uses external functions which we introduced in June to get results from their risk prediction models. Without Snowflake data sharing, we would have had to contact Quantifind to understand what format they wanted the data in, then extract this data into a file, FTP the file to Quantifind, wait for the results, then ingest the results back into our systems for them to be usable. Or I would have had to write code to call Quantifinds API. All of that would have taken days. In contrast, with data sharing, I can set this up in minutes. What's more, now that I have set this up, as new claims are added in the future, they will automatically leverage Quantifind's data service. I view the scores returned by Quantifind and see the two entities in my claims data have a high score for insurance fraud risk. I open up the link returned by Quantifind to read more, and find that this organization has been involved in an insurance crime ring. Looks like that is a claim that we won't be approving. Using the Quantifind data service through the Snowflake data marketplace gives me access to a risk scoring capability that we don't have in house without having to call custom APIs. For a provider like Quantifind this drives new leads and monetization opportunities. Now that I have identified potentially fraudulent claims, let's move on to the second part. I would like to share this fraud risk information with the agents who sold the corresponding policies. To do this, I need two things. First, I need to find the agents who sold these policies. Then I need to share with these agents the fraud risk information that we got from Quantifind. But I want to share it such that each agent only sees the fraud risk information corresponding to claims for policies that they wrote. To find agents who sold these policies, I need to look up our Salesforce data. I can find this easily within Insureco's internal data exchange. I see there's a listing with Salesforce data. Our sales Ops team has published this listing so I know it's our officially blessed data set, and I can immediately access it from my Snowflake account without copying any data or having to set up ETL. I can now join Salesforce data with my claims to identify the agents for the policies that were flagged to have fraudulent claims. I also have the Snowflake account information for each agent. Next, I create a secure view that joins on an entitlements table, such that each agent can only see the rows corresponding to policies that they have sold. I then share this directly with the agents. This share contains the secure view that I created with the names of the auto body shops, and the fraud risk identified by Quantifind. Finally, let's move on to the third and last part. Now that I have detected potentially fraudulent claims, I'm going to move on to building a dashboard that our executives have been asking for. They want to see how Insureco compares against other auto insurance companies on key metrics, like total claims paid out for the auto insurance line of business nationwide. I go to the Snowflake data marketplace and find SNL U.S. Insurance Statutory Data from SNP. This data is included with Insureco's existing subscription with SMP so when I request access to it, SMP can immediately share this data with me through Snowflake data sharing. I create a virtual database from the share, and I'm ready to query this data, no ETL needed. And since this is a virtual database, pointing to the original data in SNP Snowflake account, I have access to the latest data as it arrives in SNPs account. I see that the SNL U.S. Insurance Statutory Data from SNP has data on assets, premiums earned and claims paid out by each us insurance company in 2019. This data is broken up by line of business and geography and in many cases goes beyond the data that would be available from public financial filings. This is exactly the data I need. I identify a subset of comparable insurance companies whose net total assets are within 20% of Insureco's, and whose lines of business are similar to ours. I can now create a Snow site dashboard that compares Insureco against similar insurance companies on key metrics, like net earned premiums, and net claims paid out in 2019 for auto insurance. I can see that while we are below median our net earned premiums, we are doing better than our competition on total claims paid out in 2019, which could be a reflection of our improved claims handling and fraud detection. That's a good insight that I can share with our executives. In summary, the Data Cloud enabled me to do three key things. First, seamlessly fine data and data services that I need to do my job, be it an external data service like Quantifind and external data set from SNP or internal data from Insureco's data exchange. Second, get immediate live access to this data. And third, control and manage collaboration around this data. With Snowflake, I can mobilize data and data services across my business ecosystem in just minutes. >> Thank you Prasanna. Now I want to turn our focus to extensible data pipelines. We believe there are two different and important ways of making Snowflakes platform highly extensible. First, by enabling teams to leverage services or business logic that live outside of Snowflake interacting with data within Snowflake. We do this through a feature called external functions, a mechanism to conveniently bring data to where the computation is. We announced this feature for calling regional endpoints via AWS gateway in June, and it's currently available in public preview. We are also now in public preview supporting Azure API management and will soon support Google API gateway and AWS private endpoints. The second extensibility mechanism does the converse. It brings the computation to Snowflake to run closer to the data. We will do this by enabling the creation of functions and procedures in SQL, Java, Scala or Python ultimately providing choice based on the programming language preference for you or your organization. You will see Java, Scala and Python available through private and public previews in the future. The possibilities enabled by these extensibility features are broad and powerful. However, our commitment to being a great platform for data engineers, data scientists and developers goes far beyond programming language. Today, I am delighted to announce Snowpark a family of libraries that will bring a new experience to programming data in Snowflake. Snowpark enables you to write code directly against Snowflake in a way that is deeply integrated into the languages I mentioned earlier, using familiar concepts like DataFrames. But the most important aspect of Snowpark is that it has been designed and optimized to leverage the Snowflake engine with its main characteristics and benefits, performance, reliability, and scalability with near zero maintenance. Think of the power of a declarative SQL statements available through a well known API in Scala, Java or Python, all these against data governed in your core data platform. We believe Snowpark will be transformative for data programmability. I'd like to introduce Sri to showcase how our fictitious insurance company Insureco will be able to take advantage of the Snowpark API for data science workloads. >> Thanks Christian, hi, everyone? I'm Sri Chintala, a product manager at Snowflake focused on extensible data pipelines. And today, I'm very excited to show you a preview of Snowpark. In our first demo, we saw how Insureco could identify potentially fraudulent claims. Now, for all the valid claims InsureCo wants to ensure they're providing excellent customer service. To do that, they put in place a system to transcribe all of their customer calls, so they can look for patterns. A simple thing they'd like to do is detect the sentiment of each call so they can tell which calls were good and which were problematic. They can then better train their claim agents for challenging calls. Let's take a quick look at the work they've done so far. InsureCo's data science team use Snowflakes external functions to quickly and easily train a machine learning model in H2O AI. Snowflake has direct integrations with H2O and many other data science providers giving Insureco the flexibility to use a wide variety of data science libraries frameworks or tools to train their model. Now that the team has a custom trained sentiment model tailored to their specific claims data, let's see how a data engineer at Insureco can use Snowpark to build a data pipeline that scores customer call logs using the model hosted right inside of Snowflake. As you can see, we have the transcribed call logs stored in the customer call logs table inside Snowflake. Now, as a data engineer trained in Scala, and used to working with systems like Spark and Pandas, I want to use familiar programming concepts to build my pipeline. Snowpark solves for this by letting me use popular programming languages like Java or Scala. It also provides familiar concepts in APIs, such as the DataFrame abstraction, optimized to leverage and run natively on the Snowflake engine. So here I am in my ID, where I've written a simple scalar program using the Snowpark libraries. The first step in using the Snowpark API is establishing a session with Snowflake. I use the session builder object and specify the required details to connect. Now, I can create a DataFrame for the data in the transcripts column of the customer call logs table. As you can see, the Snowpark API provides native language constructs for data manipulation. Here, I use the Select method provided by the API to specify the column names to return rather than writing select transcripts as a string. By using the native language constructs provided by the API, I benefit from features like IntelliSense and type checking. Here you can see some of the other common methods that the DataFrame class offers like filters like join and others. Next, I define a get sentiment user defined function that will return a sentiment score for an input string by using our pre trained H2O model. From the UDF, we call the score method that initializes and runs the sentiment model. I've built this helper into a Java file, which along with the model object and license are added as dependencies that Snowpark will send to Snowflake for execution. As a developer, this is all programming that I'm familiar with. We can now call our get sentiment function on the transcripts column of the DataFrame and right back the results of the score transcripts to a new target table. Let's run this code and switch over to Snowflake to see the score data and also all the work that Snowpark has done for us on the back end. If I do a select star from scored logs, we can see the sentiment score of each call right alongside the transcript. With Snowpark all the logic in my program is pushed down into Snowflake. I can see in the query history that Snowpark has created a temporary Java function to host the pre trained H20 model, and that the model is running right in my Snowflake warehouse. Snowpark has allowed us to do something completely new in Snowflake. Let's recap what we saw. With Snowpark, Insureco was able to use their preferred programming language, Scala and use the familiar DataFrame constructs to score data using a machine learning model. With support for Java UDFs, they were able to run a train model natively within Snowflake. And finally, we saw how Snowpark executed computationally intensive data science workloads right within Snowflake. This simplifies Insureco's data pipeline architecture, as it reduces the number of additional systems they have to manage. We hope that extensibility with Scala, Java and Snowpark will enable our users to work with Snowflake in their preferred way while keeping the architecture simple. We are very excited to see how you use Snowpark to extend your data pipelines. Thank you for watching and with that back to you, Christian. >> Thank you Sri. You saw how Sri could utilize Snowpark to efficiently perform advanced sentiment analysis. But of course, if this use case was important to your business, you don't want to fully automate this pipeline and analysis. Imagine being able to do all of the following in Snowflake, your pipeline could start far upstream of what you saw in the demo. By storing your actual customer care call recordings in Snowflake, you may notice that this is new for Snowflake. We'll come back to the idea of storing unstructured data in Snowflake at the end of my talk today. Once you have the data in Snowflake, you can use our streams and past capabilities to call an external function to transcribe these files. To simplify this flow even further, we plan to introduce a serverless execution model for tasks where Snowflake can automatically size and manage resources for you. After this step, you can use the same serverless task to execute sentiment scoring of your transcript as shown in the demo with incremental processing as each transcript is created. Finally, you can surface the sentiment score either via snow side, or through any tool you use to share insights throughout your organization. In this example, you see data being transformed from a raw asset into a higher level of information that can drive business action, all fully automated all in Snowflake. Turning back to Insureco, you know how important data governance is for any major enterprise but particularly for one in this industry. Insurance companies manage highly sensitive data about their customers, and have some of the strictest requirements for storing and tracking such data, as well as managing and governing it. At Snowflake, we think about governance as the ability to know your data, manage your data and collaborate with confidence. As you saw in our first demo, the Data Cloud enables seamless collaboration, control and access to data via the Snowflake data marketplace. And companies may set up their own data exchanges to create similar collaboration and control across their ecosystems. In future releases, we expect to deliver enhancements that create more visibility into who has access to what data and provide usage information of that data. Today, we are announcing a new capability to help Snowflake users better know and organize your data. This is our new tagging framework. Tagging in Snowflake will allow user defined metadata to be attached to a variety of objects. We built a broad and robust framework with powerful implications. Think of the ability to annotate warehouses with cost center information for tracking or think of annotating tables and columns with sensitivity classifications. Our tagging capability will enable the creation of companies specific business annotations for objects in Snowflakes platform. Another key aspect of data governance in Snowflake is our policy based framework where you specify what you want to be true about your data, and Snowflake enforces those policies. We announced one such policy earlier this year, our dynamic data masking capability, which is now available in public preview. Today, we are announcing a great complimentary a policy to achieve row level security to see how role level security can enhance InsureCo's ability to govern and secure data. I'll hand it over to Artin for a demo. >> Hello, I'm Martin Avanes, Director of Product Management for Snowflake. As Christian has already mentioned, the rise of the Data Cloud greatly accelerates the ability to access and share diverse data leading to greater data collaboration across teams and organizations. Controlling data access with ease and ensuring compliance at the same time is top of mind for users. Today, I'm thrilled to announce our new row access policies that will allow users to define various rules for accessing data in the Data Cloud. Let's check back in with Insureco to see some of these in action and highlight how those work with other existing policies one can define in Snowflake. Because Insureco is a multinational company, it has to take extra measures to ensure data across geographic boundaries is protected to meet a wide range of compliance requirements. The Insureco team has been asked to segment what data sales team members have access to based on where they are regionally. In order to make this possible, they will use Snowflakes row access policies to implement row level security. We are going to apply policies for three Insureco's sales team members with different roles. Alice, an executive must be able to view sales data from both North America and Europe. Alex in North America sales manager will be limited to access sales data from North America only. And Jordan, a Europe sales manager will be limited to access sales data from Europe only. As a first step, the security administrator needs to create a lookup table that will be used to determine which data is accessible based on each role. As you can see, the lookup table has the row and their associated region, both of which will be used to apply policies that we will now create. Row access policies are implemented using standard SQL syntax to make it easy for administrators to create policies like the one our administrators looking to implement. And similar to masking policies, row access policies are leveraging our flexible and expressive policy language. In this demo, our admin users to create a row access policy that uses the row and region of a user to determine what row level data they have access to when queries are executed. When users queries are executed against the table protected by such a row access policy, Snowflakes query engine will dynamically generate and apply the corresponding predicate to filter out rows the user is not supposed to see. With the policy now created, let's log in as our Sales Users and see if it worked. Recall that as a sales executive, Alice should have the ability to see all rows from North America and Europe. Sure enough, when she runs her query, she can see all rows so we know the policy is working for her. You may also have noticed that some columns are showing masked data. That's because our administrator's also using our previously announced data masking capabilities to protect these data attributes for everyone in sales. When we look at our other users, we should notice that the same columns are also masked for them. As you see, you can easily combine masking and row access policies on the same data sets. Now let's look at Alex, our North American sales manager. Alex runs to st Korea's Alice, row access policies leverage the lookup table to dynamically generate the corresponding predicates for this query. The result is we see that only the data for North America is visible. Notice too that the same columns are still masked. Finally, let's try Jordan, our European sales manager. Jordan runs the query and the result is only the data for Europe with the same columns also masked. And you reintroduced masking policies, today you saw row access policies in action. And similar to our masking policies, row access policies in Snowflake will be accepted Hands of capability integrated seamlessly across all of Snowflake everywhere you expect it to work it does. If you're accessing data stored in external tables, semi structured JSON data, or building data pipelines via streams or plan to leverage Snowflakes data sharing functionality, you will be able to implement complex row access policies for all these diverse use cases and workloads within Snowflake. And with Snowflakes unique replication feature, you can instantly apply these new policies consistently to all of your Snowflake accounts, ensuring governance across regions and even across different clouds. In the future, we plan to demonstrate how to combine our new tagging capabilities with Snowflakes policies, allowing advanced audit and enforcing those policies with ease. And with that, let's pass it back over to Christian. >> Thank you Artin. We look forward to making this new tagging and row level security capabilities available in private preview in the coming months. One last note on the broad area of data governance. A big aspect of the Data Cloud is the mobilization of data to be used across organizations. At the same time, privacy is an important consideration to ensure the protection of sensitive, personal or potentially identifying information. We're working on a set of product capabilities to simplify compliance with privacy related regulatory requirements, and simplify the process of collaborating with data while preserving privacy. Earlier this year, Snowflake acquired a company called Crypto Numerix to accelerate our efforts on this front, including the identification and anonymization of sensitive data. We look forward to sharing more details in the future. We've just shown you three demos of new and exciting ways to use Snowflake. However, I want to also remind you that our commitment to the core platform has never been greater. As you move workloads on to Snowflake, we know you expect exceptional price performance and continued delivery of new capabilities that benefit every workload. On price performance, we continue to drive performance improvements throughout the platform. Let me give you an example comparing an identical set of customers submitted queries that ran both in August of 2019, and August of 2020. If I look at the set of queries that took more than one second to compile 72% of those improved by at least 50%. When we make these improvements, execution time goes down. And by implication, the required compute time is also reduced. Based on our pricing model to charge for what you use, performance improvements not only deliver faster insights, but also translate into cost savings for you. In addition, we have two new major announcements on performance to share today. First, we announced our search optimization service during our June event. This service currently in public preview can be enabled on a table by table basis, and is able to dramatically accelerate lookup queries on any column, particularly those not used as clustering columns. We initially support equality comparisons only, and today we're announcing expanded support for searches in values, such as pattern matching within strings. This will unlock a number of additional use cases such as analytics on logs data for performance or security purposes. This expanded support is currently being validated by a few customers in private preview, and will be broadly available in the future. Second, I'd like to introduce a new service that will be in private preview in a future release. The query acceleration service. This new feature will automatically identify and scale out parts of a query that could benefit from additional resources and parallelization. This means that you will be able to realize dramatic improvements in performance. This is especially impactful for data science and other scan intensive workloads. Using this feature is pretty simple. You define a maximum amount of additional resources that can be recruited by a warehouse for acceleration, and the service decides when it would be beneficial to use them. Given enough resources, a query over a massive data set can see orders of magnitude performance improvement compared to the same query without acceleration enabled. In our own usage of Snowflake, we saw a common query go 15 times faster without changing the warehouse size. All of these performance enhancements are extremely exciting, and you will see continued improvements in the future. We love to innovate and continuously raise the bar on what's possible. More important, we love seeing our customers adopt and benefit from our new capabilities. In June, we announced a number of previews, and we continue to roll those features out and see tremendous adoption, even before reaching general availability. Two have those announcements were the introduction of our geospatial support and policies for dynamic data masking. Both of these features are currently in use by hundreds of customers. The number of tables using our new geography data type recently crossed the hundred thousand mark, and the number of columns with masking policies also recently crossed the same hundred thousand mark. This momentum and level of adoption since our announcements in June is phenomenal. I have one last announcement to highlight today. In 2014, Snowflake transformed the world of data management and analytics by providing a single platform with first class support for both structured and semi structured data. Today, we are announcing that Snowflake will be adding support for unstructured data on that same platform. Think of the abilities of Snowflake used to store access and share files. As an example, would you like to leverage the power of SQL to reason through a set of image files. We have a few customers as early adopters and we'll provide additional details in the future. With this, you will be able to leverage Snowflake to mobilize all your data in the Data Cloud. Our customers rely on Snowflake as the data platform for every part of their business. However, the vision and potential of Snowflake is actually much bigger than the four walls of any organization. Snowflake has created a Data Cloud a data connected network with a vision where any Snowflake customer can leverage and mobilize the world's data. Whether it's data sets, or data services from traditional data providers for SaaS vendors, our marketplace creates opportunities for you and raises the bar in terms of what is possible. As examples, you can unify data across your supply chain to accelerate your time and quality to market. You can build entirely new revenue streams, or collaborate with a consortium on data for good. The possibilities are endless. Every company has the opportunity to gain richer insights, build greater products and deliver better services by reaching beyond the data that he owns. Our vision is to enable every company to leverage the world's data through seamless and governing access. Snowflake is your window into this data network into this broader opportunity. Welcome to the Data Cloud. (upbeat music)

Published Date : Nov 19 2020

SUMMARY :

is the gateway to the Data Cloud, FTP the file to Quantifind, It brings the computation to Snowflake and that the model is running as the ability to know your data, the ability to access is the mobilization of data to

ENTITIES

Entity	Category	Confidence
Insureco	ORGANIZATION	0.99+
Christian	PERSON	0.99+
Alice	PERSON	0.99+
August of 2020	DATE	0.99+
August of 2019	DATE	0.99+
June	DATE	0.99+
InsureCo	ORGANIZATION	0.99+
Martin Avanes	PERSON	0.99+
Europe	LOCATION	0.99+
Quantifind	ORGANIZATION	0.99+
Prasanna	PERSON	0.99+
15 times	QUANTITY	0.99+
2019	DATE	0.99+
Alex	PERSON	0.99+
SNP	ORGANIZATION	0.99+
2014	DATE	0.99+
Jordan	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Scala	TITLE	0.99+
Java	TITLE	0.99+
72%	QUANTITY	0.99+
SQL	TITLE	0.99+
Today	DATE	0.99+
North America	LOCATION	0.99+
each agent	QUANTITY	0.99+
SMP	ORGANIZATION	0.99+
second part	QUANTITY	0.99+
First	QUANTITY	0.99+
Second	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Snowflake	TITLE	0.99+
Python	TITLE	0.99+
each call	QUANTITY	0.99+
Sri Chintala	PERSON	0.99+
each role	QUANTITY	0.99+
today	DATE	0.99+
Two	QUANTITY	0.99+
two	QUANTITY	0.99+
Both	QUANTITY	0.99+
Crypto Numerix	ORGANIZATION	0.99+
two entities	QUANTITY	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Snowpark: