Image Title

Search Results for dodgeville:

Breaking Analysis Further defining Supercloud W/ tech leaders VMware, Snowflake, Databricks & others


 

from the cube studios in palo alto in boston bringing you data driven insights from the cube and etr this is breaking analysis with dave vellante at our inaugural super cloud 22 event we further refined the concept of a super cloud iterating on the definition the salient attributes and some examples of what is and what is not a super cloud welcome to this week's wikibon cube insights powered by etr you know snowflake has always been what we feel is one of the strongest examples of a super cloud and in this breaking analysis from our studios in palo alto we unpack our interview with benoit de javille co-founder and president of products at snowflake and we test our super cloud definition on the company's data cloud platform and we're really looking forward to your feedback first let's examine how we defl find super cloudant very importantly one of the goals of super cloud 22 was to get the community's input on the definition and iterate on previous work super cloud is an emerging computing architecture that comprises a set of services which are abstracted from the underlying primitives of hyperscale clouds we're talking about services such as compute storage networking security and other native tooling like machine learning and developer tools to create a global system that spans more than one cloud super cloud as shown on this slide has five essential properties x number of deployment models and y number of service models we're looking for community input on x and y and on the first point as well so please weigh in and contribute now we've identified these five essential elements of a super cloud let's talk about these first the super cloud has to run its services on more than one cloud leveraging the cloud native tools offered by each of the cloud providers the builder of the super cloud platform is responsible for optimizing the underlying primitives of each cloud and optimizing for the specific needs be it cost or performance or latency or governance data sharing security etc but those primitives must be abstracted such that a common experience is delivered across the clouds for both users and developers the super cloud has a metadata intelligence layer that can maximize efficiency for the specific purpose of the super cloud i.e the purpose that the super cloud is intended for and it does so in a federated model and it includes what we call a super pass this is a prerequisite that is a purpose-built component and enables ecosystem partners to customize and monetize incremental services while at the same time ensuring that the common experiences exist across clouds now in terms of deployment models we'd really like to get more feedback on this piece but here's where we are so far based on the feedback we got at super cloud 22. we see three deployment models the first is one where a control plane may run on one cloud but supports data plane interactions with more than one other cloud the second model instantiates the super cloud services on each individual cloud and within regions and can support interactions across more than one cloud with a unified interface connecting those instantiations those instances to create a common experience and the third model superimposes its services as a layer or in the case of snowflake they call it a mesh on top of the cloud on top of the cloud providers region or regions with a single global instantiation a single global instantiation of those services which spans multiple cloud providers this is our understanding from a comfort the conversation with benoit dejaville as to how snowflake approaches its solutions and for now we're going to park the service models we need to more time to flesh that out and we'll propose something shortly for you to comment on now we peppered benoit dejaville at super cloud 22 to test how the snowflake data cloud aligns to our concepts and our definition let me also say that snowflake doesn't use the term data cloud they really want to respect and they want to denigrate the importance of their hyperscale partners nor do we but we do think the hyperscalers today anyway are building or not building what we call super clouds but they are but but people who bar are building super clouds are building on top of hyperscale clouds that is a prerequisite so here are the questions that we tested with snowflake first question how does snowflake architect its data cloud and what is its deployment model listen to deja ville talk about how snowflake has architected a single system play the clip there are several ways to do this you know uh super cloud as as you name them the way we we we picked is is to create you know one single system and that's very important right the the the um [Music] there are several ways right you can instantiate you know your solution uh in every region of a cloud and and you know potentially that region could be a ws that region could be gcp so you are indeed a multi-cloud solution but snowflake we did it differently we are really creating cloud regions which are superposed on top of the cloud provider you know region infrastructure region so we are building our regions but but where where it's very different is that each region of snowflake is not one in instantiation of our service our service is global by nature we can move data from one region to the other when you land in snowflake you land into one region but but you can grow from there and you can you know exist in multiple clouds at the same time and that's very important right it's not one single i mean different instantiation of a system is one single instantiation which covers many cloud regions and many cloud providers snowflake chose the most advanced level of our three deployment models dodgeville talked about too presumably so it could maintain maximum control and ensure that common experience like the iphone model next we probed about the technical enablers of the data cloud listen to deja ville talk about snow grid he uses the term mesh and then this can get confusing with the jamaicani's data mesh concept but listen to benoit's explanation well as i said you know first we start by building you know snowflake regions we have today furry region that spawn you know the world so it's a worldwide worldwide system with many regions but all these regions are connected together they are you know meshed together with our technology we name it snow grid and that makes it hard because you know regions you know azure region can talk to a ws region or gcp regions and and as a as a user of our cloud you you don't see really these regional differences that you know regions are in different you know potentially clown when you use snowflake you can exist your your presence as an organization can be in several regions several clouds if you want geographic and and and both geographic and cloud provider so i can share data irrespective of the the cloud and i'm in the snowflake data cloud is that correct i can do that today exactly and and that's very critical right what we wanted is to remove data silos and and when you instantiate a system in one single region and that system is locked in that region you cannot communicate with other parts of the world you are locking the data in one region right and we didn't want to do that we wanted you know data to be distributed the way customer wants it to be distributed across the world and potentially sharing data at world scale now maybe there are many ways to skin the other cat meaning perhaps if a platform does instantiate in multiple places there are ways to share data but this is how snowflake chose to approach the problem next question how do you deal with latency in this big global system this is really important to us because while snowflake has some really smart people working as engineers and and the like we don't think they've solved for the speed of light problem the best people working on it as we often joke listen to benoit deja ville's comments on this topic so yes and no the the way we do it it's very expensive to do that because generally if you want to join you know data which is in which are in different regions and different cloud it's going to be very expensive because you need to move you know data every time you join it so the way we do it is that you replicate the subset of data that you want to access from one region from other regions so you can create this data mesh but data is replicated to make it very cheap and very performant too and is the snow grid does that have the metadata intelligence yes to actually can you describe that a little bit yeah snow grid is both uh a way to to exchange you know metadata about so each region of snowflake knows about all the other regions of snowflake every time we create a new region diary you know the metadata is distributed over our data cloud not only you know region knows all the regions but knows you know every organization that exists in our clouds where this organization is where data can be replicated by this organization and then of course it's it's also used as a way to uh uh exchange data right so you can exchange you know beta by scale of data size and we just had i was just receiving an email from one of our customers who moved more than four petabytes of data cross-region cross you know cloud providers in you know few days and you know it's a lot of data so it takes you know some time to move but they were able to do that online completely online and and switch over you know to the diff to the other region which is failover is very important also so yes and no probably means typically no he says yes and no probably means no so it sounds like snowflake is selectively pulling small amounts of data and replicating it where necessary but you also heard him talk about the metadata layer which is one of the essential aspects of super cloud okay next we dug into security it's one of the most important issues and we think one of the hardest parts related to deploying super cloud so we've talked about how the cloud has become the first line of defense for the cso but now with multi-cloud you have multiple first lines of defense and that means multiple shared responsibility models and multiple tool sets from different cloud providers and an expanded threat surface so listen to benoit's explanation here please play the clip this is a great question uh security has always been the most important aspect of snowflake since day one right this is the question that every customer of ours has you know how you can you guarantee the security of my data and so we secure data really tightly in region we have several layers of security it starts by by encrypting it every data at rest and that's very important a lot of customers are not doing that right you hear these attacks for example on on cloud you know where someone left you know their buckets uh uh open and then you know you can access the data because it's a non-encrypted uh so we are encrypting everything at rest we are encrypting everything in transit so a region is very secure now you know you never from one region you never access data from another region in snowflake that's why also we replicate data now the replication of that data across region or the metadata for that matter is is really highly secure so snow grits ensure that everything is encrypted everything is you know we have multiple you know encryption keys and it's you know stored in hardware you know secure modules so we we we built you know snow grids such that it's secure and it allows very secure movement of data so when we heard this explanation we immediately went to the lowest common denominator question meaning when you think about how aws for instance deals with data in motion or data and rest it might be different from how another cloud provider deals with it so how does aws uh uh uh differences for example in the aws maturity model for various you know cloud capabilities you know let's say they've got a faster nitro or graviton does it do do you have to how does snowflake deal with that do they have to slow everything else down like imagine a caravan cruising you know across the desert so you know every truck can keep up let's listen it's a great question i mean of course our software is abstracting you know all the cloud providers you know infrastructure so that when you run in one region let's say aws or azure it doesn't make any difference as far as the applications are concerned and and this abstraction of course is a lot of work i mean really really a lot of work because it needs to be secure it needs to be performance and you know every cloud and it has you know to expose apis which are uniform and and you know cloud providers even though they have potentially the same concept let's say blob storage apis are completely different the way you know these systems are secure it's completely different the errors that you can get and and the retry you know mechanism is very different from one cloud to the other performance is also different we discovered that when we were starting to port our software and and and you know we had to completely rethink how to leverage blob storage in that cloud versus that cloud because just of performance too so we had you know for example to you know stripe data so all this work is work that's you know you don't need as an application because our vision really is that applications which are running in our data cloud can you know be abstracted of all this difference and and we provide all the services all the workload that this application need whether it's transactional access to data analytical access to data you know managing you know logs managing you know metrics all of these is abstracted too such that they are not you know tied to one you know particular service of one cloud and and distributing this application across you know many regions many cloud is very seamless so from that answer we know that snowflake takes care of everything but we really don't understand the performance implications in you know in that specific case but we feel pretty certain that the promises that snowflake makes around governance and security within their data sharing construct construct will be kept now another criterion that we've proposed for super cloud is a super pass layer to create a common developer experience and an enabler for ecosystem partners to monetize please play the clip let's listen we build it you know a custom build because because as you said you know what exists in one cloud might not exist in another cloud provider right so so we have to build you know on this all these this components that modern application mode and that application need and and and and that you know goes to machine learning as i say transactional uh analytical system and the entire thing so such that they can run in isolation basically and the objective is the developer experience will be identical across those clouds yes right the developers doesn't need to worry about cloud provider and actually our system we have we didn't talk about it but the marketplace that we have which allows actually to deliver we're getting there yeah okay now we're not going to go deep into ecosystem today we've talked about snowflakes strengths in this regard but snowflake they pretty much ticked all the boxes on our super cloud attributes and definition we asked benoit dejaville to confirm that this is all shipping and available today and he also gave us a glimpse of the future play the clip and we are still developing it you know the transactional you know unistore as we call it was announced in last summit so so they are still you know working properly but but but that's the vision right and and and that's important because we talk about the infrastructure right you mentioned a lot about storage and compute but it's not only that right when you think about application they need to use the transactional database they need to use an analytical system they need to use you know machine learning so you need to provide also all these services which are consistent across all the cloud providers so you can hear deja ville talking about expanding beyond taking advantage of the core infrastructure storage and networking et cetera and bringing intelligence to the data through machine learning and ai so of course there's more to come and there better be at this company's valuation despite the recent sharp pullback in a tightening fed environment okay so i know it's cliche but everyone's comparing snowflakes and data bricks databricks has been pretty vocal about its open source posture compared to snowflakes and it just so happens that we had aligotsy on at super cloud 22 as well he wasn't in studio he had to do remote because i guess he's presenting at an investor conference this week so we had to bring him in remotely now i didn't get to do this interview john furrier did but i listened to it and captured this clip about how data bricks sees super cloud and the importance of open source take a listen to goatzee yeah i mean let me start by saying we just we're big fans of open source we think that open source is a force in software that's going to continue for you know decades hundreds of years and it's going to slowly replace all proprietary code in its way we saw that you know it could do that with the most advanced technology windows you know proprietary operating system very complicated got replaced with linux so open source can pretty much do anything and what we're seeing with the data lake house is that slowly the open source community is building a replacement for the proprietary data warehouse you know data lake machine learning real-time stack in open source and we're excited to be part of it for us delta lake is a very important project that really helps you standardize how you lay out your data in the cloud and with it comes a really important protocol called delta sharing that enables you in an open way actually for the first time ever share large data sets between organizations but it uses an open protocol so the great thing about that is you don't need to be a database customer you don't even like databricks you just need to use this open source project and you can now securely share data sets between organizations across clouds and it actually does so really efficiently just one copy of the data so you don't have to copy it if you're within the same cloud so the implication of ellie gotzi's comments is that databricks with delta sharing as john implied is playing a long game now i don't know if enough about the databricks architecture to comment in detail i got to do more research there so i reached out to my two analyst friends tony bear and sanji mohan to see what they thought because they cover these companies pretty closely here's what tony bear said quote i've viewed the divergent lake house strategies of data bricks and snowflake in the context of their roots prior to delta lake databrick's prime focus was the compute not the storage layer and more specifically they were a compute engine not a database snowflake approached from the opposite end of the pool as they originally fit the mold of the classic database company rather than a specific compute engine per se the lake house pushes both companies outside of their original comfort zones data bricks to storage snowflake to compute engine so it makes perfect sense for databricks to embrace the open source narrative at the storage layer and for snowflake to continue its walled garden approach but in the long run their strategies are already overlapping databricks is not a 100 open source company its practitioner experience has always been proprietary and now so is its sql query engine likewise snowflake has had to open up with the support of iceberg for open data lake format the question really becomes how serious snowflake will be in making iceberg a first-class citizen in its environment that is not necessarily officially branding a lake house but effectively is and likewise can databricks deliver the service levels associated with walled gardens through a more brute force approach that relies heavily on the query engine at the end of the day those are the key requirements that will matter to data bricks and snowflake customers end quote that was some deep thought by by tony thank you for that sanjay mohan added the following quote open source is a slippery slope people buy mobile phones based on open source android but it's not fully open similarly databricks delta lake was not originally fully open source and even today its photon execution engine is not we are always going to live in a hybrid world snowflake and databricks will support whatever model works best for them and their customers the big question is do customers care as deeply about which vendor has a higher degree of openness as we technology people do i believe customers evaluation criteria is far more nuanced than just to decipher each vendor's open source claims end quote okay so i had to ask dodgeville about their so-called wall garden approach and what their strategy is with apache iceberg here's what he said iceberg is is very important so just to to give some context iceberg is an open you know table format right which was you know first you know developed by netflix and netflix you know put it open source in the apache community so we embrace that's that open source standard because because it's widely used by by many um many you know companies and also many companies have you know really invested a lot of effort in building you know big data hadoop solution or data like solution and they want to use snowflake and they couldn't really use snowflake because all their data were in open you know formats so we are embracing icebergs to help these companies move through the cloud but why we have been relentless with direct access to data direct access to data is a little bit of a problem for us and and the reason is when you direct access to data now you have direct access to storage now you have to understand for example the specificity of one cloud versus the other so as soon as you start to have direct access to data you lose your you know your cloud diagnostic layer you don't access data with api when you have direct access to data it's very hard to secure data because you need to grant access direct access to tools which are not you know protected and you see a lot of you know hacking of of data you know because of that so so that was not you know direct access to data is not serving well our customers and that's why we have been relented to do that because it's it's cr it's it's not cloud diagnostic it's it's you you have to code that you have to you you you need a lot of intelligence while apis access so we want open apis that's that's i guess the way we embrace you know openness is is by open api versus you know you access directly data here's my take snowflake is hedging its bets because enough people care about open source that they have to have some open data format options and it's good optics and you heard benoit deja ville talk about the risks of directly accessing the data and the complexities it brings now is that maybe a little fud against databricks maybe but same can be said for ollie's comments maybe flooding the proprietaryness of snowflake but as both analysts pointed out open is a spectrum hey i remember unix used to equal open systems okay let's end with some etr spending data and why not compare snowflake and data bricks spending profiles this is an xy graph with net score or spending momentum on the y-axis and pervasiveness or overlap in the data set on the x-axis this is data from the january survey when snowflake was holding above 80 percent net score off the charts databricks was also very strong in the upper 60s now let's fast forward to this next chart and show you the july etr survey data and you can see snowflake has come back down to earth now remember anything above 40 net score is highly elevated so both companies are doing well but snowflake is well off its highs and data bricks has come down somewhat as well databricks is inching to the right snowflake rocketed to the right post its ipo and as we know databricks wasn't able to get to ipo during the covet bubble ali gotzi is at the morgan stanley ceo conference this week they got plenty of cash to withstand a long-term recession i'm told and they've started the message that they're a billion dollars in annualized revenue i'm not sure exactly what that means i've seen some numbers on their gross margins i'm not sure what that means i've seen some numbers on their net retention revenue or net revenue retention again i'll reserve judgment until we see an s1 but it's clear both of these companies have momentum and they're out competing in the market well as always be the ultimate arbiter different philosophies perhaps is it like democrats and republicans well it could be but they're both going after a solving data problem both companies are trying to help customers get more value out of their data and both companies are highly valued so they have to perform for their investors to paraphrase ralph nader the similarities may be greater than the differences okay that's it for today thanks to the team from palo alto for this awesome super cloud studio build alex myerson and ken shiffman are on production in the palo alto studios today kristin martin and sheryl knight get the word out to our community rob hoff is our editor-in-chief over at siliconangle thanks to all please check out etr.ai for all the survey data remember these episodes are all available as podcasts wherever you listen just search breaking analysis podcasts i publish each week on wikibon.com and siliconangle.com and you can email me at david.vellante at siliconangle.com or dm me at devellante or comment on my linkedin posts and please as i say etr has got some of the best survey data in the business we track it every quarter and really excited to be partners with them this is dave vellante for the cube insights powered by etr thanks for watching and we'll see you next time on breaking analysis [Music] you

Published Date : Aug 14 2022

SUMMARY :

and and the retry you know mechanism is

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
netflixORGANIZATION

0.99+

john furrierPERSON

0.99+

palo altoORGANIZATION

0.99+

tony bearPERSON

0.99+

bostonLOCATION

0.99+

sanji mohanPERSON

0.99+

ken shiffmanPERSON

0.99+

bothQUANTITY

0.99+

todayDATE

0.99+

ellie gotziPERSON

0.99+

VMwareORGANIZATION

0.99+

SnowflakeORGANIZATION

0.99+

siliconangle.comOTHER

0.99+

more than four petabytesQUANTITY

0.99+

first pointQUANTITY

0.99+

kristin martinPERSON

0.99+

both companiesQUANTITY

0.99+

first questionQUANTITY

0.99+

rob hoffPERSON

0.99+

more than oneQUANTITY

0.99+

second modelQUANTITY

0.98+

alex myersonPERSON

0.98+

third modelQUANTITY

0.98+

one regionQUANTITY

0.98+

one copyQUANTITY

0.98+

one regionQUANTITY

0.98+

five essential elementsQUANTITY

0.98+

androidTITLE

0.98+

100QUANTITY

0.98+

first lineQUANTITY

0.98+

DatabricksORGANIZATION

0.98+

sherylPERSON

0.98+

more than one cloudQUANTITY

0.98+

firstQUANTITY

0.98+

iphoneCOMMERCIAL_ITEM

0.98+

super cloud 22EVENT

0.98+

each cloudQUANTITY

0.98+

eachQUANTITY

0.97+

sanjay mohanPERSON

0.97+

johnPERSON

0.97+

republicansORGANIZATION

0.97+

this weekDATE

0.97+

hundreds of yearsQUANTITY

0.97+

siliconangleORGANIZATION

0.97+

each weekQUANTITY

0.97+

data lake houseORGANIZATION

0.97+

one single regionQUANTITY

0.97+

januaryDATE

0.97+

dave vellantePERSON

0.96+

each regionQUANTITY

0.96+

oneQUANTITY

0.96+

dave vellantePERSON

0.96+

tonyPERSON

0.96+

above 80 percentQUANTITY

0.95+

more than one cloudQUANTITY

0.95+

more than one cloudQUANTITY

0.95+

data lakeORGANIZATION

0.95+

five essential propertiesQUANTITY

0.95+

democratsORGANIZATION

0.95+

first timeQUANTITY

0.95+

julyDATE

0.94+

linuxTITLE

0.94+

etrORGANIZATION

0.94+

devellanteORGANIZATION

0.93+

dodgevilleORGANIZATION

0.93+

each vendorQUANTITY

0.93+

super cloud 22ORGANIZATION

0.93+

delta lakeORGANIZATION

0.92+

three deployment modelsQUANTITY

0.92+

first linesQUANTITY

0.92+

dejavilleLOCATION

0.92+

day oneQUANTITY

0.92+

Benoit & Christian Live


 

>>Okay, We're now going into the technical deep dive. We're gonna geek out here a little bit. Ben Wa Dodgeville is here. He's co founder of Snowflake and president of products. And also joining us is Christian Kleinerman. Who's the senior vice president of products. Gentlemen, welcome. Good to see you. >>Yeah, you that >>get this year, they Thanks for having us. >>Very welcome. So it been well, we've heard a lot this morning about the data cloud, and it's becoming my view anyway, the linchpin of your strategy. I'm interested in what technical decisions you made early on. That that led you to this point and even enabled the data cloud. >>Yes. So? So I would say that that a crowd was built in tow in three phases. Really? The initial phase, as you call it, was it was really about 20 minutes. One regions Teoh, Data Cloud and and that region. What was important is to make that region infinity, infinity scalable, right. And that's our architectural, which we call the beauty cross to share the architectural er so that you can plug in as many were clues in that region as a Z without any limits. The limit is really the underlying prop Provide the, you know, resource is which you know, Cal provide the region as a really no limits. So So that z you know, region architecture, I think, was really the building block of the snowflake. That a cloud. But it really didn't stop there. The second aspect Waas Well, it was really data sharing. How you know munity internets within the region, how to share data between 10 and off that region between different customers on that was also enabled by architectures Because we discover, you know, compute and storage so compute You know clusters can access any storage within the region. Eso that's based off the data cloud and then really faced three Which is critical is the expansion the global expansion how we made you know, our cloud domestic layers so that we could talk You know the snowflake vision on different clouds on DNA Now we are running in three cloud on top of three cloud providers. We started with the ws and US West. We moved to assure and then uh, Google g c p On how this this crowd region way started with one crowd region as I said in the W S U S West, and then we create we created, you know, many you know, different regions. We have 22 regions today, all over the world and all over the different in the cloud providers. And what's more important is that these regions are not isolated. You know, Snowflake is one single, you know, system for the world where we created this global data mesh which connects every region such that not only there's no flex system as a whole can can be aware of for these regions, But customers can replicate data across regions on and, you know, share. There are, you know, across the planet if need be. So So this is one single, you know, really? I call it the World Wide Web. Off data that, that's, you know, is this vision of the data cloud. And it really started with this building block, which is a cloud region. >>Thank you for that. Ben White Christian. You and I have talked about this. I mean, that notion of a stripping away the complexity and that's kind of what the data cloud does. But if you think about data architectures, historically they really had no domain knowledge. They've really been focused on the technology toe ingest and analyze and prepare And then, you know, push data out to the business and you're really flipping that model, allowing the sort of domain leaders to be first class citizens if you will, uh, because they're the ones that creating data value, and they're worrying less about infrastructure. But I wonder, do you feel like customers air ready for that change? >>I I love the observation. They've that, uh, so much energy goes in in in enterprises, in organizations today, just dealing with infrastructure and dealing with pipes and plumbing and things like that and something that was insightful from from Ben Juan and and our founders from from Day one WAAS. This is a managed service. We want our customers to focus on the data, getting the insights, getting the decisions in time, not just managing pipes and plumbing and patches and upgrades, and and the the other piece that it's it's it's an interesting reality is that there is this belief that the cloud is simplifying this, and all of a sudden there's no problem but actually understanding each of the public cloud providers is a large undertaking, right? Each of them have 100 plus services, uh, sending upgrades and updates on a constant basis. And that just distracts from the time that it takes to go and say, Here's my data. Here's my data model. Here's how it make better decisions. So at the heart of everything we do is we wanna abstract the infrastructure. We don't wanna abstract the nuance of each of the cloud providers. And as you said, have companies focus on This is the domain expertise or the knowledge for my industry. Are all companies ready for it? I think it's a It's a mixed bag. We we talk to customers on a regular basis every way, every week, every day, and some of them are full on. They've sort of burned the bridges and, like I'm going to the cloud, I'm going to embrace a new model. Some others. You can see the complete like, uh, shock and all expressions like What do you mean? I don't have all these knobs. 2 to 3 can turn. Uh, but I think the future is very clear on how do we get companies to be more competitive through data? >>Well, Ben Ben. Well, it's interesting that Christian mentioned to manage service and that used to be in a hosting. Guys run around the lab lab coats and plugging things in. And of course, you're looking at this differently. It's high degrees of automation. But, you know, one of those areas is workload management. And I wonder how you think about workload management and how that changes with the data cloud. >>Yeah, this is a great question. Actually, Workload management used to be a nightmare. You know, traditional systems on it was a nightmare for the B s and they had to spend most a lot of their time, you know, just managing workloads. And why is that is because all these workloads are running on the single, you know, system and a single cluster The compete for resources. So managing workload that always explain it as explain Tetris, right? You had the first to know when to run. This work will make sure that too big workers are not overlapping. You know, maybe it really is pushed at night, you know, And And you have this 90 window which is not, you know, efficient. Of course, for you a TL because you have delays because of that. But but you have no choice, right? You have a speaks and more for resource is and you have to get the best out of this speaks resource is. And and for sure you don't want to eat here with her to impact your dash boarding workload or your reports, you know, impact and with data science and and And this became a true nine man because because everyone wants to be that a driven meaning that all the entire company wants to run new workers on on this system. And these systems are completely overwhelmed. So so, well below management was, and I may have before Snowflake and Snowflake made it really >>easy. The >>reason is it's no flag. We leverage the crowds who dedicates, you know, compute resources to each work. It's in the snowflake terminology. It's called a warehouse virtual warehouse, and each workload can run in its own virtual warehouse, and each virtual warehouse has its own dedicated competition resources. It's on, you know, I opened with and you can really control how much resources which workload gas by sizing this warehouses. You know, I just think the compute resources that they can use When the workload, you know, starts to execute automatically. The warehouse, the compute resources are turned off, but turned on by snowflake is for resuming a warehouse and you can dynamically resized this warehouse. It can be done by the system automatically. You know if if the conference see of the workload increases or it can be done manually by the administrator or, you know, just suggesting, you know, uh, compute power. You know, for each workload and and the best off that model is not only it gives you a very fine grain. Control on resource is that this work can get Not only workloads are not competing and not impacting it in any other workload. But because of that model, you can hand as many workload as you want. And that's really critical because, as I said, you know, everyone in the organization wants to use data to make decisions, So you have more and more work roads running. And then the Patriots game, you know, would have been impossible in in a in a centralized one single computer, cross the system On the flip side. Oh, is that you have to have a zone administrator off the system. You have to to justify that. The workload is worth running for your organization, right? It's so easy in literally in seconds, you can stand up a new warehouse and and start to run your your crazy on that new compute cluster. And of course, you have to justify if the cost of that because there is a cost, right, snowflake charges by seconds off compute So that cost, you know, is it's justified and you have toe. You know, it's so easy now to hire new workflow than you do new things with snowflake that that that you have to to see, you know, and and look at the trade off the cost off course and managing costs. >>So, Christian been while I use the term nightmare, I'm thinking about previous days of workload management. I mean, I talked to a lot of customers that are trying to reduce the elapsed time of going from data insights, and their nightmare is they've got this complicated data lifecycle. Andi, I'm wondering how you guys think about that. That notion of compressing elapsed time toe data value from raw data to insights. >>Yeah, so? So we we obsess or we we think a lot about this time to insight from the moment that an event happens toe the point that it shows up in a dashboard or a report or some decision or action happens based on it. There are three parts that we think on. How do we reduce that life cycle? The first one which ties to our previous conversation is related toe. Where is their muscle memory on processes or ways of doing things that don't actually make us much sense? My favorite example is you say you ask any any organization. Do you run pipelines and ingestion and transformation at two and three in the morning? And the answer is, Oh yeah, we do that. And if you go in and say, Why do you do that? The answer is typically, well, that's when the resource is are available Back to Ben Wallace. Tetris, right? That's that's when it was possible. But then you ask, Would you really want to run it two and three in the morning? If if you could do it sooner, we could do it. Mawr in time, riel time with when the event happened. So first part of it is back to removing the constraints of the infrastructures. How about running transformations and their ingestion when the business best needs it? When it's the lowest time to inside the lowest latency, not one of technology lets you do it. So that's the the the easy one out the door. The second one is instead of just fully optimizing a process, where can you remove steps of the process? This is where all of our data sharing and the snowflake data marketplace come into place. How about if you need to go in and just data from a SAS application vendor or maybe from a commercial data provider and imagine the dream off? You wouldn't have to be running constant iterations and FTP s and cracking C S V files and things like that. What if it's always available in your environment, always up to date, And that, in our mind, is a lot more revolutionary, which is not? Let's take away a process of ingesting and copying data and optimize it. How about not copying in the first place? So that's back to number two on, then back to number three is is what we do day in and day out on making sure our platform delivers the best performance. Make it faster. The combination of those three things has led many of our customers, and and And you'll see it through many of the customer testimonials today that they get insights and decisions and actions way faster, in part by removing steps, in part by doing away with all habits and in part because we deliver exceptional performance. >>Thank you, Christian. Now, Ben Wa is you know, we're big proponents of this idea of the main driven design and data architecture. Er, you know, for example, customers building entire applications and what I like all data products or data services on their data platform. I wonder if you could talk about the types of applications and services that you're seeing >>built >>on top of snowflake. >>Yeah, and And I have to say that this is a critical aspect of snowflake is to create this platform and and really help application to be built on top of this platform. And the more application we have, the better the platform will be. It is like, you know, the the analogies with your iPhone. If your iPhone that no applications, you know it would be useless. It's it's an empty platforms. So So we are really encouraging. You know, applications to be belong to the top of snowflake and from there one actually many applications and many off our customers are building applications on snowflake. We estimated that's about 30% are running already applications on top off our platform. And the reason is is off course because it's it's so easy to get compute resources. There is no limit in scale in our viability, their ability. So all these characteristics are critical for for an application on DWI deliver that you know from day One Now we have improved, you know, our increased the scope off the platform by adding, you know, Java in competition and Snow Park, which which was announced today. That's also you know, it is an enabler. Eso in terms off type of application. It's really, you know, all over and and what I like actually needs to be surprised, right? I don't know what well being on top of snowflake and how it will be the world, but with that are sharing. Also, we are opening the door to a new type of applications which are deliver of the other marketplace. Uh, where, You know, one can get this application died inside the platform, right? The platform is distributing this application, and today there was a presentation on a Christian T notes about, >>you >>know, 20 finds, which, you know, is this machine learning, you know, which is providing toe. You know, any users off snowflake off the application and and machine learning, you know, to find, you know, and apply model on on your data and enrich your data. So data enrichment, I think, will be a huge aspect of snowflake and data enrichment with machine learning would be a big, you know, use case for these applications. Also, how to get there are, you know, inside the platform. You know, a lot of applications led him to do that. Eso machine learning. Uh, that engineering enrichments away. These are application that we run on the platform. >>Great. Hey, we just got a minute or so left in. Earlier today, we ran a video. We saw that you guys announced the startup competition, >>which >>is awesome. Ben, while you're a judge in this competition, what can you tell us about this >>Yeah, >>e you know, for me, we are still a startup. I didn't you know yet, you know, realize that we're not anymore. Startup. I really, you know, you really feel about you know, l things, you know, a new startups, you know, on that. That's very important for Snowflake. We have. We were started yesterday, and we want to have new startups. So So the ends, the idea of this program, the other aspect off that program is also toe help, you know, started to build on top of snowflake and to enrich. You know, this this pain, you know, rich ecosystem that snowflake is or the data cloud off that a cloud is And we want to, you know, add and boost. You know that that excitement for the platform, so So the ants, you know, it's a win win. It's a win, you know, for for new startups. And it's a win, ofcourse for us. Because it will make the platform even better. >>Yeah, And startups, or where innovation happens. So registrations open. I've heard, uh, several, uh, startups have have signed up. You goto snowflake dot com slash startup challenge, and you can learn mawr. That's exciting program. An initiative. So thank you for doing that on behalf of of startups out there and thanks. Ben Wa and Christian. Yeah, I really appreciate you guys coming on Great conversation. >>Thanks for David. >>You're welcome. And when we talk, Thio go to market >>pros. They >>always tell us that one of the key tenets is to stay close to the customer. Well, we want to find out how data helps us. To do that in our next segment. Brings in to chief revenue officers to give us their perspective on how data is helping their customers transform. Business is digitally. Let's watch.

Published Date : Nov 20 2020

SUMMARY :

Okay, We're now going into the technical deep dive. That that led you to this point and even enabled the data cloud. and then we create we created, you know, many you know, different regions. and prepare And then, you know, push data out to the business and you're really flipping that model, And as you said, have companies focus on This is the domain expertise But, you know, You know, maybe it really is pushed at night, you know, And And you have this 90 The done manually by the administrator or, you know, just suggesting, you know, I'm wondering how you guys think about that. And if you go in and say, Why do you do that? Er, you know, for example, customers building entire It is like, you know, the the analogies with your iPhone. the application and and machine learning, you know, to find, We saw that you guys announced the startup competition, is awesome. so So the ants, you know, it's a win win. I really appreciate you guys coming on Great conversation. And when we talk, Thio go to market Brings in to chief revenue

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavidPERSON

0.99+

Christian KleinermanPERSON

0.99+

Ben WallacePERSON

0.99+

Ben WhitePERSON

0.99+

Ben WaPERSON

0.99+

three partsQUANTITY

0.99+

Ben BenPERSON

0.99+

EachQUANTITY

0.99+

BenPERSON

0.99+

iPhoneCOMMERCIAL_ITEM

0.99+

Ben Wa DodgevillePERSON

0.99+

SnowflakeORGANIZATION

0.99+

ChristianPERSON

0.99+

BenoitPERSON

0.99+

todayDATE

0.99+

ThioPERSON

0.99+

yesterdayDATE

0.99+

first partQUANTITY

0.99+

firstQUANTITY

0.99+

eachQUANTITY

0.99+

three thingsQUANTITY

0.99+

22 regionsQUANTITY

0.98+

second aspectQUANTITY

0.98+

JavaTITLE

0.98+

about 20 minutesQUANTITY

0.98+

first oneQUANTITY

0.98+

10QUANTITY

0.98+

each workQUANTITY

0.98+

about 30%QUANTITY

0.97+

Ben JuanPERSON

0.97+

second oneQUANTITY

0.97+

nine manQUANTITY

0.97+

oneQUANTITY

0.97+

90 windowQUANTITY

0.97+

singleQUANTITY

0.97+

each virtual warehouseQUANTITY

0.96+

twoQUANTITY

0.96+

each workloadQUANTITY

0.96+

DWIORGANIZATION

0.95+

100 plus serviQUANTITY

0.94+

20 findsQUANTITY

0.94+

one singleQUANTITY

0.91+

3QUANTITY

0.91+

threeDATE

0.91+

three phasesQUANTITY

0.91+

this morningDATE

0.91+

GoogleORGANIZATION

0.89+

threeQUANTITY

0.89+

TetrisTITLE

0.89+

Snow ParkTITLE

0.88+

US WestLOCATION

0.87+

Christian TPERSON

0.87+

PatriotsORGANIZATION

0.87+

this yearDATE

0.86+

single clusterQUANTITY

0.84+

day OneQUANTITY

0.82+

twoDATE

0.8+

SASORGANIZATION

0.79+

one single computerQUANTITY

0.78+

SnowflakeTITLE

0.78+

one crowd regionQUANTITY

0.76+

three cloud providersQUANTITY

0.76+

W S U S WestLOCATION

0.74+

One regionsQUANTITY

0.73+

ChristianORGANIZATION

0.73+

Day oneQUANTITY

0.71+

Earlier todayDATE

0.68+

wsORGANIZATION

0.61+

number threeQUANTITY

0.58+

g c pTITLE

0.57+

2QUANTITY

0.53+

SnowflakeEVENT

0.45+

TetrisORGANIZATION

0.35+

Democratizing AI and Advanced Analytics with Dataiku x Snowflake


 

>>My name is Dave Volonte, and with me are two world class technologists, visionaries and entrepreneurs. And Wa Dodgeville is the he co founded Snowflake, and he's now the president of the product division. And Florian Duetto is the co founder and CEO of Data Aiko. Gentlemen, welcome to the Cube to first timers. Love it. >>Great to be here >>now, Florian you and Ben Wa You have a number of customers in common. And I have said many times on the Cube that you know, the first era of cloud was really about infrastructure, making it more agile, taking out costs. And the next generation of innovation is really coming from the application of machine intelligence to data with the cloud is really the scale platform. So is that premise your relevant to you? Do you buy that? And and why do you think snowflake and data ICU make a good match for customers? >>I think that because it's our values that are aligned when it's all about actually today allowing complexity for customers. So you close the gap or the democratizing access to data access to technology. It's not only about data data is important, but it's also about the impact of data. Who can you make the best out of data as fast as possible as easily as possible within an organization. And another value is about just the openness of the platform building the future together? Uh, I think a platform that is not just about the platform but also full ecosystem of partners around it, bringing the level off accessibility and flexibility you need for the 10 years away. >>Yeah, so that's key. But it's not just data. It's turning data into insights. Have been why you came out of the world of very powerful but highly complex databases. And we know we all know that you and the snowflake team you get very high marks for really radically simplifying customers lives. But can you talk specifically about the types of challenges that your customers air using snowflake to solve? >>Yeah, so So the really the challenge, you know, be four. Snowflake. I would say waas really? To put all the data, you know, in one place and run all the computers, all the workloads that you wanted to run, You know, against that data and off course, you know, existing legacy platforms. We're not able to support. You know that level of concurrency, Many workload. You know, we we talk about machine learning that a science that are engendering, you know, that our house big data were closed or running in one place didn't make sense at all. And therefore, you know what customers did is to create silos, silos of data everywhere, you know, with different system having a subset of the data. And of course, now you cannot analyze this data in one place. So, snowflake, we really solve that problem by creating a single, you know, architectural where you can put all the data in the cloud. So it's a really cloud native we really thought about You know how to solve that problem, how to create, you know, leverage, Cloud and the lessee cc off cloud to really put all the die in one place, but at the same time not run all workload at the same place. So each workload that runs in Snowflake that is dedicated, You know, computer resource is to run, and that makes it very Ajai, right? You know, Floyd and talk about, you know, data scientists having to run analysis, so they need you know a lot of compute resources, but only for, you know, a few hours on. Do you know, with snowflake they can run these new work lord at this workload to the system, get the compute resources that they need to run this workload. And when it's over, they can shut down. You know that their system, it will be automatically shut down. Therefore, they would not pay for the resources that they don't use. So it's a very Ajai system where you can do this, analyzes when you need, and you have all the power to run all this workload at the same time. >>Well, it's profound what you guys built to me. I mean, of course, everybody's trying to copy it now. It was like, remember that bringing the notion of bringing compute to the data and the Hadoop days, and I think that that Asai say everybody is sort of following your suit now are trying to Florian I gotta say the first data scientist I ever interviewed on the Cube was amazing. Hilary Mason, right after she started a bit Lee. And, you know, she made data science that sounds so compelling. But data science is hard. So same same question for you. What do you see is the biggest challenges for customers that they're facing with data science. >>The biggest challenge, from my perspective, is that owns you solve the issue of the data. Seidel with snowflake, you don't want to bring another Seidel, which would be a side off skills. Essentially, there is to the talent gap between the talented label of the market, or are it is to actually find recruits trained data scientist on what needs to be done. And so you need actually to simplify the access to technologies such as every organization can make it, whatever the talent, by bridging that gap and to get there, there is a need of actually breaking up the silos. And in a collaborative approach where technologists and business work together and actually put some their hands into those data projects together, >>it makes sense for flooring. Let's stay with you for a minute. If I can your observation spaces, you know it's pretty, pretty global, and and so you have a unique perspective on how companies around the world might be using data and data science. Are you seeing any trends may be differences between regions or maybe within different industries. What are you seeing? >>Yes. Yeah, definitely. I do see trends that are not geographic that much, but much more in terms of maturity of certain industries and certain sectors, which are that certain industries invested a lot in terms of data, data access, ability to start data in the last few years and no age, a level of maturity where they can invest more and get to the next steps. And it's really rely on the ability of certain medial certain organization actually to have built this long term strategy a few years ago and no start raping up the benefits. >>You know, a decade ago, Florian Hal Varian, we, you know, famously said that the sexy job in the next 10 years will be statisticians. And then everybody sort of change that to data scientists and then everybody. All the statisticians became data scientists, and they got a raise. But data science requires more than just statistics acumen. What what skills >>do >>you see as critical for the next generation of data science? >>Yeah, it's a good question because I think the first generation of the patient is became the licenses because they could done some pipe and quickly on be flexible. And I think that the skills or the next generation of data sentences will definitely be different. It will be first about being able to speak the language of the business, meaning, oh, you translate data inside predictive modeling all of this into actionable insight or business impact. And it would be about you collaborate with the rest of the business. It's not just a farce. You can build something off fast. You can do a notebook in python or your credit models off themselves. It's about, oh, you actually build this bridge with the business. And obviously those things are important. But we also has become the center of the fact that technology will evolve in the future. There will be new tools and technologies, and they will still need to keep this level of flexibility and get to understand quickly, quickly. What are the next tools they need to use the new languages or whatever to get there. >>As you look back on 2020 what are you thinking? What are you telling people as we head into next year? >>Yeah, I I think it's Zaveri interesting, right? We did this crisis, as has told us that the world really can change from one day to the next. And this has, you know, dramatic, you know, and perform the, you know, aspect. For example, companies all the sudden, you know, So their revenue line, you know, dropping. And they had to do less meat data. Some of the companies was the reverse, right? All the sudden, you know, they were online, like in stock out, for example, and their business, you know, completely, you know, change, you know, from one day to the other. So this GT off, You know, I, you know, adjusting the resource is that you have tow the task a need that can change, you know, using solution like snowflakes, you know, really has that. And we saw, you know, both in in our customers some customers from one day to the to do the next where, you know, growing like big time because they benefited, you know, from from from from co vid and their business benefited, but also, as you know, had to drop. And what is nice with with with cloud, it allows to, you know, I just compute resources toe, you know, to your business needs, you know, and really adjusted, you know, in our, uh, the the other aspect is is understanding what is happening, right? You need to analyze the we saw all these all our customers basically wanted to understand. What is that going to be the impact on my business? How can I adapt? How can I adjust? And and for that, they needed to analyze data. And, of course, a lot of data which are not necessarily data about, you know, their business, but also data from the outside. You know, for example, coffee data, You know, where is the States? You know, what is the impact? You know, geographic impact from covitz, You know, all the time and access to this data is critical. So this is, you know, the promise off the data crowd, right? You know, having one single place where you can put all the data off the world. So our customers, all the Children you know, started to consume the cov data from our that our marketplace and and we had the literally thousands of customers looking at this data analyzing this data, uh, to make good decisions So this agility and and and this, you know, adapt adapting, you know, from from one hour to the next is really critical. And that goes, you know, with data with crowding adjusting, resource is on and that's, you know, doesn't exist on premise. So So So indeed, I think the lesson learned is is we are living in a world which machines changing all the time and we have for understanding We have to adjust and and And that's why cloud, you know, somewhere it's great. >>Excellent. Thank you. You know the kid we like to talk about disruption, of course. Who doesn't on And also, I mean, you look at a I and and the impact that is beginning to have and kind of pre co vid. You look at some of the industries that were getting disrupted by, you know, we talked about digital transformation and you had on the one end of the spectrum industries like publishing which are highly disrupted or taxis. And you could say Okay, well, that's, you know, bits versus Adam, the old Negroponte thing. But then the flip side of that look at financial services that hadn't been dramatically disrupted. Certainly healthcare, which is ripe for disruption Defense. So the number number of industries that really hadn't leaned into digital transformation If it ain't broke, don't fix it. Not on my watch. There was this complacency and then, >>of >>course, co vid broke everything. So, florian, I wonder if you could comment? You know what industry or industries do you think you're gonna be most impacted by data science and what I call machine intelligence or a I in the coming years and decades? >>Honestly, I think it's all of them artist, most of them because for some industries, the impact is very visible because we're talking about brand new products, drones like cars or whatever that are very visible for us. But for others, we are talking about sport from changes in the way you operate as an organization, even if financial industry itself doesn't seems to be so impacted when you look it from the consumer side or the outside. In fact, internally, it's probably impacted just because the way you use data on developer for flexibility, you need the kind off cost gay you can get by leveraging the latest technologies is just enormous, and so it will actually transform the industry that also and overall, I think that 2020 is only a where, from the perspective of a I and analytics, we understood this idea of maturity and resilience, maturity, meaning that when you've got a crisis, you actually need data and ai more than before. You need to actually call the people from data in the room to take better decisions and look for a while and not background. And I think that's a very important learning from 2020 that will tell things about 2021 and the resilience it's like, Yeah, Data Analytics today is a function consuming every industries and is so important that it's something that needs to work. So the infrastructure is to work in frustration in super resilient. So probably not on prime on a fully and prime at some point and the kind of residence where you need to be able to plan for literally anything like no hypothesis in terms of behaviors can be taken for granted. And that's something that is new and which is just signaling that we're just getting to the next step for the analytics. >>I wonder, Benoit, if you have anything to add to that. I mean, I often wonder, you know, winter machine's gonna be able to make better diagnoses than doctors. Some people say already, you know? Well, the financial services traditional banks lose control of payment systems. Uh, you know what's gonna happen to big retail stores? I mean, maybe bring us home with maybe some of your final thoughts. >>Yeah, I would say, you know, I I don't see that as a negative, right? The human being will always be involved very closely, but the machine and the data can really have, you know, see, Coalition, you know, in the data that that would be impossible for for for human being alone, you know, you know, to to discover so So I think it's going to be a compliment, not a replacement on. Do you know everything that has made us you know faster, you know, doesn't mean that that we have less work to do. It means that we can doom or and and we have so much, you know, to do, uh, that that I would not be worried about, You know, the effect off being more efficient and and and better at at our you know, work. And indeed, you know, I fundamentally think that that data, you know, processing off images and doing, you know, I ai on on on these images and discovering, you know, patterns and and potentially flagging, you know, disease, where all year that then it was possible is going toe have a huge impact in in health care, Onda and And as as as Ryan was saying, every you know, every industry is going to be impacted by by that technology. So So, yeah, I'm very optimistic. >>Great guys. I wish we had more time. I gotta leave it there. But so thanks so much for coming on. The Cube was really a pleasure having you.

Published Date : Nov 20 2020

SUMMARY :

And Wa Dodgeville is the he co founded And I have said many times on the Cube that you know, the first era of cloud was really about infrastructure, So you close the gap or the democratizing access to data And we know we all know that you and the snowflake team you get very high marks for Yeah, so So the really the challenge, you know, be four. And, you know, And so you need actually to simplify the access to you know it's pretty, pretty global, and and so you have a unique perspective on how companies the ability of certain medial certain organization actually to have built this long term strategy You know, a decade ago, Florian Hal Varian, we, you know, famously said that the sexy job in the next And it would be about you collaborate with the rest of the business. So our customers, all the Children you know, started to consume the cov you know, we talked about digital transformation and you had on the one end of the spectrum industries You know what industry or industries do you think you're gonna be most impacted by data the kind of residence where you need to be able to plan for literally I mean, I often wonder, you know, winter machine's gonna be able to make better diagnoses that data, you know, processing off images and doing, you know, I ai on I gotta leave it there.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VolontePERSON

0.99+

Florian DuettoPERSON

0.99+

Hilary MasonPERSON

0.99+

Florian Hal VarianPERSON

0.99+

FlorianPERSON

0.99+

BenoitPERSON

0.99+

RyanPERSON

0.99+

Ben WaPERSON

0.99+

Data AikoORGANIZATION

0.99+

2020DATE

0.99+

10 yearsQUANTITY

0.99+

LeePERSON

0.99+

Wa DodgevillePERSON

0.99+

next yearDATE

0.99+

pythonTITLE

0.99+

SnowflakeORGANIZATION

0.99+

firstQUANTITY

0.99+

one placeQUANTITY

0.99+

one hourQUANTITY

0.98+

a decade agoDATE

0.98+

FloydPERSON

0.98+

2021DATE

0.98+

one dayQUANTITY

0.98+

bothQUANTITY

0.97+

todayDATE

0.97+

first generationQUANTITY

0.96+

AdamPERSON

0.93+

OndaORGANIZATION

0.93+

one single placeQUANTITY

0.93+

florianPERSON

0.93+

each workloadQUANTITY

0.92+

oneQUANTITY

0.91+

fourQUANTITY

0.9+

few years agoDATE

0.88+

thousands of customersQUANTITY

0.88+

CubeCOMMERCIAL_ITEM

0.87+

first data scientistQUANTITY

0.84+

singleQUANTITY

0.83+

AsaiPERSON

0.82+

two worldQUANTITY

0.81+

first eraQUANTITY

0.74+

next 10 yearsDATE

0.74+

NegropontePERSON

0.73+

ZaveriORGANIZATION

0.72+

DataikuORGANIZATION

0.7+

CubeORGANIZATION

0.64+

AjaiORGANIZATION

0.58+

yearsDATE

0.57+

covitzPERSON

0.53+

decadesQUANTITY

0.52+

CubePERSON

0.45+

SnowflakeTITLE

0.45+

SeidelORGANIZATION

0.43+

snowflakeEVENT

0.35+

SeidelCOMMERCIAL_ITEM

0.34+