Ajay Vohora and Duncan Turnbull | Io-Tahoe ActiveDQ Intelligent Automation for Data Quality
>>From around the globe, but it's the cube presenting active DQ, intelligent automation for data quality brought to you by IO Tahoe. >>Now we're going to look at the role automation plays in mobilizing your data on snowflake. Let's welcome. And Duncan Turnbull who's partner sales engineer at snowflake and AIG Vihara is back CEO of IO. Tahoe is going to share his insight. Gentlemen. Welcome. >>Thank you, David. Good to have you back. Yeah, it's great to have you back >>A J uh, and it's really good to CIO Tao expanding the ecosystem so important. Um, now of course bringing snowflake and it looks like you're really starting to build momentum. I mean, there's progress that we've seen every month, month by month, over the past 12, 14 months, your seed investors, they gotta be happy. >>They are all that happy. And then I can see that we run into a nice phase of expansion here and new customers signing up. And now you're ready to go out and raise that next round of funding. I think, um, maybe think of a slight snowflake five years ago. So we're definitely on track with that. A lot of interest from investors and, um, we're right now trying to focus in on those investors that can partner with us, understand AI data and, and automation. >>So personally, I mean, you've managed a number of early stage VC funds. I think four of them, uh, you've taken several comp, uh, software companies through many funding rounds and growth and all the way to exit. So, you know how it works, you have to get product market fit, you know, you gotta make sure you get your KPIs, right. And you gotta hire the right salespeople, but, but what's different this time around, >>Uh, well, you know, the fundamentals that you mentioned though, those are never change. And, um, what we can say, what I can say that's different, that's shifted, uh, this time around is three things. One in that they used to be this kind of choice of, do we go open source or do we go proprietary? Um, now that has turned into, um, a nice hybrid model where we've really keyed into, um, you know, red hat doing something similar with Santos. And the idea here is that there is a core capability of technology that independence a platform, but it's the ability to then build an ecosystem around that made a pervade community. And that community may include customers, uh, technology partners, other tech vendors, and enabling the platform adoption so that all of those folks in that community can build and contribute, um, while still maintaining the core architecture and platform integrity, uh, at the core of it. >>And that's one thing that's changed was fitting a lot of that type of software company, um, emerge into that model, which is different from five years ago. Um, and then leveraging the cloud, um, every cloud snowflake cloud being one of them here in order to make use of what customers, uh, and customers and enterprise software are moving towards. Uh, every CIO is now in some configuration of a hybrid. Um, it is state whether those cloud multi-cloud on prem. That's just the reality. The other piece is in dealing with the CIO is legacy. So the past 15, 20 years they've purchased many different platforms, technologies, and some of those are still established and still, how do you, um, enable that CIO to make purchase while still preserving and in some cases building on and extending the, the legacy, um, material technology. So they've invested their people's time and training and financial investment into solving a problem, customer pain point, uh, with technology, but, uh, never goes out of fashion >>That never changes. You have to focus like a laser on that. And of course, uh, speaking of companies who are focused on solving problems, don't can turn bill from snowflake. You guys have really done a great job and really brilliantly addressing pain points, particularly around data warehousing, simplified that you're providing this new capability around data sharing, uh, really quite amazing. Um, Dunkin AAJ talks about data quality and customer pain points, uh, in, in enterprise. It, why is data quality been such a problem historically? >>Oh, sorry. One of the biggest challenges that's really affected by it in the past is that because to address everyone's need for using data, they've evolved all these kinds of different places to store all these different silos or data marts or all this kind of clarification of places where data lives and all of those end up with slightly different schedules to bringing data in and out. They end up with slightly different rules for transforming that data and formatting it and getting it ready and slightly different quality checks for making use of it. And this then becomes like a big problem in that these different teams are then going to have slightly different or even radically different ounces to the same kinds of questions, which makes it very hard for teams to work together, uh, on their different data problems that exist inside the business, depending on which of these silos they end up looking at and what you can do. If you have a single kind of scalable system for putting all of your data into it, you can kind of sidestep along to this complexity and you can address the data quality issues in a, in a single and a single way. >>Now, of course, we're seeing this huge trend in the market towards robotic process automation, RPA, that adoption is accelerating. Uh, you see, in UI paths, I IPO, you know, 35 plus billion dollars, uh, valuation, you know, snowflake like numbers, nice cops there for sure. Uh, agent you've coined the phrase data RPA, what is that in simple terms? >>Yeah, I mean, it was born out of, uh, seeing how in our ecosystem concern community developers and customers, uh, general business users for wanting to adopt and deploy a tar hose technology. And we could see that, um, I mean, there's not monkeying out PA we're not trying to automate that piece, but wherever there is a process that was tied into some form of a manual overhead with handovers and so on. Um, that process is something that we were able to automate with, with our ties technology and, and the deployment of AI and machine learning technologies specifically to those data processes almost as a precursor to getting into financial automation that, um, that's really where we're seeing the momentum pick up, especially in the last six months. And we've kept it really simple with snowflake. We've kind of stepped back and said, well, you know, the resource that a snowflake can leverage here is, is the metadata. So how could we turn snowflake into that repository of being the data catalog? And by the way, if you're a CIO looking to purchase a data catalog tool stop, there's no need to, um, working with snowflake, we've enable that intelligence to be gathered automatically and to be put, to use within snowflake. So reducing that manual effort, and I'm putting that data to work. And, um, and that's where we've packaged this with, uh, AI machine learning specific to those data tasks. Um, and it made sense that's, what's resonated with, with our customers. >>You know, what's interesting here, just a quick aside, as you know, I've been watching snowflake now for awhile and, and you know, of course the, the competitors come out and maybe criticize why they don't have this feature. They don't have that feature. And it's snowflake seems to have an answer. And the answer oftentimes is, well, its ecosystem ecosystem is going to bring that because we have a platform that's so easy to work with though. So I'm interested Duncan in what kind of collaborations you are enabling with high quality data. And of course, you know, your data sharing capability. >>Yeah. So I think, uh, you know, the ability to work on, on datasets, isn't just limited to inside the business itself or even between different business units. And we were kind of discussing maybe with their silos. Therefore, when looking at this idea of collaboration, we have these where we want to be >>Able to exploit data to the greatest degree possible, but we need to maintain the security, the safety, the privacy, and governance of that data. It could be quite valuable. It could be quite personal depending on the application involved. One of these novel applications that we see between organizations of data sharing is this idea of data clean rooms. And these data clean rooms are safe, collaborative spaces, which allow multiple companies or even divisions inside a company where they have particular, uh, privacy requirements to bring two or more data sets together for analysis. But without having to actually share the whole unprotected data set with each other, and this lets you to, you know, when you do this inside of snowflake, you can collaborate using standard tool sets. You can use all of our SQL ecosystem. You can use all of the data science ecosystem that works with snowflake. >>You can use all of the BI ecosystem that works with snowflake, but you can do that in a way that keeps the confidentiality that needs to be presented inside the data intact. And you can only really do these kinds of, uh, collaborations, especially across organization, but even inside large enterprises, when you have good reliable data to work with, otherwise your analysis just isn't going to really work properly. A good example of this is one of our large gaming customers. Who's an advertiser. They were able to build targeting ads to acquire customers and measure the campaign impact in revenue, but they were able to keep their data safe and secure while doing that while working with advertising partners, uh, the business impact of that was they're able to get a lifted 20 to 25% in campaign effectiveness through better targeting and actually, uh, pull through into that of a reduction in customer acquisition costs because they just didn't have to spend as much on the forms of media that weren't working for them. >>So, ha I wonder, I mean, you know, with, with the way public policy shaping out, you know, obviously GDPR started it in the States, you know, California, consumer privacy act, and people are sort of taking the best of those. And, and, and there's a lot of differentiation, but what are you seeing just in terms of, you know, the government's really driving this, this move to privacy, >>Um, government public sector, we're seeing a huge wake up an activity and, uh, across the whole piece that, um, part of it has been data privacy. Um, the other part of it is being more joined up and more digital rather than paper or form based. Um, we've all got stories of waiting in line, holding a form, taking that form to the front of the line and handing it over a desk. Now government and public sector is really looking to transform their services into being online, to show self service. Um, and that whole shift is then driving the need to, um, emulate a lot of what the commercial sector is doing, um, to automate their processes and to unlock the data from silos to put through into those, uh, those processes. Um, and another thing I can say about this is they, the need for data quality is as a Dunkin mentions underpins all of these processes, government pharmaceuticals, utilities, banking, insurance, the ability for a chief marketing officer to drive a, a loyalty campaign. >>They, the ability for a CFO to reconcile accounts at the end of the month. So do a, a, uh, a quick, accurate financial close. Um, also the, the ability of a customer operations to make sure that the customer has the right details about themselves in the right, uh, application that they can sell. So from all of that is underpinned by data and is effective or not based on the quality of that data. So whilst we're mobilizing data to snowflake cloud, the ability to then drive analytics, prediction, business processes off that cloud, um, succeeds or fails on the quality of that data. >>I mean it, and, you know, I would say, I mean, it really is table stakes. If you don't trust the data, you're not gonna use the data. The problem is it always takes so long to get to the data quality. There's all these endless debates about it. So we've been doing a fair amount of work and thinking around this idea of decentralized data, data by its very nature is decentralized, but the fault domains of traditional big data is that everything is just monolithic and the organizations monolithic technology's monolithic, the roles are very, you know, hyper specialized. And so you're hearing a lot more these days about this notion of a data fabric or what calls a data mesh. Uh, and we've kind of been leaning in to that and the ability to, to connect various data capabilities, whether it's a data warehouse or a data hub or a data Lake that those assets are discoverable, they're shareable through API APIs and they're governed on a federated basis. And you're using now bringing in a machine intelligence to improve data quality. You know, I wonder Duncan, if you could talk a little bit about Snowflake's approach to this topic. >>Sure. So I'd say that, you know, making use of all of your data, is there a key kind of driver behind these ideas that they can mesh into the data fabrics? And the idea is that you want to bring together not just your kind of strategic data, but also your legacy data and everything that you have inside the enterprise. I think I'd also like to kind of expand upon what a lot of people view as all of the data. And I think that a lot of people kind of miss that there's this whole other world of data they could be having access to, which is things like data from their business partners, their customers, their suppliers, and even stuff that's more in the public domain, whether that's, you know, demographic data or geographic or all these kinds of other types of data sources. And what I'd say to some extent is that the data cloud really facilitates the ability to share and gain access to this both kind of between organizations inside organizations. >>And you don't have to, you know, make lots of copies of the data and kind of worry about the storage and this federated, um, you know, idea of governance and all these things that it's quite complex to kind of manage this. Uh, you know, the snowflake approach really enables you to share data with your ecosystem all the world, without any latency with full control over what's shared without having to introduce new complexities or having complex attractions with APIs or software integration. The simple approach that we provide allows a relentless focus on creating the right data product to meet the challenges facing your business today. >>So, Andrea, the key here is to don't get to talking about it in my mind. Anyway, my cake takeaway is to simplicity. If you can take the complexity out of the equation, we're going to get more adoption. It really is that simple. >>Yeah, absolutely. Do you think that that whole journey, maybe five, six years ago, the adoption of data lakes was, was a stepping stone. Uh, however, the Achilles heel there was, you know, the complexity that it shifted towards consuming that data from a data Lake where there were many, many sets of data, um, to, to be able to cure rate and to, um, to consume, uh, whereas actually, you know, the simplicity of being able to go to the data that you need to do your role, whether you're in tax compliance or in customer services is, is key. And, you know, listen for snowflake by auto. One thing we know for sure is that our customers are super small and they're very capable. They're they're data savvy and know, want to use whichever tool and embrace whichever, um, cloud platform that is gonna reduce the barriers to solving. What's complex about that data, simplifying that and using, um, good old fashioned SQL, um, to access data and to build products from it to exploit that data. So, um, simplicity is, is key to it to allow people to, to, to make use of that data. And CIO is recognize that >>So Duncan, the cloud obviously brought in this notion of dev ops, um, and new methodologies and things like agile that brought that's brought in the notion of data ops, which is a very hot topic right now. Um, basically dev ops applies to data about how D how does snowflake think about this? How do you facilitate that methodology? >>Yeah, sorry. I agree with you absolutely. That they drops takes these ideas of agile development of >>Agile delivery and of the kind of dev ops world that we've seen just rise and rise, and it applies them to the data pipeline, which is somewhere where it kind of traditionally hasn't happened. And it's the same kinds of messages as we see in the development world, it's about delivering faster development, having better repeatability and really getting towards that dream of the data-driven enterprise, you know, where you can answer people's data questions, they can make better business decisions. And we have some really great architectural advantages that allow us to do things like allow cloning of data sets without having to copy them, allows us to do things like time travel so we can see what data looked like at some point in the past. And this lets you kind of set up both your own kind of little data playpen as a clone without really having to copy all of that data. >>So it's quick and easy, and you can also, again, with our separation of storage and compute, you can provision your own virtual warehouse for dev usage. So you're not interfering with anything to do with people's production usage of this data. So the, these ideas, the scalability, it just makes it easy to make changes, test them, see what the effect of those changes are. And we've actually seen this. You were talking a lot about partner ecosystems earlier. Uh, the partner ecosystem has taken these ideas that are inside snowflake and they've extended them. They've integrated them with, uh, dev ops and data ops tooling. So things like version control and get an infrastructure automation and things like Terraform. And they've kind of built that out into more of a data ops products that, that you can, you can make yourself so we can see there's a huge impact of, of these ideas coming into the data world. >>We think we're really well-placed to take advantage to them. The partner ecosystem is doing a great job with doing that. And it really allows us to kind of change that operating model for data so that we don't have as much emphasis on like hierarchy and change windows and all these kinds of things that are maybe use as a lot of fashioned. And we kind of taking the shift from this batch data integration into, you know, streaming continuous data pipelines in the cloud. And this kind of gets you away from like a once a week or once a month change window, if you're really unlucky to, you know, pushing changes, uh, in a much more rapid fashion as the needs of the business change. >>I mean, those hierarchical organizational structures, uh, w when we apply those to begin to that, what it actually creates the silos. So if you're going to be a silo Buster, which aji look at you guys in silo busters, you've got to put data in the hands of the domain experts, the business people, they know what data they want, if they have to go through and beg and borrow for a new data sets, et cetera. And so that's where automation becomes so key. And frankly, the technology should be an implementation detail, not the dictating factor. I wonder if you could comment on this. >>Yeah, absolutely. I think, um, making the, the technologies more accessible to the general business users >>Or those specialists business teams that, um, that's the key to unlocking is it is interesting to see is as people move from organization to organization where they've had those experiences operating in a hierarchical sense, I want to break free from that and, um, or have been exposed to, um, automation, continuous workflows, um, change is continuous in it. It's continuous in business, the market's continuously changing. So having that flow across the organization of work, using key components, such as get hub, similar to what you drive process Terraform to build in, um, code into the process, um, and automation and with a high Tahoe leveraging all the metadata from across those fragmented sources is, is, is good to say how those things are coming together. And watching people move from organization to organization say, Hey, okay, I've got a new start. I've got my first hundred days to impress my, my new manager. >>Uh, what kind of an impact can I, um, bring to this? And quite often we're seeing that as, you know, let me take away the good learnings from how to do it, or how not to do it from my previous role. And this is an opportunity for me to, to bring in automation. And I'll give you an example, David, you know, recently started working with a, a client in financial services. Who's an asset manager, uh, managing financial assets. They've grown over the course of the last 10 years through M and a, and each of those acquisitions have bought with it tactical data. It's saying instead of data of multiple CRM systems now multiple databases, multiple bespoke in-house created applications. And when the new CIO came in and had a look at those well, you know, yes, I want to mobilize my data. Yes, I need to modernize my data state because my CEO is now looking at these crypto assets that are on the horizon and the new funds that are emerging that around digital assets and crypto assets. >>But in order to get to that where absolutely data underpins and is the core asset, um, cleaning up that, that legacy situation mobilizing the relevant data into the Safelite cloud platform, um, is where we're giving time back, you know, that is now taking a few weeks, whereas that transitioned to mobilize that data, start with that, that new clean slate to build upon a new business as a, a digital crypto asset manager, as well as the legacy, traditional financial assets, bonds stocks, and fixed income assets, you name it, uh, is where we're starting to see a lot of innovation. >>Yeah. Tons of innovation. I love the crypto examples and FTS are exploding and, you know, let's face it, traditional banks are getting disrupted. Uh, and so I also love this notion of data RPA. I, especially because I've done a lot of work in the RPA space. And, and I want to, what I would observe is that the, the early days of RPA, I call it paving the cow path, taking existing processes and applying scripts, get letting software robots, you know, do its thing. And that was good because it reduced, you know, mundane tasks, but really where it's evolved is a much broader automation agenda. People are discovering new, new ways to completely transform their processes. And I see a similar, uh, analogy for data, the data operating model. So I'm wonder whenever you think about that, how a customer really gets started bringing this to their ecosystem, their data life cycles. >>Sure. Yeah. So step one is always the same is figuring out for the CIO, the chief data officer, what data do I have, um, and that's increasingly something that they want towards a mate, so we can help them there and, and do that automated data discovery, whether that is documents in the file, share backup archive in a relational data store, in a mainframe really quickly hydrating that and bringing that intelligence, the forefront of, of what do I have, and then it's the next step of, well, okay. Now I want to continually monitor and curate that intelligence with the platform that I've chosen. Let's say snowflake, um, in order such that I can then build applications on top of that platform to serve my, my internal, external customer needs and the automation around classifying data reconciliation across different fragmented data silos, building that in those insights into snowflake. >>Um, as you say, a little later on where we're talking about data quality, active DQ, allowing us to reconcile data from different sources, as well as look at the integrity of that data. Um, so they can go on to remediation, you know, I, I wanna, um, harness and leverage, um, techniques around traditional RPA. Um, but to get to that stage, I need to fix the data. So remediating publishing the data in snowflake, uh, allowing analysis to be formed performance snowflake. Th those are the key steps that we see and just shrinking that timeline into weeks, giving the organization that time back means they're spending more time on their customer and solving their customer's problem, which is where we want them to be. >>This is the brilliance of snowflake actually, you know, Duncan is, I've talked to him, then what does your view about this and your other co-founders and it's really that focus on simplicity. So, I mean, that's, you, you picked a good company to join my opinion. So, um, I wonder if you could, you know, talk about some of the industry sectors that are, again, going to gain the most from, from data RPA, I mean, traditional RPA, if I can use that term, you know, a lot of it was back office, a lot of, you know, financial w what are the practical applications where data RPA is going to impact, you know, businesses and, and the outcomes that we can expect. >>Yes, sir. So our drive is, is really to, to make that, um, business general user's experience of RPA simpler and, and using no code to do that, uh, where they've also chosen snowflake to build that their cloud platform. They've got the combination then of using a relatively simple script scripting techniques, such as SQL, uh, without no code approach. And the, the answer to your question is whichever sector is looking to mobilize their data. Uh, it seems like a cop-out, but to give you some specific examples, David, um, in banking where, uh, customers are looking to modernize their banking systems and enable better customer experience through, through applications and digital apps. That's where we're, we're seeing a lot of traction, uh, and this approach to, to pay RPA to data, um, health care, where there's a huge amount of work to do to standardize data sets across providers, payers, patients, uh, and it's an ongoing, um, process there for, for retail, um, helping to, to build that immersive customer experience. >>So recommending next best actions, um, providing an experience that is going to drive loyalty and retention, that's, that's dependent on understanding what that customer's needs intent, uh, being out to provide them with the content or the outfit at that point in time, or all data dependent utilities is another one great overlap there with, with snowflake where, you know, helping utilities, telecoms energy, water providers to build services on that data. And this is where the ecosystem just continues to, to expand. If we, if we're helping our customers turn their data into services for, for their ecosystem, that's, that's exciting. And they were more so exciting than insurance, which we always used to, um, think back to, uh, when insurance used to be very dull and mundane, actually, that's where we're seeing a huge amounts of innovation to create new flexible products that are priced to the day to the situation and, and risk models being adaptive when the data changes, uh, on, on events or circumstances. So across all those sectors that they're all mobilizing that data, they're all moving in some way, shape or form to a, a multi-cloud, um, set up with their it. And I think with, with snowflake and without Tahoe, being able to accelerate that and make that journey simple and as complex is, uh, is why we found such a good partner here. >>All right. Thanks for that. And then thank you guys. Both. We gotta leave it there. Uh, really appreciate Duncan you coming on and Aja best of luck with the fundraising. >>We'll keep you posted. Thanks, David. All right. Great. >>Okay. Now let's take a look at a short video. That's going to help you understand how to reduce the steps around your data ops. Let's watch.
SUMMARY :
intelligent automation for data quality brought to you by IO Tahoe. Tahoe is going to share his insight. Yeah, it's great to have you back Um, now of course bringing snowflake and it looks like you're really starting to build momentum. And then I can see that we run into a And you gotta hire the right salespeople, but, but what's different this time around, Uh, well, you know, the fundamentals that you mentioned though, those are never change. enable that CIO to make purchase while still preserving and in some And of course, uh, speaking of the business, depending on which of these silos they end up looking at and what you can do. uh, valuation, you know, snowflake like numbers, nice cops there for sure. We've kind of stepped back and said, well, you know, the resource that a snowflake can and you know, of course the, the competitors come out and maybe criticize why they don't have this feature. And we were kind of discussing maybe with their silos. the whole unprotected data set with each other, and this lets you to, you know, And you can only really do these kinds you know, obviously GDPR started it in the States, you know, California, consumer privacy act, insurance, the ability for a chief marketing officer to drive They, the ability for a CFO to reconcile accounts at the end of the month. I mean it, and, you know, I would say, I mean, it really is table stakes. extent is that the data cloud really facilitates the ability to share and gain access to this both kind Uh, you know, the snowflake approach really enables you to share data with your ecosystem all the world, So, Andrea, the key here is to don't get to talking about it in my mind. Uh, however, the Achilles heel there was, you know, the complexity So Duncan, the cloud obviously brought in this notion of dev ops, um, I agree with you absolutely. And this lets you kind of set up both your own kind So it's quick and easy, and you can also, again, with our separation of storage and compute, you can provision your own And this kind of gets you away from like a once a week or once a month change window, And frankly, the technology should be an implementation detail, not the dictating factor. the technologies more accessible to the general business users similar to what you drive process Terraform to build in, that as, you know, let me take away the good learnings from how to do um, is where we're giving time back, you know, that is now taking a And that was good because it reduced, you know, mundane tasks, that intelligence, the forefront of, of what do I have, and then it's the next step of, you know, I, I wanna, um, harness and leverage, um, This is the brilliance of snowflake actually, you know, Duncan is, I've talked to him, then what does your view about this and your but to give you some specific examples, David, um, the day to the situation and, and risk models being adaptive And then thank you guys. We'll keep you posted. That's going to help you understand how to reduce
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Andrea | PERSON | 0.99+ |
Duncan Turnbull | PERSON | 0.99+ |
Ajay Vohora | PERSON | 0.99+ |
Duncan | PERSON | 0.99+ |
20 | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
IO | ORGANIZATION | 0.99+ |
Both | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
first hundred days | QUANTITY | 0.99+ |
SQL | TITLE | 0.99+ |
both | QUANTITY | 0.99+ |
three things | QUANTITY | 0.98+ |
California | LOCATION | 0.98+ |
five years ago | DATE | 0.98+ |
one thing | QUANTITY | 0.98+ |
25% | QUANTITY | 0.97+ |
Terraform | ORGANIZATION | 0.97+ |
each | QUANTITY | 0.97+ |
one | QUANTITY | 0.96+ |
35 plus billion dollars | QUANTITY | 0.96+ |
five | DATE | 0.96+ |
Santos | ORGANIZATION | 0.96+ |
once a week | QUANTITY | 0.95+ |
GDPR | TITLE | 0.95+ |
Tahoe | PERSON | 0.95+ |
once a month | QUANTITY | 0.95+ |
consumer privacy act | TITLE | 0.94+ |
four | QUANTITY | 0.94+ |
step one | QUANTITY | 0.93+ |
IO Tahoe | ORGANIZATION | 0.93+ |
M | ORGANIZATION | 0.91+ |
agile | TITLE | 0.91+ |
last six months | DATE | 0.91+ |
14 months | QUANTITY | 0.9+ |
single | QUANTITY | 0.88+ |
six years ago | DATE | 0.88+ |
today | DATE | 0.88+ |
Io-Tahoe | ORGANIZATION | 0.87+ |
12 | QUANTITY | 0.84+ |
one of them | QUANTITY | 0.83+ |
AIG Vihara | ORGANIZATION | 0.82+ |
One thing | QUANTITY | 0.8+ |
single way | QUANTITY | 0.77+ |
last 10 years | DATE | 0.76+ |
Tons | QUANTITY | 0.75+ |
Agile | TITLE | 0.73+ |
years | QUANTITY | 0.73+ |
Tahoe | ORGANIZATION | 0.7+ |
Terraform | TITLE | 0.66+ |
every cloud | QUANTITY | 0.65+ |
Dunkin | ORGANIZATION | 0.61+ |
past 15, 20 | DATE | 0.59+ |
Tao | ORGANIZATION | 0.56+ |
Snowflake | ORGANIZATION | 0.56+ |
Safelite | ORGANIZATION | 0.54+ |
snowflake | TITLE | 0.53+ |
Dunkin AAJ | PERSON | 0.52+ |
people | QUANTITY | 0.51+ |
hat | ORGANIZATION | 0.5+ |
Glenn Grossman and Yusef Khan | Io-Tahoe ActiveDQ Intelligent Automation
>>from around the globe. It's the >>cube presenting >>active de que intelligent automation for data quality brought to you by Iota Ho >>Welcome to the sixth episode of the I. O. Tahoe data automation series. On the cube. We're gonna start off with a segment on how to accelerate the adoption of snowflake with Glenn Grossman, who is the enterprise account executive from Snowflake and yusef khan, the head of data services from Iota. Gentlemen welcome. >>Good afternoon. Good morning, Good evening. Dave. >>Good to see you. Dave. Good to see you. >>Okay glenn uh let's start with you. I mean the Cube hosted the snowflake data cloud summit in November and we heard from customers and going from love the tagline zero to snowflake, you know, 90 minutes very quickly. And of course you want to make it simple and attractive for enterprises to move data and analytics into the snowflake platform but help us understand once the data is there, how is snowflake helping to achieve savings compared to the data lake? >>Absolutely. dave. It's a great question, you know, it starts off first with the notion and uh kind of, we coined it in the industry or t shirt size pricing. You know, you don't necessarily always need the performance of a high end sports car when you're just trying to go get some groceries and drive down the street 20 mph. The t shirt pricing really aligns to, depending on what your operational workload is to support the business and the value that you need from that business? Not every day. Do you need data? Every second of the moment? Might be once a day, once a week through that t shirt size price and we can align for the performance according to the environmental needs of the business. What those drivers are the key performance indicators to drive that insight to make better decisions, It allows us to control that cost. So to my point, not always do you need the performance of a Ferrari? Maybe you need the performance and gas mileage of the Honda Civic if you would just get and deliver the value of the business but knowing that you have that entire performance landscape at a moments notice and that's really what what allows us to hold and get away from. How much is it going to cost me in a data lake type of environment? >>Got it. Thank you for that yussef. Where does Io Tahoe fit into this equation? I mean what's, what's, what's unique about the approach that you're taking towards this notion of mobilizing data on snowflake? >>Well, Dave in the first instance we profile the data itself at the data level, so not just at the level of metadata and we do that wherever that data lives. So it could be structured data could be semi structured data could be unstructured data and that data could be on premise. It could be in the cloud or it could be on some kind of SAAS platform. And so we profile this data at the source system that is feeding snowflake within snowflake itself within the end applications and the reports that the snowflake environment is serving. So what we've done here is take our machine learning discovery technology and make snowflake itself the repository for knowledge and insights on data. And this is pretty unique. Uh automation in the form of our P. A. Is being applied to the data both before after and within snowflake. And so the ultimate outcome is that business users can have a much greater degree of confidence that the data they're using can be trusted. Um The other thing we do uh which is unique is employee data R. P. A. To proactively detect and recommend fixes the data quality so that removes the manual time and effort and cost it takes to fix those data quality issues. Uh If they're left unchecked and untouched >>so that's key to things their trust, nobody's gonna use the data. It's not trusted. But also context. If you think about it, we've contextualized are operational systems but not our analytic system. So there's a big step forward glen. I wonder if you can tell us how customers are managing data quality when they migrate to snowflake because there's a lot of baggage in in traditional data warehouses and data lakes and and data hubs. Maybe you can talk about why this is a challenge for customers. And like for instance can you proactively address some of those challenges that customers face >>that we certainly can. They have. You know, data quality. Legacy data sources are always inherent with D. Q. Issues whether it's been master data management and data stewardship programs over the last really almost two decades right now, you do have systemic data issues. You have siloed data, you have information operational, data stores data marks. It became a hodgepodge when organizations are starting their journey to migrate to the cloud. One of the things that were first doing is that inspection of data um you know first and foremost even looking to retire legacy data sources that aren't even used across the enterprise but because they were part of the systemic long running operational on premise technology, it stayed there when we start to look at data pipelines as we onboard a customer. You know we want to do that era. We want to do QA and quality assurance so that we can, And our ultimate goal eliminate the garbage in garbage out scenarios that we've been plagued with really over the last 40, 50 years of just data in general. So we have to take an inspection where traditionally it was E. T. L. Now in the world of snowflake, it's really lt we're extracting were loading or inspecting them. We're transforming out to the business so that these routines could be done once and again give great business value back to making decisions around the data instead of spending all this long time. Always re architect ng the data pipeline to serve the business. >>Got it. Thank you. Glenda yourself of course. Snowflakes renowned for customers. Tell me all the time. It's so easy. It's so easy to spin up a data warehouse. It helps with my security. Again it simplifies everything but so you know, getting started is one thing but then adoption is also a key. So I'm interested in the role that that I owe. Tahoe plays in accelerating adoption for new customers. >>Absolutely. David. I mean as Ben said, you know every every migration to Snowflake is going to have a business case. Um uh and that is going to be uh partly about reducing spending legacy I. T. Servers, storage licenses, support all those good things um that see I want to be able to turn off entirely ultimately. And what Ayatollah does is help discover all the legacy undocumented silos that have been built up, as Glenn says on the data estate across a period of time, build intelligence around those silos and help reduce those legacy costs sooner by accelerating that that whole process. Because obviously the quicker that I. T. Um and Cdos can turn off legacy data sources the more funding and resources going to be available to them to manage the new uh Snowflake based data estate on the cloud. And so turning off the old building, the new go hand in hand to make sure those those numbers stack up the program is delivered uh and the benefits are delivered. And so what we're doing here with a Tahoe is improving the customers are y by accelerating their ability to adopt Snowflake. >>Great. And I mean we're talking a lot about data quality here but in a lot of ways that's table stakes like I said, if you don't trust the data, nobody's going to use it. And glenn, I mean I look at Snowflake and I see obviously the ease of use the simplicity you guys are nailing that the data sharing capabilities I think are really exciting because you know everybody talks about sharing data but then we talked about data as an asset, Everyone so high I to hold it. And so sharing is is something that I see as a paradigm shift and you guys are enabling that. So one of the things beyond data quality that are notable that customers are excited about that, maybe you're excited about >>David, I think you just cleared it out. It's it's this massive data sharing play part of the data cloud platform. Uh you know, just as of last year we had a little over about 100 people, 100 vendors in our data marketplace. That number today is well over 450 it is all about democratizing and sharing data in a world that is no longer held back by FTp s and C. S. V. S and then the organization having to take that data and ingested into their systems. You're a snowflake customer. want to subscribe to an S and P data sources an example, go subscribe it to it. It's in your account there was no data engineering, there was no physical lift of data and that becomes the most important thing when we talk about getting broader insights, data quality. Well, the data has already been inspected from your vendor is just available in your account. It's obviously a very simplistic thing to describe behind the scenes is what our founders have created to make it very, very easy for us to democratize not only internal with private sharing of data, but this notion of marketplace ensuring across your customers um marketplace is certainly on the type of all of my customers minds and probably some other areas that might have heard out of a recent cloud summit is the introduction of snow park and being able to do where all this data is going towards us. Am I in an ale, you know, along with our partners at Io Tahoe and R. P. A. Automation is what do we do with all this data? How do we put the algorithms and targets now? We'll be able to run in the future R and python scripts and java libraries directly inside Snowflake, which allows you to even accelerate even faster, Which people found traditionally when we started off eight years ago just as a data warehousing platform. >>Yeah, I think we're on the cusp of just a new way of thinking about data. I mean obviously simplicity is a starting point but but data by its very nature is decentralized. You talk about democratizing data. I like this idea of the global mesh. I mean it's very powerful concept and again it's early days but you know, keep part of this is is automation and trust, yussef you've worked with Snowflake and you're bringing active D. Q. To the market what our customers telling you so far? >>Well David the feedback so far has been great. Which is brilliant. So I mean firstly there's a point about speed and acceleration. Um So that's the speed to incite really. So where you have inherent data quality issues uh whether that's with data that was on premise and being brought into snowflake or on snowflake itself, we're able to show the customer results and help them understand their data quality better Within Day one which is which is a fantastic acceleration. I'm related to that. There's the cost and effort to get that insight is it's a massive productivity gain versus where you're seeing customers who've been struggling sometimes too remediate legacy data and legacy decisions that they've made over the past couple of decades, so that that cost and effort is much lower than it would otherwise have been. Um 3rdly, there's confidence and trust, so you can see Cdos and see IOS got demonstrable results that they've been able to improve data quality across a whole bunch of use cases for business users in marketing and customer services, for commercial teams, for financial teams. So there's that very quick kind of growth in confidence and credibility as the projects get moving. And then finally, I mean really all the use cases for the snowflake depend on data quality, really whether it's data science, uh and and the kind of snow park applications that Glenn has talked about, all those use cases work better when we're able to accelerate the ri for our joint customers by very quickly pushing out these data quality um insights. Um And I think one of the one of the things that the snowflake have recognized is that in order for C. I. O. Is to really adopt enterprise wide, um It's also as well as the great technology with Snowflake offers, it's about cleaning up that legacy data state, freeing up the budget for CIA to spend it on the new modern day to a state that lets them mobilise their data with snowflake. >>So you're seeing the Senate progression. We're simplifying the the the analytics from a tech perspective. You bring in Federated governance which which brings more trust. Then then you bring in the automation of the data quality piece which is fundamental. And now you can really start to, as you guys are saying, democratized and scale uh and share data. Very powerful guys. Thanks so much for coming on the program. Really appreciate your time. >>Thank you. I appreciate as well. Yeah.
SUMMARY :
It's the the head of data services from Iota. Good afternoon. Good to see you. I mean the Cube hosted the snowflake data cloud summit and the value that you need from that business? Thank you for that yussef. so not just at the level of metadata and we do that wherever that data lives. so that's key to things their trust, nobody's gonna use the data. Always re architect ng the data pipeline to serve the business. Again it simplifies everything but so you know, getting started is one thing but then I mean as Ben said, you know every every migration to Snowflake is going I see obviously the ease of use the simplicity you guys are nailing that the data sharing that might have heard out of a recent cloud summit is the introduction of snow park and I mean it's very powerful concept and again it's early days but you know, Um So that's the speed to incite And now you can really start to, as you guys are saying, democratized and scale uh and I appreciate as well.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
Glenn Grossman | PERSON | 0.99+ |
Ben | PERSON | 0.99+ |
Io Tahoe | ORGANIZATION | 0.99+ |
Yusef Khan | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
20 mph | QUANTITY | 0.99+ |
Glenn | PERSON | 0.99+ |
CIA | ORGANIZATION | 0.99+ |
IOS | TITLE | 0.99+ |
Glenda | PERSON | 0.99+ |
90 minutes | QUANTITY | 0.99+ |
100 vendors | QUANTITY | 0.99+ |
Ferrari | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
One | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
first instance | QUANTITY | 0.99+ |
November | DATE | 0.99+ |
sixth episode | QUANTITY | 0.99+ |
once a day | QUANTITY | 0.99+ |
once a week | QUANTITY | 0.98+ |
Senate | ORGANIZATION | 0.98+ |
today | DATE | 0.98+ |
both | QUANTITY | 0.98+ |
eight years ago | DATE | 0.97+ |
yusef khan | PERSON | 0.97+ |
over | QUANTITY | 0.96+ |
one | QUANTITY | 0.95+ |
R. P. A. Automation | ORGANIZATION | 0.95+ |
python | TITLE | 0.95+ |
Tahoe | ORGANIZATION | 0.94+ |
I. O. Tahoe | TITLE | 0.93+ |
Honda | ORGANIZATION | 0.93+ |
Io-Tahoe | ORGANIZATION | 0.93+ |
one thing | QUANTITY | 0.91+ |
Io Tahoe | PERSON | 0.87+ |
firstly | QUANTITY | 0.87+ |
Civic | COMMERCIAL_ITEM | 0.87+ |
Snowflake | TITLE | 0.86+ |
Tahoe | PERSON | 0.85+ |
Ayatollah | PERSON | 0.84+ |
Snowflake | EVENT | 0.83+ |
past couple of decades | DATE | 0.82+ |
about 100 people | QUANTITY | 0.81+ |
two decades | QUANTITY | 0.8+ |
over 450 | QUANTITY | 0.79+ |
40, 50 years | QUANTITY | 0.76+ |
Day one | QUANTITY | 0.75+ |
glenn | PERSON | 0.74+ |
java | TITLE | 0.72+ |
snowflake | EVENT | 0.7+ |
Iota Ho | ORGANIZATION | 0.68+ |
P. | ORGANIZATION | 0.62+ |
ActiveDQ Intelligent Automation | ORGANIZATION | 0.61+ |
snowflake data cloud summit | EVENT | 0.6+ |
Iota | LOCATION | 0.58+ |
FTp | TITLE | 0.56+ |
Snowflake | ORGANIZATION | 0.54+ |
zero | QUANTITY | 0.53+ |
R | TITLE | 0.52+ |
O. | EVENT | 0.41+ |
C. | EVENT | 0.34+ |
Io-Tahoe Episode 6: ActiveDQ™ Intelligent Automation for Data Quality Management promo 1
>>The data Lake concept was intriguing when first introduced in 2010, but people quickly realized that shoving data into a data Lake may data Lake stagnant, repositories that were essentially storage bins that were less expensive than traditional data warehouses. This is Dave Vellante joined me for IO. Tahoe's latest installment of the data automation series, active DQ, intelligent automation for data quality management. We'll talk to experts from snowflake about their data assessment utility from within the snowflake platform and how it scales to the demands of business. While also controlling costs. I have Tahoe CEO, AIG Hora will explain how IO Tahoe and snowflake together are bringing active DQ to market. And what the customers are saying about it. Save the date Thursday, April 29th for IO Tahoes data automation series active DQ, intelligent automation for data quality show streams promptly at 11:00 AM Eastern on the cube, the >>In high tech coverage.
SUMMARY :
the snowflake platform and how it scales to the demands of business.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
2010 | DATE | 0.99+ |
Thursday, April 29th | DATE | 0.99+ |
IO | ORGANIZATION | 0.98+ |
11:00 AM Eastern | DATE | 0.97+ |
first | QUANTITY | 0.96+ |
IO Tahoe | ORGANIZATION | 0.96+ |
AIG Hora | ORGANIZATION | 0.91+ |
Io-Tahoe | TITLE | 0.89+ |
IO Tahoes | ORGANIZATION | 0.89+ |
ActiveDQ™ | TITLE | 0.86+ |
Episode 6 | QUANTITY | 0.85+ |
Tahoe | ORGANIZATION | 0.83+ |
Tahoe | PERSON | 0.56+ |
CEO | PERSON | 0.54+ |