Image Title

Search Results for Jimit Devani:

Ajay Vohora and Duncan Turnbull | Io-Tahoe Data Quality: Active DQ


 

>> Announcer: From around the globe. It's the cube presenting active DQ, intelligent automation for data quality brought to you by Io Tahoe. (indistinct) >> Got it? all right if everybody is ready we'll opening on Dave in five, four, three. Now we're going to look at the role automation plays in mobilizing your data on snowflake. Let's welcome. And Duncan Turnbull who's partner sales engineer at snowflake, Ajay Vohora is back CEO of IO. Tahoe he's going to share his insight. Gentlemen. Welcome. >> Thank you, David good to be back. >> Yes it's great to have you back Ajay and it's really good to see Io Tahoe expanding the ecosystem so important now of course bringing snowflake in, it looks like you're really starting to build momentum. I mean, there's progress that we've seen every month month by month, over the past 12, 14 months. Your seed investors, they got to be happy. >> They are they're happy and they can see that we're running into a nice phase of expansion here new customers signing up, and now we're ready to go out and raise that next round of funding. Maybe think of us like Snowflake five years ago. So we're definitely on track with that. A lot of interest from investors and right now trying to focus in on those investors that can partner with us and understand AI data and an automation. >> Well, so personally, I mean you've managed a number of early stage VC funds. I think four of them. You've taken several comm software companies through many funding rounds and growth and all the way to exit. So you know how it works. You have to get product market fit, you got to make sure you get your KPIs, right. And you got to hire the right salespeople, but what's different this time around? >> Well, you know, the fundamentals that you mentioned those that never change. What I can see that's different that's shifted this time around is three things. One in that they used to be this kind of choice of do we go open source or do we go proprietary? Now that has turned into a nice hybrid model where we've really keyed into RedHat doing something similar with Centos. And the idea here is that there is a core capability of technology that underpins a platform, but it's the ability to then build an ecosystem around that made up of a community. And that community may include customers, technology partners, other tech vendors and enabling the platform adoption so that all of those folks in that community can build and contribute whilst still maintaining the core architecture and platform integrity at the core of it. And that's one thing that's changed. We're seeing a lot of that type of software company emerge into that model, which is different from five years ago. And then leveraging the Cloud, every Cloud, Snowflake Cloud being one of them here. In order to make use of what customers end customers in enterprise software are moving towards. Every CIO is now in some configuration of a hybrid. IT is state whether that is Cloud, multi-Cloud, on-prem. That's just the reality. The other piece is in dealing with the CIO, his legacy. So the past 15, 20 years I've purchased many different platforms, technologies, and some of those are still established and still (indistinct) How do you enable that CIO to make purchase whilst still preserving and in some cases building on and extending the legacy material technology. So they've invested their people's time and training and financial investment into. Yeah, of course solving a problem, customer pain point with technology that never goes out in a fashion >> That never changes. You have to focus like a laser on that. And of course, speaking of companies who are focused on solving problems, Duncan Turnbull from Snowflake. You guys have really done a great job and really brilliantly addressing pain points particularly around data warehousing, simplified that you're providing this new capability around data sharing really quite amazing. Duncan, Ajay talks about data quality and customer pain points in enterprise IT. Why is data quality been such a problem historically? >> So one of the biggest challenges that's really affected that in the past is that because to address everyone's needs for using data, they've evolved all these kinds of different places to store it, all these different silos or data marts or all this kind of pluralfiation of places where data lives and all of those end up with slightly different schedules for bringing data in and out, they end up with slightly different rules for transforming that data and formatting it and getting it ready and slightly different quality checks for making use of it. And this then becomes like a big problem in that these different teams are then going to have slightly different or even radically different ounces to the same kinds of questions, which makes it very hard for teams to work together on their different data problems that exist inside the business, depending on which of these silos they end up looking at. And what you can do. If you have a single kind of scalable system for putting all of your data, into it, you can kind of side step along this complexity and you can address the data quality issues in a single way. >> Now, of course, we're seeing this huge trend in the market towards robotic process automation, RPA that adoption is accelerating. You see in UI paths, IPO, 35 plus billion dollars, valuation, Snowflake like numbers, nice comms there for sure. Ajay you've coined the phrase data RPA what is that in simple terms? >> Yeah I mean, it was born out of seeing how in our ecosystem (indistinct) community developers and customers general business users for wanting to adopt and deploy Io Tahoe's technology. And we could see that. I mean, there's not marketing out here we're not trying to automate that piece but wherever there is a process that was tied into some form of a manual overhead with handovers. And so on, that process is something that we were able to automate with Io Tahoe's technology and the employment of AI and machine learning technologies specifically to those data processes, almost as a precursor to getting into marketing automation or financial information automation. That's really where we're seeing the momentum pick up especially in the last six months. And we've kept it really simple with snowflake. We've kind of stepped back and said, well, the resource that a Snowflake can leverage here is the metadata. So how could we turn Snowflake into that repository of being the data catalog? And by the way, if you're a CIO looking to purchase the data catalog tool, stop there's no need to. Working with Snowflake we've enabled that intelligence to be gathered automatically and to be put to use within snowflake. So reducing that manual effort and I'm putting that data to work. And that's where we've packaged this with our AI machine learning specific to those data tasks. And it made sense that's what's resonated with our customers. >> You know, what's interesting here just a quick aside, as you know I've been watching snowflake now for awhile and of course the competitors come out and maybe criticize, "Why they don't have this feature. They don't have that feature." And snowflake seems to have an answer. And the answer oftentimes is, well ecosystem, ecosystem is going to bring that because we have a platform that's so easy to work with. So I'm interested Duncan in what kind of collaborations you are enabling with high quality data. And of course, your data sharing capability. >> Yeah so I think the ability to work on datasets isn't just limited to inside the business itself or even between different business units you're kind of discussing maybe with those silos before. When looking at this idea of collaboration. We have these challenges where we want to be able to exploit data to the greatest degree possible, but we need to maintain the security, the safety, the privacy, and governance of that data. It could be quite valuable. It could be quite personal depending on the application involved. One of these novel applications that we see between organizations of data sharing is this idea of data clean rooms. And these data clean rooms are safe, collaborative spaces which allow multiple companies or even divisions inside a company where they have particular privacy requirements to bring two or more data sets together, for analysis. But without having to actually share the whole unprotected data set with each other. And this lets you to you know, when you do this inside of Snowflake you can collaborate using standard tool sets. You can use all of our SQL ecosystem. You can use all of the data science ecosystem that works with Snowflake. You can use all of the BI ecosystem that works with snowflake. But you can do that in a way that keeps the confidentiality that needs to be presented inside the data intact. And you can only really do these kinds of collaborations especially across organization but even inside large enterprises, when you have good reliable data to work with, otherwise your analysis just isn't going to really work properly. A good example of this is one of our large gaming customers. Who's an appetizer. They were able to build targeted ads to acquire customers and measure the campaign impact in revenue but they were able to keep their data safe and secure while doing that while working with advertising partners. The business impact of that was they're able to get a lift of 20 to 25% in campaign effectiveness through better targeting and actually pull through into that of a reduction in customer acquisition costs because they just didn't have to spend as much on the forms of media that weren't working for them. >> So, Ajay I wonder, I mean with the way public policy is shaping out, you know, obviously GDPR started it in the States, California consumer privacy Act, and people are sort of taking the best of those. And there's a lot of differentiation but what are you seeing just in terms of governments really driving this move to privacy. >> Government, public sector, we're seeing a huge wake up an activity and across (indistinct), part of it has been data privacy. The other part of it is being more joined up and more digital rather than paper or form based. We've all got, so there's a waiting in the line, holding a form, taking that form to the front of the line and handing it over a desk. Now government and public sector is really looking to transform their services into being online (indistinct) self service. And that whole shift is then driving the need to emulate a lot of what the commercial sector is doing to automate their processes and to unlock the data from silos to put through into those processes. And another thing that I can say about this is the need for data quality is as Duncan mentions underpins all of these processes government, pharmaceuticals, utilities, banking, insurance. The ability for a chief marketing officer to drive a a loyalty campaign, the ability for a CFO to reconcile accounts at the end of the month to do a quick accurate financial close. Also the ability of a customer operations to make sure that the customer has the right details about themselves in the right application that they can sell. So from all of that is underpinned by data and is effective or not based on the quality of that data. So whilst we're mobilizing data to the Snowflake Cloud the ability to then drive analytics, prediction, business processes of that Cloud succeeds or fails on the quality of that data. >> I mean it really is table stakes. If you don't trust the data you're not going to use the data. The problem is it always takes so long to get to the data quality. There's all these endless debates about it. So we've been doing a fair amount of work and thinking around this idea of decentralized data. Data by its very nature is decentralized but the fault domains of traditional big data is that everything is just monolithic. And the organizations monolithic that technology's monolithic, the roles are very, you know, hyper specialized. And so you're hearing a lot more these days about this notion of a data fabric or what Jimit Devani calls a data mesh and we've kind of been leaning into that and the ability to connect various data capabilities whether it's a data, warehouse or a data hub or a data lake, that those assets are discoverable, they're shareable through API APIs and they're governed on a federated basis. And you're using now bringing in a machine intelligence to improve data quality. You know, I wonder Duncan, if you could talk a little bit about Snowflake's approach to this topic >> Sure so I'd say that making use of all of your data is the key kind of driver behind these ideas of beta meshes or beta fabrics? And the idea is that you want to bring together not just your kind of strategic data but also your legacy data and everything that you have inside the enterprise. I think I'd also like to kind of expand upon what a lot of people view as all of the data. And I think that a lot of people kind of miss that there's this whole other world of data they could be having access to, which is things like data from their business partners, their customers, their suppliers, and even stuff that's, more in the public domain, whether that's, you know demographic data or geographic or all these kinds of other types of data sources. And what I'd say to some extent is that the data Cloud really facilitates the ability to share and gain access to this both kind of, between organizations, inside organizations. And you don't have to, make lots of copies of the data and kind of worry about the storage and this federated, idea of governance and all these things that it's quite complex to kind of manage. The snowflake approach really enables you to share data with your ecosystem or the world without any latency with full control over what's shared without having to introduce new complexities or having complex interactions with APIs or software integration. The simple approach that we provide allows a relentless focus on creating the right data product to meet the challenges facing your business today. >> So Ajay, the key here is Duncan's talking about it my mind and in my cake takeaway is to simplicity. If you can take the complexity out of the equation you're going to get more adoption. It really is that simple. >> Yeah, absolutely. I think that, that whole journey, maybe five, six years ago the adoption of data lakes was a stepping stone. However, the Achilles heel there was the complexity that it shifted towards consuming that data from a data lake where there were many, many sets of data to be able to cure rate and to consume. Whereas actually, the simplicity of being able to go to the data that you need to do your role, whether you're in tax compliance or in customer services is key. And listen for snowflake by Io Tahoe. One thing we know for sure is that our customers are super smart and they're very capable. They're data savvy and they'll want to use whichever tool and embrace whichever Cloud platform that is going to reduce the barriers to solving what's complex about that data, simplifying that and using good old fashioned SQL to access data and to build products from it to exploit that data. So simplicity is key to it to allow people to make use of that data and CIO is recognize that. >> So Duncan, the Cloud obviously brought in this notion of DevOps and new methodologies and things like agile that's brought in the notion of DataOps which is a very hot topic right now basically DevOps applies to data about how does Snowflake think about this? How do you facilitate that methodology? >> So I agree with you absolutely that DataOps takes these ideas of agile development or agile delivery and have the kind of DevOps world that we've seen just rise and rise. And it applies them to the data pipeline, which is somewhere where it kind of traditionally hasn't happened. And it's the same kinds of messages. As we see in the development world it's about delivering faster development having better repeatability and really getting towards that dream of the data-driven enterprise, where you can answer people's data questions they can make better business decisions. And we have some really great architectural advantages that allow us to do things like allow cloning of data sets without having to copy them, allows us to do things like time travel so we can see what the data looked like at some point in the past. And this lets you kind of set up both your own kind of little data playpen as a clone without really having to copy all of that data so it's quick and easy. And you can also, again with our separation of storage and compute, you can provision your own virtual warehouse for dev usage. So you're not interfering with anything to do with people's production usage of this data. So these ideas, the scalability, it just makes it easy to make changes, test them, see what the effect of those changes are. And we've actually seen this, that you were talking a lot about partner ecosystems earlier. The partner ecosystem has taken these ideas that are inside Snowflake and they've extended them. They've integrated them with DevOps and DataOps tooling. So things like version control and get an infrastructure automation and things like Terraform. And they've kind of built that out into more of a DataOps products that you can make use of. So we can see there's a huge impact of these ideas coming into the data world. We think we're really well-placed to take advantage to them. The partner ecosystem is doing a great job with doing that. And it really allows us to kind of change that operating model for data so that we don't have as much emphasis on like hierarchy and change windows and all these kinds of things that are maybe viewed as a lot as fashioned. And we kind of taken the shift from this batch stage of integration into streaming continuous data pipelines in the Cloud. And this kind of gets you away from like a once a week or once a month change window if you're really unlucky to pushing changes in a much more rapid fashion as the needs of the business change. >> I mean those hierarchical organizational structures when we apply those to begin to that it actually creates the silos. So if you're going to be a silo buster, which Ajay I look at you guys in silo busters, you've got to put data in the hands of the domain experts, the business people, they know what data they want, if they have to go through and beg and borrow for a new data sets cetera. And so that's where automation becomes so key. And frankly the technology should be an implementation detail not the dictating factor. I wonder if you could comment on this. >> Yeah, absolutely. I think making the technologies more accessible to the general business users or those specialists business teams that's the key to unlocking. So it is interesting to see is as people move from organization to organization where they've had those experiences operating in a hierarchical sense, I want to break free from that. And we've been exposed to automation. Continuous workflows change is continuous in IT. It's continuous in business. The market's continuously changing. So having that flow across the organization of work, using key components, such as GitHub and similar towards your drive process, Terraform to build in, code into the process and automation and with Io Tahoe, leveraging all the metadata from across those fragmented sources is good to see how those things are coming together. And watching people move from organization to organization say, "Hey okay, I've got a new start. I've got my first hundred days to impress my new manager. What kind of an impact can I bring to this?" And quite often we're seeing that as, let me take away the good learnings from how to do it or how not to do it from my previous role. And this is an opportunity for me to bring in automation. And I'll give you an example, David, recently started working with a client in financial services. Who's an asset manager, managing financial assets. They've grown over the course of the last 10 years through M&A and each of those acquisitions have bought with its technical debt, it's own set of data, that multiple CRM systems now multiple databases, multiple bespoke in-house created applications. And when the new CIO came in and had a look at those he thought well, yes I want to mobilize my data. Yes, I need to modernize my data state because my CEO is now looking at these crypto assets that are on the horizon and the new funds that are emerging that's around digital assets and crypto assets. But in order to get to that where absolutely data underpins that and is the core asset cleaning up that that legacy situation mobilizing the relevant data into the Snowflake Cloud platform is where we're giving time back. You know, that is now taking a few weeks whereas that transitioned to mobilize that data start with that new clean slate to build upon a new business as a digital crypto asset manager as well as the legacy, traditional financial assets, bonds, stocks, and fixed income assets, you name it is where we're starting to see a lot of innovation. >> Tons of innovation. I love the crypto examples, NFTs are exploding and let's face it. Traditional banks are getting disrupted. And so I also love this notion of data RPA. Especially because Ajay I've done a lot of work in the RPA space. And what I would observe is that the early days of RPA, I call it paving the cow path, taking existing processes and applying scripts, letting software robots do its thing. And that was good because it reduced mundane tasks, but really where it's evolved is a much broader automation agenda. People are discovering new ways to completely transform their processes. And I see a similar analogy for the data operating model. So I'm wonder what do you think about that and how a customer really gets started bringing this to their ecosystem, their data life cycles. >> Sure. Yeah. Step one is always the same. It's figuring out for the CIO, the chief data officer, what data do I have? And that's increasingly something that they want to automate, so we can help them there and do that automated data discovery whether that is documents in the file share backup archive in a relational data store in a mainframe really quickly hydrating that and bringing that intelligence the forefront of what do I have, and then it's the next step of, well, okay now I want to continually monitor and curate that intelligence with the platform that I've chosen let's say Snowflake. In order such that I can then build applications on top of that platform to serve my internal external customer needs. and the automation around classifying data, reconciliation across different fragmented data silos building that in those insights into Snowflake. As you say, a little later on where we're talking about data quality, active DQ, allowing us to reconcile data from different sources as well as look at the integrity of that data. So then go on to remediation. I want to harness and leverage techniques around traditional RPA but to get to that stage, I need to fix the data. So remediating publishing the data in Snowflake, allowing analysis to be formed, performed in Snowflake but those are the key steps that we see and just shrinking that timeline into weeks, giving the organization that time back means they're spending more time on their customer and solving their customer's problem which is where we want them to be. >> Well, I think this is the brilliance of Snowflake actually, you know, Duncan I've talked to Benoit Dageville about this and your other co-founders and it's really that focus on simplicity. So I mean, that's you picked a good company to join in my opinion. So I wonder Ajay, if you could talk about some of the industry sectors that again are going to gain the most from data RPA, I mean traditional RPA, if I can use that term, a lot of it was back office, a lot of financial, what are the practical applications where data RPA is going to impact businesses and the outcomes that we can expect. >> Yes, so our drive is really to make that business general user's experience of RPA simpler and using no code to do that where they've also chosen Snowflake to build their Cloud platform. They've got the combination then of using a relatively simple scripting techniques such as SQL without no code approach. And the answer to your question is whichever sector is looking to mobilize their data. It seems like a cop-out but to give you some specific examples, David now in banking, where our customers are looking to modernize their banking systems and enable better customer experience through applications and digital apps, that's where we're seeing a lot of traction in this approach to pay RPA to data. And health care where there's a huge amount of work to do to standardize data sets across providers, payers, patients and it's an ongoing process there. For retail helping to to build that immersive customer experience. So recommending next best actions. Providing an experience that is going to drive loyalty and retention, that's dependent on understanding what that customer's needs, intent are, being able to provide them with the content or the offer at that point in time or all data dependent utilities. There's another one great overlap there with Snowflake where helping utilities telecoms, energy, water providers to build services on that data. And this is where the ecosystem just continues to expand. If we're helping our customers turn their data into services for their ecosystem, that's exciting. Again, they were more so exciting than insurance which it always used to think back to, when insurance used to be very dull and mundane, actually that's where we're seeing a huge amounts of innovation to create new flexible products that are priced to the day to the situation and risk models being adaptive when the data changes on events or circumstances. So across all those sectors that they're all mobilizing their data, they're all moving in some way but for sure form to a multi-Cloud setup with their IT. And I think with Snowflake and with Io Tahoe being able to accelerate that and make that journey simple and less complex is why we've found such a good partner here. >> All right. Thanks for that. And thank you guys both. We got to leave it there really appreciate Duncan you coming on and Ajay best of luck with the fundraising. >> We'll keep you posted. Thanks, David. >> All right. Great. >> Okay. Now let's take a look at a short video. That's going to help you understand how to reduce the steps around your DataOps let's watch. (upbeat music)

Published Date : Apr 20 2021

SUMMARY :

brought to you by Io Tahoe. he's going to share his insight. and it's really good to see Io Tahoe and they can see that we're running and all the way to exit. but it's the ability to You have to focus like a laser on that. is that because to address in the market towards robotic and I'm putting that data to work. and of course the competitors come out that needs to be presented this move to privacy. the ability to then drive and the ability to connect facilitates the ability to share and in my cake takeaway is to simplicity. that is going to reduce the And it applies them to the data pipeline, And frankly the technology should be that's the key to unlocking. that the early days of RPA, and the automation and the outcomes that we can expect. And the answer to your question is We got to leave it there We'll keep you posted. All right. That's going to help you

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavidPERSON

0.99+

Ajay VohoraPERSON

0.99+

Duncan TurnbullPERSON

0.99+

Duncan TurnbullPERSON

0.99+

fiveQUANTITY

0.99+

DuncanPERSON

0.99+

twoQUANTITY

0.99+

DavePERSON

0.99+

IOORGANIZATION

0.99+

Jimit DevaniPERSON

0.99+

AjayPERSON

0.99+

Io TahoeORGANIZATION

0.99+

20QUANTITY

0.99+

Io-TahoeORGANIZATION

0.99+

OneQUANTITY

0.99+

California consumer privacy ActTITLE

0.99+

TahoePERSON

0.99+

Benoit DagevillePERSON

0.99+

SnowflakeTITLE

0.99+

five years agoDATE

0.99+

SQLTITLE

0.99+

first hundred daysQUANTITY

0.98+

fourQUANTITY

0.98+

GDPRTITLE

0.98+

eachQUANTITY

0.98+

threeQUANTITY

0.98+

bothQUANTITY

0.98+

25%QUANTITY

0.97+

three thingsQUANTITY

0.97+

oneQUANTITY

0.97+

M&AORGANIZATION

0.97+

once a weekQUANTITY

0.97+

one thingQUANTITY

0.96+

SnowflakeORGANIZATION

0.95+

once a monthQUANTITY

0.95+

DevOpsTITLE

0.95+

snowflakeTITLE

0.94+

singleQUANTITY

0.93+

last six monthsDATE

0.92+

StatesTITLE

0.92+

six years agoDATE

0.91+

single wayQUANTITY

0.91+

Snowflake CloudTITLE

0.9+

DataOpsTITLE

0.9+

todayDATE

0.86+

12QUANTITY

0.85+

35 plus billion dollarsQUANTITY

0.84+

fiveDATE

0.84+

Step oneQUANTITY

0.83+

TonsQUANTITY

0.82+

RedHatORGANIZATION

0.81+

CentosORGANIZATION

0.8+

One thingQUANTITY

0.79+

14 monthsQUANTITY

0.79+