Paula D'Amico, Webster Bank | Io Tahoe | Enterprise Data Automation

>>from around the globe. It's the Cube with digital coverage of enterprise data automation, an event Siri's brought to you by Iot. Tahoe, >>my buddy, We're back. And this is Dave Volante, and we're covering the whole notion of automating data in the Enterprise. And I'm really excited to have Paul Damico here. She's a senior vice president of enterprise data Architecture at Webster Bank. Good to see you. Thanks for coming on. >>Hi. Nice to see you, too. Yes. >>So let's let's start with Let's start with Webster Bank. You guys are kind of a regional. I think New York, New England, uh, leave headquartered out of Connecticut, but tell us a little bit about the bank. >>Yeah, Um, Webster Bank >>is regional Boston And that again, and New York, Um, very focused on in Westchester and Fairfield County. Um, they're a really highly rated saying regional bank for this area. They, um, hold, um, quite a few awards for the area for being supportive for the community and, um, are really moving forward. Technology lives. They really want to be a data driven bank, and they want to move into a more robust Bruce. >>Well, we got a lot to talk about. So data driven that is an interesting topic. And your role as data architect. The architecture is really senior vice president data architecture. So you got a big responsibility as it relates to It's kind of transitioning to this digital data driven bank. But tell us a little bit about your role in your organization, >>right? Um, currently, >>today we have, ah, a small group that is just working toward moving into a more futuristic, more data driven data warehouse. That's our first item. And then the other item is to drive new revenue by anticipating what customers do when they go to the bank or when they log into there to be able to give them the best offer. The only way to do that is you >>have uh huh. >>Timely, accurate, complete data on the customer and what's really a great value on off something to offer that or a new product or to help them continue to grow their savings or do and grow their investment. >>Okay. And I really want to get into that. But before we do and I know you're sort of part way through your journey, you got a lot of what they do. But I want to ask you about Cove. It how you guys you're handling that? I mean, you had the government coming down and small business loans and P p p. And huge volume of business and sort of data was at the heart of that. How did you manage through that? >>But we were extremely successful because we have a big, dedicated team that understands where their data is and was able to switch much faster than a larger bank to be able to offer. The TPP longs at to our customers within lightning speeds. And part of that was is we adapted to Salesforce very, for we've had salesforce in house for over 15 years. Um, you know, pretty much, uh, that was the driving vehicle to get our CPP is loans in on and then developing logic quickly. But it was a 24 7 development role in get the data moving, helping our customers fill out the forms. And a lot of that was manual. But it was a It was a large community effort. >>Well, think about that. Think about that too. Is the volume was probably much, much higher the volume of loans to small businesses that you're used to granting. But and then also, the initial guidelines were very opaque. You really didn't know what the rules were, but you were expected to enforce them. And then finally, you got more clarity. So you had to essentially code that logic into the system in real time, right? >>I wasn't >>directly involved, but part of my data movement Team Waas, and we had to change the logic overnight. So it was on a Friday night was released. We've pushed our first set of loans through and then the logic change, Um, from, you know, coming from the government and changed. And we had to re develop our our data movement piece is again and we design them and send them back. So it was It was definitely kind of scary, but we were completely successful. We hit a very high peak and I don't know the exact number, but it was in the thousands of loans from, you know, little loans to very large loans, and not one customer who buy it's not yet what they needed for. Um, you know, that was the right process and filled out the rate and pace. >>That's an amazing story and really great support for the region. New York, Connecticut, the Boston area. So that's that's fantastic. I want to get into the rest of your story. Now let's start with some of the business drivers in banking. I mean, obviously online. I mean, a lot of people have sort of joked that many of the older people who kind of shunned online banking would love to go into the branch and see their friendly teller had no choice, You know, during this pandemic to go to online. So that's obviously a big trend you mentioned. So you know the data driven data warehouse? I wanna understand that. But well, at the top level, what were some of what are some of the key business drivers there catalyzing your desire for change? >>Um, the ability to give the customer what they need at the time when they need it. And what I mean by that is that we have, um, customer interactions in multiple ways, right? >>And I want >>to be able for the customer, too. Walk into a bank, um, or online and see the same the same format and being able to have the same feel, the same look, and also to be able to offer them the next best offer for them. But they're you know, if they want looking for a new a mortgage or looking to refinance or look, you know, whatever it iss, um, that they have that data, we have the data and that they feel comfortable using it. And that's a untethered banker. Um, attitude is, you know, whatever my banker is holding and whatever the person is holding in their phone, that that is the same. And it's comfortable, so they don't feel that they've, you know, walked into the bank and they have to do a lot of different paperwork comparative filling out paperwork on, you know, just doing it on their phone. >>So you actually want the experience to be better. I mean, and it is in many cases now, you weren't able to do this with your existing against mainframe based Enterprise data warehouse. Is is that right? Maybe talk about that a little bit. >>Yeah, we were >>definitely able to do it with what we have today. The technology we're using, but one of the issues is that it's not timely, Um, and and you need a timely process to be able to get the customers to understand what's happening. Um, you want you need a timely process so we can enhance our risk management. We can apply for fraud issues and things like that. >>Yeah, so you're trying to get more real time in the traditional e g W. It's it's sort of a science project. There's a few experts that know how to get it. You consider line up. The demand is tremendous, and often times by the time you get the answer, you know it's outdated. So you're trying to address that problem. So So part of it is really the cycle time, the end end cycle, time that you're pressing. And then there's if I understand it, residual benefits that are pretty substantial from a revenue opportunity. Other other offers that you can you can make to the right customer, Um, that that you, you maybe know through your data. Is that right? >>Exactly. It's drive new customers, Teoh new opportunities. It's enhanced the risk, and it's to optimize the banking process and then obviously, to create new business. Um, and the only way we're going to be able to do that is that we have the ability to look at the data right when the customer walks in the door or right when they open up their app. And, um, by doing, creating more to New York time near real time data for the data warehouse team that's giving the lines of business the ability to to work on the next best offer for that customer. >>Paulo, we're inundated with data sources these days. Are there their data sources that you maybe maybe had access to before? But perhaps the backlog of ingesting and cleaning and cataloging and you know of analyzing. Maybe the backlog was so great that you couldn't perhaps tap some of those data sources. You see the potential to increase the data sources and hence the quality of the data, Or is that sort of premature? >>Oh, no. Um, >>exactly. Right. So right now we ingest a lot of flat files and from our mainframe type of Brennan system that we've had for quite a few years. But now that we're moving to the cloud and off Prem and on France, you know, moving off Prem into like an s three bucket. Where That data king, We can process that data and get that data faster by using real time tools to move that data into a place where, like, snowflake could utilize that data or we can give it out to our market. >>Okay, so we're >>about the way we do. We're in batch mode. Still, so we're doing 24 hours. >>Okay, So when I think about the data pipeline and the people involved, I mean, maybe you could talk a little bit about the organization. I mean, you've got I know you have data. Scientists or statisticians? I'm sure you do. Ah, you got data architects, data engineers, quality engineers, you know, developers, etcetera, etcetera. And oftentimes, practitioners like yourself will will stress about pay. The data's in silos of the data quality is not where we want it to be. We have to manually categorize the data. These are all sort of common data pipeline problems, if you will. Sometimes we use the term data ops, which is kind of a play on Dev Ops applied to the data pipeline. I did. You just sort of described your situation in that context. >>Yeah. Yes. So we have a very large data ops team and everyone that who is working on the data part of Webster's Bay has been there 13 14 years. So they get the data, they understand that they understand the lines of business. Um, so it's right now, um, we could we have data quality issues, just like everybody else does. We have. We have places in him where that gets clans, Um, and we're moving toward. And there was very much silo data. The data scientists are out in the lines of business right now, which is great, cause I think that's where data science belongs. We should give them on. And that's what we're working towards now is giving them more self service, giving them the ability to access the data, um, in a more robust way. And it's a single source of truth. So they're not pulling the data down into their own like tableau dashboards and then pushing the data back out. Um, so they're going to more not, I don't want to say a central repository, but a more of a robust repository that's controlled across multiple avenues where multiple lines of business can access. That said, how >>got it? Yes, and I think that one of the key things that I'm taking away from your last comment is the cultural aspects of this bite having the data. Scientists in the line of business, the line of lines of business, will feel ownership of that data as opposed to pointing fingers, criticizing the data quality they really own that that problem, as opposed to saying, Well, it's it's It's Paulus problem, >>right? Well, I have. My problem >>is, I have a date. Engineers, data architects, they database administrators, right, Um, and then data traditional data forwarding people. Um, and because some customers that I have that our business customers lines of business, they want to just subscribe to a report. They don't want to go out and do any data science work. Um, and we still have to provide that. So we still want to provide them some kind of regimen that they wake up in the morning and they open up their email. And there's the report that they just drive, um, which is great. And it works out really well. And one of the things is why we purchase I o waas. I would have the ability to give the lines of business the ability to do search within the data. And we read the data flows and data redundancy and things like that help me cleanup the data and also, um, to give it to the data. Analysts who say All right, they just asked me. They want this certain report, and it used to take Okay, well, we're gonna four weeks, we're going to go. We're gonna look at the data, and then we'll come back and tell you what we dio. But now with Iot Tahoe, they're able to look at the data and then, in one or two days of being able to go back and say, yes, we have data. This is where it is. This is where we found that this is the data flows that we've found also, which is that what I call it is the birth of a column. It's where the calm was created and where it went live as a teenager. And then it went to, you know, die very archive. Yeah, it's this, you know, cycle of life for a column. And Iot Tahoe helps us do that, and we do. Data lineage has done all the time. Um, and it's just takes a very long time. And that's why we're using something that has AI and machine learning. Um, it's it's accurate. It does it the same way over and over again. If an analyst leads, you're able to utilize talked something like, Oh, to be able to do that work for you. I get that. >>Yes. Oh, got it. So So a couple things there is in in, In researching Iot Tahoe, it seems like one of the strengths of their platform is the ability to visualize data the data structure and actually dig into it. But also see it, um, and that speeds things up and gives everybody additional confidence. And then the other pieces essentially infusing AI or machine intelligence into the data pipeline is really how you're attacking automation, right? And you're saying it's repeatable and and then that helps the data quality, and you have this virtuous cycle. Is there a firm that and add some color? Perhaps >>Exactly. Um, so you're able to let's say that I have I have seven cause lines of business that are asking me questions and one of the questions I'll ask me is. We want to know if this customer is okay to contact, right? And you know, there's different avenues, so you can go online to go. Do not contact me. You can go to the bank and you can say I don't want, um, email, but I'll take tests and I want, you know, phone calls. Um, all that information. So seven different lines of business asked me that question in different ways once said okay to contact the other one says, you know, customer one to pray All these, You know, um, and each project before I got there used to be siloed. So one customer would be 100 hours for them to do that and analytical work, and then another cut. Another analysts would do another 100 hours on the other project. Well, now I can do that all at once, and I can do those type of searches and say, Yes, we already have that documentation. Here it is. And this is where you can find where the customer has said, you know, you don't want I don't want to get access from you by email, or I've subscribed to get emails from you. >>Got it. Okay? Yeah. Okay. And then I want to come back to the cloud a little bit. So you you mentioned those three buckets? So you're moving to the Amazon cloud. At least I'm sure you're gonna get a hybrid situation there. You mentioned Snowflake. Um, you know what was sort of the decision to move to the cloud? Obviously, snowflake is cloud only. There's not an on Prem version there. So what precipitated that? >>Alright, So, from, um, I've been in >>the data I t Information field for the last 35 years. I started in the US Air Force and have moved on from since then. And, um, my experience with off brand waas with Snowflake was working with G McGee capital. And that's where I met up with the team from Iot to house as well. And so it's a proven. So there's a couple of things one is symptomatic of is worldwide. Now to move there, right, Two products, they have the on frame in the offering. Um, I've used the on Prem and off Prem. They're both great and it's very stable and I'm comfortable with other people are very comfortable with this. So we picked. That is our batch data movement. Um, we're moving to her, probably HBR. It's not a decision yet, but we're moving to HP are for real time data which has changed capture data, you know, moves it into the cloud. And then So you're envisioning this right now in Petrit, you're in the S three and you have all the data that you could possibly want. And that's Jason. All that everything is sitting in the S three to be able to move it through into snowflake and snowflake has proven cto have a stability. Um, you only need to learn in train your team with one thing. Um, aws has is completely stable at this 10.2. So all these avenues, if you think about it going through from, um, you know, this is your your data lake, which is I would consider your s three. And even though it's not a traditional data leg like you can touch it like a like a progressive or a dupe and into snowflake and then from snowflake into sandboxes. So your lines of business and your data scientists and just dive right in, Um, that makes a big, big win. and then using Iot. Ta ho! With the data automation and also their search engine, um, I have the ability to give the data scientists and eight analysts the the way of they don't need to talk to i t to get, um, accurate information or completely accurate information from the structure. And we'll be right there. >>Yes, so talking about, you know, snowflake and getting up to speed quickly. I know from talking to customers you get from zero to snowflake, you know, very fast. And then it sounds like the i o Ta ho is sort of the automation cloud for your data pipeline within the cloud. This is is that the right way to think about it? >>I think so. Um, right now I have I o ta >>ho attached to my >>on Prem. And, um, I >>want to attach it to my offering and eventually. So I'm using Iot Tahoe's data automation right now to bring in the data and to start analyzing the data close to make sure that I'm not missing anything and that I'm not bringing over redundant data. Um, the data warehouse that I'm working off is not a It's an on Prem. It's an Oracle database and its 15 years old. So it has extra data in it. It has, um, things that we don't need anymore. And Iot. Tahoe's helping me shake out that, um, extra data that does not need to be moved into my S three. So it's saving me money when I'm moving from offering on Prem. >>And so that was a challenge prior because you couldn't get the lines of business to agree what to delete or what was the issue there. >>Oh, it was more than that. Um, each line of business had their own structure within the warehouse, and then they were copying data between each other and duplicating the data and using that, uh so there might be that could be possibly three tables that have the same data in it. But it's used for different lines of business. And so I had we have identified using Iot Tahoe. I've identified over seven terabytes in the last, um, two months on data that is just been repetitive. Um, it just it's the same exact data just sitting in a different scheme. >>And and that's not >>easy to find. If you only understand one schema that's reporting for that line of business so that >>yeah, more bad news for the storage companies out there. Okay to follow. >>It's HCI. That's what that's what we were telling people you >>don't know and it's true, but you still would rather not waste it. You apply it to, you know, drive more revenue. And and so I guess Let's close on where you see this thing going again. I know you're sort of part way through the journey. May be you could sort of describe, you know, where you see the phase is going and really what you want to get out of this thing, You know, down the road Midterm. Longer term. What's your vision or your your data driven organization? >>Um, I want >>for the bankers to be able to walk around with on iPad in their hands and be able to access data for that customer really fast and be able to give them the best deal that they can get. I want Webster to be right there on top, with being able to add new customers and to be able to serve our existing customers who had bank accounts. Since you were 12 years old there and now our, you know, multi. Whatever. Um, I want them to be able to have the best experience with our our bankers, and >>that's awesome. I mean, that's really what I want is a banking customer. I want my bank to know who I am, anticipate my needs and create a great experience for me. And then let me go on with my life. And so that is a great story. Love your experience, your background and your knowledge. Can't thank you enough for coming on the Cube. >>No, thank you very much. And you guys have a great day. >>Alright, Take care. And thank you for watching everybody keep it right there. We'll take a short break and be right back. >>Yeah, yeah, yeah, yeah.

Published Date : Jun 25 2020

SUMMARY :

of enterprise data automation, an event Siri's brought to you by Iot. And I'm really excited to have Paul Damico here. Hi. Nice to see you, too. So let's let's start with Let's start with Webster Bank. awards for the area for being supportive for the community So you got a big responsibility as it relates to It's kind of transitioning to And then the other item is to drive new revenue Timely, accurate, complete data on the customer and what's really But I want to ask you about Cove. And part of that was is we adapted to Salesforce very, And then finally, you got more clarity. Um, from, you know, coming from the government and changed. I mean, a lot of people have sort of joked that many of the older people Um, the ability to give the customer what they a new a mortgage or looking to refinance or look, you know, whatever it iss, So you actually want the experience to be better. Um, you want you need a timely process so we can enhance Other other offers that you can you can make to the right customer, Um, and the only way we're going to be You see the potential to Prem and on France, you know, moving off Prem into like an s three bucket. about the way we do. quality engineers, you know, developers, etcetera, etcetera. Um, so they're going to more not, I don't want to say a central criticizing the data quality they really own that that problem, Well, I have. We're gonna look at the data, and then we'll come back and tell you what we dio. it seems like one of the strengths of their platform is the ability to visualize data the data structure and to contact the other one says, you know, customer one to pray All these, You know, So you you mentioned those three buckets? All that everything is sitting in the S three to be able to move it through I know from talking to customers you get from zero to snowflake, Um, right now I have I o ta Um, the data warehouse that I'm working off is And so that was a challenge prior because you couldn't get the lines Um, it just it's the same exact data just sitting If you only understand one schema that's reporting Okay to That's what that's what we were telling people you You apply it to, you know, drive more revenue. for the bankers to be able to walk around with on iPad And so that is a great story. And you guys have a great day. And thank you for watching everybody keep it right there.

ENTITIES

Entity	Category	Confidence
Paul Damico	PERSON	0.99+
Dave Volante	PERSON	0.99+
Webster Bank	ORGANIZATION	0.99+
Westchester	LOCATION	0.99+
Paula D'Amico	PERSON	0.99+
iPad	COMMERCIAL_ITEM	0.99+
New York	LOCATION	0.99+
one	QUANTITY	0.99+
Connecticut	LOCATION	0.99+
100 hours	QUANTITY	0.99+
S three	COMMERCIAL_ITEM	0.99+
15 years	QUANTITY	0.99+
Jason	PERSON	0.99+
France	LOCATION	0.99+
Siri	TITLE	0.99+
first item	QUANTITY	0.99+
three tables	QUANTITY	0.99+
24 hours	QUANTITY	0.99+
thousands	QUANTITY	0.99+
two months	QUANTITY	0.99+
each line	QUANTITY	0.99+
Fairfield County	LOCATION	0.99+
HP	ORGANIZATION	0.99+
Friday night	DATE	0.99+
Oracle	ORGANIZATION	0.99+
Two products	QUANTITY	0.99+
Boston	LOCATION	0.99+
four weeks	QUANTITY	0.99+
US Air Force	ORGANIZATION	0.98+
over 15 years	QUANTITY	0.98+
two days	QUANTITY	0.98+
New England	LOCATION	0.98+
each project	QUANTITY	0.98+
today	DATE	0.98+
Iot Tahoe	PERSON	0.98+
Paulo	PERSON	0.98+
Iot Tahoe	ORGANIZATION	0.98+
both	QUANTITY	0.97+
one thing	QUANTITY	0.97+
first set	QUANTITY	0.97+
TPP	TITLE	0.97+
Paulus	PERSON	0.97+
seven cause	QUANTITY	0.97+
one schema	QUANTITY	0.97+
one customer	QUANTITY	0.96+
13 14 years	QUANTITY	0.96+
over seven terabytes	QUANTITY	0.96+
three	QUANTITY	0.96+
single source	QUANTITY	0.95+
Webster's Bay	ORGANIZATION	0.95+
Webster	ORGANIZATION	0.94+
seven different lines	QUANTITY	0.94+
Cove	ORGANIZATION	0.94+
Prem	ORGANIZATION	0.93+
Enterprise Data Automation	ORGANIZATION	0.92+
eight analysts	QUANTITY	0.92+
10.2	QUANTITY	0.89+
12 years old	QUANTITY	0.89+
Iot	ORGANIZATION	0.88+
three buckets	QUANTITY	0.88+
Snowflake	EVENT	0.86+
last 35 years	DATE	0.84+
Team Waas	ORGANIZATION	0.8+
Io Tahoe	PERSON	0.79+
24 7 development	QUANTITY	0.72+
Salesforce	ORGANIZATION	0.68+
each	QUANTITY	0.68+
Amazon cloud	ORGANIZATION	0.66+
Tahoe	PERSON	0.66+
zero	QUANTITY	0.64+
snowflake	EVENT	0.61+
things	QUANTITY	0.57+

Paula D'Amico, Webster Bank | Io Tahoe | Enterprise Data Automation

>> Narrator: From around the Globe, it's theCube with digital coverage of Enterprise Data Automation, and event series brought to you by Io-Tahoe. >> Everybody, we're back. And this is Dave Vellante, and we're covering the whole notion of Automated Data in the Enterprise. And I'm really excited to have Paula D'Amico here. Senior Vice President of Enterprise Data Architecture at Webster Bank. Paula, good to see you. Thanks for coming on. >> Hi, nice to see you, too. >> Let's start with Webster bank. You guys are kind of a regional I think New York, New England, believe it's headquartered out of Connecticut. But tell us a little bit about the bank. >> Webster bank is regional Boston, Connecticut, and New York. Very focused on in Westchester and Fairfield County. They are a really highly rated regional bank for this area. They hold quite a few awards for the area for being supportive for the community, and are really moving forward technology wise, they really want to be a data driven bank, and they want to move into a more robust group. >> We got a lot to talk about. So data driven is an interesting topic and your role as Data Architecture, is really Senior Vice President Data Architecture. So you got a big responsibility as it relates to kind of transitioning to this digital data driven bank but tell us a little bit about your role in your Organization. >> Currently, today, we have a small group that is just working toward moving into a more futuristic, more data driven data warehousing. That's our first item. And then the other item is to drive new revenue by anticipating what customers do, when they go to the bank or when they log in to their account, to be able to give them the best offer. And the only way to do that is you have timely, accurate, complete data on the customer and what's really a great value on offer something to offer that, or a new product, or to help them continue to grow their savings, or do and grow their investments. >> Okay, and I really want to get into that. But before we do, and I know you're, sort of partway through your journey, you got a lot to do. But I want to ask you about Covid, how you guys handling that? You had the government coming down and small business loans and PPP, and huge volume of business and sort of data was at the heart of that. How did you manage through that? >> We were extremely successful, because we have a big, dedicated team that understands where their data is and was able to switch much faster than a larger bank, to be able to offer the PPP Long's out to our customers within lightning speed. And part of that was is we adapted to Salesforce very for we've had Salesforce in house for over 15 years. Pretty much that was the driving vehicle to get our PPP loans in, and then developing logic quickly, but it was a 24 seven development role and get the data moving on helping our customers fill out the forms. And a lot of that was manual, but it was a large community effort. >> Think about that too. The volume was probably much higher than the volume of loans to small businesses that you're used to granting and then also the initial guidelines were very opaque. You really didn't know what the rules were, but you were expected to enforce them. And then finally, you got more clarity. So you had to essentially code that logic into the system in real time. >> I wasn't directly involved, but part of my data movement team was, and we had to change the logic overnight. So it was on a Friday night it was released, we pushed our first set of loans through, and then the logic changed from coming from the government, it changed and we had to redevelop our data movement pieces again, and we design them and send them back through. So it was definitely kind of scary, but we were completely successful. We hit a very high peak. Again, I don't know the exact number but it was in the thousands of loans, from little loans to very large loans and not one customer who applied did not get what they needed for, that was the right process and filled out the right amount. >> Well, that is an amazing story and really great support for the region, your Connecticut, the Boston area. So that's fantastic. I want to get into the rest of your story now. Let's start with some of the business drivers in banking. I mean, obviously online. A lot of people have sort of joked that many of the older people, who kind of shunned online banking would love to go into the branch and see their friendly teller had no choice, during this pandemic, to go to online. So that's obviously a big trend you mentioned, the data driven data warehouse, I want to understand that, but what at the top level, what are some of the key business drivers that are catalyzing your desire for change? >> The ability to give a customer, what they need at the time when they need it. And what I mean by that is that we have customer interactions in multiple ways. And I want to be able for the customer to walk into a bank or online and see the same format, and being able to have the same feel the same love, and also to be able to offer them the next best offer for them. But they're if they want looking for a new mortgage or looking to refinance, or whatever it is that they have that data, we have the data and that they feel comfortable using it. And that's an untethered banker. Attitude is, whatever my banker is holding and whatever the person is holding in their phone, that is the same and it's comfortable. So they don't feel that they've walked into the bank and they have to do fill out different paperwork compared to filling out paperwork on just doing it on their phone. >> You actually do want the experience to be better. And it is in many cases. Now you weren't able to do this with your existing I guess mainframe based Enterprise Data Warehouses. Is that right? Maybe talk about that a little bit? >> Yeah, we were definitely able to do it with what we have today the technology we're using. But one of the issues is that it's not timely. And you need a timely process to be able to get the customers to understand what's happening. You need a timely process so we can enhance our risk management. We can apply for fraud issues and things like that. >> Yeah, so you're trying to get more real time. The traditional EDW. It's sort of a science project. There's a few experts that know how to get it. You can so line up, the demand is tremendous. And then oftentimes by the time you get the answer, it's outdated. So you're trying to address that problem. So part of it is really the cycle time the end to end cycle time that you're progressing. And then there's, if I understand it residual benefits that are pretty substantial from a revenue opportunity, other offers that you can make to the right customer, that you maybe know, through your data, is that right? >> Exactly. It's drive new customers to new opportunities. It's enhanced the risk, and it's to optimize the banking process, and then obviously, to create new business. And the only way we're going to be able to do that is if we have the ability to look at the data right when the customer walks in the door or right when they open up their app. And by doing creating more to New York times near real time data, or the data warehouse team that's giving the lines of business the ability to work on the next best offer for that customer as well. >> But Paula, we're inundated with data sources these days. Are there other data sources that maybe had access to before, but perhaps the backlog of ingesting and cleaning in cataloging and analyzing maybe the backlog was so great that you couldn't perhaps tap some of those data sources. Do you see the potential to increase the data sources and hence the quality of the data or is that sort of premature? >> Oh, no. Exactly. Right. So right now, we ingest a lot of flat files and from our mainframe type of front end system, that we've had for quite a few years. But now that we're moving to the cloud and off-prem and on-prem, moving off-prem, into like an S3 Bucket, where that data we can process that data and get that data faster by using real time tools to move that data into a place where, like snowflake could utilize that data, or we can give it out to our market. Right now we're about we do work in batch mode still. So we're doing 24 hours. >> Okay. So when I think about the data pipeline, and the people involved, maybe you could talk a little bit about the organization. You've got, I don't know, if you have data scientists or statisticians, I'm sure you do. You got data architects, data engineers, quality engineers, developers, etc. And oftentimes, practitioners like yourself, will stress about, hey, the data is in silos. The data quality is not where we want it to be. We have to manually categorize the data. These are all sort of common data pipeline problems, if you will. Sometimes we use the term data Ops, which is sort of a play on DevOps applied to the data pipeline. Can you just sort of describe your situation in that context? >> Yeah, so we have a very large data ops team. And everyone that who is working on the data part of Webster's Bank, has been there 13 to 14 years. So they get the data, they understand it, they understand the lines of business. So it's right now. We could the we have data quality issues, just like everybody else does. But we have places in them where that gets cleansed. And we're moving toward and there was very much siloed data. The data scientists are out in the lines of business right now, which is great, because I think that's where data science belongs, we should give them and that's what we're working towards now is giving them more self service, giving them the ability to access the data in a more robust way. And it's a single source of truth. So they're not pulling the data down into their own, like Tableau dashboards, and then pushing the data back out. So they're going to more not, I don't want to say, a central repository, but a more of a robust repository, that's controlled across multiple avenues, where multiple lines of business can access that data. Is that help? >> Got it, Yes. And I think that one of the key things that I'm taking away from your last comment, is the cultural aspects of this by having the data scientists in the line of business, the lines of business will feel ownership of that data as opposed to pointing fingers criticizing the data quality. They really own that that problem, as opposed to saying, well, it's Paula's problem. >> Well, I have my problem is I have data engineers, data architects, database administrators, traditional data reporting people. And because some customers that I have that are business customers lines of business, they want to just subscribe to a report, they don't want to go out and do any data science work. And we still have to provide that. So we still want to provide them some kind of regiment that they wake up in the morning, and they open up their email, and there's the report that they subscribe to, which is great, and it works out really well. And one of the things is why we purchased Io-Tahoe was, I would have the ability to give the lines of business, the ability to do search within the data. And we'll read the data flows and data redundancy and things like that, and help me clean up the data. And also, to give it to the data analysts who say, all right, they just asked me they want this certain report. And it used to take okay, four weeks we're going to go and we're going to look at the data and then we'll come back and tell you what we can do. But now with Io-Tahoe, they're able to look at the data, and then in one or two days, they'll be able to go back and say, Yes, we have the data, this is where it is. This is where we found it. This is the data flows that we found also, which is what I call it, is the break of a column. It's where the column was created, and where it went to live as a teenager. (laughs) And then it went to die, where we archive it. And, yeah, it's this cycle of life for a column. And Io-Tahoe helps us do that. And we do data lineage is done all the time. And it's just takes a very long time and that's why we're using something that has AI in it and machine running. It's accurate, it does it the same way over and over again. If an analyst leaves, you're able to utilize something like Io-Tahoe to be able to do that work for you. Is that help? >> Yeah, so got it. So a couple things there, in researching Io-Tahoe, it seems like one of the strengths of their platform is the ability to visualize data, the data structure and actually dig into it, but also see it. And that speeds things up and gives everybody additional confidence. And then the other piece is essentially infusing AI or machine intelligence into the data pipeline, is really how you're attacking automation. And you're saying it repeatable, and then that helps the data quality and you have this virtual cycle. Maybe you could sort of affirm that and add some color, perhaps. >> Exactly. So you're able to let's say that I have seven cars, lines of business that are asking me questions, and one of the questions they'll ask me is, we want to know, if this customer is okay to contact, and there's different avenues so you can go online, do not contact me, you can go to the bank and you can say, I don't want email, but I'll take texts. And I want no phone calls. All that information. So, seven different lines of business asked me that question in different ways. One said, "No okay to contact" the other one says, "Customer 123." All these. In each project before I got there used to be siloed. So one customer would be 100 hours for them to do that analytical work, and then another analyst would do another 100 hours on the other project. Well, now I can do that all at once. And I can do those types of searches and say, Yes, we already have that documentation. Here it is, and this is where you can find where the customer has said, "No, I don't want to get access from you by email or I've subscribed to get emails from you." >> Got it. Okay. Yeah Okay. And then I want to go back to the cloud a little bit. So you mentioned S3 Buckets. So you're moving to the Amazon cloud, at least, I'm sure you're going to get a hybrid situation there. You mentioned snowflake. What was sort of the decision to move to the cloud? Obviously, snowflake is cloud only. There's not an on-prem, version there. So what precipitated that? >> Alright, so from I've been in the data IT information field for the last 35 years. I started in the US Air Force, and have moved on from since then. And my experience with Bob Graham, was with snowflake with working with GE Capital. And that's where I met up with the team from Io-Tahoe as well. And so it's a proven so there's a couple of things one is Informatica, is worldwide known to move data. They have two products, they have the on-prem and the off-prem. I've used the on-prem and off-prem, they're both great. And it's very stable, and I'm comfortable with it. Other people are very comfortable with it. So we picked that as our batch data movement. We're moving toward probably HVR. It's not a total decision yet. But we're moving to HVR for real time data, which is changed capture data, moves it into the cloud. And then, so you're envisioning this right now. In which is you're in the S3, and you have all the data that you could possibly want. And that's JSON, all that everything is sitting in the S3 to be able to move it through into snowflake. And snowflake has proven to have a stability. You only need to learn and train your team with one thing. AWS as is completely stable at this point too. So all these avenues if you think about it, is going through from, this is your data lake, which is I would consider your S3. And even though it's not a traditional data lake like, you can touch it like a Progressive or Hadoop. And then into snowflake and then from snowflake into sandbox and so your lines of business and your data scientists just dive right in. That makes a big win. And then using Io-Tahoe with the data automation, and also their search engine. I have the ability to give the data scientists and data analysts the way of they don't need to talk to IT to get accurate information or completely accurate information from the structure. And we'll be right back. >> Yeah, so talking about snowflake and getting up to speed quickly. I know from talking to customers you can get from zero to snowflake very fast and then it sounds like the Io-Tahoe is sort of the automation cloud for your data pipeline within the cloud. Is that the right way to think about it? >> I think so. Right now I have Io-Tahoe attached to my on-prem. And I want to attach it to my off-prem eventually. So I'm using Io-Tahoe data automation right now, to bring in the data, and to start analyzing the data flows to make sure that I'm not missing anything, and that I'm not bringing over redundant data. The data warehouse that I'm working of, it's an on-prem. It's an Oracle Database, and it's 15 years old. So it has extra data in it. It has things that we don't need anymore, and Io-Tahoe's helping me shake out that extra data that does not need to be moved into my S3. So it's saving me money, when I'm moving from off-prem to on-prem. >> And so that was a challenge prior, because you couldn't get the lines of business to agree what to delete, or what was the issue there? >> Oh, it was more than that. Each line of business had their own structure within the warehouse. And then they were copying data between each other, and duplicating the data and using that. So there could be possibly three tables that have the same data in it, but it's used for different lines of business. We have identified using Io-Tahoe identified over seven terabytes in the last two months on data that has just been repetitive. It's the same exact data just sitting in a different schema. And that's not easy to find, if you only understand one schema, that's reporting for that line of business. >> More bad news for the storage companies out there. (both laughs) So far. >> It's cheap. That's what we were telling people. >> And it's true, but you still would rather not waste it, you'd like to apply it to drive more revenue. And so, I guess, let's close on where you see this thing going. Again, I know you're sort of partway through the journey, maybe you could sort of describe, where you see the phase is going and really what you want to get out of this thing, down the road, mid-term, longer term, what's your vision or your data driven organization. >> I want for the bankers to be able to walk around with an iPad in their hand, and be able to access data for that customer, really fast and be able to give them the best deal that they can get. I want Webster to be right there on top with being able to add new customers, and to be able to serve our existing customers who had bank accounts since they were 12 years old there and now our multi whatever. I want them to be able to have the best experience with our bankers. >> That's awesome. That's really what I want as a banking customer. I want my bank to know who I am, anticipate my needs, and create a great experience for me. And then let me go on with my life. And so that follow. Great story. Love your experience, your background and your knowledge. I can't thank you enough for coming on theCube. >> Now, thank you very much. And you guys have a great day. >> All right, take care. And thank you for watching everybody. Keep right there. We'll take a short break and be right back. (gentle music)

Published Date : Jun 23 2020

SUMMARY :

to you by Io-Tahoe. And I'm really excited to of a regional I think and they want to move it relates to kind of transitioning And the only way to do But I want to ask you about Covid, and get the data moving And then finally, you got more clarity. and filled out the right amount. and really great support for the region, and being able to have the experience to be better. to be able to get the customers that know how to get it. and it's to optimize the banking process, and analyzing maybe the backlog was and get that data faster and the people involved, And everyone that who is working is the cultural aspects of this the ability to do search within the data. and you have this virtual cycle. and one of the questions And then I want to go back in the S3 to be able to move it Is that the right way to think about it? and to start analyzing the data flows and duplicating the data and using that. More bad news for the That's what we were telling people. and really what you want and to be able to serve And so that follow. And you guys have a great day. And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Paula D'Amico	PERSON	0.99+
Paula	PERSON	0.99+
Connecticut	LOCATION	0.99+
Westchester	LOCATION	0.99+
Informatica	ORGANIZATION	0.99+
24 hours	QUANTITY	0.99+
one	QUANTITY	0.99+
13	QUANTITY	0.99+
thousands	QUANTITY	0.99+
100 hours	QUANTITY	0.99+
Bob Graham	PERSON	0.99+
iPad	COMMERCIAL_ITEM	0.99+
Webster Bank	ORGANIZATION	0.99+
GE Capital	ORGANIZATION	0.99+
first item	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
two products	QUANTITY	0.99+
seven	QUANTITY	0.99+
New York	LOCATION	0.99+
Boston	LOCATION	0.99+
three tables	QUANTITY	0.99+
Each line	QUANTITY	0.99+
first set	QUANTITY	0.99+
two days	QUANTITY	0.99+
DevOps	TITLE	0.99+
Webster bank	ORGANIZATION	0.99+
14 years	QUANTITY	0.99+
over 15 years	QUANTITY	0.99+
seven cars	QUANTITY	0.98+
each project	QUANTITY	0.98+
Friday night	DATE	0.98+
Enterprise Data Automation	ORGANIZATION	0.98+
New England	LOCATION	0.98+
Io-Tahoe	ORGANIZATION	0.98+
today	DATE	0.98+
Webster's Bank	ORGANIZATION	0.98+
one schema	QUANTITY	0.97+
Fairfield County	LOCATION	0.97+
One	QUANTITY	0.97+
one customer	QUANTITY	0.97+
over seven terabytes	QUANTITY	0.97+
Salesforce	ORGANIZATION	0.96+
both	QUANTITY	0.95+
single source	QUANTITY	0.93+
one thing	QUANTITY	0.93+
US Air Force	ORGANIZATION	0.93+
Webster	ORGANIZATION	0.92+
S3	COMMERCIAL_ITEM	0.92+
Enterprise Data Architecture	ORGANIZATION	0.91+
Io Tahoe	PERSON	0.91+
Oracle	ORGANIZATION	0.9+
15 years old	QUANTITY	0.9+
Io-Tahoe	PERSON	0.89+
12 years old	QUANTITY	0.88+
Tableau	TITLE	0.87+
four weeks	QUANTITY	0.86+
S3 Buckets	COMMERCIAL_ITEM	0.84+
Covid	PERSON	0.81+
Data Architecture	ORGANIZATION	0.79+
JSON	TITLE	0.79+
Senior Vice President	PERSON	0.78+
24 seven development role	QUANTITY	0.77+
last 35 years	DATE	0.77+
both laughs	QUANTITY	0.75+
Io-Tahoe	TITLE	0.73+
each	QUANTITY	0.72+
loans	QUANTITY	0.71+
zero	QUANTITY	0.71+

Paula D'Amico, Webster Bank

>> Narrator: From around the Globe, it's theCube with digital coverage of Enterprise Data Automation, and event series brought to you by Io-Tahoe. >> Everybody, we're back. And this is Dave Vellante, and we're covering the whole notion of Automated Data in the Enterprise. And I'm really excited to have Paula D'Amico here. Senior Vice President of Enterprise Data Architecture at Webster Bank. Paula, good to see you. Thanks for coming on. >> Hi, nice to see you, too. >> Let's start with Webster bank. You guys are kind of a regional I think New York, New England, believe it's headquartered out of Connecticut. But tell us a little bit about the bank. >> Webster bank is regional Boston, Connecticut, and New York. Very focused on in Westchester and Fairfield County. They are a really highly rated regional bank for this area. They hold quite a few awards for the area for being supportive for the community, and are really moving forward technology wise, they really want to be a data driven bank, and they want to move into a more robust group. >> We got a lot to talk about. So data driven is an interesting topic and your role as Data Architecture, is really Senior Vice President Data Architecture. So you got a big responsibility as it relates to kind of transitioning to this digital data driven bank but tell us a little bit about your role in your Organization. >> Currently, today, we have a small group that is just working toward moving into a more futuristic, more data driven data warehousing. That's our first item. And then the other item is to drive new revenue by anticipating what customers do, when they go to the bank or when they log in to their account, to be able to give them the best offer. And the only way to do that is you have timely, accurate, complete data on the customer and what's really a great value on offer something to offer that, or a new product, or to help them continue to grow their savings, or do and grow their investments. >> Okay, and I really want to get into that. But before we do, and I know you're, sort of partway through your journey, you got a lot to do. But I want to ask you about Covid, how you guys handling that? You had the government coming down and small business loans and PPP, and huge volume of business and sort of data was at the heart of that. How did you manage through that? >> We were extremely successful, because we have a big, dedicated team that understands where their data is and was able to switch much faster than a larger bank, to be able to offer the PPP Long's out to our customers within lightning speed. And part of that was is we adapted to Salesforce very for we've had Salesforce in house for over 15 years. Pretty much that was the driving vehicle to get our PPP loans in, and then developing logic quickly, but it was a 24 seven development role and get the data moving on helping our customers fill out the forms. And a lot of that was manual, but it was a large community effort. >> Think about that too. The volume was probably much higher than the volume of loans to small businesses that you're used to granting and then also the initial guidelines were very opaque. You really didn't know what the rules were, but you were expected to enforce them. And then finally, you got more clarity. So you had to essentially code that logic into the system in real time. >> I wasn't directly involved, but part of my data movement team was, and we had to change the logic overnight. So it was on a Friday night it was released, we pushed our first set of loans through, and then the logic changed from coming from the government, it changed and we had to redevelop our data movement pieces again, and we design them and send them back through. So it was definitely kind of scary, but we were completely successful. We hit a very high peak. Again, I don't know the exact number but it was in the thousands of loans, from little loans to very large loans and not one customer who applied did not get what they needed for, that was the right process and filled out the right amount. >> Well, that is an amazing story and really great support for the region, your Connecticut, the Boston area. So that's fantastic. I want to get into the rest of your story now. Let's start with some of the business drivers in banking. I mean, obviously online. A lot of people have sort of joked that many of the older people, who kind of shunned online banking would love to go into the branch and see their friendly teller had no choice, during this pandemic, to go to online. So that's obviously a big trend you mentioned, the data driven data warehouse, I want to understand that, but what at the top level, what are some of the key business drivers that are catalyzing your desire for change? >> The ability to give a customer, what they need at the time when they need it. And what I mean by that is that we have customer interactions in multiple ways. And I want to be able for the customer to walk into a bank or online and see the same format, and being able to have the same feel the same love, and also to be able to offer them the next best offer for them. But they're if they want looking for a new mortgage or looking to refinance, or whatever it is that they have that data, we have the data and that they feel comfortable using it. And that's an untethered banker. Attitude is, whatever my banker is holding and whatever the person is holding in their phone, that is the same and it's comfortable. So they don't feel that they've walked into the bank and they have to do fill out different paperwork compared to filling out paperwork on just doing it on their phone. >> You actually do want the experience to be better. And it is in many cases. Now you weren't able to do this with your existing I guess mainframe based Enterprise Data Warehouses. Is that right? Maybe talk about that a little bit? >> Yeah, we were definitely able to do it with what we have today the technology we're using. But one of the issues is that it's not timely. And you need a timely process to be able to get the customers to understand what's happening. You need a timely process so we can enhance our risk management. We can apply for fraud issues and things like that. >> Yeah, so you're trying to get more real time. The traditional EDW. It's sort of a science project. There's a few experts that know how to get it. You can so line up, the demand is tremendous. And then oftentimes by the time you get the answer, it's outdated. So you're trying to address that problem. So part of it is really the cycle time the end to end cycle time that you're progressing. And then there's, if I understand it residual benefits that are pretty substantial from a revenue opportunity, other offers that you can make to the right customer, that you maybe know, through your data, is that right? >> Exactly. It's drive new customers to new opportunities. It's enhanced the risk, and it's to optimize the banking process, and then obviously, to create new business. And the only way we're going to be able to do that is if we have the ability to look at the data right when the customer walks in the door or right when they open up their app. And by doing creating more to New York times near real time data, or the data warehouse team that's giving the lines of business the ability to work on the next best offer for that customer as well. >> But Paula, we're inundated with data sources these days. Are there other data sources that maybe had access to before, but perhaps the backlog of ingesting and cleaning in cataloging and analyzing maybe the backlog was so great that you couldn't perhaps tap some of those data sources. Do you see the potential to increase the data sources and hence the quality of the data or is that sort of premature? >> Oh, no. Exactly. Right. So right now, we ingest a lot of flat files and from our mainframe type of front end system, that we've had for quite a few years. But now that we're moving to the cloud and off-prem and on-prem, moving off-prem, into like an S3 Bucket, where that data we can process that data and get that data faster by using real time tools to move that data into a place where, like snowflake could utilize that data, or we can give it out to our market. Right now we're about we do work in batch mode still. So we're doing 24 hours. >> Okay. So when I think about the data pipeline, and the people involved, maybe you could talk a little bit about the organization. You've got, I don't know, if you have data scientists or statisticians, I'm sure you do. You got data architects, data engineers, quality engineers, developers, etc. And oftentimes, practitioners like yourself, will stress about, hey, the data is in silos. The data quality is not where we want it to be. We have to manually categorize the data. These are all sort of common data pipeline problems, if you will. Sometimes we use the term data Ops, which is sort of a play on DevOps applied to the data pipeline. Can you just sort of describe your situation in that context? >> Yeah, so we have a very large data ops team. And everyone that who is working on the data part of Webster's Bank, has been there 13 to 14 years. So they get the data, they understand it, they understand the lines of business. So it's right now. We could the we have data quality issues, just like everybody else does. But we have places in them where that gets cleansed. And we're moving toward and there was very much siloed data. The data scientists are out in the lines of business right now, which is great, because I think that's where data science belongs, we should give them and that's what we're working towards now is giving them more self service, giving them the ability to access the data in a more robust way. And it's a single source of truth. So they're not pulling the data down into their own, like Tableau dashboards, and then pushing the data back out. So they're going to more not, I don't want to say, a central repository, but a more of a robust repository, that's controlled across multiple avenues, where multiple lines of business can access that data. Is that help? >> Got it, Yes. And I think that one of the key things that I'm taking away from your last comment, is the cultural aspects of this by having the data scientists in the line of business, the lines of business will feel ownership of that data as opposed to pointing fingers criticizing the data quality. They really own that that problem, as opposed to saying, well, it's Paula's problem. >> Well, I have my problem is I have data engineers, data architects, database administrators, traditional data reporting people. And because some customers that I have that are business customers lines of business, they want to just subscribe to a report, they don't want to go out and do any data science work. And we still have to provide that. So we still want to provide them some kind of regiment that they wake up in the morning, and they open up their email, and there's the report that they subscribe to, which is great, and it works out really well. And one of the things is why we purchased Io-Tahoe was, I would have the ability to give the lines of business, the ability to do search within the data. And we'll read the data flows and data redundancy and things like that, and help me clean up the data. And also, to give it to the data analysts who say, all right, they just asked me they want this certain report. And it used to take okay, four weeks we're going to go and we're going to look at the data and then we'll come back and tell you what we can do. But now with Io-Tahoe, they're able to look at the data, and then in one or two days, they'll be able to go back and say, Yes, we have the data, this is where it is. This is where we found it. This is the data flows that we found also, which is what I call it, is the break of a column. It's where the column was created, and where it went to live as a teenager. (laughs) And then it went to die, where we archive it. And, yeah, it's this cycle of life for a column. And Io-Tahoe helps us do that. And we do data lineage is done all the time. And it's just takes a very long time and that's why we're using something that has AI in it and machine running. It's accurate, it does it the same way over and over again. If an analyst leaves, you're able to utilize something like Io-Tahoe to be able to do that work for you. Is that help? >> Yeah, so got it. So a couple things there, in researching Io-Tahoe, it seems like one of the strengths of their platform is the ability to visualize data, the data structure and actually dig into it, but also see it. And that speeds things up and gives everybody additional confidence. And then the other piece is essentially infusing AI or machine intelligence into the data pipeline, is really how you're attacking automation. And you're saying it repeatable, and then that helps the data quality and you have this virtual cycle. Maybe you could sort of affirm that and add some color, perhaps. >> Exactly. So you're able to let's say that I have seven cars, lines of business that are asking me questions, and one of the questions they'll ask me is, we want to know, if this customer is okay to contact, and there's different avenues so you can go online, do not contact me, you can go to the bank and you can say, I don't want email, but I'll take texts. And I want no phone calls. All that information. So, seven different lines of business asked me that question in different ways. One said, "No okay to contact" the other one says, "Customer 123." All these. In each project before I got there used to be siloed. So one customer would be 100 hours for them to do that analytical work, and then another analyst would do another 100 hours on the other project. Well, now I can do that all at once. And I can do those types of searches and say, Yes, we already have that documentation. Here it is, and this is where you can find where the customer has said, "No, I don't want to get access from you by email or I've subscribed to get emails from you." >> Got it. Okay. Yeah Okay. And then I want to go back to the cloud a little bit. So you mentioned S3 Buckets. So you're moving to the Amazon cloud, at least, I'm sure you're going to get a hybrid situation there. You mentioned snowflake. What was sort of the decision to move to the cloud? Obviously, snowflake is cloud only. There's not an on-prem, version there. So what precipitated that? >> Alright, so from I've been in the data IT information field for the last 35 years. I started in the US Air Force, and have moved on from since then. And my experience with Bob Graham, was with snowflake with working with GE Capital. And that's where I met up with the team from Io-Tahoe as well. And so it's a proven so there's a couple of things one is Informatica, is worldwide known to move data. They have two products, they have the on-prem and the off-prem. I've used the on-prem and off-prem, they're both great. And it's very stable, and I'm comfortable with it. Other people are very comfortable with it. So we picked that as our batch data movement. We're moving toward probably HVR. It's not a total decision yet. But we're moving to HVR for real time data, which is changed capture data, moves it into the cloud. And then, so you're envisioning this right now. In which is you're in the S3, and you have all the data that you could possibly want. And that's JSON, all that everything is sitting in the S3 to be able to move it through into snowflake. And snowflake has proven to have a stability. You only need to learn and train your team with one thing. AWS as is completely stable at this point too. So all these avenues if you think about it, is going through from, this is your data lake, which is I would consider your S3. And even though it's not a traditional data lake like, you can touch it like a Progressive or Hadoop. And then into snowflake and then from snowflake into sandbox and so your lines of business and your data scientists just dive right in. That makes a big win. And then using Io-Tahoe with the data automation, and also their search engine. I have the ability to give the data scientists and data analysts the way of they don't need to talk to IT to get accurate information or completely accurate information from the structure. And we'll be right back. >> Yeah, so talking about snowflake and getting up to speed quickly. I know from talking to customers you can get from zero to snowflake very fast and then it sounds like the Io-Tahoe is sort of the automation cloud for your data pipeline within the cloud. Is that the right way to think about it? >> I think so. Right now I have Io-Tahoe attached to my on-prem. And I want to attach it to my off-prem eventually. So I'm using Io-Tahoe data automation right now, to bring in the data, and to start analyzing the data flows to make sure that I'm not missing anything, and that I'm not bringing over redundant data. The data warehouse that I'm working of, it's an on-prem. It's an Oracle Database, and it's 15 years old. So it has extra data in it. It has things that we don't need anymore, and Io-Tahoe's helping me shake out that extra data that does not need to be moved into my S3. So it's saving me money, when I'm moving from off-prem to on-prem. >> And so that was a challenge prior, because you couldn't get the lines of business to agree what to delete, or what was the issue there? >> Oh, it was more than that. Each line of business had their own structure within the warehouse. And then they were copying data between each other, and duplicating the data and using that. So there could be possibly three tables that have the same data in it, but it's used for different lines of business. We have identified using Io-Tahoe identified over seven terabytes in the last two months on data that has just been repetitive. It's the same exact data just sitting in a different schema. And that's not easy to find, if you only understand one schema, that's reporting for that line of business. >> More bad news for the storage companies out there. (both laughs) So far. >> It's cheap. That's what we were telling people. >> And it's true, but you still would rather not waste it, you'd like to apply it to drive more revenue. And so, I guess, let's close on where you see this thing going. Again, I know you're sort of partway through the journey, maybe you could sort of describe, where you see the phase is going and really what you want to get out of this thing, down the road, mid-term, longer term, what's your vision or your data driven organization. >> I want for the bankers to be able to walk around with an iPad in their hand, and be able to access data for that customer, really fast and be able to give them the best deal that they can get. I want Webster to be right there on top with being able to add new customers, and to be able to serve our existing customers who had bank accounts since they were 12 years old there and now our multi whatever. I want them to be able to have the best experience with our bankers. >> That's awesome. That's really what I want as a banking customer. I want my bank to know who I am, anticipate my needs, and create a great experience for me. And then let me go on with my life. And so that follow. Great story. Love your experience, your background and your knowledge. I can't thank you enough for coming on theCube. >> Now, thank you very much. And you guys have a great day. >> All right, take care. And thank you for watching everybody. Keep right there. We'll take a short break and be right back. (gentle music)

Published Date : Jun 4 2020

SUMMARY :

to you by Io-Tahoe. And I'm really excited to of a regional I think and they want to move it relates to kind of transitioning And the only way to do But I want to ask you about Covid, and get the data moving And then finally, you got more clarity. and filled out the right amount. and really great support for the region, and being able to have the experience to be better. to be able to get the customers that know how to get it. and it's to optimize the banking process, and analyzing maybe the backlog was and get that data faster and the people involved, And everyone that who is working is the cultural aspects of this the ability to do search within the data. and you have this virtual cycle. and one of the questions And then I want to go back in the S3 to be able to move it Is that the right way to think about it? and to start analyzing the data flows and duplicating the data and using that. More bad news for the That's what we were telling people. and really what you want and to be able to serve And so that follow. And you guys have a great day. And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Paula D'Amico	PERSON	0.99+
Paula	PERSON	0.99+
Connecticut	LOCATION	0.99+
Westchester	LOCATION	0.99+
Informatica	ORGANIZATION	0.99+
24 hours	QUANTITY	0.99+
one	QUANTITY	0.99+
13	QUANTITY	0.99+
thousands	QUANTITY	0.99+
100 hours	QUANTITY	0.99+
Bob Graham	PERSON	0.99+
iPad	COMMERCIAL_ITEM	0.99+
Webster Bank	ORGANIZATION	0.99+
GE Capital	ORGANIZATION	0.99+
first item	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
two products	QUANTITY	0.99+
seven	QUANTITY	0.99+
New York	LOCATION	0.99+
Boston	LOCATION	0.99+
three tables	QUANTITY	0.99+
Each line	QUANTITY	0.99+
first set	QUANTITY	0.99+
two days	QUANTITY	0.99+
DevOps	TITLE	0.99+
Webster bank	ORGANIZATION	0.99+
14 years	QUANTITY	0.99+
over 15 years	QUANTITY	0.99+
seven cars	QUANTITY	0.98+
each project	QUANTITY	0.98+
Friday night	DATE	0.98+
New England	LOCATION	0.98+
Io-Tahoe	ORGANIZATION	0.98+
today	DATE	0.98+
Webster's Bank	ORGANIZATION	0.98+
one schema	QUANTITY	0.97+
Fairfield County	LOCATION	0.97+
One	QUANTITY	0.97+
one customer	QUANTITY	0.97+
over seven terabytes	QUANTITY	0.97+
Salesforce	ORGANIZATION	0.96+
both	QUANTITY	0.95+
single source	QUANTITY	0.93+
one thing	QUANTITY	0.93+
US Air Force	ORGANIZATION	0.93+
Webster	ORGANIZATION	0.92+
S3	COMMERCIAL_ITEM	0.92+
Enterprise Data Architecture	ORGANIZATION	0.91+
Oracle	ORGANIZATION	0.9+
15 years old	QUANTITY	0.9+
Io-Tahoe	PERSON	0.89+
12 years old	QUANTITY	0.88+
Tableau	TITLE	0.87+
four weeks	QUANTITY	0.86+
S3 Buckets	COMMERCIAL_ITEM	0.84+
Covid	PERSON	0.81+
Data Architecture	ORGANIZATION	0.79+
JSON	TITLE	0.79+
Senior Vice President	PERSON	0.78+
24 seven development role	QUANTITY	0.77+
last 35 years	DATE	0.77+
both laughs	QUANTITY	0.75+
Io-Tahoe	TITLE	0.73+
each	QUANTITY	0.72+
loans	QUANTITY	0.71+
zero	QUANTITY	0.71+
Amazon cloud	ORGANIZATION	0.65+
last two months	DATE	0.65+

Tiji Mathew, Patrick Zimet and Senthil Karuppaiah | Io-Tahoe Data Quality Active DQ

(upbeat music), (logo pop up) >> Narrator: From around the globe it's theCUBE. Presenting active DQ intelligent automation for data quality brought to you by IO-Tahoe. >> Are you ready to see active DQ on Snowflake in action? Let's get into the show and tell him, do the demo. With me or Tiji Matthew, the Data Solutions Engineer at IO-Tahoe. Also joining us is Patrick Zeimet Data Solutions Engineer at IO-Tahoe and Senthilnathan Karuppaiah, who's the Head of Production Engineering at IO-Tahoe. Patrick, over to you let's see it. >> Hey Dave, thank you so much. Yeah, we've seen a huge increase in the number of organizations interested in Snowflake implementation. Were looking for an innovative, precise and timely method to ingest their data into Snowflake. And where we are seeing a lot of success is a ground up method utilizing both IO-Tahoe and Snowflake. To start you define your as is model. By leveraging IO-Tahoe to profile your various data sources and push the metadata to Snowflake. Meaning we create a data catalog within Snowflake for a centralized location to document items such as source system owners allowing you to have those key conversations and understand the data's lineage, potential blockers and what data is readily available for ingestion. Once the data catalog is built you have a much more dynamic strategies surrounding your Snowflake ingestion. And what's great is that while you're working through those key conversations IO-Tahoe will maintain that metadata push and partnered with Snowflake ability to version the data. You can easily incorporate potential scheme changes along the way. Making sure that the information that you're working on stays as current as the systems that you're hoping to integrate with Snowflake. >> Nice, Patrick I wonder if you could address how you IO-Tahoe Platform Scales and maybe in what way it provides a competitive advantage for customers. >> Great question where IO-Tahoe shines is through its active DQ or the ability to monitor your data's quality in real time. Marking which roads need remediation. According to the customized business rules that you can set. Ensuring that the data quality standards meet the requirements of your organizations. What's great is through our use of RPA. We can scale with an organization. So as you ingest more data sources we can allocate more robotic workers meaning the results will continue to be delivered in the same timely fashion you've grown used to. What's Morrisons IO-Tahoe is doing the heavy lifting on monitoring data quality. That's frees up your data experts to focus on the more strategic tasks such as remediation that augmentations and analytics developments. >> Okay, maybe Tiji, you could address this. I mean, how does all this automation change the operating model that we were talking to to Aj and Dunkin before about that? I mean, if it involves less people and more automation what else can I do in parallel? >> I'm sure the participants today will also be asking the same question. Let me start with the strategic tasks Patrick mentioned, Io-Tahoe does the heavy lifting. Freeing up data experts to act upon the data events generated by IO-Tahoe. Companies that have teams focused on manually building their inventory of the data landscape. Leads to longer turnaround times in producing actionable insights from their own data assets. Thus, diminishing the value realized by traditional methods. However, our operating model involves profiling and remediating at the same time creating a catalog data estate that can be used by business or IT accordingly. With increased automation and fewer people. Our machine learning algorithms augment the data pipeline to tag and capture the data elements into a comprehensive data catalog. As IO-Tahoe automatically catalogs the data estate in a centralized view, the data experts can partly focus on remediating the data events generated from validating against business rules. We envision that data events coupled with this drillable and searchable view will be a comprehensive one to assess the impact of bad quality data. Let's briefly look at the image on screen. For example, the view indicates that bad quality zip code data impacts the contact data which in turn impacts other related entities in systems. Now contrast that with a manually maintained spreadsheet that drowns out the main focus of your analysis. >> Tiji, how do you tag and capture bad quality data and stop that from you've mentioned these printed dependencies. How do you stop that from flowing downstream into the processes within the applications or reports? >> As IO-Tahoe builds the data catalog across source systems. We tag the elements that meet the business rule criteria while segregating the failed data examples associated with the elements that fall below a certain threshold. The elements that meet the business rule criteria are tagged to be searchable. Thus, providing an easy way to identify data elements that may flow through the system. The segregated data examples on the other hand are used by data experts to triage for the root cause. Based on the root cause potential outcomes could be one, changes in the source system to prevent that data from entering the system in the first place. Two, add data pipeline logic, to sanitize bad data from being consumed by downstream applications and reports or just accept the risk of storing bad data and address it when it meets a certain threshold. However, Dave as for your question about preventing bad quality data from flowing into the system? IO-Tahoe will not prevent it because the controls of data flowing between systems is managed outside of IO-Tahoe. Although, IO-Tahoe will alert and notify the data experts to events that indicate bad data has entered the monitored assets. Also we have redesigned our product to be modular and extensible. This allows data events generated by IO-Tahoe to be consumed by any system that wants to control the targets from bad data. Does IO-Tahoe empowers the data experts to control the bad data from flowing into their system. >> Thank you for that. So, one of the things that we've noticed, we've written about is that you've got these hyper specialized roles within the data, the centralized data organization. And wonder how do the data folks get involved here if at all, and how frequently do they get involved? Maybe Senthilnathan you could take that. >> Thank you, Dave for having me here. Well, based on whether the data element in question is in data cataloging or monitoring phase. Different data folks gets involved. When it isn't in the data cataloging stage. The data governance team, along with enterprise architecture or IT involved in setting up the data catalog. Which includes identifying the critical data elements business term identification, definition, documentation data quality rules, and data even set up data domain and business line mapping, lineage PA tracking source of truth. So on and so forth. It's typically in one time set up review certify then govern and monitor. But while when it is in the monitoring phase during any data incident or data issues IO-Tahoe broadcast data signals to the relevant data folks to act and remedy it as quick as possible. And alerts the consumption team it could be the data science, analytics, business opts are both a potential issue so that they are aware and take necessary preventative measure. Let me show you an example, critical data element from data quality dashboard view to lineage view to data 360 degree view for a zip code for conformity check. So in this case the zip code did not meet the past threshold during the technical data quality check and was identified as non-compliant item and notification was sent to the ID folks. So clicking on the zip code. Will take to the lineage view to visualize the dependent system, says that who are producers and who are the consumers. And further drilling down will take us to the detailed view, that a lot of other information's are presented to facilitate for a root cause analysis and not to take it to a final closure. >> Thank you for that. So Tiji? Patrick was talking about the as is to be. So I'm interested in how it's done now versus before. Do you need a data governance operating model for example? >> Typically a company that decides to make an inventory of the data assets would start out by manually building a spreadsheet managed by data experts of the company. What started as a draft now get break into the model of a company. This leads to loss of collaboration as each department makes a copy of their catalog for their specific needs. This decentralized approach leads to loss of uniformity which each department having different definitions which ironically needs a governance model for the data catalog itself. And as the spreadsheet grows in complexity the skill level needed to maintain. It also increases thus leading to fewer and fewer people knowing how to maintain it. About all the content that took so much time and effort to build is not searchable outside of that spreadsheet document. >> Yeah, I think you really hit the nail on my head Tiji. Now companies want to move away from the spreadsheet approach. IO-Tahoe addresses the shortcoming of the traditional approach enabling companies to achieve more with less. >> Yeah, what the customer reaction has been. We had Webster Bank, on one of the early episodes for example, I mean could they have achieved. What they did without something like active data quality and automation maybe Senthilnathan you could address that? >> Sure, It is impossible to achieve full data quality monitoring and remediation without automation or digital workers in place reality that introverts they don't have the time to do the remediation manually because they have to do an analysis conform fix on any data quality issues, as fast as possible before it gets bigger and no exception to Webster. That's why Webster implemented IO-Tahoe's active DQ to set up the business, metadata management and data quality monitoring and remediation in the Snowflake cloud data Lake. We help and building the center of excellence in the data governance, which is managing the data catalog schedule on demand and in-flight data quality checks, but Snowflake, no pipe on stream are super beneficial to achieve in flight quality checks. Then the data assumption monitoring and reporting last but not the least the time saver is persisting the non-compliant records for every data quality run within the Snowflake cloud, along with remediation script. So that during any exceptions the respect to team members is not only alerted. But also supplied with necessary scripts and tools to perform remediation right from the IO-Tahoe's Active DQ. >> Very nice. Okay guys, thanks for the demo. Great stuff. Now, if you want to learn more about the IO-Tahoe platform and how you can accelerate your adoption of Snowflake book some time with a data RPA expert all you got to do is click on the demo icon on the right of your screen and set a meeting. We appreciate you attending this latest episode of the IO-Tahoe data automation series. Look, if you missed any of the content that's all available on demand. This is Dave Vellante theCUBE. Thanks for watching. (upbeat music)

Published Date : Apr 29 2021

SUMMARY :

the globe it's theCUBE. and tell him, do the demo. and push the metadata to Snowflake. if you could address or the ability to monitor the operating model on remediating the data events generated into the processes within the data experts to events that indicate So, one of the things that So clicking on the zip code. Thank you for that. the skill level needed to maintain. of the traditional approach one of the early episodes So that during any exceptions the respect of the IO-Tahoe data automation series.

ENTITIES

Entity	Category	Confidence
Patrick	PERSON	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Tiji Matthew	PERSON	0.99+
Tiji Mathew	PERSON	0.99+
Senthil Karuppaiah	PERSON	0.99+
Patrick Zimet	PERSON	0.99+
IO-Tahoe	ORGANIZATION	0.99+
Io-Tahoe	ORGANIZATION	0.99+
Tiji	PERSON	0.99+
360 degree	QUANTITY	0.99+
Senthilnathan Karuppaiah	PERSON	0.99+
each department	QUANTITY	0.99+
Snowflake	TITLE	0.99+
today	DATE	0.99+
Webster	ORGANIZATION	0.99+
Aj	PERSON	0.99+
Dunkin	PERSON	0.98+
Two	QUANTITY	0.98+
IO	ORGANIZATION	0.97+
Patrick Zeimet	PERSON	0.97+
Webster Bank	ORGANIZATION	0.97+
one	QUANTITY	0.97+
one time	QUANTITY	0.97+
both	QUANTITY	0.96+
Senthilnathan	PERSON	0.96+
IO-Tahoe	TITLE	0.93+
first place	QUANTITY	0.89+
IO	TITLE	0.72+
Snowflake	EVENT	0.71+
Tahoe	ORGANIZATION	0.69+
Data Solutions	ORGANIZATION	0.69+
-Tahoe	TITLE	0.64+
Tahoe	TITLE	0.63+
Snowflake	ORGANIZATION	0.6+
Morrisons	ORGANIZATION	0.6+

Tiji Mathew, Patrick Zimet and Senthil Karuppaiah | Io-Tahoe Data Quality: Active DQ

(upbeat music), (logo pop up) >> Narrator: From around the globe it's theCUBE. Presenting active DQ intelligent automation for data quality brought to you by IO-Tahoe. >> Are you ready to see active DQ on Snowflake in action? Let's get into the show and tell him, do the demo. With me or Tiji Matthew, the Data Solutions Engineer at IO-Tahoe. Also joining us is Patrick Zeimet Data Solutions Engineer at IO-Tahoe and Senthilnathan Karuppaiah, who's the Head of Production Engineering at IO-Tahoe. Patrick, over to you let's see it. >> Hey Dave, thank you so much. Yeah, we've seen a huge increase in the number of organizations interested in Snowflake implementation. Were looking for an innovative, precise and timely method to ingest their data into Snowflake. And where we are seeing a lot of success is a ground up method utilizing both IO-Tahoe and Snowflake. To start you define your as is model. By leveraging IO-Tahoe to profile your various data sources and push the metadata to Snowflake. Meaning we create a data catalog within Snowflake for a centralized location to document items such as source system owners allowing you to have those key conversations and understand the data's lineage, potential blockers and what data is readily available for ingestion. Once the data catalog is built you have a much more dynamic strategies surrounding your Snowflake ingestion. And what's great is that while you're working through those key conversations IO-Tahoe will maintain that metadata push and partnered with Snowflake ability to version the data. You can easily incorporate potential scheme changes along the way. Making sure that the information that you're working on stays as current as the systems that you're hoping to integrate with Snowflake. >> Nice, Patrick I wonder if you could address how you IO-Tahoe Platform Scales and maybe in what way it provides a competitive advantage for customers. >> Great question where IO-Tahoe shines is through its active DQ or the ability to monitor your data's quality in real time. Marking which roads need remediation. According to the customized business rules that you can set. Ensuring that the data quality standards meet the requirements of your organizations. What's great is through our use of RPA. We can scale with an organization. So as you ingest more data sources we can allocate more robotic workers meaning the results will continue to be delivered in the same timely fashion you've grown used to. What's Morrisons IO-Tahoe is doing the heavy lifting on monitoring data quality. That's frees up your data experts to focus on the more strategic tasks such as remediation that augmentations and analytics developments. >> Okay, maybe Tiji, you could address this. I mean, how does all this automation change the operating model that we were talking to to Aj and Dunkin before about that? I mean, if it involves less people and more automation what else can I do in parallel? >> I'm sure the participants today will also be asking the same question. Let me start with the strategic task. Patrick mentioned IO-Tahoe does the heavy lifting. Freeing up data experts to act upon the data events generated by IO-Tahoe. Companies that have teams focused on manually building their inventory of the data landscape. Leads to longer turnaround times in producing actionable insights from their own data assets. Thus, diminishing the value realized by traditional methods. However, our operating model involves profiling and remediating at the same time creating a catalog data estate that can be used by business or IT accordingly. With increased automation and fewer people. Our machine learning algorithms augment the data pipeline to tag and capture the data elements into a comprehensive data catalog. As IO-Tahoe automatically catalogs the data estate in a centralized view, the data experts can partly focus on remediating the data events generated from validating against business rules. We envision that data events coupled with this drillable and searchable view will be a comprehensive one to assess the impact of bad quality data. Let's briefly look at the image on screen. For example, the view indicates that bad quality zip code data impacts the contact data which in turn impacts other related entities in systems. Now contrast that with a manually maintained spreadsheet that drowns out the main focus of your analysis. >> Tiji, how do you tag and capture bad quality data and stop that from you've mentioned these printed dependencies. How do you stop that from flowing downstream into the processes within the applications or reports? >> As IO-Tahoe builds the data catalog across source systems. We tag the elements that meet the business rule criteria while segregating the failed data examples associated with the elements that fall below a certain threshold. The elements that meet the business rule criteria are tagged to be searchable. Thus, providing an easy way to identify data elements that may flow through the system. The segregated data examples on the other hand are used by data experts to triage for the root cause. Based on the root cause potential outcomes could be one, changes in the source system to prevent that data from entering the system in the first place. Two, add data pipeline logic, to sanitize bad data from being consumed by downstream applications and reports or just accept the risk of storing bad data and address it when it meets a certain threshold. However, Dave as for your question about preventing bad quality data from flowing into the system? IO-Tahoe will not prevent it because the controls of data flowing between systems is managed outside of IO-Tahoe. Although, IO-Tahoe will alert and notify the data experts to events that indicate bad data has entered the monitored assets. Also we have redesigned our product to be modular and extensible. This allows data events generated by IO-Tahoe to be consumed by any system that wants to control the targets from bad data. Does IO-Tahoe empowers the data experts to control the bad data from flowing into their system. >> Thank you for that. So, one of the things that we've noticed, we've written about is that you've got these hyper specialized roles within the data, the centralized data organization. And wonder how do the data folks get involved here if at all, and how frequently do they get involved? Maybe Senthilnathan you could take that. >> Thank you, Dave for having me here. Well, based on whether the data element in question is in data cataloging or monitoring phase. Different data folks gets involved. When it doesn't the data cataloging stage. The data governance team, along with enterprise architecture or IT involved in setting up the data catalog. Which includes identifying the critical data elements business term identification, definition, documentation data quality rules, and data even set up data domain and business line mapping, lineage PA tracking source of truth. So on and so forth. It's typically in one time set up review certify then govern and monitor. But while when it is in the monitoring phase during any data incident or data issues IO-Tahoe broadcast data signals to the relevant data folks to act and remedy it as quick as possible. And alerts the consumption team it could be the data science, analytics, business opts are both a potential issue so that they are aware and take necessary preventative measure. Let me show you an example, critical data element from data quality dashboard view to lineage view to data 360 degree view for a zip code for conformity check. So in this case the zip code did not meet the past threshold during the technical data quality check and was identified as non-compliant item and notification was sent to the ID folks. So clicking on the zip code. Will take to the lineage view to visualize the dependent system, says that who are producers and who are the consumers. And further drilling down will take us to the detailed view, that a lot of other information's are presented to facilitate for a root cause analysis and not to take it to a final closure. >> Thank you for that. So Tiji? Patrick was talking about the as is to be. So I'm interested in how it's done now versus before. Do you need a data governance operating model for example? >> Typically a company that decides to make an inventory of the data assets would start out by manually building a spreadsheet managed by data experts of the company. What started as a draft now get break into the model of a company. This leads to loss of collaboration as each department makes a copy of their catalog for their specific needs. This decentralized approach leads to loss of uniformity which each department having different definitions which ironically needs a governance model for the data catalog itself. And as the spreadsheet grows in complexity the skill level needed to maintain. It also increases thus leading to fewer and fewer people knowing how to maintain it. About all the content that took so much time and effort to build is not searchable outside of that spreadsheet document. >> Yeah, I think you really hit the nail on my head Tiji. Now companies want to move away from the spreadsheet approach. IO-Tahoe addresses the shortcoming of the traditional approach enabling companies to achieve more with less. >> Yeah, what the customer reaction has been. We had Webster Bank, on one of the early episodes for example, I mean could they have achieved. What they did without something like active data quality and automation maybe Senthilnathan you could address that? >> Sure, It is impossible to achieve full data quality monitoring and remediation without automation or digital workers in place reality that introverts they don't have the time to do the remediation manually because they have to do an analysis conform fix on any data quality issues, as fast as possible before it gets bigger and no exception to Webster. That's why Webster implemented IO-Tahoe's active DQ to set up the business, metadata management and data quality monitoring and remediation in the Snowflake cloud data Lake. We help and building the center of excellence in the data governance, which is managing the data catalog schedule on demand and in-flight data quality checks, but Snowflake, no pipe on stream are super beneficial to achieve in flight quality checks. Then the data assumption monitoring and reporting last but not the least the time saver is persisting the non-compliant records for every data quality run within the Snowflake cloud, along with remediation script. So that during any exceptions the respect to team members is not only alerted. But also supplied with necessary scripts and tools to perform remediation right from the IO-Tahoe's Active DQ. >> Very nice. Okay guys, thanks for the demo. Great stuff. Now, if you want to learn more about the IO-Tahoe platform and how you can accelerate your adoption of Snowflake book some time with a data RPA expert all you got to do is click on the demo icon on the right of your screen and set a meeting. We appreciate you attending this latest episode of the IO-Tahoe data automation series. Look, if you missed any of the content that's all available on demand. This is Dave Vellante theCUBE. Thanks for watching. (upbeat music)

Published Date : Apr 21 2021

SUMMARY :

the globe it's theCUBE. and tell him, do the demo. and push the metadata to Snowflake. if you could address or the ability to monitor the operating model on remediating the data events generated into the processes within the data experts to events that indicate So, one of the things that So clicking on the zip code. Thank you for that. the skill level needed to maintain. of the traditional approach one of the early episodes So that during any exceptions the respect of the IO-Tahoe data automation series.

ENTITIES

Entity	Category	Confidence
Patrick	PERSON	0.99+
Dave	PERSON	0.99+
Tiji Matthew	PERSON	0.99+
Tiji Mathew	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Patrick Zimet	PERSON	0.99+
IO-Tahoe	ORGANIZATION	0.99+
Senthil Karuppaiah	PERSON	0.99+
360 degree	QUANTITY	0.99+
Tiji	PERSON	0.99+
Senthilnathan Karuppaiah	PERSON	0.99+
each department	QUANTITY	0.99+
today	DATE	0.99+
Snowflake	TITLE	0.99+
Webster	ORGANIZATION	0.99+
Aj	PERSON	0.99+
Dunkin	PERSON	0.98+
Two	QUANTITY	0.98+
IO	ORGANIZATION	0.97+
Patrick Zeimet	PERSON	0.97+
one time	QUANTITY	0.97+
Webster Bank	ORGANIZATION	0.97+
one	QUANTITY	0.97+
Io-Tahoe	ORGANIZATION	0.96+
both	QUANTITY	0.96+
Senthilnathan	PERSON	0.96+
IO-Tahoe	TITLE	0.95+
first place	QUANTITY	0.89+
Snowflake	EVENT	0.71+
Tahoe	ORGANIZATION	0.69+
Data Solutions	ORGANIZATION	0.69+
IO	TITLE	0.68+
-Tahoe	TITLE	0.64+
Snowflake	ORGANIZATION	0.6+
Morrisons	ORGANIZATION	0.6+
Tahoe	TITLE	0.59+

Yusef Khan, Io Tahoe | Enterprise Data Automation

>>from around the globe. It's the Cube with digital coverage of enterprise data automation, an event Siri's brought to you by Iot. Tahoe, everybody, We're back. We're talking about enterprise data automation. The hashtag is data automated, and we're going to really dig into data migrations, data, migrations. They're risky. They're time consuming, and they're expensive. Yousef con is here. He's the head of partnerships and alliances at I o ta ho coming again from London. Hey, good to see you, Seth. Thanks very much. >>Thank you. >>So your role is is interesting. We're talking about data migrations. You're gonna head of partnerships. What is your role specifically? And how is it relevant to what we're gonna talk about today? >>Uh, I work with the various businesses such as cloud companies, systems integrators, companies that sell operating systems, middleware, all of whom are often quite well embedded within a company. I t infrastructures and have existing relationships. Because what we do fundamentally makes migrating to the cloud easier on data migration easier. A lot of businesses that are interested in partnering with us. Um, we're interested in parting with, So >>let's set up the problem a little bit. And then I want to get into some of the data. You know, I said that migration is a risky, time consuming, expensive. They're they're often times a blocker for organizations to really get value out of data. Why is that? >>Uh, I think I mean, all migrations have to start with knowing the facts about your data, and you can try and do this manually. But when that you have an organization that may have been going for decades or longer, they will probably have a pretty large legacy data estate so that I have everything from on premise mainframes. They may have stuff which is probably in the cloud, but they probably have hundreds, if not thousands of applications and potentially hundreds of different data stores. Um, now they're understanding of what they have. Ai's often quite limited because you can try and draw a manual maps, but they're outdated very quickly. Every time that data changes the manual that's out of date on people obviously leave organizations over time, so that kind of tribal knowledge gets built up is limited as well. So you can try a Mackel that manually you might need a db. Hey, thanks. Based analyst or ah, business analyst, and they won't go in and explore the data for you. But doing that manually is very, very time consuming this contract teams of people, months and months. Or you can use automation just like what's the bank with Iot? And they managed to do this with a relatively small team. Are in a timeframe of days. >>Yeah, we talked to Paul from Webster Bank. Awesome discussion. So I want to dig into this migration and let's let's pull up graphic it will talk about. We'll talk about what a typical migration project looks like. So what you see here it is. It's very detailed. I know it's a bit of an eye test, but let me call your attention to some of the key aspects of this Ah, and then use. If I want you to chime in. So at the top here, you see that area graph that's operational risk for a typical migration project, and you can see the timeline and the the milestones. That blue bar is the time to test so you can see the second step data analysis talking 24 weeks so, you know, very time consuming. And then Let's not get dig into the stuff in the middle of the fine print, but there's some real good detail there, but go down the bottom. That's labor intensity in the in the bottom and you can see high is that sort of brown and and you could see a number of data analysis, data staging data prep, the trial, the implementation post implementation fixtures, the transition toe B A B a year, which I think is business as usual. Those are all very labor intensive. So what do you take aways from this typical migration project? What do we need to know yourself? >>I mean, I think the key thing is, when you don't understand your data upfront, it's very difficult to scope to set up a project because you go to business stakeholders and decision makers and you say Okay, we want to migrate these data stores. We want to put them in the cloud most often, but actually, you probably don't know how much data is there. You don't necessarily know how many applications that relates to, you know, the relationships between the data. You don't know the flow of the data. So the direction in which the data is going between different data stores and tables, so you start from a position where you have pretty high risk and alleviate that risk. You could be stacking project team of lots and lots of people to do the next base, which is analysis. And so you set up a project which has got a pretty high cost. The big projects, more people, the heavy of governance, obviously on then there, then in the phase where they're trying to do lots and lots of manual analysis manage. That, in a sense, is, as we all know, on the idea of trying to relate data that's in different those stores relating individual tables and columns. Very, very time consuming, expensive. If you're hiring in resource from consultants or systems integrators externally, you might need to buy or to use party tools, Aziz said earlier. The people who understand some of those systems may have left a while ago. See you even high risks quite cost situation from the off on the same things that have developed through the project. Um, what are you doing with it, Ayatollah? Who is that? We're able to automate a lot of this process from the very beginning because we can do the initial data. Discovery run, for example, automatically you very quickly have an automated validator. A data map on the data flow has been generated automatically, much less time and effort and much less cars. Doctor Marley. >>Okay, so I want to bring back that that first chart, and I want to call your attention to the again that area graph the blue bars and then down below that labor intensity. And now let's bring up the the the same chart. But with a set of an automation injection in here and now. So you now see the So let's go Said Accelerated by Iot, Tom. Okay, great. And we're going to talk about this. But look, what happens to the operational risk. A dramatic reduction in that. That graph. And then look at the bars, the bars, those blue bars. You know, data analysis went from 24 weeks down to four weeks and then look at the labor intensity. The it was all these were high data analysis data staging data prep. Try a lot post implementation fixtures in transition to be a you. All of those went from high labor intensity. So we've now attack that and gone to low labor intensity. Explain how that magic happened. >>I think that the example off a data catalog. So every large enterprise wants to have some kind of repository where they put all their understanding about their data in its Price States catalog, if you like, um, imagine trying to do that manually. You need to go into every individual data store. You need a DB a business analyst, rich data store they need to do in extracted the data table was individually they need to cross reference that with other data school, it stores and schemers and tables. You probably were the mother of all lock Excel spreadsheets. It would be a very, very difficult exercise to do. I mean, in fact, one of our reflections as we automate lots of data lots of these things is, um it accelerates the ability to water may, But in some cases, it also makes it possible for enterprise customers with legacy systems um, take banks, for example. There quite often end up staying on mainframe systems that they've had in place for decades. Uh, no migrating away from them because they're not able to actually do the work of understanding the data g duplicating the data, deleting data isn't relevant and then confidently going forward to migrate. So they stay where they are with all the attendant problems assistance systems that are out of support. Go back to the data catalog example. Um, whatever you discover invades, discovery has to persist in a tool like a data catalog. And so we automate data catalog books, including Out Way Cannot be others, but we have our own. The only alternative to this kind of automation is to build out this very large project team or business analysts off db A's project managers processed analysts together with data to understand that the process of gathering data is correct. To put it in the repository to validate it except etcetera, we've got into organizations and we've seen them ramp up teams off 2030 people costs off £234 million a year on a time frame, 15 20 years just to try and get a data catalog done. And that's something that we can typically do in a timeframe of months, if not weeks. And the difference is using automation. And if you do what? I've just described it. In this manual situation, you make migrations to the cloud prohibitively expensive. Whatever saving you might make from shutting down your legacy data stores, we'll get eaten up by the cost of doing it. Unless you go with the more automated approach. >>Okay, so the automated approach reduces risk because you're not gonna, you know you're going to stay on project plan. Ideally, it's all these out of scope expectations that come up with the manual processes that kill you in the rework andan that data data catalog. People are afraid that their their family jewels data is not going to make it through to the other side. So So that's something that you're you're addressing and then you're also not boiling the ocean. You're really taking the pieces that are critical and stuff you don't need. You don't have to pay for >>process. It's a very good point. I mean, one of the other things that we do and we have specific features to do is to automatically and noise data for a duplication at a rover or record level and redundancy on a column level. So, as you say before you go into a migration process. You can then understand. Actually, this stuff it was replicated. We don't need it quite often. If you put data in the cloud you're paying, obviously, the storage based offer compute time. The more data you have in there that's duplicated, that is pure cost. You should take out before you migrate again if you're trying to do that process of understanding what's duplicated manually off tens or hundreds of bases stores. It was 20 months, if not years. Use machine learning to do that in an automatic way on it's much, much quicker. I mean, there's nothing I say. Well, then, that costs and benefits of guitar. Every organization we work with has a lot of money existing, sunk cost in their I t. So have your piece systems like Oracle or Data Lakes, which they've spent a good time and money investing in. But what we do by enabling them to transition everything to the strategic future repositories, is accelerate the value of that investment and the time to value that investment. So we're trying to help people get value out of their existing investments on data estate, close down the things that they don't need to enable them to go to a kind of brighter, more future well, >>and I think as well, you know, once you're able to and this is a journey, we know that. But once you're able to go live on, you're infusing sort of a data mindset, a data oriented culture. I know it's somewhat buzzword, but when you when you see it in organizations, you know it's really and what happens is you dramatically reduce that and cycle time of going from data to actually insights. Data's plentiful, but insights aren't, and that is what's going to drive competitive advantage over the next decade and beyond. >>Yeah, definitely. And you could only really do that if you get your data estate cleaned up in the first place. Um, I worked with the managed teams of data scientists, data engineers, business analysts, people who are pushing out dashboards and trying to build machine learning applications. You know, you know, the biggest frustration for lots of them and the thing that they spend far too much time doing is trying to work out what the right data is on cleaning data, which really you don't want a highly paid thanks to scientists doing with their time. But if you sort out your data stays in the first place, get rid of duplication. If that pans migrate to cloud store, where things are really accessible on its easy to build connections and to use native machine learning tools, you're well on the way up to date the maturity curve on you can start to use some of those more advanced applications. >>You said. What are some of the pre requisites? Maybe the top few that are two or three that I need to understand as a customer to really be successful here? Is it skill sets? Is it is it mindset leadership by in what I absolutely need to have to make this successful? >>Well, I think leadership is obviously key just to set the vision of people with spiky. One of the great things about Ayatollah, though, is you can use your existing staff to do this work. If you've used on automation, platform is no need to hire expensive people. Alright, I was a no code solution. It works out of the box. You just connect to force on your existing stuff can use. It's very intuitive that has these issues. User interface? >>Um, it >>was only to invest vast amounts with large consultants who may well charging the earth. Um, and you already had a bit of an advantage. If you've got existing staff who are close to the data subject matter experts or use it because they can very easily learn how to use a tool on, then they can go in and they can write their own data quality rules on. They can really make a contribution from day one, when we are go into organizations on way. Can I? It's one of the great things about the whole experience. Veritas is. We can get tangible results back within the day. Um, usually within an hour or two great ones to say Okay, we started to map relationships. Here's the data map of the data that we've analyzed. Harrison thoughts on where the sensitive data is because it's automated because it's running algorithms stater on. That's what they were really to expect. >>Um, >>and and you know this because you're dealing with the ecosystem. We're entering a new era of data and many organizations to your point, they just don't have the resources to do what Google and Amazon and Facebook and Microsoft did over the past decade To become data dominant trillion dollar market cap companies. Incumbents need to rely on technology companies to bring that automation that machine intelligence to them so they can apply it. They don't want to be AI inventors. They want to apply it to their businesses. So and that's what really was so difficult in the early days of so called big data. You have this just too much complexity out there, and now companies like Iot Tahoe or bringing your tooling and platforms that are allowing companies to really become data driven your your final thoughts. Please use it. >>That's a great point, Dave. In a way, it brings us back to where it began. In terms of partnerships and alliances. I completely agree with a really exciting point where we can take applications like Iot. Uh, we can go into enterprises and help them really leverage the value of these type of machine learning algorithms. And and I I we work with all the major cloud providers AWS, Microsoft Azure or Google Cloud Platform, IBM and Red Hat on others, and we we really I think for us. The key thing is that we want to be the best in the world of enterprise data automation. We don't aspire to be a cloud provider or even a workflow provider. But what we want to do is really help customers with their data without automated data functionality in partnership with some of those other businesses so we can leverage the great work they've done in the cloud. The great work they've done on work flows on virtual assistants in other areas. And we help customers leverage those investments as well. But our heart, we really targeted it just being the best, uh, enterprised data automation business in the world. >>Massive opportunities not only for technology companies, but for those organizations that can apply technology for business. Advantage yourself, count. Thanks so much for coming on the Cube. Appreciate. All right. And thank you for watching everybody. We'll be right back right after this short break. >>Yeah, yeah, yeah, yeah.

Published Date : Jun 23 2020

SUMMARY :

of enterprise data automation, an event Siri's brought to you by Iot. And how is it relevant to what we're gonna talk about today? fundamentally makes migrating to the cloud easier on data migration easier. a blocker for organizations to really get value out of data. And they managed to do this with a relatively small team. That blue bar is the time to test so you can see the second step data analysis talking 24 I mean, I think the key thing is, when you don't understand So you now see the So let's go Said Accelerated by Iot, You need a DB a business analyst, rich data store they need to do in extracted the data processes that kill you in the rework andan that data data catalog. close down the things that they don't need to enable them to go to a kind of brighter, and I think as well, you know, once you're able to and this is a journey, And you could only really do that if you get your data estate cleaned up in I need to understand as a customer to really be successful here? One of the great things about Ayatollah, though, is you can use Um, and you already had a bit of an advantage. and and you know this because you're dealing with the ecosystem. And and I I we work And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
London	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Yusef Khan	PERSON	0.99+
Seth	PERSON	0.99+
Dave	PERSON	0.99+
20 months	QUANTITY	0.99+
Aziz	PERSON	0.99+
hundreds	QUANTITY	0.99+
tens	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Webster Bank	ORGANIZATION	0.99+
24 weeks	QUANTITY	0.99+
two	QUANTITY	0.99+
four weeks	QUANTITY	0.99+
three	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Io Tahoe	PERSON	0.99+
Marley	PERSON	0.99+
Harrison	PERSON	0.99+
Data Lakes	ORGANIZATION	0.99+
Siri	TITLE	0.99+
Excel	TITLE	0.99+
Veritas	ORGANIZATION	0.99+
second step	QUANTITY	0.99+
15 20 years	QUANTITY	0.98+
Tahoe	PERSON	0.98+
One	QUANTITY	0.98+
first chart	QUANTITY	0.98+
an hour	QUANTITY	0.98+
Red Hat	ORGANIZATION	0.98+
one	QUANTITY	0.97+
Tom	PERSON	0.96+
hundreds of bases	QUANTITY	0.96+
first	QUANTITY	0.95+
next decade	DATE	0.94+
first place	QUANTITY	0.94+
Iot	ORGANIZATION	0.94+
Iot	TITLE	0.93+
earth	LOCATION	0.93+
day one	QUANTITY	0.92+
Mackel	ORGANIZATION	0.91+
today	DATE	0.91+
Ayatollah	PERSON	0.89+
£234 million a year	QUANTITY	0.88+
data	QUANTITY	0.88+
Iot	PERSON	0.83+
hundreds of	QUANTITY	0.81+
thousands of applications	QUANTITY	0.81+
decades	QUANTITY	0.8+
I o ta ho	ORGANIZATION	0.75+
past decade	DATE	0.75+
Microsoft Azure	ORGANIZATION	0.72+
two great ones	QUANTITY	0.72+
2030 people	QUANTITY	0.67+
Doctor	PERSON	0.65+
States	LOCATION	0.65+
Iot Tahoe	ORGANIZATION	0.65+
a year	QUANTITY	0.55+
Yousef	PERSON	0.45+
Cloud Platform	TITLE	0.44+
Cube	ORGANIZATION	0.38+

Enterprise Data Automation | Crowdchat

>>from around the globe. It's the Cube with digital coverage of enterprise data automation, an event Siri's brought to you by Iot. Tahoe Welcome everybody to Enterprise Data Automation. Ah co created digital program on the Cube with support from my hotel. So my name is Dave Volante. And today we're using the hashtag data automated. You know, organizations. They really struggle to get more value out of their data, time to data driven insights that drive cost savings or new revenue opportunities. They simply take too long. So today we're gonna talk about how organizations can streamline their data operations through automation, machine intelligence and really simplifying data migrations to the cloud. We'll be talking to technologists, visionaries, hands on practitioners and experts that are not just talking about streamlining their data pipelines. They're actually doing it. So keep it right there. We'll be back shortly with a J ahora who's the CEO of Iot Tahoe to kick off the program. You're watching the Cube, the leader in digital global coverage. We're right back right after this short break. Innovation impact influence. Welcome to the Cube disruptors. Developers and practitioners learn from the voices of leaders who share their personal insights from the hottest digital events around the globe. Enjoy the best this community has to offer on the Cube, your global leader. High tech digital coverage from around the globe. It's the Cube with digital coverage of enterprise, data, automation and event. Siri's brought to you by Iot. Tahoe. Okay, we're back. Welcome back to Data Automated. A J ahora is CEO of I O ta ho, JJ. Good to see how things in London >>Thanks doing well. Things in, well, customers that I speak to on day in, day out that we partner with, um, they're busy adapting their businesses to serve their customers. It's very much a game of ensuring the week and serve our customers to help their customers. Um, you know, the adaptation that's happening here is, um, trying to be more agile. Got to be more flexible. Um, a lot of pressure on data, a lot of demand on data and to deliver more value to the business, too. So that customers, >>as I said, we've been talking about data ops a lot. The idea being Dev Ops applied to the data pipeline, But talk about enterprise data automation. What is it to you. And how is it different from data off >>Dev Ops, you know, has been great for breaking down those silos between different roles functions and bring people together to collaborate. Andi, you know, we definitely see that those tools, those methodologies, those processes, that kind of thinking, um, lending itself to data with data is exciting. We look to do is build on top of that when data automation, it's the it's the nuts and bolts of the the algorithms, the models behind machine learning that the functions. That's where we investors, our r and d on bringing that in to build on top of the the methods, the ways of thinking that break down those silos on injecting that automation into the business processes that are going to drive a business to serve its customers. It's, um, a layer beyond Dev ops data ops. They can get to that point where well, I think about it is is the automation behind new dimension. We've come a long way in the last few years. Boy is, we started out with automating some of those simple, um, to codify, um, I have a high impact on organization across the data a cost effective way house. There's data related tasks that classify data on and a lot of our original pattern certain people value that were built up is is very much around that >>love to get into the tech a little bit in terms of how it works. And I think we have a graphic here that gets into that a little bit. So, guys, if you bring that up, >>sure. I mean right there in the middle that the heart of what we do it is, you know, the intellectual property now that we've built up over time that takes from Hacha genius data sources. Your Oracle Relational database. Short your mainframe. It's a lay and increasingly AP eyes and devices that produce data and that creates the ability to automatically discover that data. Classify that data after it's classified. Them have the ability to form relationships across those different source systems, silos, different lines of business. And once we've automated that that we can start to do some cool things that just puts of contact and meaning around that data. So it's moving it now from bringing data driven on increasingly where we have really smile, right people in our customer organizations you want I do some of those advanced knowledge tasks data scientists and ah, yeah, quants in some of the banks that we work with, the the onus is on, then, putting everything we've done there with automation, pacifying it, relationship, understanding that equality, the policies that you can apply to that data. I'm putting it in context once you've got the ability to power. Okay, a professional is using data, um, to be able to put that data and contacts and search across the entire enterprise estate. Then then they can start to do some exciting things and piece together the the tapestry that fabric across that different system could be crm air P system such as s AP and some of the newer brown databases that we work with. Snowflake is a great well, if I look back maybe five years ago, we had prevalence of daily technologies at the cutting edge. Those are converging to some of the cloud platforms that we work with Google and AWS and I think very much is, as you said it, those manual attempts to try and grasp. But it is such a complex challenges scale quickly runs out of steam because once, once you've got your hat, once you've got your fingers on the details Oh, um, what's what's in your data state? It's changed, You know, you've onboard a new customer. You signed up a new partner. Um, customer has, you know, adopted a new product that you just Lawrence and there that that slew of data keeps coming. So it's keeping pace with that. The only answer really is is some form of automation >>you're working with AWS. You're working with Google, You got red hat. IBM is as partners. What is attracting those folks to your ecosystem and give us your thoughts on the importance of ecosystem? >>That's fundamental. So, I mean, when I caimans where you tell here is the CEO of one of the, um, trends that I wanted us CIO to be part of was being open, having an open architecture allowed one thing that was close to my heart, which is as a CEO, um, a c i o where you go, a budget vision on and you've already made investments into your organization, and some of those are pretty long term bets. They should be going out 5 10 years, sometimes with the CRM system training up your people, getting everybody working together around a common business platform. What I wanted to ensure is that we could openly like it using AP eyes that were available, the love that some investment on the cost that has already gone into managing in organizations I t. But business users to before. So part of the reason why we've been able to be successful with, um, the partners like Google AWS and increasingly, a number of technology players. That red hat mongo DB is another one where we're doing a lot of good work with, um and snowflake here is, um Is those investments have been made by the organizations that are our customers, and we want to make sure we're adding to that. And they're leveraging the value that they've already committed to. >>Yeah, and maybe you could give us some examples of the r A y and the business impact. >>Yeah, I mean, the r a y David is is built upon on three things that I mentioned is a combination off. You're leveraging the existing investment with the existing estate, whether that's on Microsoft Azure or AWS or Google, IBM, and I'm putting that to work because, yeah, the customers that we work with have had made those choices. On top of that, it's, um, is ensuring that we have got the automation that is working right down to the level off data, a column level or the file level we don't do with meta data. It is being very specific to be at the most granular level. So as we've grown our processes and on the automation, gasification tagging, applying policies from across different compliance and regulatory needs that an organization has to the data, everything that then happens downstream from that is ready to serve a business outcome now without hoping out which run those processes within hours of getting started And, um, Bill that picture, visualize that picture and bring it to life. You know, the PR Oh, I that's off the bat with finding data that should have been deleted data that was copies off on and being able to allow the architect whether it's we're working on GCB or a migration to any other clouds such as AWS or a multi cloud landscape right off the map. >>A. J. Thanks so much for coming on the Cube and sharing your insights and your experience is great to have you. >>Thank you, David. Look who is smoking in >>now. We want to bring in the customer perspective. We have a great conversation with Paul Damico, senior vice president data architecture, Webster Bank. So keep it right there. >>Utah Data automated Improve efficiency, Drive down costs and make your enterprise data work for you. Yeah, we're on a mission to enable our customers to automate the management of data to realise maximum strategic and operational benefits. We envisage a world where data users consume accurate, up to date unified data distilled from many silos to deliver transformational outcomes, activate your data and avoid manual processing. Accelerate data projects by enabling non I t resources and data experts to consolidate categorize and master data. Automate your data operations Power digital transformations by automating a significant portion of data management through human guided machine learning. Yeah, get value from the start. Increase the velocity of business outcomes with complete accurate data curated automatically for data, visualization tours and analytic insights. Improve the security and quality of your data. Data automation improves security by reducing the number of individuals who have access to sensitive data, and it can improve quality. Many companies report double digit era reduction in data entry and other repetitive tasks. Trust the way data works for you. Data automation by our Tahoe learns as it works and can ornament business user behavior. It learns from exception handling and scales up or down is needed to prevent system or application overloads or crashes. It also allows for innate knowledge to be socialized rather than individualized. No longer will your companies struggle when the employee who knows how this report is done, retires or takes another job, the work continues on without the need for detailed information transfer. Continue supporting the digital shift. Perhaps most importantly, data automation allows companies to begin making moves towards a broader, more aspirational transformation, but on a small scale but is easy to implement and manage and delivers quick wins. Digital is the buzzword of the day, but many companies recognized that it is a complex strategy requires time and investment. Once you get started with data automation, the digital transformation initiated and leaders and employees alike become more eager to invest time and effort in a broader digital transformational agenda. Yeah, >>everybody, we're back. And this is Dave Volante, and we're covering the whole notion of automating data in the Enterprise. And I'm really excited to have Paul Damico here. She's a senior vice president of enterprise Data Architecture at Webster Bank. Good to see you. Thanks for coming on. >>Nice to see you too. Yes. >>So let's let's start with Let's start with Webster Bank. You guys are kind of a regional. I think New York, New England, uh, leave headquartered out of Connecticut, but tell us a little bit about the >>bank. Yeah, Webster Bank is regional, Boston. And that again in New York, Um, very focused on in Westchester and Fairfield County. Um, they're a really highly rated bank regional bank for this area. They, um, hold, um, quite a few awards for the area for being supportive for the community. And, um, are really moving forward. Technology lives. Currently, today we have, ah, a small group that is just working toward moving into a more futuristic, more data driven data warehouse. That's our first item. And then the other item is to drive new revenue by anticipating what customers do when they go to the bank or when they log into there to be able to give them the best offer. The only way to do that is you have timely, accurate, complete data on the customer and what's really a great value on off something to offer that >>at the top level, what were some of what are some of the key business drivers there catalyzing your desire for change >>the ability to give the customer what they need at the time when they need it? And what I mean by that is that we have, um, customer interactions and multiple weights, right? And I want to be able for the customer, too. Walk into a bank, um, or online and see the same the same format and being able to have the same feel, the same look and also to be able to offer them the next best offer for them. >>Part of it is really the cycle time, the end end cycle, time that you're pressing. And then there's if I understand it, residual benefits that are pretty substantial from a revenue opportunity >>exactly. It's drive new customers, Teoh new opportunities. It's enhanced the risk, and it's to optimize the banking process and then obviously, to create new business. Um, and the only way we're going to be able to do that is that we have the ability to look at the data right when the customer walks in the door or right when they open up their app. >>Do you see the potential to increase the data sources and hence the quality of the data? Or is that sort of premature? >>Oh, no. Um, exactly. Right. So right now we ingest a lot of flat files and from our mainframe type of runnin system that we've had for quite a few years. But now that we're moving to the cloud and off Prem and on France, you know, moving off Prem into, like, an s three bucket Where that data king, we can process that data and get that data faster by using real time tools to move that data into a place where, like, snowflake Good, um, utilize that data or we can give it out to our market. The data scientists are out in the lines of business right now, which is great, cause I think that's where data science belongs. We should give them on, and that's what we're working towards now is giving them more self service, giving them the ability to access the data in a more robust way. And it's a single source of truth. So they're not pulling the data down into their own like tableau dashboards and then pushing the data back out. I have eight engineers, data architects, they database administrators, right, um, and then data traditional data forwarding people, Um, and because some customers that I have that our business customers lines of business, they want to just subscribe to a report. They don't want to go out and do any data science work. Um, and we still have to provide that. So we still want to provide them some kind of read regiment that they wake up in the morning and they open up their email. And there's the report that they just drive, um, which is great. And it works out really well. And one of the things. This is why we purchase I o waas. I would have the ability to give the lines of business the ability to do search within the data, and we read the data flows and data redundancy and things like that and help me cleanup the data and also, um, to give it to the data. Analysts who say All right, they just asked me. They want this certain report and it used to take Okay, well, we're gonna four weeks, we're going to go. We're gonna look at the data, and then we'll come back and tell you what we dio. But now with Iot Tahoe, they're able to look at the data and then, in one or two days of being able to go back and say, Yes, we have data. This is where it is. This is where we found that this is the data flows that we've found also, which is what I call it is the birth of a column. It's where the calm was created and where it went live as a teenager. And then it went to, you know, die very archive. >>In researching Iot Tahoe, it seems like one of the strengths of their platform is the ability to visualize data the data structure, and actually dig into it. But also see it, um, and that speeds things up and gives everybody additional confidence. And then the other pieces essentially infusing ai or machine intelligence into the data pipeline is really how you're attacking automation, right? >>Exactly. So you're able to let's say that I have I have seven cause lines of business that are asking me questions. And one of the questions I'll ask me is, um, we want to know if this customer is okay to contact, right? And you know, there's different avenues so you can go online to go. Do not contact me. You can go to the bank And you could say, I don't want, um, email, but I'll take tests and I want, you know, phone calls. Um, all that information. So seven different lines of business asked me that question in different ways once said Okay to contact the other one says, You know, just for one to pray all these, you know, um, and each project before I got there used to be siloed. So one customer would be 100 hours for them to do that and analytical work, and then another cut. Another of analysts would do another 100 hours on the other project. Well, now I can do that all at once, and I can do those type of searches and say yes we already have that documentation. Here it is. And this is where you can find where the customer has said, You know, you don't want I don't want to get access from you by email, or I've subscribed to get emails from you. I'm using Iot typos eight automation right now to bring in the data and to start analyzing the data close to make sure that I'm not missing anything and that I'm not bringing over redundant data. Um, the data warehouse that I'm working off is not, um a It's an on prem. It's an oracle database. Um, and it's 15 years old, so it has extra data in it. It has, um, things that we don't need anymore. And Iot. Tahoe's helping me shake out that, um, extra data that does not need to be moved into my S three. So it's saving me money when I'm moving from offering on Prem. >>What's your vision or your your data driven organization? >>Um, I want for the bankers to be able to walk around with on iPad in their hands and be able to access data for that customer really fast and be able to give them the best deal that they can get. I want Webster to be right there on top, with being able to add new customers and to be able to serve our existing customers who had bank accounts. Since you were 12 years old there and now our, you know, multi. Whatever. Um, I want them to be able to have the best experience with our our bankers. >>That's really what I want is a banking customer. I want my bank to know who I am, anticipate my needs and create a great experience for me. And then let me go on with my life. And so that's a great story. Love your experience, your background and your knowledge. Can't thank you enough for coming on the Cube. >>No, thank you very much. And you guys have a great day. >>Next, we'll talk with Lester Waters, who's the CTO of Iot Toe cluster takes us through the key considerations of moving to the cloud. >>Yeah, right. The entire platform Automated data Discovery data Discovery is the first step to knowing your data auto discover data across any application on any infrastructure and identify all unknown data relationships across the entire siloed data landscape. smart data catalog. Know how everything is connected? Understand everything in context, regained ownership and trust in your data and maintain a single source of truth across cloud platforms, SAS applications, reference data and legacy systems and power business users to quickly discover and understand the data that matters to them with a smart data catalog continuously updated ensuring business teams always have access to the most trusted data available. Automated data mapping and linking automate the identification of unknown relationships within and across data silos throughout the organization. Build your business glossary automatically using in house common business terms, vocabulary and definitions. Discovered relationships appears connections or dependencies between data entities such as customer account, address invoice and these data entities have many discovery properties. At a granular level, data signals dashboards. Get up to date feeds on the health of your data for faster improved data management. See trends, view for history. Compare versions and get accurate and timely visual insights from across the organization. Automated data flows automatically captured every data flow to locate all the dependencies across systems. Visualize how they work together collectively and know who within your organization has access to data. Understand the source and destination for all your business data with comprehensive data lineage constructed automatically during with data discovery phase and continuously load results into the smart Data catalog. Active, geeky automated data quality assessments Powered by active geek You ensure data is fit for consumption that meets the needs of enterprise data users. Keep information about the current data quality state readily available faster Improved decision making Data policy. Governor Automate data governance End to end over the entire data lifecycle with automation, instant transparency and control Automate data policy assessments with glossaries, metadata and policies for sensitive data discovery that automatically tag link and annotate with metadata to provide enterprise wide search for all lines of business self service knowledge graph Digitize and search your enterprise knowledge. Turn multiple siloed data sources into machine Understandable knowledge from a single data canvas searching Explore data content across systems including GRP CRM billing systems, social media to fuel data pipelines >>Yeah, yeah, focusing on enterprise data automation. We're gonna talk about the journey to the cloud Remember, the hashtag is data automate and we're here with Leicester Waters. Who's the CTO of Iot Tahoe? Give us a little background CTO, You've got a deep, deep expertise in a lot of different areas. But what do we need to know? >>Well, David, I started my career basically at Microsoft, uh, where I started the information Security Cryptography group. They're the very 1st 1 that the company had, and that led to a career in information, security. And and, of course, as easy as you go along with information security data is the key element to be protected. Eso I always had my hands and data not naturally progressed into a roll out Iot talk was their CTO. >>What's the prescription for that automation journey and simplifying that migration to the cloud? >>Well, I think the first thing is understanding what you've got. So discover and cataloging your data and your applications. You know, I don't know what I have. I can't move it. I can't. I can't improve it. I can't build upon it. And I have to understand there's dependence. And so building that data catalog is the very first step What I got. Okay, >>so So we've done the audit. We know we've got what's what's next? Where do we go >>next? So the next thing is remediating that data you know, where do I have duplicate data? I may have often times in an organization. Uh, data will get duplicated. So somebody will take a snapshot of the data, you know, and then end up building a new application, which suddenly becomes dependent on that data. So it's not uncommon for an organization of 20 master instances of a customer, and you can see where that will go. And trying to keep all that stuff in sync becomes a nightmare all by itself. So you want to sort of understand where all your redundant data is? So when you go to the cloud, maybe you have an opportunity here to do you consolidate that that data, >>then what? You figure out what to get rid of our actually get rid of it. What's what's next? >>Yes, yes, that would be the next step. So figure out what you need. What, you don't need you Often times I've found that there's obsolete columns of data in your databases that you just don't need. Or maybe it's been superseded by another. You've got tables have been superseded by other tables in your database, so you got to kind of understand what's being used and what's not. And then from that, you can decide. I'm gonna leave this stuff behind or I'm gonna I'm gonna archive this stuff because I might need it for data retention where I'm just gonna delete it. You don't need it. All were >>plowing through your steps here. What's next on the >>journey? The next one is is in a nutshell. Preserve your data format. Don't. Don't, Don't. Don't boil the ocean here at music Cliche. You know, you you want to do a certain degree of lift and shift because you've got application dependencies on that data and the data format, the tables in which they sent the columns and the way they're named. So some degree, you are gonna be doing a lift and ship, but it's an intelligent lift and ship. The >>data lives in silos. So how do you kind of deal with that? Problem? Is that is that part of the journey? >>That's that's great pointed because you're right that the data silos happen because, you know, this business unit is start chartered with this task. Another business unit has this task and that's how you get those in stance creations of the same data occurring in multiple places. So you really want to is part of your cloud migration. You really want a plan where there's an opportunity to consolidate your data because that means it will be less to manage. Would be less data to secure, and it will be. It will have a smaller footprint, which means reduce costs. >>But maybe you could address data quality. Where does that fit in on the >>journey? That's that's a very important point, you know. First of all, you don't want to bring your legacy issues with U. S. As the point I made earlier. If you've got data quality issues, this is a good time to find those and and identify and remediate them. But that could be a laborious task, and you could probably accomplish. It will take a lot of work. So the opportunity used tools you and automate that process is really will help you find those outliers that >>what's next? I think we're through. I think I've counted six. What's the What's the lucky seven >>Lucky seven involved your business users. Really, When you think about it, you're your data is in silos, part of part of this migration to cloud as an opportunity to break down the silos. These silence that naturally occurs are the business. You, uh, you've got to break these cultural barriers that sometimes exists between business and say so. For example, I always advise there's an opportunity year to consolidate your sensitive data. Your P I. I personally identifiable information and and three different business units have the same source of truth From that, there's an opportunity to consolidate that into one. >>Well, great advice, Lester. Thanks so much. I mean, it's clear that the Cap Ex investments on data centers they're generally not a good investment for most companies. Lester really appreciate Lester Water CTO of Iot Tahoe. Let's watch this short video and we'll come right back. >>Use cases. Data migration. Accelerate digitization of business by providing automated data migration work flows that save time in achieving project milestones. Eradicate operational risk and minimize labor intensive manual processes that demand costly overhead data quality. You know the data swamp and re establish trust in the data to enable data signs and Data analytics data governance. Ensure that business and technology understand critical data elements and have control over the enterprise data landscape Data Analytics ENABLEMENT Data Discovery to enable data scientists and Data Analytics teams to identify the right data set through self service for business demands or analytical reporting that advanced too complex regulatory compliance. Government mandated data privacy requirements. GDP Our CCP, A, e, p, R HIPPA and Data Lake Management. Identify late contents cleanup manage ongoing activity. Data mapping and knowledge graph Creates BKG models on business enterprise data with automated mapping to a specific ontology enabling semantic search across all sources in the data estate data ops scale as a foundation to automate data management presences. >>Are you interested in test driving the i o ta ho platform Kickstart the benefits of data automation for your business through the Iot Labs program? Ah, flexible, scalable sandbox environment on the cloud of your choice with set up service and support provided by Iot. Top Click on the link and connect with the data engineer to learn more and see Iot Tahoe in action. Everybody, we're back. We're talking about enterprise data automation. The hashtag is data automated and we're going to really dig into data migrations, data migrations. They're risky, they're time consuming and they're expensive. Yousef con is here. He's the head of partnerships and alliances at I o ta ho coming again from London. Hey, good to see you, Seth. Thanks very much. >>Thank you. >>So let's set up the problem a little bit. And then I want to get into some of the data said that migration is a risky, time consuming, expensive. They're they're often times a blocker for organizations to really get value out of data. Why is that? >>I think I mean, all migrations have to start with knowing the facts about your data. Uh, and you can try and do this manually. But when you have an organization that may have been going for decades or longer, they will probably have a pretty large legacy data estate so that I have everything from on premise mainframes. They may have stuff which is probably in the cloud, but they probably have hundreds, if not thousands of applications and potentially hundreds of different data stores. >>So I want to dig into this migration and let's let's pull up graphic. It will talk about We'll talk about what a typical migration project looks like. So what you see, here it is. It's very detailed. I know it's a bit of an eye test, but let me call your attention to some of the key aspects of this, uh and then use if I want you to chime in. So at the top here, you see that area graph that's operational risk for a typical migration project, and you can see the timeline and the the milestones That Blue Bar is the time to test so you can see the second step. Data analysis. It's 24 weeks so very time consuming, and then let's not get dig into the stuff in the middle of the fine print. But there's some real good detail there, but go down the bottom. That's labor intensity in the in the bottom, and you can see hi is that sort of brown and and you could see a number of data analysis data staging data prep, the trial, the implementation post implementation fixtures, the transition to be a Blu, which I think is business as usual. >>The key thing is, when you don't understand your data upfront, it's very difficult to scope to set up a project because you go to business stakeholders and decision makers, and you say Okay, we want to migrate these data stores. We want to put them in the cloud most often, but actually, you probably don't know how much data is there. You don't necessarily know how many applications that relates to, you know, the relationships between the data. You don't know the flow of the basis of the direction in which the data is going between different data stores and tables. So you start from a position where you have pretty high risk and probably the area that risk you could be. Stack your project team of lots and lots of people to do the next phase, which is analysis. And so you set up a project which has got a pretty high cost. The big projects, more people, the heavy of governance, obviously on then there, then in the phase where they're trying to do lots and lots of manual analysis, um, manual processes, as we all know, on the layer of trying to relate data that's in different grocery stores relating individual tables and columns, very time consuming, expensive. If you're hiring in resource from consultants or systems integrators externally, you might need to buy or to use party tools. Aziz said earlier the people who understand some of those systems may have left a while ago. CEO even higher risks quite cost situation from the off on the same things that have developed through the project. Um, what are you doing with Ayatollah? Who is that? We're able to automate a lot of this process from the very beginning because we can do the initial data. Discovery run, for example, automatically you very quickly have an automated validator. A data met on the data flow has been generated automatically, much less time and effort and much less cars stopped. >>Yeah. And now let's bring up the the the same chart. But with a set of an automation injection in here and now. So you now see the sort of Cisco said accelerated by Iot, Tom. Okay, great. And we're gonna talk about this, but look, what happens to the operational risk. A dramatic reduction in that, That that graph and then look at the bars, the bars, those blue bars. You know, data analysis went from 24 weeks down to four weeks and then look at the labor intensity. The it was all these were high data analysis, data staging data prep trialling post implementation fixtures in transition to be a you all those went from high labor intensity. So we've now attacked that and gone to low labor intensity. Explain how that magic happened. >>I think that the example off a data catalog. So every large enterprise wants to have some kind of repository where they put all their understanding about their data in its price States catalog. If you like, imagine trying to do that manually, you need to go into every individual data store. You need a DB, a business analyst, reach data store. They need to do an extract of the data. But it on the table was individually they need to cross reference that with other data school, it stores and schemers and tables you probably with the mother of all Lock Excel spreadsheets. It would be a very, very difficult exercise to do. I mean, in fact, one of our reflections as we automate lots of data lots of these things is, um it accelerates the ability to water may, But in some cases, it also makes it possible for enterprise customers with legacy systems take banks, for example. There quite often end up staying on mainframe systems that they've had in place for decades. I'm not migrating away from them because they're not able to actually do the work of understanding the data, duplicating the data, deleting data isn't relevant and then confidently going forward to migrate. So they stay where they are with all the attendant problems assistance systems that are out of support. You know, you know, the biggest frustration for lots of them and the thing that they spend far too much time doing is trying to work out what the right data is on cleaning data, which really you don't want a highly paid thanks to scientists doing with their time. But if you sort out your data in the first place, get rid of duplication that sounds migrate to cloud store where things are really accessible. It's easy to build connections and to use native machine learning tools. You well, on the way up to the maturity card, you can start to use some of the more advanced applications >>massive opportunities not only for technology companies, but for those organizations that can apply technology for business. Advantage yourself, count. Thanks so much for coming on the Cube. Much appreciated. Yeah, yeah, yeah, yeah

Published Date : Jun 23 2020

SUMMARY :

of enterprise data automation, an event Siri's brought to you by Iot. a lot of pressure on data, a lot of demand on data and to deliver more value What is it to you. into the business processes that are going to drive a business to love to get into the tech a little bit in terms of how it works. the ability to automatically discover that data. What is attracting those folks to your ecosystem and give us your thoughts on the So part of the reason why we've IBM, and I'm putting that to work because, yeah, the A. J. Thanks so much for coming on the Cube and sharing your insights and your experience is great to have Look who is smoking in We have a great conversation with Paul Increase the velocity of business outcomes with complete accurate data curated automatically And I'm really excited to have Paul Damico here. Nice to see you too. So let's let's start with Let's start with Webster Bank. complete data on the customer and what's really a great value the ability to give the customer what they need at the Part of it is really the cycle time, the end end cycle, time that you're pressing. It's enhanced the risk, and it's to optimize the banking process and to the cloud and off Prem and on France, you know, moving off Prem into, In researching Iot Tahoe, it seems like one of the strengths of their platform is the ability to visualize data the You know, just for one to pray all these, you know, um, and each project before data for that customer really fast and be able to give them the best deal that they Can't thank you enough for coming on the Cube. And you guys have a great day. Next, we'll talk with Lester Waters, who's the CTO of Iot Toe cluster takes Automated data Discovery data Discovery is the first step to knowing your We're gonna talk about the journey to the cloud Remember, the hashtag is data automate and we're here with Leicester Waters. data is the key element to be protected. And so building that data catalog is the very first step What I got. Where do we go So the next thing is remediating that data you know, You figure out what to get rid of our actually get rid of it. And then from that, you can decide. What's next on the You know, you you want to do a certain degree of lift and shift Is that is that part of the journey? So you really want to is part of your cloud migration. Where does that fit in on the So the opportunity used tools you and automate that process What's the What's the lucky seven there's an opportunity to consolidate that into one. I mean, it's clear that the Cap Ex investments You know the data swamp and re establish trust in the data to enable Top Click on the link and connect with the data for organizations to really get value out of data. Uh, and you can try and milestones That Blue Bar is the time to test so you can see the second step. have pretty high risk and probably the area that risk you could be. to be a you all those went from high labor intensity. But it on the table was individually they need to cross reference that with other data school, Thanks so much for coming on the Cube.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Dave Volante	PERSON	0.99+
Paul Damico	PERSON	0.99+
Paul Damico	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Aziz	PERSON	0.99+
Webster Bank	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Westchester	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
24 weeks	QUANTITY	0.99+
Seth	PERSON	0.99+
London	LOCATION	0.99+
one	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
Connecticut	LOCATION	0.99+
New York	LOCATION	0.99+
100 hours	QUANTITY	0.99+
iPad	COMMERCIAL_ITEM	0.99+
Cisco	ORGANIZATION	0.99+
four weeks	QUANTITY	0.99+
Siri	TITLE	0.99+
thousands	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
six	QUANTITY	0.99+
first item	QUANTITY	0.99+
20 master instances	QUANTITY	0.99+
today	DATE	0.99+
second step	QUANTITY	0.99+
S three	COMMERCIAL_ITEM	0.99+
I o ta ho	ORGANIZATION	0.99+
first step	QUANTITY	0.99+
Fairfield County	LOCATION	0.99+
five years ago	DATE	0.99+
first	QUANTITY	0.99+
each project	QUANTITY	0.99+
France	LOCATION	0.98+
two days	QUANTITY	0.98+
Leicester Waters	ORGANIZATION	0.98+
Iot Tahoe	ORGANIZATION	0.98+
Cap Ex	ORGANIZATION	0.98+
seven cause	QUANTITY	0.98+
Lester Waters	PERSON	0.98+
5 10 years	QUANTITY	0.98+
Boston	LOCATION	0.97+
Iot	ORGANIZATION	0.97+
Tahoe	ORGANIZATION	0.97+
Tom	PERSON	0.97+
First	QUANTITY	0.97+
15 years old	QUANTITY	0.96+
seven different lines	QUANTITY	0.96+
single source	QUANTITY	0.96+
Utah	LOCATION	0.96+
New England	LOCATION	0.96+
Webster	ORGANIZATION	0.95+
12 years old	QUANTITY	0.95+
Iot Labs	ORGANIZATION	0.95+
Iot. Tahoe	ORGANIZATION	0.95+
1st 1	QUANTITY	0.95+
U. S.	LOCATION	0.95+
J ahora	ORGANIZATION	0.95+
Cube	COMMERCIAL_ITEM	0.94+
Prem	ORGANIZATION	0.94+
one customer	QUANTITY	0.93+
Oracle	ORGANIZATION	0.93+
I O ta ho	ORGANIZATION	0.92+
Snowflake	TITLE	0.92+
seven	QUANTITY	0.92+
single	QUANTITY	0.92+
Lester	ORGANIZATION	0.91+

Yusef Khan

>> Commentator: From around the globe, it's theCUBE with digital coverage of Enterprise Data Automation. An event series brought to you by Io-Tahoe. >> Hi everybody, we're back, we're talking about Enterprise Data Automation. The hashtag is data automated, and we're going to really dig into data migrations. Data migrations are risky, they're time consuming and they're expensive. Yusef Khan is here, he's the head of partnerships and alliances at Io-Tahoe, coming again from London. Hey, good to see you, Yusef, thanks very much. >> Thank Dave, great guy. >> So your role is interesting. We're talking about data migrations, you're going to head of partnerships, what is your role specifically and how is it relevant to what we're going to talk about today? >> Well, I work with the various businesses, such as cloud companies, systems integrators, companies that sell operating systems, middleware, all of whom are often quite well embedded within a company IT infrastructures and have existing relationships, because what we do fundamentally makes migration to the cloud easier and data migration easier, there are lots of businesses that are interested in partnering with us some were interested in partnering with. >> So let's set up the problem a little bit and then I want to get into some of the data. You know, you said that migrations are risky, time consuming, expensive, they're often times a blocker for organizations to really get value out of data. Why is that? >> Ah, I think I mean, all migrations have to start with knowing the facts about your data and you can try and do this manually but when you have an organization that may have been going for decades or longer, they will probably have a pretty large legacy data estate. So they'll have everything from on premise mainframes, they may have stuff which is partly in the clouds but they probably have hundreds, if not thousands of applications and potentially hundreds of different data stores. Now their understanding of what they have, is often quite limited because you can try and draw manual maps but they're out-of-date very quickly, every time data changes, the manual map set a date and people obviously leave organizations all the time. So that kind of tribal knowledge gets built up is limited as well. So you can try and map all that manually, you might need a DBA, database analyst or a business analyst and they might go in and explore the data for you. But doing that manually is very very time consuming. This can take teams of people months and months or you can use automation, just like Webster Bank did with Io-Tahoe and they managed to do this with a relatively small team in a timeframe of days. >> Yeah, we talked to Paul from Webster Bank, awesome discussion. So I want to dig in to this migration, then let's pull up a graphic that we'll talk about, what a typical migration project looks like. So what you see here it's very detailed, I know, it's a bit of an eye test but let me call your attention to some of the key aspects of this and then Yusef, I want you to chime in. So at the top here, you see that area graph, that's operational risk for typical migration project and you can see the timeline and the milestones, that blue bar is the time to test, so you can see the second step data analysis it's taking 24 weeks, so you know, very time consuming and then let's not get dig into the stuff in the middle of the fine print but there's some real good detail there but go down the bottom, that's labor intensity in the bottom and you can see high is that sort of brown and you can see a number of data analysis, data staging, data prep, the trial, the implementation, post implementation fixtures, the transition to BAU, which I think is Business As Usual. Those are all very labor intensive. So what are your takeaways from this typical migration project? What do we need to know Yusef? >> I mean, I think the key thing is, when you don't understand your data upfront, it's very difficult to scope and to set up a project because you go to business stakeholders and decision makers and you say, "okay, we want to migrate these data stores, we want to put them into the cloud most often", but actually, you probably don't know how much data is there, you don't necessarily know how many applications it relates to, you don't know the relationships between the data, you don't know the flow of the data so the direction in which the data is going between different data stores and tables. So you start from a position where you have pretty high risk and alleviate that risk, you probably stack your project team with lots and lots of people to do the next phase, which is analysis and so you've set up a project which is got to pretty high cost. The bigger the project, the more people the heavier the governance obviously and then in the phase where they're trying to do lots and lots of manual analysis. Manual analysis, as we all know and the idea of trying to relate data that's in different data stores, relating individual tables and columns are very, very time consuming, expensive if you're hiring in resource from consultants or systems integrators externally, you might need to buy or to use third party tools. As I said earlier, the people who understand some of those systems may have left a while ago and so you are in a high risks, high cost situation from the off and the same thing sort of develops through the project. What you find with Io-Tahoe is that we're able to automate a lot of this process from the very beginning, because we can do the initial data discovery run for example automatically, so you very quickly have an automated view of the data, a data map and the data flow has been generated automatically, much less time and effort and much less cost of money. >> Okay, so I'm going to bring back that first chart and I want to call your attention to again, that area graph, the blue bars and then down below that labor intensity and now let's bring up the same chart, but with a sort of an automation injection in here and now so you now see the sort of essence celebrated by Io-Tahoe. Okay, great, we're going to talk about this but look what happens to the operational risk, a dramatic reduction in that graph and then look at the bars, the bars, those blue bars, you know, data analysis went from 24 weeks down to four weeks and then look at the labor intensity. All these were high, data analysis, data staging, Data Prep, trial, post implementation fixtures in transition to BAU. All those went from high labor intensity, so we've now attacked that and gone to low labor intensity, explain how that magic happened. >> Ah, let's take the example of a data catalog. So every large enterprise wants to have some kind of repository where they put all their understanding about that data and its price data catalog, if you like. Imagine trying to do that manually, you need to go into every individual data store, you need a DBA and the business analyst for each data store, they need to do an extract of the data, they need to put tables individually, they need to cross reference that with other data stores and schemas and tables, you've probably end up with the mother of all Excel spreadsheets and it would be a very, very difficult exercise to do. I mean, in fact, one of our reflections as we automate lots of these things is, it accelerates the ability to automate, but in some cases it also makes it possible for enterprise customers with legacy systems, take banks, for example, they quite often end up staying on mainframe systems that they've had in place for decades, and not migrating away from them because they're not able to actually do the work of understanding the data, duplicating the data, deleting data that isn't relevant and then confidently going forward to migrate. So they stay where they are with all the attendant problems or success systems that are out of their support. Go back to the data catalog example. Whatever you discover in data discovery has to persist in a tool like a data catalog and so we automate data catalogs including our own, we can also feed others but we have our own. The only alternative to this kind of automation is to build out this very large project team of business analysts, of DBAs, project managers, process analysts, to gather all the data, to understand that the process of gathering the data is correct, to put it in the repository, to validate it, etcetera, etcetera. We've got into organizations and we've seen them, ramp up teams of 20 30 people, cost of 2, 3, 4 million pounds a year and a timeframe of 15 to 20 years, just to try and get a data catalog done and that's something that we can typically do in a timeframe of months if not weeks and the differences is using automation and if you do what I've just described in this manual situation, you make migrations to the cloud prohibitively expensive, whatever saving you might make from shutting down your legacy data stores, will get eaten up by the cost of doing it unless you go with a more automated approach. >> Okay, so the automated approach reduces risk because you're not going to, you know, you're going to stay on project plan, ideally, you know, it's all these out of scope expectations that come up with the manual processes that kill you in the rework and then that data catalog, people are afraid that their family jewels data is not going to make it through to the other side. So, that's something that you're addressing and then you're also not boiling the ocean, you're really taking the pieces that are critical and the stuff that you don't need, you don't have to pay for as part of this process. >> It's a very good point. I mean, one of the other things that we do and we have specific features to do, is to automatically analyze data for duplication at a row-level or record level and redundancy at a column level. So as you say, before you go into migration process, you can then understand actually, this stuff here is duplicated, we don't need it. Quite often, if you put data in the cloud, you're paying obviously for storage space or for compute time, the more data you have in there is duplicated, that's pure cost you should take out before you migrate. Again, if you're trying to do that process of understanding was duplicated manually of 10s or 100s of data stores, it will take you months if not years, you use machine learning to do it in an automatic way and it's much much quicker. I mean, there's nothing I'd say about the net cost and benefit of Io-Tahoe. Every organization we work with has a lot of money existing sunk cost in there IT, so they'll have your IP systems like Oracle or data lakes which they've spent good time and money investing in. What we do by enabling them to transition everything to their strategic future repositories, is accelerate the value of investment and the time to value that investment. So we are trying to help people get value out of their existing investments and data estate, close down the things that they don't need and enable them to go to a kind of brighter and more present future. >> Well, I think as well, you know, once you're able to and this is a journey, we know that but once you're able to go live and you're infusing sort of a data mindset, a data oriented culture, I know it's somewhat buzzwordy, but when you when you see it in organizations, you know it's real and what happens is you dramatically reduce that and cycle time of going from data to actually insights, data is plentiful but insights aren't and that is what's going to drive competitive advantage over the next decade and beyond. >> Yeah, definitely and you can only really do that if you get your data state cleaned up in the first place. I've worked with and managed teams of data scientists, big data engineers, business analysts, people who are pushing out dashboards and are trying to build machine learning applications. You'll know you have the biggest frustration for lots of them and the thing that they spend far too much time doing is trying to work out what the right data is, and cleaning data, which really you don't want a highly paid data scientist doing with their time but if you sort out your data set in the first place, get rid of duplication, perhaps migrate to a cloud store where things are more readily accessible and it's easy to build connections and to use native machine learning tools, you're well on the way up the maturity curve and you can start to use some of those more advanced applications. >> Yusef, what are some of the prerequisites maybe the top, you know, few that are two or three that I need to understand as a customer to really be successful here? I mean, there's, is it skill sets? Is it, mindset, leadership buy-in? What do I absolutely need to have to make this successful? >> Well, I think leadership is obviously key, being able to sort of set the vision for people is obviously key. One of the great things about Io-Tahoe though, is you can use your existing staff to do this work if you use our automation platform, there's no need to hire expensive people. Io-Tahoe is a no code solution, it works out of the box, you just connect to source and then your existing staff can use it. It's very intuitive and easy to use, user interface is only to invest vast amounts with large consultancies, who may well charging the earth and you are actually a bit of an advantage if you've got existing staff who are close to the data, who are subject matter experts or use it because they can very easily learn how to use the tool and then they can go in and they can write their own data quality rules and they can really make a contribution from day one. When we go into organizations and we connect all of the great things about the whole experience via Io-Tahoe is we can get tangible results back within the day. Usually within an hour or two, were able to say, okay, we started to map the relationships here. Here's a data map of the data that we've analyzed and here are some thoughts on what your sensitive data is, because it's automated, because it's running algorithms across data and that's what people really should expect. >> And you know this because you're dealing with the ecosystem, we're entering a new era of data and many organizations to your point, they just don't have the resources to do what Google and Amazon and Facebook and Microsoft did over the past decade to become you know, data dominant, you know, trillion dollar market cap companies. Incumbents need to rely on technology companies to bring that automation, that machine intelligence to them so they can apply it. They don't want to be AI inventors, they want to apply it to their businesses. So and that's what really was so difficult in the early days of so called Big Data, you had this just too much complexity out there and now companies like Io-Tahoe are bringing you know, tooling and platforms that are allowing companies to really become data driven. Your final thoughts, please Yusef. >> But that's a great point, Dave. In a way it brings us back to where it began in terms of partnerships and alliances. I completely agree, a really exciting point where we can take applications like Io-Tahoe and we can go into enterprises and help them really leverage the value of these type of machine learning algorithms and AI. We work with all the major cloud providers, AWS, Microsoft Azure, Google Cloud Platform, IBM, Red Hat, and others and we really, I think, for us, the key thing is that we want to be the best in the world at Enterprise Data Automation. We don't aspire to be a cloud provider or even a workflow provider but what we want to do is really help customers with their data, with our automated data functionality in partnership with some of those other businesses so we can leverage the great work they've done in the cloud, the great work they've done on workflows, on virtual assistants and in other areas and we help customers leverage those investments as well but our heart we're really targeted at just being the best enterprise, data automation business in the world. >> Massive opportunities not only for technology companies but for those organizations that can apply technology for business advantage, Yusef Khan, thanks so much for coming on theCUBE. >> Pretty much appreciated. >> All right, and thank you for watching everybody. We'll be right back right after this short break. (upbeat music)

Published Date : Jun 4 2020

SUMMARY :

to you by Io-Tahoe. and we're going to really and how is it relevant to the cloud easier and and then I want to get and they managed to do this that blue bar is the time to test, and so you are in a high and now so you now see the sort and if you do what I've just described and the stuff that you don't need, and the time to value that investment. and that is what's going to and you can start to use some and you are actually a bit of an advantage to become you know, data dominant, and we can go into enterprises that can apply technology you for watching everybody.

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
John	PERSON	0.99+
Yusef	PERSON	0.99+
Vodafone	ORGANIZATION	0.99+
Neil	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Webster Bank	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Deutsche	ORGANIZATION	0.99+
Earth Engine	ORGANIZATION	0.99+
Sudhir	PERSON	0.99+
Europe	LOCATION	0.99+
Jeff Frick	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Adolfo Hernandez	PERSON	0.99+
Telco	ORGANIZATION	0.99+
2012	DATE	0.99+
Google	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Corinna Cortes	PERSON	0.99+
Dave Brown	PERSON	0.99+
telco	ORGANIZATION	0.99+
24 weeks	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
100s	QUANTITY	0.99+
Adolfo	PERSON	0.99+
KDDI	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
London	LOCATION	0.99+
15	QUANTITY	0.99+
Io-Tahoe	ORGANIZATION	0.99+
Yusef Khan	PERSON	0.99+
80%	QUANTITY	0.99+
90%	QUANTITY	0.99+
Sudhir Hasbe	PERSON	0.99+
two	QUANTITY	0.99+
SK Telecom	ORGANIZATION	0.99+
two lines	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
BigQuery	TITLE	0.99+
IBM	ORGANIZATION	0.99+
four weeks	QUANTITY	0.99+
10s	QUANTITY	0.99+
Brazil	LOCATION	0.99+
three	QUANTITY	0.99+
SQL	TITLE	0.99+
San Francisco	LOCATION	0.99+
LinkedIn	ORGANIZATION	0.99+
Global Telco Business Unit	ORGANIZATION	0.99+

Ajay Vohora Final

>> Narrator: From around the globe, its theCUBE! With digital coverage of enterprise data automation. An event series brought to you by Io-Tahoe. >> Okay, we're back, welcome back to Data Automated, Ajay Vohora is CEO of Io-Tahoe. Ajay, good to see you, how are things in London? >> Things are doing well, things are doing well, we're making progress. Good to see you, hope you're doing well, and pleasure being back here on theCUBE. >> Yeah, it's always great to talk to you, we're talking enterprise data automation, as you know, within our community we've been pounding the whole DataOps conversation. A little different, though, we're going to dig into that a little bit, but let's start with, Ajay, how are you seeing the response to COVID, and I'm especially interested in the role that data has played in this pandemic. >> Yeah, absolutely, I think everyone's adapting, both socially and in business, the customers that I speak to, day in, day out, that we partner with, they're busy adapting their businesses to serve their customers, it's very much a game of ensuring that we can serve our customers to help their customers, and the adaptation that's happening here is trying to be more agile, trying to be more flexible, and there's a lot of pressure on data, lot of demand on data to deliver more value to the business, to serve that customer. >> Yeah, I mean data, machine intelligence and cloud are really three huge factors that have helped organizations in this pandemic, and the machine intelligence or AI piece, that's what automation is all about, how do you see automation helping organizations evolve, maybe faster than they thought they might have to? >> For sure, I think the necessity of these times, there's, as they say, there's a lot of demand on doing something with data, data, a lot of businesses talk about being data-driven. It's interesting, I sort of look behind that when we work with our customers, and it's all about the customer. My peers, CEOs, investors, shareholders, the common theme here is the customer, and that customer experience starts and ends with data. Being able to move from a point that is reacting to what the customer is expecting, and taking it to that step forward where you can be proactive to serve what that customer's expectation to, and that's definitely come alive now with the current time. >> Yeah, so as I said, we were talking about DataOps a lot, the idea being DevOps applied to the data pipeline, but talk about enterprise data automation, what is it to you and how is it different from DataOps? >> Yeah, great question, thank you. I think we're all familiar with, got more and more awareness around DevOps as it's applied to processes, methodologies that have become more mature over the past five years around DevOps, but managing change, managing application life cycles, managing software development, DevOps has been great, but breaking down those silos between different roles, functions, and bringing people together to collaborate. And we definitely see that those tools, those methodologies, those processes, that kind of thinking, lending itself to data with DataOps is exciting, we're excited about that, and shifting the focus from being IT versus business users to, who are the data producers and who are the data consumers, and in a lot of cases it can sit in many different lines of business. So with DataOps, those methods, those tools, those processes, what we look to do is build on top of that with data automation, it's the nuts and bolts of the algorithms, the models behind machine learning, the functions, that's where we invest our R&D. And bringing that in to build on top of the methods, the ways of thinking that break down those silos, and injecting that automation into the business processes that are going to drive a business to serve its customer. It's a layer beyond DevOps, DataOps, taking it to that point where, way I like to think about it is, is the automation behind the automation. We can take, I'll give you an example of a bank where we've done a lot of work to move them into accelerating their digital transformation, and what we're finding is that as we're able to automate the jobs related to data, and managing that data, and serving that data, that's going into them as a business automating their processes for their customer. So it's definitely having a compound effect. >> Yeah, I mean I think that DataOps for a lot of people is somewhat new, the whole DevOps, the DataOps thing is good and it's a nice framework, good methodology, there is obviously a level of automation in there, and collaboration across different roles, but it sounds like you're talking about sort of supercharging it if you will, the automation behind the automation. You know, organizations talk about being data-driven, you hear that thrown around a lot. A lot of times people will sit back and say "We don't make decisions without data." Okay, but really, being data-driven is, there's a lot of aspects there, there's cultural, but there's also putting data at the core of your organization, understanding how it affects monetization, and as you know well, silos have been built up, whether it's through M&A, data sprawl, outside data sources, so I'm interested in your thoughts on what data-driven means and specifically how Io-Tahoe plays there. >> Yeah, sure, I'd be happy to put that through, David. We've come a long way in the last three or four years, we started out with automating some of those simple, to codify, but have a high impact on an organization across a data lake, across a data warehouse. Those data-related tasks that help classify data. And a lot of our original patents and IP portfolio that were built up is very much around there. Automating, classifying data across different sources, and then being able to serve that for some purpose. So originally, some of those simpler challenges that we help our customers solve, were around data privacy. I've got a huge data lake here, I'm a telecoms business, so I've got millions of subscribers, and quite often a chief data office challenge is, how do I cover the operational risk here, where I've got so much data, I need to simplify my approach to automating, classifying that data. Reason is, can't do that manually, we can't throw people at it, and the scale of that is prohibitive. Quite often, if you were to do it manually, by the time you've got a good picture of it, it's already out of date. So in starting with those simple challenges that we've been able to address, we've then gone on and built on that to see, what else do we serve? What else do we serve for the chief data officer, chief marketing officer, and the CFO, and in these times, where those decision-makers are looking for, have a lot of choices in the platform options that they take, the tooling, they're very much looking for that Swiss army knife, being able to do one thing really well is great, but more and more, where that cost pressure challenge is coming in, is about how do we offer more across the organization, bring in those business, lines of business activities that depend on data, to not just with IT. >> So we like, in theCUBE sometimes we like to talk about okay, what is it, and then how does it work, and what's the business impact? We kind of covered what it is, I'd love to get into the tech a little bit in terms of how it works, and I think we have a graphic here that gets into that a little bit. So guys, if you could bring that up, I wonder, Ajay, if you could tell us, what is the secret sauce behind Io-Tahoe, and if you could take us through this slide. >> Ajay: Sure, I mean right there in the middle, the heart of what we do, it is the intellectual property that were built up over time, that takes from heterogeneous data sources, your Oracle relational database, your mainframe, your data lake, and increasingly APIs and devices that produce data. And now creates the ability to automatically discover that data, classify that data, after it's classified then have the ability to form relationship across those different source systems, silos, different lines of business, and once we've automated that, then we can start to do some cool things, such as put some context and meaning around that data. So it's moving it now from being data-driven, and increasingly where we have really smart, bright people in our customer organizations who want to do some of those advanced knowledge tasks, data scientists, and quants in some of the banks that we work with. The onus is on them, putting everything we've done there with automation, classifying it, relationship, understanding data quality, the policies that you can apply to that data, and putting it in context. Once you've got the ability to power a professional who's using data, to be able to put that data in context and search across the entire enterprise estate, then they can start to do some exciting things, and piece together the tapestry, the fabric, across their different system. Could be CRM, ELP systems, such as SAP, and some of the newer cloud databases that we work with, Snowflake is a great one. >> Yeah, so this is, you're describing sort of one of the reasons why there's so many stovepipes in organizations, 'cause data is kind of locked into these silos and applications, and I also want to point out that previously, to do discovery, to do that classification that you talked about, form those relationships, to glean context from data, a lot of that, if not most of that, in some cases all of that would've been manual. And of course it's out of date so quickly, nobody wants to do it because it's so hard, so this again is where automation comes into the idea of really becoming data-driven. >> Sure, I mean the efforts, if I look back maybe five years ago, we had a prevalence of data lake technologies at the cutting edge, and those have started to converge and move to some of the cloud platforms that we work with, such as Google and AWS. And I think very much as you've said it, those manual attempts to try and grasp what is such a complex challenge at scale, quickly runs out of steam, because once you've got your fingers on the details of what's in your data estate, it's changed. You've onboarded a new customer, you've signed up a new partner, a customer has adopted a new product that you've just launched, and that slew of data keeps coming, so it's keeping pace with that, the only answer really here is some form of automation. And what we've found is if we can tie automation with what I said before, the expertise, the subject matter experience that sometimes goes back many years within an organization's people, that augmentation between machine learning, AI, and that knowledge that sits inside the organization really tends to allot a lot of value in data. >> Yeah, so you know well, Ajay, you can't be as a smaller company all things to all people, so the ecosystem is critical. You're working with AWS, you're working with Google, you got Red Hat, IBM as partners. What is attracting those folks to your ecosystem, and give us your thoughts on the importance of ecosystem. >> Yeah, that's fundamental, I mean when I came into Io-Tahoe here as CEO, one of the trends that I wanted us to be part of was being open, having an open architecture that allowed one thing that was close to my heart, which was as a CEO, a CIO, well you've got a budget vision, and you've already made investments into your organization, and some of those are pretty long term bets, they could be going out five, 10 years sometimes, with a CRM system, training up your people, getting everybody working together around a common business platform. What I wanted to ensure is that we could openly plug in, using APIs that were available, to a lot of that sunk investment, and the cost that has already gone into managing an organization's IT, for business users to perform. So, part of the reason why we've been able to be successful with some of our partners like Google, AWS, and increasingly a number of technology players such as Red Hat, MongoDB is another one that we're doing a lot of good work with, and Snowflake, there is, those investments have been made by the organizations that are our customers, and we want to make sure we're adding to that, and then leveraging the value that they've already committed to. >> Okay, so we've talked about what it is and how it works, now I want to get into the business impact, I would say what I would be looking for, from this, would be can you help me lower my operational risk, I've got tasks that I do, many are sequential, some are in parallel, but can you reduce my time to task, and can you help me reduce the labor intensity, and ultimately my labor cost, so I can put those resources elsewhere, and ultimately I want to reduce the end to end cycle time, because that is going to drive telephone number ROI, so am I missing anything, can you do those things, maybe you can give us some examples of the ROI and the business impact. >> Yeah, I mean the ROI, David, is built upon three things that I've mentioned, it's a combination of leveraging the existing investment with the existing estate, whether that's on Microsoft Azure, or AWS, or Google, IBM, and putting that to work, because the customers that we work with have made those choices. On top of that, it's ensuring that we have got the automation that is working right down to the level of data, at a column level or the file level. So we don't deal with metadata, it's being very specific, to be at the most granular level. So as we run our processes and the automation, classification, tagging, applying policies from across different compliance and regulatory needs an organization has to the data, everything that then happens downstream from that is ready to serve a business outcome. It could be a customer who wants that experience on a mobile device, a tablet, or face to face, within a store. And being able to provision the right data, and enable our customers to do that for their customers, with the right data that they can trust, at the right time, just in that real time moment where a decision or an action is being expected, that's driving the ROI to be in some cases 20x plus, and that's really satisfying to see, that kind of impact, it's taking years down to month, and in many cases months of work down to days, and some cases hours, the time to value. I'm impressed with how quickly out of the box, with very little training a customer can pick up our tool, and use features such as search, data discovery, knowledge graph, and identifying duplicates, and redundant data. Straight off the bat, within hours. >> Well it's why investors are interested in this space, I mean they're looking for a big, total available market, they're looking for a significant return, 10x is, you got to have 10x, 20x is better. So that's exciting, and obviously strong management, and a strong team. I want to ask you about people, and culture. So you got people process technology, we've seen with this pandemic that the processes are really unpredictable, and the technology has to be able to adapt to any process, not the reverse, you can't force your process into some static software, so that's very very important, but at the end of the day, you got to get people on board. So I wonder if you could talk about this notion of culture, and a data-driven culture. >> Yeah, that's so important, I mean, current times is forcing the necessity of the moment to adapt, but as we start to work our way through these changes and adapt and work with our customers to adapt to these changing economic times, what we're seeing here is the ability to have the technology complement, in a really smart way, what those business users and IT knowledge workers are looking to achieve together. So, I'll give you an example. We have quite often with the data operations teams, in the companies that we are partnering with, have a lot of inbound inquiries on a day to day level, "I really need this set of data because I think it can help "my data scientists run a particular model," or "What would happen if we combine these two different "silos of data and get some enrichment going?" Now those requests can sometimes take weeks to realize, what we've been able to do with the power of (audio glitches) technology, is to get those answers being addressed by the business users themselves, and now, with our customers, they're coming to the data and IT folks saying "Hey, I've now built something in a development environment, "why don't we see how that can scale up "with these sets of data?" I don't need terabytes of it, I know exactly the columns and the feats in the data that I'm going to use, and that cuts out a lot of wastage, and time, and cost, to innovate. >> Well that's huge, I mean the whole notion of self-service in the lines of business actually feeling like they have ownership of the data, as opposed to IT or some technology group owning the data because then you've got data quality issues, or if it doesn't line up with their agenda, you're going to get a lot of finger pointing, so that is a really important piece of it. I'll give you a last word, Ajay, your final thoughts if you would. >> Yeah, we're excited to be on this path, and I think we've got some great customer examples here, where we're having a real impact in a really fast pace, whether it's helping them migrate to the cloud, helping them clean up their legacy data lake, and quite often now, the conversation is around data quality. As more of the applications that we enable to work more proficiently could be data, RPA, could be robotic process automation, a lot of the APIs that are now available in the cloud platforms, a lot of those are dependent on data quality and being able to automate for business users, to take accountability of being able to look at the trend of their data quality over time and get those signaled, is really driving trust, and that trust in data is helping in turn, the IT teams, the data operations teams they partner with, do more, and more quickly. So it comes back to culture, being able to apply the technology in such a way that it's visual, it's intuitive, and helping just like DevOps has with IT, DataOps, putting the intelligence in at the data level, to drive that collaboration. We're excited. >> You know, you remind me of something, I lied, I don't want to go yet, if it's okay. I know we're tight on time, but you mentioned a migration to the cloud, and I'm thinking about the conversation with Paula from Webster Bank. Migrations are, they're a nasty word for organizations, and we saw this with Webster, how are you able to help minimize the migration pain and why is that something that you guys are good at? >> Yeah, I mean there are many large, successful companies that we've worked with, Webster's a great example. Where I'd like to give you the analogy where, you've got a lot of bright people in your teams, if you're running a business as a CEO, and it's a bit like a living brain. But imagine if those different parts of your brain were not connected, that would certainly diminish how you're able to perform. So, what we're seeing, particularly with migration, is where banks, retailers, manufacturers have grown over the last 10 years, through acquisition, and through different initiatives to drive customer value. That sprawl in their data estate hasn't been fully dealt with. It's sometimes been a good thing to leave whatever you've acquired or created in situ, side by side with that legacy mainframe, and your Oracle ERP. And what we're able to do very quickly with that migration challenge is shine a light on all the different parts of data application at the column level, or at the file level if it's a data lake, and show an enterprise architect, a CDO, how everything's connected, where there may not be any documentation. The bright people that created some of those systems have long since moved on, or retired, or been promoted into other roles, and within days, being able to automatically generate and keep refreshed the states of that data, across that landscape, and put it into context, then allows you to look at a migration from a confidence that you're dealing with the facts, rather than what we've often seen in the past, is teams of consultants and business analysts and data analysts, spend months getting an approximation, and a good idea of what it could be in the current state, and try their very best to map that to the future target state. Now with Io-Tahoe being able to run those processes within hours of getting started, and build that picture, visualize that picture, and bring it to life. The ROI starts off the bat with finding data that should've been deleted, data that there's copies of, and being able to allow the architect, whether it's we have working on GCP, or in migration to any of the clouds such as AWS, or a multicloud landscape, quite often now. We're seeing, yeah. >> Yeah, that visi-- That visibility is key to sort of reducing operational risk, giving people confidence that they can move forward, and being able to do that and update that on an ongoing basis means you can scale. Ajay Vohora, thanks so much for coming to theCUBE and sharing your insights and your experiences, great to have you. >> Thank you David, look forward to talking again. >> All right, and keep it right there everybody, we're here with Data Automated on theCUBE, this is Dave Vellante, and we'll be right back right after this short break. (calm music)

Published Date : Jun 1 2020

SUMMARY :

to you by Io-Tahoe. Ajay, good to see you, Good to see you, hope you're doing well, Yeah, it's always great to talk to you, and the adaptation and it's all about the customer. the jobs related to data, and as you know well, that depend on data, to not just with IT. and if you could take and quants in some of the in some cases all of that and move to some of the cloud so the ecosystem is critical. and the cost that has already gone into the end to end cycle time, and some cases hours, the time to value. and the technology has to be able to adapt and the feats in the data of self-service in the lines of business at the data level, to and we saw this with Webster, and being able to allow the architect, and being able to do that and update that forward to talking again. and we'll be right back

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Ajay Vohora	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Paula	PERSON	0.99+
Google	ORGANIZATION	0.99+
Io-Tahoe	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Ajay	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Webster	ORGANIZATION	0.99+
10x	QUANTITY	0.99+
London	LOCATION	0.99+
Red Hat	ORGANIZATION	0.99+
Webster Bank	ORGANIZATION	0.99+
DevOps	TITLE	0.99+
20x	QUANTITY	0.99+
Oracle	ORGANIZATION	0.98+
five years ago	DATE	0.98+
one thing	QUANTITY	0.97+
three things	QUANTITY	0.97+
Data Automated	ORGANIZATION	0.96+
DataOps	TITLE	0.94+
10 years	QUANTITY	0.94+
three huge factors	QUANTITY	0.93+
one	QUANTITY	0.92+
both	QUANTITY	0.91+
millions of subscribers	QUANTITY	0.89+
four years	QUANTITY	0.85+
DataOps	ORGANIZATION	0.84+
two different	QUANTITY	0.82+
MongoDB	ORGANIZATION	0.81+
pandemic	EVENT	0.77+
Microsoft Azure	ORGANIZATION	0.74+
20x plus	QUANTITY	0.72+
past five years	DATE	0.69+
theCUBE	ORGANIZATION	0.68+
SAP	ORGANIZATION	0.64+
M	TITLE	0.63+
last three	DATE	0.57+
Snowflake	ORGANIZATION	0.57+
last 10 years	DATE	0.52+
Webster	PERSON	0.49+
Swiss	ORGANIZATION	0.43+
COVID	PERSON	0.23+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Webster bank: