Seth Rao, FirstEigen | AWS re:Invent 2021
(upbeat music) >> Hey, welcome back to Las Vegas. theCUBE is live at AWS re:Invent 2021. I'm Lisa Martin. We have two live sets, theCUBE. We are running one of the largest hybrid tech events, most important events of the year with AWS and its massive ecosystem of partners like as I said. Two live sets, two remote sets. Over a hundred guests on the program talking about the next generation of cloud innovation. I'm pleased to welcome a first timer to theCUBE. Seth Rao, the CEO of FirstEigen joins me. Seth, nice to have you on the program. >> Thank you nice to be here. >> Talk to me about FirstEigen. Also explain to me the name. >> So FirstEigen is a startup company based out of Chicago. The name Eigen is a German word. It's a mathematical term. It comes from eigenvectors and eigenvalues which is used and what it's called is principal component analysis, which is used to detect anomalies, which is related to what we do. So we look for errors in data and hence our name FirstEigen. >> Got it. That's excellent. So talk to me. One of the things that has been a resounding theme of this year's re:Invent is that especially in today's age, every company needs to be a data company. >> Yeah. >> It's all one thing to say it's as a whole other thing to be able to put that into practice with reliable data, with trustworthy data. Talk to me about some of the challenges that you help customers solve 'cause some of the theme about not just being a data company but if you're not a data company you're probably not going to be around much longer. >> Yeah, absolutely .So what we have seen across the board across all verticals, the customers we work with is data governance teams and data management teams are constantly firefighting to find errors in data and fix it. So what we have done is we have created the software DataBuck that autonomously looks at every data set and it will discover errors that are hidden to the human eye. They're hard to find out, hard to detect. Our machine learning algorithms figure out those errors before those errors impact the business. In the usual way, things are sorted out, things are done. It's very laborious, time-consuming and expensive. You have taken a process that takes man-years or even man-months and compressed it to a few hours. >> So dramatic time-savings there. >> Absolutely. >> So six years ago when you guys were founded, you realize this gap in the market, thought it's taking way too long. We don't have this amount of time. Gosh, can you imagine if you guys weren't around the last 22 months when certainly time was of the essence? >> Absolutely. Yeah. Six years ago when we founded the company, my co-founder who's also the CTO. He has extensive experience in validating data and data quality. And my own background and my own experiences in AI and ML. And what we saw was that people are spending an enormous amount of time and yet errors were getting down through to the business side. And at that point it comes back and people are still firefighting. So it was a waste of time, waste of money, waste of effort. >> Right. But also there's the potential for brand damage, brand reputation. Whatever products and services you're producing, if your employees don't have the right data, if there's errors there of what's going out to the consumers is wrong then you've got a big problem. >> Absolutely. Interesting you should mention that because over the summer there was a Danish bank, a very big name Danish bank that had to send apology letters to its customers because they overcharged them on the mortgage because the data in the backend had some errors in it and didn't realize it was inadvertent. But somebody ultimately caught it and did the right thing. Absolutely correct. If the data is incorrect and then you're doing analytics or you're doing reporting or you're sending people a bill that they need to pay it better be very accurate. Otherwise it's a serious brand damage. It has real implications and it has a whole bunch of other issues as well. >> It does and those things can snowball very quickly. >> Yeah. >> So talk to me about one of the things that we've seen in the recent months and years is this explosion of data. And then when the pandemic struck we had this scattering of people and data sources or so much data. The edge is persistent. We've got this work from anywhere environment. What are some of the risks for organizations? They come to you and saying help us ensure that our data is trustworthy. I mean that the trust is key but how do you help organizations that are in somewhat a flux figure out how to solve that problem? >> Yeah. So you're absolutely correct. There is an explosion of data, number one. And along with that, there is also an explosion of analytical tools to mine that data. So as a consequence, there is a big growth. It's exponential growth of microservices, how people are consuming that data. Now in the old world when there were a few consumers of data, it was a lot easier to validate the data. You had few people who are the gatekeepers or the data stewards. But with an explosion of data consumers within a company, you have to take a completely different approach. You cannot now have people manually looking and creating rules to validate data. So there has to be a change in the process. You start validating the data. As soon as the data comes into your system, you start validating if the data is reliable at point zero. >> Okay. >> And then it goes downstream. And every stage the data hops that is a chance that data can get corrupted. And these are called systems risks. Because there are multiple systems and data comes from multiple systems onto the cloud, errors creep in. So you validate the data from the beginning all the way to the end and the kinds of checks you do also increase in complexity as the data is going downstream. You don't want to boil the ocean upfront. You want to do the essential checks. Is my water drinkable at this point, right? I'm not trying to cook as soon as it comes out of the tap. Is it drinkable? - Right. >> Good enough quality. If not then we go back to the source and say, guys, send me better quality data. So sequence, the right process and check every step along the way. >> How much of a cultural shift is FirstEigen helping to facilitate within organizations that now don't... There isn't time to, like we talked about if an error gets in, there's so many downstream effects that can happen, but how do you help organizations shift their mindset? 'Cause that's hard thing to change. >> Fantastic point. In fact, what we see is the mindset change is the biggest wall for companies to have good data. People have been living in the old world where there is a team that is a group, much downstream that is responsible for accurate data. But the volume of data, the complexity of data has gone up so much that that team cannot handle it anymore. It's just beyond their scope. It's not fair for us to expect them to save the world. So the mindshift has to come from an organization leadership that says guys, the data engineers who are upfront who are getting the data into the organization, who are taking care of the data assets have to start thinking of trustable data. Because if they stopped doing it, everything downstream becomes easy. Otherwise it's much, much more complex for these guys. And that's what we do. Our tool provides autonomous solution to monitor the data. It comes out with a data trust score with zero human input. Our software will be able to validate the data and give an objective trust score. Right now it's a popularity contest. People are saying they vote. Yeah, I think I like this. I like this and I like that. That's okay. Maybe it's acceptable. But the reason they do it is because there is no way to objectively say the data is trustable. If there is a small error somewhere, it's a needle in the haystack. It's hard to find out, but we can. With machine learning algorithms our software can detect the errors, the minutest errors, and to give an objective score from zero to a hundred, trust or no trust. So along with a mindset, now they have the tool to implement that mindset and we can make it happen. >> Talk to me about some of the things that you've seen from a data governance perspective, as we've seen, the explosion, the edge, people working from anywhere. This hybrid environment that we're going to be in for quite some time. >> Yeah. >> From a data governance perspective and Dave Vellante did his residency. We're seeing so many more things pop up, you know different regulations. How do you help facilitate data governance for organizations as the data volume is just going to continue to proliferate? >> Absolutely correct. So data governance. So we are a key component of data governance and data quality and data trustworthiness, reliability is a key component of it. And one of the central, one of the central pillars of data governance is the data catalog. Just like a catalog in the library. It's cataloging every data asset. But right now the catalogs, which are the mainstay are not as good as they can be. A key information that is missing is I know where my data is what I don't know is how good is my data? How usable is it? If I'm using it for an accounts receivable or an accounts payable, for example, the data better be very, very accurate. So what our software will do is it'll help data governance by linking with any data governance tool and giving an important component which is data quality, reliability, trustability score, which is objective to every data asset. So imagine I open the catalog. I see where my book is in the library. I also know if there are pages missing in the book is the book readable? So it's not good enough to know that I have a book somewhere but it's how good is it? >> Right >> So DataBuck will make that happen. >> So when customers come to you, how do you help them start? 'Cause obviously the data, the volume it's intimidating. >> Yeah. >> Where do they start? >> Great. This is interestingly enough a challenge that every customer has. >> Right. >> Everybody is ambitious enough to say, no, I want to make the change. But the previous point was, if you want to do such a big change, it's an organizational change management problem. So the way we recommend customers is start with the small problem. Get some early victories. And this software is very easy. Just bring it in, automate a small part. You have your sales data or transactional data, or operational data. Take a small portion of it, automate it. Get reliable data, get good analytics, get the results and start expanding to other places. Trying to do everything at one time, it's just too much inertia, organizations don't move. You don't get anywhere. Data initiatives will fail. >> Right. So you're helping customers identify where are those quick wins? >> Yes. And where are the landmines that we need to be able to find out where they are so we can navigate around them? >> Yeah. We have enough expedience over 20 years of working with different customers. And I know if something can go wrong we know where it'll go wrong and we can help them steer them away from the landmines and take them to areas where they'll get quick wins. 'Cause we want the customer to win. We want them to go back and say, look, because of this, we were able to do better analytics. We are able to do better reporting and so on and so forth. We can help them navigate this area. >> Do you have a favorite example, customer example that you think really articulates that value there, that we're helping customers. We can't boil the ocean like you said. It doesn't make any sense, but customer that you helped with small quick wins that really just opened up the opportunity to unlock the value of trustable data. >> Absolutely. So we're working with a fortune 50 company in the US and it's a manufacturing company. Their CFO is a little in a concern whether the data that she's reporting to the Wall Street is acceptable, does it have any errors? And ultimately she signing off on it. So she had a large team in the technology side that was supporting her and they were doing their best. But in spite of that, she's a very sharp woman. She was able to look and find errors and saying, "Something does not look right here guys. Go back and check". Then it goes back to the IT team and they go, "Oh yeah, actually, there was an error". Some errors had slipped through. So they brought us in and we were able to automate the process, What they could do. They could do a few checks within that audit window. We were able to do an enormous number of checks more. More detailed, more accurate. And we were able to reduce the number of errors that were slipping through by over 98%. >> Big number. >> So, absolutely. Really fast. Really good. Now that this has gone through they feel a lot more comfortable than the question is, okay. In addition to financial reporting, can I use it to iron out my supply chain data? 'Cause they have thousands of vendors. They have hundreds of distributors. They have products all over the globe. Now they want to validate all the data because even if your data is off in a one or 2%, if you're a hundred plus billion dollar company, it has an enormous impact on your balance sheet and your income statement. >> Absolutely. Yeah. >> So we are slowly expanding as soon as they allow us. They like us now they're taking it to other areas from beyond finance. >> Well it sounds like you have not only great technology, Seth but a great plan for helping customers with those quick wins and then learning and expanding within and really developing that trusted relationship between FirstEigen and your customers. Thank you so much for joining me on the program today. Introducing the company, what you guys are doing really cool stuff. Appreciate your time. >> Thank you very much. >> All right. >> Pleasure to be here. >> For Seth Rao, I'm Lisa Martin. You're watching theCUBE. The global leader in live tech coverage. (upbeat music)
SUMMARY :
We are running one of the Also explain to me the name. So FirstEigen is a startup One of the things 'cause some of the theme that are hidden to the human eye. So six years ago through to the business side. have the right data, that they need to pay it can snowball very quickly. I mean that the trust is key So there has to be a the kinds of checks you do So sequence, the right process 'Cause that's hard thing to change. So the mindshift has to come the things that you've seen as the data volume is just going is the data catalog. 'Cause obviously the data, that every customer has. So the way we recommend customers So you're to find out where they are We are able to do better We can't boil the ocean like you said. the IT team and they go, They have products all over the globe. Yeah. to other areas from beyond finance. me on the program today. The global leader in live tech coverage.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Seth Rao | PERSON | 0.99+ |
Chicago | LOCATION | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Seth | PERSON | 0.99+ |
US | LOCATION | 0.99+ |
FirstEigen | ORGANIZATION | 0.99+ |
two remote sets | QUANTITY | 0.99+ |
Two live sets | QUANTITY | 0.99+ |
zero | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
two live sets | QUANTITY | 0.99+ |
thousands | QUANTITY | 0.99+ |
six years ago | DATE | 0.99+ |
fortune 50 | ORGANIZATION | 0.98+ |
over 98% | QUANTITY | 0.98+ |
over 20 years | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Six years ago | DATE | 0.98+ |
2% | QUANTITY | 0.98+ |
One | QUANTITY | 0.97+ |
one time | QUANTITY | 0.97+ |
German | OTHER | 0.97+ |
pandemic | EVENT | 0.96+ |
Over a hundred guests | QUANTITY | 0.95+ |
zero human | QUANTITY | 0.94+ |
DataBuck | TITLE | 0.94+ |
hundred plus billion dollar | QUANTITY | 0.93+ |
Invent | EVENT | 0.9+ |
DataBuck | ORGANIZATION | 0.89+ |
one thing | QUANTITY | 0.87+ |
Wall Street | LOCATION | 0.87+ |
last 22 months | DATE | 0.85+ |
re:Invent 2021 | EVENT | 0.83+ |
this year | DATE | 0.8+ |
a hundred | QUANTITY | 0.77+ |
first timer | QUANTITY | 0.75+ |
hundreds of distributors | QUANTITY | 0.73+ |
point zero | QUANTITY | 0.67+ |
2021 | DATE | 0.63+ |
theCUBE | TITLE | 0.55+ |
Danish | LOCATION | 0.55+ |
CEO | PERSON | 0.52+ |
theCUBE | ORGANIZATION | 0.46+ |
Danish | OTHER | 0.45+ |