Sam Kassoumeh, SecurityScorecard | CUBE Conversation
(upbeat music) >> Hey everyone, welcome to this CUBE conversation. I'm John Furrier, your host of theCUBE here in Palo Alto, California. We've got Sam Kassoumeh, co-founder and chief operating office at SecurityScorecard here remotely coming in. Thanks for coming on Sam. Security, Sam. Thanks for coming on. >> Thank you, John. Thanks for having me. >> Love the security conversations. I love what you guys are doing. I think this idea of managed services, SaaS. Developers love it. Operation teams love getting into tools easily and having values what you guys got with SecurityScorecard. So let's get into what we were talking before we came on. You guys have a unique solution around ratings, but also it's not your grandfather's pen test want to be security app. Take us through what you guys are doing at SecurityScorecard. >> Yeah. So just like you said, it's not a point in time assessment and it's similar to a traditional credit rating, but also a little bit different. You can really think about it in three steps. In step one, what we're doing is we're doing threat intelligence data collection. We invest really heavily into R&D function. We never stop investing in R&D. We collect all of our own data across the entire IPV force space. All of the different layers. Some of the data we collect is pretty straightforward. We might crawl a website like the example I was giving. We might crawl a website and see that the website says copyright 2005, but we know it's 2022. Now, while that signal isn't enough to go hack and break into the company, it's definitely a signal that someone might not be keeping things up to date. And if a hacker saw that it might encourage them to dig deeper. To more complex signals where we're running one of the largest DNS single infrastructures in the world. We're monitoring command and control malware and its behaviors. We're essentially collecting signals and vulnerabilities from the entire IPV force space, the entire network layer, the entire web app player, leaked credentials. Everything that we think about when we talk about the security onion, we collect data at each one of those layers of the onion. That's step one. And we can do all sorts of interesting insights and information and reports just out of that thread intel. Now, step two is really interesting. What we do is we go identify the attack surface area or what we call the digital footprint of any company in the world. So as a customer, you can simply type in the name of a company and we identify all of the domains, sub domains, subsidiaries, organizations that are identified on the internet that belong to that organization. So every digital asset of every company we go out and we identify that and we update that every 24 hours. And step three is the rating. The rating is probabilistic and it's deterministic. The rating is a benchmark. We're looking at companies compared to their peers of similar size within the same industry and we're looking at how they're performing. And it's probabilistic in the sense that companies that have an F are about seven to eight times more likely to experience a breach. We're an A through F scale, universally understood. Ds and Fs, more likely to experience a breach. A's we see less breaches now. Like I was mentioning before, it doesn't mean that an F is always going to get hacked or an A can never get hacked. If a nation state targets an A, they're going to eventually get in with enough persistence and budget. If the pizza shop on the corner has an F, they may never get hacked because no one cares, but natural correlation, more doors open to the house equals higher likelihood someone unauthorized is going to walk in. So it's really those three steps. The collection, we map it to the surface area of the company and then we produce a rating. Today we're rating about 12 million companies every single day. >> And how many people do you have as customers? >> We have 50,000 organizations using us, both free and paid. We have a freemium tier where just like Yelp or a LinkedIn business profile. Any company in the world has a right to go claim the score. We never extort companies to fix the score. We never charge a company to see the score or fix it. Any company in a world without paying us a cent can go in. They can understand what we're seeing about them, what a hacker could see about their environment. And then we empower them with the tools to fix it and they can fix it and the score will go up. Now companies pay us because they want enterprise capabilities. They want additional modules, insights, which we can talk about. But in total, there's about 50,000 companies that at any given point in time, they're monitoring about a million and a half organizations of the 12 million that we're rating. It sounds like Google. >> If you want to look at it. >> Sounds like Google Search you got going on there. You got a lot of search and then you create relevance, a score, like a ranking. >> That's precisely it. And that's exactly why Google ventures invested in us in our Series B round. And they're on our board. They looked and they said, wow, you guys are building like a Google Search engine over some really impressive threat intelligence. And then you're distilling it into a score which anybody in the world can easily understand. >> Yeah. You obviously have page rank, which changed the organic search business in the late 90s, early 2000s and the rest is history. AdWords. >> Yeah. >> So you got a lot of customer growth there potentially with the opt-in customer view, but you're looking at this from the outside in. You're looking at companies and saying, what's your security posture? Getting a feel for what they got going on and giving them scores. It sounds like it's not like a hacker proof. It's just more of a indicator for management and the team. >> It's an indicator. It's an indicator. Because today, when we go look at our vendors, business partners, third parties were flying blind. We have no idea how they're doing, how they're performing. So the status quo for the last 20 years has been perform a risk assessments, send a questionnaire, ask for a pen test and an audit evidence. We're trying to break that cycle. Nobody enjoys it. They're long tail. It's a trust without verification. We don't really like that. So we think we can evolve beyond this point in time assessment and give a continuous view. Now, today, historically, we've been outside in. Not intrusive, and we'll show you what a hacker can see about an environment, but we have some cool things percolating under the hood that give more of a 360 view outside, inside, and also a regulatory compliance view as well. >> Why is the compliance of the whole third party thing that you're engaging with important? Because I mean, obviously having some sort of way to say, who am I dealing with is important. I mean, we hear all kinds of things in the security landscape, oh, zero trust, and then we hear trust, supply chain, software risk, for example. There's a huge trust factor there. I need to trust this tool or this container. And then you got the zero trust, don't trust anything. And then you've got trust and verify. So you have all these different models and postures, and it just seems hard to keep up with. >> Sam: It's so hard. >> Take us through what that means 'cause pen tests, SOC reports. I mean the clouds help with the SOC report, but if you're doing agile, anything DevOps, you basically would need to do a pen test like every minute. >> It's impossible. The market shifted to the cloud. We watched and it still is. And that created a lot of complexity, not to date myself. But when I was starting off as a security practitioner, the data center used to be in the basement and I would have lunch with the database administrator and we talk about how we were protecting the data. Those days are long gone. We outsource a lot of our key business practices. We might use, for example, ADP for a payroll provider or Dropbox to store our data. But we've shifted and we no longer no who that person is that's protecting our data. They're sitting in another company in another area unknown. And I think about 10, 15 years ago, CISOs had the realization, Hey, wait a second. I'm relying on that third party to function and operate and protect my data, but I don't have any insight, visibility or control of their program. And we were recommended to use questionnaires and audit forms, and those are great. It's good hygiene. It's good practice. Get to know the people that are protecting your data, ask them the questions, get the evidence. The challenge is it's point in time, it's limited. Sometimes the information is inaccurate. Not intentionally, I don't think people intentionally want to go lie, but Hey, if there's a $50 million deal we're trying to close and it's dependent on checking this one box, someone might bend a rule a little bit. >> And I said on theCUBE publicly that I think pen test reports are probably being fudged and dates being replicated because it's just too fast. And again, today's world is about velocity on developers, trust on the code. So you got all kinds of trust issues. So I think verification, the blue check mark on Twitter kind of thing going on, you're going to see a lot more of that and I think this is just the beginning. I think what you guys are doing is scratching the surface. I think this outside in is a good first step, but that's not going to solve the internal problem that still coming and have big surface areas. So you got more surface area expanding. I mean, IOT's coming in, the Edge is coming fast. Never mind hybrid on-premise cloud. What's your organizations do to evaluate the risk and the third party? Hands shaking, verification, scorecards. Is it like a free look here or is it more depth to it? Do you double click on it? Take us through how this evolves. >> John it's become so disparate and so complex, Because in addition to the market moving to the cloud, we're now completely decentralized. People are working from home or working hybrid, which adds more endpoints. Then what we've learned over time is that it's not just a third party problem, because guess what? My third parties behind the scenes are also using third parties. So while I might be relying on them to process my customer's payment information, they're relying on 20 vendors behind the scene that I don't even know about. I might have an A, they might have an A. It's really important that we expand beyond that. So coming out of our innovation hub, we've developed a number of key capabilities that allow us to expand the value for the customer. One, you mentioned, outside in is great, but it's limited. We can see what a hacker sees and that's helpful. It gives us pointers where to maybe go ask double click, get comfort, but there's a whole nother world going on behind the firewall inside of an organization. And there might be a lot of good things going on that CISO security teams need to be rewarded for. So we built an inside module and component that allows teams to start plugging in the tools, the capabilities, keys to their cloud environments. And that can show anybody who's looking at the scorecard. It's less like a credit score and more like a social platform where we can go and look at someone's profile and say, Hey, how are things going on the inside? Do they have two-factor off? Are there cloud instances configured correctly? And it's not a point in time. This is a live connection that's being made. This is any point in time, we can validate that. The other component that we created is called an evidence locker. And an evidence locker, it's like a secure vault in my scorecard and it allows me to upload things that you don't really stand for or check for. Collateral, compliance paperwork, SOC 2 reports. Those things that I always begrudgingly email. I don't want to share with people my trade secrets, my security policies, and have it sit on their exchange server. So instead of having to email the same documents out, 300 times a month, I just upload them to my evidence locker. And what's great is now anybody following my scorecard can proactively see all the great things I'm doing. They see the outside view. They see the inside view. They see the compliance view. And now they have the holy grail view of my environment and can have a more intelligent conversation. >> Access to data and access methods are an interesting innovation area around data lineage. Tracing is becoming a big thing. We're seeing that. I was just talking with the Snowflake co-founder the other day here in theCUBE about data access and they're building a proprietary mesh on top of the clouds to figure out, Hey, I don't want to give just some tool access to data because I don't know what's on the other side of those tools. Now they had a robust ecosystem. So I can see this whole vendor risk supply chain challenge around integration as a huge problem space that you guys are attacking. What's your reaction to that? >> Yeah. Integration is tricky because we want to be really particular about who we allow access into our environment or where we're punching holes in the firewall and piping data out out of the environment. And that can quickly become unwieldy just with the control that we have. Now, if we give access to a third party, we then don't have any control over who they're sharing our information with. When I talk to CISOs today about this challenge, a lot of folks are scratching their head, a lot of folks treat this as a pet project. Like how do I control the larger span beyond just the third parties? How do I know that their software partners, their contractors that they're working with building their tools are doing a good job? And even if I know, meaning, John, you might send me a list of all of your vendors. I don't want to be the bad guy. I don't really have the right to go reach out to my vendors' vendors knocking on their door saying, hi, I'm Sam. I'm working with John and he's your customer. And I need to make sure that you're protecting my data. It's an awkward chain of conversation. So we're building some tools that help the security teams hold the entire ecosystem accountable. We actually have a capability called automatic vendor discovery. We can go detect who are the vendors of a company based on the connections that we see, the inbound and outbound connections. And what often ends up happening John is we're bringing to the attention to our customers, awareness about inbound and outbound connections. They had no idea existed. There were the shadow IT and the ghost vendors that were signed without going through an assessment. We detect those connections and then they can go triage and reduce the risk accordingly. >> I think that risk assessment of vendors is key. I was just reading a story about this, about how a percentage, I forget the number. It was pretty large of applications that aren't even being used that are still on in companies. And that becomes a safe haven for bad actors to hang out and penetrate 'cause they get overlooked 'cause no one's using them, but they're still online. And so there's a whole, I called cleaning up the old dead applications that are still connected. >> That happens all the time. Those applications also have applications that are dead and applications that are alive may also have users that are dead as well. So you have that problem at the application level, at the user level. We also see a permutation of what you describe, which is leftover artifacts due to configuration mistakes. So a company just put up a new data center, a satellite office in Singapore and they hired a team to go install all the hardware. Somebody accidentally left an administrative portal exposed to the public internet and nobody knew the internet works, the lights are on, the office is up and running, but there was something that was supposed to be turned off that was left turned on. So sometimes we bring to company's attention and they say, that's not mine. That doesn't belong to me. And we're like, oh, well, we see some reason why. >> It's his fault. >> Yeah and they're like, oh, that was the contractor set up the thing. They forgot to turn off the administrative portal with the default login credentials. So we shut off those doors. >> Yeah. Sam, this is really something that's not talked about a lot in the industry that we've become so reliant on managed services and other people, CISOs, CIOs, and even all departments that have applications, even marketing departments, they become reliant on agencies and other parties to do stuff for them which inherently just increases the risk here of what they have. So there inherently could be as secure as they could be, but yet exposed completely on the other side. >> That's right. We have so many virtual touch points with our partners, our vendors, our managed service providers, suppliers, other third parties, and all the humans that are involved in that mix. It creates just a massive ripple effect. So everybody in a chain can be doing things right. And if there's one bad link, the whole chain breaks. I know it's like the cliche analogy, but it rings true. >> Supply chain trust again. Trust who you trust. Let's see how those all reconcile. So Sam, I have to ask you, okay, you're a former CISO. You've seen many movies in the industry. Co-founded this company. You're in the front lines. You've got some cool things happening. I can almost imagine the vision is a lot more than just providing a rating and score. I'm sure there's more vision around intelligence, automation. You mentioned vault, wallet capabilities, exchanging keys. We heard at re:Inforce automated reasoning, metadata reasoning. You got all kinds of crypto and quantum. I mean, there's a lot going on that you can tap into. What's your vision where you see SecurityScorecard going? >> When we started the company, the rating was the thing that we sold and it was a language that helped technical and non-technical folks alike level the playing field and talk about risk and use it to drive their strategy. Today, the rating just opens the door to that discussion and there's so much additional value. I think in the next one to two years, we're going to see the rating becomes standardized. It's going to be more frequently asked or even required or leveraged by key decision makers. When we're doing business, it's going to be like, Hey, show me your scorecard. So I'm seeing the rating get baked more and more the lexicon of risk. But beyond the rating, the goal is really to make a world a safer place. Help transform and rise the tide. So all ships can lift. In order to do that, we have to help companies, not only identify the risk, but also rectify the risk. So there's tools we build to really understand the full risk. Like we talked about the inside, the outside, the fourth parties, fifth parties, the real ecosystem. Once we identified where are all the Fs and bad things, will then what? So couple things that we're doing. We've launched a pro serve arm to help companies. Now companies don't have to pay to fix the score. Anybody, like I said, can fix the score completely free of charge, but some companies need help. They ask us and they say, Hey, I'm looking for a trusted advisor. A Sherpa, a guide to get me to a better place or they'll say, Hey, I need some pen testing services. So we've augmented a service arm to help accelerate the remediation efforts. We're also partnered with different industries that use the rating as part of a larger picture. The cyber rating isn't the end all be all. When companies are assessing risk, they may be looking at a financial ratings, ESG ratings, KYC AML, cyber security, and they're trying to form a complete risk profile. So we go and we integrate into those decision points. Insurance companies, all the top insurers, re-insurers, brokers are leveraging SecurityScorecard as an ingredient to help underwrite for cyber liability insurance. It's not the only ingredient, but it helps them underwrite and identify the help and price the risk so they can push out a policy faster. First policy is usually the one that's signed. So time to quote is an important metric. We help to accelerate that. We partner with credit rating agencies like Fitch, who are talking to board members, who are asking, Hey, I need a third party, independent verification of what my CISO is saying. So the CISO is presenting the rating, but so are the proxy advisors and the ratings companies to the board. So we're helping to inform the boards and evolve how they're thinking about cyber risk. We're helping with the insurance space. I think that, like you said, we're only scratching the surface. I can see, today we have about 50,000 companies that are engaging a rating and there's no reason why it's not going to be in the millions in just the next couple years here. >> And you got the capability to bring in more telemetry and see the new things, bring that into the index, bring that into the scorecard and then map that to potential any vulnerabilities. >> Bingo. >> But like you said, the old days, when you were dating yourself, you were in a glass room with a door lock and key and you can see who's two folks in there having lunch, talking database. No one's going to get hurt. Now that's gone, right? So now you don't know who's out there and machines. So you got humans that you don't know and you got machines that are turning on and off services, putting containers out there. Who knows what's in those payloads. So a ton of surface area and complexity to weave through. I mean only is going to get done with automation. >> It's the only way. Part of our vision includes not attempting to make a faster questionnaire, but rid ourselves of the process all altogether and get more into the continuous assessment mindset. Now look, as a former CISO myself, I don't want another tool to log into. We already have 50 tools we log into every day. Folks don't need a 51st and that's not the intent. So what we've done is we've created today, an automation suite, I call it, set it and forget it. Like I'm probably dating myself, but like those old infomercials. And look, and you've got what? 50,000 vendors business partners. Then behind there, there's another a hundred thousand that they're using. How are you going to keep track of all those folks? You're not going to log in every day. You're going to set rules and parameters about the things that you care about and you care depending on the nature of the engagement. If we're exchanging sensitive data on the network layer, you might care about exposed database. If we're doing it on the app layer, you're going to look at application security vulnerabilities. So what our customers do is they go create rules that say, Hey, if any of these companies in my tier one critical vendor watch list, if they have any of these parameters, if the score drops, if they drop below a B, if they have these issues, pick these actions and the actions could be, send them a questionnaire. We can send the questionnaire for you. You don't have to send pen and paper, forget about it. You're going to open your email and drag the Excel spreadsheet. Those days are over. We're done with that. We automate that. You don't want to send a questionnaire, send a report. We have integrations, notify Slack, create a Jira ticket, pipe it to ServiceNow. Whatever system of record, system of intelligence, workflow tools companies are using, we write in and allow them to expedite the whole. We're trying to close the window. We want to close the window of the attack. And in order to do that, we have to bring the attention to the people as quickly as possible. That's not going to happen if someone logs in every day. So we've got the platform and then that automation capability on top of it. >> I love the vision. I love the utility of a scorecard, a verification mark, something that could be presented, credential, an image, social proof. To security and an ongoing way to monitor it, observe it, update it, add value. I think this is only going to be the beginning of what I would see as much more of a new way to think about credentialing companies. >> I think we're going to reach a point, John, where and some of our customers are already doing this. They're publishing their scorecard in the public domain, not with the technical details, but an abstracted view. And thought leaders, what they're doing is they're saying, Hey, before you send me anything, look at my scorecard securityscorecard.com/securityrating, and then the name of their company, and it's there. It's in the public domain. If somebody Googles scorecard for certain companies, it's going to show up in the Google Search results. They can mitigate probably 30, 40% of inbound requests by just pointing to that thing. So we want to give more of those tools, turn security from a reactive to a proactive motion. >> Great stuff, Sam. I love it. I'm going to make sure when you hit our site, our company, we've got camouflage sites so we can make sure you get the right ones. I'm sure we got some copyright dates. >> We can navigate the decoys. We can navigate the decoys sites. >> Sam, thanks for coming on. And looking forward to speaking more in depth on showcase that we have upcoming Amazon Startup Showcase where you guys are going to be presenting. But I really appreciate this conversation. Thanks for sharing what you guys are working on. We really appreciate. Thanks for coming on. >> Thank you so much, John. Thank you for having me. >> Okay. This is theCUBE conversation here in Palo Alto, California. Coming in from New York city is the co-founder, chief operating officer of securityscorecard.com. I'm John Furrier. Thanks for watching. (gentle music)
SUMMARY :
to this CUBE conversation. Thanks for having me. and having values what you guys and see that the website of the 12 million that we're rating. then you create relevance, wow, you guys are building and the rest is history. for management and the team. So the status quo for the and it just seems hard to keep up with. I mean the clouds help Sometimes the information is inaccurate. and the third party? the capabilities, keys to the other day here in IT and the ghost vendors I forget the number. and nobody knew the internet works, the administrative portal the risk here of what they have. and all the humans that You're in the front lines. and the ratings companies to the board. and see the new things, I mean only is going to and get more into the I love the vision. It's in the public domain. I'm going to make sure when We can navigate the decoys. And looking forward to speaking Thank you so much, John. city is the co-founder,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
John | PERSON | 0.99+ |
Sam Kassoumeh | PERSON | 0.99+ |
Sam | PERSON | 0.99+ |
30 | QUANTITY | 0.99+ |
John Furrier | PERSON | 0.99+ |
Singapore | LOCATION | 0.99+ |
50 tools | QUANTITY | 0.99+ |
12 million | QUANTITY | 0.99+ |
20 vendors | QUANTITY | 0.99+ |
Fitch | ORGANIZATION | 0.99+ |
Today | DATE | 0.99+ |
$50 million | QUANTITY | 0.99+ |
fifth parties | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Palo Alto, California | LOCATION | 0.99+ |
today | DATE | 0.99+ |
SecurityScorecard | ORGANIZATION | 0.99+ |
First policy | QUANTITY | 0.99+ |
two folks | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Excel | TITLE | 0.99+ |
50,000 vendors | QUANTITY | 0.99+ |
Dropbox | ORGANIZATION | 0.99+ |
late 90s | DATE | 0.99+ |
fourth parties | QUANTITY | 0.99+ |
51st | QUANTITY | 0.99+ |
Yelp | ORGANIZATION | 0.99+ |
early 2000s | DATE | 0.99+ |
two-factor | QUANTITY | 0.99+ |
securityscorecard.com | OTHER | 0.99+ |
first step | QUANTITY | 0.99+ |
two years | QUANTITY | 0.99+ |
three steps | QUANTITY | 0.98+ |
eight times | QUANTITY | 0.98+ |
one bad link | QUANTITY | 0.98+ |
about 50,000 companies | QUANTITY | 0.98+ |
one box | QUANTITY | 0.98+ |
millions | QUANTITY | 0.98+ |
Googles | ORGANIZATION | 0.97+ |
both | QUANTITY | 0.97+ |
step two | QUANTITY | 0.97+ |
about 12 million companies | QUANTITY | 0.97+ |
Snowflake | ORGANIZATION | 0.97+ |
50,000 organizations | QUANTITY | 0.97+ |
One | QUANTITY | 0.96+ |
2005 | DATE | 0.96+ |
ORGANIZATION | 0.96+ | |
zero trust | QUANTITY | 0.96+ |
2022 | DATE | 0.95+ |
step one | QUANTITY | 0.95+ |
360 view | QUANTITY | 0.95+ |
300 times a month | QUANTITY | 0.94+ |
securityscorecard.com/securityrating | OTHER | 0.94+ |
a cent | QUANTITY | 0.93+ |
Sherpa | ORGANIZATION | 0.93+ |
AdWords | TITLE | 0.93+ |
SOC 2 | TITLE | 0.92+ |
New York city | LOCATION | 0.91+ |
CUBE | ORGANIZATION | 0.91+ |
about a million and a half organizations | QUANTITY | 0.89+ |
Amazon Startup Showcase | EVENT | 0.89+ |
Series B | OTHER | 0.86+ |
CISO | ORGANIZATION | 0.86+ |
one | QUANTITY | 0.86+ |
step three | QUANTITY | 0.86+ |
next couple years | DATE | 0.84+ |
24 hours | QUANTITY | 0.84+ |
zero | QUANTITY | 0.84+ |
single | QUANTITY | 0.84+ |
about seven | QUANTITY | 0.83+ |
2021 AWSSQ2 069 AWS Krishna Gade and Amit Paka
(upbeat music) >> Hello and welcome to theCUBE as we present AWS Startup Showcase, The Next Big Thing in AI, Security & Life Sciences, the hottest startups. And today's session is really the next big thing in AI Security & Life Sciences. As to the AI track is really a big one most important. And we have a feature in company, fiddler.ai. I'm your host, John Furrier with theCUBE. And we're joined by the founders, Krishna Gade, founder and CEO, and Amit Paka, founder and Chief Product Officer. Great to have the founders on. Gentlemen, thank you for coming on this Cube segment for the AWS Startup Showcase. >> Thanks, John... >> Good to be here. >> So the topic of this session is staying compliant and accelerating AI adoption and model performance monitoring. Basically, bottom line is how to be innovative with AI and stay (John laughs) within the rules of the road, if you will. So, super important topic. Everyone knows the benefits of what AI can do. Everyone sees machine learning being embedded in every single application, but the business drivers of compliance and all kinds of new kinds of regulations are popping up. So we don't. The question is how do you stay compliant? Which is essentially how do you not foreclose the future opportunities? That's really the question on everyone's mind these days. So let's get into it. But before we start let's take a minute to explain what you guys do. Krishna, we'll start with you first. What does fiddler.ai do? >> Absolutely, yeah. Fiddler is a model performance management platform company. We help, you know, enterprises, mid-market companies to build responsible AI by helping them continuously monitoring their AI, analyzing it, explaining it, so that they know what's going on with their AI solutions at any given point of time. And they can be like, ensuring that, you know businesses are intact and they're compliant with all the regulations that they have in their industry. >> Everyone thinks AI is a secret sauce. It's magic beans and automatically will just change over the company. (John laughs) So it's kind of like this almost like it's a hope. But the reality is there is some value there but there's something that has to be done first. So let's get into what this model performance management is because it's a concept that needs to be understood well but also you got to implement it properly. There's some foundational things you've got to you know, walk, crawl before you walk and walk before you run kind of thing. So let's get into it. What is model performance management? >> Yeah, that's a great question. So the core software artifact most an AI system is called an AI model. So it essentially represents the patterns inside data accessing manner so that it can actually predict the future. Now, for example, let's say I'm trying to build an AI based credit underwriting system. What I would do is I would look at the historical you know, loans data. You know, good loans and bad loans. And then, I will build it a model that can capture those patterns so that when a new customer comes in I can actually predict, you know, how likely they are going to default on the loan much more activity. And this helps me as a bank or center company to produce more good loans for my company and ensure that my customer is not, you know, getting the right customer service. Now, the problem though is this AI model is a black box. Unlike regular software code you cannot really open up and read its code and read its patterns and how it is doing. And so that's where the risks around the AI models come along. And so you need a ways to innovate to actually explain it. You need to understand it and you need to monitor it. And this is where the model performance management system like Fiddler can help you look into that black box. Understand how it's doing it, monitor its predictions continuously so that you know what these models are doing at any given point of time. >> I mean, I'd love to get your thoughts on this because on the product side I could, first of all, totally awesome concept. No one debates that. But now you've got more and more companies integrating with each other more data's being shared. And so the, you know, everyone knows what an app sec review is, right? But now they're thinking about this concept of how do you do review of models, right? So understanding what's inside the black box is a huge thing. How do you do this? What does it mean? >> Yeah, so typically what you would do is it's just like software where you would validate software code going through QA and like analysis. In case of models you would try to prove the model in like different granularities to really understand how the model is behaving. This could be at a model prediction like level in case of the loans example, Krishna just gave. Why is my model saying high-risk to in particular loan? Or it might be in case of explaining groups of loans. For example, why is my model making high-risk predictions to loans made in California or loans made to all men? Was it loans made to all women? And it could also be at the global level. What are the key data factors important to my model? So the ability to prove the model deeper and really opening up the black box and then using that knowledge to explain how the model is working to non-technical folks in compliance. Or to folks who are regulators, who just want to ensure that they know how the model works to make sure that it's keeping up with kind of lending regulations to ensure that it's not biased and so on. So that's typically the way you would do it with the machine learning model. >> Krishna, talk about the potential embarrassments that could happen. You just mentioned some of the use cases you heard from a mid-saying you know, female, male. I mean, machines, aren't that smart. (John laughs) >> Yeah. >> If they don't have the data. >> Yeah. >> And data is fragmented you've got silos with all kinds of challenges just on the data problem, right? >> Yeah. >> So nevermind the machine learning problems. So, this is huge. I mean, the embarrassment opportunities. >> Yeah. >> And the risk management on whether it's a hack or something else. So you've got public embarrassment by doing something really went wrong. And then, you've got the real business impact that could be damaging. >> Absolutely. You know, AI has come forward a lot, right? I mean, you know, you have lots of data these days. You have a lot of computing power an amazing algorithms that you can actually build really sophisticated models. Some of these models were known to beat humans in image recognition and whatnot. However, the problem is there are risks in using AI, you know, without properly testing it, without properly monitoring it. For example, a couple of years ago, Apple and Goldman Sachs launched a credit card, right? And for their users where they were using algorithms presumably AI or machine learning algorithms to set credit limits. What happened was within the same household husband and wife got 10 times difference in the credit limits being set for them. And some of these people had similar FICO scores, similar salary ranges. And some of them went online and complained about it and that included the likes of Steve Wozniak as well. >> Yeah. >> So this was, these kind of stories are usually embarrassing when you could lose customer trust overnight, right? And, you know, you have to do a lot of PR damage. Eventually, there was a regulatory probate with Goldman Sachs. So there are these problems if you're not properly monitoring area systems, properly validating and testing them before you launch to the users. And that is why tools like Fiddler are coming forward so that you know, enterprises can do this. So that they can ensure responsible AI for both their organization as well as their customers. >> That's a great point, I want to get into this. What it kind of means and the kind of the industry side of it? And then, how that impacts customers? If you guys don't mind, machine learning opposite a term MLOps has been coined in the industry as you know. Basically, operations around machine learning, which kind of gets into the workflows and development life cycles. But ultimately, as you mentioned, this black box and this model being made. There's a heavy reliance on data. So Amit, what does this mean? Because now is it becomes operational with MLOps. There is now internal workflows and activities and roles and responsibilities. How is this changing organizations, you know separate the embarrassment, which is totally true. Now I've got an internal operational aspect and there's dev involved. What's the issue? >> Yeah, so typically, so if you look at the whole life cycle of machine learning ops, in some ways mirrors the traditional life cycle of kind of DevOps but in some ways it introduces new complexities. Specifically, because the models can be a black box. That's one thing to kind of watch out for. And secondly, because these models are probabilistic artifact, which means they are trained on data to grab relationships for what kind of potentially making high accuracy predictions. But the data that they see in life might actually differ and that might hurt their performance especially because machine learning is applied towards these high ROI use cases. So this process of MLOps needs to change to incorporate the fact that machine learning models can be black boxes and machine learning models can decay. And so the second part I think that's also relevant is because machine learning models can decay. You don't just create one model you create multiple versions of these models. And so you have to constantly stay on top of how your model is deviating from your reality and actual reality and kind of bring it back to that representation of reality. >> So this is interesting, I like this. So now there's a model for the model. So this is interesting. You guys have innovated on this model performance management idea. Can you explain the framework and how you guys solve that regulatory compliance piece? Because if you can be a model of the model, if you will. >> Then. >> Then you can then have some stability around maintaining the code basis or the integrity of the model. >> Okay. >> How does that? What do you guys offer? Take us through the framework and how it works and then how it ties to that regulatory piece? >> So the MPM system or the model performance management system really sits at the heart of the machine learning workflow. Keeping track of the data that is flowing through your ML life cycle, keeping track of the models that are going, you know, we're getting created and getting deployed and how they're performing. Keeping track of the whole parts of the models. So it gives you a centralized way of managing all of these information in one place, right? It gives you an oversight from a compliance standpoint from an operational standpoint of what's going on with your models in production. Imagine you're a bank you're probably creating hundreds of these models, but a variety of use cases, credit risk, fraud, anti-money laundering. How are you going to know which models are actually working very well? Which models are stale? Which models are expired? How do you know which models are underperforming? You know, are you getting alerts? So this is what this kind of governance, this performance management is what the system offers. It's a visual interface, lots of dashboards, the developers, operations folks, compliance folks can go and look into. And then they would get alerts when things go wrong with respect to their models. In terms of how it can be helpful to meet in compliance regulations. For example, let's say I'm starting to create a new credit risk model in a bank. Now I'm innovating on different AI algorithms here immediately before I even deploy that model I have to validate it. I have to explain it and create a report so that I can submit to my internal risk management team which can then review it, you know, understand all kinds of risks around it. And then potentially share it with the audit team and then keep a log of these reports so that when a regulator comes visits them, you know they can share these reports. These are the model reports. Is that how the model was created? Fiddler helps them create these reports, keep all of these reports in one place. And then once the model is deployed, you know, it basically can help them monitor these models continuously. So that they don't just have one ad hoc report when it was created upfront, they can a continuous monitoring continuous dashboard in terms of what it was doing in the last one whatever number of months it was running for. >> You know what? >> Historically, if you were to regulate it like all AI applications in the U.S. the legacy regulations are the ones that today are applied as to the equal credit opportunity or the Fed guidelines of like SR 11-7 that kind of comment that's applicable to all banks. So there is no purpose-built AI regulation but the EU released a proposed regulation just about three weeks back. That classifies risk within applications, and specifically for high-risk applications. They propose new oversight and the ads mandating explainability helping teams understand how the models are working and monitoring to ensure that when a model is trained for high accuracy, it maintains that. So now those two mandatory needs of high risk application, those are the ones that are solved by Fiddler. >> Yeah, this is, you mentioned explainable AI. Could you just quickly define that for the audience? Because this is a trend we're seeing a lot more of. Take a minute to explain what is explainable AI? >> Yeah, as I said in the beginning, you know AI model is a new software artifact that is being created. It is the core of an AI system. It's what represents all the patterns in the data and coach them and then uses that knowledge to predict the future. Now how it encodes all of these patterns is black magic, right? >> Yeah. >> You really don't know how the model is working. And so explainable AI is a set of technologies that can help you unlock that black box. You know, quote-unquote debug that model, looking to the model is introspected inspected, probate, whatever you want to call it, to understand how it works. For example, let's say I created an AI model, that again, predicts, you know, loan risk. Now let's say some person, a person comes to my bank and applies for a $10,000 loan, and the bank rejects the loan or the model rejects the loan. Now, why did it do it, right? That's a question that can explain the way I can answer. They can answer, hey, you know, the person's, you know salary range, you know, is contributing to 20% of the loan risk or this person's previous debt is contributing to 30% of the loan risk. So you can get a detailed set of dashboards in terms of attribution of taking the loan risk, the composite loan risk, and then attributing it to all the inputs that the model is observing. And so therefore, you now know how the moral is treating each of these inputs. And so now you have an idea of like where the person is getting effected by this loaner's mark. So now as a human, as an underwriter or a loan officer lending officer, I have knowledge about how the model is working. I can then have my human intuition or lap on it. I can approve the model sometimes I can disapprove the model sometimes. I can use this feedback and deliver it to the data science team, the AI team, so they can actually make the model better over time. So this unlocking black box has several benefits throughout their life cycle. >> That's awesome. Great definition. Great call. I want to grab get that on the record for the audience. Also, we'll make a clip out of that too. One of the things that I meant you brought up I love and want to get into is this MLOps impact. So as we were just talking earlier debugging module models and production, totally cool, relevant, unpacked a black box. But model decay, that's an interesting concept. Can you explain more? Because this to me, I think is potentially a big blind spot for the industry, because, you know, I talked to Swami at Amazon, who runs their AI group and, you know, they want to make AI easier and ML easier with SageMaker and other tools. But you can fall into a trap of thinking everything's done at one and done. It's iterative is you've got leverage here. You got to keep track of the performance of the models, not just debugging them. Are they actually working? Is there new data? This is a whole another practice. Could you explain this concept of model decay? >> Yeah, so let's look at the lending example Krishna was just talking about. If you expect your customers to be your citizens, right? So you will have examples in your training set which might have historical loans made to people that the needs of 40, and let's say 70. And so you will train your model and your model will be trained our highest accuracy in making loans to these type of applicants. But now let's say introduced a new loan product that you're targeting, let's say younger college going folks. So that model is not trained to work well in those kinds of scenarios. Or it could also happen that you could get a lot more older people coming in to apply for these loans. So the data that the model can see in life might not represent the data that you train the model with. And the model has recognized relationships in this data and it might not recognize relationships in this new data. So this is a constant, I would say, it's an ongoing challenge that you would face when you have a live model in ensuring that the reality meets your representation of the reality when you train the model. And so this is something that's unique to machine learning models and it has not been a problem historically in the world of DevOps. But it is a very key problem in the DevOps. >> This is really great topic. And most people who are watching might want to might know of some of these problems when they see the main mainstream press talk about fairness in black versus white skin and bias and algorithms. I mean, that's kind of like the press state that talk about those kinds of like big high level topics. But what it really means is that the data (John laughs) of practiced fairness and bias and skewing and all kinds of new things that come up that the machines just can't handle. This is a big deal. So this is happening to every part of data in an organization. So, great problem statement. I guess the next segue would be, why Fiddler, why now? What are you guys doing? How are you solving these problems? Take us through some use cases. How people engage with you guys? How you solve the problem and how you guys see this evolving? >> Great, so Fiddler is a purpose-built platform to solve for model explainability of modern monitoring and moderate bias detection. This is the only thing that we do, right? So we are super focused on building this tool to be useful across a variety of, you know, AI problems, from financial services to retail, to advertising to human resources, healthcare and so on and so forth. And so we have found a lot of commonalities around how data scientists are solving these problems across these industries. And we've created a system that can be plugged into their workflows. For example, I could be a bank, you know, creating anti-money laundering models on a modern AI platform like TensorFlow. Or I could be like a retail company that is building a recommendation models in, you know, PyTorch, like library. You can bring all of those models into one under one sort of umbrella, like using Fiddler. We can support a variety of heterogeneous types of models. And that is a very very hard technical problem to solve. To be able to ingest and digest all these different types of monotypes and then provide a single pane of glass in terms of how the model is performing. How explaining the model, tracking the model life cycle throughout its existence, right? And so that is the value prop that Fiddler offers, the MLOps team, so they can get this oversight. And so this plugs in nicely with their MLOps so they don't have to change anything and give the additional benefit... >> So, you're basically creating faster outcomes because the teams can work on real problems. >> Right. >> And not have to deal with the maintenance of model management. >> Right. >> Whether it's debugging or decay evaluations, right? >> Right, we take care of all of their model operations from a monitoring standpoint, analysis standpoint, debugability, alerting. So that they can just build the right kind of models for their customers. And we give them all the insights and intelligence to know the problems with behind those models behind their datasets. So that they can actually build more accurate models more responsible models for their customers. >> Okay, Amit, give us the secret sauce. What's going on in the product? How does it all work? What's the secret sauce? >> So there are three key kind of pillars to Fiddler product. One is of course, we leverage the latest research, and we actually productize that in like amazing ways where when you explain models you get the explanation within a second. So this activates new use cases like, let's say counterfactual analysis. You can not only get explanations for your loan, you can also see hypothetically. What if this the loan applicant was, you know, had a higher income? What would the model do? So, that's one part productizing latest research. The second part is infrastructure at scale. So we are not just building something that would work for SMBs. We are building something that works on enterprise scale. So billions and billions of predictions, right? Flowing through the system. We want to make sure that we can handle as larger scale as seamlessly as kind of possible. So we are trying to activate that and making sure we are the best enterprise grade product on the market. And thirdly, user experience. What you'll see when you use Fiddler. Finally, when we do demos to kind of customers what they really see is the product. They don't see that the scale right, right, right then and there. They don't see the deep reason. What they see, what they see are these like beautiful experiences that are very intuitive to them. Where we've merged explainability and monitoring and bias detection in like seamless way. So you get the most intuitive experiences that are not just designed for the technical user, but also for the non-technical user. Who are also stakeholders within AI. >> So the scale thing is a huge point, by the way. I think that's something that you see successful companies. That's a differentiator and frankly, it's the new sustainability. So new lock-in, if you will, not to be in a bad way but in a good way. You do a good job. You get scale, you get leverage. I want to just point out and get your guys' thoughts on your approach on the frame. Where you guys are centralized. >> Right. >> So as decentralization continues to be a wave you guys are taking much more of a centralized approach. Why is that done? Take us through the decision on that. >> Yeah. So, I mean, in terms of, you know decentralization in terms of running models on different you know, containers and, you know, scoring them on multiple number of nodes, that's absolutely makes sense, right? When from a deployment standpoint from a inference standpoint. But when it comes to actually you know, understanding how the models are working. Visualizing them, monitoring them, knowing what's going on with the models. You need a centralized dashboard that a lapsed user can actually use or a head of AI governance inside a bank and use what are all the models that my team is shipping? You know, which models carry risk, you know? How are these models performing last week? This, you need a centralized repository. Otherwise, it'll be very very hard to track these models, right? Because the models are going to grow really really fast. You know, there are so many open source libraries, open source model architecture has been produced. And so many data scientists coming out of grad schools and whatnot. And the number of models in enterprise is just going to grow many many fold in the coming years. Now, how are you going to track all of these things without having a centralized platform? And that's what we envisaged a few years ago that every team will need an oversight tool like Fiddler. Which can keep track of all of their models in one place. And that's what we are finding from our customers. >> As long as you don't get in the way of them creating value, which is the goal, right? >> Right. >> And be frictionless take away the friction. >> Yeah. >> And enable it. Love the concept. I think you guys are on something big there, great products. Great vision. The question I have for you to kind of wrap things up here. Is that this is all new, right? And new, it's all goodness, right? If you've got scale in the Cloud, all these new benefits. Again, more techies coming out of grad school and Computer Science and Engineering, and just data analysis in general is changing. And there's more people to be democratized to be contributing. >> Right. >> How do you operationalize it? How do companies get this going? Because you've got a new thing happening. It's a new wave. >> Okay. >> But it's still the same game, make business run better. >> Right. >> So you've got to deploy something new. What's the operational playbook for companies to get started? >> Absolutely. First step is to, if a company is trying to install AI, incorporate AI into their workflow. You know, most companies I would say, they're in still early stages, right? There a lot of enterprises are still, you know, developing these models. Some of them may have been in labs. ML operationalization is starting to happen and it probably started in a year or two ago, right? So now when it comes to, you know, putting AI into practice, so far, you know, you can have AI models in labs. They're not going to hurt anyone. They're not going to hurt your business. They're not going to hurt your users. But once you operationalize them then you have to do it in a proper manner, in a responsible manner, in a trustworthy manner. And so we actually have a playbook in terms of how you would have to do this, right? How are you going to test these models? How are you going to analyze and validate them before they actually are deployed? How are you going to analyze, you know, look into data bias and training set bias, or test set bias. And once they are deployed to production are you tracking, you know, model performance or time? Are you tracking drifting models? You know, the decay part that we talked about. Do you have alerts in place when model performance goes all over the place? Now, all of a sudden, suddenly you get a lot of false positives in your fraud models. Are you able to track them? We have the personnel in place. You have the data scientists, the ML engineers, the MLOps engineers, the governance teams in place if it's in a regulated industry to use these tools. And then, the tools like Fiddler, will add value, will make them, you know, do their job, institutionalize this process of responsible AI. So that they're not only reaping the benefits of this great technology. There's no doubt about the AI, right? It's actually, it's going to be game changing but then they can also do it in a responsible and trustworthy manner. >> Yeah, it's really get some wins, get some momentum, see it. This is the Cloud way. It gets them some value immediately and grow from there. I was talking to a friend the other day, Amit, about IT the lecture. I don't worry about IT and all the Cloud. I go, there's no longer IT, IT is dead. It's an AI department now. (Amit laughs) So and this is kind of what you guys are getting at. This now it's data now it's AI. It's kind of like what IT used to be enabling organizations to be successful. You guys are looking at it from the perspective of the same way it's enabled success. You put it out that you provision (John laughs) algorithms instead of servers they're algorithms now. This is the new model. >> Yeah, we believe that all companies in the future as it happened to this wave of data are going to be AI companies, right? So it's really just a matter of time. And the companies that are first movers in this are going to have a significant advantage like we're seeing that in like banking already. Where the banks that have made the leap into AI battles are reaping benefits of enabling a lot more models at the same risk profile using deep learning models. As long as you're able to like validate these to ensure that they're meeting kind of like the regulations. But it's going to give significant advantages to a lot of companies as they move faster with respect to others in the same industry. >> Yeah, quickers too, saw a friend too on the compliance side. You mentioned trust and transparency with the whole EU thing. Some are saying that, you know, to be a public company, you're going to have to have AI disclosure soon. You're going to have to have on the disclosure in your public statements around how you're explaining your AI. Again, fantasy today. But pretty plausible. >> Right, absolutely. I mean, the real reality today is, you know less than 10% of the CEOs care about ethical AI, right? And that has to change. And I think, you know, and I think that has to change for the better, because at the end of the day, if you are using AI, if you're not using in a responsible and trustworthy manner then there is like regulation. There is compliance risk, there's operational business risk. You know, customer trust. Losing customers trust can be huge. So I think, you know, we want to provide that you know, insurance, or like, you know like a preventative mechanism. So that, you know, if you have these tools in place then you're less likely to get into those situations. >> Awesome. Great, great conversation, Krishna, Amit. Thank you for sharing both the founders of Fiddler.ai. Great company. On the right side of history in my opinion, the next big thing in AI. AI departments, AI compliance, AI reporting. (John laughs) Explainable AI, ethical AI, all part of this next revolution. Gentlemen, thank you for joining us on theCUBE Amazon Startup Showcase. >> Thanks for having us, John. >> Okay, it's theCUBE coverage. Thank you for watching. (upbeat music)
SUMMARY :
really the next big thing So the topic of this We help, you know, enterprises, and walk before you run kind of thing. so that you know what And so the, you know, So the ability to prove the model deeper of the use cases you heard So nevermind the And the risk management and that included the likes so that you know, enterprises can do this. and the kind of the industry side of it? And so you have to constantly stay on top of the model, if you will. the integrity of the model. that are going, you know, and the ads mandating define that for the audience? It is the core of an AI system. know, the person's, you know One of the things that of the reality when you train the model. and how you guys see this evolving? And so that is the value because the teams can And not have to deal So that they can just build What's going on in the product? They don't see that the scale So the scale thing is you guys are taking much more And the number of models in enterprise take away the friction. I think you guys are How do you operationalize it? But it's still the same game, What's the operational playbook So now when it comes to, you know, You put it out that you of like the regulations. you know, to be a public company, And I think, you know, the founders of Fiddler.ai. Thank you for watching.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
California | LOCATION | 0.99+ |
10 times | QUANTITY | 0.99+ |
John Furrier | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Amit Paka | PERSON | 0.99+ |
Steve Wozniak | PERSON | 0.99+ |
Apple | ORGANIZATION | 0.99+ |
EU | ORGANIZATION | 0.99+ |
30% | QUANTITY | 0.99+ |
20% | QUANTITY | 0.99+ |
Goldman Sachs | ORGANIZATION | 0.99+ |
John | PERSON | 0.99+ |
40 | QUANTITY | 0.99+ |
$10,000 | QUANTITY | 0.99+ |
Krishna | PERSON | 0.99+ |
Amit | PERSON | 0.99+ |
billions | QUANTITY | 0.99+ |
70 | QUANTITY | 0.99+ |
Fed | ORGANIZATION | 0.99+ |
last week | DATE | 0.99+ |
Krishna Gade | PERSON | 0.99+ |
One | QUANTITY | 0.99+ |
one part | QUANTITY | 0.99+ |
second part | QUANTITY | 0.99+ |
less than 10% | QUANTITY | 0.99+ |
one model | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
three key | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
one thing | QUANTITY | 0.98+ |
First step | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
one place | QUANTITY | 0.98+ |
Fiddler.ai | ORGANIZATION | 0.98+ |
secondly | QUANTITY | 0.97+ |
hundreds | QUANTITY | 0.97+ |
each | QUANTITY | 0.97+ |
first | QUANTITY | 0.97+ |
U.S. | LOCATION | 0.97+ |
Swami | PERSON | 0.96+ |
a year | DATE | 0.94+ |
first movers | QUANTITY | 0.94+ |
theCUBE | ORGANIZATION | 0.94+ |
Fiddler | ORGANIZATION | 0.94+ |
FICO | ORGANIZATION | 0.93+ |
SR 11-7 | TITLE | 0.93+ |
one | QUANTITY | 0.91+ |
two ago | DATE | 0.88+ |
three weeks back | DATE | 0.83+ |
couple of years ago | DATE | 0.82+ |
Amazon Startup Showcase | EVENT | 0.81+ |
few years ago | DATE | 0.8+ |
billions of predictions | QUANTITY | 0.77+ |
Fiddler | TITLE | 0.77+ |
two mandatory | QUANTITY | 0.76+ |
SageMaker | TITLE | 0.76+ |
single pane of | QUANTITY | 0.75+ |
a second | QUANTITY | 0.74+ |
thirdly | QUANTITY | 0.73+ |
about | DATE | 0.73+ |
single application | QUANTITY | 0.73+ |
PyTorch | ORGANIZATION | 0.73+ |
AWS | EVENT | 0.72+ |
Startup Showcase | EVENT | 0.69+ |
TensorFlow | TITLE | 0.67+ |
AWS Startup Showcase | EVENT | 0.65+ |