Nimrod Vax, BigID | AWS re:Invent 2020 Partner Network Day

>> Announcer: From around the globe, it's theCUBE. With digital coverage of AWS re:Invent 2020. Special coverage sponsored by AWS global partner network. >> Okay, welcome back everyone to theCUBE virtual coverage of re:Invent 2020 virtual. Normally we're in person, this year because of the pandemic we're doing remote interviews and we've got a great coverage here of the APN, Amazon Partner Network experience. I'm your host John Furrier, we are theCUBE virtual. Got a great guest from Tel Aviv remotely calling in and videoing, Nimrod Vax, who is the chief product officer and co-founder of BigID. This is the beautiful thing about remote, you're in Tel Aviv, I'm in Palo Alto, great to see you. We're not in person but thanks for coming on. >> Thank you. Great to see you as well. >> So you guys have had a lot of success at BigID, I've noticed a lot of awards, startup to watch, company to watch, kind of a good market opportunity data, data at scale, identification, as the web evolves beyond web presence identification, authentication is super important. You guys are called BigID. What's the purpose of the company? Why do you exist? What's the value proposition? >> So first of all, best startup to work at based on Glassdoor worldwide, so that's a big achievement too. So look, four years ago we started BigID when we realized that there is a gap in the market between the new demands from organizations in terms of how to protect their personal and sensitive information that they collect about their customers, their employees. The regulations were becoming more strict but the tools that were out there, to the large extent still are there, were not providing to those requirements and organizations have to deal with some of those challenges in manual processes, right? For example, the right to be forgotten. Organizations need to be able to find and delete a person's data if they want to be deleted. That's based on GDPR and later on even CCPA. And organizations have no way of doing it because the tools that were available could not tell them whose data it is that they found. The tools were very siloed. They were looking at either unstructured data and file shares or windows and so forth, or they were looking at databases, there was nothing for Big Data, there was nothing for cloud business applications. And so we identified that there is a gap here and we addressed it by building BigID basically to address those challenges. >> That's great, great stuff. And I remember four years ago when I was banging on the table and saying, you know regulation can stunt innovation because you had the confluence of massive platform shifts combined with the business pressure from society. That's not stopping and it's continuing today. You seeing it globally, whether it's fake news in journalism, to privacy concerns where modern applications, this is not going away. You guys have a great market opportunity. What is the product? What is smallID? What do you guys got right now? How do customers maintain the success as the ground continues to shift under them as platforms become more prevalent, more tools, more platforms, more everything? >> So, I'll start with BigID. What is BigID? So BigID really helps organizations better manage and protect the data that they own. And it does that by connecting to everything you have around structured databases and unstructured file shares, big data, cloud storage, business applications and then providing very deep insight into that data. Cataloging all the data, so you know what data you have where and classifying it so you know what type of data you have. Plus you're analyzing the data to find similar and duplicate data and then correlating them to an identity. Very strong, very broad solution fit for IT organization. We have some of the largest organizations out there, the biggest retailers, the biggest financial services organizations, manufacturing and et cetera. What we are seeing is that there are, with the adoption of cloud and business success obviously of AWS, that there are a lot of organizations that are not as big, that don't have an IT organization, that have a very well functioning DevOps organization but still have a very big footprint in Amazon and in other kind of cloud services. And they want to get visibility and they want to do it quickly. And the SmallID is really built for that. SmallID is a lightweight version of BigID that is cloud-native built for your AWS environment. And what it means is that you can quickly install it using CloudFormation templates straight from the AWS marketplace. Quickly stand up an environment that can scan, discover your assets in your account automatically and give you immediate visibility into that, your S3 bucket, into your DynamoDB environments, into your EMR clusters, into your Athena databases and immediately building a full catalog of all the data, so you know what files you have where, you know where what tables, what technical metadata, operational metadata, business metadata and also classified data information. So you know where you have sensitive information and you can immediately address that and apply controls to that information. >> So this is data discovery. So the use case is, I'm an Amazon partner, I mean we use theCUBE virtuals on Amazon, but let's just say hypothetically, we're growing like crazy. Got S3 buckets over here secure, encrypted and the rest, all that stuff. Things are happening, we're growing like a weed. Do we just deploy smallIDs and how it works? Is that use cases, SmallID is for AWS and BigID for everything else or? >> You can start small with SmallID, you get the visibility you need, you can leverage the automation of AWS so that you automatically discover those data sources, connect to them and get visibility. And you could grow into BigID using the same deployment inside AWS. You don't have to switch migrate and you use the same container cluster that is running inside your account and automatically scale it up and then connect to other systems or benefit from the more advanced capabilities the BigID can offer such as correlation, by connecting to maybe your Salesforce, CRM system and getting the ability to correlate to your customer data and understand also whose data it is that you're storing. Connecting to your on-premise mainframe, with the same deployment connecting to your Google Drive or office 365. But the point is that with the smallID you can really start quickly, small with a very small team and get that visibility very quickly. >> Nimrod, I want to ask you a question. What is the definition of cloud native data discovery? What does that mean to you? >> So cloud native means that it leverages all the benefits of the cloud. Like it gets all of the automation and visibility that you get in a cloud environment versus any traditional on-prem environment. So one thing is that BigID is installed directly from your marketplace. So you could browse, find its solution on the AWS marketplace and purchase it. It gets deployed using CloudFormation templates very easily and very quickly. It runs on a elastic container service so that once it runs you can automatically scale it up and down to increase the scan and the scale capabilities of the solution. It connects automatically behind the scenes into the security hub of AWS. So you get those alerts, the policy alerts fed into your security hub. It has integration also directly into the native logging capabilities of AWS. So your existing Datadog or whatever you're using for monitoring can plug into it automatically. That's what we mean by cloud native. >> And if you're cloud native you got to be positioned to take advantage of the data and machine learning in particular. Can you expand on the role of machine learning in your solution? Customers are leaning in heavily this year, you're seeing more uptake on machine learning which is basically AI, AI is machine learning, but it's all tied together. ML is big on all the deployments. Can you share your thoughts? >> Yeah, absolutely. So data discovery is a very tough problem and it has been around for 20 years. And the traditional methods of classifying the data or understanding what type of data you have has been, you're looking at the pattern of the data. Typically regular expressions or types of kind of pattern-matching techniques that look at the data. But sometimes in order to know what is personal or what is sensitive it's not enough to look at the pattern of the data. How do you distinguish between a date of birth and any other date. Date of birth is much more sensitive. How do you find country of residency or how do you identify even a first name from the last name? So for that, you need more advanced, more sophisticated capabilities that go beyond just pattern matching. And BigID has a variety of those techniques, we call that discovery-in-depth. What it means is that very similar to security-in-depth where you can not rely on a single security control to protect your environment, you can not rely on a single discovery method to truly classify the data. So yes, we have regular expression, that's the table state basic capability of data classification but if you want to find data that is more contextual like a first name, last name, even a phone number and distinguish between a phone number and just a sequence of numbers, you need more contextual NLP based discovery, name entity recognition. We're using (indistinct) to extract and find data contextually. We also apply deep learning, CNN capable, it's called CNN, which is basically deep learning in order to identify and classify document types. Which is basically being able to distinguish between a resume and a application form. Finding financial records, finding medical records. So RA are advanced NLP classifiers can find that type of data. The more advanced capabilities that go beyond the smallID into BigID also include cluster analysis which is an unsupervised machine learning method of finding duplicate and similar data correlation and other techniques that are more contextual and need to use machine learning for that. >> Yeah, and unsupervised that's a lot harder than supervised. You need to have that ability to get that what you can't see. You got to get the blind spots identified and that's really the key observational data you need. This brings up the kind of operational you heard cluster, I hear governance security you mentioned earlier GDPR, this is an operational impact. Can you talk about how it impacts on specifically on the privacy protection and governance side because certainly I get the clustering side of it, operationally just great. Everyone needs to get that. But now on the business model side, this is where people are spending a lot of time scared and worried actually. What the hell to do? >> One of the things that we realized very early on when we started with BigID is that everybody needs a discovery. You need discovery and we actually started with privacy. You need discovery in route to map your data and apply the privacy controls. You need discovery for security, like we said, right? Find and identify sensitive data and apply controls. And you also need discovery for data enablement. You want to discover the data, you want to enable it, to govern it, to make it accessible to the other parts of your business. So discovery is really a foundation and starting point and that you get there with smallID. How do you operationalize that? So BigID has the concept of an application framework. Think about it like an Apple store for data discovery where you can run applications inside your kind of discovery iPhone in order to run specific (indistinct) use cases. So, how do you operationalize privacy use cases? We have applications for privacy use cases like subject access requests and data rights fulfillment, right? Under the CCPA, you have the right to request your data, what data is being stored about you. BigID can help you find all that data in the catalog that after we scan and find that information we can find any individual data. We have an application also in the privacy space for consent governance right under CCP. And you have the right to opt out. If you opt out, your data cannot be sold, cannot be used. How do you enforce that? How do you make sure that if someone opted out, that person's data is not being pumped into Glue, into some other system for analytics, into Redshift or Snowflake? BigID can identify a specific person's data and make sure that it's not being used for analytics and alert if there is a violation. So that's just an example of how you operationalize this knowledge for privacy. And we have more examples also for data enablement and data management. >> There's so much headroom opportunity to build out new functionality, make it programmable. I really appreciate what you guys are doing, totally needed in the industry. I could just see endless opportunities to make this operationally scalable, more programmable, once you kind of get the foundation out there. So congratulations, Nimrod and the whole team. The question I want to ask you, we're here at re:Invent's virtual, three weeks we're here covering Cube action, check out theCUBE experience zone, the partner experience. What is the difference between BigID and say Amazon's Macy? Let's think about that. So how do you compare and contrast, in Amazon they say we love partnering, but we promote our ecosystem. You guys sure have a similar thing. What's the difference? >> There's a big difference. Yes, there is some overlap because both a smallID and Macy can classify data in S3 buckets. And Macy does a pretty good job at it, right? I'm not arguing about it. But smallID is not only about scanning for sensitive data in S3. It also scans anything else you have in your AWS environment, like DynamoDB, like EMR, like Athena. We're also adding Redshift soon, Glue and other rare data sources as well. And it's not only about identifying and alerting on sensitive data, it's about building full catalog (indistinct) It's about giving you almost like a full registry of your data in AWS, where you can look up any type of data and see where it's found across structured, unstructured big data repositories that you're handling inside your AWS environment. So it's broader than just for security. Apart from the fact that they're used for privacy, I would say the biggest value of it is by building that catalog and making it accessible for data enablement, enabling your data across the board for other use cases, for analytics in Redshift, for Glue, for data integrations, for various other purposes. We have also integration into Kinesis to be able to scan and let you know which topics, use what type of data. So it's really a very, very robust full-blown catalog of the data that across the board that is dynamic. And also like you mentioned, accessible to APIs. Very much like the AWS tradition. >> Yeah, great stuff. I got to ask you a question while you're here. You're the co-founder and again congratulations on your success. Also the chief product officer of BigID, what's your advice to your colleagues and potentially new friends out there that are watching here? And let's take it from the entrepreneurial perspective. I have an application and I start growing and maybe I have funding, maybe I take a more pragmatic approach versus raising billions of dollars. But as you grow the pressure for AppSec reviews, having all the table stakes features, how do you advise developers or entrepreneurs or even business people, small medium-sized enterprises to prepare? Is there a way, is there a playbook to say, rather than looking back saying, oh, I didn't do with all the things I got to go back and retrofit, get BigID. Is there a playbook that you see that will help companies so they don't get killed with AppSec reviews and privacy compliance reviews? Could be a waste of time. What's your thoughts on all this? >> Well, I think that very early on when we started BigID, and that was our perspective is that we knew that we are a security and privacy company. So we had to take that very seriously upfront and be prepared. Security cannot be an afterthought. It's something that needs to be built in. And from day one we have taken all of the steps that were needed in order to make sure that what we're building is robust and secure. And that includes, obviously applying all of the code and CI/CD tools that are available for testing your code, whether it's (indistinct), these type of tools. Applying and providing, penetration testing and working with best in line kind of pen testing companies and white hat hackers that would look at your code. These are kind of the things that, that's what you get funding for, right? >> Yeah. >> And you need to take advantage of that and use them. And then as soon as we got bigger, we also invested in a very, kind of a very strong CSO that comes from the industry that has a lot of expertise and a lot of credibility. We also have kind of CSO group. So, each step of funding we've used extensively also to make RM kind of security poster a lot more robust and invisible. >> Final question for you. When should someone buy BigID? When should they engage? Is it something that people can just download immediately and integrate? Do you have to have, is the go-to-market kind of a new target the VP level or is it the... How does someone know when to buy you and download it and use the software? Take us through the use case of how customers engage with. >> Yeah, so customers directly have those requirements when they start hitting and having to comply with regulations around privacy and security. So very early on, especially organizations that deal with consumer information, get to a point where they need to be accountable for the data that they store about their customers and they want to be able to know their data and provide the privacy controls they need to their consumers. For our BigID product this typically is a kind of a medium size and up company, and with an IT organization. For smallID, this is a good fit for companies that are much smaller, that operate mostly out of their, their IT is basically their DevOps teams. And once they have more than 10, 20 data sources in AWS, that's where they start losing count of the data that they have and they need to get more visibility and be able to control what data is being stored there. Because very quickly you start losing count of data information, even for an organization like BigID, which isn't a bigger organization, right? We have 200 employees. We are at the point where it's hard to keep track and keep control of all the data that is being stored in all of the different data sources, right? In AWS, in Google Drive, in some of our other sources, right? And that's the point where you need to start thinking about having that visibility. >> Yeah, like all growth plan, dream big, start small and get big. And I think that's a nice pathway. So small gets you going and you lead right into the BigID. Great stuff. Final, final question for you while I gatchu here. Why the awards? Someone's like, hey, BigID is this cool company, love the founder, love the team, love the value proposition, makes a lot of sense. Why all the awards? >> Look, I think one of the things that was compelling about BigID from the beginning is that we did things differently. Our whole approach for personal data discovery is unique. And instead of looking at the data, we started by looking at the identities, the people and finally looking at their data, learning how their data looks like and then searching for that information. So that was a very different approach to the traditional approach of data discovery. And we continue to innovate and to look at those problems from a different perspective so we can offer our customers an alternative to what was done in the past. It's not saying that we don't do the basic stuffs. The Reg X is the connectivity that that is needed. But we always took a slightly different approach to diversify, to offer something slightly different and more comprehensive. And I think that was the thing that really attracted us from the beginning with the RSA Innovation Sandbox award that we won in 2018, the Gartner Cool Vendor award that we received. And later on also the other awards. And I think that's the unique aspect of BigID. >> You know you solve big problems than certainly as needed. We saw this early on and again I don't think that the problem is going to go away anytime soon, platforms are emerging, more tools than ever before that converge into platforms and as the logic changes at the top all of that's moving onto the underground. So, congratulations, great insight. >> Thank you very much. >> Thank you. Thank you for coming on theCUBE. Appreciate it Nimrod. Okay, I'm John Furrier. We are theCUBE virtual here for the partner experience APN virtual. Thanks for watching. (gentle music)

Published Date : Dec 3 2020

SUMMARY :

Announcer: From around the globe, of the APN, Amazon Partner Great to see you as well. So you guys have had a For example, the right to be forgotten. What is the product? of all the data, so you know and the rest, all that stuff. and you use the same container cluster What is the definition of Like it gets all of the automation of the data and machine and need to use machine learning for that. and that's really the key and that you get there with smallID. Nimrod and the whole team. of the data that across the things I got to go back These are kind of the things that, and a lot of credibility. is the go-to-market kind of And that's the point where you need and you lead right into the BigID. And instead of looking at the data, and as the logic changes at the top for the partner experience APN virtual.

ENTITIES

Entity	Category	Confidence
AWS	ORGANIZATION	0.99+
Nimrod Vax	PERSON	0.99+
Nimrod	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Tel Aviv	LOCATION	0.99+
2018	DATE	0.99+
Glassdoor	ORGANIZATION	0.99+
BigID	TITLE	0.99+
200 employees	QUANTITY	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
BigID	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
SmallID	TITLE	0.99+
GDPR	TITLE	0.99+
four years ago	DATE	0.98+
billions of dollars	QUANTITY	0.98+
Redshift	TITLE	0.98+
CloudFormation	TITLE	0.97+
both	QUANTITY	0.97+
DynamoDB	TITLE	0.97+
single	QUANTITY	0.97+
CNN	ORGANIZATION	0.97+
this year	DATE	0.97+
EMR	TITLE	0.97+
one thing	QUANTITY	0.97+
One	QUANTITY	0.96+
one	QUANTITY	0.96+
each step	QUANTITY	0.95+
Amazon Partner Network	ORGANIZATION	0.95+
three weeks	QUANTITY	0.95+
APN	ORGANIZATION	0.95+
20 years	QUANTITY	0.95+
S3	TITLE	0.94+
Athena	TITLE	0.94+
office 365	TITLE	0.94+
today	DATE	0.93+
first name	QUANTITY	0.92+
smallIDs	TITLE	0.91+
Gartner Cool Vendor	TITLE	0.91+
Kinesis	TITLE	0.91+
20 data sources	QUANTITY	0.9+
RSA Innovation Sandbox	TITLE	0.88+
CCP	TITLE	0.88+
Invent 2020 Partner Network Day	EVENT	0.88+
smallID	TITLE	0.88+
more than 10,	QUANTITY	0.88+
Macy	ORGANIZATION	0.86+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Gartner Cool Vendor: