Yinglian Xie, DataVisor | CUBEConversation, November 2018
(upbeat music) >> Okay, welcome to theCUBE everyone. This is a CUBE Conversation here in Palo Alto, California in the CUBE studios. I'm John Furrier, the co-founder of SiliconANGLE media, the host of the CUBE. I'm here with Yinglian Xie. She's the co-founder and CEO of data visor, entrepreneur, former Microsoft researcher. Thanks for joining me in CUBE conversation. >> My great pleasure to be here. >> So I'm excited to chat with you because you've got a really hot company, and a very hot space, but also as an entrepreneur, you're out competing against a huge wave of transformation. You've got big clouds out there, you've got IT enterprises moving to some sort of cloud operating model. You have global IOT market, huge security problem. You guys are trying to solve that with Data Visor, your company. So take me through the journey. First take a minute to explain what Data Visor is, and I want to ask you about how you got into this business, how it started. So what does Data Visor do, first give a one minute overview of the company. >> Sure, so Data Visor is a company that uses the AI machinery and big data, trying to detect and prevent a variety of fraud and abuse problems for all these consumer facing enterprises. So our mission is to really leverage these advance technology that you talk about in many of these, and to help these consumer facing enterprises to establish and restore trust to the end users like you and me, like every one of us. >> Yes, cyber security and security in general is a global issue. I mean, spear phishing is just so effective, you just come in and just send someone a LinkedIn message or an email, they click on a link and you're done. There's not much technology. People are struggling with this, but you guys have a unique approach that you taking with Data Visor so I want to dig into it. But first, how did it all start? When you started this company with your co-founder, did you just wake up one day and say, you know what we're going to go solve the security problems for the world. Where did the idea come from and how did it all start? >> So I would say it's probably, if you look at the background of me and my co-founder, it's probably the natural journey to it, because we actually came from a research and academia background. And me spending seven years of my post doc research in Silicon Valley before starting Data Visor, from there when we joined in 2006, actually it was where we kind of just see this parallel computing paradigm. Like Matt Purdue's paper just got published, and all the data is available, we have all these security problems and at that time we were partnering with a number of large consumer facing groups in Microsoft, and to see how we can use this big data to solve some of the challenges that they face in terms of for example the online fraud and abuse. And also we see the industry and was rapidly getting into the digital era where we have billions of users online, so everybody sees this unique challenge of, they have a variety of vulnerabilities they face, they're trying to bring more rich features to users. At the same time, they see new fraud are coming up also very rapidly. So everybody, when they see new fraud, they are trying to have point solutions. Where they say, let's just tackle this, but then afterwards there's another fraud, or another abuse coming up. >> Throw another tool at em. Build another tool. Buy another tool. >> Exactly. Kind of arms race, where they're being reactive, and catching in a cat and mouse game. So we decided, let's just come to see whether we build something different and leverage the AI machine learning, and then we see what this new cull computing, big data infrastructure can do. So let's build something a little bit more proactive, so that we've been in the security area for so long, that we feel something fundamental that can be a game changer. It's only when we don't make assumptions to see what kind of attacks we want to detect. But be a little bit more open to say, let's try to build something more robust, that can have the ability to automatically discover and detect these new type of unknown attacks more proactively. >> Yinglian, I want to talk about that point, about your time at Microsoft. At that time around 2006, I think it's notable because the environment of Microsoft scale was massive. They were powering, the browsers were everywhere, MSN, the online services that Microsoft had were certainly large scale, but they were built on what I would call gen one internet technology. Databases, big large scale. At the time there, the new entrants, Facebook, otherworlds, they were building all their own tech. So you had kind of the new entrant who had a clean sheet of paper, and they built their own large scale. And we know the history of that, those kinds of companies, that were natively at that time. That's the environment that Microsoft had, that a lot of customers today have. They have technologies that have been around, they have to transform very quickly. So when you learned about some of those data collection capabilities at scale of older technologies, and rushing to a new solution, this is a problem that a lot of end user enterprises have. CIOs, cloud architects, data architects, and they've been operating data warehouses for generations. Big fenced off databases, slow, big data lakes turning into swamps. So that's the current situation, how do you guys speak to that? Because this is the number one challenge we see. Is, I have all this data, I've got a data problem. I'm now full of data, I'm being taken advantage of with the fraud. Whether it's spear phishing or some other scams that are going on with email and all this stuff. How do you guys talk to that customer, that environment? >> You definitely very spot on the challenges and problems that we all face. So while we get into the digital era, everybody has this great sense of trying to collect data and story those data. So that has been, the amount of data we collect is tremendous nowadays. The next step everybody was looking at, the big challenge for us, is how to make value of these in a more effective way. And we also talk about a lot about the AI and machine learning, how they can transform some of the way we do things in the past. The analogy we know is how do we go from the manual driving cars to the self driving era of having all the automation intelligence, and making value out of this. So there are still a lot of challenges that you definitely touch upon. First of all, when they have the data there, does that mean we have the data, we have the data in a consistent, consolidated way. Many times, two different divisions, departments collecting data, they're still in silo mode. So how to bring the data together. And second is, we have the data, we have the computing power, how do we bring the algorithm that operate on top of that the framework to have a system that would let algorithm generating values. Like in the fraud detection space, be able to automatically process huge amount of data, and make decisions in real time. Instantly, detecting these new type of attacks. So we find that's a problem beyond the silo of just an IT problem, or just a data science problem, of just a business problem. So many times these three groups still sort of work separately, but in the end we needed the main knowledge, we need building a system, and we need good data architecture to solve them together. So that's where Datavisor is building a solution, the ecosystem to consider all of this. >> Okay, so let's talk about the ecosystem a little bit later. I want to get to the algorithm piece. That seems to be your secret sauce, right? The algorithms? Is that where the action is for you guys? The secret algorithms or is it setup in the environment first? It kind of makes sense, you've got to set the table first, get the data unified or addressable, and then apply software algorithms to them. That's where the AI comes. What's your secret sauce? >> Yeah, so that's a good question. A lot of our customers ask us the same question, is algorithm your secret sauce? And my answer is kind of partially yes, but also at the same time, not completely. Because we're all catching up very rapidly in algorithm, if you look at the new algorithm being published every year. There's a lot of great ideas out there, great algorithm there. So our unique algorithm is the differentiating technology is called unsupervised machine learning. So unsupervised means we don't need to require customers to have historical loss experience, or need to know the training labels of what past attacks look like. So to proactively discover new type of, unknown type attacks and automate it away. So that's what the algorithm part is, and it has its merit. >> And by the way, people want to know about this machine supervised and unsupervised machine learning, go Google search, there's some papers out there. But I think, most people know this, or might not know it, it's really hard to do unsupervised machine learning because supervised you just tell it what to look for, it finds it. Unsupervised is saying be ready for anything, basically. Oversimplifying. >> Exactly, unsupervised means we want it to make decisions without assumptions. And we want to be able to discover those patterns as the attackers evolve and be very adaptive. So that's definitely a great idea out there. I wouldn't say if you Google, like search unsupervised, and you would find in academia there are published articles about it.6 So I wouldn't say it's a completely new concept, it's a concept out there. >> It's been around for a while, but the compute is the value. Because now you have the computation accelerate all those calculations required that used to be stalling it, from 10 years ago. I mean it's been around for a couple decades. AI and machine learning, but it's been computation intensive. >> Very much so, very much so. So if you look at the gap where that keep the academia side of the world algorithm, to where it's working. It is something similar to deep learning requires a lot more computation complexity compared to the past algorithms. >> Yinglian, I've got to ask you, because this comes up and I'll skip back to the reality of the customer. Because I can geek out on this all day long, I love the conversation, and we should certainly do a follow up on Deep Dive with our team. But the reality is customers have been consolidating and outsourcing IT for generations. And just only few years ago did they wake up, and some woke up earlier than others and said, wow I have no intellectual property, I have no competitive advantage, my IT's all outsourced, I am getting killed with requests for top line revenue growth and I'm getting killed with security breaches, and where's my IT staff. So they don't have the luxury of just turning on a machine learning. Hey, give me some machine learning guys, and solve the problem. That's really hard to setup. You've got to kind of build a trajectory with economies of scale in IT. This is a huge problem. How do you work with companies that just say, look I got security problems but I don't have time or the capability to hire machine learning people, because that's an aspiration, that's not viable, not attainable. What do you say to the customers? Can you still work with those customers, are you a good fit for that kind of environment? Talk about that dynamic, because that seems to happen a lot. >> Yeah, so in that area, you really to bring a solution to solve their problem. Like us today, we have a lot of infrastructure capability, platforms where they can leverage. But you definitely talk about the challenge they face. They don't have people to leverage those underlying primitives and build something to immediately address their business challenges. >> Can you build it for them? >> That's where Datavisor is, to provide the platform and the service to the customers. Where we take data in, and tell them directly all the type of attacks they face, in real time. Constantly, all the time. >> I really want to get your opinion on something that I've been talking about publicly lately, and I've been interviewing folks in the industry about it, because if you look at the graphics market around AI, and nvidia has been doing very, very well. They broke into gaming, obviously is the vertical and using the graphics cards for block chain mining. Then nvidia kind of walked into these new markets because they had purpose built processor for floating point and graphic stuff that was very specialized but now becomes very popular. We're seeing the need for something around data, where you want to have agility, but you also want high performance. So people are making trade offs between agility and high performance and if you ask anyone they'll tell you that I'd love to have more performance in data. So there's no nvidia yet has come out and become the nvidia of data. There's no data processing unit out there yet. This is something that we see a need for. So what you're talking about here is customers have all these demands, it's almost like they need a data processing unit. >> What they need is a solution, like you said, when they have a business solution, they're not looking at something like a generic framework or generic paradigm. They're looking at something to tackle the specific need. For example when we talk about fraud prevention, we're talking about rebuilding a service, the ecosystem that combines the data element, combines the algorithm that address their problem right away. So that's where we talk about with your analogy with nvidia, they want something almost like that chip, directly solve their pain point. >> And that's what you guys are kind of doing, because let me see if I get this right. You guys have this kind of horizontal view of data, but you're going very vertically, and specializing on the vertical markets because that's where the need for the acute nature of the algorithms to be successful. Like say, financial services. Am I getting that right? So it's like horizontally scalable data, but very specialized purpose. >> Exactly. So horizontally scalable data, but then really mine the data and view the algorithms that optimize for the detection of these unknown type of fraud in this area. >> Because they're customized, I mean they have certain techniques that the financial guys will use to attack the banks, right? So you had to be really nimble and agile at the application. >> Right, so when we build the algorithm, we have in mind the specific application we need to target. So you don't want to be over general in the sense that it can do anything, but in the end it does nothing super, super well. So if we are solving that particular fraud detection problem, in the end it needs to be, everything needs to be optimized. The integration with data, the algorithm, the output, the integration with the customer, needs to be optimized for the scenario. In the long run, can it be even generalized. You talked about the agility, and the nimbleness to broaden out to other areas. Then they will say, we are taking approach I would love to see nvidia's approach gradually expanding to other verticals. That is something we are looking from the long term perspective. Our view is that we a layer above all the cloud computing, the data layer. We are the layer that is verticalize position and targeted to solve this specific business issues. And we want to do that really well. Solve that problem one at a time. And then leveraging that algorithm, the underlying infrastructure we built to see whether we can expand that to other verticals, other scenarios. >> So you don't get dependent upon the cloud players? You actually will draft off their success. >> So we leverage the cloud computing era aggressively. Who doesn't in this scenario? It definitely brings the scale, the agility, and the flexibility to expand. And there's a lot of great technology there. >> What do you think about the cloud players? When you look at multiple clouds and hybrid cloud is a trend happening right now. What's your opinion of how that's going? That comes up a lot. CIOs number one channel and cloud architects, and then data architects are all kind of working as the new personas we're seeing. How has the cloud and multi cloud or single cloud approach, for your customers, how do you see that evolving? Because we see trends where, for instance, the Department of Defense, probably going to go all in on Amazon. That's the single cloud solution, but it wasn't sourced as a single cloud. So it turns out that Amazon was better for that, versus spreading things around to multiple clouds. So there's a trade off, what's your thoughts on that as a technologist. >> Well you touched upon an interesting point, because actually, our position is multi cloud. Multi cloud as well as, we support even un-permissed deployment. I will talk about the reason why. The cloud is such a big space, and we see different players there. We definitely see different players, because of their historical working with different vendors, as well as their development you definitely see. Actually our position in this space was driven by the customer need. From that, what we saw is customers have these requirements of their favorite cloud environment. And then there's public cloud verses private cloud. We're not completely there to say there's one cloud that rules all. And you also see some very conservative areas, particularly financial services where their security is really their top priority, they're conservative. And from that perspective, they still are having un-permissed solutions. And we have to be considerate of all these different requirements. And also when we look at evolvement, we also see different geographic landscapes have different cloud deployment landscapes as well. And it's a dynamic environment. >> It's a new dynamic. >> It's a new dynamic. >> Especially the global component, the regions. >> Exactly, the regions. And the different regions, and we also have the GDPR, where does the data residence problem. So that also makes it also challenging to say, just deploy your solution on one type of cloud, that's a very rigid model. So definitely from very early days, we basically decide our data decision would be, we are going to support multi cloud very early on. >> And it makes sense, because people don't want to move a lot of data around. They're going to want to have data in multiple clouds, if that's where the app is. Latency in the threats around moving packets from point A to point B are a risk too. Not just latency, but hacks. Alright, great. I'm very impressed with your vision. I'm very impressed with what you guys are going. I think it's very relevant. Talk about the business. Where are you guys at in terms of customers, what kind of customers do you have, how many customers, can you talk about some of the metrics. How many customers you have, what kind of customers, what are they doing with you, what are the successes? Can you lay out some of the use cases? >> So we work with many of the largest enterprises in the world, and so the probably also the ones that face a lot of challenge of these large scale fraud at the same time they are the ones aggressively moving forward in adopting new technology solutions. They are a little bit more the early, pioneering, adopters. So our customer can be in three verticals, today. So we take a vertical approach. The first is those large social commerce, like Sector. And some of our customers, for example Yelp, Pinterest, kind of customers. And there is also the second vertical, is those mobile apps. There's a lot of fraudulence in stores, where these mobile apps are trying acquire users aggressively everywhere, but among the users acquired, those in stores there can be substantial amount that is fraudulent. So those are the separate segment we target. And the third segment, we talked about, and you mentioned the financial area, where traditionally people focus on the risk of control, the fraud detection definitely causes a big problem. Their challenge is when they move from the past existing era to the digital era, going online, and a lot of new attacks start coming up, and definitely a huge challenge problem for them as well. >> So you guys have some great funds, you have some great investors. NEA, New Enterprise Associates and sequoia capital. What's the growth plan for you? What's the goal for the company, what's your growth strategy? What's on your mind now? Hiring obviously, customer, what's the focus? What's the growth plan? >> So our focus is, we've been working with many of these large service providers. We mentioned our large enterprise customers. So globally today, we've already been protecting over a full billing end user accounts in total. So it's a lot of users at this moment, for our next step of growth and so we have two thoughts. A is we want to basically make the service even more scalable, and even more standardized in a sense that we can work with more than just the largest ones and be able to make it convenient, to be integrated with as many consumer facing providers. >> To expand the breadth. >> To expand the breadth, yes, of customers that we work with. The second aspect is, when looking at the fraud detection, we feel traditionally when the fraud market is segmented, we talk about when in the offline world, you would see financial sector fraud very different from somebody working on content. Nowadays, we can consolidate it, so in that area we're trying to build a more wholistic ecosystem. Where the device side of solutions and the analytical solutions can be consolidated together, to make it an ecosystem where we can have both sides of use and be able to provide to our customers different kind of needs. In the past, it was very point solutions. You would see data signal providers, then you would see some algorithm providers, and focusing on a specific type of fraud, and we wanted to make an ecosystem, so that, to your point in the past on the data, we will be able to connect the data, look at the use at account level and be able to detect a variety of types of fraud. As the enterprises are pushing out new features, and new flavors of these types. >> And the ecosystem participants will look like what? Ad networks, data services? Who is in the ecosystem that you want to build? >> Yeah, so that's a great question. In the ecosystem we talk about, for example, cull providers, can be an ecosystem basically. They actually power the computation layer, of all the resource there. We can also partner with data partners. That's another important element, so you're looking at technology data systems all integrated together. At the same time we can also look at the consulting firms that bring a bigger solution to the customers with the fraud being an important component that they want to address with system integrators. And so all these can fit together, and even some of the underlying algorithm solutions in the end can be plucked into the ecosystem to provide different aspects of use and make value out of data. So that different algorithms work together, and become defense area. >> It's like a security first strategy. First we had cloud first, data first, now security first. I mean, got to have the security. Well I really appreciate, we need more algorithms to police the algorithms. Algorithms for algorithms. So maybe that's next for you guys. Well with the business goal in mind we always take an open holistic view. I like you talking about security first, when we look at how to solve that problem more effectively, then we are very open minded to say, what is the best combinations we want to be three ultimately. And that's a single bit of real time, instant decision that is important at that time, because that matters with good users friction, they face whether we can be able to accurately detect attackers. So we are all optimizing for that, and then all the underlying data consolidation piece, the algorithm in combination working with each other, is just to make the barrier high, make it difficult for the attackers, and to make all of us good users easier. >> Well you're doing amazing things, and I think you're right. There's value in that data, new ways to use that data for better security is just the beginning of this new trend. Thanks for coming in and sharing your insights and congratulations on a great start up, and good luck to you and you co-founder. Thanks for sharing. >> Thank you, great to have this conversation. I'm here in theCUBE studios in Palo Alto, I'm John Furrier for CUBE Conversation with hot start up Data Visor Yinglian Xie CEO and co-founder. I'm John Furrier, thanks for watching. (bright music)
SUMMARY :
I'm John Furrier, the co-founder of SiliconANGLE media, So I'm excited to chat with you because you've got So our mission is to really leverage for the world. and at that time we were partnering with Build another tool. that can have the ability to automatically discover So that's the current situation, So that has been, the amount of data we collect and then apply software algorithms to them. So unsupervised means we don't need to require And by the way, people want to know about this machine as the attackers evolve and be very adaptive. but the compute is the value. that keep the academia side of the world algorithm, I love the conversation, and we should certainly do Like us today, we have a lot of infrastructure capability, and the service to the customers. and I've been interviewing folks in the industry about it, that combines the data element, combines the algorithm of the algorithms to be successful. that optimize for the detection of these unknown type So you had to be really nimble and agile at the application. in the end it needs to be, So you don't get dependent upon the cloud players? and the flexibility to expand. the Department of Defense, and we see different players there. And the different regions, and we also have the GDPR, Latency in the threats around moving packets from And the third segment, we talked about, So you guys have some great funds, and even more standardized in a sense that we and the analytical solutions can be consolidated together, At the same time we can also look at and to make all of us good users easier. and good luck to you and you co-founder. Yinglian Xie CEO and co-founder.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
2006 | DATE | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
Yinglian Xie | PERSON | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
seven years | QUANTITY | 0.99+ |
Matt Purdue | PERSON | 0.99+ |
one minute | QUANTITY | 0.99+ |
second aspect | QUANTITY | 0.99+ |
Department of Defense | ORGANIZATION | 0.99+ |
Palo Alto, California | LOCATION | 0.99+ |
November 2018 | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
nvidia | ORGANIZATION | 0.99+ |
First | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
third segment | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
first | QUANTITY | 0.99+ |
CUBE | ORGANIZATION | 0.99+ |
three groups | QUANTITY | 0.99+ |
two thoughts | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Data Visor | ORGANIZATION | 0.99+ |
second | QUANTITY | 0.99+ |
Yinglian | PERSON | 0.98+ |
few years ago | DATE | 0.98+ |
both sides | QUANTITY | 0.98+ |
single cloud | QUANTITY | 0.98+ |
second vertical | QUANTITY | 0.97+ |
three | QUANTITY | 0.97+ |
Yelp | ORGANIZATION | 0.97+ |
billions of users | QUANTITY | 0.95+ |
one cloud | QUANTITY | 0.95+ |
two different divisions | QUANTITY | 0.94+ |
three verticals | QUANTITY | 0.93+ |
MSN | ORGANIZATION | 0.93+ |
NEA | ORGANIZATION | 0.92+ |
single cloud solution | QUANTITY | 0.91+ |
GDPR | TITLE | 0.91+ |
one day | QUANTITY | 0.88+ |
New Enterprise Associates | ORGANIZATION | 0.87+ |
one type | QUANTITY | 0.82+ |
single bit | QUANTITY | 0.81+ |
10 years ago | DATE | 0.8+ |
SiliconANGLE media | ORGANIZATION | 0.79+ |
Datavisor | ORGANIZATION | 0.79+ |
sequoia | ORGANIZATION | 0.77+ |
theCUBE | ORGANIZATION | 0.71+ |
first strategy | QUANTITY | 0.71+ |
couple decades | QUANTITY | 0.71+ |
gen one | QUANTITY | 0.68+ |
point | OTHER | 0.65+ |
CUBEConversation | EVENT | 0.61+ |
a minute | QUANTITY | 0.56+ |
DataVisor | ORGANIZATION | 0.55+ |