Seth Myers, Demandbase | George Gilbert at HQ
>> This is George Gilbert, we're on the ground at Demandbase, the B2B CRM company, based on AI, one of uh, a very special company that's got some really unique technology. We have the privilege to be with Seth Myers today, Senior Data Scientist and resident wizard, and who's going to take us on a journey through some of the technology Demandbase is built on, and some of the technology coming down the road. So Seth, welcome. >> Thank you very much for having me. >> So, we talked earlier with Aman Naimat, Senior VP of Technology, and we talked about some of the functionality in Demandbase, and how it's very flexible, and reactive, and adaptive in helping guide, or react to a customer's journey, through the buying process. Tell us about what that journey might look like, how it's different, and the touchpoints, and the participants, and then how your technology rationalizes that, because we know, old CRM packages were really just lists of contact points. So this is something very different. How's it work? >> Yeah, absolutely, so at the highest level, each customer's going to be different, each customer's going to make decisions and look at different marketing collateral, and respond to different marketing collateral in different ways, you know, as the companies get bigger, and their products they're offering become more sophisticated, that's certainly the case, and also, sales cycles take a long time. You're engaged with an opportunity over many months, and so there's a lot of touchpoints, there's a lot of planning that has to be done, so that actually offers a huge opportunity to be solved with AI, especially in light of recent developments in this thing called reinforcement learning. So reinforcement learning is basically machine learning that can think strategically, they can actually plan ahead in a series of decisions, and it's actually technology behind AlphaGo which is the Google technology that beat the best Go players in the world. And what we basically do is we say, "Okay, if we understand "you're a customer, we understand the company you work at, "we understand the things they've been researching elsewhere "on third party sites, then we can actually start to predict "about content they will be likely to engage with." But more importantly, we can start to predict content they're more likely to engage with next, and after that, and after that, and after that, and so what our technology does is it looks at all possible paths that your potential customer can take, all the different content you could ever suggest to them, all the different routes they will take, and it looks at ones that they're likely to follow, but also ones they're likely to turn them into an opportunity. And so we basically, in the same way Google Maps considers all possible routes to get you from your office to home, we do the same, and we choose the one that's most likely to convert the opportunity, the same way Google chooses the quickest road home. >> Okay, this is really, that's a great example, because people can picture that, but how do you, how do you know what's the best path, is it based on learning from previous journeys from customers? >> Yes. >> And then, if you make a wrong guess, you sort of penalize the engine and say, "Pick the next best, "what you thought was the next best path." >> Absolutely, so the way, the nuts and bolts of how it works is we start working with our clients, and they have all this data of different customers, and how they've engaged with different pieces of content throughout their journey, and so the machine learning model, what it's really doing at any moment in time, given any customer in any stage of the opportunity that they find themselves in, it says, what piece of content are they likely to engage with next, and that's based on historical training data, if you will. And then once we make that decision on a step-by-step basis, then we kind of extrapolate, and we basically say, "Okay, if we showed them this page, or if they engage with "this material, what would that do, what situation would "we find them in at the next step, and then what would "we recommend from there, and then from there, "and then from there," and so it's really kind of learning the right move to make at each time, and then extrapolating that all the way to the opportunity being closed. >> The picture that's in my mind is like, the Deep Blue, I think it was chess, where it would map out all the potential moves. >> Very similar, yeah. >> To the end game. >> Very similar idea. >> So, what about if you're trying to engage with a customer across different channels, and it's not just web content? How is that done? >> Well, that's something that we're very excited about, and that's something that we're currently really starting to devote resources to. Right now, we already have a product live that's focused on web content specifically, but yeah, we're working on kind of a multi-channel type solution, and we're all pretty excited about it. >> Okay so, obviously you can't talk too much about it. Can you tell us what channels that might touch? >> I might have to play my cards a little close to my chest on this one, but I'll just say we're excited. >> Alright. Well I guess that means I'll have to come back. >> Please, please. >> So, um, tell us about the personalized conversations. Is the conversation just another way of saying, this is how we're personalizing the journey? Or is there more to it than that? >> Yeah, it really is about personalizing the journey, right? Like you know, a lot of our clients now have a lot of sophisticated marketing collateral, and a lot of time and energy has gone into developing content that different people find engaging, that kind of positions products towards pain points, and all that stuff, and so really there's so much low-hanging fruit by just organizing and leveraging all of this material, and actually forming the conversation through a series of journeys through that material. >> Okay, so, Aman was telling us earlier that we have so many sort of algorithms, they're all open source, or they're all published, and they're only as good as the data you can apply them to. So, tell us, where do companies, startups, you know, not the Googles, Microsofts, Amazons, where do they get their proprietary information? Is it that you have algorithms that now are so advanced that you can refine raw information into proprietary information that others don't have? >> Really I think it comes down to, our competitive advantage I think is largely in the source of our data, and so, yes, you can build more and more sophisticated algorithms, but again, you're starting with a public data set, you'll be able to derive some insights, but there will always be a path to those datasets for, say, a competitor. For example, we're currently tracking about 700 billion web interactions a year, and then we're also able to attribute those web interactions to companies, meaning the employees at those companies involved in those web interactions, and so that's able to give us an insight that no amount of public data or processing would ever really be able to achieve. >> How do you, Aman started to talk to us about how, like there were DNS, reverse DNS registries. >> Reverse IP lookups, yes. >> Yeah, so how are those, if they're individuals within companies, and then the companies themselves, how do you identify them reliably? >> Right, so reverse IP lookup is, we've been doing this for years now, and so we've kind of developed a multi-source solution, so reverse IP lookups is a big one. Also machine learning, you can look at traffic coming from an IP address, and you can start to make some very informed decisions about what the IP address is actually doing, who they are, and so if you're looking at, at the account level, which is what we're tracking at, there's a lot of information to be gleaned from that kind of information. >> Sort of the way, and this may be a weird-sounding analogy, but the way a virus or some piece of malware has a signature in terms of its behavior, you find signatures in terms of users associated with an IP address. >> And we certainly don't de-anonymize individual users, but if we're looking at things at the account level, then you know, the bigger the data, the more signal you can infer, and so if we're looking at a company-wide usage of an IP address, then you can start to make some very educated guesses as to who that company is, the things that they're researching, what they're in market for, that type of thing. >> And how do you find out, if they're not coming to your site, and they're not coming to one of your customer's sites, how do you find out what they're touching? >> Right, I mean, I can't really go into too much detail, but a lot of it comes from working with publishers, and a lot of this data is just raw, and it's only because we can identify the companies behind these IP addresses, that we're able to actually turn these web interactions into insights about specific companies. >> George: Sort of like how advertisers or publishers would track visitors across many, many sites, by having agreements. >> Yes. Along those lines, yeah. >> Okay. So, tell us a little more about natural language processing, I think where most people have assumed or have become familiar with it is with the B2C capabilities, with the big internet giants, where they're trying to understand all language. You have a more well-scoped problem, tell us how that changes your approach. >> So a lot of really exciting things are happening in natural language processing in general, and the research, and right now in general, it's being measured against this yardstick of, can it understand languages as good as a human can, obviously we're not there yet, but that doesn't necessarily mean you can't derive a lot of meaningful insights from it, and the way we're able to do that is, instead of trying to understand all of human language, let's understand very specific language associated with the things that we're trying to learn. So obviously we're a B2B marketing company, so it's very important to us to understand what companies are investing in other companies, what companies are buying from other companies, what companies are suing other companies, and so if we said, okay, we only want to be able to infer a competitive relationship between two businesses in an actual document, that becomes a much more solvable and manageable problem, as opposed to, let's understand all of human language. And so we actually started off with these kind of open source solutions, with some of these proprietary solutions that we paid for, and they didn't work because their scope was this broad, and so we said, okay, we can do better by just focusing in on the types of insights we're trying to learn, and then work backwards from them. >> So tell us, how much of the algorithms that we would call building blocks for what you're doing, and others, how much of those are all published or open source, and then how much is your secret sauce? Because we talk about data being a key part of the secret sauce, what about the algorithms? >> I mean yeah, you can treat the algorithms as tools, but you know, a bag of tools a product does not make, right? So our secret sauce becomes how we use these tools, how we deploy them, and the datasets we put them again. So as mentioned before, we're not trying to understand all of human language, actually the exact opposite. So we actually have a single machine learning algorithm that all it does is it learns to recognize when Amazon, the company, is being mentioned in a document. So if you see the word Amazon, is it talking about the river, is it talking about the company? So we have a classifier that all it does is it fires whenever Amazon is being mentioned in a document. And that's a much easier problem to solve than understanding, than Siri basically. >> Okay. I still get rather irritated with Siri. So let's talk about, um, broadly this topic that sort of everyone lays claim to as their great higher calling, which is democratizing machine learning and AI, and opening it up to a much greater audience. Help set some context, just the way you did by saying, "Hey, if we narrow the scope of a problem, "it's easier to solve." What are some of the different approaches people are taking to that problem, and what are their sweet spots? >> Right, so the the talk of the data science community, talking machinery right now, is some of the work that's coming out of DeepMind, which is a subsidiary of Google, they just built AlphaGo, which solved the strategy game that we thought we were decades away from actually solving, and their approach of restricting the problem to a game, with well-defined rules, with a limited scope, I think that's how they're able to propel the field forward so significantly. They started off by playing Atari games, then they moved to long term strategy games, and now they're doing video games, like video strategy games, and I think the idea of, again, narrowing the scope to well-defined rules and well-defined limited settings is how they're actually able to advance the field. >> Let me ask just about playing the video games. I can't remember Star... >> Starcraft. >> Starcraft. Would you call that, like, where the video game is a model, and you're training a model against that other model, so it's almost like they're interacting with each other. >> Right, so it really comes down, you can think of it as pulling levers, so you have a very complex machine, and there's certain levers you can pull, and the machine will respond in different ways. If you're trying to, for example, build a robot that can walk amongst a factory and pick out boxes, like how you move each joint, what you look around, all the different things you can see and sense, those are all levers to pull, and that gets very complicated very quickly, but if you narrow it down to, okay, there's certain places on the screen I can click, there's certain things I can do, there's certain inputs I can provide in the video game, you basically limit the number of levers, and then optimizing and learning how to work those levers is a much more scoped and reasonable problem, as opposed to learn everything all at once. >> Okay, that's interesting, now, let me switch gears a little bit. We've done a lot of work at WikiBound about IOT and increasingly edge-based intelligence, because you can't go back to the cloud for your analytics for everything, but one of the things that's becoming apparent is, it's not just the training that might go on in a cloud, but there might be simulations, and then the sort of low-latency response is based on a model that's at the edge. Help elaborate where that applies and how that works. >> Well in general, when you're working with machine learning, in almost every situation, training the model is, that's really the data-intensive process that requires a lot of extensive computation, and that's something that makes sense to have localized in a single location which you can leverage resources and you can optimize it. Then you can say, alright, now that I have this model that understands the problem that's trained, it becomes a much simpler endeavor to basically put that as close to the device as possible. And so that really is how they're able to say, okay, let's take this really complicated billion-parameter neural network that took days and weeks to train, and let's actually derive insights at the level, right at the device level. Recent technology though, like I mentioned deep learning, that in itself, just the actual deploying the technology creates new challenges as well, to the point that actually Google invented a new type of chip to just run... >> The tensor processing. >> Yeah, the TPU. The tensor processing unit, just to handle what is now a machine learning algorithm so sophisticated that even deploying it after it's been trained is still a challenge. >> Is there a difference in the hardware that you need for training vs. inferencing? >> So they initially deployed the TPU just for the sake of inference. In general, the way it actually works is that, when you're building a neural network, there is a type of mathematical operation to do a whole bunch, and it's based on the idea of working with matrices and it's like that, that's still absolutely the case with training as well as inference, where actually, querying the model, but so if you can solve that one mathematical operation, then you can deploy it everywhere. >> Okay. So, one of our CTOs was talking about how, in his view, what's going to happen in the cloud is richer and richer simulations, and as you say, the querying the model, getting an answer in realtime or near realtime, is out on the edge. What exactly is the role of the simulation? Is that just a model that understands time, and not just time, but many multiple parameters that it's playing with? >> Right, so simulations are particularly important in taking us back to reinforcement learning, where you basically have many decisions to make before you actually see some sort of desirable or undesirable outcome, and so, for example, the way AlphaGo trained itself is basically by running simulations of the game being played against itself, and really what that simulations are doing is allowing the artificial intelligence to explore the entire possibilities of all games. >> Sort of like WarGames, if you remember that movie. >> Yes, with uh... >> Matthew Broderick, and it actually showed all the war game scenarios on the screen, and then figured out, you couldn't really win. >> Right, yes, it's a similar idea where they, for example in Go, there's more board configurations than there are atoms in the observable universe, and so the way Deep Blue won chess is basically, more or less explore the vast majority of chess moves, that's really not the same option, you can't really play that same strategy with AlphaGo, and so, this constant simulation is how they explore the meaningful game configurations that it needed to win. >> So in other words, they were scoped down, so the problem space was smaller. >> Right, and in fact, basically one of the reasons, like AlphaGo was really kind of two different artificial intelligences working together, one that decided which solutions to explore, like which possibilities it should pursue more, and which ones not to, to ignore, and then the second piece was, okay, given the certain board configuration, what's the likely outcome? And so those two working in concert, one that narrows and focuses, and one that comes up with the answer, given that focus, is how it was actually able to work so well. >> Okay. Seth, on that note, that was a very, very enlightening 20 minutes. >> Okay. I'm glad to hear that. >> We'll have to come back and get an update from you soon. >> Alright, absolutely. >> This is George Gilbert, I'm with Seth Myers, Senior Data Scientist at Demandbase, a company I expect we'll be hearing a lot more about, and we're on the ground, and we'll be back shortly.
SUMMARY :
We have the privilege to and the participants, and the company you work at, say, "Pick the next best, the right move to make the Deep Blue, I think it was chess, that we're very excited about, Okay so, obviously you I might have to play I'll have to come back. Is the conversation just and actually forming the as good as the data you can apply them to. and so that's able to give us Aman started to talk to us about how, and you can start to make Sort of the way, and this the things that they're and a lot of this data is just George: Sort of like how Along those lines, yeah. the B2C capabilities, focusing in on the types of about the company? the way you did by saying, the problem to a game, playing the video games. Would you call that, and that gets very complicated a model that's at the edge. that in itself, just the Yeah, the TPU. the hardware that you need and it's based on the idea is out on the edge. and so, for example, the if you remember that movie. it actually showed all the and so the way Deep Blue so the problem space was smaller. and focuses, and one that Seth, on that note, that was a very, very I'm glad to hear that. We'll have to come back and and we're on the ground,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George Gilbert | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
George | PERSON | 0.99+ |
Amazons | ORGANIZATION | 0.99+ |
Microsofts | ORGANIZATION | 0.99+ |
Siri | TITLE | 0.99+ |
Googles | ORGANIZATION | 0.99+ |
Demandbase | ORGANIZATION | 0.99+ |
20 minutes | QUANTITY | 0.99+ |
Starcraft | TITLE | 0.99+ |
second piece | QUANTITY | 0.99+ |
WikiBound | ORGANIZATION | 0.99+ |
two businesses | QUANTITY | 0.99+ |
Seth Myers | PERSON | 0.99+ |
Aman Naimat | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
Atari | ORGANIZATION | 0.99+ |
Seth | PERSON | 0.98+ |
each customer | QUANTITY | 0.98+ |
each joint | QUANTITY | 0.98+ |
Go | TITLE | 0.98+ |
single | QUANTITY | 0.98+ |
Matthew Broderick | PERSON | 0.98+ |
one | QUANTITY | 0.98+ |
today | DATE | 0.97+ |
Aman | PERSON | 0.96+ |
Deep Blue | TITLE | 0.96+ |
billion-parameter | QUANTITY | 0.94+ |
each time | QUANTITY | 0.91+ |
two different artificial intelligences | QUANTITY | 0.88+ |
decades | QUANTITY | 0.88+ |
Google Maps | TITLE | 0.86+ |
AlphaGo | ORGANIZATION | 0.82+ |
about 700 billion web interactions a year | QUANTITY | 0.81+ |
Star | TITLE | 0.81+ |
AlphaGo | TITLE | 0.79+ |
one mathematical | QUANTITY | 0.78+ |
lot | QUANTITY | 0.76+ |
years | QUANTITY | 0.74+ |
DeepMind | ORGANIZATION | 0.74+ |
lot of information | QUANTITY | 0.73+ |
bag of tools | QUANTITY | 0.63+ |
IOT | TITLE | 0.62+ |
WarGames | TITLE | 0.6+ |
sites | QUANTITY | 0.6+ |
Aman Naimat, Demandbase, Chapter 2 | George Gilbert at HQ
>> And we're back, this is George Gilbert from Wikibon, and I'm here with Aman Naimat at Demandbase, the pioneers in the next gen AI generation of CRM. So Aman, let's continue where we left off. So we're talking about natural language processing, and I think most people are familiar with it more on the B to C technology, where the big internet providers have sort of accumulated a lot of voice data and have learned how to process it and convert it into text. So tell us how B to B NLP is different, to use a lot of acronyms. In other words, how you're using it to build up a map of relationships between businesses. >> Right, yeah, we call it the demand graph. So it's an interesting question, because firstly, it turns out that, while very different, B to B is also, the language is quite boring. It doesn't evolve as fast as consumer concepts. And so it makes the problem much more approachable from a language understanding point of view. So natural language processing or natural language understanding is all about how machines can understand and store and take action on language. So while we were working on this four or five years ago, and that's my background as well, it turned out the problem was simpler, because human language is very rich, and natural language processing converting voice to text is trivial compared to understanding meaning of things and words, which is much more difficult. Or even the sense of the word, apparently in English each word has six meanings, right? We call them word senses. So the problem was only simpler because B to B language doesn't tend to evolve as fast as regular language, because terms stick in an industry. The challenge with B to B and why it was different is that each industry or sub-industry has a very specific language and jargon and acronyms. So to really understand that industry, you need to come from that industry. So if you go back to the CRM example of what happened 10, 20 years ago, you would have a sales person that would come from that industry if you wanted to sell into it. And that still happens in some traditional companies, right? So the idea was to be able to replicate the knowledge that they would have as if they came from that industry. So it's the language, the vocabularies, and then ultimately have a way of storing and taking action on it. It's very analogous to what Google had done with Knowledge Graph. >> Alright, so two questions I guess. First is, it sounds almost like a translation problem, in the sense that you have some base language primitives, like partner, supplier, competitor, customer. But that the language in each industry is different, and so you have to map those down to those sort of primitives. So tell us the process. You don't have on staff people who translate from every industry. >> I mean that was the whole, writing logical rules or expressions for language, which use conventional good old fashioned AI. >> You mean this was the rules-based knowledge engineering? >> That's right. And that clearly did not succeed, because it is impossible to do it. >> The old quip which was, one researcher said, "Every time I fired a rules engineer, "my accuracy score would go up." (chuckles) >> That's right, and now the problem is because language is evolving, and the context is so different. So even pharmaceutical companies in the US or in the Bay Area would use different language than pharma in Europe or in Switzerland. And so it's just impossible to be able to quantify the variations. >> George: To do it manually. >> To do it manually, it's impossible. It's certainly not possible for a small startup. And we did try having it be generated. In the early days we used to have crowdsource workers validate the machine. But it turned out that they couldn't do it either, because they didn't understand the pharmaceutical language either, right? So in the end, the only way to do that was to have some sort of model and some seed data to be able to validate it, or to hire experts and to have small samples of data to validate. So going back to the graph, right, it turns out that when we have seen sophisticated AI work, you know, towards complex problems, so for example predicting your next connection on LinkedIn, or your next friend, or what ads should you see on Facebook, they have used network-based data, social graph data, or in the case of Google, it's the Knowledge Graph, of how things are connected. And somehow machine learning and AI systems based on network data tend to be more powerful and more intuitive than other types of models. >> So OK, when you say model, help us with an example of, you're representing a business and who it's connected to and its place in the world. >> So the demand graph is basically as Demandbase, who are our customers, who are their partners, who are their suppliers, who are their competitors. And utilizing that network of companies in a manner that we have network of friends on LinkedIn or Facebook. And it turns out that businesses are extremely social in nature. In fact, we found out that the connections between companies have more signal, and are more predictive of acquisition or predicting the next customer, than even the Facebook social graph. So it's much easier to utilize the business graph, the B to B business graph, to predict the next customer, than to say, predict your next friend on Facebook. >> OK, so that's a perfect analogy. So tell us about the raw material you churn through on the web, and then how you learn what that terminology might be. You've boot-strapped a little bit, now you have all this data, and you have to make sense out of new terms, and then you build this graph of who this business is related to. >> That's right, and the hardest part is to be able to handle rumors and to be able to handle jokes, like, "Isn't it time for Microsoft to just buy Salesforce?" Question mark, smiley face. You know, so it's a challenging problem. But we were lucky that business language and business press is definitely more boring than, you know, people talking about movies. >> George: Or Reddit. >> Or Reddit, right. So the way we work is we process the entire business internet, or the entire internet. And initially we used to crawl it ourselves, but soon realized that Common Crawl, which is an open source foundation that has crawled the internet and put at least a large chunk of it, and that really enabled us to stop the crawling. And we read the entire internet and look at, ultimately we're interested in businesses, 'cause that's the world we are, in business, B to B marketing and B to B sales. We look at wherever there's a company mentioned or a business person or business title mentioned, and then ignore everything else. 'Cause if it doesn't have a company or a business person, we don't care. Right, so, or a business product. So we read the entire internet, and try to then infer that this is, Amazon is mentioned in it, then we figure out, is it Amazon the company, or is it Amazon the river? So that's problem number one. So we call it the entity linking problem. And then we try to understand and piece together the various expressions of relationships between companies expressed in text. It could be a press release, it could be a competitive analysis, it could be announcement of a new product. It could be a supply chain relationship. It could be a rumor. And then it also turns out the internet's very noisy, so we look at corroboration across multiple disparate sources-- >> George: Interesting, to decide-- >> Is it true? >> George: To signal is it real. >> Right, yeah, 'cause there's a lot of fake news out there. (George laughs) So we look at corroboration and the sources to be able to infer if we can have confidence in this. >> I can imagine this could be applied to-- >> A lot of other problems. >> Political issues. So OK, you've got all these sources, give us some specific examples of feeds, of sources, and then help us understand. 'Cause I don't think we've heard a lot about the notion of boot-strapping, and it sounds like you're generalizing, which is not something that most of us are familiar with who have a surface-level familiarity with machine learning. >> I think there was a lot of research like, not to credit Google too much, but... Boot-strapping methods were used by Sergei I think was the first person, and then he gave up 'cause they founded Google and they moved on. And since then in 2003, 2004, there was a lot of research around this topic. You know, and it's in the genre of unsupervised machine learning models. And in the real world, because there's less labeled data, we tend to find that to be an extremely effective method, to learn language and obviously now with deep learning, it's also being utilized more, unsupervised methods. But the idea is really to, and this was around five years ago when we started building this graph, and I obviously don't know how the Google Knowledge Graph is built, but I can assume it's a similar technique. We don't tend to talk about how commercial products work that much. But the idea is basically to generalize models or learn from a small seed, so let's say I put in seed like Nike and Adidas, and say they compete, right? And then if you look at the entire internet and look at all the expressions of how Nike and Adidas are expressed together in language, it could be, you know, "I think "Nike shoes are better than Adidas." >> Ah, so it's not just that you find an opinion that they're better than, but you find all the expressions that explain that they're different and they're competition. >> That's right. But we also find cases where somebody's saying, "I bought Nike and Adidas," or, "Nike and Adidas shoes are sold here." So we have to be able to be smart enough to discern when it's something else and not competition. >> OK, so you've told us how this graph gets built out. So the suppliers, the partners, the customers, the competitors, now you've got this foundation-- >> And people and products as well. >> OK, people, products. You've got this really rich foundation. Now you build and application on top of it. Tell us about CRM with that foundation. >> Yeah, I mean we have the demand graph, in which we tie in also things around basic data that you could find from graphics and intent that we've also built. But it also turns out that the knowledge graph itself, our initial intuition was that we'll just expose this to end users, and they'll be able to figure it out. But it was just too complicated. It really needed another level of machinery and AI on top to take advantage of the graph, and to be able to build prescriptive actions. And action could be, or to solve a business problem. A problem could be, I'm an IOT startup, I'm looking for manufacturing companies who will buy my product. Or it could be, I am a venture capital firm, I want to understand what other venture capital firms are investing in. Or, hey, I'm Tesla, and I'm looking for a new supplier for the new Tesla screen. Or you know, things of that nature. So then we apply and build specific models, more machine learning, or layers of machine learning, to then solve specific business problems. Like the reinforcement learning to understand next best action. >> And are these models associated with one of your customers? >> No, they're general purpose, they're packaged applications. >> OK, tell us more, so what was the base level technology that you started with in terms of the being able to manage a customer conversation, a marketing conversation, and then how did that get richer over time? >> Yeah, I mean we take our proprietary data sets that we've accumulated over the years and manufactured over the years, and then co-mingle with customer data, which we keep private, 'cause they own the data. And the technology is generic, but you're right, the model being generated by the machine is specific to every customer. So obviously the next best action model for a pharmaceutical company is based on doctors visiting, and is this person an oncologist, or what they're researching online. And that model is very different than a model for Demandbase for example, or Salesforce. >> Is it that the algorithm's different, or it's trained on different data? >> It's trained on different data. It's the same code, I mean we only have 20, 30 data scientists, so we're obviously not going to build custom code for... So the idea is it's the same model, but the same meta model is trained on different data. So public data, but also customers' private data. >> And how much does the customer, let's say your customer's Tesla, how much of it is them running some of their data through this boot-strapping process, versus how much of it is, your model is set up and it just automatically once you've boot-strapped it, it automatically starts learning from the interactions with the Tesla, with Tesla itself from all the different partners and customers? >> Right, I think you know, we have found, most startups are just learning over small data sets, which are customer-centric. What we have found is real magic happens when you take private data and combine it with large amounts of public data. So at Demandbase, we have massive amounts of public and proprietary data. And then we plug in, and we have to tell you that our client is Tesla, so it understands the localized graph, and knows the Tesla ecosystem, and that's based on public data sets and our proprietary data. Then we also bring in your private slice whenever possible. >> George: Private...? >> Slice of data. So we have code that can plug into your web site, and then start understanding interactions that your customers are having. And then based on that, we're able to train our models. As much as possible, we try to automate the data capture process, so in essence using a sensor or using a pixel on your web site, and then we take that private stream of data and include it in our graph and merge it in, and that's where we find... Our data by itself is not as powerful as our data mixed with your private data. >> So I guess one way to think about it would be, there's a skeletal graph, and that may be sounding too minimalistic, there's a graph. But let's say you take Tesla as the example, you tell them what data you need from them, and that trains the meta models, and then it fleshes out the graph of the Tesla ecosystem. >> Right, whatever data we couldn't get or infer, from the outside. And we have a lot of proprietary data, where we see online traffic, business traffic, what people are reading, who's interested in what, for hundreds of millions of people. We have developed that technology. So we know a lot without actually getting people's private slice. But you know, whenever possible, we want the maximum impact. >> So... >> It's actually simple, and let's divorce the words graphs for a second. It's really about, let's say that I know you, right, and there's some information you can tell me about you. But imagine if I google your name, and I read every document about you, every video you have produced, every blog you have written, then I have the best of both knowledge, right, your private data from maybe your social graph on Facebook, and then your public data. And then if I knew, you know... If I partnered with Forbes and they told me you logged in and read something on Forbes, then they'll get me that data, so now I really have a deep understanding of what you're interested in, who you are, what's your language, you know, what are you interested in. It's that, sort of simplified, but similar, at a much larger scale. >> Alright, let's take a pause at this point and then we'll come back with part three. >> Excellent.
SUMMARY :
more on the B to C technology, So the idea was to be able to replicate in the sense that you have I mean that was the because it is impossible to do it. The old quip which And so it's just impossible to be So in the end, the only way to do that was So OK, when you say model, the B to B business graph, and then how you learn what the hardest part is to So the way we work is and the sources to be and it sounds like you're generalizing, But the idea is basically to generalize Ah, so it's not just that you find So we have to be able to So the suppliers, the Now you build and and to be able to build No, they're general purpose, and manufactured over the years, So the idea is it's the same model, and we have to tell you and then we take that graph of the Tesla ecosystem. get or infer, from the outside. and then your public data. and then we'll come back with part three.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Switzerland | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
George Gilbert | PERSON | 0.99+ |
US | LOCATION | 0.99+ |
2003 | DATE | 0.99+ |
George | PERSON | 0.99+ |
Sergei | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Bay Area | LOCATION | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
Adidas | ORGANIZATION | 0.99+ |
Nike | ORGANIZATION | 0.99+ |
two questions | QUANTITY | 0.99+ |
six meanings | QUANTITY | 0.99+ |
2004 | DATE | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Forbes | ORGANIZATION | 0.99+ |
First | QUANTITY | 0.99+ |
Demandbase | ORGANIZATION | 0.99+ |
each word | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Aman | ORGANIZATION | 0.99+ |
each industry | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
four | DATE | 0.98+ |
Aman Naimat | PERSON | 0.98+ |
Wikibon | ORGANIZATION | 0.95+ |
ORGANIZATION | 0.95+ | |
hundreds of millions of people | QUANTITY | 0.95+ |
English | OTHER | 0.94+ |
10, 20 years ago | DATE | 0.94+ |
first person | QUANTITY | 0.94+ |
one way | QUANTITY | 0.94+ |
Aman Naimat | ORGANIZATION | 0.94+ |
five years ago | DATE | 0.93+ |
20, 30 data scientists | QUANTITY | 0.88+ |
Salesforce | ORGANIZATION | 0.88+ |
firstly | QUANTITY | 0.86+ |
one researcher | QUANTITY | 0.83+ |
around five years ago | DATE | 0.82+ |
one | QUANTITY | 0.73+ |
a second | QUANTITY | 0.71+ |
Salesforce | TITLE | 0.67+ |
Chapter 2 | OTHER | 0.64+ |
Knowledge Graph | TITLE | 0.63+ |
part three | QUANTITY | 0.56+ |
Aman Naimat, Demandbase, Chapter 1 | George Gilbert at HQ
>> Hi, this is George Gilbert. We have an extra-special guest today on our CUBEcast, Aman Naimat, Senior Vice President and CTO of Demandbase started with a five-person startup, Spiderbook. Almost like a reverse IPO, Demandbase bought Spiderbook, but it sounds like Spiderbook took over Demandbase. So Aman, welcome. >> Thank you, excited to be here. Always good to see you. >> So, um, Demandbase is a Next Gen CRM program. Let's talk about, just to set some context. >> Yes. >> For those who aren't intimately familiar with traditional CRM, what problems do they solve? And how did they start, and how did they evolve? >> Right, that's a really good question. So, for the audience, CRM really started as a contact manager, right? And it was replicating what a salesperson did in their own private notebook, writing contact phone numbers in an electronic version of it, right? So you had products that were really built for salespeople on an individual basis. But it slowly evolved, particularly with Siebel, into more of a different twist. It evolved into more of a management tool or reporting tool because Tom Siebel was himself a sales manager, ran a sales team at Oracle. And so, it actually turned from an individual-focused product to an organization management reporting product. And I've been building this stuff since I was 19. And so, it's interesting that, you know, the products today, we're going, actually pivoting back into products that help salespeople or help individual marketers and add value and not just focus on management reporting. >> That's an interesting perspective. So it's more now empowering as opposed to, sort of, reporting. >> Right, and I think some of it is cultural influence. You know, over the last decade, we have seen consumer apps actually take a much more, sort of predominant position rather than in the traditional, earlier in the 80s and 90s, the advanced applications were corporate applications, your large computers and companies. But over the last year, as consumer technology has taken off, and actually, I would argue has advanced more than even enterprise technology, so in essence, that's influencing the business. >> So, even ERP was a system of record, which is the state of the enterprise. And this is much more an organizational productivity tool. >> Right. >> So, tell us now, the mental leap, the conceptual leap that Demandbase made in terms of trying to solve a different problem. >> Right, so, you know, Demandbase started on the premise or around marketing automation and marketing application which was around identifying who you are. As we move towards more digital transaction and Web was becoming the predominant way of doing business, as people say that's 70 to 80 percent of all businesses start using online digital research, there was no way to know it, right? The majority of the Internet is this dark, unknown place. You don't know who's on your website, right? >> You're referring to the anonymity. >> Exactly. >> And not knowing who is interacting with you until very late. >> Exactly, and you can't do anything intelligent if you don't know somebody, right? So if you didn't know me, you couldn't really ask. What will you do? You'll ask me stupid questions around the weather. And really, as humans, I can only communicate if you know somebody. So the sort of innovation behind Demandbase was, and it still continues to be to actually bring around and identify who you're talking to, be it online on your website and now even off your website. And that allows you to have a much more sort of personalized conversation. Because ultimately in marketing and perhaps even in sales, it comes down to having a personal conversation. So that's really what, which if you could have a billion people who could talk to every person coming to your website in a personalized manner, that would be fantastic. But that's just not possible. >> So, how do you identify a person before they even get to a vendor's website so that you can start on a personalized level? >> Right, so Demandbase has been building this for a long time, but really, it's a hard problem. And it's harder now than ever before because of security and privacy, lots of hackers out there. People are actually trying to hide, or at least prevent this from leaking out. So, eight, nine years ago, we could buy registries or reverse DNS. But now with ISBs, and we are behind probably Comcast or Level 3. So how do you even know who this IP address is even registered to? So about eight years ago, we started mapping IP addresses, 'cause that's how you browse the Internet, to companies that they work at, right? But it turned out that was no longer effective. So we have built over the last eight years proprietary methods that know how companies relate to the IP addresses that they have. But we have gone to doing partnerships. So when you log into certain websites, we partner with them to identify you if you self-identify at Forbes.com, for example. So when you log in, we do a deal. And we have hundreds of partners and data providers. But now, the state of the art where we are is we are now looking at behavioral signals to identify who you are. >> In other words, not just touch points with partners where they collect an identity. >> Right. >> You have a signature of behavior. >> That's right. >> It's really interesting that humans are very unique. And based on what they're reading online and what they're reading about, you can actually identify a person and certainly identify enough things about them to know that this is an executive at Tesla who's interested in IOT manufacturing. >> Ah, so you don't need to resolve down to the name level. >> No. >> You need to know sort of the profile. >> Persona, exactly. >> The persona. >> The persona, and that's enough for marketing. So if I knew that this is a C-level supply chain executive from Tesla who lives in Palo Alto and has interests in these areas or problems, that's enough for Siemens to then have an intelligent conversation to this person, even if they're anonymous on their website or if they call on the phone or anything else. >> So, okay, tell us the next step. Once you have a persona, is it Demandbase that helps them put together a personalized? >> Profile. >> Profile, and lead it through the conversation? >> Yeah, so earlier, well, not earlier, but very recently, rebuilding this technology was just a very hard problem. To identify now hundreds of millions of people, I think around 700 are businesspeople globally which is majority of the business world. But we realize that in AI, making recommendations or giving you data in advanced analytics is just not good enough because you need a way to actually take action and have a personalized conversation because there are 100 thousand people on your website. Making recommendations, it's just overwhelming for humans to get that much data. So the better sort of idea now that we're working on is just take the action. So if somebody from Tesla visits your website, and they are an executive who will buy your product, take them to the right application. If they go back and leave your website, then display them the right message in a personalized ad. So it's all about taking actions. And then obviously, whenever possible, guiding humans towards a personalized conversation that will maximize your relationship. >> So, it sounds like sometimes it's anticipating and recommending a next best action. >> Yeah. >> And sometimes, it's your program taking the next best action. >> That's right, because it's just not possible to scale people to take actions. I mean, we have 30, 40 sales reps in Demandbase. We can't handle the volume. And it's difficult to create that personalized letter, right? So we make recommendations, but we've found that it's just too overwhelming. >> Ah, so in other words, when you're talking about recommendations, you're talking about recommendations for Demandbase for? >> Or our clients, employees, or salespeople, right? >> Okay. >> But whenever possible, we are looking to now build systems that in essence are in autopilot mode, and they take the action. They drive themselves. >> Give us some examples of the actions. >> That's right, so some actions could be if you know that a qualified person came to your website, notify the salesperson and open a chat window saying, "This is an executive. "This is similar to a person who will buy "a product from you. "They're looking for this thing. "Do you want to connect with a salesperson?" And obviously, only the people that will buy from you. Or, the action could be, send them an email automatically based on something they will be interested in, and in essence, have a conversation. Right? So it's all about conversation. An ad or an email or a person are just ways of having a conversation, different channels. >> So, it sounds like there was an intermediate marketing automation generation. >> Right. >> After traditional CRM which was reporting. >> Right, that's true. >> Where it was basically, it didn't work until you registered on the website. >> That's right. >> And then, they could email you. They could call you. The inside sales reps. >> That's right. >> You know, if you took a demo, >> That's right. >> you had to put an idea in there. >> And that's still, you know, so when Demandbase came around, that was the predominant between the CRM we were talking about. >> George: Right. >> There was a gap. There was a generation which started to be marketing. It was all about form fills. >> George: Yeah. >> And it was all about nurturing, but I think that's just spam. And today, their effectiveness is close to nothing. >> Because it's basically email or outbound calls. >> Yeah, it's email spam. Do you know we all have email boxes filled with this stuff? And why doesn't it work? Because, not only because it's becoming ineffective and that's one reason. Because they don't know me, right? And it boils down to if the email was really good and it related to what you're looking for or who you are, then it will be effective. But spam, or generic email is just not effective. So it's to some extent, we lost the intimacy. And with the new generation of what we call account-based marketing, we are trying to build intimacy at scale. >> Okay, so tell us more. Tell us first the philosophy behind account-based marketing and then the mechanics of how you do it. >> Sure, really, account-based marketing is nothing new. So if you walk into a corporation, they have these really sophisticated salespeople who understand their clients, and they focus on one-on-one, and it's very effective. So if you had Google as a client or Tesla as a client, and you are Siemens, you have two people working and keeping that relationship working 'cause you make millions of dollars. But that's not a scalable model. It's certainly not scalable for startups here to work with or to scale your organization, be more effective. So really, the idea behind account-based marketing is to scale that same efficacy, that same personalized conversation but at higher volume, right? And maximize, and the only way to really do that is using artificial intelligence. Because in essence, we are trying to replicate human behavior, human knowledge at scale. Right? And to be able to harvest and know what somebody who knows about pharma would know. >> So give me an example of, let's stay in pharma for a sec. >> Sure. >> And what are the decision points where based on what a customer does or responds to, you determine the next step or Demandbase determines what next step to take? >> Right. >> What are some of those options? Like a decision tree maybe? >> You can think of it, it's quite faddish in our industry now. It's reinforcement learning which is what Google used in the Go system. >> George: Yeah, AlphaGo. >> AlphaGo, right, and we were inspired by that. And in essence, what we are trying to do is predict not only what will keep you going but where you will win. So we give rewards at each point. And the ultimate goal is to convert you to a customer. So it looks at all your possible futures, and then it figures out in what possible futures you will be a customer. And then it works backwards to figure out where it should take you next. >> Wow, okay, so this is very different from >> They play six months ahead. So it's a planning system. >> Okay. >> Cause your sales cycles are six months ahead. >> So help us understand the difference between the traditional statistical machine learning that is a little more mainstream now. >> Sure. >> Then the deep learning, the neural nets, and then reinforcement learning. >> Right. >> Where are the sweet spots? What are the sweet spots for the problems they solve? >> Yeah, I mean, you know, there's a lot of fad and things out there. In my opinion, you can achieve a lot and solve real-world problems with simpler machine learning algorithms. In fact, for the data science team that I run, I always say, "Start with like the most simplest algorithm." Because if the data is there and you have the intuition, you can get to a 60% F-score or quality with the most naive implementation. >> George: 60% meaning? >> Like accuracy of the model. >> Confidence. >> Confidence. Sure, how good the model is, how precise it is. >> Okay. >> And sure, then you can make it better by using more advanced algorithms. The reinforcement learning, the interesting thing is that its ability to plan ahead. Most machine learning can only make a decision. They are classifiers of sorts, right? They say, is this good or bad? Or, is this blue? Or, is this a cat or not? They're mostly Boolean in nature or you can simulate that in multi-class classifiers. But reinforcement learning allows you to sort of plan ahead. And in CRM or as humans, we're always planning ahead. You know, a really good salesperson knows that for this stage opportunity or this person in pharma, I need to invite them to the dinner 'cause their friends are coming and they know that last year when they did that, then in the future, that person converted. Right, if they go to the next stage and they, so it plans ahead the possible futures and figures out what to do next. >> So, for those who are familiar with the term AB testing. >> Sure. >> And who are familiar with the notion that most machine learning models have to be trained on data where the answer exists, and they test it out, train it on one set of data >> Sure. >> Where they know the answers, then they hold some back and test it and see if it works. So, how does reinforcement learning change that? >> I mean, it's still testing on supervised models to know. It can be used to derive. You still need data to understand what the reward function would be. Right? And you still need to have historical data to understand what you should give it. And sure, have humans influence it as well, right? At some point, we always need data. Right? If you don't have the data, you're nowhere. And if you don't have, but it also turns out that most of the times, there is a way to either derive the data from some unsupervised method or have a proxy for the data that you really need. >> So pick a key feature in Demandbase and then where you can derive the data you need to make a decision, just as an example. >> Yeah, that's a really good question. We derive datas all the time, right? So, let me use something quite, quite interesting that I wish more companies and people used is the Internet data, right? The Internet today is the largest source of human knowledge, and it actually know more than you could imagine. And even simple queries, so we use the Bing API a lot. And to know, so one of the simple problems we ran into many years ago, and that's when we realized how we should be using Internet data which in academia has been used but not as used as it should be. So you know, you can buy APIs from Bing. And I wish Google would give their API, but they don't. So, that's our next best choice. We wanted to understand who people are. So there's their common names, right? So, George Gilbert is a common name or Alan Fletcher who's my co-founder. And, you know, is that a common name? And if you search that, just that name, you get that name in various contexts. Or co-occurring with other words, you can see that there are many Alan Fletchers, right? Or if you get, versus if you type in my name, Aman Naimat, you will always find the same kind of context. So you will know it's one person or it's a unique name. >> So, it sounds to me that reinforcement learning is online learning where you're using context. It's not perfectly labeled data. >> Right. I think there is no perfectly labeled data. So there's a misunderstanding of data scientists coming out of perfectly labeled data courses from Stanford, or whatever machine learning program. And we realized very quickly that the world doesn't have any perfect labeled data. We think we are going to crowdsource that data. And it turns out, we've tried it multiple times, and after a year, we realized that it's just a waste of time. You can't get, you know, 20 cents or 25 cents per item worker somewhere in wherever to hat and label data of any quality to you. So, it's much more effective to, and we were a startup, so we didn't have money like Google to pay. And even if you had the money, it generally never works out. We find it more effective to bootstrap or reuse unsupervised models to actually create data. >> Help us. Elaborate on that, the unsupervised and the bootstrapping where maybe it's sort of like a lawnmower where you give it that first. >> That's right. >> You know, tug. >> I mean, we've used it extensively. So let me give you an example. Let's say you wanted to create a list of cities, right? Or a list of the classic example actually was a paper written by Sergey Brin. I think he was trying to figure out the names of all authors in the world, and this is 1988. And basically if you search on Google, the term "has written the book," just the term "has written the book," these are called patterns, or hearse patterns, I think. Then you can imagine that it's also always preceded by a name of a person who's an author. So, "George Gilbert has written the book," and then the name of the book, right? Or "William Shakespeare has written the book X." And you seed it with William Shakespeare, and you get some books. Or you put Shakespeare and you get some authors, right? And then, you use it to learn other patterns that also co-occurred between William Shakespeare and the book. >> George: Ah. >> And then you learn more patterns and you use it to extract more authors. >> And in the case of Demandbase, that's how you go from learning, starting bootstrapping within, say, pharma terminology. >> Yes. >> And learning the rest of pharma terminology. >> And then, using generic terminology to enter an industry, and then learning terminology that we ourselves don't understand yet it means. For example, I always used this example where if we read a sentence like "Takeda has in-licensed "a molecule from Roche," it may mean nothing to us, but it means that they're partnered and bought a product, in pharma lingo. So we use it to learn new language. And it's a common technique. We use it extensively, both. So it goes down to, while we do use highly sophisticated algorithms for some problems, I think most problems can be solved with simple models and thinking through how to apply domain expertise and data intuition and having the data to do it. >> Okay, let's pause on that point and come back to it. >> Sure. >> Because that sounds like a rich vein to explore. So this is George Gilbert on the ground at Demandbase. We'll be right back in a few minutes.
SUMMARY :
and CTO of Demandbase Always good to see you. Let's talk about, just to set some context. And so, it's interesting that, you know, So it's more now empowering so in essence, that's influencing the business. And this is much more an organizational the conceptual leap that Demandbase made identifying who you are. And not knowing who is interacting with you And that allows you to have a much more to identify who you are. with partners where they collect an identity. you can actually identify a person Ah, so you don't need to resolve down So if I knew that this is a C-level Once you have a persona, is it Demandbase is just not good enough because you need a way So, it sounds like sometimes it's anticipating And sometimes, it's your program And it's difficult to create that personalized letter, to now build systems that in essence And obviously, only the people that will buy from you. So, it sounds like there was an intermediate until you registered on the website. And then, they could email you. And that's still, you know, There was a generation which started to be marketing. And it was all about nurturing, And it boils down to if the email was really good the mechanics of how you do it. So if you had Google as a client So give me an example of, You can think of it, it's quite faddish And the ultimate goal is to convert you to a customer. So it's a planning system. between the traditional statistical machine learning Then the deep learning, the neural nets, Because if the data is there and you have Sure, how good the model is, how precise it is. And sure, then you can make it better So, for those who are familiar with the term and see if it works. And if you don't have, but it also turns out and then where you can derive the data you need And if you search that, just that name, So, it sounds to me that reinforcement learning And even if you had the money, it's sort of like a lawnmower where you give it that first. And basically if you search on Google, And then you learn more patterns And in the case of Demandbase, and having the data to do it. So this is George Gilbert on the ground at Demandbase.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George Gilbert | PERSON | 0.99+ |
George | PERSON | 0.99+ |
70 | QUANTITY | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
Alan Fletcher | PERSON | 0.99+ |
Tom Siebel | PERSON | 0.99+ |
Siemens | ORGANIZATION | 0.99+ |
Comcast | ORGANIZATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
25 cents | QUANTITY | 0.99+ |
60% | QUANTITY | 0.99+ |
20 cents | QUANTITY | 0.99+ |
Sergey Brin | PERSON | 0.99+ |
hundreds | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
two people | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Roche | ORGANIZATION | 0.99+ |
1988 | DATE | 0.99+ |
Stanford | ORGANIZATION | 0.99+ |
100 thousand people | QUANTITY | 0.99+ |
William Shakespeare | PERSON | 0.99+ |
Aman Naimat | PERSON | 0.99+ |
six months | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Shakespeare | PERSON | 0.99+ |
Demandbase | ORGANIZATION | 0.99+ |
Takeda | ORGANIZATION | 0.99+ |
both | QUANTITY | 0.99+ |
Siebel | PERSON | 0.99+ |
one reason | QUANTITY | 0.99+ |
five-person | QUANTITY | 0.99+ |
today | DATE | 0.98+ |
Bing | ORGANIZATION | 0.98+ |
AlphaGo | ORGANIZATION | 0.98+ |
one person | QUANTITY | 0.98+ |
Alan Fletchers | PERSON | 0.98+ |
Aman | PERSON | 0.98+ |
90s | DATE | 0.97+ |
around 700 | QUANTITY | 0.97+ |
millions of dollars | QUANTITY | 0.96+ |
each point | QUANTITY | 0.96+ |
hundreds of millions of people | QUANTITY | 0.96+ |
Chapter 1 | OTHER | 0.96+ |
eight, | DATE | 0.96+ |
80 percent | QUANTITY | 0.96+ |
first | QUANTITY | 0.96+ |
Spiderbook | ORGANIZATION | 0.96+ |
Googl | ORGANIZATION | 0.96+ |
one set | QUANTITY | 0.95+ |
Forbes.com | ORGANIZATION | 0.95+ |
one | QUANTITY | 0.94+ |
a billion people | QUANTITY | 0.9+ |
80s | DATE | 0.89+ |
about eight years ago | DATE | 0.84+ |
last eight years | DATE | 0.84+ |
last decade | DATE | 0.83+ |
CUBEcast | ORGANIZATION | 0.82+ |
Aman Naimat, Demandbase, Chaper 3 | George Gilbert at HQ
>> This is George Gilbert from Wikibon. We're back on the ground with Aman Naimat at Demandbase. >> Hey. >> And, we're having a really interesting conversation About building next-gen enterprise applications. >> It's getting really deep. (laughing) >> So, so let's look ahead a little bit. >> Sure. >> We've talked in some detail about the foundation technologies. >> Right. >> And you told me before that we have so much technology, you know, still to work with. >> Yeah. >> That is unexploited. That we don't need, you know, a whole lot of breakthroughs, but we should focus on customer needs that are unmet. >> Yeah. >> Let's talk about some problems yet to be solved, but that are customer facing with, as you have told me, existing technology. >> Right, can solve. >> Yes. >> Absolutely, I mean, there's a lot of focus in Silicon Valley about, like, scaling machine learning and investing in, you know, GPUs and what have you. But I think there's enough technology there. So where's the gap? The really gap is in understanding how to build AI applications, and how to monetize it, because it is quite different than building traditional applications. It has different characteristics. You know, so it's much more experimental in nature. Although, you know, with lean engineering, we've moved towards iterative to (mumbles) development, for example. Like, for example, 90% of the time, I, you know, after 20 years of building software, I'm quite confident I can build software. It turns out, in the world of data science and AI driven, or AI applications, you can't have that much confidence. It's a lot more like discovering molecules in pharma. So you have to experiment more often, and methods have to be discovered, there's more discovery and less engineering in the early stages. >> Is the discovery centered on do you have the right data? >> Yeah, or are you measuring the right thing, right? If you thought you were going to maximize, work the model to maximize revenue, but really, maybe the end function should be increasing engagement with the customer. So, often, we don't know the end objective function, or incorrectly guess the right or wrong objective function. The only way to do that is to be able to build and end-to-end system in days, and then iterate through the different models in hours and days as quickly as possible with the end goal and customer in mind. >> This is really fascinating because we were, some of the research we're doing is on the really primitive capabilities of the, sort of, analytic data pipeline. >> Yes. >> Where, you know, all the work that has to do with coming up with the features. >> Yeah. >> And then plugging that into a model, and then managing the model's life cycle. That those, that whole process is so fragmented. >> Yeah. >> And it's, you know, chewing gum and bailing wire. >> Sure. >> And I imagine that that's slows dramatically that experimentation process. >> I mean, it slows it down, but it's also mindset, right? >> Okay. >> So, now that we have built, you know, we probably have a hundred machine learning models now, Demandbase, that I've contributed to the build with our data scientists, and in the end we've found out that you can actually do something in a day or two with extremely small amount of data over, using Python and SKLearn, today, very quickly, that will give you, and then, you know, build some simple UI that a human can evaluate and give feedback or whatever action you're trying to get to. And get to that as quickly as possible, rather than worrying about the pipelines, rather than worry about everything else because in 80% of the cases, it will fail anyways. Or you will realize that either you don't have the right data, or nobody wants it, or it can never be done, or you need to approach it completely different, from a completely different objective function. >> Let me parse what you've said in a different way. >> Sure. >> And see if I understand it. Traditional model building is based, not on sampling, but on the full data set. >> That's right. >> And what you're saying, in terms of experimentation. >> Start doing that, yes. >> Is to go back to samples. >> That's right. Go back to, there's a misunderstanding that we need, you know, while Demandbase processes close to a trillion rows of data today, we found that almost all big data, AI solutions, can be initially proven with very small amounts of data, and small amount of features. And if they don't work that, if you cannot have a hundred rows of data and have a human look at some rows and make a judgment, then it's not possible, most likely, with one billion, and with ten billion. So, if you cannot work it, now there are exceptions to this, but in 90% of the cases, if the solution is not at, you know, few thousand or million rows of data. Now the problem is that all the easy, you know, libraries and open-source stuff that's out there, it's all designed to be workable in small amounts of data. So, what we don't want to do is build this whole massive infrastructure, which is getting easier, and worrying about data pipelines and putting it all together, only to realize that this is not going to work. Or, more often, it doesn't solve any problem. >> So, if I were to sort of boil that down into terms of product terms. >> Yeah. >> The notion that you could have something like Spark running on your laptop. >> Yeah. >> And scaling out to a bit cluster. >> Yeah, just run it on laptop. >> That, yeah. >> In fact, you don't even need Spark. >> Or, I was going to say, not even spark. >> No. >> Just use Python. >> Just by scale learning is much better for something like this. >> It's almost like, this is, so it's back to Visual Basic. You know, you're not going to build a production app in >> I wouldn't go that far. >> Well >> It's a prototype. >> No I meant for the prototype GUI app you do in Visual Basic, and then, you know, when you're going to build a production one, you use Microsoft Foundation Class. >> Because most often, right, more often, you don't have the right data, you have the wrong objective function, or your customer is not happy with the results or wants to modify. And that's true for conventional business applications, the old school whatever internet applications. But it is more true for here because it's much, the data is much more noisy, the problems are much more complex, and ultimately you need to be able to take real world action, and so build something that can take the real world action, be it for a very narrow problem or use case. And get to it, even without any model. And the first model that I recommend, or I do, or my data scientists do, is I just do it yourself by hand. Just label the data and say as if, let's pretend that this was the right answer, and we can take this action and the workflow works, like, did something good happen? You know, will it be something that will satisfy some problem? And if that's not true, then why build it? And you can do that manually, right? So I guess it's no different than any other entrepreneurial endeavor. But it's more true in data science projects, firstly, because they're more likely to be wrong than I think we have learned now how to build good software. >> Imperative software. >> The imperative software. And data science is called data science for a reason. It's much more experimental, right? Like, in science, you don't know. A negative experiment is a fine experiment. >> This is actually, of all that we've been talking about, it might sound the most abstract, but it's also the most profound because what you're saying is this elaborate process and the technology to support it, you know, this whole pipeline, that it's like you only do that once you've proven the prototype. >> That's right. And get the prototype in a day. >> You don't want that elaborate structure and process when you're testing something out. >> No, yeah, exactly. And, you know, like when we build our own machine learning models, obviously coming out of academia, you know, there was a class project that it took us a year or six months to really design the best models, and test it, and prove it out intrinsic, intrinsic testing, and we knew it was working. But what we should really have done, what should we do now is we build models, we do experiments daily. And get to, in essence, the patient with our molecule every day, so, you know, we have the advantage given that we entail the marketing, that we can get to test our molecules or drugs on a daily basis. And we have enough data to test it, and we have enough customers, thankfully, to test it. And some of them are collaborating with us. So, we get to end solution on a daily basis. >> So, now I understand why you said, we don't need these radical algorithmic breakthroughs or, you know, new super, turbo-charged processors. So, with this approach of really fast prototyping, what are some of the unmet needs in, you know, it's just a matter of cycling through these experiments? >> Yeah, so I think one of the biggest unmet need today, we're able to understand language, we're able to predict who should you do business with and what should you talk about, but I think natural language generation, or creating a personalized email, really personalized and really beautifully written, is still something that we haven't quite, you know, have a full grasp on. And to be able to communicate at human level personalization, to be able to talk, you know, we can generate ads today, but that's not really, you know, language, right? It is language, but not as sophisticated as what we're talking here. Or to be able to generate text or have a bot speak to you, right? We can have a bot, we can now understand and respond in text, but really speak to you fluently with context about you is definitely an area we're heavily investing in, or looking to invest in in the near future. >> And with existing technology. >> With existing technology. I think, we think if you can narrow it down, we can generate emails that are much better than what are salesperson would write. In fact, we already have a product that can personalize a website, automatically, using AI, reinforcement learning, all the data we have. And it can rewrite a website to be customized for each visitor, personalized to each visitor, >> Give us an example of what. >> So, you know, for example if you go to Siemens or SAP and you come from pharma, it will take you and surface different content about pharmaceuticals. And, you know, in essence, at some point you can generate a whole page that's personalized to if somebody comes to pharma from a CFO versus an IT person, it will change the entire page content, right? To that, to, in essence, the entire buyer journey could be personalized. Because, you know, today buying from B2B, it's quite jarring, it's filled with spam, it's, you know, it's not a pleasant experience. It's not concierge level experience. And really, in an ideal world, you want B2B or marketing to be personalized. You want it to be like you're being, you know, guided through, if you need something, you can ask a question and you have a personalized assistant talking to you about it. >> So that there's, the journey is not coded in. >> It isn't, yeah. >> The journey, or the conversation response reacts to the >> To the customer. >> To the customer. >> Right, and B2B buyers want, you know, they want something like that. They don't have time to waste to it. Who want's to be lost on a website? >> Right. >> You know, you go to any Fortune 500 company's website and you, it's a mess. >> Okay, so let's back up to the Demandbase in the Bay Area, software ecosystem. >> Sure. >> So, SalesForce is a big company. >> Yes. >> Marketing is one of their pillars. >> Yes. >> Tell us, what is it about this next gen technology that is so, we touched on this before, but so anathema to the way traditional software companies build their products? >> Yeah, I mean, SalesForce is a very close partner, they're a customer, we work with them very closely. I think they're also an investor, small investor for Demandbase. We have a deep relationship with them. And I, myself, come from the traditional software background, you know, I've been building CRM, so I'll talk about myself, because I've seen how different and, you know, I have to sort of transition at a very early stage from a human centric CRM to a data driven CRM, or a human driven versus data driven. And it's, you have to think about things differently. So, one difference is that, when you look at data in human driven CRM, you trust it implicitly because somebody in your org put it in. You may challenge it, it's old, it's stale, but there's no fear that it's a machine recommending you and driving you. And it requires the interfaces to be much different. You have to think about how do you build trust between the person, you know, who's being driven in a Tesla, also, similar problem. And, you know, how do you give them the controls so they can turn of the autopilot, right? And how do you, you know, take feedback from humans to improve the models? So, it's a different way that human interface even becomes more different, and simpler. The other interesting thing is that if you look at traditional applications, they're quite complicated. They have all these fields because, you know, just enter all this data and you type it in. But the way you interact with our application, is that we already know everything, or a lot. So, why bother asking you? We already know where you are, who you are, what you should do, so we are in essence, guiding you more of a, using the Tesla autopilot example, it already knows where you are. It knows you're sitting in the car and it knows that you need to break because, you know, you're going to crash, so it'll just break by itself. So, you know, the interface is. >> That's really an interesting analogy. Tesla is a data driven piece of software. >> It is. >> Whereas, you know, my old BMW or whatever is a human driven piece of software. >> And there's some things in the middle. So, I recently, I mean, looking at cars, I just had a baby, and Volvo is something in the middle. Where, if you're going to have an accident or somebody comes close, it blinks. So, it's like advanced analytics, right? Which is analogous to that. Tesla just stops if you're going to have an accident. And that's the right idea, because if I'm going to have an accident, you don't want to rely on me to look at some light, what if I'm talking on the phone or looking at my kid? You know, some blinking light over there. Which is why advanced analytics hasn't been as successful as it should be. >> Because the hand off between the data driven and the human driven is a very difficult hand off. >> It's a very difficult hand off. And whenever possible, the right answer for us today is if you know everything, and you can take the action, like if you're going to have an accident just stop. Or, if you need to go, go, right? So if you come out in the morning, you know, and you go to work at 9 am, it should just put itself out, like, you know, why wait for human to, you know, get rid of all the monotonous problems that we ourselves have, right? >> That's a great example. On that note, let's break and this is George Gilbert. I'm with, and having a great conversation with Aman Naimat, Senior VP and CTO of Demandbase, and we will be back shortly with a member of the data science team. >> Thank you, George.
SUMMARY :
We're back on the ground with Aman Naimat at Demandbase. And, we're having a really interesting conversation It's getting really deep. the foundation technologies. technology, you know, still to work with. That we don't need, you know, a whole lot of breakthroughs, as you have told me, existing technology. and investing in, you know, GPUs and what have you. Yeah, or are you measuring the right thing, right? This is really fascinating because we were, Where, you know, all the work that And then plugging that into a model, And I imagine that that's slows dramatically So, now that we have built, you know, we probably not on sampling, but on the full data set. Now the problem is that all the easy, you know, So, if I were to sort of boil that down The notion that you could have something for something like this. It's almost like, this is, so it's back to Visual Basic. and then, you know, when you're going to build And you can do that manually, right? Like, in science, you don't know. you know, this whole pipeline, that it's like And get the prototype in a day. You don't want that elaborate structure and process every day, so, you know, we have the advantage what are some of the unmet needs in, you know, and respond in text, but really speak to you fluently I think, we think if you can narrow it down, So, you know, for example if you go to Siemens Right, and B2B buyers want, you know, You know, you go to any Fortune 500 company's in the Bay Area, software ecosystem. between the person, you know, who's being driven Tesla is a data driven piece of software. Whereas, you know, my old BMW or whatever is a to have an accident, you don't want to rely on me and the human driven is a very difficult hand off. to, you know, get rid of all the monotonous problems Senior VP and CTO of Demandbase, and we will be back
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George Gilbert | PERSON | 0.99+ |
90% | QUANTITY | 0.99+ |
one billion | QUANTITY | 0.99+ |
Volvo | ORGANIZATION | 0.99+ |
Siemens | ORGANIZATION | 0.99+ |
BMW | ORGANIZATION | 0.99+ |
Visual Basic | TITLE | 0.99+ |
80% | QUANTITY | 0.99+ |
ten billion | QUANTITY | 0.99+ |
9 am | DATE | 0.99+ |
a year | QUANTITY | 0.99+ |
SalesForce | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
Demandbase | ORGANIZATION | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Python | TITLE | 0.99+ |
George | PERSON | 0.99+ |
six months | QUANTITY | 0.99+ |
Spark | TITLE | 0.99+ |
20 years | QUANTITY | 0.98+ |
Microsoft | ORGANIZATION | 0.98+ |
Bay Area | LOCATION | 0.98+ |
a day | QUANTITY | 0.98+ |
each visitor | QUANTITY | 0.98+ |
SAP | ORGANIZATION | 0.98+ |
two | QUANTITY | 0.97+ |
first model | QUANTITY | 0.96+ |
SKLearn | TITLE | 0.96+ |
one | QUANTITY | 0.94+ |
Wikibon | ORGANIZATION | 0.93+ |
firstly | QUANTITY | 0.92+ |
one difference | QUANTITY | 0.85+ |
Aman Naimat | ORGANIZATION | 0.77+ |
hundred machine | QUANTITY | 0.73+ |
trillion rows of data | QUANTITY | 0.73+ |
million rows of data | QUANTITY | 0.72+ |
Chaper | PERSON | 0.69+ |
Aman Naimat | PERSON | 0.68+ |
thousand | QUANTITY | 0.68+ |
few | QUANTITY | 0.59+ |
Foundation Class | TITLE | 0.59+ |
500 | QUANTITY | 0.44+ |
Fortune | TITLE | 0.36+ |