Aman Naimat, Demandbase, Chaper 3 | George Gilbert at HQ

>> This is George Gilbert from Wikibon. We're back on the ground with Aman Naimat at Demandbase. >> Hey. >> And, we're having a really interesting conversation About building next-gen enterprise applications. >> It's getting really deep. (laughing) >> So, so let's look ahead a little bit. >> Sure. >> We've talked in some detail about the foundation technologies. >> Right. >> And you told me before that we have so much technology, you know, still to work with. >> Yeah. >> That is unexploited. That we don't need, you know, a whole lot of breakthroughs, but we should focus on customer needs that are unmet. >> Yeah. >> Let's talk about some problems yet to be solved, but that are customer facing with, as you have told me, existing technology. >> Right, can solve. >> Yes. >> Absolutely, I mean, there's a lot of focus in Silicon Valley about, like, scaling machine learning and investing in, you know, GPUs and what have you. But I think there's enough technology there. So where's the gap? The really gap is in understanding how to build AI applications, and how to monetize it, because it is quite different than building traditional applications. It has different characteristics. You know, so it's much more experimental in nature. Although, you know, with lean engineering, we've moved towards iterative to (mumbles) development, for example. Like, for example, 90% of the time, I, you know, after 20 years of building software, I'm quite confident I can build software. It turns out, in the world of data science and AI driven, or AI applications, you can't have that much confidence. It's a lot more like discovering molecules in pharma. So you have to experiment more often, and methods have to be discovered, there's more discovery and less engineering in the early stages. >> Is the discovery centered on do you have the right data? >> Yeah, or are you measuring the right thing, right? If you thought you were going to maximize, work the model to maximize revenue, but really, maybe the end function should be increasing engagement with the customer. So, often, we don't know the end objective function, or incorrectly guess the right or wrong objective function. The only way to do that is to be able to build and end-to-end system in days, and then iterate through the different models in hours and days as quickly as possible with the end goal and customer in mind. >> This is really fascinating because we were, some of the research we're doing is on the really primitive capabilities of the, sort of, analytic data pipeline. >> Yes. >> Where, you know, all the work that has to do with coming up with the features. >> Yeah. >> And then plugging that into a model, and then managing the model's life cycle. That those, that whole process is so fragmented. >> Yeah. >> And it's, you know, chewing gum and bailing wire. >> Sure. >> And I imagine that that's slows dramatically that experimentation process. >> I mean, it slows it down, but it's also mindset, right? >> Okay. >> So, now that we have built, you know, we probably have a hundred machine learning models now, Demandbase, that I've contributed to the build with our data scientists, and in the end we've found out that you can actually do something in a day or two with extremely small amount of data over, using Python and SKLearn, today, very quickly, that will give you, and then, you know, build some simple UI that a human can evaluate and give feedback or whatever action you're trying to get to. And get to that as quickly as possible, rather than worrying about the pipelines, rather than worry about everything else because in 80% of the cases, it will fail anyways. Or you will realize that either you don't have the right data, or nobody wants it, or it can never be done, or you need to approach it completely different, from a completely different objective function. >> Let me parse what you've said in a different way. >> Sure. >> And see if I understand it. Traditional model building is based, not on sampling, but on the full data set. >> That's right. >> And what you're saying, in terms of experimentation. >> Start doing that, yes. >> Is to go back to samples. >> That's right. Go back to, there's a misunderstanding that we need, you know, while Demandbase processes close to a trillion rows of data today, we found that almost all big data, AI solutions, can be initially proven with very small amounts of data, and small amount of features. And if they don't work that, if you cannot have a hundred rows of data and have a human look at some rows and make a judgment, then it's not possible, most likely, with one billion, and with ten billion. So, if you cannot work it, now there are exceptions to this, but in 90% of the cases, if the solution is not at, you know, few thousand or million rows of data. Now the problem is that all the easy, you know, libraries and open-source stuff that's out there, it's all designed to be workable in small amounts of data. So, what we don't want to do is build this whole massive infrastructure, which is getting easier, and worrying about data pipelines and putting it all together, only to realize that this is not going to work. Or, more often, it doesn't solve any problem. >> So, if I were to sort of boil that down into terms of product terms. >> Yeah. >> The notion that you could have something like Spark running on your laptop. >> Yeah. >> And scaling out to a bit cluster. >> Yeah, just run it on laptop. >> That, yeah. >> In fact, you don't even need Spark. >> Or, I was going to say, not even spark. >> No. >> Just use Python. >> Just by scale learning is much better for something like this. >> It's almost like, this is, so it's back to Visual Basic. You know, you're not going to build a production app in >> I wouldn't go that far. >> Well >> It's a prototype. >> No I meant for the prototype GUI app you do in Visual Basic, and then, you know, when you're going to build a production one, you use Microsoft Foundation Class. >> Because most often, right, more often, you don't have the right data, you have the wrong objective function, or your customer is not happy with the results or wants to modify. And that's true for conventional business applications, the old school whatever internet applications. But it is more true for here because it's much, the data is much more noisy, the problems are much more complex, and ultimately you need to be able to take real world action, and so build something that can take the real world action, be it for a very narrow problem or use case. And get to it, even without any model. And the first model that I recommend, or I do, or my data scientists do, is I just do it yourself by hand. Just label the data and say as if, let's pretend that this was the right answer, and we can take this action and the workflow works, like, did something good happen? You know, will it be something that will satisfy some problem? And if that's not true, then why build it? And you can do that manually, right? So I guess it's no different than any other entrepreneurial endeavor. But it's more true in data science projects, firstly, because they're more likely to be wrong than I think we have learned now how to build good software. >> Imperative software. >> The imperative software. And data science is called data science for a reason. It's much more experimental, right? Like, in science, you don't know. A negative experiment is a fine experiment. >> This is actually, of all that we've been talking about, it might sound the most abstract, but it's also the most profound because what you're saying is this elaborate process and the technology to support it, you know, this whole pipeline, that it's like you only do that once you've proven the prototype. >> That's right. And get the prototype in a day. >> You don't want that elaborate structure and process when you're testing something out. >> No, yeah, exactly. And, you know, like when we build our own machine learning models, obviously coming out of academia, you know, there was a class project that it took us a year or six months to really design the best models, and test it, and prove it out intrinsic, intrinsic testing, and we knew it was working. But what we should really have done, what should we do now is we build models, we do experiments daily. And get to, in essence, the patient with our molecule every day, so, you know, we have the advantage given that we entail the marketing, that we can get to test our molecules or drugs on a daily basis. And we have enough data to test it, and we have enough customers, thankfully, to test it. And some of them are collaborating with us. So, we get to end solution on a daily basis. >> So, now I understand why you said, we don't need these radical algorithmic breakthroughs or, you know, new super, turbo-charged processors. So, with this approach of really fast prototyping, what are some of the unmet needs in, you know, it's just a matter of cycling through these experiments? >> Yeah, so I think one of the biggest unmet need today, we're able to understand language, we're able to predict who should you do business with and what should you talk about, but I think natural language generation, or creating a personalized email, really personalized and really beautifully written, is still something that we haven't quite, you know, have a full grasp on. And to be able to communicate at human level personalization, to be able to talk, you know, we can generate ads today, but that's not really, you know, language, right? It is language, but not as sophisticated as what we're talking here. Or to be able to generate text or have a bot speak to you, right? We can have a bot, we can now understand and respond in text, but really speak to you fluently with context about you is definitely an area we're heavily investing in, or looking to invest in in the near future. >> And with existing technology. >> With existing technology. I think, we think if you can narrow it down, we can generate emails that are much better than what are salesperson would write. In fact, we already have a product that can personalize a website, automatically, using AI, reinforcement learning, all the data we have. And it can rewrite a website to be customized for each visitor, personalized to each visitor, >> Give us an example of what. >> So, you know, for example if you go to Siemens or SAP and you come from pharma, it will take you and surface different content about pharmaceuticals. And, you know, in essence, at some point you can generate a whole page that's personalized to if somebody comes to pharma from a CFO versus an IT person, it will change the entire page content, right? To that, to, in essence, the entire buyer journey could be personalized. Because, you know, today buying from B2B, it's quite jarring, it's filled with spam, it's, you know, it's not a pleasant experience. It's not concierge level experience. And really, in an ideal world, you want B2B or marketing to be personalized. You want it to be like you're being, you know, guided through, if you need something, you can ask a question and you have a personalized assistant talking to you about it. >> So that there's, the journey is not coded in. >> It isn't, yeah. >> The journey, or the conversation response reacts to the >> To the customer. >> To the customer. >> Right, and B2B buyers want, you know, they want something like that. They don't have time to waste to it. Who want's to be lost on a website? >> Right. >> You know, you go to any Fortune 500 company's website and you, it's a mess. >> Okay, so let's back up to the Demandbase in the Bay Area, software ecosystem. >> Sure. >> So, SalesForce is a big company. >> Yes. >> Marketing is one of their pillars. >> Yes. >> Tell us, what is it about this next gen technology that is so, we touched on this before, but so anathema to the way traditional software companies build their products? >> Yeah, I mean, SalesForce is a very close partner, they're a customer, we work with them very closely. I think they're also an investor, small investor for Demandbase. We have a deep relationship with them. And I, myself, come from the traditional software background, you know, I've been building CRM, so I'll talk about myself, because I've seen how different and, you know, I have to sort of transition at a very early stage from a human centric CRM to a data driven CRM, or a human driven versus data driven. And it's, you have to think about things differently. So, one difference is that, when you look at data in human driven CRM, you trust it implicitly because somebody in your org put it in. You may challenge it, it's old, it's stale, but there's no fear that it's a machine recommending you and driving you. And it requires the interfaces to be much different. You have to think about how do you build trust between the person, you know, who's being driven in a Tesla, also, similar problem. And, you know, how do you give them the controls so they can turn of the autopilot, right? And how do you, you know, take feedback from humans to improve the models? So, it's a different way that human interface even becomes more different, and simpler. The other interesting thing is that if you look at traditional applications, they're quite complicated. They have all these fields because, you know, just enter all this data and you type it in. But the way you interact with our application, is that we already know everything, or a lot. So, why bother asking you? We already know where you are, who you are, what you should do, so we are in essence, guiding you more of a, using the Tesla autopilot example, it already knows where you are. It knows you're sitting in the car and it knows that you need to break because, you know, you're going to crash, so it'll just break by itself. So, you know, the interface is. >> That's really an interesting analogy. Tesla is a data driven piece of software. >> It is. >> Whereas, you know, my old BMW or whatever is a human driven piece of software. >> And there's some things in the middle. So, I recently, I mean, looking at cars, I just had a baby, and Volvo is something in the middle. Where, if you're going to have an accident or somebody comes close, it blinks. So, it's like advanced analytics, right? Which is analogous to that. Tesla just stops if you're going to have an accident. And that's the right idea, because if I'm going to have an accident, you don't want to rely on me to look at some light, what if I'm talking on the phone or looking at my kid? You know, some blinking light over there. Which is why advanced analytics hasn't been as successful as it should be. >> Because the hand off between the data driven and the human driven is a very difficult hand off. >> It's a very difficult hand off. And whenever possible, the right answer for us today is if you know everything, and you can take the action, like if you're going to have an accident just stop. Or, if you need to go, go, right? So if you come out in the morning, you know, and you go to work at 9 am, it should just put itself out, like, you know, why wait for human to, you know, get rid of all the monotonous problems that we ourselves have, right? >> That's a great example. On that note, let's break and this is George Gilbert. I'm with, and having a great conversation with Aman Naimat, Senior VP and CTO of Demandbase, and we will be back shortly with a member of the data science team. >> Thank you, George.

Published Date : Nov 2 2017

SUMMARY :

We're back on the ground with Aman Naimat at Demandbase. And, we're having a really interesting conversation It's getting really deep. the foundation technologies. technology, you know, still to work with. That we don't need, you know, a whole lot of breakthroughs, as you have told me, existing technology. and investing in, you know, GPUs and what have you. Yeah, or are you measuring the right thing, right? This is really fascinating because we were, Where, you know, all the work that And then plugging that into a model, And I imagine that that's slows dramatically So, now that we have built, you know, we probably not on sampling, but on the full data set. Now the problem is that all the easy, you know, So, if I were to sort of boil that down The notion that you could have something for something like this. It's almost like, this is, so it's back to Visual Basic. and then, you know, when you're going to build And you can do that manually, right? Like, in science, you don't know. you know, this whole pipeline, that it's like And get the prototype in a day. You don't want that elaborate structure and process every day, so, you know, we have the advantage what are some of the unmet needs in, you know, and respond in text, but really speak to you fluently I think, we think if you can narrow it down, So, you know, for example if you go to Siemens Right, and B2B buyers want, you know, You know, you go to any Fortune 500 company's in the Bay Area, software ecosystem. between the person, you know, who's being driven Tesla is a data driven piece of software. Whereas, you know, my old BMW or whatever is a to have an accident, you don't want to rely on me and the human driven is a very difficult hand off. to, you know, get rid of all the monotonous problems Senior VP and CTO of Demandbase, and we will be back

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
90%	QUANTITY	0.99+
one billion	QUANTITY	0.99+
Volvo	ORGANIZATION	0.99+
Siemens	ORGANIZATION	0.99+
BMW	ORGANIZATION	0.99+
Visual Basic	TITLE	0.99+
80%	QUANTITY	0.99+
ten billion	QUANTITY	0.99+
9 am	DATE	0.99+
a year	QUANTITY	0.99+
SalesForce	ORGANIZATION	0.99+
today	DATE	0.99+
Tesla	ORGANIZATION	0.99+
Demandbase	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
Python	TITLE	0.99+
George	PERSON	0.99+
six months	QUANTITY	0.99+
Spark	TITLE	0.99+
20 years	QUANTITY	0.98+
Microsoft	ORGANIZATION	0.98+
Bay Area	LOCATION	0.98+
a day	QUANTITY	0.98+
each visitor	QUANTITY	0.98+
SAP	ORGANIZATION	0.98+
two	QUANTITY	0.97+
first model	QUANTITY	0.96+
SKLearn	TITLE	0.96+
one	QUANTITY	0.94+
Wikibon	ORGANIZATION	0.93+
firstly	QUANTITY	0.92+
one difference	QUANTITY	0.85+
Aman Naimat	ORGANIZATION	0.77+
hundred machine	QUANTITY	0.73+
trillion rows of data	QUANTITY	0.73+
million rows of data	QUANTITY	0.72+
Chaper	PERSON	0.69+
Aman Naimat	PERSON	0.68+
thousand	QUANTITY	0.68+
few	QUANTITY	0.59+
Foundation Class	TITLE	0.59+
500	QUANTITY	0.44+
Fortune	TITLE	0.36+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for SKLearn: