Claudia Perlich, Dstillery - Women in Data Science 2017 - #WiDS2017 - #theCUBE
>> Narrator: Live from Stanford University, it's theCUBE covering the Women in Data Science Conference 2017. >> Hi welcome back to theCUBE, I'm Lisa Martin and we are live at Stanford University at the second annual Women in Data Science one day tech conference. We are joined by one of the speakers for the event today, Claudia Perlich, the Chief Scientist at Dstillery, Claudia, welcome to theCUBE. >> Claudia: Thank you so much for having me. It's exciting. >> It is exciting! It's great to have you here. You are quite the prolific author, you've won data mining competitions and awards, you speak at conferences all around the world. Talk to us what you're currently doing as the Chief Scientist for Dstillery. Who's Dstillery? What's the Chief Scientist's role and how are you really leveraging data and science to be a change agent for your clients. I joined Dstillery when it was still called Media6Degrees as a very small startup in the New York ad tech space. It was very exciting. I came out of the IBM Watson Research Lab and really found this a new challenging application area for my skills. What does a Chief Scientist do? It's a good question, I think it actually took the CEO about two years to finally give me a job description, (laughter) and the conclusion at that point was something like, okay there is technical contribution, so I sit down and actually code things and I build prototypes and I play around with data. I also am referred to as Intellectual Leadership, so I work a lot with the teams just kind of scoping problems, brainstorming was may work or dozen, and finally, that's what I'm here for today, is what they consider an Ambassador for the company, so being the face to talk about the more scientific aspects of what's happening now in ad tech, which brings me to what we actually do, right. One of the things that happened over the recent past in advertising is it became an incredible playground for data signs because the available data is incomparable to many other fields that I have seen. And so Dstillery was a pioneer in that space starting to look at initially social data things that people shared, but over the years it has really grown into getting a sense of the digital footprint of what people do. And our primary business model was to bring this to marketers to help them on a much more individualized basis identify who their customers current as well as futures are. Really get a very different understanding than these broad middle-aged soccer mom kind of categories to honor the individual tastes and preferences and actions that really truly reflect the variety of what people do. I'm many things as you mentioned, I publish mom, what's a mom, and I have a horse, so there are many different parts to me. I don't think any single one description fully captures that and we felt that advertising is a great space to explore how you can translate that and help both sides, the people that are being interacted with, as well as the brands that want to make sure that they reach the right individuals. >> Lisa: Very interesting. Well, as buyers journey as changed to mostly online, >> Exactly. >> You're right, it's an incredibly rich opportunity for companies to harness more of that behavioral information and probably see things that they wouldn't have predicted. We were talking to Walmart Labs earlier and one of the interesting insights that they shared was that, especially in Silicon Valley where people spend too much time in the car commuting-- (laughter) You have a long commute as well by train. >> Yes. >> And you'd think that people would want, I want my groceries to show up on my doorstep, I don't want to have to go into the store, and they actually found the opposite that people in such a cosmopolitan area as Silicon Valley actually want to go into the store and pick up-- >> Claudia: Yep. >> Their groceries, so it's very interesting how the data actually can sometimes really change. It's really the scientific method on a very different scale >> Claudia: Much smaller. >> But really using the behavior insights to change the shopping experience, but also to change the experience of companies that are looking to sell their products. >> I think that the last part of the puzzle is, the question is no longer what is the right video for the Super Bowl, I mean we have the Super Bowl coming up, right? >> Lisa: Right. Right. >> They did a study like when do people pay attention to the Super Bowl. You can actually tell, cuz you know what people don't do when they pay attention to the Super Bowl? >> Lisa: Mm,hmm. >> They're not playing around with their phones. They're actually not playing-- >> Lisa: Of course. >> Candy Crush and all these things, so what we see in the ad tech environment, we actually see that the demand for the digital ads go down when people really focus on what's going on on the big screen. But that was a diversion ... >> Lisa: It's very interesting (laughter) though cuz it's something that's very tangible and very ... It's a real world applications. Question for you about data science and your background. You mentioned that you worked with IBM Watson. Forbes has just said that Data Scientist is the best job to apply for in 2017. What is your vision? Talk to us about your team, how you've grown that up, how you're using big data and science to really optimize the products that you deliver to your customers. >> Data Science is really many, many different flavors and in some sense I became a Data Scientist long before the term really existed. Back then I was just a particular weird kind of geek. (laughter) You know all of a sudden it's-- >> Now it has a name. (laughter) >> Right and the reputation to be fun and so you see really many different application areas depending very different skillsets. What is originally the focus of our company has always been around, can we predict what people are going to do? That was always the primary focus and now you see that it's very nicely reflected at the event too. All of sudden communicating this becomes much bigger a part of the puzzle where people say, "Okay, I realize that you're really "good at predicting, but can you tell me why, "what is it these nuggets of inside-- >> Interpretation, right. >> "That you mentioned. Can you visualize what's going on?" And so we grew a team initially from a small group of really focused machine learning and predictive skills over to the broader can you communicate it. Can you explain to the customer archieve brands what happened here. Can you visualize data. That's kind of the broader shift and I think the most challenging part that I can tell in the broader picture of where there is a bit of a short coming in skillset, we have a lot of people who are really good today at analyzing data and coding, so that part has caught up. There are so many Data Science programs. What I still am looking for is how do you bring management and corporate culture to the place where they can truly take advantage of it. >> Lisa: Right. >> This kind of disconnect that we still have-- >> Lisa: Absolutely. >> How do we educate the management level to be comfortable evaluating what their data science group actually did. Whether they working on the right problems that really ultimately will have impact. I think that layer of education needs to receive a lot more emphasis compared to what we already see in terms of this increased skillset on just the sheer technical side of it. >> You mentioned that you teach-- >> Claudia: Mm,hmm. >> Before we went live here, that you teach at NYU, but you're also teaching Data Science to the business folks. I would love for you to expand a little bit more upon that and how are you helping to educate these people to understand the impact. Cuz that's really, really a change agent within the company. That's a cultural change, which is really challenging-- >> Claudia: Very much so. >> Lisa: What's their perception? What's their interest in understanding how this can really drive value? >> What you see, I've been teaching this course for almost six years now, and originally it was really kind of the hardcore coders who also happened to get a PhD on the side, who came to the course. Now you increasingly have a very broad collection of business minded people. I typically teach in the part-time, meaning they all have day jobs and they've realized in their day jobs, I need this. I need that. That skill. That knowledge. We're trying to get on the ground where without having to teach them python and ARM whatever the new toys are there. How can you identify opportunities? How do you know which of the many different flavors of Data Science, from prediction towards visualization to just analyzing historical data to maybe even causality. Which of these tools is appropriate for the task at hand and then being able to evaluate whether the level of support that a machine can only bring, is it even sufficient? Because often just because you can analyze data doesn't mean that the reliability of the model is truly sufficient to support then a downstream business project. Being able to really understand those trade offs without necessarily being able to sit down and code it yourself. That knowledge has become a lot more valuable and I really enjoy the brainstorming when we're just trying to scope a project when they come with problems from their day job and say, "Hey, we're trying to do that." And saying, "Are you really trying to do that?" "What are you actually able to execute? "What kind of decisions can you make?" This is almost like the brainstorming in my own company now brought out to much broader people working in hospitals, people working in banking, so I get exposed to all of these kinds of problems said and that makes it really exciting for me. >> Lisa: Interesting. When Dstillery is talking to customer or prospective customers, is this now something that you're finding is a board level conversation within businesses? >> Claudia: No, I never get bored of that, so there is a part of the business that is pretty well understood and executed. You come to us, you give us money, and we will execute a digital campaign, either on mobile phones, on video, and you tell me what it is that you want me to optimize for. Do you want people to click on your ad? Please don't say yes, that's the worst possible things you may ask me to do-- (laughter) But let's talk about what you're going to measure, whether you want people to show up in your store, whether you really care about signing up for a test drive, and then the system automatically will build all the models that then do all the real-time bidding. Advertising, I'm not sure how many people are aware, as your New York Times page loads, every single ad slot on that side is sold in a real-time auction. About 50 billion times a day, we receive a request whether we want to bid on the opportunity to show somebody an ad. >> Lisa: Wow. >> So that piece, I can't make 50 billion decisions a day. >> Lisa: Right. >> It is entirely automated. There's this fully automated machine learning that just serves that purpose. What makes it interesting for me now that ... Now this is kind of standard fare if you want to move over and is more interesting parts. Well, can you for instance predict which of the 15 different creatives I have for Jobani, should I show you? >> Lisa: Mm,hmm. >> The one with the woman running, or the one with the kid opening, so there is no nuances to it and exploring these new challenges or going into totally new areas talking about, for instance churn prediction, I know an awful lot about people, I can predict very many things and a lot of them go far beyond just how you interact with ads, it's almost the most boring part. We can see people researching diabetes. We can provide snapshots to farmer telling them here's really where we see a rise of activity on a certain topic and maybe this is something of interest to understand which population is driving those changes. These kinds of conversations really making it exciting for me to bring the knowledge of what I see back to many different constituents and see what kind of problems we can possibly support with that. >> Lisa: It's interesting too. It sounds like more, not just providing ad technology to customers-- >> Claudia: Yeah. >> You're really helping them understand where they should be looking to drive value for their businesses. >> Claudia: That's really been the focus increasingly and I enjoy that a lot. >> Lisa: I can imagine that, that's quite interesting. Want to ask you a little bit before we wrap up here about your talk today. I was looking at your, the title of your abstract is, "Beware what you ask for: The secret life of predictive models". (laughter) Talk to us about some of the lessons you learn when things have gone a little bit, huh, I didn't expect that. >> I'm a huge fan of predictive modeling. I love the capabilities and what this technology can do. This being said, it's a collection of aha moments where you're looking at this and this, this doesn't really smell right. To give you an example from ad tech, and I alluded to this, when people say, "Okay we want a high click through rate." Yes, that means I have to predict who will click on an ad. And then you realize that no matter what the campaign, no matter what the product, the model always chooses to show the ad on the flashlight app. Yeah, because that's when people fumble in the dark. The model's really, really good at predicting when people are likely to click on an ad, except that's really not what you intended-- >> Right. >> When you asked me to do that. >> Right. >> So it's almost the best and powerful that they move off into a sidetracked direction you didn't even know existed. Something similar happened with one of these competitions that I won. For Siemens Medical where you had to identify an FMI images of breast, which of these regions are most likely benign or which one have cancer. In both models we did really, really well, all was good. Until we realized that the patient ID was by far the most predictive feature. Now this really shouldn't happen. Your social security number shouldn't be able to predict-- >> Lisa: Right. >> Anything really. It wasn't the social security number, but when we started looking a little bit deeper, we realized what had happened is the data set was a sample from different sources, and one was a treatment center, and one was a screening center and they had certain ranges of patient IDs, so the model had learned where the machine stood, not what the image actually contained about the probability of having cancer. Whoever assembled the data set possibly didn't think about the downstream effect this can have on modeling-- >> Right. >> Which brings us back to the data science skill as really comprehensive starting all the way from the beginning of where the data is collected, all the way down to be extremely skeptical about your own work and really make sure that it truly reflects what you want it to do. You asked earlier like what makes really good Data Scientists. The intuition to feel when something is wrong and to be able to pinpoint and trace it back with the curiosity of really needing to understand everything about the whole process. >> Lisa: And also being not only being able to communicate it, but probably being willing to fail. >> Claudia: That is the number one really requirement. If you want to have a data-driven culture, you have to embrace failure, because otherwise you will fail. >> Lisa: How do you find the reception (laughter) to that fact by your business students. Is that something that they're used to hearing or does it sound like a foreign language to them? >> I think the majority of them are in junior enough positions that they-- >> Lisa: Okay. >> Truly embrace that and if at all, they have come across the fact that they weren't allowed to fail as often as they had wanted to. I think once you go into the higher levels of conversation and we see that a lot in the ad tech industry where you have incentive problems. We see a lot of fraud being targeted. At the end of the day, the ad agency doesn't want to confess to the client that yeah they just wasted five million dollars-- >> Lisa: Right. >> Of ad spend on bots, and even the CMO might not be feeling very comfortable confessing that to the CO-- >> Right. >> Claudia: Being willing to truly face up the truth that sometimes data forces you into your face, that can be quite difficult for a company or even an industry. >> Lisa: Yes, it can. It's quite revolutionary. As is this event, so Claudia Perlich we thank you so much for joining us-- >> My pleasure. >> Lisa: On theCUBE today and we know that you're going to be mentoring a lot of people that are here. We thank you for watching theCUBE. We are live at Stanford University from the Women in Data Science Conference. I am Lisa Martin and we'll be right back (upbeat music)
SUMMARY :
covering the Women in Data We are joined by one of the Claudia: Thank you so being the face to talk about changed to mostly online, and one of the interesting It's really the scientific that are looking to sell their products. Lisa: Right. to the Super Bowl. around with their phones. demand for the digital ads is the best job to apply for in 2017. before the term really existed. Now it has a name. Right and the reputation to be fun and corporate culture to the the management level to and how are you helping and I really enjoy the brainstorming to customer or prospective customers, on the opportunity to show somebody an ad. So that piece, I can't make Well, can you for instance predict of interest to understand which population ad technology to customers-- be looking to drive value and I enjoy that a lot. of the lessons you learn the model always chooses to show the ad So it's almost the best and powerful happened is the data set was and to be able to able to communicate it, Claudia: That is the Lisa: How do you find the reception I think once you go into the to truly face up the truth we thank you so much for joining us-- from the Women in Data Science Conference.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Claudia Perlich | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Claudia | PERSON | 0.99+ |
2017 | DATE | 0.99+ |
Candy Crush | TITLE | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Siemens Medical | ORGANIZATION | 0.99+ |
Dstillery | ORGANIZATION | 0.99+ |
New York | LOCATION | 0.99+ |
Super Bowl | EVENT | 0.99+ |
Super Bowl | EVENT | 0.99+ |
Walmart Labs | ORGANIZATION | 0.99+ |
IBM Watson Research Lab | ORGANIZATION | 0.99+ |
Jobani | PERSON | 0.99+ |
five million dollars | QUANTITY | 0.99+ |
both models | QUANTITY | 0.99+ |
both sides | QUANTITY | 0.99+ |
single | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
15 different creatives | QUANTITY | 0.98+ |
One | QUANTITY | 0.97+ |
#WiDS2017 | EVENT | 0.97+ |
about two years | QUANTITY | 0.97+ |
ARM | ORGANIZATION | 0.97+ |
Women in Data Science Conference 2017 | EVENT | 0.97+ |
Women in Data Science Conference | EVENT | 0.97+ |
Women in Data Science | EVENT | 0.96+ |
one | QUANTITY | 0.96+ |
Media6Degrees | ORGANIZATION | 0.96+ |
About 50 billion times a day | QUANTITY | 0.95+ |
Forbes | ORGANIZATION | 0.95+ |
Stanford University | ORGANIZATION | 0.93+ |
50 billion decisions a day | QUANTITY | 0.92+ |
Women in Data Science 2017 | EVENT | 0.92+ |
Beware what you ask for: The secret life of predictive models | TITLE | 0.9+ |
IBM Watson | ORGANIZATION | 0.89+ |
theCUBE | ORGANIZATION | 0.89+ |
almost six years | QUANTITY | 0.88+ |
one day | QUANTITY | 0.86+ |
Stanford University | ORGANIZATION | 0.84+ |
NYU | ORGANIZATION | 0.82+ |
single ad | QUANTITY | 0.72+ |
python | ORGANIZATION | 0.66+ |
second annual | QUANTITY | 0.62+ |
one of the speakers | QUANTITY | 0.61+ |
New York Times | TITLE | 0.6+ |
dozen | QUANTITY | 0.56+ |