Image Title

Search Results for Dandelion:

Tanmay Bakshi, IBM Honorary Cloud Advisor | Open Source Summit 2017


 

>> Announcer: Live from Los Angeles. It's theCUBE covering Open Source Summit North America 2017. Brought to you by, the Linux Foundation and Red Hat. >> Hello everyone, welcome back. Our live coverage, theCUBE's live coverage, of the Open Source Summit in North America, it's a part of the Linux Foundation. I'm John Furrier your host, with Stu Miniman our co-host. Our next guest is Tanmay Bakshi, who is an IBM honorary cloud advisor, algorithmist, former CUBE alumni. Great to see you. >> Thank you very much! Glad to be here! >> You get taller every year. It was what, three years ago, two years ago? >> I believe yeah, two years ago, Interconnect 2016. >> IBM show... doing a lot of great stuff. You're an IBM VIP, you're doing a lot of work with them. IBM Champion. >> Thank you >> Congratulations. >> Thank you. >> What's new? You're pushing any code today? >> Definitely! Now today, getting ready for my BoF that I've got tonight, it's been absolutely great. I've been working on a lot of new projects that I'm going to be talking about today and tomorrow at my keynote. Like I've been working on AskTanmay, or course you know, Interconnect 2016, very first time I presented AskTanmay. Since then, a lot has changed, I've incorporated real, deep learning algorithms, custom, with tensorflow. Into AskTanmay, AskTanmay now thinks about what it's actually looking at, using Watson as well, it's really interesting. And of course, new projects that I'm working on, including DeepSPADE, which actually, basically helps online communities, to detect, and of course report and flag spam, from different websites. For example, Stack Overflow, which I'm working on right now. >> So you're doing some deep learning stuff >> Tanmay: Yes >> with IBM Watson, the team, everything else. >> Tanmay: Exactly, yes. >> What's the coolest thing you've worked on, since we last talked? (laughing) >> Well it would have to be a tie between AskTanmay, DeepSPADE, and advancement to the Cognitive Story. As you know, from last time, I've been working on lots of interesting projects, like with AskTanmay, some great new updates that you'll hear about today. DeepSPADE itself though, I'd like to get a little bit more into that. There's actually, I mean of course, everyone listening right now has used Stack Overflow or Stack Exchange at one point in their lives. And so, they've probably noticed that, a little bit, here and there, you'd see a spam message on Stack Overflow, on a comment or post. And of course there are methods to try and prevent spam on Stack Overflow, but they aren't very effective. And that's why a group of programmers, known as Charcoal SE, actually went ahead and started creating, basically this sweep to try and prevent spam on Stack Exchange. And they call it, SmokeDetector. And it helps them to find and remove spam on Stack Exchange. >> This is so good until it goes out, and the battery needs to be replaced, and you got to get on a chair. But this whole SmokeDetector, this is a real way they help create a good, healthy community. >> Yes, exactly. So, they try and basically find spam, report to moderators, and if enough alarms are set off, they try and report it, or flag it automatically, via other people's accounts. And so basically, what I'm trying to do is, I mean, a few weeks ago, when I found out about what they're doing, I found out that they use regular expressions to try and find spam. And so they have, you know, years of people gathering experience, they're experts in this field. And they keep, you know, adding more regular expressions to try and find spam. And since I, you know, am really really passionate about deep learning, I thought why not try and help them out, trying to augment this sort of SmokeDetector, with deep learning. And so, they graciously donated their data set to me, which has a good amount of training, training rows for me to actually train a deep learning system to classify a post between spam or non-spam. And you'll be hearing a lot more about the model architecture, the CNN plus GRU model, that I've got running in Keras, tonight during my BoF. >> Now, machine learning, could be a real benefit to spam detection, cause the patterns. >> Tanmay: Exactly. >> Spammers tend to have their own patterns, >> Tanmay: Exactly. >> as do bots. >> Tanmay: Yes, exactly, exactly. And eventually, you realize that hey, maybe we're not using the same words in every post, but there's a specific pattern of words, or specific type of word, that always appears in a spam message. And machine learning would help us combat against that. And of course, in this case, maybe we don't actually have a word, or a specific website, or a specific phone number, that would trigger a regular expression alarm. But in the context that this website appears, machine learning can tell us that, "hey, yeah, this is probably a spam post." There are lots of really interesting places where machine learning can tie in with this, and help out with the accuracy. In fact, I've been able to reach around 98% accuracy, and around 15 thousand testing rows. So, I'm very glad with the results so far, and of course, I'm continuing to do all this brand retuning and everything... >> Alright, so how old are you this year? I can't keep the numbers straight. Are you 13, 14? >> Well originally, Interconnect 2016, I was 12, but now I'm 13 years old, and I'm going to be 14 in October, October 16th. >> Okay, so you're knocking on 14? >> Tanmay: Uh, not just yet there, I'll be 14... >> So, Tanmay, you're 14, you're time's done, at this point. But, one of your missions, to be serious, is helping to inspire the next generation. Especially here, at the Open Source Summit, give us a preview of what we're going to see in your keynote. >> Sure, definitely. And now, as you mentioned, in fact, I actually have a goal. Which is really to reach out to and help 100 thousand aspiring coders along their journey, of learning to code, and of course then applying that code in lots of different fields. In fact I'm actually, already around 4,500 people there. Which, I'm very very excited about. But today, during my BoF, as I mentioned, I'm going to be talking a lot about the in-depth of the DeepSpade and AskTanmay projects I've been working on. But tomorrow, during my keynote, you'll be hearing a lot about generally all the projects that I've been working on, and how they're impacting lots of different fields. Like, healthcare, utility, security via artificial intelligence and machine learning. >> So, when you first talked to us about AskTanmay, it's been what almost 18 months, I think there. What's changed, what's accelerating? I hear you throw out things like Tensorflow, not something we were talking about two years ago. >> Tanmay: Yeah. >> What have been some of the key learnings you've had, as you've really dug into this? >> Sure, in fact, this actually something that I'm going to be covering tonight. And that is, that AskTanmay, you could say, that it's DNA, well, from AskMSR, that was made in 2002. And I took that, revived it, and basically made it into AskTanmay. In its DNA, there were specific elements, like for example, it really relies on data redundancy. If there's no data redundancy, then AskTanmay doesn't do well. If you were to ask it where it was, where's the Open Source Summit North America going to be held, it wouldn't answer correctly, because it's not redundant enough on the internet. It's mentioned once or twice, but not more than that. And so, I learned that it's currently very, I guess you could say naive how it actually understands the data that it's collecting. However, over the past, I'd say around six or seven months, I've been able to implement a BiDAF or Bi-Directional Attention Flow, that was created by Allen AI. It's completely open-source, and it uses something that's called a SQuAD data set, or Stanford Question and Answer Data Set. In order to actually take paragraphs and questions, and try to return answers as snippets from the paragraphs. And so again, integrating AskTanmay, this allows me to really reduce the data redundancy requirement, able to merge very similar answers to have, you know better answers on the top of the list, and of course I'm able to have it more smart, it's not as naive. It actually understands the content that it's gathering from search engines. For example, Google and Bing, which I've also added search support for. So again, a lot has changed, using deep learning but still, sort of the key-points of AskTanmay requires very little computational power, very very cross-platform, runs on any operating system, including iOS, Android, etc. And of course, from there, open-source completely. >> So how has your life changed, since all the, you've been really in the spotlight, and well-deserved I think. It's been great to have you On theCUBE multiple times, thanks for coming on. >> Thank you No, definitely of course. >> Dave Vallante was just calling. He wants to ask you a few questions himself. Dave, if you're watching, we'll get you on, just call right now. What's going on, what are you going to do when... Are you like happy right now? Are you cool with everything? Or is there a point where you say, "Hey I want to play a little bit with different tools", you want more freedom? What's going on? >> Well, you see, right now I'm very very excited, I'm very happy with what I'm doing. Because of course I mean, my life generally has changed quite a bit since last Interconnect, you could say. From Interconnect 2016 to 17, to now. Of course, since then, I've been able to go into lots of different fields. Not only am I working with general deep learning at IBM Watson, now I'm working with lots of different tools. And I'm working especially, in terms of like, for example Linux. What I've been doing with open-source and everything. I've been able to create, for example, AskTanmay now integrated Keras and tensorflow. DeepSpade is actually built entirely off of tensorflow and Keras. And now I've also been able to venture into lots of different APIs as well. Not just with IBM Watson. Also things like, we've got the Dandelion API. Which AskTanmay also relies off of Dandelion, providing text similarity services for semantic and syntactic text similarity. Which, again, we'll be talking about tonight as well. So, yeah, lot's has changed, and of course, with all this sort of, new stuff that I'm able to show, or new media for which I'm able to share my knowledge, for example, all these, you know CUBE, interviews I've been doing, and of course all these keynotes, I'm able to really spread my message about AI, why I believe it's not only our future, but also our present. Like, for example, I also mentioned this last time. If you were to just open up your phone right now, you already see that you're, half of your phone is powered by AI. It's detecting that hey you're at your home right now, you just drove back from work, and it's this time on this day, so you probably want to open up this application. It predicts that, and provides you with that. Apart from that, things like Siri, Google Now, these are all powered by AI, they're already an integral part of our lives. And of course, what they're going to be doing in our lives to come is just absolutely great. With like, healthcare, providing artificial communication ability for people who can't communicate naturally. I think it's going to be really really interesting. >> Tanmay, it's always great have you on theCUBE. Congratulations. >> Tanmay: Thank you very much. >> AskTanmay, good projects. Let's stay in touch, as we start to produce more collaboration, we'd love to keep promoting your work. Great job. And you're an inspiration to many. >> Tanmay: Thank you very much, glad to be here. >> Thanks for coming on theCUBE. Live coverage from the Open Source Summit's theCUBE, in Los Angeles. I'm John Furrer, Stu Miniman. We'll be back with more live coverage after short this break. (upbeat music)

Published Date : Sep 11 2017

SUMMARY :

Brought to you by, Great to see you. It was what, three years ago, two years ago? You're an IBM VIP, you're doing a lot of work with them. that I'm going to be talking about today And it helps them to find and the battery needs to be replaced, And so they have, you know, could be a real benefit to spam detection, And eventually, you realize that hey, Alright, so how old are you this year? and I'm going to be 14 in October, October 16th. to be serious, And now, as you mentioned, in fact, I hear you throw out things like Tensorflow, and of course I'm able to have it more smart, It's been great to have you Thank you What's going on, what are you going to do when... And now I've also been able to venture into lots Tanmay, it's always great have you on theCUBE. And you're an inspiration to many. from the Open Source Summit's theCUBE, in Los Angeles.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VallantePERSON

0.99+

TanmayPERSON

0.99+

Stu MinimanPERSON

0.99+

John FurrerPERSON

0.99+

Tanmay BakshiPERSON

0.99+

Red HatORGANIZATION

0.99+

DavePERSON

0.99+

Linux FoundationORGANIZATION

0.99+

2002DATE

0.99+

John FurrierPERSON

0.99+

SiriTITLE

0.99+

Los AngelesLOCATION

0.99+

tomorrowDATE

0.99+

IBMORGANIZATION

0.99+

CUBEORGANIZATION

0.99+

three years agoDATE

0.99+

todayDATE

0.99+

tonightDATE

0.99+

two years agoDATE

0.99+

13QUANTITY

0.99+

iOSTITLE

0.99+

LinuxTITLE

0.99+

GoogleORGANIZATION

0.99+

100 thousandQUANTITY

0.99+

AndroidTITLE

0.99+

North AmericaLOCATION

0.99+

CNNORGANIZATION

0.99+

twiceQUANTITY

0.98+

AskTanmayORGANIZATION

0.98+

Open Source SummitEVENT

0.98+

14QUANTITY

0.98+

theCUBEORGANIZATION

0.98+

firstQUANTITY

0.97+

12QUANTITY

0.97+

around 98%QUANTITY

0.97+

InterconnectORGANIZATION

0.97+

IBM WatsonORGANIZATION

0.97+

DeepSpadeTITLE

0.97+

StanfordORGANIZATION

0.97+

BingORGANIZATION

0.97+

around 4,500 peopleQUANTITY

0.96+

Open Source Summit North America 2017EVENT

0.96+

Open Source Summit 2017EVENT

0.96+

GRUORGANIZATION

0.95+

first timeQUANTITY

0.95+

DandelionTITLE

0.95+

Stack OverflowTITLE

0.95+

onceQUANTITY

0.94+

KerasTITLE

0.93+

Open Source Summit North AmericaEVENT

0.92+

one pointQUANTITY

0.92+

this yearDATE

0.92+

around 15 thousand testing rowsQUANTITY

0.91+

around sixQUANTITY

0.9+

InterconnectTITLE

0.9+