Satish Puranam & Rebecca Riss, Ford | KubeCon + CloudNativeCon NA 2022

(bright music) (crowd talking indistinctly in the background) >> Hey guys, welcome back to Detroit, Michigan. theCUBE is live at KubeCon + CloudNativeCon 2022. You might notice something really unique here. Lisa Martin with our newest co-host of theCUBE, Savannah Peterson! Savannah, it's great to see you. >> It's so good to be here with you (laughs). >> I know, I know. We have a great segment coming up. I always love talking couple things, cars, one, two, with companies that have been around for a hundred plus years and how they've actually transformed. >> Oh yeah. >> Ford is here. You have a great story about how you, about Ford. >> Ford brought me to Detroit the first time. I was here at the North American International Auto Show. Some of you may be familiar, and the fine folks from Ford brought me out to commentate just like this, as they were announcing the Ford Bronco. >> Satish: Oh wow. >> Which I am still lusting after. >> You don't have one yet? >> For the record. No, I don't. My next car's got to be an EV. Although, ironically, there's a Ford EV right behind us here on set today. >> I know, I know. >> Which we were both just contemplating before we went live. >> It's really shiny. >> We're going to have to go check it out. >> I have to check it out. Yep, we'll do that. Yeah. Well, please welcome our two guests from Ford, Satish Puranam, is here, The Technical Leader at Cloud and Rebecca Risk, Principal Architect, developer relations. We are so excited to have you guys on the program. >> Clearly. >> Thanks for joining us. (all laugh) >> Thank you for having us. >> I love you're Ford enthusiasts! Yeah, that's awesome. >> I drive a Ford. >> Oh, awesome! Thank you. >> I can only say that's one car company here. >> That's great. >> Yes, yes. >> Great! Thank you a lot. >> Thank you for your business! >> Absolutely. (all laugh) >> So, Satish, talk to us a little bit about- I mean I think of Cloud as a car company but it seems like it's a technology company that makes cars. >> Yes. Talk to us about Ford as a Cloud first, technology driven company, and then we're going to talk about what you're doing with Red Hat and Boston University. >> Yeah, I'm like everything that all these cars that you're seeing, beautiful right behind us it's all built on, around, and with technology, right? So there's so much code goes into these cars these days, it's probably, it's mind boggling to think that probably your iPhones might be having less code as opposed to these cars. Everything from control systems, everything is code. We don't do any more clay models. Everything is done digital, 3D, virtual reality and all that stuff. So all that takes code, all of that takes technology. And we have been in that journey for the last- since 2016 when we started our first mobile app and all that stuff. And of late we have been like, heavily invested in Google. Moving a lot of these experiences, data acquisition systems AI/ML modeling for like all the autonomous cars. It's all technology and like from the day it is conceived, to the day it is marketed, to the day when you show up for a servicing, and hopefully soon how you can buy and you know, provide feedback to us, is all technology that drives all of this stuff. So it's amazing for us to see everything that we go and immerse ourselves in the technology. There is a real life thing that we can see what we all do for it, right? So- >> Yes, we're only sorry that our audience can't actually see the car, >> Yep. >> but we'll get some B-roll for you later on. Rebecca, talk a little bit about your role. Here we are at KubeCon, Savannah and I and John were talking when we went live this morning, that this is huge. That the show floor is massive, a lot bigger than last year. The collaboration and the spirit of the community is not only alive and well, as we heard in the keynote this morning, it's thriving. >> Yeah. >> Talk about developer relations at Ford and what you are helping to drive in your role. >> Yeah, so my team is all about helping developers work faster with different platforms that my team curates and produces, so that our developers don't have to deal with all of the details of setting up their environments to actually code. And we have really great people, kind of the top software developers in the company, are part of my team to produce those products that other people can use, and accelerate their development. And we have a great relationship with the developers in the company and outside with the different vendor relationships that we have, to make sure that we're always producing the next platform with the next tech stack that our developers will want to continue to use to produce the really great products that we are all about making at Ford. >> Let's dig in there a little bit because I'm curious and I suspect you both had something to do with it. How did you approach your Cloud Native transformation and how do you evaluate new technologies for the team? >> It's sometimes- many a times I would say it's like dogfooding and like experimentation. >> Yeah. Isn't anything in innovation a lot of- >> Yeah, a lot of experimentation. We started our, as I said, the Cloud Native journey back in 2016 with Cloud Foundry and things, technologies around that. Soon realized, that there was like a lot of buzz around that time. Twelve-Factor was a thing, Stateless was a thing. And then all those Stateful needs to drive the Stateless. So where do we do that thing? And the next logical iteration was Kubernetes was bursting upon the scene at that time. So we started doing a lot of experimentation. >> Like the Kool-Aid man, burst on the Kubernetes scene- >> Exactly right. >> Through the wall. >> So, the question is like, why can't we do? I think we were like crazy enough to say that Kubernetes people are talking about our serverless or Twelve-Factor on Kubernetes. We are crazy enough to do Stateful on Kubernetes and we've been doing it successfully for five years. So it's a lot about experimentation. I think good chunk of experiments that we do do not yield the results that we get, but many a times, some of them are like Gangbusters. Like, other aspects that we've been doing of late is like partnering with Becky and rest of the organization, right? Because they are the people who are like closest to the developers. We are somewhat behind the scenes doing some things but it is Becky and the rest of the architecture teams who are actually front and center with the customers, right? So it is the collaborative effort that we've been working through past few years that has been really really been useful and coming around and helping us to make some of these products really beautiful. >> Yeah, well you make a lot of beautiful products. I think we've all, I think we've all seen them. Something that I think is really interesting and part of why I was so excited for this interview, and kind of nudged John out, was because you've been- Ford has been investing in technology in a committed way for decades and I don't think most people are aware of that. When I originally came out to Dearborn, I learned that you've had a head of VR who happens to be a female. For what it's worth, Elizabeth, who's been running VR for you for two and a half decades, for 25 years. >> Satish: Yep. >> That is an impressive commitment. What is that like from a culture perspective inside of Ford? What is the attitude around innovation and technology? >> So I've been a long time Ford employee. I just celebrated my 29th year. >> Oh, wow! >> Congratulations! >> Wow, congrats! That's a huge deal. >> Yeah, it's a huge deal. I'm so proud of my career and all that Ford has brought to me and it's just a testament. I have many colleagues like me who've been there for their whole career or have done other things and come to Ford and then spent another 20 years with us because we foster the culture that makes you want to stay. We have development programs to allow you to upscale and change your role and learn new things and play with the new technologies that people are interested in doing and really make an impact to our community of developers at Ford or the company itself and the results that we're delivering. So to have that, you know, culture for so many years that people really love to work. They love to work with the people that they're working with. They love to stay engaged and they love the fact that you can have many different careers within the same umbrella, which we call the "blue oval". And that's really why I've been there for so long. I think I probably had 13 very unique and different jobs along the way. It's as if I left, and you know shopped around my skills elsewhere. But I didn't ever have to leave the company. It's been fabulous. >> The cultural change and adoption of- embracing modern technology- Cloud Native automotive software is impressive because a lot of historied companies, you guys have been there a long time, have challenges with that because it's really hard to get an entire moving, you'll call it the blue oval, to change and adapt- >> Savannah: I love that. >> and be willing to experiment. So that that is impressive. Talk about, you go by Becky, so I'll call you Becky, >> Rebecca/Becky: Yeah. >> The developer culture in terms of the developers really being the center of the nucleus of influencing the direction in which the company's going. I imagine that they probably are fairly influential. >> Yeah, so I had a very- one of the unique positions I held was a culture change for our department, Information Technology in 2016. >> Satish: Yeah. >> As the teacher was involved with moving us to the cloud, I was responsible- >> You are the transformation team! This is beautiful. I love this. We've got the right people on the show. >> Yeah, we do. >> I was responsible for changing the culture to orient our employees to pay attention to what do we want to create for tomorrow? What are the kind of skills we need to trust each other to move quickly. And that was completely unique. >> Satish: Yeah. >> Like I had men in the trenches delivering software before that, and then plucked out because they wanted someone, you know who had authentic experience with our development team to be that voice. And it was such a great investment that Ford continues to do is invest in our culture transformation. Because with each step forward that we do, we have to refine what our priorities are. And you do that through culture transformation and culture management. And that's been, I think really, the key to our successful pivots that we've made over the last six years that we've been able to continue to refine and hone where we really want to go through that culture movement. >> Absolutely. I think if I could add another- >> Please. >> spotlight to it is like the biggest thing about Ford has been among various startup-like culture, right? So the idea is that we encourage people to think outside the box, right? >> Savannah: Or outside the oval? >> Right! (laughs) >> Lisa: Outside the oval, yes! >> Absolutely! Right. >> So the question is like, you can experiment with various things, new technologies and you will get all the leadership support to go along with it. I think that is very important too and like we can be in the trenches and talk about all of these nice little things but who the heck would've thought that, you know Kubernetes was announced in 2015, in late 2016, we have early dev Kubernetes clusters already running. 2017, we are live with workloads on Kubernetes! >> Savannah: Early adopters over here. >> Yeah. >> Yeah. >> I'm like all of this thing doesn't happen without lot of foresight and support from the leadership, but it's also the grassroot efforts that is encouraged all along to be on the front end of all of these things and try different things. Some of them may not work >> Savannah: Right. >> But that's okay. But how do we know we are doing something, if you're not failing? We have to fail in order to do something, right? >> Lisa: I always say- >> So I think that's been a great thing that is encouraged very often and otherwise I would not be doing, I've done a whole bunch of stuff at Ford. Without that kind of ability to support and have an appetite for, some of those things would not have been here at all. >> I always say failure is not a bad F-word. >> Satish: Yep. >> Savannah: I love that. >> But what you're talking about there is kind of like driving this hot wheel of experimentation. You have to have the right culture and the mindset- >> Satish: Absolutely. >> to do that. Try fail, move on, learn, iterate, go. >> Satish: Correct. >> You guys have a great partnership with Red Hat and Boston University. You're speaking about that later today. >> Satish: Yes. >> Unpack that for us. What, from a technical perspective, what are you doing and what's it resulting in? >> Yeah, I think the biggest thing is Becky was talking about as during this transformation journey, is lot has changed in very small amount of time. So we traditionally been like, "Hey, here's a spreadsheet of things I need you to deliver for me" to "Here is a catalog of things, you can get it today and be successful with it". That is frightening to several of our developers. The goal, one of the things that we've been working with Q By Example, Red Hat and all the thing, is that how can we lower the bar for the developers, right? Kubernetes is great. It's also a wall of YAML. >> It's extremely complex, number one complaint. >> The question is how can I zero on? I'm like, if we go back think like when we talk about in cars with human-machine interfaces, which parts do I need to know? Here's the steering wheel, here's the gas pedal, or here's the brake. As long as you know these two, three different things you should be fairly be okay to drive those things, right? So the idea of some of the things with enablementing we are trying to do is like reduce that barrier, right? Reduce- lower the bar so that more people can participate in it. >> One of the ways that you did that was Q By Example, right, QBE? >> Satish: Yes, Yes. >> Can you tell us a little bit more about that as you finish this answer? >> Yeah, I think the biggest thing with Q By Example is like Q By Example gives you the small bite-sized things about Kubernetes, right? >> Savannah: Great place to start. >> But what we wanted to do is that we wanted to reinforce that learning by turning into a real world living example app. We took part info, we said, Hey, what does it look like? How do I make sure that it is highly available? How do I make sure that it is secure? Here is an example YAML of it that you can literally verbatim copy and paste into your editor and click run and then you will get an instant gratification feedback loop >> I was going to say, yeah, they feel like you're learning too! >> Yes. Right. So the idea would be is like, and then instead of giving you just a boring prose text to read, we actually drop links to relevant blog posts saying that, hey you can just go there. And that has been inspirational in terms of like and reinforcing the learning. So that has been where we started working with the Boston University, Red Hat and the community around all of that stuff. >> Talk a little bit about, Becky, about some of the business outcomes. You mentioned things like upskilling the workforce which is really nice to hear that there's such a big focus on it. But I imagine too, there's more participation in the community, but also from an end customer perspective. Obviously, everything Ford's doing is to serve the end customers >> Becky: Right. How does this help the end customer have that experience that they really, these days, demand with patience being something that, I think, is gone because of the pandemic? >> Right? Right. So one of the things that my team does is we create the platforms that help Accelerate developers be successful and it helps educate them more quickly on appropriate use of the platforms and helps them by adopting the platforms to be more secure which inherently lead to the better results for our end customers because their data is secure because the products that they have are well created and they're tested thoroughly. So we catch all those things earlier in the cycle by using these platforms that we help curate and produce. And that's really important because, like you had mentioned, this steep learning curve associated with Kubernetes, right? >> Savannah: Yeah. >> So my team is able to kind of help with that abstraction so that we solve kind of the higher complex problems for them so that developers can move faster and then we focus our education on what's important for them. We use things like Q By Example, as a source instead of creating that content ourselves, right? We are able to point them to that. So it's great that there's that community and we're definitely involved with that. But that's so important to help our developers be successful in moving as quickly as they want and not having 20,000 people solve the same problems. >> Satish: (chuckles) Yeah. >> Each individually- >> Savannah: you don't need to! >> and sometimes differently. >> Savannah: We're stronger together, you know? >> Exactly. >> The water level rises together and Ford is definitely a company that illustrates that by example. >> Yeah, I'm like, we can't make a better round wheel right? >> Yeah! So, we have to build upon what we have already been built ahead of us. And I think a lot of it is also about how can we give back and participate in the community, right? So I think that is paramount for us as like, here we are in Detroit so we're trying to recruit and show people that you know, everything that we do is not just old car and sheet metal >> Savannah: Combustion. >> and everything and right? There's a lot of tech goes and sometimes it is really, really cool to do that. And biggest thing for us is like how can we involve our community of developers sooner, earlier, faster without actually encumbering them and saying that, hey here is a book, go master it. We'll talk two months later. So I think that has been another journey. I think that has been a biggest uphill challenge for us is that how can we actually democratize all of these things for everybody. >> Yeah. Well no one better to try than you I would suspect. >> We can only try and hope everything turns out well, right? >> You know, as long as there's room for the bumpers on the lane for if you fail. >> Exactly. >> It sounds like you're driving the program in the right direction. Closing question for you, what's next? Is electric the future? Is Kubernetes the future? What's Ford all in on right now, looking forward? (crowd murmuring in the background) >> Data is the king, right? >> Savannah: Oh, okay, yes! >> Data is a new currency. We use that for several things to improve the cars improve the quality of autonomous driving Is Level 5 driving here? Maybe will be here soon, we'll see. But we are all working towards it, right? So machine learning, AI feedback. How do you actually post sale experience for example? So all of these are all areas that we are working to. We are, may not be getting like Kubernetes in a car but we are putting Kubernetes in plants. Like you order a Marquis or you order a Bronco, you see that here. Here's where in the assembly line your car is. It's taking pictures. It's actually taking pictures on Kubernetes platform. >> That's pretty cool. >> And it is tweeting for you on the Twitter and the social media platform. So there's a lot of that. So it is real and we are doing it. We need more help. A lot of the community efforts that we are seeing and a lot of the innovation that is happening on the floor here, it's phenomenal. The question is how we can incorporate those things into our workflows. >> Yeah, well you have the right audience for that here. You also have the right attitude, >> Exactly. >> the right appetite, and the right foundation. Becky, last question for you. Top three takeaways from your talk today. If you're talking to the developer community you want to inspire: Come work for us! What would you say? >> If you're ready to invest in yourself and upskill and be part of something that is pretty remarkable, come work for us! We have many, many different technical career paths that you can follow. We invest in our employees. When you master something, it's time for you to move on. We have career growth for you. It's been a wonderful gift to me and my family and I encourage everyone to check us out careers.ford.com or stop by our booth if you're happen to be here in person. >> Satish: Absolutely! >> We have our curated job openings that are specific for this community, available. >> Satish: Absolutely. >> Love it. Perfect close. Nailed pitch there. I'm sure you're all going to check out their job page. (all laugh) >> Exactly! And what you talked about, the developer experience, the customer experience are inextricably linked and you guys are really focused on that. Congratulations on all the work that you've done. We got to go get a selfie with that car girl. >> Yes, we do. >> Absolutely. >> We got to show them, we got to show the audience what it looks like on the inside too. We'll do a little IG video. (Lisa laughs) >> Absolutely. >> We will show you that for our guests and my cohost, Savannah Peterson. Lisa Martin here live in Detroit with theCUBE at KubeCon and CloudNativeCon 2022. The one and only John Furrier, who you know gets FOMO, is going to be back with me next. So stick around. (all laugh) (bright music)

Published Date : Oct 27 2022

SUMMARY :

it's great to see you. It's so good to be We have a great segment coming up. You have a great story Some of you may be For the record. Which we were both just I have to check it out. Thanks for joining us. I love you're Ford Thank you. I can only say that's Thank you a lot. (all laugh) So, Satish, talk to Talk to us about Ford as a Cloud first, to the day when you show of the community is not and what you are helping don't have to deal with all of the details something to do with it. a times I would say it's in innovation a lot of- a lot of buzz around that time. So it is the collaborative Something that I think is What is the attitude around So I've been a long time Ford employee. That's a huge deal. So to have that, you know, culture So that that is impressive. of influencing the direction one of the unique positions You are the transformation What are the kind of skills we need that Ford continues to do is I think Absolutely! So the question is that is encouraged all along to be on the We have to fail in order Without that kind of ability to support I always say failure and the mindset- to do that. You're speaking about that later today. what are you doing and and all the thing, is that It's extremely complex, So the idea of some of the things it that you can literally and the community around in the community, but also from is gone because of the pandemic? So one of the things so that we solve kind of a company that illustrates and show people that really cool to do that. try than you I would suspect. for the bumpers on the in the right direction. areas that we are working to. and a lot of the innovation You also have the right attitude, and the right foundation. that you can follow. that are specific for to check out their job page. and you guys are really focused on that. We got to show them, we is going to be back with me next.

ENTITIES

Entity	Category	Confidence
Elizabeth	PERSON	0.99+
Rebecca	PERSON	0.99+
2016	DATE	0.99+
Satish	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Becky	PERSON	0.99+
13	QUANTITY	0.99+
Ford	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
Savannah Peterson	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
Savannah	PERSON	0.99+
2015	DATE	0.99+
Detroit	LOCATION	0.99+
John Furrier	PERSON	0.99+
Rebecca Risk	PERSON	0.99+
John	PERSON	0.99+
Satish Puranam	PERSON	0.99+
Rebecca Riss	PERSON	0.99+
Boston University	ORGANIZATION	0.99+
25 years	QUANTITY	0.99+
five years	QUANTITY	0.99+
2017	DATE	0.99+
two guests	QUANTITY	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
careers.ford.com	OTHER	0.99+
last year	DATE	0.99+
29th year	QUANTITY	0.99+
20,000 people	QUANTITY	0.99+
KubeCon	EVENT	0.99+
Detroit, Michigan	LOCATION	0.99+
two	QUANTITY	0.99+
20 years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
two months later	DATE	0.99+
One	QUANTITY	0.99+
Each	QUANTITY	0.98+
Cloud	ORGANIZATION	0.98+
late 2016	DATE	0.98+
Kubernetes	TITLE	0.98+

KubeCon + CloudNativeCon 2022 Preview w/ @Stu

>>Keon Cloud Native Con kicks off in Detroit on October 24th, and we're pleased to have Stewart Miniman, who's the director of Market Insights, hi, at, for hybrid platforms at Red Hat back in the studio to help us understand the key trends to look for at the events. Do welcome back, like old, old, old >>Home. Thank you, David. It's great to, great to see you and always love doing these previews, even though Dave, come on. How many years have I told you Cloud native con, It's a hoodie crowd. They're gonna totally call you out for where in a tie and things like that. I, I know you want to be an ESPN sportscaster, but you know, I I, I, I still don't think even after, you know, this show's been around for so many years that there's gonna be too many ties into Troy. I >>Know I left the hoodie in my off, I'm sorry folks, but hey, we'll just have to go for it. Okay. Containers generally, and Kubernetes specifically continue to show very strong spending momentum in the ETR survey data. So let's bring up this slide that shows the ETR sectors, all the sectors in the tax taxonomy with net score or spending velocity in the vertical axis and pervasiveness on the horizontal axis. Now, that red dotted line that you see, that marks the elevated 40% mark, anything above that is considered highly elevated in terms of momentum. Now, for years, the big four areas of momentum that shine above all the rest have been cloud containers, rpa, and ML slash ai for the first time in 10 quarters, ML and AI and RPA have dropped below the 40% line, leaving only cloud and containers in rarefied air. Now, Stu, I'm sure this data doesn't surprise you, but what do you make of this? >>Yeah, well, well, Dave, I, I did an interview with at Deepak who owns all the container and open source activity at Amazon earlier this year, and his comment was, the default deployment mechanism in Amazon is containers. So when I look at your data and I see containers and cloud going in sync, yeah, that, that's, that's how we see things. We're helping lots of customers in their overall adoption. And this cloud native ecosystem is still, you know, we're still in that Cambridge explosion of new projects, new opportunities, AI's a great workload for these type type of technologies. So it's really becoming pervasive in the marketplace. >>And, and I feel like the cloud and containers go hand in hand, so it's not surprising to see those two above >>The 40%. You know, there, there's nothing to say that, Look, can I run my containers in my data center and not do the public cloud? Sure. But in the public cloud, the default is the container. And one of the hot discussions we've been having in this ecosystem for a number of years is edge computing. And of course, you know, I want something that that's small and lightweight and can do things really fast. A lot of times it's an AI workload out there, and containers is a great fit at the edge too. So wherever it goes, containers is a good fit, which has been keeping my group at Red Hat pretty busy. >>So let's talk about some of those high level stats that we put together and preview for the event. So it's really around the adoption of open source software and Kubernetes. Here's, you know, a few fun facts. So according to the state of enterprise open source report, which was published by Red Hat, although it was based on a blind survey, nobody knew that that Red Hat was, you know, initiating it. 80% of IT execs expect to increase their use of enterprise open source software. Now, the CNCF community has currently more than 120,000 developers. That's insane when you think about that developer resource. 73% of organizations in the most recent CNCF annual survey are using Kubernetes. Now, despite the momentum, according to that same Red Hat survey, adoption barriers remain for some organizations. Stu, I'd love you to talk about this specifically around skill sets, and then we've highlighted some of the other trends that we expect to see at the event around Stu. I'd love to, again, your, get your thoughts on the preview. You've done a number of these events, automation, security, governance, governance at scale, edge deployments, which you just mentioned among others. Now Kubernetes is eight years old, and I always hear people talking about there's something coming beyond Kubernetes, but it looks like we're just getting started. Yeah, >>Dave, It, it is still relatively early days. The CMC F survey, I think said, you know, 96% of companies when they, when CMC F surveyed them last year, were either deploying Kubernetes or had plans to deploy it. But when I talked to enterprises, nobody has said like, Hey, we've got every group on board and all of our applications are on. It is a multi-year journey for most companies and plenty of them. If you, you look at the general adoption of technology, we're still working through kind of that early majority. We, you know, passed the, the chasm a couple of years ago. But to a point, you and I we're talking about this ecosystem, there are plenty of people in this ecosystem that could care less about containers and Kubernetes. Lots of conversations at this show won't even talk about Kubernetes. You've got, you know, big security group that's in there. >>You've got, you know, certain workloads like we talked about, you know, AI and ml and that are in there. And automation absolutely is playing a, a good role in what's going on here. So in some ways, Kubernetes kind of takes a, a backseat because it is table stakes at this point. So lots of people involved in it, lots of activities still going on. I mean, we're still at a cadence of three times a year now. We slowed it down from four times a year as an industry, but there's, there's still lots of innovation happening, lots of adoption, and oh my gosh, Dave, I mean, there's just no shortage of new projects and new people getting involved. And what's phenomenal about it is there's, you know, end user practitioners that aren't just contributing. But many of the projects were spawned out of work by the likes of Intuit and Spotify and, and many others that created some of the projects that sit alongside or above the, the, you know, the container orchestration itself. >>So before we talked about some of that, it's, it's kind of interesting. It's like Kubernetes is the big dog, right? And it's, it's kind of maturing after, you know, eight years, but it's still important. I wanna share another data point that underscores the traction that containers generally are getting in Kubernetes specifically have, So this is data from the latest ETR survey and shows the spending breakdown for Kubernetes in the ETR data set for it's cut for respondents with 50 or more citations in, in by the IT practitioners that lime green is new adoptions, the forest green is spending 6% or more relative to last year. The gray is flat spending year on year, and those little pink bars, that's 6% or down spending, and the bright red is retirements. So they're leaving the platform. And the blue dots are net score, which is derived by subtracting the reds from the greens. And the yellow dots are pervasiveness in the survey relative to the sector. So the big takeaway here is that there is virtually no red, essentially zero churn across all sectors, large companies, public companies, private firms, telcos, finance, insurance, et cetera. So again, sometimes I hear this things beyond Kubernetes, you've mentioned several, but it feels like Kubernetes is still a driving force, but a lot of other projects around Kubernetes, which we're gonna hear about at the show. >>Yeah. So, so, so Dave, right? First of all, there was for a number of years, like, oh wait, you know, don't waste your time on, on containers because serverless is gonna rule the world. Well, serverless is now a little bit of a broader term. Can I do a serverless viewpoint for my developers that they don't need to think about the infrastructure but still have containers underneath it? Absolutely. So our friends at Amazon have a solution called Fargate, their proprietary offering to kind of hide that piece of it. And in the open source world, there's a project called Can Native, I think it's the second or third can Native Con's gonna happen at the cncf. And even if you use this, I can still call things over on Lambda and use some of those functions. So we know Dave, it is additive and nothing ever dominates the entire world and nothing ever dies. >>So we have, we have a long runway of activities still to go on in containers and Kubernetes. We're always looking for what that next thing is. And what's great about this ecosystem is most of it tends to be additive and plug into the pieces there, there's certain tools that, you know, span beyond what can happen in the container world and aren't limited to it. And there's others that are specific for it. And to talk about the industries, Dave, you know, I love, we we have, we have a community event that we run that's gonna happen at Cubans called OpenShift Commons. And when you look at like, who's speaking there? Oh, we've got, you know, for Lockheed Martin, University of Michigan and I g Bank all speaking there. So you look and it's like, okay, cool, I've got automotive, I've got, you know, public sector, I've got, you know, university education and I've got finance. So all of you know, there is not an industry that is not touched by this. And the general wave of software adoption is the reason why, you know, not just adoption, but the creation of new software is one of the differentiators for companies. And that is what, that's the reason why I do containers, isn't because it's some cool technology and Kubernetes is great to put on my resume, but that it can actually accelerate my developers and help me create technology that makes me respond to my business and my ultimate end users. Well, >>And you know, as you know, we've been talking about the Supercloud a lot and the Kubernetes is clearly enabler to, to Supercloud, but I wanted to go back, you and John Furrier have done so many of, you know, the, the cube cons, but but go back to Docker con before Kubernetes was even a thing. And so you sort of saw this, you know, grow. I think there's what, how many projects are in CNCF now? I mean, hundreds. Hundreds, okay. And so you're, Will we hear things in Detroit, things like, you know, new projects like, you know, Argo and capabilities around SI store and things like that? Well, you're gonna hear a lot about that. Or is it just too much to cover? >>So I, I mean the, the good news, Dave, is that the CNCF really is, is a good steward for this community and new things got in get in. So there's so much going on with the existing projects that some of the new ones sometimes have a little bit of a harder time making a little bit of buzz. One of the more interesting ones is a project that's been around for a while that I think back to the first couple of Cube Cuban that John and I did service Mesh and Istio, which was created by Google, but lived under basically a, I guess you would say a Google dominated governance for a number of years is now finally under the CNCF Foundation. So I talked to a number of companies over the years and definitely many of the contributors over the years that didn't love that it was a Google Run thing, and now it is finally part. >>So just like Kubernetes is, we have SEO and also can Native that I mentioned before also came outta Google and those are all in the cncf. So will there be new projects? Yes. The CNCF is sometimes they, they do matchmaking. So in some of the observability space, there were a couple of projects that they said, Hey, maybe you can go merge down the road. And they ended up doing that. So there's still you, you look at all these projects and if I was an end user saying, Oh my God, there is so much change and so many projects, you know, I can't spend the time in the effort to learn about all of these. And that's one of the challenges and something obviously at Red Hat, we spend a lot of time figuring out, you know, not to make winners, but which are the things that customers need, Where can we help make them run in production for our, our customers and, and help bring some stability and a little bit of security for the overall ecosystem. >>Well, speaking of security, security and, and skill sets, we've talked about those two things and they sort of go hand in hand when I go to security events. I mean, we're at reinforced last summer, we were just recently at the CrowdStrike event. A lot of the discussion is sort of best practice because it's so complicated. And, and, and will you, I presume you're gonna hear a lot of that here because security securing containers now, you know, the whole shift left thing and shield right is, is a complicated matter, especially when you saw with the earlier data from the Red Hat survey, the the gaps are around skill sets. People don't have the skill. So should we expect to hear a lot about that, A lot of sort of how to, how to take advantage of some of these new capabilities? >>Yeah, Dave, absolutely. So, you know, one of the conversations going on in the community right now is, you know, has DevOps maybe played out as we expect to see it? There's a newer term called platform engineering, and how much do I need to do there? Something that I, I know your, your team's written a lot about Dave, is how much do you need to know versus what can you shift to just a platform or a service that I can consume? I've talked a number of times with you since I've been at Red Hat about the cloud services that we offer. So you want to use our offering in the public cloud. Our first recommendation is, hey, we've got cloud services, how much Kubernetes do you really want to learn versus you want to do what you can build on top of it, modernize the pieces and have less running the plumbing and electric and more, you know, taking advantage of the, the technologies there. So that's a big thing we've seen, you know, we've got a big SRE team that can manage that for use so that you have to spend less time worrying about what really is un differentiated heavy lifting and spend more time on what's important to your business and your >>Customers. So, and that's, and that's through a managed service. >>Yeah, absolutely. >>That whole space is just taken off. All right, Stu I'll give you the final word. You know, what are you excited about for, for, for this upcoming event and Detroit? Interesting choice of venue? Yeah, >>Look, first of off, easy flight. I've, I've never been to Detroit, so I'm, I'm willing to give it a shot and hopefully, you know, that awesome airport. There's some, some, some good things there to learn. The show itself is really a choose your own adventure because there's so much going on. The main show of QAN and cloud Native Con is Wednesday through Friday, but a lot of a really interesting stuff happens on Monday and Tuesday. So we talked about things like OpenShift Commons in the security space. There's cloud Native Security Day, which is actually two days and a SIG store event. There, there's a get up show, there's, you know, k native day. There's so many things that if you want to go deep on a topic, you can go spend like a workshop in some of those you can get hands on to. And then at the show itself, there's so much, and again, you can learn from your peers. >>So it was good to see we had, during the pandemic, it tilted a little bit more vendor heavy because I think most practitioners were pretty busy focused on what they could work on and less, okay, hey, I'm gonna put together a presentation and maybe I'm restricted at going to a show. Yeah, not, we definitely saw that last year when I went to LA I was disappointed how few customer sessions there were. It, it's back when I go look through the schedule now there's way more end users sharing their stories and it, it's phenomenal to see that. And the hallway track, Dave, I didn't go to Valencia, but I hear it was really hopping felt way more like it was pre pandemic. And while there's a few people that probably won't come because Detroit, we think there's, what we've heard and what I've heard from the CNCF team is they are expecting a sizable group up there. I know a lot of the hotels right near the, where it's being held are all sold out. So it should be, should be a lot of fun. Good thing I'm speaking on an edge panel. First time I get to be a speaker at the show, Dave, it's kind of interesting to be a little bit of a different role at the show. >>So yeah, Detroit's super convenient, as I said. Awesome. Airports too. Good luck at the show. So it's a full week. The cube will be there for three days, Tuesday, Wednesday, Thursday. Thanks for coming. >>Wednesday, Thursday, Friday, sorry, >>Wednesday, Thursday, Friday is the cube, right? So thank you for that. >>And, and no ties from the host, >>No ties, only hoodies. All right Stu, thanks. Appreciate you coming in. Awesome. And thank you for watching this preview of CubeCon plus cloud Native Con with at Stu, which again starts the 24th of October, three days of broadcasting. Go to the cube.net and you can see all the action. We'll see you there.

Published Date : Oct 4 2022

SUMMARY :

Red Hat back in the studio to help us understand the key trends to look for at the events. I know you want to be an ESPN sportscaster, but you know, I I, I, I still don't think even Now, that red dotted line that you And this cloud native ecosystem is still, you know, we're still in that Cambridge explosion And of course, you know, I want something that that's small and lightweight and Here's, you know, a few fun facts. I think said, you know, 96% of companies when they, when CMC F surveyed them last year, You've got, you know, certain workloads like we talked about, you know, AI and ml and that And it's, it's kind of maturing after, you know, eight years, but it's still important. oh wait, you know, don't waste your time on, on containers because serverless is gonna rule the world. And the general wave of software adoption is the reason why, you know, And you know, as you know, we've been talking about the Supercloud a lot and the Kubernetes is clearly enabler to, to Supercloud, definitely many of the contributors over the years that didn't love that it was a Google Run the observability space, there were a couple of projects that they said, Hey, maybe you can go merge down the road. securing containers now, you know, the whole shift left thing and shield right is, So, you know, one of the conversations going on in the community right now is, So, and that's, and that's through a managed service. All right, Stu I'll give you the final word. There, there's a get up show, there's, you know, k native day. I know a lot of the hotels right near the, where it's being held are all sold out. Good luck at the show. So thank you for that. Go to the cube.net and you can see all the action.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
David	PERSON	0.99+
Lockheed Martin	ORGANIZATION	0.99+
6%	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Detroit	LOCATION	0.99+
50	QUANTITY	0.99+
CNCF	ORGANIZATION	0.99+
October 24th	DATE	0.99+
40%	QUANTITY	0.99+
Stewart Miniman	PERSON	0.99+
Friday	DATE	0.99+
Google	ORGANIZATION	0.99+
96%	QUANTITY	0.99+
two days	QUANTITY	0.99+
University of Michigan	ORGANIZATION	0.99+
Stu	PERSON	0.99+
CMC F	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
Tuesday	DATE	0.99+
John	PERSON	0.99+
Wednesday	DATE	0.99+
eight years	QUANTITY	0.99+
Monday	DATE	0.99+
last year	DATE	0.99+
three days	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
second	QUANTITY	0.99+
73%	QUANTITY	0.99+
Thursday	DATE	0.99+
LA	LOCATION	0.99+
more than 120,000 developers	QUANTITY	0.99+
two things	QUANTITY	0.99+
John Furrier	PERSON	0.99+
hundreds	QUANTITY	0.99+
Hundreds	QUANTITY	0.99+
first time	QUANTITY	0.99+
two	QUANTITY	0.99+
24th of October	DATE	0.99+
one	QUANTITY	0.98+
KubeCon	EVENT	0.98+
CubeCon	EVENT	0.98+
CNCF Foundation	ORGANIZATION	0.98+
cube.net	OTHER	0.98+
last summer	DATE	0.98+
Valencia	LOCATION	0.98+
third	QUANTITY	0.98+
Spotify	ORGANIZATION	0.98+
Intuit	ORGANIZATION	0.98+
last year	DATE	0.98+
One	QUANTITY	0.98+
cloud Native Security Day	EVENT	0.97+
Kubernetes	TITLE	0.97+
QAN	EVENT	0.97+
ESPN	ORGANIZATION	0.97+

Regina Manfredi, Teradata | Amazon re:MARS 2022

(light techno music) >> Okay, welcome back, everyone from theCUBE's coverage of AWS re:Mars here in Las Vegas. Back in person, I'm John Furrier, host of theCUBE. Re:MARS stands or Machine learning, Automation, Robotics, and Space. And we're covering all the action two days, day two. And we're here with Regina Manfredi, who's the VP of global CSPs, Cloud Service Providers Alliances with Teradata. Great to see you. Cloud service providers or- >> Cloud services providers, the hyperscalers. >> Hyperscalers, the big guys. All the CapEx, Amazon. >> Yes. >> The big guys. >> Indeed, thanks for having me. >> Yeah, Thanks for coming on. So tell about your role. So alliances, you're here with AWS. What's the role with AWS and Teradata? >> So AWS and Teradata have recently entered into a strategic collaboration agreement where we're really focused on building solutions together, leveraging AWS services, as well as Teradata's outstanding architecture, as it relates to the data analytics platform that we provide for our customers in the cloud today. And we're really trying to drive better outcomes for data scientists, business analysts, etc. >> You know, just recently, did a CUBE conversation with Teradata, and I was really surprised to find, not shocked, but kind of surprised, the scale of the computation that's going on in some of the cloud things you're doing. And you have the legacy on-premises data warehouse traditional business as well. >> Regina: We do. >> And there's a huge shift going on. A lot of the kind of upstarts, "Oh, data warehouse, old school. Data warehouse, it's antiquated, old," but that's not true. You guys have a lot of cloud action. >> We do, we have substantial cloud action that's occurring with our customers today. We actually just released earlier this year an announcement around 1,000 node tests in the cloud together with AWS, and had success, no downtime, no failures at all. And so we're pretty proud about that, and excited about what that's going to hold for our customers who need that level of scale. >> Well, Regina, I got to tell you, I have a little bit of a confession here. I'm a cloud data nerd by my training. And, you know, I've always watched all the different kind of levels of transformation with the industry, and you know, this is going to change that, that's going to kill that. Everything's going to be killed and then it never dies, but it just changes. Even today, SQL is still like the prominent language, it's never going to, in fact it's amplified further because that's what people like. So that just proves that things don't always get replaced. And so I wanted to ask you this because as we're here at this event at re:MARS, you have space, you have all these ambitious positive goals, and they just need to do some machine learning. They need some cloud, they need some, they need to have the solutions. >> Regina: Yes. They're not going to like get in the weed and say, "Oh, this is a better Hadoop cluster than this Kubernetes cluster. So it's not about sometimes the tech, it's about the solution. >> It is, and one of the things that was interesting for us in our session earlier this week was the fact that we had so many customers approach us after that session and say, "I just need help preparing my data. Running my models, training my models, and making sure that they run and can be deployed. And I don't want to move all this data all the time and have all this failure rate that I'm experiencing." And so it was very basic requirements and needs as people begin into their journey on AI/ML for their business. And so it was reaffirming that we're on the right track and driving the right tools for them. I want to get your perspective on what you're thinking about the show, but first, I want to ask this since you brought that up. Swami was on stage and he said, "You can spend your entire time and your career just trying to figure out what's going on, machine learning." >> Regina: Yup. >> "Which open source framework's going to be better than the other one." I mean, it's just a lot of work to even figure it out. We just had the Fiddler's AI CEO on who worked out all the hyperscalers, say Facebook tend to, you know, real, you know, super alpha geek, if you will. And he was saying, and we were talking about open source, free software, integrations are a big part of where cloud scale, and the value is being captured for companies and people who are doing projects. Integrating some managed services, so this is where I see you, guys, going right now with Teradata, having all these cloud services built on the install base. >> Right. Which is not, doesn't hurt that at all. It just only helps it as they would migrate to cloud, its integrations, so you take a little bit of Amazon here, a little bit of Teradata there. >> Regina: Absolutely. >> What's your perspective, what's your reaction to that? >> So, I agree. And we think that's part of our secret sauce. You know, what we want to have is a data analytics platform in the cloud that allows data scientists, and architects, etc., to bring their own tools. So whatever they're utilizing today, we want them to be able to utilize it in vantage, and make sure that, A, can drive some efficiencies, and also, some better, smarter economics, as it relates to their particular projects. And so I agree with you 100% , and would tell you that we view that as somewhat our competitive advantage. It's not about being all proprietary. We want those integrations, and we've got dozens of them with AWS, and- >> Can you give example, can you give a couple examples of some integrations that highlight that? >> Sure, so right now we've got an integration with SageMaker today that allows our customers or data scientists to come in, prepare the data, and actually leverage SageMaker to build and train the models, and then deploy very quickly and easily without having to do all the data movement within their architecture. >> It's just so fascinating. I can't wait to have more conversation with you guys about this because I just think the world's spinning in a direction where, with low code, no code, >> Regina: Yup. >> you can see code, companion whisperer, that they have CodeWhisperer they launched today, they're writing subroutines for machine learning. And so it's not autocomplete, it's subroutine. So you're seeing all these advances on the technology. So it comes back to the building blocks, the integration. It just seems like going to be like a plug and play. That's old, were all, are old words. Mix and match, plug and play, interoperability, were old words, like, in the old days. Now they're becoming more relevant. What's your take on all that? >> Yeah, I would agree. I don't think that we should be competing against the algorithms, and neither do we. We want to just actually build out the toolsets that drive the enablement based on what a customer's requirements and needs are, and based on what the investments that they've already made within their own enterprises. >> You know, what's interesting about this event, I love to get your reaction to what re:MARS means to you because it's machine learning, automation, robotics, and space. Not your typical tech conference. >> Regina: No. >> Okay, little bit of a mixed bag there, so to speak. I love it. I think it's like super alpha geek, very nerdy, super nerds are here. And the topics kind of reflect the future. For the people that are watching that aren't here, what's your vibe on the show? What's your takeaway? How would you explain what's going on here from a market perspective, from a vibe perspective, what's happening? >> This is my first re:MARS actually, and I would have to tell you that I feel like it just, general observation, a few things, one, the conversations are more meaningful and we're getting into the meat of what a data scientist truly needs in order to be successful in their role and help drive their enterprise. That's number one. So I think, to your point, we're all kind of geeking out together here. The other thing that I think is pretty exciting is the amount of use cases, and ways in which we are driving impact. AWS and Teradata driving impact for the business analysts in the enterprise environment, but also for the people, their customers. That's pretty exciting to see. >> You know, it's interesting. When I first, was kind of like thinking about the show and what I was going to expect, it kind of overexceeded my expectations in the sense of what I was thinking about IOT, industrial, and digital innovation. 'Cause that's going to scale. I think now we're at a tipping point with machine learning that the industrial, IOT markets is going to explode 'cause machine learning's ready. But there was a whole positive, save the earth angle >> Regina: Yes. >> that caught my attention. >> Regina: Yes. You know, the discoveries from space are going to potentially have impact for the good, not just a cliche some sustainability messaging. It was actually real. >> Right, I think that that's exciting in an area in which we're excited to explore. We're doing a lot of work behind the scenes around sustainability and ESG initiatives for our customers, but also for the greater good. It's about driving outcomes for the greater good and being responsible with how we approach that. You know, the other thing I noticed too from a robotics standpoint, given I live in California, is a huge robotics culture there, you know. It's like bigger than football and baseball, and some sports. They provide A and B team and people get cut from the B team. There's so much demand to be on the robotics team. It's not a club, it's a team. >> Regina: Right. And so, you look at what's going on robotics, it's so exciting in the sense that if you're young and you're into tech, this is like- >> Regina: This is the place to be. >> I mean, why wouldn't you be hanging out here? >> Yeah, well, and I visited the booth over at University of Michigan, and how they're driving robotics to help support the human body to go further distances, and to drive better performance and health for individuals, and was really impressed with the work that they're doing, and even saw a use case and a need where I thought, you know, I have a quadriplegic sister-in-law, who I thought, "Wow, someday, maybe she'll be upright and walking again." >> John: Yeah. >> And those were exciting conversations to have while I was here. >> The advances on the material management robots I think is fascinating to see that growth. Well, let's get back to Teradata real quick to kind of close out future of what's next. Obviously, a lot of migration to the cloud happening. What's the outlook on the landscape and where do you see it evolving? Because you're seeing what the hyperscalers are doing, the cloud service providers, they're providing the CapEx. In fact, we coined the term supercloud, last re:Invent, that's become a thing. And Charles Fitzgerald would think it's not a thing, he debates us online all the time on Twitter. But it's, you can build on top of a CapEx. >> Regina: Yup. >> They did all the heavy lifting. You know, Snowflake, Databricks, the list goes on and on. So building on top of that to build proprietary advantages or even just sustainable advantages is now easier to do. So superclouds are kind of in play. So that means whoever's got the playbook can win. So you guys seem to be executing that playbook of having the installed base, and then working with AWS >> Regina: Yes. >> to ride that wave. Tell us about the migration strategies you're seeing, and what are your customers doing specifically, and take us through a customer that's leaning into the cloud and driving. >> So when I think about specific customers that are leaning in, you know, the first and most important thing that we're hearing is, you've got to be able to scale. I've got 1,000 nodes or 100 nodes, or whatnot. And so we're addressing that because we think that there's a place for hybrid cloud. We think everyone's moving and rushing towards the cloud, but even one of our competitors last week announced that there's a place for on-prem, and we would agree. >> John: Yeah. >> So that is something that we're really focused on, and you take, for example, the automotive industry. We're seeing a lot of work being done together with our joint customers, AWS and Teradata, and some of these auto manufacturers who are experiencing supply chain issues and challenges today, and also need to drive better quality control measures within their own lines, in the manufacturing lines. And so we're working together with them to look at what type of machine learning and AI can we be leveraging together as part of the overall solution to drive those analytics, and make sure that they have better quality control >> You know, that's really good insight about the on-premise thing. And I think that supports what we're seeing around hybrid. We see hybrid as a steady state going forward, period. >> Regina: Yeah. >> And that will evolve into multi thing. Multi-cloud, you want to call it, or superclouds, and more things. Basically, distributed computing. So if you look at the edge here, the edge is just on-premise. What is the premise? It's an edge or big device, small device, data center is a large edge. >> Regina: Right. >> And so if you're using cloud hybrid, the distinction kind of goes away. And I think this is where we'll going to see the winners emerge in data. Because remember, you go back to 2010, Hadoop was the big thing, big data. And that kind of crashed and burned. And then now you're seeing Databricks picking up a lot of that. Snowflake, you guys are there. And so it's still going on, this transformation in data. >> Regina: It is. And I think hybrid's a huge deal. What are customers saying around that? Because I think they're just trying to figure out cloud scale. >> I think they're trying to figure out cloud scale, I think they're also trying to figure out security. And so, you know, when we're talking to our customers, that absolutely is critical. And I would also suggest that the customer base is really looking for, "Hey, don't just help me migrate, I really need to modernize." And so driving the right use cases for the customer is important. >> You know, another thing that you, guys, have a lot of core expertise in is governance. And we've seen how that has played in all the compliance, and all these conversations are kind of converging. Do you have closed, do you have open? Machine learning needs more data, dow do you protect it? So that set a hot area that I see as well. And that's something that's emerging, 'cause cyber's also involved too, like, you have cyber security threats on code, so I'm curious to see how that turns out. What's your perspective on, what's Teradata's perspective on the security, open, closed perspective? Any- >> It's a priority for, security is a priority for us. And I don't think that we've officially made that determination yet, right? We're still exploring, and we're going to do whatever our customers require of us. In terms of an open, closed perspective, I think we want to be flexible. Again, like I said before, it's about being open and supportive of whatever the customer requirement is especially across the different industries. >> Well, Regina, great to have you on theCUBE. Thanks for coming. I really appreciate it. Great insight, great to catch up on Teradata, cloud play. Very strong move. I think it's a good one. Final question I want to ask you though, is a little bit more about the personnel in the industry, like, obviously, if you're young, you're seeing all this space here, machine learning's not obvious. I know schools now are training it, but you start to see new personas come into the workforce. Where are the gaps? I mean, obviously, we have a lot of new opportunities, like, cybersecurity has a lot of job openings. Is there any observations that you have around or advice to younger folks coming in, from a career standpoint? Because a lot of job openings are skills that weren't even taught in school. >> Regina: Right, that's- >> You know. >> And then you got the women in check, and you have all kinds of opportunities now that aren't just engineering, right? >> Regina: Yes. >> It's not just engineering. It's computer science, so there's a whole in-migration of new talent coming in the industry. >> Yes, I think maintaining a curious mind is really critical, and taking time to invest in learning. You know, there are so many resources available to us at our disposal that that don't cost us a dime. And so my advice to anybody who is curious, remain curious, dig in, and get some experience, and don't be afraid to stick your neck out, and try it. >> Well, in this conference we have robots welcome, you know, in this out there. >> Yeah. (laughs) >> Regina, thanks for coming out here. Really appreciate it >> John, thank you, it's a pleasure. >> CUBE coverage here in Las Vegas for Amazon re:MARS. I'm John Furrier, your host. Stay with more live coverage after this short break. (upbeat bright music)

Published Date : Jun 23 2022

SUMMARY :

And we're here with Regina Manfredi, providers, the hyperscalers. Hyperscalers, the big guys. What's the role with AWS and Teradata? customers in the cloud today. in some of the cloud things you're doing. A lot of the kind of upstarts, in the cloud together with AWS, and they just need to do So it's not about sometimes the tech, and driving the right tools for them. and the value is being captured so you take a little bit of Amazon here, And so I agree with you 100% , prepare the data, with you guys about this advances on the technology. that drive the enablement to what re:MARS means to you And the topics kind of reflect the future. but also for the people, their customers. in the sense of what I You know, the discoveries from space You know, the other thing I noticed too it's so exciting in the and to drive better performance And those I think is fascinating to see that growth. of having the installed base, that's leaning into the cloud and driving. and we would agree. and also need to drive better And I think that supports what What is the premise? And I think this is where And I think hybrid's a huge deal. And so driving the right use cases in all the compliance, And I don't think that to have you on theCUBE. coming in the industry. and don't be afraid to we have robots welcome, you Really appreciate it I'm John Furrier, your host.

ENTITIES

Entity	Category	Confidence
Regina Manfredi	PERSON	0.99+
California	LOCATION	0.99+
Regina	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Charles Fitzgerald	PERSON	0.99+
Teradata	ORGANIZATION	0.99+
last week	DATE	0.99+
Las Vegas	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
first	QUANTITY	0.99+
2010	DATE	0.99+
today	DATE	0.99+
Databricks	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Swami	PERSON	0.98+
1,000 nodes	QUANTITY	0.98+
two days	QUANTITY	0.98+
CapEx	ORGANIZATION	0.98+
earlier this year	DATE	0.98+
Twitter	ORGANIZATION	0.97+
100 nodes	QUANTITY	0.97+
earlier this week	DATE	0.96+
SageMaker	TITLE	0.91+
day two	QUANTITY	0.88+
Fiddler	ORGANIZATION	0.87+
around 1,000 node tests	QUANTITY	0.86+
dozens	QUANTITY	0.84+
SQL	TITLE	0.8+
MARS	TITLE	0.78+
earth	LOCATION	0.77+
ESG	ORGANIZATION	0.74+
Michigan	LOCATION	0.69+

Brett McMillen, AWS | AWS re:Invent 2020

>>From around the globe. It's the cube with digital coverage of AWS reinvent 2020, sponsored by Intel and AWS. >>Welcome back to the cubes coverage of AWS reinvent 2020 I'm Lisa Martin. Joining me next is one of our cube alumni. Breton McMillan is back the director of us, federal for AWS. Right. It's great to see you glad that you're safe and well. >>Great. It's great to be back. Uh, I think last year when we did the cube, we were on the convention floor. It feels very different this year here at reinvent, it's gone virtual and yet it's still true to how reinvent always been. It's a learning conference and we're releasing a lot of new products and services for our customers. >>Yes. A lot of content, as you say, the one thing I think I would say about this reinvent, one of the things that's different, it's so quiet around us. Normally we're talking loudly over tens of thousands of people on the showroom floor, but great. That AWS is still able to connect in such an actually an even bigger way with its customers. So during Theresa Carlson's keynote, want to get your opinion on this or some info. She talked about the AWS open data sponsorship program, and that you guys are going to be hosting the national institutes of health, NIH sequence, read archive data, the biologist, and may former gets really excited about that. Talk to us about that because especially during the global health crisis that we're in, that sounds really promising >>Very much is I am so happy that we're working with NIH on this and multiple other initiatives. So the secret greed archive or SRA, essentially what it is, it's a very large data set of sequenced genomic data. And it's a wide variety of judge you gnomic data, and it's got a knowledge human genetic thing, but all life forms or all branches of life, um, is in a SRA to include viruses. And that's really important here during the pandemic. Um, it's one of the largest and oldest, um, gen sequence genomic data sets are out there and yet it's very modern. It has been designed for next generation sequencing. So it's growing, it's modern and it's well used. It's one of the more important ones that it's out there. One of the reasons this is so important is that we know to find cures for what a human ailments and disease and death, but by studying the gem genomic code, we can come up with the answers of these or the scientists can come up with answer for that. And that's what Amazon is doing is we're putting in the hands of the scientists, the tools so that they can help cure heart disease and diabetes and cancer and, um, depression and yes, even, um, uh, viruses that can cause pandemics. >>So making this data, sorry, I'm just going to making this data available to those scientists. Worldwide is incredibly important. Talk to us about that. >>Yeah, it is. And so, um, within NIH, we're working with, um, the, um, NCBI when you're dealing with NIH, there's a lot of acronyms, uh, and uh, at NIH, it's the national center for, um, file type technology information. And so we're working with them to make this available as an open data set. Why, why this is important is it's all about increasing the speed for scientific discovery. I personally think that in the fullness of time, the scientists will come up with cures for just about all of the human ailments that are out there. And it's our job at AWS to put into the hands of the scientists, the tools they need to make things happen quickly or in our lifetime. And I'm really excited to be working with NIH on that. When we start talking about it, there's multiple things. The scientists needs. One is access to these data sets and SRA. >>It's a very large data set. It's 45 petabytes and it's growing. I personally believe that it's going to double every year, year and a half. So it's a very large data set and it's hard to move that data around. It's so much easier if you just go into the cloud, compute against it and do your research there in the cloud. And so it's super important. 45 petabytes, give you an idea if it were all human data, that's equivalent to have a seven and a half million people or put another way 90% of everybody living in New York city. So that's how big this is. But then also what AWS is doing is we're bringing compute. So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. The third leg of the tool of the stool is giving the scientists easy access to the specialized tool sets they need. >>And we're doing that in a few different ways. One that the people would design these toolsets design a lot of them on AWS, but then we also make them available through something called AWS marketplace. So they can just go into marketplace, get a catalog, go in there and say, I want to launch this resolve work and launches the infrastructure underneath. And it speeds the ability for those scientists to come up with the cures that they need. So SRA is stored in Amazon S3, which is a very popular object store, not just in the scientific community, but virtually every industry uses S3. And by making this available on these public data sets, we're giving the scientists the ability to speed up their research. >>One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, it's also facilitating collaboration globally because now you've got the cloud to drive all of this, which allows researchers and completely different parts of the world to be working together almost in real time. So I can imagine the incredible power that this is going to, to provide to that community. So I have to ask you though, you talked about this being all life forms, including viruses COVID-19, what are some of the things that you think we can see? I expect this to facilitate. Yeah. >>So earlier in the year we took the, um, uh, genetic code or NIH took the genetic code and they, um, put it in an SRA like format and that's now available on AWS and, and here's, what's great about it is that you can now make it so anybody in the world can go to this open data set and start doing their research. One of our goals here is build back to a democratization of research. So it used to be that, um, get, for example, the very first, um, vaccine that came out was a small part. It's a vaccine that was done by our rural country doctor using essentially test tubes in a microscope. It's gotten hard to do that because data sets are so large, you need so much computer by using the power of the cloud. We've really democratized it and now anybody can do it. So for example, um, with the SRE data set that was done by NIH, um, organizations like the university of British Columbia, their, um, cloud innovation center is, um, doing research. And so what they've done is they've scanned, they, um, SRA database think about it. They scanned out 11 million entries for, uh, coronavirus sequencing. And that's really hard to do in a typical on-premise data center. Who's relatively easy to do on AWS. So by making this available, we can have a larger number of scientists working on the problems that we need to have solved. >>Well, and as the, as we all know in the U S operation warp speed, that warp speed alone term really signifies how quickly we all need this to be progressing forward. But this is not the first partnership that AWS has had with the NIH. Talk to me about what you guys, what some of the other things are that you're doing together. >>We've been working with NIH for a very long time. Um, back in 2012, we worked with NIH on, um, which was called the a thousand genome data set. This is another really important, um, data set and it's a large number of, uh, against sequence human genomes. And we moved that into, again, an open dataset on AWS and what's happened in the last eight years is many scientists have been able to compute about on it. And the other, the wonderful power of the cloud is over time. We continue to bring out tools to make it easier for people to work. So what they're not they're computing using our, um, our instance types. We call it elastic cloud computing. whether they're doing that, or they were doing some high performance computing using, um, uh, EMR elastic MapReduce, they can do that. And then we've brought up new things that really take it to the next layer, like level like, uh, Amazon SageMaker. >>And this is a, um, uh, makes it really easy for, um, the scientists to launch machine learning algorithms on AWS. So we've done the thousand genome, uh, dataset. Um, there's a number of other areas within NIH that we've been working on. So for example, um, over at national cancer Institute, we've been providing some expert guidance on best practices to how, how you can architect and work on these COVID related workloads. Um, NIH does things with, um, collaboration with many different universities, um, over 2,500, um, academic institutions. And, um, and they do that through grants. And so we've been working with doc office of director and they run their grant management applications in the RFA on AWS, and that allows it to scale up and to work very efficiently. Um, and then we entered in with, um, uh, NIH into this program called strides strides as a program for knowing NIH, but also all these other institutions that work within NIH to use the power of the cloud use commercial cloud for scientific discovery. And when we started that back in July of 2018, long before COVID happened, it was so great that we had that up and running because now we're able to help them out through the strides program. >>Right. Can you imagine if, uh, let's not even go there? I was going to say, um, but so, okay. So the SRA data is available through the AWS open data sponsorship program. You talked about strides. What are some of the other ways that AWS system? >>Yeah, no. So strides, uh, is, uh, you know, wide ranging through multiple different institutes. So, um, for example, over at, uh, the national heart lung and blood Institute, uh, do di NHL BI. I said, there's a lot of acronyms and I gel BI. Um, they've been working on, um, harmonizing, uh, genomic data. And so working with the university of Michigan, they've been analyzing through a program that they call top of med. Um, we've also been working with a NIH on, um, establishing best practices, making sure everything's secure. So we've been providing, um, AWS professional services that are showing them how to do this. So one portion of strides is getting the right data set and the right compute in the right tools, in the hands of the scientists. The other areas that we've been working on is making sure the scientists know how to use it. And so we've been developing these cloud learning pathways, and we started this quite a while back, and it's been so helpful here during the code. So, um, scientists can now go on and they can do self-paced online courses, which we've been really helping here during the, during the pandemic. And they can learn how to maximize their use of cloud technologies through these pathways that we've developed for them. >>Well, not education is imperative. I mean, there, you think about all of the knowledge that they have with within their scientific discipline and being able to leverage technology in a way that's easy is absolutely imperative to the timing. So, so, um, let's talk about other data sets that are available. So you've got the SRA is available. Uh, what are their data sets are available through this program? >>What about along a wide range of data sets that we're, um, uh, doing open data sets and in general, um, these data sets are, um, improving the human condition or improving the, um, the world in which we live in. And so, um, I've talked about a few things. There's a few more, uh, things. So for example, um, there's the cancer genomic Atlas that we've been working with, um, national cancer Institute, as well as the national human genomic research Institute. And, um, that's a very important data set that being computed against, um, uh, throughout the world, uh, commonly within the scientific community, that data set is called TCGA. Um, then we also have some, uh, uh, datasets are focused on certain groups. So for example, kids first is a data set. That's looking at a lot of the, um, challenges, uh, in diseases that kids get every kind of thing from very rare pediatric cancer as to heart defects, et cetera. >>And so we're working with them, but it's not just in the, um, uh, medical side. We have open data sets, um, with, uh, for example, uh, NOAA national ocean open national oceanic and atmospheric administration, um, to understand what's happening better with climate change and to slow the rate of climate change within the department of interior, they have a Landsat database that is looking at pictures of their birth cell, like pictures of the earth, so we can better understand the MCO world we live in. Uh, similarly, uh, NASA has, um, a lot of data that we put out there and, um, over in the department of energy, uh, there's data sets there, um, that we're researching against, or that the scientists are researching against to make sure that we have better clean, renewable energy sources, but it's not just government agencies that we work with when we find a dataset that's important. >>We also work with, um, nonprofit organizations, nonprofit organizations are also in, they're not flush with cash and they're trying to make every dollar work. And so we've worked with them, um, organizations like the child mind Institute or the Allen Institute for brain science. And these are largely like neuro imaging, um, data. And we made that available, um, via, um, our open data set, um, program. So there's a wide range of things that we're doing. And what's great about it is when we do it, you democratize science and you allowed many, many more science scientists to work on these problems. They're so critical for us. >>The availability is, is incredible, but also the, the breadth and depth of what you just spoke. It's not just government, for example, you've got about 30 seconds left. I'm going to ask you to summarize some of the announcements that you think are really, really critical for federal customers to be paying attention to from reinvent 2020. >>Yeah. So, um, one of the things that these federal government customers have been coming to us on is they've had to have new ways to communicate with their customer, with the public. And so we have a product that we've had for a while called on AWS connect, and it's been used very extensively throughout government customers. And it's used in industry too. We've had a number of, um, of announcements this weekend. Jasmine made multiple announcements on enhancement, say AWS connect or additional services, everything from helping to verify that that's the right person from AWS connect ID to making sure that that customer's gets a good customer experience to connect wisdom or making sure that the managers of these call centers can manage the call centers better. And so I'm really excited that we're putting in the hands of both government and industry, a cloud based solution to make their connections to the public better. >>It's all about connections these days, but I wish we had more time, cause I know we can unpack so much more with you, but thank you for joining me on the queue today, sharing some of the insights, some of the impacts and availability that AWS is enabling the scientific and other federal communities. It's incredibly important. And we appreciate your time. Thank you, Lisa, for Brett McMillan. I'm Lisa Martin. You're watching the cubes coverage of AWS reinvent 2020.

Published Date : Dec 10 2020

SUMMARY :

It's the cube with digital coverage of AWS It's great to see you glad that you're safe and well. It's great to be back. Talk to us about that because especially during the global health crisis that we're in, One of the reasons this is so important is that we know to find cures So making this data, sorry, I'm just going to making this data available to those scientists. And so, um, within NIH, we're working with, um, the, So in the cloud, you can scale up your compute, scale it down, and then kind of the third they're. And it speeds the ability for those scientists One of the things that Springs jumps out to me too, is it's in addition to enabling them to speed up research, And that's really hard to do in a typical on-premise data center. Talk to me about what you guys, take it to the next layer, like level like, uh, Amazon SageMaker. in the RFA on AWS, and that allows it to scale up and to work very efficiently. So the SRA data is available through the AWS open data sponsorship And so working with the university of Michigan, they've been analyzing absolutely imperative to the timing. And so, um, And so we're working with them, but it's not just in the, um, uh, medical side. And these are largely like neuro imaging, um, data. I'm going to ask you to summarize some of the announcements that's the right person from AWS connect ID to making sure that that customer's And we appreciate your time.

ENTITIES

Entity	Category	Confidence
NIH	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Brett McMillan	PERSON	0.99+
Brett McMillen	PERSON	0.99+
AWS	ORGANIZATION	0.99+
NASA	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
July of 2018	DATE	0.99+
2012	DATE	0.99+
Theresa Carlson	PERSON	0.99+
Jasmine	PERSON	0.99+
Lisa	PERSON	0.99+
90%	QUANTITY	0.99+
New York	LOCATION	0.99+
Allen Institute	ORGANIZATION	0.99+
SRA	ORGANIZATION	0.99+
last year	DATE	0.99+
Breton McMillan	PERSON	0.99+
NCBI	ORGANIZATION	0.99+
45 petabytes	QUANTITY	0.99+
SRE	ORGANIZATION	0.99+
seven and a half million people	QUANTITY	0.99+
third leg	QUANTITY	0.99+
One	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
earth	LOCATION	0.99+
over 2,500	QUANTITY	0.99+
SRA	TITLE	0.99+
S3	TITLE	0.98+
pandemic	EVENT	0.98+
first partnership	QUANTITY	0.98+
one	QUANTITY	0.98+
child mind Institute	ORGANIZATION	0.98+
U S	LOCATION	0.98+
this year	DATE	0.98+
pandemics	EVENT	0.98+
national cancer Institute	ORGANIZATION	0.98+
both	QUANTITY	0.98+
national heart lung and blood Institute	ORGANIZATION	0.98+
NOAA	ORGANIZATION	0.97+
national human genomic research Institute	ORGANIZATION	0.97+
today	DATE	0.97+
Landsat	ORGANIZATION	0.96+
first	QUANTITY	0.96+
11 million entries	QUANTITY	0.96+
about 30 seconds	QUANTITY	0.95+
year and a half	QUANTITY	0.94+
AWS connect	ORGANIZATION	0.93+
university of British Columbia	ORGANIZATION	0.92+
COVID	EVENT	0.91+
COVID-19	OTHER	0.91+
over tens of thousands of people	QUANTITY	0.91+

Jamir Jaffer, IronNet Cybersecurity | AWS re:Inforce 2019

>> live from Boston, Massachusetts. It's the Cube covering A W s reinforce 2019. Brought to you by Amazon Web service is and its ecosystem partners. >> Well, welcome back. Everyone's Cube Live coverage here in Boston, Massachusetts, for AWS. Reinforce Amazon Web sources. First inaugural conference around security. It's not Osama. It's a branded event. Big time ecosystem developing. We have returning here. Cube Alumni Bill Jeff for VP of strategy and the partnerships that Iron Net Cyber Security Company. Welcome back. Thanks. General Keith Alexander, who was on a week and 1/2 ago. And it was public sector summit. Good to see you. Good >> to see you. Thanks for >> having my back, but I want to get into some of the Iran cyber communities. We had General Qi 1000. He was the original commander of the division. So important discussions that have around that. But don't get your take on the event. You guys, you're building a business. The minute cyber involved in public sector. This is commercial private partnership. Public relations coming together. Yeah. Your models are sharing so bringing public and private together important. >> Now that's exactly right. And it's really great to be here with eight of us were really close partner of AWS is we'll work with them our entire back in today. Runs on AWS really need opportunity. Get into the ecosystem, meet some of the folks that are working that we might work with my partner but to deliver a great product, right? And you're seeing a lot of people move to cloud, right? And so you know some of the big announcement that are happening here today. We're willing. We're looking to partner up with eight of us and be a first time provider for some key new Proactiv elves. AWS is launching in their own platform here today. So that's a really neat thing for us to be partnered up with this thing. Awesome organization. I'm doing some of >> the focus areas around reinforcing your party with Amazon shares for specifics. >> Yes. So I don't know whether they announced this capability where they're doing the announcement yesterday or today. So I forget which one so I'll leave that leave that leave that once pursued peace out. But the main thing is, they're announcing couple of new technology plays way our launch party with them on the civility place. So we're gonna be able to do what we were only wanted to do on Prem. We're gonna be able to do in the cloud with AWS in the cloud formation so that we'll deliver the same kind of guy that would deliver on prime customers inside their own cloud environments and their hybrid environment. So it's a it's a it's a sea change for us. The company, a sea change for a is delivering that new capability to their customers and really be able to defend a cloud network the way you would nonpregnant game changer >> described that value, if you would. >> Well, so you know, one of the key things about about a non pregnant where you could do you could look at all the flows coming past you. You look at all the data, look at in real time and develop behavior. Lana looks over. That's what we're doing our own prime customers today in the cloud with his world who looked a lox, right? And now, with the weight of your capability, we're gonna be able to integrate that and do a lot Maur the way we would in a in a in a normal sort of on Prem environment. So you really did love that. Really? Capability of scale >> Wagon is always killed. The predictive analytics, our visibility and what you could do. And too late. Exactly. Right. You guys solve that with this. What are some of the challenges that you see in cloud security that are different than on premise? Because that's the sea, So conversation we've been hearing. Sure, I know on premise. I didn't do it on premises for awhile. What's the difference between the challenge sets, the challenges and the opportunities they provide? >> Well, the opportunities air really neat, right? Because you've got that even they have a shared responsibility model, which is a little different than you officially have it. When it's on Prem, it's all yours essential. You own that responsibility and it is what it is in the cloud. Its share responsible to cloud provider the data holder. Right? But what's really cool about the cloud is you could deliver some really interesting Is that scale you do patch updates simultaneously, all your all your back end all your clients systems, even if depending how your provisioning cloud service is, you could deliver that update in real time. You have to worry about. I got to go to individual systems and update them, and some are updated. Summer passed. Some aren't right. Your servers are packed simultaneously. You take him down, you're bringing back up and they're ready to go, right? That's a really capability that for a sigh. So you're delivering this thing at scale. It's awesome now, So the challenge is right. It's a new environment so that you haven't dealt with before. A lot of times you feel the hybrid environment governed both an on Prem in sanitation and class sensation. Those have to talkto one another, right? And you might think about Well, how do I secure those those connections right now? And I think about spending money over here when I got all seduced to spend up here in the cloud. And that's gonna be a hard thing precisely to figure out, too. And so there are some challenges, but the great thing is, you got a whole ecosystem. Providers were one of them here in the AWS ecosystem. There are a lot here today, and you've got eight of us as a part of self who wants to make sure that they're super secure, but so are yours. Because if you have a problem in their cloud, that's a challenge. Them to market this other people. You talk about >> your story because your way interviews A couple weeks ago, you made a comment. I'm a recovering lawyer, kind of. You know, we all laughed, but you really start out in law, right? >> How did you end up here? Yeah, well, the truth is, I grew up sort of a technology or myself. My first computer is a trash 80 a trs 80 color computer. RadioShack four k of RAM on board, right. We only >> a true TRS 80. Only when I know what you're saying. That >> it was a beautiful system, right? Way stored with sword programs on cassette tapes. Right? And when we operated from four Keita 16 k way were the talk of the Rainbow Computer Club in Santa Monica, California Game changer. It was a game here for 16. Warning in with 60 give onboard. Ram. I mean, this is this is what you gonna do. And so you know, I went from that and I in >> trouble or something, you got to go to law school like you're right >> I mean, you know, look, I mean, you know it. So my dad, that was a chemist, right? So he loved computers, love science. But he also had an unrequited political boners body. He grew up in East Africa, Tanzania. It was always thought that he might be a minister in government. The Socialist came to power. They they had to leave you at the end of the day. And he came to the states and doing chemistry, which is course studies. But he still loved politics. So he raised at NPR. So when I went to college, I studied political science. But I paid my way through college doing computer support, life sciences department at the last moment. And I ran 10 based. He came on climate through ceilings and pulled network cable do punch down blocks, a little bit of fibrous placing. So, you know, I was still a murderer >> writing software in the scythe. >> One major, major air. And that was when when the web first came out and we had links. Don't you remember? That was a text based browser, right? And I remember looking to see him like this is terrible. Who would use http slash I'm going back to go for gophers. Awesome. Well, turns out I was totally wrong about Mosaic and Netscape. After that, it was It was it was all hands on >> deck. You got a great career. Been involved a lot in the confluence of policy politics and tech, which is actually perfect skill set for the challenge we're dealing. So I gotta ask you, what are some of the most important conversations that should be on the table right now? Because there's been a lot of conversations going on around from this technology. I has been around for many decades. This has been a policy problem. It's been a societal problem. But now this really focus on acute focus on a lot of key things. What are some of the most important things that you think should be on the table for techies? For policymakers, for business people, for lawmakers? >> One. I think we've got to figure out how to get really technology knowledge into the hands of policymakers. Right. You see, you watch the Facebook hearings on Capitol Hill. I mean, it was a joke. It was concerning right? I mean, anybody with a technology background to be concerned about what they saw there, and it's not the lawmakers fault. I mean, you know, we've got to empower them with that. And so we got to take technologist, threw it out, how to get them to talk policy and get them up on the hill and in the administration talking to folks, right? And one of the big outcomes, I think, has to come out of that conversation. What do we do about national level cybersecurity, Right, because we assume today that it's the rule. The private sector provides cyber security for their own companies, but in no other circumstance to expect that when it's a nation state attacker, wait. We don't expect Target or Wal Mart or any other company. J. P. Morgan have surface to air missiles on the roofs of their warehouses or their buildings to Vegas Russian bear bombers. Why, that's the job of the government. But when it comes to cyberspace, we expect Private Cummings defending us everything from a script kiddie in his basement to the criminal hacker in Eastern Europe to the nation state, whether Russia, China, Iran or North Korea and these nation states have virtually a limited resource. Your armies did >> sophisticated RND technology, and it's powerful exactly like a nuclear weaponry kind of impact for digital. >> Exactly. And how can we expect prices comes to defend themselves? It's not. It's not a fair fight. And so the government has to have some role. The questions? What role? How did that consist with our values, our principles, right? And how do we ensure that the Internet remains free and open, while still is sure that the president is not is not hampered in doing its job out there. And I love this top way talk about >> a lot, sometimes the future of warfare. Yeah, and that's really what we're talking about. You go back to Stuxnet, which opened Pandora's box 2016 election hack where you had, you know, the Russians trying to control the mean control, the narrative. As you pointed out, that that one video we did control the belief system you control population without firing a shot. 20 twenties gonna be really interesting. And now you see the U. S. Retaliate to Iran in cyberspace, right? Allegedly. And I was saying that we had a conversation with Robert Gates a couple years ago and I asked him. I said, Should we be Maur taking more of an offensive posture? And he said, Well, we have more to lose than the other guys Glasshouse problem? Yeah, What are your thoughts on? >> Look, certainly we rely intimately, inherently on the cyber infrastructure that that sort of is at the core of our economy at the core of the world economy. Increasingly, today, that being said, because it's so important to us all the more reason why we can't let attacks go Unresponded to write. And so if you're being attacked in cyberspace, you have to respond at some level because if you don't, you'll just keep getting punched. It's like the kid on the playground, right? If the bully keeps punching him and nobody does anything, not not the not the school administration, not the kid himself. Well, then the boy's gonna keep doing what he's doing. And so it's not surprising that were being tested by Iran by North Korea, by Russia by China, and they're getting more more aggressive because when we don't punch back, that's gonna happen. Now we don't have to punch back in cyberspace, right? A common sort of fetish about Cyrus is a >> response to the issue is gonna respond to the bully in this case, your eggs. Exactly. Playground Exactly. We'll talk about the Iran. >> So So if I If I if I can't Yeah, the response could be Hey, we could do this. Let them know you could Yes. And it's a your move >> ate well, And this is the key is that it's not just responding, right. So Bob Gates or told you we can't we talk about what we're doing. And even in the latest series of alleged responses to Iran, the reason we keep saying alleged is the U. S has not publicly acknowledged it, but the word has gotten out. Well, of course, it's not a particularly effective deterrence if you do something, but nobody knows you did it right. You gotta let it out that you did it. And frankly, you gotta own it and say, Hey, look, that guy punch me, I punch it back in the teeth. So you better not come after me, right? We don't do that in part because these cables grew up in the intelligence community at N S. A and the like, and we're very sensitive about that But the truth is, you have to know about your highest and capabilities. You could talk about your abilities. You could say, Here are my red lines. If you cross him, I'm gonna punch you back. If you do that, then by the way, you've gotta punch back. They'll let red lines be crossed and then not respond. And then you're gonna talk about some level of capabilities. It can't all be secret. Can't all be classified. Where >> are we in this debate? Me first. Well, you're referring to the Thursday online attack against the intelligence Iranian intelligence community for the tanker and the drone strike that they got together. Drone take down for an arm in our surveillance drones. >> But where are we >> in this debate of having this conversation where the government should protect and serve its people? And that's the role. Because if a army rolled in fiscal army dropped on the shores of Manhattan, I don't think Citibank would be sending their people out the fight. Right? Right. So, like, this is really happening. >> Where are we >> on this? Like, is it just sitting there on the >> table? What's happening? What's amazing about it? Hi. This was getting it going well, that that's a Q. What's been amazing? It's been happening since 2012 2011 right? We know about the Las Vegas Sands attack right by Iran. We know about North Korea's. We know about all these. They're going on here in the United States against private sector companies, not against the government. And there's largely been no response. Now we've seen Congress get more active. Congress just last year passed to pass legislation that gave Cyber command the authority on the president's surgery defenses orders to take action against Russia, Iran, North Korea and China. If certain cyber has happened, that's a good thing, right to give it. I'll be giving the clear authority right, and it appears the president willing to make some steps in that direction, So that's a positive step. Now, on the back end, though, you talk about what we do to harden ourselves, if that's gonna happen, right, and the government isn't ready today to defend the nation, even though the Constitution is about providing for the common defense, and we know that the part of defense for long. For a long time since Secretary Panetta has said that it is our mission to defend the nation, right? But we know they're not fully doing that. How do they empower private sector defense and one of keys That has got to be Look, if you're the intelligence community or the U. S. Government, you're Clinton. Tremendous sense of Dad about what you're seeing in foreign space about what the enemy is doing, what they're preparing for. You have got to share that in real time at machine speed with industry. And if you're not doing that and you're still count on industry to be the first line defense, well, then you're not empowered. That defense. And if you're on a pair of the defense, how do you spend them to defend themselves against the nation? State threats? That's a real cry. So >> much tighter public private relationship. >> Absolutely, absolutely. And it doesn't have to be the government stand in the front lines of the U. S. Internet is, though, is that you could even determine the boundaries of the U. S. Internet. Right? Nobody wants an essay or something out there doing that, but you do want is if you're gonna put the private sector in the in the line of first defense. We gotta empower that defense if you're not doing that than the government isn't doing its job. And so we gonna talk about this for a long time. I worked on that first piece of information sharing legislation with the House chairman, intelligence Chairman Mike Rogers and Dutch Ruppersberger from Maryland, right congressman from both sides of the aisle, working together to get a fresh your decision done that got done in 2015. But that's just a first step. The government's got to be willing to share classified information, scaled speed. We're still not seeing that. Yeah, How >> do people get involved? I mean, like, I'm not a political person. I'm a moderate in the middle. But >> how do I How do people get involved? How does the technology industry not not the >> policy budgets and the top that goes on the top tech companies, how to tech workers or people who love Tad and our patriots and or want freedom get involved? What's the best approach? >> Well, that's a great question. I think part of is learning how to talk policy. How do we get in front policymakers? Right. And we're I run. I run a think tank on the side at the National Institute at George Mason University's Anton Scalia Law School Way have a program funded by the Hewlett Foundation who were bringing in technologists about 25 of them. Actually. Our next our second event. This Siri's is gonna be in Chicago this weekend. We're trained these technologies, these air data scientists, engineers and, like talk Paul's right. These are people who said We want to be involved. We just don't know how to get involved And so we're training him up. That's a small program. There's a great program called Tech Congress, also funded by the U. A. Foundation that places technologists in policy positions in Congress. That's really cool. There's a lot of work going on, but those are small things, right. We need to do this, its scale. And so you know, what I would say is that their technology out there want to get involved, reach out to us, let us know well with our partners to help you get your information and dad about what's going on. Get your voice heard there. A lot of organizations to that wanna get technologies involved. That's another opportunity to get in. Get in the building is a >> story that we want to help tell on be involved in David. I feel passion about this. Is a date a problem? So there's some real tech goodness in there. Absolutely. People like to solve hard problems, right? I mean, we got a couple days of them. You've got a big heart problems. It's also for all the people out there who are Dev Ops Cloud people who like to work on solving heart problems. >> We got a lot >> of them. Let's do it. So what's going on? Iron? Give us the update Could plug for the company. Keith Alexander found a great guy great guests having on the Cube. That would give the quick thanks >> so much. So, you know, way have done two rounds of funding about 110,000,000. All in so excited. We have partners like Kleiner Perkins Forge point C five all supporting us. And now it's all about We just got a new co CEO in Bill Welshman. See Scaler and duo. So he grew Z scaler. $1,000,000,000 valuation he came in to do Oh, you know, they always had a great great exit. Also, we got him. We got Sean Foster in from from From Industry also. So Bill and Sean came together. We're now making this business move more rapidly. We're moving to the mid market. We're moving to a cloud platform or aggressively and so exciting times and iron it. We're coming toe big and small companies near you. We've got the capability. We're bringing advanced, persistent defense to bear on his heart problems that were threat analytics. I collected defence. That's the key to our operation. We're excited >> to doing it. I call N S A is a service, but that's not politically correct. But this is the Cube, so >> Well, look, if you're not, if you want to defensive scale, right, you want to do that. You know, ECE knows how to do that key down here at the forefront of that when he was in >> the government. Well, you guys are certainly on the cutting edge, riding that wave of common societal change technology impact for good, for defence, for just betterment, not make making a quick buck. Well, you know, look, it's a good business model by the way to be in that business. >> I mean, It's on our business cards. And John Xander means it. Our business. I'd say the Michigan T knows that he really means that, right? Rather private sector. We're looking to help companies to do the right thing and protect the nation, right? You know, I protect themselves >> better. Well, our missions to turn the lights on. Get those voices out there. Thanks for coming on. Sharing the lights. Keep covers here. Day one of two days of coverage. Eight of us reinforce here in Boston. Stay with us for more Day one after this short break.

Published Date : Jun 25 2019

SUMMARY :

Brought to you by Amazon Web service is Cube Alumni Bill Jeff for VP of strategy and the partnerships that Iron Net Cyber to see you. You guys, you're building a business. And it's really great to be here with eight of us were really close partner of AWS is we'll to defend a cloud network the way you would nonpregnant game changer Well, so you know, one of the key things about about a non pregnant where you could do you could look at all the flows coming What are some of the challenges that you see in cloud security but the great thing is, you got a whole ecosystem. You know, we all laughed, but you really start out in law, How did you end up here? That And so you know, I went from that and I in They they had to leave you at the end of the day. And I remember looking to see him like this is terrible. What are some of the most important things that you think should be on the table for techies? And one of the big outcomes, I think, has to come out of that conversation. And so the government has to have some role. And I was saying that we had a conversation with Robert Gates a couple years that that sort of is at the core of our economy at the core of the world economy. response to the issue is gonna respond to the bully in this case, your eggs. So So if I If I if I can't Yeah, the response could be Hey, we could do this. And even in the latest series of alleged responses to Iran, the reason we keep saying alleged is the U. Iranian intelligence community for the tanker and the drone strike that they got together. And that's the role. Now, on the back end, though, you talk about what we do to harden ourselves, if that's gonna happen, And it doesn't have to be the government stand in the front lines of the U. I'm a moderate in the middle. And so you know, It's also for all the people out there who found a great guy great guests having on the Cube. That's the key to our operation. to doing it. ECE knows how to do that key down here at the forefront of that when he was in Well, you know, look, it's a good business model by the way to be in that business. We're looking to help companies to do the right thing and protect the nation, Well, our missions to turn the lights on.

ENTITIES

Entity	Category	Confidence
Target	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Citibank	ORGANIZATION	0.99+
Clinton	PERSON	0.99+
Hewlett Foundation	ORGANIZATION	0.99+
Sean	PERSON	0.99+
2015	DATE	0.99+
Chicago	LOCATION	0.99+
Wal Mart	ORGANIZATION	0.99+
Jamir Jaffer	PERSON	0.99+
Boston	LOCATION	0.99+
two days	QUANTITY	0.99+
John Xander	PERSON	0.99+
$1,000,000,000	QUANTITY	0.99+
United States	LOCATION	0.99+
Congress	ORGANIZATION	0.99+
Bill	PERSON	0.99+
Bob Gates	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Keith Alexander	PERSON	0.99+
U. A. Foundation	ORGANIZATION	0.99+
Robert Gates	PERSON	0.99+
Maryland	LOCATION	0.99+
Iron Net Cyber Security Company	ORGANIZATION	0.99+
eight	QUANTITY	0.99+
Cyrus	PERSON	0.99+
Paul	PERSON	0.99+
today	DATE	0.99+
Manhattan	LOCATION	0.99+
Sean Foster	PERSON	0.99+
Mike Rogers	PERSON	0.99+
Bill Welshman	PERSON	0.99+
Boston, Massachusetts	LOCATION	0.99+
David	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Pandora	ORGANIZATION	0.99+
Thursday	DATE	0.99+
Vegas	LOCATION	0.99+
NPR	ORGANIZATION	0.99+
yesterday	DATE	0.99+
second event	QUANTITY	0.99+
last year	DATE	0.99+
Rainbow Computer Club	ORGANIZATION	0.99+
Eastern Europe	LOCATION	0.99+
U. S. Government	ORGANIZATION	0.99+
Iran	ORGANIZATION	0.99+
U. S	ORGANIZATION	0.99+
both sides	QUANTITY	0.99+
first computer	QUANTITY	0.99+
J. P. Morgan	ORGANIZATION	0.99+
ECE	ORGANIZATION	0.99+
Siri	TITLE	0.99+
China	ORGANIZATION	0.99+
Santa Monica, California	LOCATION	0.99+
East Africa, Tanzania	LOCATION	0.99+
Russia	ORGANIZATION	0.99+
TRS 80	COMMERCIAL_ITEM	0.99+
two rounds	QUANTITY	0.99+
first step	QUANTITY	0.99+
National Institute	ORGANIZATION	0.98+
Capitol Hill	LOCATION	0.98+
North Korea	ORGANIZATION	0.98+
House	ORGANIZATION	0.98+
first piece	QUANTITY	0.98+
one	QUANTITY	0.98+
Secretary	PERSON	0.98+
2019	DATE	0.98+
George Mason University	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Lana	PERSON	0.98+
Tad	PERSON	0.97+
first defense	QUANTITY	0.97+
RadioShack	ORGANIZATION	0.97+
Panetta	PERSON	0.97+
first time	QUANTITY	0.97+
first line	QUANTITY	0.97+
60	QUANTITY	0.96+
Amazon Web	ORGANIZATION	0.96+

Bill Raduchel | Automation Anywhere Imagine 2018

>> From Times Square, in the heart of New York City, it's theCUBE. Covering Imagine 2018. Brought to you by Automation Anywhere. >> Hey welcome back everybody, Jeff Frick here with theCUBE. We're in Manhattan at the Automation Anywhere Imagine 2018. 1100 people milling around looking at the ecosystem, looking at all the offers that all the partners have. And we're excited to have one of the strategic advisors from the company, he's Bill Raduchel. Strategic advisor, been in the industry for >> 50 years, 40 years, 50 years, whatever. Forever. >> So Bill, thanks for takin' a few minutes. >> My pleasure. >> So how did you get involved with Automation Anywhere? >> Oh the way most things happen in life, friends, right? You get involved, and got to talking to Mihir, and we got, we see the world much the same way. And see the importance of bots and bringing productivity back to the economy. And no other way to do it. So just ya know, it grew. >> It grew. So it's interesting right? Cause I though ERP was supposed to have rung out all the efficiency that, and waste in the system, but clearly that was not the case. >> I won both CIO of the year and CTO of the year, and I put in an ERP system, and I understand it. It also failed three times going in. It was incredibly painful, but it produced over a billion dollars in cash saving. So it did. The problem is the world changes. And the world changes now at a pace far faster than you can possibly change your ERP system. >> Right. >> I mean ERP systems are built to be changed every I don't know, 15 to 25 years. And the world in 25 years is gonna look very different than the world does today. So we just have a huge disconnect between how fast we can create and deploy software, and how fast the world is changing to which that software has to relay. >> Right. And still so many of the processes that people actually do in their day job, are still spreadsheet based, you know, my goodness. How much of the world's computational horsepower is used on Excel on stand alone little reports and projects? >> Another question to ask is how many errors are in those spreadsheets? >> That's right. Not enough copy paste. >> I mean, I was on a study for the National Academy of Sciences, and we looked at why productivity growth wasn't happening. And one answer, which we just talked about, is Legacy software. I mean, you just couldn't change it, you couldn't, you know when you had to rewrite the software all productivity growth just slowed to a crawl. The other thing is something that economists call lore. And lore is basically oral tradition. But it's the way the company really works. >> Right. >> You have all these processes and all these procedures but when you get down and you start talking and sort of like, what is it the secret boss show? I mean, you learn the little things that the people down at the bottom know. Well, so far, Automation has never really penetrated that. And yet that becomes the barrier to almost all change. So what RPA does, is RPA actually begins to go after lore. RPA allows companies to begin to understand lore, and understand how to optimize it. Understand how to record it. I mean, you know, it's not written down. It's below the level that people bother to document and yet, if you don't change the lore, you're not gonna matter. >> You're not changing anything. >> You're not changing anything. So this is why this is so exciting because for the first time, companies, organizations, people, I mean we see all this stuff coming out just to help us in our everyday lives. You get to go at the lore. I mean, you know that, well you don't put that field in, no you wait 20 seconds after you filled in this field before you go and do that, because it takes that long for that and you get an error over here. That's how things really work. And this is the kind of technology that can actually address that. And so for that point of view it's really revolutionary because we've never been able I mean, oral tradition has never been subject to a whole lot of scientific studies. >> Well the other thing is just so impressive when you've been in the business a long time, you know we're talking about AOL before we turn on the cameras and shipping CDs around. >> Right. >> As we get closer and closer to ya know, infinite compute, infinite storage, infinite networking, 5G just around the corner. At a price point that keeps absutodically getting closer and closer to zero, the opportunity for things like AI, and to really apply a lot more horsepower to these problems, opens up a whole different opportunity. >> Two comments to that. One is, about 15 years ago the National Science Foundation funded Monica Lamb at Stanford to do a project on the open mobile internet, POMI. And one of their conclusions was that at some point in the future, which may be happening now, we would all have a digital butler. And everybody would have, basically a bot. They would be living 24/7 operating on our behalf, doing the things that help make our life better. And that is you know, really what's gonna happen. Now you see AI, and if you saw there was a report that got a lot of news from the speech given at the Federal Reserve Bank at Dallas, I think. Where the guy said well productivity is fine, it's just that the AI technology hasn't been able to find a way to be effective, or made real. Well the way it's gonna be made real is these bots because you still got your ERP system. Now granted I can have AI over here, but if it doesn't talk to the ERP system, how is the order gonna get placed? How is the product gonna get mailed? How is it gonna get shipped? So something has to go bring these together. So again, you're not gonna have impact from AI unless you have an impact from bots. Because they're the interface to the real world. >> Well the other huge thing that happened, right, was this mobile. And the Googles and the Amazons of the world resetting our expectations of the way we should be interacting with our technology. And you know, it's funny but there's little things that are in our day all the time. I mean, Ways is just a phenomenal example, right? And auto fill on an address. You know, this is the address you typed in, this is the one that USPS says is the official address from your home. So it's all these little tiny things that are just happening >> Spell check. >> Without even, spellcheck. >> Spell check, I mean, the inventor of spell check is John Seely Brown. And he was giving a speech at the University of Michigan 15 years ago and the graduates weren't pleased. Here was a computer scientist gonna come talk to them and it's at the Michigan stadium, and they're throwing beach balls and no one's paying any attention. And the person who introduced him said and I wanna introduce John Seely Brown, the man who invented spell check. And he had a standing ovation from 100,000 people because that got their attention. They all knew that that was really important. No you're right. I mean, the iPhone is 10 years old. Well I mean smart phones are 20 years old. The iPhone is 10 years old, 10 and a half now. I mean, it's changed how we live our lives, how we do business, how everything goes. Anybody who thinks that the next 10 years is gonna be less change >> No, it's only accelerating. >> There's so many vectors. I mean a year ago, a friend coined the Cambric Extinction, basically a play on words on the Cambrian Extinction. And it's Cloud, AI, mobile, big data, robotics, Internetive things, and cyber security. And he pointed out that any one of those would be incredibly disruptive, they were all hitting at the same time. The thing that's amazing is that's a two year old comment. Block chain wasn't around. >> Right. >> And today, block chain may be more disruptive than any of those. And yet, how do all of those connect to the Legacy systems for some long period of time? It's what's going on in this room. >> Right. Well cause I was gonna ask you, cause you advise a ton of companies, so you've seen it and you continue to see it across a large spectrum. What's special about this company? what's special about this leadership team that keeps you excited, that keeps you involved? >> It's the people side of this, right. I mean, I have been to more computer related conferences in my life than I can count. I've never seen as much enthusiasm as there is here. Maybe, at a Mac conference. But I mean it's that same level of enthusiasm, it's passion. How does technology get adopted when you have to go invest in it? It takes passion. You gotta get people who believe. People who are committed. People who wanna go and do something with it. And that's what they've been able to do. That's what Mihir has done. And it's been brilliant in bringing that on board. >> Yeah, you can certainly feel it here in the room. Especially when it's still relatively intimate. >> Right. >> You know, people are sharing ideas, you know they're excited. It's really not kind of a competitive vendor fair, it's more of a community that's really trying to help each other out. >> Well that, I mean, they're at that stage. It may get a little bit, you know this, well no I'm not gonna tell you about my bot. It's a great bot and it does great things, but nope, I'm not gonna tell you how it works. >> Right. So just last parting word, you know as you see kind of the bot economy. We've seen they got the bot store, I guess they have a hundred bots, they've only had it open for a very short period of time. You can buy, sell, free. What do you see kind of the next short term evolution of this space? >> I think that bots are probably worth somewhere around a point in productivity growth. Well, a point >> Not a basis point, but a point point. >> A point. That's what Makenzie says, that's what, I mean because this is allowing you to capture benefits that you should of and you haven't. A point in global productivity is about a trillion dollars. So then your question for the bot economy is okay, if the value of the bots is a trillion dollars, what portion of that can the bot economy capture? And that you know, I mean 20 30 percent is certainly a reasonable number to go look at. The real world lives over here, all this technology change lives over here, and bots are gonna be the bridge by which you bring those two things together. So yes, it should be big and growing for a long time. >> Well Bill, thanks for taking a minute. I really appreciate the conversation. >> Great, thank you. >> Alright, he's Bill, I'm Jeff. You're watchin' theCUBE from Automation Anywhere Imagine 2018. Thanks for watching. (electronic music)

Published Date : Jun 1 2018

SUMMARY :

Brought to you by Automation Anywhere. that all the partners have. So Bill, thanks for And see the importance of all the efficiency that, And the world changes And the world in 25 years And still so many of the That's right. But it's the way the company really works. I mean, you know, it's not written down. I mean, you know that, well Well the other thing 5G just around the corner. it's just that the AI And the Googles and the I mean, the iPhone is 10 years old. on the Cambrian Extinction. to the Legacy systems for that keeps you excited, I mean, I have been to more feel it here in the room. you know they're excited. It may get a little bit, you know this, So just last parting word, you know I think that bots are And that you know, I mean 20 30 percent I really appreciate the conversation. from Automation Anywhere Imagine 2018.

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Bill Raduchel	PERSON	0.99+
Jeff	PERSON	0.99+
Monica Lamb	PERSON	0.99+
National Science Foundation	ORGANIZATION	0.99+
USPS	ORGANIZATION	0.99+
National Academy of Sciences	ORGANIZATION	0.99+
John Seely Brown	PERSON	0.99+
15	QUANTITY	0.99+
20 seconds	QUANTITY	0.99+
New York City	LOCATION	0.99+
40 years	QUANTITY	0.99+
Dallas	LOCATION	0.99+
Times Square	LOCATION	0.99+
Bill	PERSON	0.99+
Excel	TITLE	0.99+
100,000 people	QUANTITY	0.99+
Manhattan	LOCATION	0.99+
50 years	QUANTITY	0.99+
Makenzie	PERSON	0.99+
Two comments	QUANTITY	0.99+
25 years	QUANTITY	0.99+
Googles	ORGANIZATION	0.99+
Federal Reserve Bank	ORGANIZATION	0.99+
one	QUANTITY	0.99+
two things	QUANTITY	0.99+
both	QUANTITY	0.99+
one answer	QUANTITY	0.99+
One	QUANTITY	0.98+
iPhone	COMMERCIAL_ITEM	0.98+
Amazons	ORGANIZATION	0.98+
a year ago	DATE	0.98+
1100 people	QUANTITY	0.98+
over a billion dollars	QUANTITY	0.98+
Automation Anywhere	ORGANIZATION	0.98+
today	DATE	0.98+
three times	QUANTITY	0.97+
first time	QUANTITY	0.97+
15 years ago	DATE	0.97+
Mihir	PERSON	0.97+
10 and a half	QUANTITY	0.96+
about a trillion dollars	QUANTITY	0.95+
AOL	ORGANIZATION	0.95+
Stanford	ORGANIZATION	0.94+
zero	QUANTITY	0.94+
20 years old	QUANTITY	0.93+
about 15 years ago	DATE	0.93+
20 30 percent	QUANTITY	0.93+
10 years old	QUANTITY	0.92+
10 years old	QUANTITY	0.91+
2018	DATE	0.9+
Michigan stadium	LOCATION	0.9+
University of Michigan	ORGANIZATION	0.87+
theCUBE	ORGANIZATION	0.86+
Automation Anywhere Imagine	ORGANIZATION	0.86+
two year old	QUANTITY	0.85+
Automation Anywhere Imagine	TITLE	0.85+
Cambric Extinction	TITLE	0.78+
a trillion dollars	QUANTITY	0.75+
Imagine	TITLE	0.7+
hundred bots	QUANTITY	0.7+
Cambrian	OTHER	0.69+
RPA	TITLE	0.65+
next 10	DATE	0.61+
check	OTHER	0.54+
years	QUANTITY	0.5+
Extinction	TITLE	0.43+
Mac	COMMERCIAL_ITEM	0.36+

Data Science for All: It's a Whole New Game

>> There's a movement that's sweeping across businesses everywhere here in this country and around the world. And it's all about data. Today businesses are being inundated with data. To the tune of over two and a half million gigabytes that'll be generated in the next 60 seconds alone. What do you do with all that data? To extract insights you typically turn to a data scientist. But not necessarily anymore. At least not exclusively. Today the ability to extract value from data is becoming a shared mission. A team effort that spans the organization extending far more widely than ever before. Today, data science is being democratized. >> Data Sciences for All: It's a Whole New Game. >> Welcome everyone, I'm Katie Linendoll. I'm a technology expert writer and I love reporting on all things tech. My fascination with tech started very young. I began coding when I was 12. Received my networking certs by 18 and a degree in IT and new media from Rochester Institute of Technology. So as you can tell, technology has always been a sure passion of mine. Having grown up in the digital age, I love having a career that keeps me at the forefront of science and technology innovations. I spend equal time in the field being hands on as I do on my laptop conducting in depth research. Whether I'm diving underwater with NASA astronauts, witnessing the new ways which mobile technology can help rebuild the Philippine's economy in the wake of super typhoons, or sharing a first look at the newest iPhones on The Today Show, yesterday, I'm always on the hunt for the latest and greatest tech stories. And that's what brought me here. I'll be your host for the next hour and as we explore the new phenomenon that is taking businesses around the world by storm. And data science continues to become democratized and extends beyond the domain of the data scientist. And why there's also a mandate for all of us to become data literate. Now that data science for all drives our AI culture. And we're going to be able to take to the streets and go behind the scenes as we uncover the factors that are fueling this phenomenon and giving rise to a movement that is reshaping how businesses leverage data. And putting organizations on the road to AI. So coming up, I'll be doing interviews with data scientists. We'll see real world demos and take a look at how IBM is changing the game with an open data science platform. We'll also be joined by legendary statistician Nate Silver, founder and editor-in-chief of FiveThirtyEight. Who will shed light on how a data driven mindset is changing everything from business to our culture. We also have a few people who are joining us in our studio, so thank you guys for joining us. Come on, I can do better than that, right? Live studio audience, the fun stuff. And for all of you during the program, I want to remind you to join that conversation on social media using the hashtag DSforAll, it's data science for all. Share your thoughts on what data science and AI means to you and your business. And, let's dive into a whole new game of data science. Now I'd like to welcome my co-host General Manager IBM Analytics, Rob Thomas. >> Hello, Katie. >> Come on guys. >> Yeah, seriously. >> No one's allowed to be quiet during this show, okay? >> Right. >> Or, I'll start calling people out. So Rob, thank you so much. I think you know this conversation, we're calling it a data explosion happening right now. And it's nothing new. And when you and I chatted about it. You've been talking about this for years. You have to ask, is this old news at this point? >> Yeah, I mean, well first of all, the data explosion is not coming, it's here. And everybody's in the middle of it right now. What is different is the economics have changed. And the scale and complexity of the data that organizations are having to deal with has changed. And to this day, 80% of the data in the world still sits behind corporate firewalls. So, that's becoming a problem. It's becoming unmanageable. IT struggles to manage it. The business can't get everything they need. Consumers can't consume it when they want. So we have a challenge here. >> It's challenging in the world of unmanageable. Crazy complexity. If I'm sitting here as an IT manager of my business, I'm probably thinking to myself, this is incredibly frustrating. How in the world am I going to get control of all this data? And probably not just me thinking it. Many individuals here as well. >> Yeah, indeed. Everybody's thinking about how am I going to put data to work in my organization in a way I haven't done before. Look, you've got to have the right expertise, the right tools. The other thing that's happening in the market right now is clients are dealing with multi cloud environments. So data behind the firewall in private cloud, multiple public clouds. And they have to find a way. How am I going to pull meaning out of this data? And that brings us to data science and AI. That's how you get there. >> I understand the data science part but I think we're all starting to hear more about AI. And it's incredible that this buzz word is happening. How do businesses adopt to this AI growth and boom and trend that's happening in this world right now? >> Well, let me define it this way. Data science is a discipline. And machine learning is one technique. And then AI puts both machine learning into practice and applies it to the business. So this is really about how getting your business where it needs to go. And to get to an AI future, you have to lay a data foundation today. I love the phrase, "there's no AI without IA." That means you're not going to get to AI unless you have the right information architecture to start with. >> Can you elaborate though in terms of how businesses can really adopt AI and get started. >> Look, I think there's four things you have to do if you're serious about AI. One is you need a strategy for data acquisition. Two is you need a modern data architecture. Three is you need pervasive automation. And four is you got to expand job roles in the organization. >> Data acquisition. First pillar in this you just discussed. Can we start there and explain why it's so critical in this process? >> Yeah, so let's think about how data acquisition has evolved through the years. 15 years ago, data acquisition was about how do I get data in and out of my ERP system? And that was pretty much solved. Then the mobile revolution happens. And suddenly you've got structured and non-structured data. More than you've ever dealt with. And now you get to where we are today. You're talking terabytes, petabytes of data. >> [Katie] Yottabytes, I heard that word the other day. >> I heard that too. >> Didn't even know what it meant. >> You know how many zeros that is? >> I thought we were in Star Wars. >> Yeah, I think it's a lot of zeroes. >> Yodabytes, it's new. >> So, it's becoming more and more complex in terms of how you acquire data. So that's the new data landscape that every client is dealing with. And if you don't have a strategy for how you acquire that and manage it, you're not going to get to that AI future. >> So a natural segue, if you are one of these businesses, how do you build for the data landscape? >> Yeah, so the question I always hear from customers is we need to evolve our data architecture to be ready for AI. And the way I think about that is it's really about moving from static data repositories to more of a fluid data layer. >> And we continue with the architecture. New data architecture is an interesting buzz word to hear. But it's also one of the four pillars. So if you could dive in there. >> Yeah, I mean it's a new twist on what I would call some core data science concepts. For example, you have to leverage tools with a modern, centralized data warehouse. But your data warehouse can't be stagnant to just what's right there. So you need a way to federate data across different environments. You need to be able to bring your analytics to the data because it's most efficient that way. And ultimately, it's about building an optimized data platform that is designed for data science and AI. Which means it has to be a lot more flexible than what clients have had in the past. >> All right. So we've laid out what you need for driving automation. But where does the machine learning kick in? >> Machine learning is what gives you the ability to automate tasks. And I think about machine learning. It's about predicting and automating. And this will really change the roles of data professionals and IT professionals. For example, a data scientist cannot possibly know every algorithm or every model that they could use. So we can automate the process of algorithm selection. Another example is things like automated data matching. Or metadata creation. Some of these things may not be exciting but they're hugely practical. And so when you think about the real use cases that are driving return on investment today, it's things like that. It's automating the mundane tasks. >> Let's go ahead and come back to something that you mentioned earlier because it's fascinating to be talking about this AI journey, but also significant is the new job roles. And what are those other participants in the analytics pipeline? >> Yeah I think we're just at the start of this idea of new job roles. We have data scientists. We have data engineers. Now you see machine learning engineers. Application developers. What's really happening is that data scientists are no longer allowed to work in their own silo. And so the new job roles is about how does everybody have data first in their mind? And then they're using tools to automate data science, to automate building machine learning into applications. So roles are going to change dramatically in organizations. >> I think that's confusing though because we have several organizations who saying is that highly specialized roles, just for data science? Or is it applicable to everybody across the board? >> Yeah, and that's the big question, right? Cause everybody's thinking how will this apply? Do I want this to be just a small set of people in the organization that will do this? But, our view is data science has to for everybody. It's about bring data science to everybody as a shared mission across the organization. Everybody in the company has to be data literate. And participate in this journey. >> So overall, group effort, has to be a common goal, and we all need to be data literate across the board. >> Absolutely. >> Done deal. But at the end of the day, it's kind of not an easy task. >> It's not. It's not easy but it's maybe not as big of a shift as you would think. Because you have to put data in the hands of people that can do something with it. So, it's very basic. Give access to data. Data's often locked up in a lot of organizations today. Give people the right tools. Embrace the idea of choice or diversity in terms of those tools. That gets you started on this path. >> It's interesting to hear you say essentially you need to train everyone though across the board when it comes to data literacy. And I think people that are coming into the work force don't necessarily have a background or a degree in data science. So how do you manage? >> Yeah, so in many cases that's true. I will tell you some universities are doing amazing work here. One example, University of California Berkeley. They offer a course for all majors. So no matter what you're majoring in, you have a course on foundations of data science. How do you bring data science to every role? So it's starting to happen. We at IBM provide data science courses through CognitiveClass.ai. It's for everybody. It's free. And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. The key point is this though. It's more about attitude than it is aptitude. I think anybody can figure this out. But it's about the attitude to say we're putting data first and we're going to figure out how to make this real in our organization. >> I also have to give a shout out to my alma mater because I have heard that there is an offering in MS in data analytics. And they are always on the forefront of new technologies and new majors and on trend. And I've heard that the placement behind those jobs, people graduating with the MS is high. >> I'm sure it's very high. >> So go Tigers. All right, tangential. Let me get back to something else you touched on earlier because you mentioned that a number of customers ask you how in the world do I get started with AI? It's an overwhelming question. Where do you even begin? What do you tell them? >> Yeah, well things are moving really fast. But the good thing is most organizations I see, they're already on the path, even if they don't know it. They might have a BI practice in place. They've got data warehouses. They've got data lakes. Let me give you an example. AMC Networks. They produce a lot of the shows that I'm sure you watch Katie. >> [Katie] Yes, Breaking Bad, Walking Dead, any fans? >> [Rob] Yeah, we've got a few. >> [Katie] Well you taught me something I didn't even know. Because it's amazing how we have all these different industries, but yet media in itself is impacted too. And this is a good example. >> Absolutely. So, AMC Networks, think about it. They've got ads to place. They want to track viewer behavior. What do people like? What do they dislike? So they have to optimize every aspect of their business from marketing campaigns to promotions to scheduling to ads. And their goal was transform data into business insights and really take the burden off of their IT team that was heavily burdened by obviously a huge increase in data. So their VP of BI took the approach of using machine learning to process large volumes of data. They used a platform that was designed for AI and data processing. It's the IBM analytics system where it's a data warehouse, data science tools are built in. It has in memory data processing. And just like that, they were ready for AI. And they're already seeing that impact in their business. >> Do you think a movement of that nature kind of presses other media conglomerates and organizations to say we need to be doing this too? >> I think it's inevitable that everybody, you're either going to be playing, you're either going to be leading, or you'll be playing catch up. And so, as we talk to clients we think about how do you start down this path now, even if you have to iterate over time? Because otherwise you're going to wake up and you're going to be behind. >> One thing worth noting is we've talked about analytics to the data. It's analytics first to the data, not the other way around. >> Right. So, look. We as a practice, we say you want to bring data to where the data sits. Because it's a lot more efficient that way. It gets you better outcomes in terms of how you train models and it's more efficient. And we think that leads to better outcomes. Other organization will say, "Hey move the data around." And everything becomes a big data movement exercise. But once an organization has started down this path, they're starting to get predictions, they want to do it where it's really easy. And that means analytics applied right where the data sits. >> And worth talking about the role of the data scientist in all of this. It's been called the hot job of the decade. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. >> Yes. >> I want to see this on the cover of Vogue. Like I want to see the first data scientist. Female preferred, on the cover of Vogue. That would be amazing. >> Perhaps you can. >> People agree. So what changes for them? Is this challenging in terms of we talk data science for all. Where do all the data science, is it data science for everyone? And how does it change everything? >> Well, I think of it this way. AI gives software super powers. It really does. It changes the nature of software. And at the center of that is data scientists. So, a data scientist has a set of powers that they've never had before in any organization. And that's why it's a hot profession. Now, on one hand, this has been around for a while. We've had actuaries. We've had statisticians that have really transformed industries. But there are a few things that are new now. We have new tools. New languages. Broader recognition of this need. And while it's important to recognize this critical skill set, you can't just limit it to a few people. This is about scaling it across the organization. And truly making it accessible to all. >> So then do we need more data scientists? Or is this something you train like you said, across the board? >> Well, I think you want to do a little bit of both. We want more. But, we can also train more and make the ones we have more productive. The way I think about it is there's kind of two markets here. And we call it clickers and coders. >> [Katie] I like that. That's good. >> So, let's talk about what that means. So clickers are basically somebody that wants to use tools. Create models visually. It's drag and drop. Something that's very intuitive. Those are the clickers. Nothing wrong with that. It's been valuable for years. There's a new crop of data scientists. They want to code. They want to build with the latest open source tools. They want to write in Python or R. These are the coders. And both approaches are viable. Both approaches are critical. Organizations have to have a way to meet the needs of both of those types. And there's not a lot of things available today that do that. >> Well let's keep going on that. Because I hear you talking about the data scientists role and how it's critical to success, but with the new tools, data science and analytics skills can extend beyond the domain of just the data scientist. >> That's right. So look, we're unifying coders and clickers into a single platform, which we call IBM Data Science Experience. And as the demand for data science expertise grows, so does the need for these kind of tools. To bring them into the same environment. And my view is if you have the right platform, it enables the organization to collaborate. And suddenly you've changed the nature of data science from an individual sport to a team sport. >> So as somebody that, my background is in IT, the question is really is this an additional piece of what IT needs to do in 2017 and beyond? Or is it just another line item to the budget? >> So I'm afraid that some people might view it that way. As just another line item. But, I would challenge that and say data science is going to reinvent IT. It's going to change the nature of IT. And every organization needs to think about what are the skills that are critical? How do we engage a broader team to do this? Because once they get there, this is the chance to reinvent how they're performing IT. >> [Katie] Challenging or not? >> Look it's all a big challenge. Think about everything IT organizations have been through. Some of them were late to things like mobile, but then they caught up. Some were late to cloud, but then they caught up. I would just urge people, don't be late to data science. Use this as your chance to reinvent IT. Start with this notion of clickers and coders. This is a seminal moment. Much like mobile and cloud was. So don't be late. >> And I think it's critical because it could be so costly to wait. And Rob and I were even chatting earlier how data analytics is just moving into all different kinds of industries. And I can tell you even personally being effected by how important the analysis is in working in pediatric cancer for the last seven years. I personally implement virtual reality headsets to pediatric cancer hospitals across the country. And it's great. And it's working phenomenally. And the kids are amazed. And the staff is amazed. But the phase two of this project is putting in little metrics in the hardware that gather the breathing, the heart rate to show that we have data. Proof that we can hand over to the hospitals to continue making this program a success. So just in-- >> That's a great example. >> An interesting example. >> Saving lives? >> Yes. >> That's also applying a lot of what we talked about. >> Exciting stuff in the world of data science. >> Yes. Look, I just add this is an existential moment for every organization. Because what you do in this area is probably going to define how competitive you are going forward. And think about if you don't do something. What if one of your competitors goes and creates an application that's more engaging with clients? So my recommendation is start small. Experiment. Learn. Iterate on projects. Define the business outcomes. Then scale up. It's very doable. But you've got to take the first step. >> First step always critical. And now we're going to get to the fun hands on part of our story. Because in just a moment we're going to take a closer look at what data science can deliver. And where organizations are trying to get to. All right. Thank you Rob and now we've been joined by Siva Anne who is going to help us navigate this demo. First, welcome Siva. Give him a big round of applause. Yeah. All right, Rob break down what we're going to be looking at. You take over this demo. >> All right. So this is going to be pretty interesting. So Siva is going to take us through. So he's going to play the role of a financial adviser. Who wants to help better serve clients through recommendations. And I'm going to really illustrate three things. One is how do you federate data from multiple data sources? Inside the firewall, outside the firewall. How do you apply machine learning to predict and to automate? And then how do you move analytics closer to your data? So, what you're seeing here is a custom application for an investment firm. So, Siva, our financial adviser, welcome. So you can see at the top, we've got market data. We pulled that from an external source. And then we've got Siva's calendar in the middle. He's got clients on the right side. So page down, what else do you see down there Siva? >> [Siva] I can see the recent market news. And in here I can see that JP Morgan is calling for a US dollar rebound in the second half of the year. And, I have upcoming meeting with Leo Rakes. I can get-- >> [Rob] So let's go in there. Why don't you click on Leo Rakes. So, you're sitting at your desk, you're deciding how you're going to spend the day. You know you have a meeting with Leo. So you click on it. You immediately see, all right, so what do we know about him? We've got data governance implemented. So we know his age, we know his degree. We can see he's not that aggressive of a trader. Only six trades in the last few years. But then where it gets interesting is you go to the bottom. You start to see predicted industry affinity. Where did that come from? How do we have that? >> [Siva] So these green lines and red arrows here indicate the trending affinity of Leo Rakes for particular industry stocks. What we've done here is we've built machine learning models using customer's demographic data, his stock portfolios, and browsing behavior to build a model which can predict his affinity for a particular industry. >> [Rob] Interesting. So, I like to think of this, we call it celebrity experiences. So how do you treat every customer like they're a celebrity? So to some extent, we're reading his mind. Because without asking him, we know that he's going to have an affinity for auto stocks. So we go down. Now we look at his portfolio. You can see okay, he's got some different holdings. He's got Amazon, Google, Apple, and then he's got RACE, which is the ticker for Ferrari. You can see that's done incredibly well. And so, as a financial adviser, you look at this and you say, all right, we know he loves auto stocks. Ferrari's done very well. Let's create a hedge. Like what kind of security would interest him as a hedge against his position for Ferrari? Could we go figure that out? >> [Siva] Yes. Given I know that he's gotten an affinity for auto stocks, and I also see that Ferrari has got some terminus gains, I want to lock in these gains by hedging. And I want to do that by picking a auto stock which has got negative correlation with Ferrari. >> [Rob] So this is where we get to the idea of in database analytics. Cause you start clicking that and immediately we're getting instant answers of what's happening. So what did we find here? We're going to compare Ferrari and Honda. >> [Siva] I'm going to compare Ferrari with Honda. And what I see here instantly is that Honda has got a negative correlation with Ferrari, which makes it a perfect mix for his stock portfolio. Given he has an affinity for auto stocks and it correlates negatively with Ferrari. >> [Rob] These are very powerful tools at the hand of a financial adviser. You think about it. As a financial adviser, you wouldn't think about federating data, machine learning, pretty powerful. >> [Siva] Yes. So what we have seen here is that using the common SQL engine, we've been able to federate queries across multiple data sources. Db2 Warehouse in the cloud, IBM's Integrated Analytic System, and Hortonworks powered Hadoop platform for the new speeds. We've been able to use machine learning to derive innovative insights about his stock affinities. And drive the machine learning into the appliance. Closer to where the data resides to deliver high performance analytics. >> [Rob] At scale? >> [Siva] We're able to run millions of these correlations across stocks, currency, other factors. And even score hundreds of customers for their affinities on a daily basis. >> That's great. Siva, thank you for playing the role of financial adviser. So I just want to recap briefly. Cause this really powerful technology that's really simple. So we federated, we aggregated multiple data sources from all over the web and internal systems. And public cloud systems. Machine learning models were built that predicted Leo's affinity for a certain industry. In this case, automotive. And then you see when you deploy analytics next to your data, even a financial adviser, just with the click of a button is getting instant answers so they can go be more productive in their next meeting. This whole idea of celebrity experiences for your customer, that's available for everybody, if you take advantage of these types of capabilities. Katie, I'll hand it back to you. >> Good stuff. Thank you Rob. Thank you Siva. Powerful demonstration on what we've been talking about all afternoon. And thank you again to Siva for helping us navigate. Should be give him one more round of applause? We're going to be back in just a moment to look at how we operationalize all of this data. But in first, here's a message from me. If you're a part of a line of business, your main fear is disruption. You know data is the new goal that can create huge amounts of value. So does your competition. And they may be beating you to it. You're convinced there are new business models and revenue sources hidden in all the data. You just need to figure out how to leverage it. But with the scarcity of data scientists, you really can't rely solely on them. You may need more people throughout the organization that have the ability to extract value from data. And as a data science leader or data scientist, you have a lot of the same concerns. You spend way too much time looking for, prepping, and interpreting data and waiting for models to train. You know you need to operationalize the work you do to provide business value faster. What you want is an easier way to do data prep. And rapidly build models that can be easily deployed, monitored and automatically updated. So whether you're a data scientist, data science leader, or in a line of business, what's the solution? What'll it take to transform the way you work? That's what we're going to explore next. All right, now it's time to delve deeper into the nuts and bolts. The nitty gritty of operationalizing data science and creating a data driven culture. How do you actually do that? Well that's what these experts are here to share with us. I'm joined by Nir Kaldero, who's head of data science at Galvanize, which is an education and training organization. Tricia Wang, who is co-founder of Sudden Compass, a consultancy that helps companies understand people with data. And last, but certainly not least, Michael Li, founder and CEO of Data Incubator, which is a data science train company. All right guys. Shall we get right to it? >> All right. >> So data explosion happening right now. And we are seeing it across the board. I just shared an example of how it's impacting my philanthropic work in pediatric cancer. But you guys each have so many unique roles in your business life. How are you seeing it just blow up in your fields? Nir, your thing? >> Yeah, for example like in Galvanize we train many Fortune 500 companies. And just by looking at the demand of companies that wants us to help them go through this digital transformation is mind-blowing. Data point by itself. >> Okay. Well what we're seeing what's going on is that data science like as a theme, is that it's actually for everyone now. But what's happening is that it's actually meeting non technical people. But what we're seeing is that when non technical people are implementing these tools or coming at these tools without a base line of data literacy, they're often times using it in ways that distance themselves from the customer. Because they're implementing data science tools without a clear purpose, without a clear problem. And so what we do at Sudden Compass is that we work with companies to help them embrace and understand the complexity of their customers. Because often times they are misusing data science to try and flatten their understanding of the customer. As if you can just do more traditional marketing. Where you're putting people into boxes. And I think the whole ROI of data is that you can now understand people's relationships at a much more complex level at a greater scale before. But we have to do this with basic data literacy. And this has to involve technical and non technical people. >> Well you can have all the data in the world, and I think it speaks to, if you're not doing the proper movement with it, forget it. It means nothing at the same time. >> No absolutely. I mean, I think that when you look at the huge explosion in data, that comes with it a huge explosion in data experts. Right, we call them data scientists, data analysts. And sometimes they're people who are very, very talented, like the people here. But sometimes you have people who are maybe re-branding themselves, right? Trying to move up their title one notch to try to attract that higher salary. And I think that that's one of the things that customers are coming to us for, right? They're saying, hey look, there are a lot of people that call themselves data scientists, but we can't really distinguish. So, we have sort of run a fellowship where you help companies hire from a really talented group of folks, who are also truly data scientists and who know all those kind of really important data science tools. And we also help companies internally. Fortune 500 companies who are looking to grow that data science practice that they have. And we help clients like McKinsey, BCG, Bain, train up their customers, also their clients, also their workers to be more data talented. And to build up that data science capabilities. >> And Nir, this is something you work with a lot. A lot of Fortune 500 companies. And when we were speaking earlier, you were saying many of these companies can be in a panic. >> Yeah. >> Explain that. >> Yeah, so you know, not all Fortune 500 companies are fully data driven. And we know that the winners in this fourth industrial revolution, which I like to call the machine intelligence revolution, will be companies who navigate and transform their organization to unlock the power of data science and machine learning. And the companies that are not like that. Or not utilize data science and predictive power well, will pretty much get shredded. So they are in a panic. >> Tricia, companies have to deal with data behind the firewall and in the new multi cloud world. How do organizations start to become driven right to the core? >> I think the most urgent question to become data driven that companies should be asking is how do I bring the complex reality that our customers are experiencing on the ground in to a corporate office? Into the data models. So that question is critical because that's how you actually prevent any big data disasters. And that's how you leverage big data. Because when your data models are really far from your human models, that's when you're going to do things that are really far off from how, it's going to not feel right. That's when Tesco had their terrible big data disaster that they're still recovering from. And so that's why I think it's really important to understand that when you implement big data, you have to further embrace thick data. The qualitative, the emotional stuff, that is difficult to quantify. But then comes the difficult art and science that I think is the next level of data science. Which is that getting non technical and technical people together to ask how do we find those unknown nuggets of insights that are difficult to quantify? Then, how do we do the next step of figuring out how do you mathematically scale those insights into a data model? So that actually is reflective of human understanding? And then we can start making decisions at scale. But you have to have that first. >> That's absolutely right. And I think that when we think about what it means to be a data scientist, right? I always think about it in these sort of three pillars. You have the math side. You have to have that kind of stats, hardcore machine learning background. You have the programming side. You don't work with small amounts of data. You work with large amounts of data. You've got to be able to type the code to make those computers run. But then the last part is that human element. You have to understand the domain expertise. You have to understand what it is that I'm actually analyzing. What's the business proposition? And how are the clients, how are the users actually interacting with the system? That human element that you were talking about. And I think having somebody who understands all of those and not just in isolation, but is able to marry that understanding across those different topics, that's what makes a data scientist. >> But I find that we don't have people with those skill sets. And right now the way I see teams being set up inside companies is that they're creating these isolated data unicorns. These data scientists that have graduated from your programs, which are great. But, they don't involve the people who are the domain experts. They don't involve the designers, the consumer insight people, the people, the salespeople. The people who spend time with the customers day in and day out. Somehow they're left out of the room. They're consulted, but they're not a stakeholder. >> Can I actually >> Yeah, yeah please. >> Can I actually give a quick example? So for example, we at Galvanize train the executives and the managers. And then the technical people, the data scientists and the analysts. But in order to actually see all of the RY behind the data, you also have to have a creative fluid conversation between non technical and technical people. And this is a major trend now. And there's a major gap. And we need to increase awareness and kind of like create a new, kind of like environment where technical people also talks seamlessly with non technical ones. >> [Tricia] We call-- >> That's one of the things that we see a lot. Is one of the trends in-- >> A major trend. >> data science training is it's not just for the data science technical experts. It's not just for one type of person. So a lot of the training we do is sort of data engineers. People who are more on the software engineering side learning more about the stats of math. And then people who are sort of traditionally on the stat side learning more about the engineering. And then managers and people who are data analysts learning about both. >> Michael, I think you said something that was of interest too because I think we can look at IBM Watson as an example. And working in healthcare. The human component. Because often times we talk about machine learning and AI, and data and you get worried that you still need that human component. Especially in the world of healthcare. And I think that's a very strong point when it comes to the data analysis side. Is there any particular example you can speak to of that? >> So I think that there was this really excellent paper a while ago talking about all the neuro net stuff and trained on textual data. So looking at sort of different corpuses. And they found that these models were highly, highly sexist. They would read these corpuses and it's not because neuro nets themselves are sexist. It's because they're reading the things that we write. And it turns out that we write kind of sexist things. And they would sort of find all these patterns in there that were sort of latent, that had a lot of sort of things that maybe we would cringe at if we sort of saw. And I think that's one of the really important aspects of the human element, right? It's being able to come in and sort of say like, okay, I know what the biases of the system are, I know what the biases of the tools are. I need to figure out how to use that to make the tools, make the world a better place. And like another area where this comes up all the time is lending, right? So the federal government has said, and we have a lot of clients in the financial services space, so they're constantly under these kind of rules that they can't make discriminatory lending practices based on a whole set of protected categories. Race, sex, gender, things like that. But, it's very easy when you train a model on credit scores to pick that up. And then to have a model that's inadvertently sexist or racist. And that's where you need the human element to come back in and say okay, look, you're using the classic example would be zip code, you're using zip code as a variable. But when you look at it, zip codes actually highly correlated with race. And you can't do that. So you may inadvertently by sort of following the math and being a little naive about the problem, inadvertently introduce something really horrible into a model and that's where you need a human element to sort of step in and say, okay hold on. Slow things down. This isn't the right way to go. >> And the people who have -- >> I feel like, I can feel her ready to respond. >> Yes, I'm ready. >> She's like let me have at it. >> And the people here it is. And the people who are really great at providing that human intelligence are social scientists. We are trained to look for bias and to understand bias in data. Whether it's quantitative or qualitative. And I really think that we're going to have less of these kind of problems if we had more integrated teams. If it was a mandate from leadership to say no data science team should be without a social scientist, ethnographer, or qualitative researcher of some kind, to be able to help see these biases. >> The talent piece is actually the most crucial-- >> Yeah. >> one here. If you look about how to enable machine intelligence in organization there are the pillars that I have in my head which is the culture, the talent and the technology infrastructure. And I believe and I saw in working very closely with the Fortune 100 and 200 companies that the talent piece is actually the most important crucial hard to get. >> [Tricia] I totally agree. >> It's absolutely true. Yeah, no I mean I think that's sort of like how we came up with our business model. Companies were basically saying hey, I can't hire data scientists. And so we have a fellowship where we get 2,000 applicants each quarter. We take the top 2% and then we sort of train them up. And we work with hiring companies who then want to hire from that population. And so we're sort of helping them solve that problem. And the other half of it is really around training. Cause with a lot of industries, especially if you're sort of in a more regulated industry, there's a lot of nuances to what you're doing. And the fastest way to develop that data science or AI talent may not necessarily be to hire folks who are coming out of a PhD program. It may be to take folks internally who have a lot of that domain knowledge that you have and get them trained up on those data science techniques. So we've had large insurance companies come to us and say hey look, we hire three or four folks from you a quarter. That doesn't move the needle for us. What we really need is take the thousand actuaries and statisticians that we have and get all of them trained up to become a data scientist and become data literate in this new open source world. >> [Katie] Go ahead. >> All right, ladies first. >> Go ahead. >> Are you sure? >> No please, fight first. >> Go ahead. >> Go ahead Nir. >> So this is actually a trend that we have been seeing in the past year or so that companies kind of like start to look how to upscale and look for talent within the organization. So they can actually move them to become more literate and navigate 'em from analyst to data scientist. And from data scientist to machine learner. So this is actually a trend that is happening already for a year or so. >> Yeah, but I also find that after they've gone through that training in getting people skilled up in data science, the next problem that I get is executives coming to say we've invested in all of this. We're still not moving the needle. We've already invested in the right tools. We've gotten the right skills. We have enough scale of people who have these skills. Why are we not moving the needle? And what I explain to them is look, you're still making decisions in the same way. And you're still not involving enough of the non technical people. Especially from marketing, which is now, the CMO's are much more responsible for driving growth in their companies now. But often times it's so hard to change the old way of marketing, which is still like very segmentation. You know, demographic variable based, and we're trying to move people to say no, you have to understand the complexity of customers and not put them in boxes. >> And I think underlying a lot of this discussion is this question of culture, right? >> Yes. >> Absolutely. >> How do you build a data driven culture? And I think that that culture question, one of the ways that comes up quite often in especially in large, Fortune 500 enterprises, is that they are very, they're not very comfortable with sort of example, open source architecture. Open source tools. And there is some sort of residual bias that that's somehow dangerous. So security vulnerability. And I think that that's part of the cultural challenge that they often have in terms of how do I build a more data driven organization? Well a lot of the talent really wants to use these kind of tools. And I mean, just to give you an example, we are partnering with one of the major cloud providers to sort of help make open source tools more user friendly on their platform. So trying to help them attract the best technologists to use their platform because they want and they understand the value of having that kind of open source technology work seamlessly on their platforms. So I think that just sort of goes to show you how important open source is in this movement. And how much large companies and Fortune 500 companies and a lot of the ones we work with have to embrace that. >> Yeah, and I'm seeing it in our work. Even when we're working with Fortune 500 companies, is that they've already gone through the first phase of data science work. Where I explain it was all about the tools and getting the right tools and architecture in place. And then companies started moving into getting the right skill set in place. Getting the right talent. And what you're talking about with culture is really where I think we're talking about the third phase of data science, which is looking at communication of these technical frameworks so that we can get non technical people really comfortable in the same room with data scientists. That is going to be the phase, that's really where I see the pain point. And that's why at Sudden Compass, we're really dedicated to working with each other to figure out how do we solve this problem now? >> And I think that communication between the technical stakeholders and management and leadership. That's a very critical piece of this. You can't have a successful data science organization without that. >> Absolutely. >> And I think that actually some of the most popular trainings we've had recently are from managers and executives who are looking to say, how do I become more data savvy? How do I figure out what is this data science thing and how do I communicate with my data scientists? >> You guys made this way too easy. I was just going to get some popcorn and watch it play out. >> Nir, last 30 seconds. I want to leave you with an opportunity to, anything you want to add to this conversation? >> I think one thing to conclude is to say that companies that are not data driven is about time to hit refresh and figure how they transition the organization to become data driven. To become agile and nimble so they can actually see what opportunities from this important industrial revolution. Otherwise, unfortunately they will have hard time to survive. >> [Katie] All agreed? >> [Tricia] Absolutely, you're right. >> Michael, Trish, Nir, thank you so much. Fascinating discussion. And thank you guys again for joining us. We will be right back with another great demo. Right after this. >> Thank you Katie. >> Once again, thank you for an excellent discussion. Weren't they great guys? And thank you for everyone who's tuning in on the live webcast. As you can hear, we have an amazing studio audience here. And we're going to keep things moving. I'm now joined by Daniel Hernandez and Siva Anne. And we're going to turn our attention to how you can deliver on what they're talking about using data science experience to do data science faster. >> Thank you Katie. Siva and I are going to spend the next 10 minutes showing you how you can deliver on what they were saying using the IBM Data Science Experience to do data science faster. We'll demonstrate through new features we introduced this week how teams can work together more effectively across the entire analytics life cycle. How you can take advantage of any and all data no matter where it is and what it is. How you could use your favorite tools from open source. And finally how you could build models anywhere and employ them close to where your data is. Remember the financial adviser app Rob showed you? To build an app like that, we needed a team of data scientists, developers, data engineers, and IT staff to collaborate. We do this in the Data Science Experience through a concept we call projects. When I create a new project, I can now use the new Github integration feature. We're doing for data science what we've been doing for developers for years. Distributed teams can work together on analytics projects. And take advantage of Github's version management and change management features. This is a huge deal. Let's explore the project we created for the financial adviser app. As you can see, our data engineer Joane, our developer Rob, and others are collaborating this project. Joane got things started by bringing together the trusted data sources we need to build the app. Taking a closer look at the data, we see that our customer and profile data is stored on our recently announced IBM Integrated Analytics System, which runs safely behind our firewall. We also needed macro economic data, which she was able to find in the Federal Reserve. And she stored it in our Db2 Warehouse on Cloud. And finally, she selected stock news data from NASDAQ.com and landed that in a Hadoop cluster, which happens to be powered by Hortonworks. We added a new feature to the Data Science Experience so that when it's installed with Hortonworks, it automatically uses a need of security and governance controls within the cluster so your data is always secure and safe. Now we want to show you the news data we stored in the Hortonworks cluster. This is the mean administrative console. It's powered by an open source project called Ambari. And here's the news data. It's in parquet files stored in HDFS, which happens to be a distributive file system. To get the data from NASDAQ into our cluster, we used IBM's BigIntegrate and BigQuality to create automatic data pipelines that acquire, cleanse, and ingest that news data. Once the data's available, we use IBM's Big SQL to query that data using SQL statements that are much like the ones we would use for any relation of data, including the data that we have in the Integrated Analytics System and Db2 Warehouse on Cloud. This and the federation capabilities that Big SQL offers dramatically simplifies data acquisition. Now we want to show you how we support a brand new tool that we're excited about. Since we launched last summer, the Data Science Experience has supported Jupyter and R for data analysis and visualization. In this week's update, we deeply integrated another great open source project called Apache Zeppelin. It's known for having great visualization support, advanced collaboration features, and is growing in popularity amongst the data science community. This is an example of Apache Zeppelin and the notebook we created through it to explore some of our data. Notice how wonderful and easy the data visualizations are. Now we want to walk you through the Jupyter notebook we created to explore our customer preference for stocks. We use notebooks to understand and explore data. To identify the features that have some predictive power. Ultimately, we're trying to assess what ultimately is driving customer stock preference. Here we did the analysis to identify the attributes of customers that are likely to purchase auto stocks. We used this understanding to build our machine learning model. For building machine learning models, we've always had tools integrated into the Data Science Experience. But sometimes you need to use tools you already invested in. Like our very own SPSS as well as SAS. Through new import feature, you can easily import those models created with those tools. This helps you avoid vendor lock-in, and simplify the development, training, deployment, and management of all your models. To build the models we used in app, we could have coded, but we prefer a visual experience. We used our customer profile data in the Integrated Analytic System. Used the Auto Data Preparation to cleanse our data. Choose the binary classification algorithms. Let the Data Science Experience evaluate between logistic regression and gradient boosted tree. It's doing the heavy work for us. As you can see here, the Data Science Experience generated performance metrics that show us that the gradient boosted tree is the best performing algorithm for the data we gave it. Once we save this model, it's automatically deployed and available for developers to use. Any application developer can take this endpoint and consume it like they would any other API inside of the apps they built. We've made training and creating machine learning models super simple. But what about the operations? A lot of companies are struggling to ensure their model performance remains high over time. In our financial adviser app, we know that customer data changes constantly, so we need to always monitor model performance and ensure that our models are retrained as is necessary. This is a dashboard that shows the performance of our models and lets our teams monitor and retrain those models so that they're always performing to our standards. So far we've been showing you the Data Science Experience available behind the firewall that we're using to build and train models. Through a new publish feature, you can build models and deploy them anywhere. In another environment, private, public, or anywhere else with just a few clicks. So here we're publishing our model to the Watson machine learning service. It happens to be in the IBM cloud. And also deeply integrated with our Data Science Experience. After publishing and switching to the Watson machine learning service, you can see that our stock affinity and model that we just published is there and ready for use. So this is incredibly important. I just want to say it again. The Data Science Experience allows you to train models behind your own firewall, take advantage of your proprietary and sensitive data, and then deploy those models wherever you want with ease. So summarize what we just showed you. First, IBM's Data Science Experience supports all teams. You saw how our data engineer populated our project with trusted data sets. Our data scientists developed, trained, and tested a machine learning model. Our developers used APIs to integrate machine learning into their apps. And how IT can use our Integrated Model Management dashboard to monitor and manage model performance. Second, we support all data. On premises, in the cloud, structured, unstructured, inside of your firewall, and outside of it. We help you bring analytics and governance to where your data is. Third, we support all tools. The data science tools that you depend on are readily available and deeply integrated. This includes capabilities from great partners like Hortonworks. And powerful tools like our very own IBM SPSS. And fourth, and finally, we support all deployments. You can build your models anywhere, and deploy them right next to where your data is. Whether that's in the public cloud, private cloud, or even on the world's most reliable transaction platform, IBM z. So see for yourself. Go to the Data Science Experience website, take us for a spin. And if you happen to be ready right now, our recently created Data Science Elite Team can help you get started and run experiments alongside you with no charge. Thank you very much. >> Thank you very much Daniel. It seems like a great time to get started. And thanks to Siva for taking us through it. Rob and I will be back in just a moment to add some perspective right after this. All right, once again joined by Rob Thomas. And Rob obviously we got a lot of information here. >> Yes, we've covered a lot of ground. >> This is intense. You got to break it down for me cause I think we zoom out and see the big picture. What better data science can deliver to a business? Why is this so important? I mean we've heard it through and through. >> Yeah, well, I heard it a couple times. But it starts with businesses have to embrace a data driven culture. And it is a change. And we need to make data accessible with the right tools in a collaborative culture because we've got diverse skill sets in every organization. But data driven companies succeed when data science tools are in the hands of everyone. And I think that's a new thought. I think most companies think just get your data scientist some tools, you'll be fine. This is about tools in the hands of everyone. I think the panel did a great job of describing about how we get to data science for all. Building a data culture, making it a part of your everyday operations, and the highlights of what Daniel just showed us, that's some pretty cool features for how organizations can get to this, which is you can see IBM's Data Science Experience, how that supports all teams. You saw data analysts, data scientists, application developer, IT staff, all working together. Second, you saw how we support all tools. And your choice of tools. So the most popular data science libraries integrated into one platform. And we saw some new capabilities that help companies avoid lock-in, where you can import existing models created from specialist tools like SPSS or others. And then deploy them and manage them inside of Data Science Experience. That's pretty interesting. And lastly, you see we continue to build on this best of open tools. Partnering with companies like H2O, Hortonworks, and others. Third, you can see how you use all data no matter where it lives. That's a key challenge every organization's going to face. Private, public, federating all data sources. We announced new integration with the Hortonworks data platform where we deploy machine learning models where your data resides. That's been a key theme. Analytics where the data is. And lastly, supporting all types of deployments. Deploy them in your Hadoop cluster. Deploy them in your Integrated Analytic System. Or deploy them in z, just to name a few. A lot of different options here. But look, don't believe anything I say. Go try it for yourself. Data Science Experience, anybody can use it. Go to datascience.ibm.com and look, if you want to start right now, we just created a team that we call Data Science Elite. These are the best data scientists in the world that will come sit down with you and co-create solutions, models, and prove out a proof of concept. >> Good stuff. Thank you Rob. So you might be asking what does an organization look like that embraces data science for all? And how could it transform your role? I'm going to head back to the office and check it out. Let's start with the perspective of the line of business. What's changed? Well, now you're starting to explore new business models. You've uncovered opportunities for new revenue sources and all that hidden data. And being disrupted is no longer keeping you up at night. As a data science leader, you're beginning to collaborate with a line of business to better understand and translate the objectives into the models that are being built. Your data scientists are also starting to collaborate with the less technical team members and analysts who are working closest to the business problem. And as a data scientist, you stop feeling like you're falling behind. Open source tools are keeping you current. You're also starting to operationalize the work that you do. And you get to do more of what you love. Explore data, build models, put your models into production, and create business impact. All in all, it's not a bad scenario. Thanks. All right. We are back and coming up next, oh this is a special time right now. Cause we got a great guest speaker. New York Magazine called him the spreadsheet psychic and number crunching prodigy who went from correctly forecasting baseball games to correctly forecasting presidential elections. He even invented a proprietary algorithm called PECOTA for predicting future performance by baseball players and teams. And his New York Times bestselling book, The Signal and the Noise was named by Amazon.com as the number one best non-fiction book of 2012. He's currently the Editor in Chief of the award winning website, FiveThirtyEight and appears on ESPN as an on air commentator. Big round of applause. My pleasure to welcome Nate Silver. >> Thank you. We met backstage. >> Yes. >> It feels weird to re-shake your hand, but you know, for the audience. >> I had to give the intense firm grip. >> Definitely. >> The ninja grip. So you and I have crossed paths kind of digitally in the past, which it really interesting, is I started my career at ESPN. And I started as a production assistant, then later back on air for sports technology. And I go to you to talk about sports because-- >> Yeah. >> Wow, has ESPN upped their game in terms of understanding the importance of data and analytics. And what it brings. Not just to MLB, but across the board. >> No, it's really infused into the way they present the broadcast. You'll have win probability on the bottom line. And they'll incorporate FiveThirtyEight metrics into how they cover college football for example. So, ESPN ... Sports is maybe the perfect, if you're a data scientist, like the perfect kind of test case. And the reason being that sports consists of problems that have rules. And have structure. And when problems have rules and structure, then it's a lot easier to work with. So it's a great way to kind of improve your skills as a data scientist. Of course, there are also important real world problems that are more open ended, and those present different types of challenges. But it's such a natural fit. The teams. Think about the teams playing the World Series tonight. The Dodgers and the Astros are both like very data driven, especially Houston. Golden State Warriors, the NBA Champions, extremely data driven. New England Patriots, relative to an NFL team, it's shifted a little bit, the NFL bar is lower. But the Patriots are certainly very analytical in how they make decisions. So, you can't talk about sports without talking about analytics. >> And I was going to save the baseball question for later. Cause we are moments away from game seven. >> Yeah. >> Is everyone else watching game seven? It's been an incredible series. Probably one of the best of all time. >> Yeah, I mean-- >> You have a prediction here? >> You can mention that too. So I don't have a prediction. FiveThirtyEight has the Dodgers with a 60% chance of winning. >> [Katie] LA Fans. >> So you have two teams that are about equal. But the Dodgers pitching staff is in better shape at the moment. The end of a seven game series. And they're at home. >> But the statistics behind the two teams is pretty incredible. >> Yeah. It's like the first World Series in I think 56 years or something where you have two 100 win teams facing one another. There have been a lot of parity in baseball for a lot of years. Not that many offensive overall juggernauts. But this year, and last year with the Cubs and the Indians too really. But this year, you have really spectacular teams in the World Series. It kind of is a showcase of modern baseball. Lots of home runs. Lots of strikeouts. >> [Katie] Lots of extra innings. >> Lots of extra innings. Good defense. Lots of pitching changes. So if you love the modern baseball game, it's been about the best example that you've had. If you like a little bit more contact, and fewer strikeouts, maybe not so much. But it's been a spectacular and very exciting World Series. It's amazing to talk. MLB is huge with analysis. I mean, hands down. But across the board, if you can provide a few examples. Because there's so many teams in front offices putting such an, just a heavy intensity on the analysis side. And where the teams are going. And if you could provide any specific examples of teams that have really blown your mind. Especially over the last year or two. Because every year it gets more exciting if you will. I mean, so a big thing in baseball is defensive shifts. So if you watch tonight, you'll probably see a couple of plays where if you're used to watching baseball, a guy makes really solid contact. And there's a fielder there that you don't think should be there. But that's really very data driven where you analyze where's this guy hit the ball. That part's not so hard. But also there's game theory involved. Because you have to adjust for the fact that he knows where you're positioning the defenders. He's trying therefore to make adjustments to his own swing and so that's been a major innovation in how baseball is played. You know, how bullpens are used too. Where teams have realized that actually having a guy, across all sports pretty much, realizing the importance of rest. And of fatigue. And that you can be the best pitcher in the world, but guess what? After four or five innings, you're probably not as good as a guy who has a fresh arm necessarily. So I mean, it really is like, these are not subtle things anymore. It's not just oh, on base percentage is valuable. It really effects kind of every strategic decision in baseball. The NBA, if you watch an NBA game tonight, see how many three point shots are taken. That's in part because of data. And teams realizing hey, three points is worth more than two, once you're more than about five feet from the basket, the shooting percentage gets really flat. And so it's revolutionary, right? Like teams that will shoot almost half their shots from the three point range nowadays. Larry Bird, who wound up being one of the greatest three point shooters of all time, took only eight three pointers his first year in the NBA. It's quite noticeable if you watch baseball or basketball in particular. >> Not to focus too much on sports. One final question. In terms of Major League Soccer, and now in NFL, we're having the analysis and having wearables where it can now showcase if they wanted to on screen, heart rate and breathing and how much exertion. How much data is too much data? And when does it ruin the sport? >> So, I don't think, I mean, again, it goes sport by sport a little bit. I think in basketball you actually have a more exciting game. I think the game is more open now. You have more three pointers. You have guys getting higher assist totals. But you know, I don't know. I'm not one of those people who thinks look, if you love baseball or basketball, and you go in to work for the Astros, the Yankees or the Knicks, they probably need some help, right? You really have to be passionate about that sport. Because it's all based on what questions am I asking? As I'm a fan or I guess an employee of the team. Or a player watching the game. And there isn't really any substitute I don't think for the insight and intuition that a curious human has to kind of ask the right questions. So we can talk at great length about what tools do you then apply when you have those questions, but that still comes from people. I don't think machine learning could help with what questions do I want to ask of the data. It might help you get the answers. >> If you have a mid-fielder in a soccer game though, not exerting, only 80%, and you're seeing that on a screen as a fan, and you're saying could that person get fired at the end of the day? One day, with the data? >> So we found that actually some in soccer in particular, some of the better players are actually more still. So Leo Messi, maybe the best player in the world, doesn't move as much as other soccer players do. And the reason being that A) he kind of knows how to position himself in the first place. B) he realizes that you make a run, and you're out of position. That's quite fatiguing. And particularly soccer, like basketball, is a sport where it's incredibly fatiguing. And so, sometimes the guys who conserve their energy, that kind of old school mentality, you have to hustle at every moment. That is not helpful to the team if you're hustling on an irrelevant play. And therefore, on a critical play, can't get back on defense, for example. >> Sports, but also data is moving exponentially as we're just speaking about today. Tech, healthcare, every different industry. Is there any particular that's a favorite of yours to cover? And I imagine they're all different as well. >> I mean, I do like sports. We cover a lot of politics too. Which is different. I mean in politics I think people aren't intuitively as data driven as they might be in sports for example. It's impressive to follow the breakthroughs in artificial intelligence. It started out just as kind of playing games and playing chess and poker and Go and things like that. But you really have seen a lot of breakthroughs in the last couple of years. But yeah, it's kind of infused into everything really. >> You're known for your work in politics though. Especially presidential campaigns. >> Yeah. >> This year, in particular. Was it insanely challenging? What was the most notable thing that came out of any of your predictions? >> I mean, in some ways, looking at the polling was the easiest lens to look at it. So I think there's kind of a myth that last year's result was a big shock and it wasn't really. If you did the modeling in the right way, then you realized that number one, polls have a margin of error. And so when a candidate has a three point lead, that's not particularly safe. Number two, the outcome between different states is correlated. Meaning that it's not that much of a surprise that Clinton lost Wisconsin and Michigan and Pennsylvania and Ohio. You know I'm from Michigan. Have friends from all those states. Kind of the same types of people in those states. Those outcomes are all correlated. So what people thought was a big upset for the polls I think was an example of how data science done carefully and correctly where you understand probabilities, understand correlations. Our model gave Trump a 30% chance of winning. Others models gave him a 1% chance. And so that was interesting in that it showed that number one, that modeling strategies and skill do matter quite a lot. When you have someone saying 30% versus 1%. I mean, that's a very very big spread. And number two, that these aren't like solved problems necessarily. Although again, the problem with elections is that you only have one election every four years. So I can be very confident that I have a better model. Even one year of data doesn't really prove very much. Even five or 10 years doesn't really prove very much. And so, being aware of the limitations to some extent intrinsically in elections when you only get one kind of new training example every four years, there's not really any way around that. There are ways to be more robust to sparce data environments. But if you're identifying different types of business problems to solve, figuring out what's a solvable problem where I can add value with data science is a really key part of what you're doing. >> You're such a leader in this space. In data and analysis. It would be interesting to kind of peek back the curtain, understand how you operate but also how large is your team? How you're putting together information. How quickly you're putting it out. Cause I think in this right now world where everybody wants things instantly-- >> Yeah. >> There's also, you want to be first too in the world of journalism. But you don't want to be inaccurate because that's your credibility. >> We talked about this before, right? I think on average, speed is a little bit overrated in journalism. >> [Katie] I think it's a big problem in journalism. >> Yeah. >> Especially in the tech world. You have to be first. You have to be first. And it's just pumping out, pumping out. And there's got to be more time spent on stories if I can speak subjectively. >> Yeah, for sure. But at the same time, we are reacting to the news. And so we have people that come in, we hire most of our people actually from journalism. >> [Katie] How many people do you have on your team? >> About 35. But, if you get someone who comes in from an academic track for example, they might be surprised at how fast journalism is. That even though we might be slower than the average website, the fact that there's a tragic event in New York, are there things we have to say about that? A candidate drops out of the presidential race, are things we have to say about that. In periods ranging from minutes to days as opposed to kind of weeks to months to years in the academic world. The corporate world moves faster. What is a little different about journalism is that you are expected to have more precision where people notice when you make a mistake. In corporations, you have maybe less transparency. If you make 10 investments and seven of them turn out well, then you'll get a lot of profit from that, right? In journalism, it's a little different. If you make kind of seven predictions or say seven things, and seven of them are very accurate and three of them aren't, you'll still get criticized a lot for the three. Just because that's kind of the way that journalism is. And so the kind of combination of needing, not having that much tolerance for mistakes, but also needing to be fast. That is tricky. And I criticize other journalists sometimes including for not being data driven enough, but the best excuse any journalist has, this is happening really fast and it's my job to kind of figure out in real time what's going on and provide useful information to the readers. And that's really difficult. Especially in a world where literally, I'll probably get off the stage and check my phone and who knows what President Trump will have tweeted or what things will have happened. But it really is a kind of 24/7. >> Well because it's 24/7 with FiveThirtyEight, one of the most well known sites for data, are you feeling micromanagey on your people? Because you do have to hit this balance. You can't have something come out four or five days later. >> Yeah, I'm not -- >> Are you overseeing everything? >> I'm not by nature a micromanager. And so you try to hire well. You try and let people make mistakes. And the flip side of this is that if a news organization that never had any mistakes, never had any corrections, that's raw, right? You have to have some tolerance for error because you are trying to decide things in real time. And figure things out. I think transparency's a big part of that. Say here's what we think, and here's why we think it. If we have a model to say it's not just the final number, here's a lot of detail about how that's calculated. In some case we release the code and the raw data. Sometimes we don't because there's a proprietary advantage. But quite often we're saying we want you to trust us and it's so important that you trust us, here's the model. Go play around with it yourself. Here's the data. And that's also I think an important value. >> That speaks to open source. And your perspective on that in general. >> Yeah, I mean, look, I'm a big fan of open source. I worry that I think sometimes the trends are a little bit away from open source. But by the way, one thing that happens when you share your data or you share your thinking at least in lieu of the data, and you can definitely do both is that readers will catch embarrassing mistakes that you made. By the way, even having open sourceness within your team, I mean we have editors and copy editors who often save you from really embarrassing mistakes. And by the way, it's not necessarily people who have a training in data science. I would guess that of our 35 people, maybe only five to 10 have a kind of formal background in what you would call data science. >> [Katie] I think that speaks to the theme here. >> Yeah. >> [Katie] That everybody's kind of got to be data literate. >> But yeah, it is like you have a good intuition. You have a good BS detector basically. And you have a good intuition for hey, this looks a little bit out of line to me. And sometimes that can be based on domain knowledge, right? We have one of our copy editors, she's a big college football fan. And we had an algorithm we released that tries to predict what the human being selection committee will do, and she was like, why is LSU rated so high? Cause I know that LSU sucks this year. And we looked at it, and she was right. There was a bug where it had forgotten to account for their last game where they lost to Troy or something and so -- >> That also speaks to the human element as well. >> It does. In general as a rule, if you're designing a kind of regression based model, it's different in machine learning where you have more, when you kind of build in the tolerance for error. But if you're trying to do something more precise, then so much of it is just debugging. It's saying that looks wrong to me. And I'm going to investigate that. And sometimes it's not wrong. Sometimes your model actually has an insight that you didn't have yourself. But fairly often, it is. And I think kind of what you learn is like, hey if there's something that bothers me, I want to go investigate that now and debug that now. Because the last thing you want is where all of a sudden, the answer you're putting out there in the world hinges on a mistake that you made. Cause you never know if you have so to speak, 1,000 lines of code and they all perform something differently. You never know when you get in a weird edge case where this one decision you made winds up being the difference between your having a good forecast and a bad one. In a defensible position and a indefensible one. So we definitely are quite diligent and careful. But it's also kind of knowing like, hey, where is an approximation good enough and where do I need more precision? Cause you could also drive yourself crazy in the other direction where you know, it doesn't matter if the answer is 91.2 versus 90. And so you can kind of go 91.2, three, four and it's like kind of A) false precision and B) not a good use of your time. So that's where I do still spend a lot of time is thinking about which problems are "solvable" or approachable with data and which ones aren't. And when they're not by the way, you're still allowed to report on them. We are a news organization so we do traditional reporting as well. And then kind of figuring out when do you need precision versus when is being pointed in the right direction good enough? >> I would love to get inside your brain and see how you operate on just like an everyday walking to Walgreens movement. It's like oh, if I cross the street in .2-- >> It's not, I mean-- >> Is it like maddening in there? >> No, not really. I mean, I'm like-- >> This is an honest question. >> If I'm looking for airfares, I'm a little more careful. But no, part of it's like you don't want to waste time on unimportant decisions, right? I will sometimes, if I can't decide what to eat at a restaurant, I'll flip a coin. If the chicken and the pasta both sound really good-- >> That's not high tech Nate. We want better. >> But that's the point, right? It's like both the chicken and the pasta are going to be really darn good, right? So I'm not going to waste my time trying to figure it out. I'm just going to have an arbitrary way to decide. >> Serious and business, how organizations in the last three to five years have just evolved with this data boom. How are you seeing it as from a consultant point of view? Do you think it's an exciting time? Do you think it's a you must act now time? >> I mean, we do know that you definitely see a lot of talent among the younger generation now. That so FiveThirtyEight has been at ESPN for four years now. And man, the quality of the interns we get has improved so much in four years. The quality of the kind of young hires that we make straight out of college has improved so much in four years. So you definitely do see a younger generation for which this is just part of their bloodstream and part of their DNA. And also, particular fields that we're interested in. So we're interested in people who have both a data and a journalism background. We're interested in people who have a visualization and a coding background. A lot of what we do is very much interactive graphics and so forth. And so we do see those skill sets coming into play a lot more. And so the kind of shortage of talent that had I think frankly been a problem for a long time, I'm optimistic based on the young people in our office, it's a little anecdotal but you can tell that there are so many more programs that are kind of teaching students the right set of skills that maybe weren't taught as much a few years ago. >> But when you're seeing these big organizations, ESPN as perfect example, moving more towards data and analytics than ever before. >> Yeah. >> You would say that's obviously true. >> Oh for sure. >> If you're not moving that direction, you're going to fall behind quickly. >> Yeah and the thing is, if you read my book or I guess people have a copy of the book. In some ways it's saying hey, there are lot of ways to screw up when you're using data. And we've built bad models. We've had models that were bad and got good results. Good models that got bad results and everything else. But the point is that the reason to be out in front of the problem is so you give yourself more runway to make errors and mistakes. And to learn kind of what works and what doesn't and which people to put on the problem. I sometimes do worry that a company says oh we need data. And everyone kind of agrees on that now. We need data science. Then they have some big test case. And they have a failure. And they maybe have a failure because they didn't know really how to use it well enough. But learning from that and iterating on that. And so by the time that you're on the third generation of kind of a problem that you're trying to solve, and you're watching everyone else make the mistake that you made five years ago, I mean, that's really powerful. But that doesn't mean that getting invested in it now, getting invested both in technology and the human capital side is important. >> Final question for you as we run out of time. 2018 beyond, what is your biggest project in terms of data gathering that you're working on? >> There's a midterm election coming up. That's a big thing for us. We're also doing a lot of work with NBA data. So for four years now, the NBA has been collecting player tracking data. So they have 3D cameras in every arena. So they can actually kind of quantify for example how fast a fast break is, for example. Or literally where a player is and where the ball is. For every NBA game now for the past four or five years. And there hasn't really been an overall metric of player value that's taken advantage of that. The teams do it. But in the NBA, the teams are a little bit ahead of journalists and analysts. So we're trying to have a really truly next generation stat. It's a lot of data. Sometimes I now more oversee things than I once did myself. And so you're parsing through many, many, many lines of code. But yeah, so we hope to have that out at some point in the next few months. >> Anything you've personally been passionate about that you've wanted to work on and kind of solve? >> I mean, the NBA thing, I am a pretty big basketball fan. >> You can do better than that. Come on, I want something real personal that you're like I got to crunch the numbers. >> You know, we tried to figure out where the best burrito in America was a few years ago. >> I'm going to end it there. >> Okay. >> Nate, thank you so much for joining us. It's been an absolute pleasure. Thank you. >> Cool, thank you. >> I thought we were going to chat World Series, you know. Burritos, important. I want to thank everybody here in our audience. Let's give him a big round of applause. >> [Nate] Thank you everyone. >> Perfect way to end the day. And for a replay of today's program, just head on over to ibm.com/dsforall. I'm Katie Linendoll. And this has been Data Science for All: It's a Whole New Game. Test one, two. One, two, three. Hi guys, I just want to quickly let you know as you're exiting. A few heads up. Downstairs right now there's going to be a meet and greet with Nate. And we're going to be doing that with clients and customers who are interested. So I would recommend before the game starts, and you lose Nate, head on downstairs. And also the gallery is open until eight p.m. with demos and activations. And tomorrow, make sure to come back too. Because we have exciting stuff. I'll be joining you as your host. And we're kicking off at nine a.m. So bye everybody, thank you so much. >> [Announcer] Ladies and gentlemen, thank you for attending this evening's webcast. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your name badge at the registration desk. Thank you. Also, please note there are two exits on the back of the room on either side of the room. Have a good evening. Ladies and gentlemen, the meet and greet will be on stage. Thank you.

Published Date : Nov 1 2017

SUMMARY :

Today the ability to extract value from data is becoming a shared mission. And for all of you during the program, I want to remind you to join that conversation on And when you and I chatted about it. And the scale and complexity of the data that organizations are having to deal with has It's challenging in the world of unmanageable. And they have to find a way. AI. And it's incredible that this buzz word is happening. And to get to an AI future, you have to lay a data foundation today. And four is you got to expand job roles in the organization. First pillar in this you just discussed. And now you get to where we are today. And if you don't have a strategy for how you acquire that and manage it, you're not going And the way I think about that is it's really about moving from static data repositories And we continue with the architecture. So you need a way to federate data across different environments. So we've laid out what you need for driving automation. And so when you think about the real use cases that are driving return on investment today, Let's go ahead and come back to something that you mentioned earlier because it's fascinating And so the new job roles is about how does everybody have data first in their mind? Everybody in the company has to be data literate. So overall, group effort, has to be a common goal, and we all need to be data literate But at the end of the day, it's kind of not an easy task. It's not easy but it's maybe not as big of a shift as you would think. It's interesting to hear you say essentially you need to train everyone though across the And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. And I've heard that the placement behind those jobs, people graduating with the MS is high. Let me get back to something else you touched on earlier because you mentioned that a number They produce a lot of the shows that I'm sure you watch Katie. And this is a good example. So they have to optimize every aspect of their business from marketing campaigns to promotions And so, as we talk to clients we think about how do you start down this path now, even It's analytics first to the data, not the other way around. We as a practice, we say you want to bring data to where the data sits. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. Female preferred, on the cover of Vogue. And how does it change everything? And while it's important to recognize this critical skill set, you can't just limit it And we call it clickers and coders. [Katie] I like that. And there's not a lot of things available today that do that. Because I hear you talking about the data scientists role and how it's critical to success, And my view is if you have the right platform, it enables the organization to collaborate. And every organization needs to think about what are the skills that are critical? Use this as your chance to reinvent IT. And I can tell you even personally being effected by how important the analysis is in working And think about if you don't do something. And now we're going to get to the fun hands on part of our story. And then how do you move analytics closer to your data? And in here I can see that JP Morgan is calling for a US dollar rebound in the second half But then where it gets interesting is you go to the bottom. data, his stock portfolios, and browsing behavior to build a model which can predict his affinity And so, as a financial adviser, you look at this and you say, all right, we know he loves And I want to do that by picking a auto stock which has got negative correlation with Ferrari. Cause you start clicking that and immediately we're getting instant answers of what's happening. And what I see here instantly is that Honda has got a negative correlation with Ferrari, As a financial adviser, you wouldn't think about federating data, machine learning, pretty And drive the machine learning into the appliance. And even score hundreds of customers for their affinities on a daily basis. And then you see when you deploy analytics next to your data, even a financial adviser, And as a data science leader or data scientist, you have a lot of the same concerns. But you guys each have so many unique roles in your business life. And just by looking at the demand of companies that wants us to help them go through this And I think the whole ROI of data is that you can now understand people's relationships Well you can have all the data in the world, and I think it speaks to, if you're not doing And I think that that's one of the things that customers are coming to us for, right? And Nir, this is something you work with a lot. And the companies that are not like that. Tricia, companies have to deal with data behind the firewall and in the new multi cloud And so that's why I think it's really important to understand that when you implement big And how are the clients, how are the users actually interacting with the system? And right now the way I see teams being set up inside companies is that they're creating But in order to actually see all of the RY behind the data, you also have to have a creative That's one of the things that we see a lot. So a lot of the training we do is sort of data engineers. And I think that's a very strong point when it comes to the data analysis side. And that's where you need the human element to come back in and say okay, look, you're And the people who are really great at providing that human intelligence are social scientists. the talent piece is actually the most important crucial hard to get. It may be to take folks internally who have a lot of that domain knowledge that you have And from data scientist to machine learner. And what I explain to them is look, you're still making decisions in the same way. And I mean, just to give you an example, we are partnering with one of the major cloud And what you're talking about with culture is really where I think we're talking about And I think that communication between the technical stakeholders and management You guys made this way too easy. I want to leave you with an opportunity to, anything you want to add to this conversation? I think one thing to conclude is to say that companies that are not data driven is And thank you guys again for joining us. And we're going to turn our attention to how you can deliver on what they're talking about And finally how you could build models anywhere and employ them close to where your data is. And thanks to Siva for taking us through it. You got to break it down for me cause I think we zoom out and see the big picture. And we saw some new capabilities that help companies avoid lock-in, where you can import And as a data scientist, you stop feeling like you're falling behind. We met backstage. And I go to you to talk about sports because-- And what it brings. And the reason being that sports consists of problems that have rules. And I was going to save the baseball question for later. Probably one of the best of all time. FiveThirtyEight has the Dodgers with a 60% chance of winning. So you have two teams that are about equal. It's like the first World Series in I think 56 years or something where you have two 100 And that you can be the best pitcher in the world, but guess what? And when does it ruin the sport? So we can talk at great length about what tools do you then apply when you have those And the reason being that A) he kind of knows how to position himself in the first place. And I imagine they're all different as well. But you really have seen a lot of breakthroughs in the last couple of years. You're known for your work in politics though. What was the most notable thing that came out of any of your predictions? And so, being aware of the limitations to some extent intrinsically in elections when It would be interesting to kind of peek back the curtain, understand how you operate but But you don't want to be inaccurate because that's your credibility. I think on average, speed is a little bit overrated in journalism. And there's got to be more time spent on stories if I can speak subjectively. And so we have people that come in, we hire most of our people actually from journalism. And so the kind of combination of needing, not having that much tolerance for mistakes, Because you do have to hit this balance. And so you try to hire well. And your perspective on that in general. But by the way, one thing that happens when you share your data or you share your thinking And you have a good intuition for hey, this looks a little bit out of line to me. And I think kind of what you learn is like, hey if there's something that bothers me, It's like oh, if I cross the street in .2-- I mean, I'm like-- But no, part of it's like you don't want to waste time on unimportant decisions, right? We want better. It's like both the chicken and the pasta are going to be really darn good, right? Serious and business, how organizations in the last three to five years have just And man, the quality of the interns we get has improved so much in four years. But when you're seeing these big organizations, ESPN as perfect example, moving more towards But the point is that the reason to be out in front of the problem is so you give yourself Final question for you as we run out of time. And so you're parsing through many, many, many lines of code. You can do better than that. You know, we tried to figure out where the best burrito in America was a few years Nate, thank you so much for joining us. I thought we were going to chat World Series, you know. And also the gallery is open until eight p.m. with demos and activations. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your

ENTITIES

Entity	Category	Confidence
Tricia Wang	PERSON	0.99+
Katie	PERSON	0.99+
Katie Linendoll	PERSON	0.99+
Rob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Joane	PERSON	0.99+
Daniel	PERSON	0.99+
Michael Li	PERSON	0.99+
Nate Silver	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Trump	PERSON	0.99+
Nate	PERSON	0.99+
Honda	ORGANIZATION	0.99+
Siva	PERSON	0.99+
McKinsey	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Larry Bird	PERSON	0.99+
2017	DATE	0.99+
Rob Thomas	PERSON	0.99+
Michigan	LOCATION	0.99+
Yankees	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Clinton	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Tesco	ORGANIZATION	0.99+
Michael	PERSON	0.99+
America	LOCATION	0.99+
Leo	PERSON	0.99+
four years	QUANTITY	0.99+
five	QUANTITY	0.99+
30%	QUANTITY	0.99+
Astros	ORGANIZATION	0.99+
Trish	PERSON	0.99+
Sudden Compass	ORGANIZATION	0.99+
Leo Messi	PERSON	0.99+
two teams	QUANTITY	0.99+
1,000 lines	QUANTITY	0.99+
one year	QUANTITY	0.99+
10 investments	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
The Signal and the Noise	TITLE	0.99+
Tricia	PERSON	0.99+
Nir Kaldero	PERSON	0.99+
80%	QUANTITY	0.99+
BCG	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
ESPN	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Ferrari	ORGANIZATION	0.99+
last year	DATE	0.99+
18	QUANTITY	0.99+
three	QUANTITY	0.99+
Data Incubator	ORGANIZATION	0.99+
Patriots	ORGANIZATION	0.99+

Jags Ramnarayan, SnappyData - Spark Summit 2017 - #SparkSummit - #theCUBE

(techno music) >> Narrator: Live from San Francisco, it's theCUBE, covering Spark Summit 2017. Brought to you by Databricks. >> You are watching the Spark Summit 2017 coverage by theCUBE. I'm your host David Goad, and joined with George Gilbert. How you doing George? >> Good to be here. >> And honored to introduce our next guest, the CTO from SnappyData, wow we were lucky to get this guy. >> Thanks for having me >> David: Jags Ramnarayan, Jags thanks for joining us. >> Thanks, thanks for having me. >> And for people who may not be familiar, maybe tell us what does SnappyData do? >> So SnappyData in a nutshell, is taking Spark, which is a computer engine, and in some sense augmenting the guts of Spark so that Spark truly becomes a hybrid database. A single data store that's capable of taking Spark streams, doing transactions, providing mutable state management in Spark, but most importantly being able to turn around, and run analytical queries on that state that is continuously merging. That's in a nutshell. Let me just say a few things, SnappyData itself is a startup that is a spun out, a spun out out of Pivotal. We've been out of Pivotal for roughly about a year, so the technology itself was to a great degree, incubated within Pivotal. It's a product called GemFire within VMware and Pivotal. So we took the guts of GemFire, which is an in-memory data base, designed for transactional low-latency, high confidence scenarios, and we are sort of fusing it, that's the key thing, fusing it into Spark, so that now Spark becomes significantly richer, as not just as a computer platform, but as a store. >> Great, and we know this is not your first Spark Summit, right? How many have you been to? Lost count? >> Boy, let's see, three, four now, Spark Summits, if I include the Spark Summit, this year, four to five. >> Great, so an active part of the community. What were you expecting to learn this year, and have you been surprised by anything? >> You know, it's always wonderful to see, I mean, every time I come to Spark, it's just a new set of innovations, right? I mean, when I first came to Spark, it was a mix of, let's talk about data frames, all of these, let's optimize my priorities. Today you come, I mean there is such a wide spectrum of amazing new things that are happening. It's just mind boggling. Right from AI techniques, structured streaming, and the real-time paradigm, and sort of this confluence that Databricks brings more to it. How can I create a confluence through a unified mechanism, where it is really brilliant, is what I think. >> Okay, well let's talk about how you're innovating at SnappyData. What are some of the applications or current projects you're working on? So number of things, I mean, GE is an investor in SnappyData. So we're trying to work with GE on the investor layer Dspace. We're working with large health care companies, also on their layer Dspace. So the part done with SnappyData is one that has a lot of high velocity streams of data emerging where the streams could be, for instance, Kafka streams driving Spark streams, but streams could also be operation databases. Your Postgres instance and your Cassandra database instance, and they're all generating continuous changes to data that's emerging in an operational world, can I suck that in and almost create a replica of that state that might be emerging in the SOQL operation environment, and still allow interactive analytics ASCIL for a number of concordant users on live data. Not cube data, not pre-aggregated data, but on live data itself, right? Being able to almost give you Google-like speeds to live data. >> George, we've heard people talking about this quite a bit. >> Yeah, so Jags, as you said upfront, Spark was conceived as sort of a general purpose, I guess, analytic compute engine, and adding DBMS to it, like sort of not bolting it on, but deeply integrating it, so that the core data structures now have DBMS properties, like transactionality, that must make a huge change in the scope of applications that are applicable. Can you desribe some of those for us? >> Yeah. The classic paradigm today that we find time and again as, the so-called smack stack, right? I mean lambda stack, now there's a smack stack. Which is really about Spark running on Mesos, but really using Spark streaming as an ingestion capability, and there is continuous state that is emerging that I want to write into Cassandra. So what we find very quickly is that the moment the state is emerging, I want to throw in a business intelligence tool on top and immediately do live dashboarding on that state that is continuously changing and emerging. So what we find is that the first part, which is the high speed drives, the ability to transform these data search, cleanse the data search, get the cleanse data into Cassandra, works really well. What is missing is this ability to say, well, how am I going to get insight? How can I ask you interesting, insightful questions, get responses immediately on that live data, right? And so the common problem there is the moment I have Cassandra working, let's say, with Spark, every time I run an analytical query, you only have two choices. One is use the parallel connector to pull in the data search from Cassandra, right, and now unfortunately, when you do analytics, you're working with large volumes. And every time I run even a simple query, all of a sudden I could be pulling in 10 gigabytes, 20 gigabytes of data into Spark to run the computation. Hundreds of seconds lost. Nothing like interactive, it's all about batch querying. So how can I turn around and say that if stuff changes in Cassandra, I can can have an immediate real-time reflection of that mutable state in Spark on which I can run queries rapidly. That's a very key aspect to us. >> So you were telling me earlier that you didn't see, necessarily, a need to replace entirely, the Cassandra in the smack stack, but to compliment it. >> Jags: That's right. >> Elaborate on that. >> So our focus, much like Spark, is all about in-memory, state management in-memory processing. And Cassandra, realistically, is really designed to say how can I scale the petabyte, right, for key value operations, semi-structured data, what have you. So we think there are a number of scenarios where you still want Cassandra to be your store, because in some sense a lot of these guys have already adapted Cassandra in a fairly big way. So you want to say, hey, leave your petabyte level wall in there, and you can essentially work with the real-time state, which could still be still many terabytes of state, essentially in main memory, that's going to work with specializing it. And we're also, I mean I can touch on this approximate query process and technology, which is other part, other key part here, to say hey, I can't really 1,000 cores, and 1,000 machines just so that you can do your job really well, so one of the techniques we are adopting, which even the Databricks guys stirred with Blink, essentially, it's an approximate query processing engine, we have our own essential approximate query processing engine, as an adjunct, essentially, to our store. What that essentially means is to say, can I take a billion records and synthesize something really, really small, using smart sampling techniques, sketching techniques, essentially statistical structures, that can be stored along with Spark and Spark memory itself, and fuse it with the Spark catalyst query engine. So that as you run your query and we can very smartly figure out, can I use the approximate data structures to answer the questions extremely quickly. Even when the data would be in petabyte volume, I have these data structures that just now taking, maybe gigabytes of storage only. So hopefully not getting too, too technical, so the Spark catalyst query optimizer, like an Oracle query optimizer, it knows about the data that it's going to query, only in your case, you're taking what catalyst knows about Spark, and extending it with what's stored in your native, also Spark native, data structures. >> That's right, exactly. So think about an optimizer always takes a query plan and says, here are all the possible plans you can execute, and here is cost estimate for these plans, we essentially inject more plans into that and hopefully, our plan is even more optimized than the plans that the Spark catalyst engine came up with. And Spark is beautiful because, the Catalyst engine is a very pluggable engine. So you can essentially augment that engine very easily. >> So you've been out in the marketplace, whether in alpha, beta, or now, production, for enough time so that the community is aware of what you've done. What are some of the areas that you're being pulled in that are, that people didn't associate Spark with? >> So more often, we land up in situations where they're looking at SAP HANA, as an example, maybe a Meme SQL, maybe just Postgres, and all of the sudden, there are these hybrid workloads, which is the Gartner term of HTAP, so there's a lot of HTAP use cases, where we get pulled into. So there's no Spark, but we get pulled into it because we just a hybrid database. That's what people look at us, essentially. >> Oh, so you pull Spark in because that's just part of your solution. >> Exactly, right. So think about Spark is not just data frames and rich API, but also it has a SQL interface, right. I can essentially execute, SQL, select SQL. Of course we augment that SQL so that now you can do what you expect from a database, which is an insert, an update, a delete, can I create a view, can I run a transaction? So all of a sudden, it's not just a Spark API but what we provide looks like a SQL database itself. >> Okay, interesting. So tell us, in the work with GE, they're among the first that have sort of educated the world that in that world there's so much data coming off devices, that we have to be intelligent about what we filter and send to cloud, we train models, potentially, up there, we run them closer to the edge, so that we get low latency analytics, but you were telling us earlier that there are alternatives, especially when you have such an intelligent database, working both at the edge and in the cloud. >> Right, so that's a great point. See what's happening with sort of a lot of these machine learning models is that these models are learned on historical data search. And quite often, especially if you look at predictive maintenance, those class of use cases, in industrial IRT, the parlance could evolve very rapidly, right? Maybe because of climate changes and let's say, for a windmill farm, there are few windmills that are breaking down so rapidly it's affecting everything else, in terms of the power generation. So being able to sort of order the model itself, incrementally and near real-time, is becoming more and more important. >> David: Wow. >> It's still a fairly academic research kind of area, but for instance, we are working very closely with the University of Michigan to sort of say, can we use some of these approximate techniques to incrementally also learn a model. Right, sort of incrementally augment a model, potential of the edge, or even inside the cloud, for instance. >> David: Wow. >> So if you're doing it at the edge, would you be updating the instance of the model associated with that locale and then would the model in the cloud be sort of like the master, and then that gets pushed down, until you have an instance and a master. >> That's right. See most typically what will happen is you have computed a model using a lot of historical data. You have typically supervised techniques to compute a model. And you take that model and inject it potentially into the edge, so that it can compute that model, which is the easy part, everybody does that. So you continue to do that, right, because you really want the data scientists to be pouring through those paradigms, looking and sort of tweaking those models. But for a certain number of models, even in the models injected in the edge, can I re-tweak that model in unsupervised way, is kind of the play, we're also kind of venturing into slowly, but that's all in the future. >> But if you're doing it unsupervised, do you need metrics that sort of flag, like what is the champion challenger, and figure out-- >> I should say that I mean, not all of these models can work in this very real-time manner. So, for instance, we've been looking at saying, can we reclassify NPC, the name place classifier, to essentially do incremental classification, or incrementally learning the model. Clustering approaches can actually be done in an unsupervised way in an incremental fashion. Things like that. There's a whole spectrum of algorithms that really need to be thought through for approximate algorithms to actually apply. So it's still an active research. >> Really great discussion, guys. We've just got about a minute to go, before the break, really great stuff. I don't want to interrupt you. But maybe switch real quick to business drivers. Maybe with SnappyData or with other peers you've talked to today. What business drivers do you think are going to affect the evolution of Spark the most? I mean, for us, as a small company, the single biggest challenge we have, it's like what one of you guys said, analysts, it's raining databases out there. And there's ability to constantly educate people how you can essentially realize a very next generation, like data pipeline, in a very simplified manner, is the challenge we are running into, right. I mean, I think the business model for us is primarily how many people are going to go and say, yes, batch related analytics is important, but incrementally, for competitive reasons, want to be playing that real-time analytics game lot more than before, right? So that's going to be big for us, and hopefully we can play a big part there, along with Spark and Databricks. >> Great, well we appreciate you coming on the show today, and sharing some of the interesting work that you're doing. George, thank you so much. and Jags, thank you so much for being on theCUBE. >> Thanks for having me on, I appreciate it. Thanks, George. And thank you all for tuning in. Once again, we have more to come, today and tomorrow, here at Spark Summit 2017, thanks for watching. (techno music)

Published Date : Jun 6 2017

SUMMARY :

Brought to you by Databricks. How you doing George? And honored to introduce our next guest, and in some sense augmenting the guts of Spark if I include the Spark Summit, this year, four to five. and have you been surprised by anything? and the real-time paradigm, and sort of this confluence So the part done with SnappyData is one about this quite a bit. so that the core data structures now have DBMS properties, that the moment the state is emerging, the Cassandra in the smack stack, but to compliment it. So that as you run your query and we can very So you can essentially augment that engine very easily. What are some of the areas that you're being pulled in maybe just Postgres, and all of the sudden, Oh, so you pull Spark in because So all of a sudden, it's not just a Spark API that have sort of educated the world So being able to sort of order the model itself, but for instance, we are working very closely in the cloud be sort of like the master, So you continue to do that, right, because you that really need to be thought through is the challenge we are running into, right. and sharing some of the interesting work that you're doing. And thank you all for tuning in.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
David Goad	PERSON	0.99+
George	PERSON	0.99+
University of Michigan	ORGANIZATION	0.99+
1,000 machines	QUANTITY	0.99+
20 gigabytes	QUANTITY	0.99+
GE	ORGANIZATION	0.99+
1,000 cores	QUANTITY	0.99+
10 gigabytes	QUANTITY	0.99+
David	PERSON	0.99+
Spark	TITLE	0.99+
San Francisco	LOCATION	0.99+
SQL	TITLE	0.99+
Spark	ORGANIZATION	0.99+
Jags Ramnarayan	PERSON	0.99+
first	QUANTITY	0.99+
first part	QUANTITY	0.99+
two choices	QUANTITY	0.99+
SAP HANA	TITLE	0.99+
tomorrow	DATE	0.99+
Hundreds of seconds	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
this year	DATE	0.99+
Spark Summit 2017	EVENT	0.99+
Jags	PERSON	0.99+
One	QUANTITY	0.98+
today	DATE	0.98+
Today	DATE	0.98+
both	QUANTITY	0.98+
Databricks	ORGANIZATION	0.98+
Spark Summit	EVENT	0.97+
single	QUANTITY	0.97+
Kafka	TITLE	0.97+
Oracle	ORGANIZATION	0.97+
Google	ORGANIZATION	0.96+
about a year	QUANTITY	0.96+
Blink	ORGANIZATION	0.95+
single data	QUANTITY	0.93+
SnappyData	ORGANIZATION	0.93+
Mesos	TITLE	0.91+
three	QUANTITY	0.91+
a billion records	QUANTITY	0.91+
#SparkSummit	EVENT	0.91+
Spark Summits	EVENT	0.9+
four	QUANTITY	0.89+
theCUBE	ORGANIZATION	0.89+
Postgres	TITLE	0.89+
one	QUANTITY	0.88+
Cassandra	TITLE	0.87+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for University of Michigan: