Supercloud Applications & Developer Impact | Supercloud2

(gentle music) >> Okay, welcome back to Supercloud 2, live here in Palo Alto, California for our live stage performance. Supercloud 2 is our second Supercloud event. We're going to get these out as fast as we can every couple months. It's our second one, you'll see two and three this year. I'm John Furrier, my co-host, Dave Vellante. A panel here to break down the Supercloud momentum, the wave, and the developer impact that we bringing back Vittorio Viarengo, who's a VP for Cross-Cloud Services at VMware. Sarbjeet Johal, industry influencer and Analyst at StackPayne, his company, Cube alumni and Influencer. Sarbjeet, great to see you. Vittorio, thanks for coming back. >> Nice to be here. >> My pleasure. >> Vittorio, you just gave a keynote where we unpacked the cross-cloud services, what VMware is doing, how you guys see it, not just from VMware's perspective, but VMware looking out broadly at the industry and developers came up and you were like, "Developers, developer, developers", kind of a goof on the Steve Ballmer famous meme that everyone's seen. This is a huge star, sorry, I mean a big piece of it. The developers are the canary in the coal mines. They're the ones who are being asked to code the digital transformation, which is fully business transformation and with the market the way it is right now in terms of the accelerated technology, every enterprise grade business model's changing. The technology is evolving, the builders are kind of, they want go faster. I'm saying they're stuck in a way, but that's my opinion, but there's a lot of growth. >> Yeah. >> The impact, they got to get released up and let it go. Those developers need to accelerate faster. It's been a big part of productivity, and the conversations we've had. So developer impact is huge in Supercloud. What's your, what do you guys think about this? We'll start with you, Sarbjeet. >> Yeah, actually, developers are the masons of the digital empires I call 'em, right? They lay every brick and build all these big empires. On the left side of the SDLC, or the, you know, when you look at the system operations, developer is number one cost from economic side of things, and from technology side of things, they are tech hungry people. They are developers for that reason because developer nights are long, hours are long, they forget about when to eat, you know, like, I've been a developer, I still code. So you want to keep them happy, you want to hug your developers. We always say that, right? Vittorio said that right earlier. The key is to, in this context, in the Supercloud context, is that developers don't mind mucking around with platforms or APIs or new languages, but they hate the infrastructure part. That's a fact. They don't want to muck around with servers. It's friction for them, it is like they don't want to muck around even with the VMs. So they want the programmability to the nth degree. They want to automate everything, so that's how they think and cloud is the programmable infrastructure, industrialization of infrastructure in many ways. So they are happy with where we are going, and we need more abstraction layers for some developers. By the way, I have this sort of thinking frame for last year or so, not all developers are same, right? So if you are a developer at an ISV, you behave differently. If you are a developer at a typical enterprise, you behave differently or you are forced to behave differently because you're not writing software.- >> Well, developers, developers have changed, I mean, Vittorio, you and I were talking earlier on the keynote, and this is kind of the key point is what is a developer these days? If everything is software enabled, I mean, even hardware interviews we do with Nvidia, and Amazon and other people building silicon, they all say the same thing, "It's software on a chip." So you're seeing the role of software up and down the stack and the role of the stack is changing. The old days of full stack developer, what does that even mean? I mean, the cloud is a half a stack kind of right there. So, you know, developers are certainly more agile, but cloud native, I mean VMware is epitome of operations, IT operations, and the Tan Zoo initiative, you guys started, you went after the developers to look at them, and ask them questions, "What do you need?", "How do you transform the Ops from virtualization?" Again, back to your point, so this hardware abstraction, what is software, what is cloud native? It's kind of messy equation these days. How do you guys grokel with that? >> I would argue that developers don't want the Supercloud. I dropped that up there, so, >> Dave: Why not? >> Because developers, they, once they get comfortable in AWS or Google, because they're doing some AI stuff, which is, you know, very trendy right now, or they are in IBM, any of the IPA scaler, professional developers, system developers, they love that stuff, right? Yeah, they don't, the infrastructure gets in the way, but they're just, the problem is, and I think the Supercloud should be driven by the operators because as we discussed, the operators have been left behind because they're busy with day-to-day jobs, and in most cases IT is centralized, developers are in the business units. >> John: Yeah. >> Right? So they get the mandate from the top, say, "Our bank, they're competing against". They gave teenagers or like young people the ability to do all these new things online, and Venmo and all this integration, where are we? "Oh yeah, we can do it", and then build it, and then deploy it, "Okay, we caught up." but now the operators are back in the private cloud trying to keep the backend system running and so I think the Supercloud is needed for the primarily, initially, for the operators to get in front of the developers, fit in the workflow, but lay the foundation so it is secure.- >> So, so I love this thinking because I love the rift, because the rift points to what is the target audience for the value proposition and if you're a developer, Supercloud enables you so you shouldn't have to deal with Supercloud. >> Exactly. >> What you're saying is get the operating environment or operating system done properly, whether it's architecture, building the platform, this comes back to architecture platform conversations. What is the future platform? Is it a vendor supplied or is it customer created platform? >> Dave: So developers want best to breed, is what you just said. >> Vittorio: Yeah. >> Right and operators, they, 'cause developers don't want to deal with governance, they don't want to deal with security, >> No. >> They don't want to deal with spinning up infrastructure. That's the role of the operator, but that's where Supercloud enables, to John's point, the developer, so to your question, is it a platform where the platform vendor is responsible for the architecture, or there is it an architectural standard that spans multiple clouds that has to emerge? Based on what you just presented earlier, Vittorio, you are the determinant of the architecture. It's got to be open, but you guys determine that, whereas the nirvana is, "Oh no, it's all open, and it just kind of works." >> Yeah, so first of all, let's all level set on one thing. You cannot tell developers what to do. >> Dave: Right, great >> At least great developers, right? Cannot tell them what to do. >> Dave: So that's what, that's the way I want to sort of, >> You can tell 'em what's possible. >> There's a bottle on that >> If you tell 'em what's possible, they'll test it, they'll look at it, but if you try to jam it down their throat, >> Yeah. >> Dave: You can't tell 'em how to do it, just like your point >> Let me answer your answer the question. >> Yeah, yeah. >> So I think we need to build an architect, help them build an architecture, but it cannot be proprietary, has to be built on what works in the cloud and so what works in the cloud today is Kubernetes, is you know, number of different open source project that you need to enable and then provide, use this, but when I first got exposed to Kubernetes, I said, "Hallelujah!" We had a runtime that works the same everywhere only to realize there are 12 different distributions. So that's where we come in, right? And other vendors come in to say, "Hey, no, we can make them all look the same. So you still use Kubernetes, but we give you a place to build, to set those operation policy once so that you don't create friction for the developers because that's the last thing you want to do." >> Yeah, actually, coming back to the same point, not all developers are same, right? So if you're ISV developer, you want to go to the lowest sort of level of the infrastructure and you want to shave off the milliseconds from to get that performance, right? If you're working at AWS, you are doing that. If you're working at scale at Facebook, you're doing that. At Twitter, you're doing that, but when you go to DMV and Kansas City, you're not doing that, right? So your developers are different in nature. They are given certain parameters to work with, certain sort of constraints on the budget side. They are educated at a different level as well. Like they don't go to that end of the degree of sort of automation, if you will. So you cannot have the broad stroking of developers. We are talking about a citizen developer these days. That's a extreme low, >> You mean Low-Code. >> Yeah, Low-Code, No-code, yeah, on the extreme side. On one side, that's citizen developers. On the left side is the professional developers, when you say developers, your mind goes to the professional developers, like the hardcore developers, they love the flexibility, you know, >> John: Well app, developers too, I mean. >> App developers, yeah. >> You're right a lot of, >> Sarbjeet: Infrastructure platform developers, app developers, yes. >> But there are a lot of customers, its a spectrum, you're saying. >> Yes, it's a spectrum >> There's a lot of customers don't want deal with that muck. >> Yeah. >> You know, like you said, AWS, Twitter, the sophisticated developers do, but there's a whole suite of developers out there >> Yeah >> That just want tools that are abstracted. >> Within a company, within a company. Like how I see the Supercloud is there shouldn't be anything which blocks the developers, like their view of the world, of the future. Like if you're blocked as a developer, like something comes in front of you, you are not developer anymore, believe me, (John laughing) so you'll go somewhere else >> John: First of all, I'm, >> You'll leave the company by the way. >> Dave: Yeah, you got to quit >> Yeah, you will quit, you will go where the action is, where there's no sort of blockage there. So like if you put in front of them like a huge amount of a distraction, they don't like it, so they don't, >> Well, the idea of a developer, >> Coming back to that >> Let's get into 'cause you mentioned platform. Get year in the term platform engineering now. >> Yeah. >> Platform developer. You know, I remember back in, and I think there's still a term used today, but when I graduated my computer science degree, we were called "Software engineers," right? Do people use that term "Software engineering", or is it "Software development", or they the same, are they different? >> Well, >> I think there's a, >> So, who's engineering what? Are they engineering or are they developing? Or both? Well, I think it the, you made a great point. There is a factor of, I had the, I was blessed to work with Adam Bosworth, that is the guy that created some of the abstraction layer, like Visual Basic and Microsoft Access and he had so, he made his whole career thinking about this layer, and he always talk about the professional developers, the developers that, you know, give him a user manual, maybe just go at the APIs, he'll build anything, right, from system engine, go down there, and then through obstruction, you get the more the procedural logic type of engineers, the people that used to be able to write procedural logic and visual basic and so on and so forth. I think those developers right now are a little cut out of the picture. There's some No-code, Low-Code environment that are maybe gain some traction, I caught up with Adam Bosworth two weeks ago in New York and I asked him "What's happening to this higher level developers?" and you know what he is told me, and he is always a little bit out there, so I'm going to use his thought process here. He says, "ChapGPT", I mean, they will get to a point where this high level procedural logic will be written by, >> John: Computers. >> Computers, and so we may not need as many at the high level, but we still need the engineers down there. The point is the operation needs to get in front of them >> But, wait, wait, you seen the ChatGPT meme, I dunno if it's a Dilbert thing where it's like, "Time to tic" >> Yeah, yeah, yeah, I did that >> "Time to develop the code >> Five minutes, time to decode", you know, to debug the codes like five hours. So you know, the whole equation >> Well, this ChatGPT is a hot wave, everyone's been talking about it because I think it illustrates something that's NextGen, feels NextGen, and it's just getting started so it's going to get better. I mean people are throwing stones at it, but I think it's amazing. It's the equivalent of me seeing the browser for the first time, you know, like, "Wow, this is really compelling." This is game-changing, it's not just keyword chat bots. It's like this is real, this is next level, and I think the Supercloud wave that people are getting behind points to that and I think the question of Ops and Dev comes up because I think if you limit the infrastructure opportunity for a developer, I think they're going to be handicapped. I mean that's a general, my opinion, the thesis is you give more aperture to developers, more choice, more capabilities, more good things could happen, policy, and that's why you're seeing the convergence of networking people, virtualization talent, operational talent, get into the conversation because I think it's an infrastructure engineering opportunity. I think this is a seminal moment in a new stack that's emerging from an infrastructure, software virtualization, low-code, no-code layer that will be completely programmable by things like the next Chat GPT or something different, but yet still the mechanics and the plumbing will still need engineering. >> Sarbjeet: Oh yeah. >> So there's still going to be more stuff coming on. >> Yeah, we have, with the cloud, we have made the infrastructure programmable and you give the programmability to the programmer, they will be very creative with that and so we are being very creative with our infrastructure now and on top of that, we are being very creative with the silicone now, right? So we talk about that. That's part of it, by the way. So you write the code to the particle's silicone now, and on the flip side, the silicone is built for certain use cases for AI Inference and all that. >> You saw this at CES? >> Yeah, I saw at CES, the scenario is this, the Bosch, I spoke to Bosch, I spoke to John Deere, I spoke to AWS guys, >> Yeah. >> They were showcasing their technology there and I was spoke to Azure guys as well. So the Bosch is a good example. So they are building, they are right now using AWS. I have that interview on camera, I will put it some sometime later on there online. So they're using AWS on the back end now, but Bosch is the number one, number one or number two depending on what day it is of the year, supplier of the componentry to the auto industry, and they are creating a platform for our auto industry, so is Qualcomm actually by the way, with the Snapdragon. So they told me that customers, their customers, BMW, Audi, all the manufacturers, they demand the diversity of the backend. Like they don't want all, they, all of them don't want to go to AWS. So they want the choice on the backend. So whatever they cook in the middle has to work, they have to sprinkle the data for the data sovereign side because they have Chinese car makers as well, and for, you know, for other reasons, competitive reasons and like use. >> People don't go to, aw, people don't go to AWS either for political reasons or like competitive reasons or specific use cases, but for the most part, generally, I haven't met anyone who hasn't gone first choice with either, but that's me personally. >> No, but they're building. >> Point is the developer wants choice at the back end is what I'm hearing, but then finish that thought. >> Their developers want the choice, they want the choice on the back end, number one, because the customers are asking for, in this case, the customers are asking for it, right? But the customers requirements actually drive, their economics drives that decision making, right? So in the middle they have to, they're forced to cook up some solution which is vendor neutral on the backend or multicloud in nature. So >> Yeah, >> Every >> I mean I think that's nirvana. I don't think, I personally don't see that happening right now. I mean, I don't see the parody with clouds. So I think that's a challenge. I mean, >> Yeah, true. >> I mean the fact of the matter is if the development teams get fragmented, we had this chat with Kit Colbert last time, I think he's going to come on and I think he's going to talk about his keynote in a few, in an hour or so, development teams is this, the cloud is heterogenous, which is great. It's complex, which is challenging. You need skilled engineering to manage these clouds. So if you're a CIO and you go all in on AWS, it's hard. Then to then go out and say, "I want to be completely multi-vendor neutral" that's a tall order on many levels and this is the multicloud challenge, right? So, the question is, what's the strategy for me, the CIO or CISO, what do I do? I mean, to me, I would go all in on one and start getting hedges and start playing and then look at some >> Crystal clear. Crystal clear to me. >> Go ahead. >> If you're a CIO today, you have to build a platform engineering team, no question. 'Cause if we agree that we cannot tell the great developers what to do, we have to create a platform engineering team that using pieces of the Supercloud can build, and let's make this very pragmatic and give examples. First you need to be able to lay down the run time, okay? So you need a way to deploy multiple different Kubernetes environment in depending on the cloud. Okay, now we got that. The second part >> That's like table stakes. >> That are table stake, right? But now what is the advantage of having a Supercloud service to do that is that now you can put a policy in one place and it gets distributed everywhere consistently. So for example, you want to say, "If anybody in this organization across all these different buildings, all these developers don't even know, build a PCI compliant microservice, They can only talk to PCI compliant microservice." Now, I sleep tight. The developers still do that. Of course they're going to get their hands slapped if they don't encrypt some messages and say, "Oh, that should have been encrypted." So number one. The second thing I want to be able to say, "This service that this developer built over there better satisfy this SLA." So if the SLA is not satisfied, boom, I automatically spin up multiple instances to certify the SLA. Developers unencumbered, they don't even know. So this for me is like, CIO build a platform engineering team using one of the many Supercloud services that allow you to do that and lay down. >> And part of that is that the vendor behavior is such, 'cause the incentive is that they don't necessarily always work together. (John chuckling) I'll give you an example, we're going to hear today from Western Union. They're AWS shop, but they want to go to Google, they want to use some of Google's AI tools 'cause they're good and maybe they're even arguably better, but they're also a Snowflake customer and what you'll hear from them is Amazon and Snowflake are working together so that SageMaker can be integrated with Snowflake but Google said, "No, you want to use our AI tools, you got to use BigQuery." >> Yeah. >> Okay. So they say, "Ah, forget it." So if you have a platform engineering team, you can maybe solve some of that vendor friction and get competitive advantage. >> I think that the future proximity concept that I talk about is like, when you're doing one thing, you want to do another thing. Where do you go to get that thing, right? So that is very important. Like your question, John, is that your point is that AWS is ahead of the pack, which is true, right? They have the >> breadth of >> Infrastructure by a lot >> infrastructure service, right? They breadth of services, right? So, how do you, When do you bring in other cloud providers, right? So I believe that you should standardize on one cloud provider, like that's your primary, and for others, bring them in on as needed basis, in the subsection or sub portfolio of your applications or your platforms, what ever you can. >> So yeah, the Google AI example >> Yeah, I mean, >> Or the Microsoft collaboration software example. I mean there's always or the M and A. >> Yeah, but- >> You're going to get to run Windows, you can run Windows on Amazon, so. >> By the way, Supercloud doesn't mean that you cannot do that. So the perfect example is say that you're using Azure because you have a SQL server intensive workload. >> Yep >> And you're using Google for ML, great. If you are using some differentiated feature of this cloud, you'll have to go somewhere and configure this widget, but what you can abstract with the Supercloud is the lifecycle manage of the service that runs on top, right? So how does the service get deployed, right? How do you monitor performance? How do you lifecycle it? How you secure it that you can abstract and that's the value and eventually value will win. So the customers will find what is the values, obstructing in making it uniform or going deeper? >> How about identity? Like take identity for instance, you know, that's an opportunity to abstract. Whether I use Microsoft Identity or Okta, and I can abstract that. >> Yeah, and then we have APIs and standards that we can use so eventually I think where there is enough pain, the right open source will emerge to solve that problem. >> Dave: Yeah, I can use abstract things like object store, right? That's pretty simple. >> But back to the engineering question though, is that developers, developers, developers, one thing about developers psychology is if something's not right, they say, "Go get fixing. I'm not touching it until you fix it." They're very sticky about, if something's not working, they're not going to do it again, right? So you got to get it right for developers. I mean, they'll maybe tolerate something new, but is the "juice worth the squeeze" as they say, right? So you can't go to direct say, "Hey, it's, what's a work in progress? We're going to get our infrastructure together and the world's going to be great for you, but just hang tight." They're going to be like, "Get your shit together then talk to me." So I think that to me is the question. It's an Ops question, but where's that value for the developer in Supercloud where the capabilities are there, there's less friction, it's simpler, it solves the complexity problem. I don't need these high skilled labor to manage Amazon. I got services exposed. >> That's what we talked about earlier. It's like the Walmart example. They basically, they took away from the developer the need to spin up infrastructure and worry about all the governance. I mean, it's not completely there yet. So the developer could focus on what he or she wanted to do. >> But there's a big, like in our industry, there's a big sort of flaw or the contention between developers and operators. Developers want to be on the cutting edge, right? And operators want to be on the stability, you know, like we want governance. >> Yeah, totally. >> Right, so they want to control, developers are like these little bratty kids, right? And they want Legos, like they want toys, right? Some of them want toys by way. They want Legos, they want to build there and they want make a mess out of it. So you got to make sure. My number one advice in this context is that do it up your application portfolio and, or your platform portfolio if you are an ISV, right? So if you are ISV you most probably, you're building a platform these days, do it up in a way that you can say this portion of our applications and our platform will adhere to what you are saying, standardization, you know, like Kubernetes, like slam dunk, you know, it works across clouds and in your data center hybrid, you know, whole nine yards, but there is some subset on the next door systems of innovation. Everybody has, it doesn't matter if you're DMV of Kansas or you are, you know, metaverse, right? Or Meta company, right, which is Facebook, they have it, they are building something new. For that, give them some freedom to choose different things like play with non-standard things. So that is the mantra for moving forward, for any enterprise. >> Do you think developers are happy with the infrastructure now or are they wanting people to get their act together? I mean, what's your reaction, or you think. >> Developers are happy as long as they can do their stuff, which is running code. They want to write code and innovate. So to me, when Ballmer said, "Developer, develop, Developer, what he meant was, all you other people get your act together so these developers can do their thing, and to me the Supercloud is the way for IT to get there and let developer be creative and go fast. Why not, without getting in trouble. >> Okay, let's wrap up this segment with a super clip. Okay, we're going to do a sound bite that we're going to make into a short video for each of you >> All right >> On you guys summarizing why Supercloud's important, why this next wave is relevant for the practitioners, for the industry and we'll turn this into an Instagram reel, YouTube short. So we'll call it a "Super clip. >> Alright, >> Sarbjeet, you want, you want some time to think about it? You want to go first? Vittorio, you want. >> I just didn't mind. (all laughing) >> No, okay, okay. >> I'll do it again. >> Go back. No, we got a fresh one. We'll going to already got that one in the can. >> I'll go. >> Sarbjeet, you go first. >> I'll go >> What's your super clip? >> In software systems, abstraction is your friend. I always say that. Abstraction is your friend, even if you're super professional developer, abstraction is your friend. We saw from the MFC library from C++ days till today. Abstract, use abstraction. Do not try to reinvent what's already being invented. Leverage cloud, leverage the platform side of the cloud. Not just infrastructure service, but platform as a service side of the cloud as well, and Supercloud is a meta platform built on top of these infrastructure services from three or four or five cloud providers. So use that and embrace the programmability, embrace the abstraction layer. That's the key actually, and developers who are true developers or professional developers as you said, they know that. >> Awesome. Great super clip. Vittorio, another shot at the plate here for super clip. Go. >> Multicloud is awesome. There's a reason why multicloud happened, is because gave our developers the ability to innovate fast and ever before. So if you are embarking on a digital transformation journey, which I call a survival journey, if you're not innovating and transforming, you're not going to be around in business three, five years from now. You have to adopt the Supercloud so the developer can be developer and keep building great, innovating digital experiences for your customers and IT can get in front of it and not get in trouble together. >> Building those super apps with Supercloud. That was a great super clip. Vittorio, thank you for sharing. >> Thanks guys. >> Sarbjeet, thanks for coming on talking about the developer impact Supercloud 2. On our next segment, coming up right now, we're going to hear from Walmart enterprise architect, how they are building and they are continuing to innovate, to build their own Supercloud. Really informative, instructive from a practitioner doing it in real time. Be right back with Walmart here in Palo Alto. Thanks for watching. (gentle music)

Published Date : Feb 17 2023

SUMMARY :

the Supercloud momentum, and developers came up and you were like, and the conversations we've had. and cloud is the and the role of the stack is changing. I dropped that up there, so, developers are in the business units. the ability to do all because the rift points to What is the future platform? is what you just said. the developer, so to your question, You cannot tell developers what to do. Cannot tell them what to do. You can tell 'em your answer the question. but we give you a place to build, and you want to shave off the milliseconds they love the flexibility, you know, platform developers, you're saying. don't want deal with that muck. that are abstracted. Like how I see the Supercloud is So like if you put in front of them you mentioned platform. and I think there's the developers that, you The point is the operation to decode", you know, the browser for the first time, you know, going to be more stuff coming on. and on the flip side, the middle has to work, but for the most part, generally, Point is the developer So in the middle they have to, the parody with clouds. I mean the fact of the matter Crystal clear to me. in depending on the cloud. So if the SLA is not satisfied, boom, 'cause the incentive is that So if you have a platform AWS is ahead of the pack, So I believe that you should standardize or the M and A. you can run Windows on Amazon, so. So the perfect example is abstract and that's the value Like take identity for instance, you know, the right open source will Dave: Yeah, I can use abstract things and the world's going to be great for you, the need to spin up infrastructure on the stability, you know, So that is the mantra for moving forward, Do you think developers are happy and to me the Supercloud is for each of you for the industry you want some time to think about it? I just didn't mind. got that one in the can. platform side of the cloud. Vittorio, another shot at the the ability to innovate thank you for sharing. the developer impact Supercloud 2.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
BMW	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
John	PERSON	0.99+
Sarbjeet	PERSON	0.99+
John Furrier	PERSON	0.99+
Bosch	ORGANIZATION	0.99+
Vittorio	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Audi	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Steve Ballmer	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Adam Bosworth	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Vittorio Viarengo	PERSON	0.99+
Kit Colbert	PERSON	0.99+
Ballmer	PERSON	0.99+
four	QUANTITY	0.99+
Sarbjeet Johal	PERSON	0.99+
five hours	QUANTITY	0.99+
VMware	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Palo Alto, California	LOCATION	0.99+
Microsoft	ORGANIZATION	0.99+
Five minutes	QUANTITY	0.99+
NextGen	ORGANIZATION	0.99+
StackPayne	ORGANIZATION	0.99+
Visual Basic	TITLE	0.99+
second part	QUANTITY	0.99+
12 different distributions	QUANTITY	0.99+
CES	EVENT	0.99+
First	QUANTITY	0.99+
Twitter	ORGANIZATION	0.99+
Kansas City	LOCATION	0.99+
second one	QUANTITY	0.99+
three	QUANTITY	0.99+
both	QUANTITY	0.99+
Kansas	LOCATION	0.98+
first time	QUANTITY	0.98+
Windows	TITLE	0.98+
last year	DATE	0.98+

KubeCon Preview with Madhura Maskasky

(upbeat music) >> Hello, everyone. Welcome to theCUBE here, in Palo Alto, California for a Cube Conversation. I'm John Furrier, host of theCUBE. This is a KubeCon preview conversation. We got a great guest here, in studio, Madhura Maskasky, Co-Founder and VP of Product, Head of Product at Platform9. Madhura, great to see you. Thank you for coming in and sharing this conversation about, this cube conversation about KubeCon, a Kubecon conversation. >> Thanks for having me. >> A light nice play on words there, a little word play, but the fun thing about theCUBE is, we were there at the beginning when OpenStack was kind of on its transition, Kubernetes was just starting. I remember talking to Lou Tucker back in, I think Seattle or some event and Craig McLuckie was still working at Google at the time. And Google was debating on putting the paper out and so much has happened. Being present at creation, you guys have been there too with Platform9. Present at creation of the Kubernetes wave was not obvious only a few insiders kind of got the big picture. We were one of 'em. We saw this as a big wave. Docker containers at that time was a unicorn funded company. Now they've went back to their roots a few years ago. I think four years ago, they went back and recapped and now they're all pure open source. Since then Docker containers and containers have really powered the Kubernetes wave. Combined with the amazing work of the CNCF and KubeCon which we've been covering every year. You saw the maturation, you saw the wave, the early days, end user projects being contributed. Like Envoy's been a huge success. And then the white spaces filling in on the map, you got observability, you've got run time, you got all the things, still some white spaces in there but it's really been great to watch this growth. So I have to ask you, what do you expect this year? You guys have some cutting edge technology. You got Arlo announced and a lot's going on Kubernetes this year. It's going mainstream. You're starting to see the traditional enterprises embrace and some are scaling faster than others, manage services, plethora of choices. What do you expect this year at KubeCon North America in Detroit? >> Yeah, so I think you summarize kind of that life cycle or lifeline of Kubernetes pretty well. I think I remember the times when, just at the very beginning of Kubernetes, after it was released we were sitting I think with box, box dot com and they were describing to us why they are early adopters of Kubernetes. And we were just sitting down taking notes trying to understand this new project and what value it adds, right? And then flash forward to today where there are Dilbert strips written about Kubernetes. That's how popular it has become. So, I think as that has happened, I think one of the things that's also happened is the enterprises that adopted it relatively early are running it at a massive scale or looking to run it at massive scale. And so I think at scale cloud-native is going to be the most important theme. At scale governance, at scale manageability are going to be top of the mind. And the third factor, I think that's going to be top of the mind is cost control at scale. >> Yeah, and one of the things that we've seen is that the incubated projects a lot more being incubated now and you got the combination of end user and company contributed open source. You guys are contributing RLO >> RLO. >> and open source. >> Yeah. >> That's been part of your game plan there. So you guys are no stranger open source. How do you see this year's momentum? Is it more white space being filled? What's new coming out of the block? What do you think is going to come out of this year? What's rising in terms of traction? What do you see emerging as more notable that might not have been there last year? >> Yeah, so I think it's all about filling that white space, some level of consolidation, et cetera. That's usually the trend in the cloud-native space. And I think it's going to continue to be on that and it's going to be tooling that lets users simplify their lives. Now that Kubernetes is part of your day to day. And so it is observability, et cetera, have always been top of the mind, but I think starting this year, et cetera it's going to be at the next level. Which is gone other times of just running your Prometheus at individual cluster level, just to take that as an example. Now you need a solution- >> Yep. >> that operates at this massive scale across different distributions and your edge locations. So, it's taking those same problems but taking them to that next order of management. >> I'm looking at my notes here and I see orchestration and service mesh, which Envoy does. And you're seeing other solutions come out as well like Linkerd and whatnot. Some are more popular than others. What areas do you see are most needed? If you could go in there and be program chair for a day and you've got a day job as VP of product at Platform9. So you kind of have to have that future view of the roadmap and looking back at where you've come, what would you want to prioritize if you could bring your VP of product skills to the open source and saying, hey, can I point out some needs here? What would you say? >> Yeah, I think just the more tooling that lets people make sense and reduce some of the chaos that this prowling ecosystem of cloud-native creates. Which is tooling, that is not adding more tooling that covers white space is great, but introducing abilities that let you better manage what you have today is probably absolutely top of the mind. And I think that's really not covered today in terms of tools that are around. >> You know, I've been watching the top five incubated projects in CNCF, Argo cracked the top five. I think they got close to 12,000 GitHub stars. They have a conference now, ArgoCon here in California. What is that about? >> Yeah. >> Why is that so popular? I mean, I know it's kind of about obviously workflows and dealing with good pipeline, but why is that so popular right now? >> I think it's very interesting and I think Argo's journey and it's just climbed up in terms of its Github stars for example. And I think it's because as these scale factors that we talk about on one end number of nodes and clusters growing, and on the other end number of sites you're managing grows. I think that CD or continuous deployment of applications it used to kind of be something that you want to get to, it's that north star, but most enterprises wouldn't quite be there. They would either think that they're not ready and it's not needed enough to get there. But now when you're operating at that level of scale and to still maintain consistency without sky rocketing your costs, in terms of ops people, CD almost becomes a necessity. You need some kind of manageable, predictable way of deploying apps without having to go out with new releases that are going out every six months or so you need to do that on a daily basis, even hourly basis. And that's why. >> Scales the theme again, >> Yep. >> back to scale. >> Yep. >> All right, final question. We'll wrap up this preview for KubeCon in Detroit. Whereas we start getting the lay of the land and the focus. If you had to kind of predict the psychology of the developer that's going to be attending in person and they're going to have a hybrid event. So, they will be not as good as being in person. Us, it's going to be the first time kind of post pandemic when I think everyone's going to be together in LA it was a weird time in the calendar and Valencia was the kind of the first international one but this is the first time in North America. So, we're expecting a big audience. >> Mhm. >> If you could predict or what's your view on the psychology of the attendee this year? Obviously pumped to be back. But what do you think they're going to be thinking about? what's on their mind? What are they going to be peaked on? What's the focus? Where will be the psychology? Where will be the mindset? What are people going to be looking for this year? If you had to make a prediction on what the attendees are going to be thinking about what would you say? >> Yeah. So there's always a curiosity in terms of what's new, what new cool tools that are coming out that's going to help address some of the gaps. What can I try out? That's always as I go back to my development roots, first in mind, but then very quickly it comes down to what's going to help me do my job easier, better, faster, at lower cost. And I think again, I keep going back to that theme of automation, declarative automation, automation at scale, governance at scale, these are going to be top of the mind for both developers and ops teams. >> We'll be there covering it like a blanket like we always do from day one, present at creation at KubeCon we are going to be covering again for the consecutive year in a row. We love the CNCF. We love what they do. We thank the developers this year, again continue going mainstream closer and closer to the front lines as the company is the application. As we say, here on theCUBE we'll be there bringing you all the signal. Thanks for coming in and sharing your thoughts on KubeCon 2022. >> Thank you for having me. >> Okay. I'm John Furrier here in theCUBE in Palo Alto, California. Thanks for watching. (upbeat music)

Published Date : Sep 7 2022

SUMMARY :

Co-Founder and VP of Product, in on the map, you got observability, think that's going to be top Yeah, and one of the What do you see emerging and it's going to be but taking them to that of product skills to the and reduce some of the chaos in CNCF, Argo cracked the top five. and it's not needed enough to get there. Us, it's going to be the first of the attendee this year? of the mind for both We thank the developers this year, in theCUBE in Palo Alto, California.

ENTITIES

Entity	Category	Confidence
Madhura Maskasky	PERSON	0.99+
Craig McLuckie	PERSON	0.99+
John Furrier	PERSON	0.99+
Lou Tucker	PERSON	0.99+
California	LOCATION	0.99+
LA	LOCATION	0.99+
North America	LOCATION	0.99+
Madhura	PERSON	0.99+
Detroit	LOCATION	0.99+
Google	ORGANIZATION	0.99+
last year	DATE	0.99+
first time	QUANTITY	0.99+
Argo	ORGANIZATION	0.99+
Palo Alto, California	LOCATION	0.99+
four years ago	DATE	0.99+
KubeCon	EVENT	0.99+
Seattle	LOCATION	0.99+
one	QUANTITY	0.98+
this year	DATE	0.98+
both	QUANTITY	0.98+
third factor	QUANTITY	0.98+
today	DATE	0.98+
Platform9	ORGANIZATION	0.97+
12,000	QUANTITY	0.97+
a day	QUANTITY	0.97+
Kubernetes	TITLE	0.97+
Prometheus	TITLE	0.97+
Linkerd	ORGANIZATION	0.95+
CNCF	ORGANIZATION	0.95+
first	QUANTITY	0.94+
KubeCon 2022	EVENT	0.93+
Dilbert	PERSON	0.93+
theCUBE	ORGANIZATION	0.92+
wave	EVENT	0.91+
day one	QUANTITY	0.91+
few years ago	DATE	0.9+
Valencia	LOCATION	0.9+
ArgoCon	EVENT	0.9+
five incubated projects	QUANTITY	0.87+
KubeCon	ORGANIZATION	0.86+
Envoy	ORGANIZATION	0.85+
first international	QUANTITY	0.84+
six months	QUANTITY	0.81+
Kubecon	ORGANIZATION	0.74+
top five	QUANTITY	0.71+
Docker	ORGANIZATION	0.69+
north star	LOCATION	0.63+
Arlo	TITLE	0.54+
pandemic	EVENT	0.52+
Github	ORGANIZATION	0.5+
OpenStack	TITLE	0.49+
Kubernetes	PERSON	0.47+
GitHub	TITLE	0.47+
Cube	EVENT	0.3+

Nik Kalyani, WhenHub & TryCrypto | DevNet Create 2019

(lively pop music) >> Live from Mountain View, California. It's the Cube covering DevNet Create 2019. Brought to you by Cisco. >> Okay welcome back everyone, we're here at day two coverage live of coverage at Mountain View, Cube coverage of Cisco's DevNet Create. I'm John Furrier, your host, where all the action is in the creation side of two communities, DevNet, Cisco developers and then the open cloud native world entrepreneurship coming together to create products. Our next guest is Nik Kalyani, co-founder of WhenHub and TryCrypto. He's a builder, he's a creator, he's an entrepreneur. Welcome to The Cube, thanks for coming on >> John, thanks for having me. >> You just gave a long talk so I'll let you breathe a little bit. You're an entrepreneur, you're an inventor, you see things early. You got a lot of your hands on lots of good stuff here. This is the perfect place for you to be giving talks and hanging out. >> Absolutely I love the fact that people are here to learn. They're here to find out about the new innovative things that they can experience hands-on. I just gave a workshop on smart contracts, on the blockchain and I loved the questions I got and the energy that's there. >> What sort of questions were you getting? What was the interest? Where are people going at it? Because networking's a supply chain problem you can almost imagine applying blockchain to networking constructs. >> Yeah absolutely, you know blockchain is one of those technologies that is misunderstood quite a bit and some of the questions I got really helped me, help reinforce that. Ultimately, what I was trying to do is make sure that people understand that blockchain is not a solution for everything. There are certain things where there are scenarios where there are multiple un-trusted parties where blockchain is great, but otherwise it's just a slow database. So you want to make sure that you use it in the right scenarios and supply chain is a very common example where it's used, especially private blockchains. >> If latency's not a concern blockchain might be a solution if other things line up. Great point, I'm glad you brought that up. I want to just ask you because your profile as a person you're a visionary, you see things early. The part of the show here that's interesting is it's not like there's this research kind of thinking, although researchers tends to think about the waves coming. It's about what's here and now and what's coming but it's also making things real and creating. So a lot of the conversations are fun, exploratory, discovery orientated but also there's a lot of reality kind of grounded in it. You know entrepreneurs make some mistakes if you're too early, you're misunderstood for a long time. It's got to be a little bit early at the right time, timing's everything. Talk about the dynamic of timing and building and creating with big waves that are coming. You got cloud, you got blockchain, you got AI, you got machine learning. Talk about this dynamic. >> Absolutely, yeah so timing is so important, especially when you have start-ups right? You could have the greatest technology and maybe the market's not ready for it and so yeah it fails. My first start up was like that. I created something that the market was not ready for but fortunately the stuff I'm working on the market is ready for. So I think one of the things that developers, engineers can do is really look at how not necessarily how a technology is being marketed but what the adoption rate is. If there are more people jumping on it, and a good way to look at that is to look at GitHub and see how many people are creating samples, boilerplates, how many people are writing blog posts et cetera. That I think is a better indicator of whether a technology is ready for prime time or if it's just all vaporware. >> Tell about what you're working on now you're working on some very interesting projects. Where are they? What's the status, size of the team, collaborative open source. What's going on? >> So I have two start-ups I'm working on. the first one is called WhenHub. So we have a product called Interface that allows anyone to be an expert on any topic, and promote themselves through the platform. And allows anyone who's looking for expertise on any topic to find them and then pay for them and do a video call, get their questions answered and the whole transaction is handled via blockchain with either our cryptocurrency or you can use Apple Pay or Google Pay. So we launched a few months ago, we have about 75,000 users, it's growing very fast. We are just at the point right now where we are trying to scale-up. Our crypto token is called WHEN token. It's listed on five different exchanges. So that's one thing. While building that product one thing became very clear to me. Mainstream users have a very challenging time with using anything blockchain or cryptocurrency related. And it's through no fault of theirs, the ecosystem has been created for developers by developers and the tools lack empathy for the users. And that lead me to create an open source project called TryCrypto. The mission is to create free open source content and tools to make blockchain and cryptocurrency more accessible to users. >> To mainstream not the killer dorks and the guys coding. >> Yeah we want it to be like non-technical folks >> Is it the wallet that's the problem or is it just overall too techy? >> You know what John, the very word wallet is the problem. (John laughs) Because it gives this idea that there's something within it. As we were talking earlier, you know about blockchain, there's nothing in a wallet. It's just a placeholder for all of your addresses, right? So in fact, I'm trying to solve that problem with a new tool I've created called Photoblock, where I use a photo and emoji's to replace that. Yes, wallets are problems. The fact that it requires you to have all these parts in place before you can do anything useful, that's a big problem also. People really need to step back and look at the user experience and say what are the friction points and how can we eliminate them and that needs to happen before blockchain and cryptocurrency can have mass adoption. >> Talk about the choice of smart contract language used. Ethereum which was the hottest development oriented the most traction. A lot of ICOs kind of watered that down, it's still under 300. Other ones are emerging, NEO, EO, a bunch of other ones. It seems to be kind of like a NASCAR race, one's in the lead, someone's coming up. How do you look at that marketplace as other developers start to kick the tires? As people start building these real-world apps is that important to have a selector? Does it matter? What's your thoughts on selection? >> That's a great question. I think going back to what I said about how to evaluate a technology. You can see that Ethereum is still continues to be the leader, by far. So while EO and other blockchains have what appears to be a lot of momentum, if you dig down below the surface you don't find as much. So I continue to remain a big fan of Ethereum. Which doesn't mean I don't care for the other blockchains but I find that right now Serenity and Ethereum are a good way to move forward. I think EO is also a good platform to build on but I think their developing tools need to reach some level of maturity. On Ethereum, the folks that have created the truffle stack, the truffle and ganache package, have done a great service for developers because they make them so simple and easy. Something like that needs to evolve. >> Yeah and your point earlier I think it's important to know for the developers out there don't confuse the protocol and the token selection on smart contracts with blockchain. Again, you don't have to anything on blockchain 'cause it's a slow database. You're doing smart contracts which doesn't really require a lot of overhead. I mean it's a contract, it does. You want to have it reliable, but you're not doing zillions of contracts per second. The IOPs are not that high. >> Yeah, actually smart contracts is also a very misunderstood term. In fact, someone asked me is it legal contracts or medical contracts, what is it? A smart contract is really just an application. A programming code that runs on the virtual machines on blockchain. They call it a contract because once it's out there it's immutable. Which means the rules are defined, known and fixed and can't be changed. So when you create a smart contract, really what you're doing is handling a very small amount of data that you want to persist forever that runs with some rules. >> And in a decentralized world, as we call it in our community, it's a digital handshake. You agreed that we would do this, there it is, it's un-hackable. What are the cool things you're working on? What else you got? Opensource project's awesome. You got a lot going on. Life's good. >> Life is good. As I mentioned, Photoblock is the thing that I'm really excited about. Another app that we are building is called Public Record. The problem we are solving there is that in areas where there is strife, or maybe there's dictators et cetera, sometimes when you have people who have photos of some crime occurring or some event occurring, they are reluctant to share it because it could be traced back and have adverse consequences. With Public Record we are building a smart contract driven blockchain app. Where you can just take a photo and it will push that photo on to IPFS. Which stands for the InterPlanetary File System, which is a decentralized file system. It will anonymize the photo. It will strip all the stuff that your camera puts on there like GPS, the camera model et cetera. It'll manipulate that photo and it will then put a hash of that on the blockchain and make it available by location. So you can go to any location look at all the photos that people have taken there that are completely anonymous and impossible to track back to the >> And what about tampering proof? You have origination data, you strip out the real origination data, that's really important for some of these countries where people get killed for sharing or trying to get the backdoor out of the country for political revolution or just simply I don't want anybody to know. How about tamper proof? >> It is, it's on IPFS, which is immutable file system. What we also do is we manipulate the colors and tones of the photo a little bit so it's impossible to even use AI to go back and reverse engineer and figure out who created the photo. The location, the time and the actual content of the photo is not tampered. So Public Record will do that. >> Just a little quick Q and A on your company. Did you do an ICO, did you finance it yourself? >> With WhenHub we did do an ICO, but it was at a time when the market was at its bare things so our ICO was moderately successful. In addition to the ICO funds, we are primarily funded by one of my co-founders, Scott Adams, the creator of the Dilbert comic strip. We are doing quite well. >> He's a cool guy to hang out with, huh? >> He is. >> Never a dull moment? >> Never a dull moment, I learn quite a bit. >> Congratulations. How do people find out how to hang out with you? You got some good things going on here. Where do you hang out? What do you do for fun? What events do you go to? What's going on with you? >> I'm on Twitter quite a bit. >> Say your Twitter handle. >> It's @techbubble. I'm there. I like to blog. on TryCrypto and also my own personal blog. I go to meet-up events here in Silicon Valley and I do make an effort to speak at least five to six conferences each year. >> Aim it forward. >> Yep. >> A lot more action going on in crypto and token economics not just from an ICO standpoint always been some negative scams out there and global fraud, but generally, blockchain and token economics is real and getting more traction and soon I think it will be clearer. Your thoughts on that, if you could share your perspective in terms of the opportunities around those two areas. >> Like any other new and exciting technology goes through the hype cycle, they've gone through that now. I think there's really two types of people in this ecosystem. The ones that are focused on the cryptocurrency and the pricing around it et cetera. But I'd really like to separate that from the blockchain aspect of it. Blockchain is a very real technology, it's a really different technology that the world has never seen before. Yes, it's very true that not everything is a good candidate for the blockchain. But there are many, many scenarios where there are multiple un-trusted parties that are excellent for blockchain. I think what needs to happen is persons in leadership position need to really evaluate: what are the scenarios where there are un-trusted entities involved? And limit their blockchain involvement, test pilots, all of that they're more likely to see more success. Versus just throwing blockchain into it, replace the database, 'cause that's guaranteed to be a fail. >> Nik, great to have you on. I totally agree with you. The team here we were in Puerto Rico, we've been in the Bahamas, we've been Toronto we've been to all the blockchain events. Consensus is coming up in New York. We might be there, May 14th. Patrick, getting ready to head down to New York. Maybe go down there. Great to have your perspective. Great to see the blockchain conversation coming in here as the emerging tech and the creation here at DevNet Create continues. Thanks for coming out. >> Thank you so much. I appreciate you having me here. >> More Cube coverage here coming live here at Mountain View after this short break. (pop music plays)

Published Date : Apr 26 2019

SUMMARY :

Brought to you by Cisco. Welcome to The Cube, thanks for coming on This is the perfect place for you and the energy that's there. to networking constructs. that is misunderstood quite a bit and some of the questions So a lot of the conversations are fun, exploratory, I created something that the market was not ready for What's the status, size of the team, And that lead me to create an and the guys coding. and that needs to happen before is that important to have a selector? I think going back to what I said don't confuse the protocol and the token selection on the virtual machines on blockchain. What are the cool things you're working on? As I mentioned, Photoblock is the thing the backdoor out of the country for political revolution of the photo a little bit so it's impossible to even use AI Did you do an ICO, did you finance it yourself? In addition to the ICO funds, we are primarily funded How do people find out how to hang out with you? and I do make an effort to speak in terms of the opportunities around those two areas. replace the database, 'cause that's guaranteed to be a fail. Nik, great to have you on. I appreciate you having me here. after this short break.

ENTITIES

Entity	Category	Confidence
Nik Kalyani	PERSON	0.99+
Scott Adams	PERSON	0.99+
John	PERSON	0.99+
Puerto Rico	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Patrick	PERSON	0.99+
New York	LOCATION	0.99+
May 14th	DATE	0.99+
Bahamas	LOCATION	0.99+
John Furrier	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Toronto	LOCATION	0.99+
TryCrypto	ORGANIZATION	0.99+
Mountain View, California	LOCATION	0.99+
WhenHub	ORGANIZATION	0.99+
Mountain View	LOCATION	0.99+
two communities	QUANTITY	0.99+
two types	QUANTITY	0.99+
two areas	QUANTITY	0.98+
five different exchanges	QUANTITY	0.98+
about 75,000 users	QUANTITY	0.98+
Nik	PERSON	0.98+
one	QUANTITY	0.98+
under 300	QUANTITY	0.98+
one thing	QUANTITY	0.97+
Google Pay	TITLE	0.97+
Apple Pay	TITLE	0.97+
six conferences	QUANTITY	0.97+
Photoblock	TITLE	0.96+
2019	DATE	0.95+
Public Record	TITLE	0.93+
first one	QUANTITY	0.91+
first start	QUANTITY	0.89+
DevNet Create	TITLE	0.88+
Ethereum	ORGANIZATION	0.86+
each year	QUANTITY	0.85+
TryCrypto	TITLE	0.84+
two start-ups	QUANTITY	0.84+
NASCAR	ORGANIZATION	0.83+
Twitter	ORGANIZATION	0.8+
@techbubble	ORGANIZATION	0.8+
DevNet	ORGANIZATION	0.8+
zillions of contracts per second	QUANTITY	0.79+
few months ago	DATE	0.79+
DevNet Create 2019	TITLE	0.79+
Ethereum	TITLE	0.78+
Dilbert	PERSON	0.75+
Photoblock	ORGANIZATION	0.75+
GitHub	ORGANIZATION	0.73+
EO	TITLE	0.71+
day two coverage	QUANTITY	0.68+
DevNet Create	ORGANIZATION	0.67+
Serenity	ORGANIZATION	0.63+
InterPlanetary File System	OTHER	0.61+
Opensource	ORGANIZATION	0.57+
co	QUANTITY	0.54+
least five	QUANTITY	0.5+
Cube	COMMERCIAL_ITEM	0.42+
Cube	ORGANIZATION	0.37+
Cube	TITLE	0.36+
Cube	PERSON	0.31+

Marc Farley, Vulcancast - Google Next 2017 - #GoogleNext17 - #theCUBE

>> Narrator: Live from the Silicon Valley, it's theCUBE. (bright music) Covering Google Cloud Next 17. >> Hi, and welcome to the second day of live coverage here of theCUBE covering Google Next 2017. We're at the heart of Silicon Valley here at our 4,500 square foot new studio in Palo Alto. We've got a team of reporters and analysts up in San Francisco checking out everything that's happening in Google. I was up there for the day two keynote, and happy to have with me is the first guest of the day, friend of theCUBE, Marc Farley, Vulcancast, guy that knows clouds, worked for one the big three in the past and going to help me break down some of what's going on in the marketplace. Mark, it's great to see you. >> Oh, it's really nice to be here, Stu, thanks for asking me on. >> Always happy to have you-- >> And what a lot of fun stuff to get into. >> Oh my god, yeah, this is what we love. We talked about, I wonder, Amazon Reinvent is like the Superbowl of the industry there. What's Google there if, you know-- >> Well, Google pulls a lot of resources for this. And they can put on a very impressive show. So if this is, if Invent is the Superbowl, then maybe this, maybe Next is the college championship game. I hate to call it college, but it's got that kind of draw, it's a big deal. >> Is is that, I don't want to say, arena football, it's the up and coming-- >> Oh, it's a lot better than that. Google really does some spectacular things at events. >> They're Google, come on, we all use Google, we all know Google, 10,000 people showed up, there's a lot of excitement. So what's your take of the show so far in Google's positioning in cloud? >> It's nothing like the introduction of Glass. And of course, Google Glass is a thing of the past, but I don't know if you remember when they introduced that, when they had the sky diver. Sky divers diving out of an airplane and then climbing up the outside of the building and all that, it was really spectacular. Nobody can ever reach that mark again, probably not even the Academy Awards. But you asked the second part of the question, what's Google position with cloud, I think that's going to be the big question moving forward. They are obviously committed to doing it, and they're bringing unique capabilities into cloud that you don't see from either Amazon or Microsoft. >> Yeah. I mean, coming into it, there's certain things that we've been hearing forever about Google, and especially when you talk about Google in the enterprise. Are they serious, is this just beta, are they going to put the money in? I thought Eric Schmidt did a real good job yesterday in the close day keynote, he's like, "Look, I've been telling Google to push hard "in the enterprise for 17 years. "Look, I signed a check for 30 billion dollars." >> 30 billion! >> Yeah, and I talked to some people, they're a little skeptical, and they're like, "Oh, you know, that's not like it all went to build "the cloud, some of it's for their infrastructure, "there's acquisitions, there's all these other things." But I think it was infrastructure related. Look, there shouldn't be a question that they're serious. And Diane Greene said, in a Q&A she had with the press, that thing about, we're going to tinker with something and then kill it, I want to smash that perception because there's certain things you can do in the consumer side that you cannot get away with on the enterprise side, and she knows that, they're putting a lot of effort to transform their support, transform the pricing, dig in with partners and channels. And some of it is, you know, they've gotten the strategy together, they've gotten the pieces together, we're moving things from beta to GA, and they're making good progress. I think they have addressed some of the misperceptions, that being said, everybody usually, it's like, "I've been hearing this for five years, "it's probably going to take me a couple of years "to really believe it." >> Yeah, but you know, the things is, for people that know Diane Greene and have watched VMware over the years, and then her being there at Google is a real commitment. And she's talking about commitment when she talks about that business. It's full pedal to the metal, this is a very serious, the things that's interesting about it, it's a lot more than infrastructure as a service. >> Yeah. >> The kinds of APIs and apps and everything that they're bringing, this is a lot more than just infrastructure, this is Google developed, Google, if you will, proprietary technology now that they're turning to the external world to use. And there's some really sophisticated stuff in there. >> Yes, so before we get into some of the competitive landscape, some of the things you were pretty impressed with, I think everybody was, the keynote this morning definitely went out much better, day one keynote, a little rocky. Didn't hear, the biggest applauses were around some of the International Women's Day, which is great that they do that, but it's nice when they're like, "Oh, here's some cool new tech," or they're like, oh, wow, this demo that they're doing, some really cool things and products that people want to get their hands on. So what jumped out at you at the keynote this morning? >> I'm trying to remember what it's called. The stuff from around personal identifiable information. >> Yeah, so that's what they call DLP or it's the Data Loss Prevention API. Thank goodness for my Evernote here, which I believe runs on Google cloud, keeping up to date, so I'm-- >> Data loss prevention shouldn't be so hard to remember. >> And by the way, you said proprietary stuff. One thing about Google is, that Data Loss Prevention, it's an API, they want to make it easy to get in, a lot of what they do is open source. They feel that that's one of their differentiations, is to be, we always used to say on the infrastructure side, it's like everybody's pumping their chest. Who's more open than everybody else? Google. Lots of cool stuff, everything from the TensorFlow and Kubernetes that's coming out, where some of us are like, "Okay, how will they actually make money on some of this, "will it be services?" But yeah, Data Loss Prevention API, which was a really cool demo. It's like, okay, here's a credit card, the video kind of takes it and it redacts the number. It can redact social security numbers, it's got that kind of machine learning AI with the video and all those things built in to try to help security encrypt and protect what you're doing. >> It's mind boggling. You think about, they do the facial recognition, but they're doing content recognition also. And you could have a string of numbers there that might not be a phone number, it might not be a social security number, and the question is, what DLP flagged that to, who knows, it doesn't really matter. What matters is that they can actually do this. And as a storage person, you're getting involved, and compliance and risk and mitigation, all these kinds of things over the years. And it's hard for software to go in and scan a lot of data to just look for text. Not images of numbers on a photograph, but just text in a document, whether it's a Word file or something. And you say, "Oh, it's not so hard," but when you try to do that at scale, it's really hard at scale. And that's the thing that I really wonder about DLP, are they going to be able to do this at large scale? And you have to think that that is part of the consideration for them, because they are large scale. And if they can do that, Stu, that is going to be wildly impressive. >> Marc, everything that Google does tends to be built for scale, so you would think they could do that. And I'd think about all the breaches, it was usually, "Oh, oops, we didn't realize we had this information, "didn't know where it was," or things like that. So if Google can help address that, they're looking at some of those core security issues they talked about, they've got a second form factor authentication with a little USB tab that can go into your computer, end to end encryption if you've got Android and Chrome devices, so a lot of good sounding things on encryption and security. >> One of the other things they announced, I don't know if this was part of the same thinking, but they talk about 64 core servers, and they talk about, or VMs, I should say, 64 core VMs, and they're talking about getting the latest and greatest from Intel. What is it, Skylink, Sky-- >> Stu: Skylake. >> Skylake, yeah, thanks. >> They had Raejeanne actually up on stage, Raejeanne Skillern, Cube alumn, know her well, was happy to see her up on stage showing off what they're doing. Not only just the chipset, but Intel's digging in, doing development on Kubernetes, doing development on TensorFlow to help with really performance. And we've seen Intel do this, they did this with virtualization with the extensions that they did, they're doing it with containers. Intel gets involved in these software pieces and makes sure that the chipset's going to be optimized, and great to see them working with Google on it. >> My guess is they're going to be using a lot of cycles for these security things also. The security is really hard, it's front and center in our lives these days, and just everything. I think Google's making a really interesting play, they take their own internal technology, this security technology that they've been using, and they know it's compute heavy. The whole thing about DLP, it's extremely compute heavy to do this stuff. Okay, let's get the biggest, fastest technology we can to make it work, and then maybe it can all seem seamless. I'm really impressed with how they've figured out to take the assets that they have in different places, like from YouTube. These other things that you would think, is YouTube really an enterprise app? No, but there's technology in YouTube that you can use for enterprise cloud services. Very smart, I give them a lot of credit for looking broadly throughout their organization which, in a lot of respects, traditionally has been a consumer oriented experience, and they're taking some of these technologies now and making it available to enterprise. It's really, really hard. >> Absolutely. They did a bunch of enhancements on the G Suite product line. It felt at times a little bit, it's like, okay, wait, I've got the cloud and I've got the applications. There are places that they come together, places that data and security flow between them, but it still feels like a couple of different parts, and how they put together the portfolio, but building a whole solution for the enterprise. We see similar things from Microsoft, not as much from Amazon. I'm curious what your take is as to how Google stacks up against Microsoft who, disclaimer, you did work for one time on the infrastructure side. >> Yeah, that's a whole interesting thing. Google really wants to try to figure out how to get enterprises that run on Microsoft technology moving to Google cloud, and I think it's going to be very tough for them. Satya Nadella and Microsoft are very serious about making a seamless experience for end users and administrators and everybody along managing the systems and using their systems. Okay, can Google replicate that? Maybe on the user side they can, but certainly not on the administration side. And there are hooks between the land-based technology and the cloud-based technology that Microsoft's been working on for years. Question is, can Google come close to replicating those kinds of things, and on Microsoft's side, do customers get enough value, is there enough magic there to make that automation of a hybrid IT experience valuable to their customers. I just have to think though that there's no way Google's going to be able to beat Microsoft at hybrid IT for Microsoft apps. I just don't believe it. >> Yeah, it's interesting. I think one of the not so secret weapons that Google has there is what they're doing with Kubernetes. They've gotten Kubernetes in all the public clouds, it's getting into a lot of on premises environment. Everything from we were at the KubeCon conference in Seattle a couple of months ago. I hear DockerCon and OpenStacks Summit are going to have strong Kubernetes discussions there, and it's growing, it's got a lot of buzz, and that kind of portability and mobility of workload has been something that, especially as guys that have storage background, we have a little bit of skepticism because physics and the size of data and that whole data gravity thing. But that being said, if I can write applications and have ways to be able to do similar things across multiple environments, that gives Google a way to spread their wings beyond what they can do in their Google cloud. So I'm curious what you think about containers, Kubernetes, serverless type activity that they're doing. >> I think within the Google cloud, they'll be able to leverage that technology pretty effectively. I don't think it's going to be very effective, though, in enterprise data centers. I think the OpenStack stuff's been a really hard road, and it's a long time coming, I don't know if they'll ever get there. So then you've got a company like Microsoft that is working really hard on the same thing. It's not clear to me what Microsoft's orchestrate is going to be, but they're going to have one. >> Are you bullish on Asure Stack that's coming out later this year? >> No, not really. >> Okay. >> I think Asure Stack's a step in the right direction, and Microsoft absolutely has to have it, not so much for Google, but for AWS, to compete with AWS. I think it's a good idea, but it's such a constrained system at this point. It's going to take a while to see what it is. You're going to have HPE and Lenovo and Cisco, all have, and Dell, all having the same basic thing. And so you ask yourself, what is the motivation for any of these companies to really knock it out of the park when Microsoft is nailing everybody's feet to the floor on what the options are to offer this? And I understand Microsoft wanting to play it safe and saying, "We want to be able to support this thing, "make sure that, when customers install it, "they don't have problems with it." And Microsoft always wants to foist the support burden onto somebody else anyway, we've all been working for Microsoft our whole lives. >> It was the old Dilbert cartoon, as soon as you open that software, you're all of a sudden Microsoft's pool boy. >> (laughs) I love that, yeah. Asure Stack's going to be pretty constrained, and they keep pushing it further out. So what's the reality of this? And Asure Pack right now is a zombie, everybody's waiting for Asure Stack, but Asure Stack keeps moving out and Asure Stack's going to be small and constrained. This stuff is hard. There's a reason why it's taking everybody a long time to get it out, there's a reason why OpenStack hasn't had the adoption that people first expected, there's going to be a reason why I think Asure Stack does not have the adoption that Microsoft hoped for either. It's going to be an interesting thing to watch over what will play out over the next five or six years. >> Yeah, but for myself, I've seen this story play out a few times on the infrastructure side. I remember the original precursor, the Vblock with Acadia and the go-to-market. VMware, when they did the VSAN stuff, the generation one of Evo really went nowhere, and they had to go, a lot of times it takes 18 to 24 months to sort out some of those basic pricing, packaging, partnering, positioning type things, and even though Asure Stack's been coming for a while, I want to say TP3 is like here, and we're talking about it, and it's going to GA this summer, but it's once we really start getting this customer environment, people start selling it, that we're going to find out what it is and what it isn't. >> It's interesting. You know how important that technology is to Microsoft. It's, in many respects, Satya's baby. And it's so important to them, and at the same time, it's not there, it's not coming, it's going to be constrained. >> So Marc, unfortunately, you and I could talk all day about stuff like this, and we've had many times, at conferences, that we spend a long time. I want to give you just the final word. Wrap up the intro for today on what's happening at Google Next and what's interesting you in the industry. >> Well, I think the big thing here is that Google is showing that they put their foot down and they're not letting up. They're serious about this business, they made this commitment. And we sort of talk and we give lip service, a little bit, to the big three, we got Asure, we got Amazon, and then there's Google. I think every year it's Google does more, and they're proving themselves as a more capable cloud service provider. They're showing the integration with HANA is really interesting, SAP, I should say, not HANA but SAP. They're going after big applications, they've got big customers. Every year that they do this, it's more of an arrival. And I think, in two years time, that idea of the big three is actually going to be big three. It's not going to be two plus one. And that is going to accelerate more of the movement into cloud faster than ever, because the options that Google is offering are different than the others, these are all different clouds with different strengths. Of the three of them, Google, I have to say, has the most, if you will, computer science behind it. It's not that Microsoft doesn't have it, but Google is going to have a lot more capability and machine learning than I think what you're going to see out of Amazon ever. They are just going to take off and run with that, and Microsoft is going to have to figure out how they're going to try to catch up or how they're going to parley what they have in machine learning. It's not that they haven't made an investment in it, but it's not like Google has made investment in it. Google's been making investment in it over the years to support their consumer applications on Google. And now that stuff is coming, like I said before, the stuff is coming into the enterprise. I think there is a shift now, and we sort of wonder, is machine learning going to happen, when it's going to happen? It's going to happen, and it's going to come from Google. >> All right, well, great way to end the opening segment here. Thank you so much, Marc Farley, for joining us. We've got a full day of coverage here from our 4,500 square foot studio in the heart of Silicon Valley. You're watching theCUBE. (bright music)

Published Date : Mar 9 2017

SUMMARY :

Narrator: Live from the in the past and going to Oh, it's really nice to be here, Stu, fun stuff to get into. of the industry there. I hate to call it college, but Oh, it's a lot better than that. in Google's positioning in cloud? I think that's going to be the are they going to put the money in? Yeah, and I talked to some people, It's full pedal to the metal, that they're bringing, this is a lot more some of the things what it's called. or it's the Data Loss Prevention API. shouldn't be so hard to remember. and all those things built in to try And it's hard for software to tends to be built for One of the other things they announced, and makes sure that the and making it available to enterprise. on the infrastructure side. it's going to be very tough for them. and the size of data and that I don't think it's going to and Microsoft absolutely has to have it, as soon as you open that software, and Asure Stack's going to and they had to go, a lot of times And it's so important to I want to give you just the final word. And that is going to in the heart of Silicon Valley.

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Marc Farley	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Lenovo	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Marc	PERSON	0.99+
San Francisco	LOCATION	0.99+
Google	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Dell	ORGANIZATION	0.99+
Eric Schmidt	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Raejeanne Skillern	PERSON	0.99+
18	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Vulcancast	ORGANIZATION	0.99+
YouTube	ORGANIZATION	0.99+
64 core	QUANTITY	0.99+
Seattle	LOCATION	0.99+
five years	QUANTITY	0.99+
4,500 square foot	QUANTITY	0.99+
17 years	QUANTITY	0.99+
Raejeanne	PERSON	0.99+
Marc Farley	PERSON	0.99+
HANA	TITLE	0.99+
Mark	PERSON	0.99+
second part	QUANTITY	0.99+
30 billion dollars	QUANTITY	0.99+
Satya Nadella	PERSON	0.99+
Satya	PERSON	0.99+
AWS	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Asure	ORGANIZATION	0.99+
International Women's Day	EVENT	0.99+
Android	TITLE	0.99+
Superbowl	EVENT	0.99+
24 months	QUANTITY	0.99+
DockerCon	EVENT	0.99+
yesterday	DATE	0.99+

Next-Generation Analytics Social Influencer Roundtable - #BigDataNYC 2016 #theCUBE

>> Narrator: Live from New York, it's the Cube, covering big data New York City 2016. Brought to you by headline sponsors, CISCO, IBM, NVIDIA, and our ecosystem sponsors, now here's your host, Dave Valante. >> Welcome back to New York City, everybody, this is the Cube, the worldwide leader in live tech coverage, and this is a cube first, we've got a nine person, actually eight person panel of experts, data scientists, all alike. I'm here with my co-host, James Cubelis, who has helped organize this panel of experts. James, welcome. >> Thank you very much, Dave, it's great to be here, and we have some really excellent brain power up there, so I'm going to let them talk. >> Okay, well thank you again-- >> And I'll interject my thoughts now and then, but I want to hear them. >> Okay, great, we know you well, Jim, we know you'll do that, so thank you for that, and appreciate you organizing this. Okay, so what I'm going to do to our panelists is ask you to introduce yourself. I'll introduce you, but tell us a little bit about yourself, and talk a little bit about what data science means to you. A number of you started in the field a long time ago, perhaps data warehouse experts before the term data science was coined. Some of you started probably after Hal Varian said it was the sexiest job in the world. (laughs) So think about how data science has changed and or what it means to you. We're going to start with Greg Piateski, who's from Boston. A Ph.D., KDnuggets, Greg, tell us about yourself and what data science means to you. >> Okay, well thank you Dave and thank you Jim for the invitation. Data science in a sense is the second oldest profession. I think people have this built-in need to find patterns and whatever we find we want to organize the data, but we do it well on a small scale, but we don't do it well on a large scale, so really, data science takes our need and helps us organize what we find, the patterns that we find that are really valid and useful and not just random, I think this is a big challenge of data science. I've actually started in this field before the term Data Science existed. I started as a researcher and organized the first few workshops on data mining and knowledge discovery, and the term data mining became less fashionable, became predictive analytics, now it's data science and it will be something else in a few years. >> Okay, thank you, Eves Mulkearns, Eves, I of course know you from Twitter. A lot of people know you as well. Tell us about your experiences and what data scientist means to you. >> Well, data science to me is if you take the two words, the data and the science, the science it holds a lot of expertise and skills there, it's statistics, it's mathematics, it's understanding the business and putting that together with the digitization of what we have. It's not only the structured data or the unstructured data what you store in the database try to get out and try to understand what is in there, but even video what is coming on and then trying to find, like George already said, the patterns in there and bringing value to the business but looking from a technical perspective, but still linking that to the business insights and you can do that on a technical level, but then you don't know yet what you need to find, or what you're looking for. >> Okay great, thank you. Craig Brown, Cube alum. How many people have been on the Cube actually before? >> I have. >> Okay, good. I always like to ask that question. So Craig, tell us a little bit about your background and, you know, data science, how has it changed, what's it all mean to you? >> Sure, so I'm Craig Brown, I've been in IT for almost 28 years, and that was obviously before the term data science, but I've evolved from, I started out as a developer. And evolved through the data ranks, as I called it, working with data structures, working with data systems, data technologies, and now we're working with data pure and simple. Data science to me is an individual or team of individuals that dissect the data, understand the data, help folks look at the data differently than just the information that, you know, we usually use in reports, and get more insights on, how to utilize it and better leverage it as an asset within an organization. >> Great, thank you Craig, okay, Jennifer Shin? Math is obviously part of being a data scientist. You're good at math I understand. Tell us about yourself. >> Yeah, so I'm a senior principle data scientist at the Nielsen Company. I'm also the founder of 8 Path Solutions, which is a data science, analytics, and technology company, and I'm also on the faculty in the Master of Information and Data Science program at UC Berkeley. So math is part of the IT statistics for data science actually this semester, and I think for me, I consider myself a scientist primarily, and data science is a nice day job to have, right? Something where there's industry need for people with my skill set in the sciences, and data gives us a great way of being able to communicate sort of what we know in science in a way that can be used out there in the real world. I think the best benefit for me is that now that I'm a data scientist, people know what my job is, whereas before, maybe five ten years ago, no one understood what I did. Now, people don't necessarily understand what I do now, but at least they understand kind of what I do, so it's still an improvement. >> Excellent. Thank you Jennifer. Joe Caserta, you're somebody who started in the data warehouse business, and saw that snake swallow a basketball and grow into what we now know as big data, so tell us about yourself. >> So I've been doing data for 30 years now, and I wrote the Data Warehouse ETL Toolkit with Ralph Timbal, which is the best selling book in the industry on preparing data for analytics, and with the big paradigm shift that's happened, you know for me the past seven years has been, instead of preparing data for people to analyze data to make decisions, now we're preparing data for machines to make the decisions, and I think that's the big shift from data analysis to data analytics and data science. >> Great, thank you. Miriam, Miriam Fridell, welcome. >> Thank you. I'm Miriam Fridell, I work for Elder Research, we are a data science consultancy, and I came to data science, sort of through a very circuitous route. I started off as a physicist, went to work as a consultant and software engineer, then became a research analyst, and finally came to data science. And I think one of the most interesting things to me about data science is that it's not simply about building an interesting model and doing some interesting mathematics, or maybe wrangling the data, all of which I love to do, but it's really the entire analytics lifecycle, and a value that you can actually extract from data at the end, and that's one of the things that I enjoy most is seeing a client's eyes light up or a wow, I didn't really know we could look at data that way, that's really interesting. I can actually do something with that, so I think that, to me, is one of the most interesting things about it. >> Great, thank you. Justin Sadeen, welcome. >> Absolutely, than you, thank you. So my name is Justin Sadeen, I work for Morph EDU, an artificial intelligence company in Atlanta, Georgia, and we develop learning platforms for non-profit and private educational institutions. So I'm a Marine Corp veteran turned data enthusiast, and so what I think about data science is the intersection of information, intelligence, and analysis, and I'm really excited about the transition from big data into smart data, and that's what I see data science as. >> Great, and last but not least, Dez Blanchfield, welcome mate. >> Good day. Yeah, I'm the one with the funny accent. So data science for me is probably the funniest job I've ever to describe to my mom. I've had quite a few different jobs, and she's never understood any of them, and this one she understands the least. I think a fun way to describe what we're trying to do in the world of data science and analytics now is it's the equivalent of high altitude mountain climbing. It's like the extreme sport version of the computer science world, because we have to be this magical unicorn of a human that can understand plain english problems from C-suite down and then translate it into code, either as soles or as teams of developers. And so there's this black art that we're expected to be able to transmogrify from something that we just in plain english say I would like to know X, and we have to go and figure it out, so there's this neat extreme sport view I have of rushing down the side of a mountain on a mountain bike and just dodging rocks and trees and things occasionally, because invariably, we do have things that go wrong, and they don't quite give us the answers we want. But I think we're at an interesting point in time now with the explosion in the types of technology that are at our fingertips, and the scale at which we can do things now, once upon a time we would sit at a terminal and write code and just look at data and watch it in columns, and then we ended up with spreadsheet technologies at our fingertips. Nowadays it's quite normal to instantiate a small high performance distributed cluster of computers, effectively a super computer in a public cloud, and throw some data at it and see what comes back. And we can do that on a credit card. So I think we're at a really interesting tipping point now where this coinage of data science needs to be slightly better defined, so that we can help organizations who have weird and strange questions that they want to ask, tell them solutions to those questions, and deliver on them in, I guess, a commodity deliverable. I want to know xyz and I want to know it in this time frame and I want to spend this much amount of money to do it, and I don't really care how you're going to do it. And there's so many tools we can choose from and there's so many platforms we can choose from, it's this little black art of computing, if you'd like, we're effectively making it up as we go in many ways, so I think it's one of the most exciting challenges that I've had, and I think I'm pretty sure I speak for most of us in that we're lucky that we get paid to do this amazing job. That we get make up on a daily basis in some cases. >> Excellent, well okay. So we'll just get right into it. I'm going to go off script-- >> Do they have unicorns down under? I think they have some strange species right? >> Well we put the pointy bit on the back. You guys have in on the front. >> So I was at an IBM event on Friday. It was a chief data officer summit, and I attended what was called the Data Divas' breakfast. It was a women in tech thing, and one of the CDOs, she said that 25% of chief data officers are women, which is much higher than you would normally see in the profile of IT. We happen to have 25% of our panelists are women. Is that common? Miriam and Jennifer, is that common for the data science field? Or is this a higher percentage than you would normally see-- >> James: Or a lower percentage? >> I think certainly for us, we have hired a number of additional women in the last year, and they are phenomenal data scientists. I don't know that I would say, I mean I think it's certainly typical that this is still a male-dominated field, but I think like many male-dominated fields, physics, mathematics, computer science, I think that that is slowly changing and evolving, and I think certainly, that's something that we've noticed in our firm over the years at our consultancy, as we're hiring new people. So I don't know if I would say 25% is the right number, but hopefully we can get it closer to 50. Jennifer, I don't know if you have... >> Yeah, so I know at Nielsen we have actually more than 25% of our team is women, at least the team I work with, so there seems to be a lot of women who are going into the field. Which isn't too surprising, because with a lot of the issues that come up in STEM, one of the reasons why a lot of women drop out is because they want real world jobs and they feel like they want to be in the workforce, and so I think this is a great opportunity with data science being so popular for these women to actually have a job where they can still maintain that engineering and science view background that they learned in school. >> Great, well Hillary Mason, I think, was the first data scientist that I ever interviewed, and I asked her what are the sort of skills required and the first question that we wanted to ask, I just threw other women in tech in there, 'cause we love women in tech, is about this notion of the unicorn data scientist, right? It's been put forth that there's the skill sets required to be a date scientist are so numerous that it's virtually impossible to have a data scientist with all those skills. >> And I love Dez's extreme sports analogy, because that plays into the whole notion of data science, we like to talk about the theme now of data science as a team sport. Must it be an extreme sport is what I'm wondering, you know. The unicorns of the world seem to be... Is that realistic now in this new era? >> I mean when automobiles first came out, they were concerned that there wouldn't be enough chauffeurs to drive all the people around. Is there an analogy with data, to be a data-driven company. Do I need a data scientist, and does that data scientist, you know, need to have these unbelievable mixture of skills? Or are we doomed to always have a skill shortage? Open it up. >> I'd like to have a crack at that, so it's interesting, when automobiles were a thing, when they first bought cars out, and before they, sort of, were modernized by the likes of Ford's Model T, when we got away from the horse and carriage, they actually had human beings walking down the street with a flag warning the public that the horseless carriage was coming, and I think data scientists are very much like that. That we're kind of expected to go ahead of the organization and try and take the challenges we're faced with today and see what's going to come around the corner. And so we're like the little flag-bearers, if you'd like, in many ways of this is where we're at today, tell me where I'm going to be tomorrow, and try and predict the day after as well. It is very much becoming a team sport though. But I think the concept of data science being a unicorn has come about because the coinage hasn't been very well defined, you know, if you were to ask 10 people what a data scientist were, you'd get 11 answers, and I think this is a really challenging issue for hiring managers and C-suites when the generants say I was data science, I want big data, I want an analyst. They don't actually really know what they're asking for. Generally, if you ask for a database administrator, it's a well-described job spec, and you can just advertise it and some 20 people will turn up and you interview to decide whether you like the look and feel and smell of 'em. When you ask for a data scientist, there's 20 different definitions of what that one data science role could be. So we don't initially know what the job is, we don't know what the deliverable is, and we're still trying to figure that out, so yeah. >> Craig what about you? >> So from my experience, when we talk about data science, we're really talking about a collection of experiences with multiple people I've yet to find, at least from my experience, a data science effort with a lone wolf. So you're talking about a combination of skills, and so you don't have, no one individual needs to have all that makes a data scientist a data scientist, but you definitely have to have the right combination of skills amongst a team in order to accomplish the goals of data science team. So from my experiences and from the clients that I've worked with, we refer to the data science effort as a data science team. And I believe that's very appropriate to the team sport analogy. >> For us, we look at a data scientist as a full stack web developer, a jack of all trades, I mean they need to have a multitude of background coming from a programmer from an analyst. You can't find one subject matter expert, it's very difficult. And if you're able to find a subject matter expert, you know, through the lifecycle of product development, you're going to require that individual to interact with a number of other members from your team who are analysts and then you just end up well training this person to be, again, a jack of all trades, so it comes full circle. >> I own a business that does nothing but data solutions, and we've been in business 15 years, and it's been, the transition over time has been going from being a conventional wisdom run company with a bunch of experts at the top to becoming more of a data-driven company using data warehousing and BI, but now the trend is absolutely analytics driven. So if you're not becoming an analytics-driven company, you are going to be behind the curve very very soon, and it's interesting that IBM is now coining the phrase of a cognitive business. I think that is absolutely the future. If you're not a cognitive business from a technology perspective, and an analytics-driven perspective, you're going to be left behind, that's for sure. So in order to stay competitive, you know, you need to really think about data science think about how you're using your data, and I also see that what's considered the data expert has evolved over time too where it used to be just someone really good at writing SQL, or someone really good at writing queries in any language, but now it's becoming more of a interdisciplinary action where you need soft skills and you also need the hard skills, and that's why I think there's more females in the industry now than ever. Because you really need to have a really broad width of experiences that really wasn't required in the past. >> Greg Piateski, you have a comment? >> So there are not too many unicorns in nature or as data scientists, so I think organizations that want to hire data scientists have to look for teams, and there are a few unicorns like Hillary Mason or maybe Osama Faiat, but they generally tend to start companies and very hard to retain them as data scientists. What I see is in other evolution, automation, and you know, steps like IBM, Watson, the first platform is eventually a great advance for data scientists in the short term, but probably what's likely to happen in the longer term kind of more and more of those skills becoming subsumed by machine unique layer within the software. How long will it take, I don't know, but I have a feeling that the paradise for data scientists may not be very long lived. >> Greg, I have a follow up question to what I just heard you say. When a data scientist, let's say a unicorn data scientist starts a company, as you've phrased it, and the company's product is built on data science, do they give up becoming a data scientist in the process? It would seem that they become a data scientist of a higher order if they've built a product based on that knowledge. What is your thoughts on that? >> Well, I know a few people like that, so I think maybe they remain data scientists at heart, but they don't really have the time to do the analysis and they really have to focus more on strategic things. For example, today actually is the birthday of Google, 18 years ago, so Larry Page and Sergey Brin wrote a very influential paper back in the '90s About page rank. Have they remained data scientist, perhaps a very very small part, but that's not really what they do, so I think those unicorn data scientists could quickly evolve to have to look for really teams to capture those skills. >> Clearly they come to a point in their career where they build a company based on teams of data scientists and data engineers and so forth, which relates to the topic of team data science. What is the right division of roles and responsibilities for team data science? >> Before we go, Jennifer, did you have a comment on that? >> Yeah, so I guess I would say for me, when data science came out and there was, you know, the Venn Diagram that came out about all the skills you were supposed to have? I took a very different approach than all of the people who I knew who were going into data science. Most people started interviewing immediately, they were like this is great, I'm going to get a job. I went and learned how to develop applications, and learned computer science, 'cause I had never taken a computer science course in college, and made sure I trued up that one part where I didn't know these things or had the skills from school, so I went headfirst and just learned it, and then now I have actually a lot of technology patents as a result of that. So to answer Jim's question, actually. I started my company about five years ago. And originally started out as a consulting firm slash data science company, then it evolved, and one of the reasons I went back in the industry and now I'm at Nielsen is because you really can't do the same sort of data science work when you're actually doing product development. It's a very very different sort of world. You know, when you're developing a product you're developing a core feature or functionality that you're going to offer clients and customers, so I think definitely you really don't get to have that wide range of sort of looking at 8 million models and testing things out. That flexibility really isn't there as your product starts getting developed. >> Before we go into the team sport, the hard skills that you have, are you all good at math? Are you all computer science types? How about math? Are you all math? >> What were your GPAs? (laughs) >> David: Anybody not math oriented? Anybody not love math? You don't love math? >> I love math, I think it's required. >> David: So math yes, check. >> You dream in equations, right? You dream. >> Computer science? Do I have to have computer science skills? At least the basic knowledge? >> I don't know that you need to have formal classes in any of these things, but I think certainly as Jennifer was saying, if you have no skills in programming whatsoever and you have no interest in learning how to write SQL queries or RR Python, you're probably going to struggle a little bit. >> James: It would be a challenge. >> So I think yes, I have a Ph.D. in physics, I did a lot of math, it's my love language, but I think you don't necessarily need to have formal training in all of these things, but I think you need to have a curiosity and a love of learning, and so if you don't have that, you still want to learn and however you gain that knowledge I think, but yeah, if you have no technical interests whatsoever, and don't want to write a line of code, maybe data science is not the field for you. Even if you don't do it everyday. >> And statistics as well? You would put that in that same general category? How about data hacking? You got to love data hacking, is that fair? Eaves, you have a comment? >> Yeah, I think so, while we've been discussing that for me, the most important part is that you have a logical mind and you have the capability to absorb new things and the curiosity you need to dive into that. While I don't have an education in IT or whatever, I have a background in chemistry and those things that I learned there, I apply to information technology as well, and from a part that you say, okay, I'm a tech-savvy guy, I'm interested in the tech part of it, you need to speak that business language and if you can do that crossover and understand what other skill sets or parts of the roles are telling you I think the communication in that aspect is very important. >> I'd like throw just something really quickly, and I think there's an interesting thing that happens in IT, particularly around technology. We tend to forget that we've actually solved a lot of these problems in the past. If we look in history, if we look around the second World War, and Bletchley Park in the UK, where you had a very similar experience as humans that we're having currently around the whole issue of data science, so there was an interesting challenge with the enigma in the shark code, right? And there was a bunch of men put in a room and told, you're mathematicians and you come from universities, and you can crack codes, but they couldn't. And so what they ended up doing was running these ads, and putting challenges, they actually put, I think it was crossword puzzles in the newspaper, and this deluge of women came out of all kinds of different roles without math degrees, without science degrees, but could solve problems, and they were thrown at the challenge of cracking codes, and invariably, they did the heavy lifting. On a daily basis for converting messages from one format to another, so that this very small team at the end could actually get in play with the sexy piece of it. And I think we're going through a similar shift now with what we're refer to as data science in the technology and business world. Where the people who are doing the heavy lifting aren't necessarily what we'd think of as the traditional data scientists, and so, there have been some unicorns and we've championed them, and they're great. But I think the shift's going to be to accountants, actuaries, and statisticians who understand the business, and come from an MBA star background that can learn the relevant pieces of math and models that we need to to apply to get the data science outcome. I think we've already been here, we've solved this problem, we've just got to learn not to try and reinvent the wheel, 'cause the media hypes this whole thing of data science is exciting and new, but we've been here a couple times before, and there's a lot to be learned from that, my view. >> I think we had Joe next. >> Yeah, so I was going to say that, data science is a funny thing. To use the word science is kind of a misnomer, because there is definitely a level of art to it, and I like to use the analogy, when Michelangelo would look at a block of marble, everyone else looked at the block of marble to see a block of marble. He looks at a block of marble and he sees a finished sculpture, and then he figures out what tools do I need to actually make my vision? And I think data science is a lot like that. We hear a problem, we see the solution, and then we just need the right tools to do it, and I think part of consulting and data science in particular. It's not so much what we know out of the gate, but it's how quickly we learn. And I think everyone here, what makes them brilliant, is how quickly they could learn any tool that they need to see their vision get accomplished. >> David: Justin? >> Yeah, I think you make a really great point, for me, I'm a Marine Corp veteran, and the reason I mentioned that is 'cause I work with two veterans who are problem solvers. And I think that's what data scientists really are, in the long run are problem solvers, and you mentioned a great point that, yeah, I think just problem solving is the key. You don't have to be a subject matter expert, just be able to take the tools and intelligently use them. >> Now when you look at the whole notion of team data science, what is the right mix of roles, like role definitions within a high-quality or a high-preforming data science teams now IBM, with, of course, our announcement of project, data works and so forth. We're splitting the role division, in terms of data scientist versus data engineers versus application developer versus business analyst, is that the right breakdown of roles? Or what would the panelists recommend in terms of understanding what kind of roles make sense within, like I said, a high performing team that's looking for trying to develop applications that depend on data, machine learning, and so forth? Anybody want to? >> I'll tackle that. So the teams that I have created over the years made up these data science teams that I brought into customer sites have a combination of developer capabilities and some of them are IT developers, but some of them were developers of things other than applications. They designed buildings, they did other things with their technical expertise besides building technology. The other piece besides the developer is the analytics, and analytics can be taught as long as they understand how algorithms work and the code behind the analytics, in other words, how are we analyzing things, and from a data science perspective, we are leveraging technology to do the analyzing through the tool sets, so ultimately as long as they understand how tool sets work, then we can train them on the tools. Having that analytic background is an important piece. >> Craig, is it easier to, I'll go to you in a moment Joe, is it easier to cross train a data scientist to be an app developer, than to cross train an app developer to be a data scientist or does it not matter? >> Yes. (laughs) And not the other way around. It depends on the-- >> It's easier to cross train a data scientist to be an app developer than-- >> Yes. >> The other way around. Why is that? >> Developing code can be as difficult as the tool set one uses to develop code. Today's tool sets are very user friendly. where developing code is very difficult to teach a person to think along the lines of developing code when they don't have any idea of the aspects of code, of building something. >> I think it was Joe, or you next, or Jennifer, who was it? >> I would say that one of the reasons for that is data scientists will probably know if the answer's right after you process data, whereas data engineer might be able to manipulate the data but may not know if the answer's correct. So I think that is one of the reasons why having a data scientist learn the application development skills might be a easier time than the other way around. >> I think Miriam, had a comment? Sorry. >> I think that what we're advising our clients to do is to not think, before data science and before analytics became so required by companies to stay competitive, it was more of a waterfall, you have a data engineer build a solution, you know, then you throw it over the fence and the business analyst would have at it, where now, it must be agile, and you must have a scrum team where you have the data scientist and the data engineer and the project manager and the product owner and someone from the chief data office all at the table at the same time and all accomplishing the same goal. Because all of these skills are required, collectively in order to solve this problem, and it can't be done daisy chained anymore it has to be a collaboration. And that's why I think spark is so awesome, because you know, spark is a single interface that a data engineer can use, a data analyst can use, and a data scientist can use. And now with what we've learned today, having a data catalog on top so that the chief data office can actually manage it, I think is really going to take spark to the next level. >> James: Miriam? >> I wanted to comment on your question to Craig about is it harder to teach a data scientist to build an application or vice versa, and one of the things that we have worked on a lot in our data science team is incorporating a lot of best practices from software development, agile, scrum, that sort of thing, and I think particularly with a focus on deploying models that we don't just want to build an interesting data science model, we want to deploy it, and get some value. You need to really incorporate these processes from someone who might know how to build applications and that, I think for some data scientists can be a challenge, because one of the fun things about data science is you get to get into the data, and you get your hands dirty, and you build a model, and you get to try all these cool things, but then when the time comes for you to actually deploy something, you need deployment-grade code in order to make sure it can go into production at your client side and be useful for instance, so I think that there's an interesting challenge on both ends, but one of the things I've definitely noticed with some of our data scientists is it's very hard to get them to think in that mindset, which is why you have a team of people, because everyone has different skills and you can mitigate that. >> Dev-ops for data science? >> Yeah, exactly. We call it insight ops, but yeah, I hear what you're saying. Data science is becoming increasingly an operational function as opposed to strictly exploratory or developmental. Did some one else have a, Dez? >> One of the things I was going to mention, one of the things I like to do when someone gives me a new problem is take all the laptops and phones away. And we just end up in a room with a whiteboard. And developers find that challenging sometimes, so I had this one line where I said to them don't write the first line of code until you actually understand the problem you're trying to solve right? And I think where the data science focus has changed the game for organizations who are trying to get some systematic repeatable process that they can throw data at and just keep getting answers and things, no matter what the industry might be is that developers will come with a particular mindset on how they're going to codify something without necessarily getting the full spectrum and understanding the problem first place. What I'm finding is the people that come at data science tend to have more of a hacker ethic. They want to hack the problem, they want to understand the challenge, and they want to be able to get it down to plain English simple phrases, and then apply some algorithms and then build models, and then codify it, and so most of the time we sit in a room with whiteboard markers just trying to build a model in a graphical sense and make sure it's going to work and that it's going to flow, and once we can do that, we can codify it. I think when you come at it from the other angle from the developer ethic, and you're like I'm just going to codify this from day one, I'm going to write code. I'm going to hack this thing out and it's just going to run and compile. Often, you don't truly understand what he's trying to get to at the end point, and you can just spend days writing code and I think someone made the comment that sometimes you don't actually know whether the output is actually accurate in the first place. So I think there's a lot of value being provided from the data science practice. Over understanding the problem in plain english at a team level, so what am I trying to do from the business consulting point of view? What are the requirements? How do I build this model? How do I test the model? How do I run a sample set through it? Train the thing and then make sure what I'm going to codify actually makes sense in the first place, because otherwise, what are you trying to solve in the first place? >> Wasn't that Einstein who said if I had an hour to solve a problem, I'd spend 55 minutes understanding the problem and five minutes on the solution, right? It's exactly what you're talking about. >> Well I think, I will say, getting back to the question, the thing with building these teams, I think a lot of times people don't talk about is that engineers are actually very very important for data science projects and data science problems. For instance, if you were just trying to prototype something or just come up with a model, then data science teams are great, however, if you need to actually put that into production, that code that the data scientist has written may not be optimal, so as we scale out, it may be actually very inefficient. At that point, you kind of want an engineer to step in and actually optimize that code, so I think it depends on what you're building and that kind of dictates what kind of division you want among your teammates, but I do think that a lot of times, the engineering component is really undervalued out there. >> Jennifer, it seems that the data engineering function, data discovery and preparation and so forth is becoming automated to a greater degree, but if I'm listening to you, I don't hear that data engineering as a discipline is becoming extinct in terms of a role that people can be hired into. You're saying that there's a strong ongoing need for data engineers to optimize the entire pipeline to deliver the fruits of data science in production applications, is that correct? So they play that very much operational role as the backbone for... >> So I think a lot of times businesses will go to data scientist to build a better model to build a predictive model, but that model may not be something that you really want to implement out there when there's like a million users coming to your website, 'cause it may not be efficient, it may take a very long time, so I think in that sense, it is important to have good engineers, and your whole product may fail, you may build the best model it may have the best output, but if you can't actually implement it, then really what good is it? >> What about calibrating these models? How do you go about doing that and sort of testing that in the real world? Has that changed overtime? Or is it... >> So one of the things that I think can happen, and we found with one of our clients is when you build a model, you do it with the data that you have, and you try to use a very robust cross-validation process to make sure that it's robust and it's sturdy, but one thing that can sometimes happen is after you put your model into production, there can be external factors that, societal or whatever, things that have nothing to do with the data that you have or the quality of the data or the quality of the model, which can actually erode the model's performance over time. So as an example, we think about cell phone contracts right? Those have changed a lot over the years, so maybe five years ago, the type of data plan you had might not be the same that it is today, because a totally different type of plan is offered, so if you're building a model on that to say predict who's going to leave and go to a different cell phone carrier, the validity of your model overtime is going to completely degrade based on nothing that you have, that you put into the model or the data that was available, so I think you need to have this sort of model management and monitoring process to take this factors into account and then know when it's time to do a refresh. >> Cross-validation, even at one point in time, for example, there was an article in the New York Times recently that they gave the same data set to five different data scientists, this is survey data for the presidential election that's upcoming, and five different data scientists came to five different predictions. They were all high quality data scientists, the cross-validation showed a wide variation about who was on top, whether it was Hillary or whether it was Trump so that shows you that even at any point in time, cross-validation is essential to understand how robust the predictions might be. Does somebody else have a comment? Joe? >> I just want to say that this even drives home the fact that having the scrum team for each project and having the engineer and the data scientist, data engineer and data scientist working side by side because it is important that whatever we're building we assume will eventually go into production, and we used to have in the data warehousing world, you'd get the data out of the systems, out of your applications, you do analysis on your data, and the nirvana was maybe that data would go back to the system, but typically it didn't. Nowadays, the applications are dependent on the insight coming from the data science team. With the behavior of the application and the personalization and individual experience for a customer is highly dependent, so it has to be, you said is data science part of the dev-ops team, absolutely now, it has to be. >> Whose job is it to figure out the way in which the data is presented to the business? Where's the sort of presentation, the visualization plan, is that the data scientist role? Does that depend on whether or not you have that gene? Do you need a UI person on your team? Where does that fit? >> Wow, good question. >> Well usually that's the output, I mean, once you get to the point where you're visualizing the data, you've created an algorithm or some sort of code that produces that to be visualized, so at the end of the day that the customers can see what all the fuss is about from a data science perspective. But it's usually post the data science component. >> So do you run into situations where you can see it and it's blatantly obvious, but it doesn't necessarily translate to the business? >> Well there's an interesting challenge with data, and we throw the word data around a lot, and I've got this fun line I like throwing out there. If you torture data long enough, it will talk. So the challenge then is to figure out when to stop torturing it, right? And it's the same with models, and so I think in many other parts of organizations, we'll take something, if someone's doing a financial report on performance of the organization and they're doing it in a spreadsheet, they'll get two or three peers to review it, and validate that they've come up with a working model and the answer actually makes sense. And I think we're rushing so quickly at doing analysis on data that comes to us in various formats and high velocity that I think it's very important for us to actually stop and do peer reviews, of the models and the data and the output as well, because otherwise we start making decisions very quickly about things that may or may not be true. It's very easy to get the data to paint any picture you want, and you gave the example of the five different attempts at that thing, and I had this shoot out thing as well where I'll take in a team, I'll get two different people to do exactly the same thing in completely different rooms, and come back and challenge each other, and it's quite amazing to see the looks on their faces when they're like, oh, I didn't see that, and then go back and do it again until, and then just keep iterating until we get to the point where they both get the same outcome, in fact there's a really interesting anecdote about when the UNIX operation system was being written, and a couple of the authors went away and wrote the same program without realizing that each other were doing it, and when they came back, they actually had line for line, the same piece of C code, 'cause they'd actually gotten to a truth. A perfect version of that program, and I think we need to often look at, when we're building models and playing with data, if we can't come at it from different angles, and get the same answer, then maybe the answer isn't quite true yet, so there's a lot of risk in that. And it's the same with presentation, you know, you can paint any picture you want with the dashboard, but who's actually validating when the dashboard's painting the correct picture? >> James: Go ahead, please. >> There is a science actually, behind data visualization, you know if you're doing trending, it's a line graph, if you're doing comparative analysis, it's bar graph, if you're doing percentages, it's a pie chart, like there is a certain science to it, it's not that much of a mystery as the novice thinks there is, but what makes it challenging is that you also, just like any presentation, you have to consider your audience. And your audience, whenever we're delivering a solution, either insight, or just data in a grid, we really have to consider who is the consumer of this data, and actually cater the visual to that person or to that particular audience. And that is part of the art, and that is what makes a great data scientist. >> The consumer may in fact be the source of the data itself, like in a mobile app, so you're tuning their visualization and then their behavior is changing as a result, and then the data on their changed behavior comes back, so it can be a circular process. >> So Jim, at a recent conference, you were tweeting about the citizen data scientist, and you got emasculated by-- >> I spoke there too. >> Okay. >> TWI on that same topic, I got-- >> Kirk Borne I hear came after you. >> Kirk meant-- >> Called foul, flag on the play. >> Kirk meant well. I love Claudia Emahoff too, but yeah, it's a controversial topic. >> So I wonder what our panel thinks of that notion, citizen data scientist. >> Can I respond about citizen data scientists? >> David: Yeah, please. >> I think this term was introduced by Gartner analyst in 2015, and I think it's a very dangerous and misleading term. I think definitely we want to democratize the data and have access to more people, not just data scientists, but managers, BI analysts, but when there is already a term for such people, we can call the business analysts, because it implies some training, some understanding of the data. If you use the term citizen data scientist, it implies that without any training you take some data and then you find something there, and they think as Dev's mentioned, we've seen many examples, very easy to find completely spurious random correlations in data. So we don't want citizen dentists to treat our teeth or citizen pilots to fly planes, and if data's important, having citizen data scientists is equally dangerous, so I'm hoping that, I think actually Gartner did not use the term citizen data scientist in their 2016 hype course, so hopefully they will put this term to rest. >> So Gregory, you apparently are defining citizen to mean incompetent as opposed to simply self-starting. >> Well self-starting is very different, but that's not what I think what was the intention. I think what we see in terms of data democratization, there is a big trend over automation. There are many tools, for example there are many companies like Data Robot, probably IBM, has interesting machine learning capability towards automation, so I think I recently started a page on KDnuggets for automated data science solutions, and there are already 20 different forums that provide different levels of automation. So one can deliver in full automation maybe some expertise, but it's very dangerous to have part of an automated tool and at some point then ask citizen data scientists to try to take the wheels. >> I want to chime in on that. >> David: Yeah, pile on. >> I totally agree with all of that. I think the comment I just want to quickly put out there is that the space we're in is a very young, and rapidly changing world, and so what we haven't had yet is this time to stop and take a deep breath and actually define ourselves, so if you look at computer science in general, a lot of the traditional roles have sort of had 10 or 20 years of history, and so thorough the hiring process, and the development of those spaces, we've actually had time to breath and define what those jobs are, so we know what a systems programmer is, and we know what a database administrator is, but we haven't yet had a chance as a community to stop and breath and say, well what do we think these roles are, and so to fill that void, the media creates coinages, and I think this is the risk we've got now that the concept of a data scientist was just a term that was coined to fill a void, because no one quite knew what to call somebody who didn't come from a data science background if they were tinkering around data science, and I think that's something that we need to sort of sit up and pay attention to, because if we don't own that and drive it ourselves, then somebody else is going to fill the void and they'll create these very frustrating concepts like data scientist, which drives us all crazy. >> James: Miriam's next. >> So I wanted to comment, I agree with both of the previous comments, but in terms of a citizen data scientist, and I think whether or not you're citizen data scientist or an actual data scientist whatever that means, I think one of the most important things you can have is a sense of skepticism, right? Because you can get spurious correlations and it's like wow, my predictive model is so excellent, you know? And being aware of things like leaks from the future, right? This actually isn't predictive at all, it's a result of the thing I'm trying to predict, and so I think one thing I know that we try and do is if something really looks too good, we need to go back in and make sure, did we not look at the data correctly? Is something missing? Did we have a problem with the ETL? And so I think that a healthy sense of skepticism is important to make sure that you're not taking a spurious correlation and trying to derive some significant meaning from it. >> I think there's a Dilbert cartoon that I saw that described that very well. Joe, did you have a comment? >> I think that in order for citizen data scientists to really exist, I think we do need to have more maturity in the tools that they would use. My vision is that the BI tools of today are all going to be replaced with natural language processing and searching, you know, just be able to open up a search bar and say give me sales by region, and to take that one step into the future even further, it should actually say what are my sales going to be next year? And it should trigger a simple linear regression or be able to say which features of the televisions are actually affecting sales and do a clustering algorithm, you know I think hopefully that will be the future, but I don't see anything of that today, and I think in order to have a true citizen data scientist, you would need to have that, and that is pretty sophisticated stuff. >> I think for me, the idea of citizen data scientist I can relate to that, for instance, when I was in graduate school, I started doing some research on FDA data. It was an open source data set about 4.2 million data points. Technically when I graduated, the paper was still not published, and so in some sense, you could think of me as a citizen data scientist, right? I wasn't getting funding, I wasn't doing it for school, but I was still continuing my research, so I'd like to hope that with all the new data sources out there that there might be scientists or people who are maybe kept out of a field people who wanted to be in STEM and for whatever life circumstance couldn't be in it. That they might be encouraged to actually go and look into the data and maybe build better models or validate information that's out there. >> So Justin, I'm sorry you had one comment? >> It seems data science was termed before academia adopted formalized training for data science. But yeah, you can make, like Dez said, you can make data work for whatever problem you're trying to solve, whatever answer you see, you want data to work around it, you can make it happen. And I kind of consider that like in project management, like data creep, so you're so hyper focused on a solution you're trying to find the answer that you create an answer that works for that solution, but it may not be the correct answer, and I think the crossover discussion works well for that case. >> So but the term comes up 'cause there's a frustration I guess, right? That data science skills are not plentiful, and it's potentially a bottleneck in an organization. Supposedly 80% of your time is spent on cleaning data, is that right? Is that fair? So there's a problem. How much of that can be automated and when? >> I'll have a shot at that. So I think there's a shift that's going to come about where we're going to move from centralized data sets to data at the edge of the network, and this is something that's happening very quickly now where we can't just hold everything back to a central spot. When the internet of things actually wakes up. Things like the Boeing Dreamliner 787, that things got 6,000 sensors in it, produces half a terabyte of data per flight. There are 87,400 flights per day in domestic airspace in the U.S. That's 43.5 petabytes of raw data, now that's about three years worth of disk manufacturing in total, right? We're never going to copy that across one place, we can't process, so I think the challenge we've got ahead of us is looking at how we're going to move the intelligence and the analytics to the edge of the network and pre-cook the data in different tiers, so have a look at the raw material we get, and boil it down to a slightly smaller data set, bring a meta data version of that back, and eventually get to the point where we've only got the very minimum data set and data points we need to make key decisions. Without that, we're already at the point where we have too much data, and we can't munch it fast enough, and we can't spin off enough tin even if we witch the cloud on, and that's just this never ending deluge of noise, right? And you've got that signal versus noise problem so then we're now seeing a shift where people looking at how do we move the intelligence back to the edge of network which we actually solved some time ago in the securities space. You know, spam filtering, if an emails hits Google on the west coast of the U.S. and they create a check some for that spam email, it immediately goes into a database, and nothing gets on the opposite side of the coast, because they already know it's spam. They recognize that email coming in, that's evil, stop it. So we've already fixed its insecurity with intrusion detection, we've fixed it in spam, so we now need to take that learning, and bring it into business analytics, if you like, and see where we're finding patterns and behavior, and brew that out to the edge of the network, so if I'm seeing a demand over here for tickets on a new sale of a show, I need to be able to see where else I'm going to see that demand and start responding to that before the demand comes about. I think that's a shift that we're going to see quickly, because we'll never keep up with the data munching challenge and the volume's just going to explode. >> David: We just have a couple minutes. >> That does sound like a great topic for a future Cube panel which is data science on the edge of the fog. >> I got a hundred questions around that. So we're wrapping up here. Just got a couple minutes. Final thoughts on this conversation or any other pieces that you want to punctuate. >> I think one thing that's been really interesting for me being on this panel is hearing all of my co-panelists talking about common themes and things that we are also experiencing which isn't a surprise, but it's interesting to hear about how ubiquitous some of the challenges are, and also at the announcement earlier today, some of the things that they're talking about and thinking about, we're also talking about and thinking about. So I think it's great to hear we're all in different countries and different places, but we're experiencing a lot of the same challenges, and I think that's been really interesting for me to hear about. >> David: Great, anybody else, final thoughts? >> To echo Dez's thoughts, it's about we're never going to catch up with the amount of data that's produced, so it's about transforming big data into smart data. >> I could just say that with the shift from normal data, small data, to big data, the answer is automate, automate, automate, and we've been talking about advanced algorithms and machine learning for the science for changing the business, but there also needs to be machine learning and advanced algorithms for the backroom where we're actually getting smarter about how we ingestate and how we fix data as it comes in. Because we can actually train the machines to understand data anomalies and what we want to do with them over time. And I think the further upstream we get of data correction, the less work there will be downstream. And I also think that the concept of being able to fix data at the source is gone, that's behind us. Right now the data that we're using to analyze to change the business, typically we have no control over. Like Dez said, they're coming from censors and machines and internet of things and if it's wrong, it's always going to be wrong, so we have to figure out how to do that in our laboratory. >> Eaves, final thoughts? >> I think it's a mind shift being a data scientist if you look back at the time why did you start developing or writing code? Because you like to code, whatever, just for the sake of building a nice algorithm or a piece of software, or whatever, and now I think with the spirit of a data scientist, you're looking at a problem and say this is where I want to go, so you have more the top down approach than the bottom up approach. And have the big picture and that is what you really need as a data scientist, just look across technologies, look across departments, look across everything, and then on top of that, try to apply as much skills as you have available, and that's kind of unicorn that they're trying to look for, because it's pretty hard to find people with that wide vision on everything that is happening within the company, so you need to be aware of technology, you need to be aware of how a business is run, and how it fits within a cultural environment, you have to work with people and all those things together to my belief to make it very difficult to find those good data scientists. >> Jim? Your final thought? >> My final thoughts is this is an awesome panel, and I'm so glad that you've come to New York, and I'm hoping that you all stay, of course, for the the IBM Data First launch event that will take place this evening about a block over at Hudson Mercantile, so that's pretty much it. Thank you, I really learned a lot. >> I want to second Jim's thanks, really, great panel. Awesome expertise, really appreciate you taking the time, and thanks to the folks at IBM for putting this together. >> And I'm big fans of most of you, all of you, on this session here, so it's great just to meet you in person, thank you. >> Okay, and I want to thank Jeff Frick for being a human curtain there with the sun setting here in New York City. Well thanks very much for watching, we are going to be across the street at the IBM announcement, we're going to be on the ground. We open up again tomorrow at 9:30 at Big Data NYC, Big Data Week, Strata plus the Hadoop World, thanks for watching everybody, that's a wrap from here. This is the Cube, we're out. (techno music)

Published Date : Sep 28 2016

SUMMARY :

Brought to you by headline sponsors, and this is a cube first, and we have some really but I want to hear them. and appreciate you organizing this. and the term data mining Eves, I of course know you from Twitter. and you can do that on a technical level, How many people have been on the Cube I always like to ask that question. and that was obviously Great, thank you Craig, and I'm also on the faculty and saw that snake swallow a basketball and with the big paradigm Great, thank you. and I came to data science, Great, thank you. and so what I think about data science Great, and last but not least, and the scale at which I'm going to go off script-- You guys have in on the front. and one of the CDOs, she said that 25% and I think certainly, that's and so I think this is a great opportunity and the first question talk about the theme now and does that data scientist, you know, and you can just advertise and from the clients I mean they need to have and it's been, the transition over time but I have a feeling that the paradise and the company's product and they really have to focus What is the right division and one of the reasons I You dream in equations, right? and you have no interest in learning but I think you need to and the curiosity you and there's a lot to be and I like to use the analogy, and the reason I mentioned that is that the right breakdown of roles? and the code behind the analytics, And not the other way around. Why is that? idea of the aspects of code, of the reasons for that I think Miriam, had a comment? and someone from the chief data office and one of the things that an operational function as opposed to and so most of the time and five minutes on the solution, right? that code that the data but if I'm listening to you, that in the real world? the data that you have or so that shows you that and the nirvana was maybe that the customers can see and a couple of the authors went away and actually cater the of the data itself, like in a mobile app, I love Claudia Emahoff too, of that notion, citizen data scientist. and have access to more people, to mean incompetent as opposed to and at some point then ask and the development of those spaces, and so I think one thing I think there's a and I think in order to have a true so I'd like to hope that with all the new and I think So but the term comes up and the analytics to of the fog. or any other pieces that you want to and also at the so it's about transforming big data and machine learning for the science and now I think with the and I'm hoping that you and thanks to the folks at IBM so it's great just to meet you in person, This is the Cube, we're out.

ENTITIES

Entity	Category	Confidence
Jennifer	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Miriam Fridell	PERSON	0.99+
Greg Piateski	PERSON	0.99+
Justin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
David	PERSON	0.99+
Jeff Frick	PERSON	0.99+
2015	DATE	0.99+
Joe Caserta	PERSON	0.99+
James Cubelis	PERSON	0.99+
James	PERSON	0.99+
Miriam	PERSON	0.99+
Jim	PERSON	0.99+
Joe	PERSON	0.99+
Claudia Emahoff	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Hillary	PERSON	0.99+
New York	LOCATION	0.99+
Hillary Mason	PERSON	0.99+
Justin Sadeen	PERSON	0.99+
Greg	PERSON	0.99+
Dave	PERSON	0.99+
55 minutes	QUANTITY	0.99+
Trump	PERSON	0.99+
2016	DATE	0.99+
Craig	PERSON	0.99+
Dave Valante	PERSON	0.99+
George	PERSON	0.99+
Dez Blanchfield	PERSON	0.99+
UK	LOCATION	0.99+
Ford	ORGANIZATION	0.99+
Craig Brown	PERSON	0.99+
10	QUANTITY	0.99+
8 Path Solutions	ORGANIZATION	0.99+
CISCO	ORGANIZATION	0.99+
five minutes	QUANTITY	0.99+
two	QUANTITY	0.99+
30 years	QUANTITY	0.99+
Kirk	PERSON	0.99+
25%	QUANTITY	0.99+
Marine Corp	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
43.5 petabytes	QUANTITY	0.99+
Boston	LOCATION	0.99+
Data Robot	ORGANIZATION	0.99+
10 people	QUANTITY	0.99+
Hal Varian	PERSON	0.99+
Einstein	PERSON	0.99+
New York City	LOCATION	0.99+
Nielsen	ORGANIZATION	0.99+
first question	QUANTITY	0.99+
Friday	DATE	0.99+
Ralph Timbal	PERSON	0.99+
U.S.	LOCATION	0.99+
6,000 sensors	QUANTITY	0.99+
UC Berkeley	ORGANIZATION	0.99+
Sergey Brin	PERSON	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Dilbert: