Ash Naseer, Warner Bros. Discovery | Busting Silos With Monocloud
(vibrant electronic music) >> Welcome back to SuperCloud2. You know, this event, and the Super Cloud initiative in general, it's an open industry-wide collaboration. Last August at SuperCloud22, we really honed in on the definition, which of course we've published. And there's this shared doc, which folks are still adding to and refining, in fact, just recently, Dr. Nelu Mihai added some critical points that really advanced some of the community's initial principles, and today at SuperCloud2, we're digging further into the topic with input from real world practitioners, and we're exploring that intersection of data, data mesh, and cloud, and importantly, the realities and challenges of deploying technology to drive new business capability, and I'm pleased to welcome Ash Naseer to the program. He's a Senior Director of Data Engineering at Warner Bros. Discovery. Ash, great to see you again, thanks so much for taking time with us. >> It's great to be back, these conversations are always very fun. >> I was so excited when we met last spring, I guess, so before we get started I wanted to play a clip from that conversation, it was June, it was at the Snowflake Summit in Las Vegas. And it's a comment that you made about your company but also data mesh. Guys, roll the clip. >> Yeah, so, when people think of Warner Bros., you always think of the movie studio. But we're more than that, right, I mean, you think of HBO, you think of TNT, you think of CNN. We have 30 plus brands in our portfolio, and each have their own needs. So the idea of a data mesh really helps us because what we can do is we can federate access across the company, so that CNN can work at their own pace, you know, when there's election season, they can ingest their own data. And they don't have to bump up against, as an example, HBO, if Game of Thrones is goin' on. >> So-- Okay, so that's pretty interesting, so you've got these sort of different groups that have different data requirements inside of your organization. Now data mesh, it's a relatively new concept, so you're kind of ahead of the curve. So Ash, my question is, when you think about getting value from data, and how that's changed over the past decade, you've had pre-Hadoop, Hadoop, what do you see that's changed, now you got the cloud coming in, what's changed? What had to be sort of fixed? What's working now, and where do you see it going? >> Yeah, so I feel like in the last decade, we've gone through quite a maturity curve. I actually like to say that we're in the golden age of data, because the tools and technology in the data space, particularly and then broadly in the cloud, they allow us to do things that we couldn't do way back when, like you suggested, back in the Hadoop era or even before that. So there's certainly a lot of maturity, and a lot of technology that has come about. So in terms of the good, bad, and ugly, so let me kind of start with the good, right? In terms of bringing value from the data, I really feel like we're in this place where the folks that are charged with unlocking that value from the data, they're actually spending the majority of their time actually doing that. And what do I mean by that? If you think about it, 10 years ago, the data scientist was the person that was going to sort of solve all of the data problems in a company. But what happened was, companies asked these data scientists to come in and do a multitude of things. And what these data scientists found out was, they were spending most of their time on, really, data wrangling, and less on actually getting the value out of the data. And in the last decade or so, I feel like we've made the shift, and we realize that data engineering, data management, data governance, those are as important practices as data science, which is sort of getting the value out of the data. And so what that has done is, it has freed up the data scientist and the business analyst and the data analyst, and the BI expert, to really focus on how to get value out of the data, and spend less time wrangling data. So I really think that that's the good. In terms of the bad, I feel like, there's a lot of legacy data platforms out there, and I feel like there's going to be a time where we'll be in that hybrid mode. And then the ugly, I feel like, with all the data and all the technology, creates another problem of itself. Because most companies don't have arms around their data, and making sure that they know who's using the data, what they're using for, and how can the company leverage the collective intelligence. That is a bigger problem to solve today than 10 years ago. And that's where technologies like the data mesh come in. >> Yeah, so when I think of data mesh, and I say, you're an early practitioner of data mesh, you mentioned legacy technology, so the concept of data mesh is inclusive. In theory anyway, you're supposed to be including the legacy technologies. Whether it's a data lake or data warehouse or Oracle or Snowflake or whatever it is. And when you think about Jamak Dagani's principles, it's domain-centric ownership, data as product. And that creates challenges around self-serve infrastructure and automated governance, and then when you start to combine these different technologies. You got legacy, you got cloud. Everything's different. And so you have to figure out how to deal with that, so my question is, how have you dealt with that, and what role has the cloud played in solving those problems, in particular, that self-serve infrastructure, and that automated governance, and where are we in terms of solving that problem from a practitioner's standpoint? >> Yeah, I always like to say that data is a team sport, and we should sort of think of it as such, and that's, I feel like, the key of the data mesh concept, is treating it as a team sport. A lot of people ask me, they're like, "Oh hey, Ash, I've heard about this thing called data mesh. "Where can I buy one?" or, "what's the technology that I use to get a data mesh? And the reality is that there isn't one technology, you can't really buy a data mesh. It's really a way of life, it's how organizations decide to approach data, like I said, back to a team sport analogy, making sure that everyone has the seat on the table, making sure that we embrace the fact that we have a lot of data, we have a lot of data problems to solve. And the way we'll be successful is to make everyone inclusive. You know, you think about the old days, Data silos or shadow IT, some might call it. That's been around for decades. And what hasn't changed was this notion that, hey, everything needs to be sort of managed centrally. But with the cloud and with the technologies that we have today, we have the right technology and the tooling to democratize that data, and democratize not only just the access, but also sort of building building blocks and sort of taking building blocks which are relevant to your product or your business. And adding to the overall data mesh. We've got all that technology. The challenge is for us to really embrace it, and make sure that we implement it from an organizational standpoint. >> So, thinking about super cloud, there's a layer that lives above the clouds and adds value. And you think about your brands you got 30 brands, you mentioned shadow IT. If, let's say, one of those brands, HBO or TNT, whatever. They want to go, "Hey, we really like Google's analytics tools," and they maybe go off and build something, I don't know if that's even allowed, maybe it's not. But then you build this data mesh. My question is around multi-cloud, cross cloud, super cloud if you will. Is that a advantage for you as a practitioner, or does that just make things more complicated? >> I really love the idea of a multi-cloud. I think it's great, I think that it should have been the norm, not the exception, I feel like people talk about it as if it's the exception. That should have been the case. I will say, though, I feel like multi-cloud should evolve organically, so back to your point about some of these different brands, and, you know, different brands or different business units. Or even in a merger and acquisitions situation, where two different companies or multiple different companies come together with different technology stacks. You know, I feel like that's an organic evolution, and making sure that we use the concepts and the technologies around the multi-cloud to bring everyone together. That's where we need to be, and again, it talks to the fact that each of those business units and each of those groups have their own unique needs, and we need to make sure that we embrace that and we enable that, rather than stifling everything. Now where I have a little bit of a challenge with the multi-cloud is when technology leaders try to build it by design. So there's a notion there that, "Hey, you need to sort of diversify "and don't put all your eggs in one basket." And so we need to have this multi-cloud thing. I feel like that is just sort of creating more complexity where it doesn't need to be, we can all sort of simplify our lives, but where it evolves organically, absolutely, I think that's the right way to go. >> But, so Ash, if it evolves organically don't you need some kind of cloud interpreter, to create a common experience across clouds, does that exist today? What are your thoughts on that? >> There is a lot of technology that exists today, and that helps go between these different clouds, a lot of these sort of cloud agnostic technologies that you talked about, the Snowflakes and the Databricks and so forth of the world, they operate in multiple clouds, they operate in multiple regions, within a given cloud and multiple clouds. So they span all of that, and they have the tools and technology, so, I feel like the tooling is there. There does need to be more of an evolution around the tooling and I think the market's need are going to dictate that, I feel like the market is there, they're asking for it, so, there's definitely going to be that evolution, but the technology is there, I think just making sure that we embrace that and we sort of embrace that as a challenge and not try to sort of shut all of that down and box everything into one. >> What's the biggest challenge, is it governance or security? Or is it more like you're saying, adoption, cultural? >> I think it's a combination of cultural as well as governance. And so, the cultural side I've talked about, right, just making sure that we give these different teams a seat at the table, and they actually bring that technology into the mix. And we use the modern tools and technologies to make sure that everybody sort of plays nice together. That is definitely, we have ways to go there. But then, in terms of governance, that is another big problem that most companies are just starting to wrestle with. Because like I said, I mean, the data silos and shadow IT, that's been around there, right? The only difference is that we're now sort of bringing everything together in a cloud environment, the collective organization has access to that. And now we just realized, oh we have quite a data problem at our hands, so how do we sort of organize this data, make sure that the quality is there, the trust is there. When people look at that data, a lot of those questions are now coming to the forefront because everything is sort of so transparent with the cloud, right? And so I feel like, again, putting in the right processes, and the right tooling to address that is going to be critical in the next years to come. >> Is sharing data across clouds, something that is valuable to you, or even within a single cloud, being able to share data. And my question is, not just within your organization, but even outside your organization, is that something that has sort of hit your radar or is it mature or is that something that really would add value to your business? >> Data sharing is huge, and again, this is another one of those things which isn't new. You know, I remember back in the '90s, when we had to share data externally, with our partners or our vendors, they used to physically send us stacks of these tapes, or physical media on some truck. And we've evolved since then, right, I mean, it went from that to sharing files online and so forth. But data sharing as a concept and as a concept which is now very frictionless, through these different technologies that we have today, that is very new. And that is something, like I said, it's always been going on. But that needs to be really embraced more as well. We as a company heavily leverage data sharing between our own different brands and business units, that helps us make that data mesh, so that when CNN, as an example, builds their own data model based on election data and the kinds of data that they need, compare that with other data in the rest of the company, sports, entertainment, and so forth and so on. Everyone has their unique data, but that data sharing capability brings it together wherever there is a need. So you think about having a Tiger Woods documentary, as an example, on HBO Max and making sure that you reach the audiences that are interested in golf and interested in sports and so forth, right? That all comes through the magic of data sharing, so, it's really critical, internally, for us. And then externally as well, because just understanding how our products are doing on our partners' networks and different distribution channels, that's important, and then just understanding how our consumers are consuming it off properties, right, I mean, we have brands that transcend just the screen, right? We have a lot of physical merchandise that you can buy in the store. So again, understanding who's buying the Batman action figures after the Batman movie was released, that's another critical insight. So it all gets enabled through data sharing, and something we rely heavily on. >> So I wanted to get your perspective on this. So I feel like the nirvana of data mesh is if I want to use Google BigQuery, an Oracle database, or a Microsoft database, or Snowflake, Databricks, Amazon, whatever. That that's a node on the mesh. And in the perfect world, you can share that data, it can be governed, I don't think we're quite there today, so. But within a platform, maybe it's within Google or within Amazon or within Snowflake or Databricks. If you're in that world, maybe even Oracle. You actually can do some levels of data sharing, maybe greater with some than others. Do you mandate as an organization that you have to use this particular data platform, or are you saying "Hey, we are architecting a data mesh for the future "where we believe the technology will support that," or maybe you've invented some technology that supports that today, can you help us understand that? >> Yeah, I always feel like mandate is a strong area, and it breeds the shadow IT and the data silos. So we don't mandate, we do make sure that there's a consistent set of governance rules, policies, and tooling that's there, so that everyone is on the same page. However, at the same time our focus is really operating in a federated way, that's been our solution, right? Is to make sure that we work within a common set of tooling, which may be different technologies, which in some cases may be different clouds. Although we're not that multi-cloud. So what we're trying to do is making sure that everyone who has that technology already built, as long as it sort of follows certain standards, it's modern, it has the capabilities that will eventually allow us to be successful and eventually allow for that data sharing, amongst those different nodes, as you put it. As long as that's the case, and as long as there's a governance layer, a master governance layer, where we know where all that data is and who has access to what and we can sort of be really confident about the quality of the data, as long as that case, our approach to that is really that federated approach. >> Sorry, did I hear you correctly, you're not multi-cloud today? >> Yeah, that's correct. There are certain spots where we use that, but by and large, we rely on a particular cloud, and that's just been, like I said, it's been the evolution, it was our evolution. We decided early on to focus on a single cloud, and that's the direction we've been going in. >> So, do you want to go to a multi-cloud, or, you mentioned organic before, if a business unit wants to go there, as long as they're adhering to those standards that you put out, maybe recommendations, that that's okay? I guess my question is, does that bring benefit to your business that you'd like to tap, or do you feel like it's not necessary? >> I'll go back to the point of, if it happens organically, we're going to be open about it. Obviously we'll have to look at every situations, not all clouds are created equal as well, so there's a number of different considerations. But by and large, when it happens organically, the key is time to value, right? How do you quickly bring those technologies in, as long as you could share the data, they're interconnected, they're secured, they're governed, we are confident on the quality, as long as those principles are met, we could definitely go in that direction. But by and large, we're sort of evolving in a singular direction, but even within a singular cloud, we're a global company. And we have audiences around the world, so making sure that even within a single cloud, those different regions interoperate as one, that's a bigger challenge that we're having to solve as well. >> Last question is kind of to the future of data and cloud and how it's going to evolve, do you see a day when companies like yours are increasingly going to be offering data, their software, services, and becoming more of a technology company, sort of pointing your tooling and your proprietary knowledge at the external world, as an opportunity, as a business opportunity? >> That's a very interesting concept, and I know companies have done that, and some of them have been extremely successful, I mean, Amazon is the biggest example that comes to mind, right-- >> Yeah. >> When they launched AWS, something that they had that expertise they had internally, and they offered it to the world as a product. But by and large, I think it's going to be far and few between, especially, it's going to be focused on companies that have technology as their DNA, or almost like in the technology sector, building technology. Most other companies have different markets that they are addressing. And in my opinion, a lot of these companies, what they're trying to do is really focus on the problems that we can solve for ourselves, I think there are more problems than we have people and expertise. So my guess is that most large companies, they're going to focus on solving their own problems. A few, like I said, more tech-focused companies, that would want to be in that business, would probably branch out, but by and large, I think companies will continue to focus on serving their customers and serving their own business. >> Alright, Ash, we're going to leave it there, Ash Naseer. Thank you so much for your perspectives, it was great to see you, I'm sure we'll see you face-to-face later on this year. >> This is great, thank you for having me. >> Ah, you're welcome, alright. Keep it right there for more great content from SuperCloud2. We'll be right back. (gentle percussive music)
SUMMARY :
and the Super Cloud initiative in general, It's great to be back, And it's a comment that So the idea of a data mesh really helps us and how that's changed and making sure that they and that automated governance, and make sure that we implement it And you think about your brands and making sure that we use the concepts and so forth of the world, make sure that the quality or is it mature or is that something and the kinds of data that they need, And in the perfect world, so that everyone is on the same page. and that's the direction the key is time to value, right? and they offered it to Thank you so much for your perspectives, Keep it right there
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
CNN | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Warner Bros. | ORGANIZATION | 0.99+ |
TNT | ORGANIZATION | 0.99+ |
Ash Naseer | PERSON | 0.99+ |
HBO | ORGANIZATION | 0.99+ |
Ash | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Nelu Mihai | PERSON | 0.99+ |
each | QUANTITY | 0.99+ |
June | DATE | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
Game of Thrones | TITLE | 0.99+ |
Databricks | ORGANIZATION | 0.99+ |
Last August | DATE | 0.99+ |
30 brands | QUANTITY | 0.99+ |
30 plus brands | QUANTITY | 0.99+ |
Snowflake | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
last spring | DATE | 0.99+ |
Batman | PERSON | 0.99+ |
Jamak Dagani | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.98+ |
one basket | QUANTITY | 0.98+ |
10 years ago | DATE | 0.98+ |
today | DATE | 0.98+ |
last decade | DATE | 0.97+ |
Snowflakes | EVENT | 0.95+ |
single cloud | QUANTITY | 0.95+ |
one | QUANTITY | 0.95+ |
two different companies | QUANTITY | 0.94+ |
SuperCloud2 | ORGANIZATION | 0.94+ |
Tiger Woods | PERSON | 0.94+ |
Warner Bros. Discovery | ORGANIZATION | 0.92+ |
decades | QUANTITY | 0.88+ |
this year | DATE | 0.85+ |
SuperCloud22 | EVENT | 0.84+ |
'90s | DATE | 0.84+ |
SuperCloud2 | EVENT | 0.83+ |
Monocloud | ORGANIZATION | 0.83+ |
Snowflake Summit | LOCATION | 0.77+ |
Super Cloud | EVENT | 0.77+ |
a day | QUANTITY | 0.74+ |
Busting Silos With | TITLE | 0.73+ |
Hadoop era | DATE | 0.66+ |
past decade | DATE | 0.63+ |
Databricks | EVENT | 0.63+ |
Max | TITLE | 0.49+ |
BigQuery | TITLE | 0.46+ |
Discovery | ORGANIZATION | 0.44+ |
Mitesh Shah, Alation & Ash Naseer, Warner Bros Discovery | Snowflake Summit 2022
(upbeat music) >> Welcome back to theCUBE's continuing coverage of Snowflake Summit '22 live from Caesar's Forum in Las Vegas. I'm Lisa Martin, my cohost Dave Vellante, we've been here the last day and a half unpacking a lot of news, a lot of announcements, talking with customers and partners, and we have another great session coming for you next. We've got a customer and a partner talking tech and data mash. Please welcome Mitesh Shah, VP in market strategy at Elation. >> Great to be here. >> and Ash Naseer great, to have you, senior director of data engineering at Warner Brothers Discovery. Welcome guys. >> Thank you for having me. >> It's great to be back in person and to be able to really get to see and feel and touch this technology, isn't it? >> Yeah, it is. I mean two years or so. Yeah. Great to feel the energy in the conference center. >> Yeah. >> Snowflake was virtual, I think for two years and now it's great to kind of see the excitement firsthand. So it's wonderful. >> Th excitement, but also the boom and the number of customers and partners and people attending. They were saying the first, or the summit in 2019 had about 1900 attendees. And this is around 10,000. So a huge jump in a short time period. Talk a little bit about the Elation-Snowflake partnership and probably some of the acceleration that you guys have been experiencing as a Snowflake partner. >> Yeah. As a snowflake partner. I mean, Snowflake is an investor of us in Elation early last year, and we've been a partner for, for longer than that. And good news. We have been awarded Snowflake partner of the year for data governance, just earlier this week. And that's in fact, our second year in a row for winning that award. So, great news on that front as well. >> Repeat, congratulations. >> Repeat. Absolutely. And we're going to hope to make it a three-peat as well. And we've also been awarded industry competency badges in five different industries, those being financial services, healthcare, retail technology, and Median Telcom. >> Excellent. Okay. Going to right get into it. Data mesh. You guys actually have a data mesh and you've presented at the conference. So, take us back to the beginning. Why did you decide that you needed to implement something like data mesh? What was the impetus? >> Yeah. So when people think of Warner brothers, you always think of like the movie studio, but we're more than that, right? I mean, you think of HBO, you think of TNT, you think of CNN, we have 30 plus brands in our portfolio and each have their own needs. So the idea of a data mesh really helps us because what we can do is we can federate access across the company so that, you know, CNN can work at their own pace. You know, when there's election season, they can ingest their own data and they don't have to, you know, bump up against as an example, HBO, if Game of Thrones is going on. >> So, okay. So the, the impetus was to serve those lines of business better. Actually, given that you've got these different brands, it was probably easier than most companies. Cause if you're, let's say you're a big financial services company, and now you have to decide who owns what. CNN owns its own data products, HBO. Now, do they decide within those different brands, how to distribute even further? Or is it really, how deep have you gone in that decentralization? >> That's a great question. It's a very close partnership, because there are a number of data sets, which are used by all the brands, right? You think about people browsing websites, right? You know, CNN has a website, Warner brothers has a website. So for us to ingest that data for each of the brands to ingest that data separately, that means five different ways of doing things and you know, a big environment, right? So that is where our team comes into play. We ingest a lot of the common data sets, but like I said, any unique data sets, data sets regarding theatrical as an example, you know, Warner brothers does it themselves, you know, for streaming, HBO Max, does it themselves. So we kind of operate in partnership. >> So do you have a centralized data team and also decentralized data teams, right? >> That's right. >> So I love this conversation because that was heresy 10 years ago, five years ago, even, cause that's inefficient. But you've, I presume you've found that it's actually more productive in terms of the business output, explain that dynamic. >> You know, you bring up such a good point. So I, you know, I consider myself as one of the dinosaurs who started like 20 plus years ago in this industry. And back then, we were all taught to think of the data warehouse as like a monolithic thing. And the reason for that is the technology wasn't there. The technology didn't catch up. Now, 20 years later, the technology is way ahead, right? But like, our mindset's still the same because we think of data warehouses and data platforms still as a monolithic thing. But if you really sort of remove that sort of mental barrier, if you will, and if you start thinking about, well, how do I sort of, you know, federate everything and make sure that you let folks who are building, or are closest to the customer or are building their products, let them own that data and have a partnership. The results have been amazing. And if we were only sort of doing it as a centralized team, we would not be able to do a 10th of what we do today. So it's that massive scale in, in our company as well. >> And I should have clarified, when we talk about data mesh are we talking about the implementing in practice, the octagon sort of framework, or is this sort of your own sort of terminology? >> Well, so the interesting part is four years ago, we didn't have- >> It didn't exist. >> Yeah. It didn't exist. And, and so we, our principle was very simple, right? When we started out, we said, we want to make sure that our brands are able to operate independently with some oversight and guidance from our technology teams, right? That's what we set out to do. We did that with Snowflake by design because Snowflake allows us to, you know, separate those, those brands into different accounts. So that was done by design. And then the, the magic, I think, is the Snowflake data sharing where, which allows us to sort of bring data in here once, and then share it with whoever needs it. So think about HBO Max. On HBO Max, You not only have HBO Max content, but content from CNN, from Cartoon Network, from Warner Brothers, right? All the movies, right? So to see how The Batman movie did in theaters and then on streaming, you don't need, you know, Warner brothers doesn't need to ingest the same streaming data. HBO Max does it. HBO Max shares it with Warner brothers, you know, store once, share many times, and everyone works at their own pace. >> So they're building data products. Those data products are discoverable APIs, I presume, or I guess maybe just, I guess the Snowflake cloud, but very importantly, they're governed. And that's correct, where Elation comes in? >> That's precisely where Elation comes in, is where sort of this central flexible foundation for data governance. You know, you mentioned data mesh. I think what's interesting is that it's really an answer to the bottlenecks created by centralized IT, right? There's this notion of decentralizing that the data engineers and making the data domain owners, the people that know the data the best, have them be in control of publishing the data to the data consumers. There are other popular concepts actually happening right now, as we speak, around modern data stack. Around data fabric that are also in many ways underpinned by this notion of decentralization, right? These are concepts that are underpinned by decentralization and as the pendulum swings, sort of between decentralization and centralization, as we go back and forth in the world of IT and data, there are certain constants that need to be centralized over time. And one of those I believe is very much a centralized platform for data governance. And that's certainly, I think where we come in. Would love to hear more about how you use Elation. >> Yeah. So, I mean, elation helps us sort of, as you guys say, sort of, map, the treasure map of the data, right? So for consumers to find where their data is, that's where Elation helps us. It helps us with the data cataloging, you know, storing all the metadata and, you know, users can go in, they can sort of find, you know, the data that they need and they can also find how others are using data. So it's, there's a little bit of a crowdsourcing aspect that Elation helps us to do whereby you know, you can see, okay, my peer in the other group, well, that's how they use this piece of data. So I'm not going to spend hours trying to figure this out. You're going to use the query that they use. So yeah. >> So you have a master catalog, I presume. And then each of the brands has their own sub catalogs, is that correct? >> Well, for the most part, we have that master catalog and then the brands sort of use it, you know, separately themselves. The key here is all that catalog, that catalog isn't maintained by a centralized group as well, right? It's again, maintained by the individual teams and not only in the individual teams, but the folks that are responsible for the data, right? So I talked about the concept of crowdsourcing, whoever sort of puts the data in, has to make sure that they update the catalog and make sure that the definitions are there and everything sort of in line. >> So HBO, CNN, and each have their own, sort of access to their catalog, but they feed into the master catalog. Is that the right way to think about it? >> Yeah. >> Okay. And they have their own virtual data warehouses, right? They have ownership over that? They can spin 'em up, spin 'em down as they see fit? Right? And they're governed. >> They're governed. And what's interesting is it's not just governed, right? Governance is a, is a big word. It's a bit nebulous, but what's really being enabled here is this notion of self-service as well, right? There's two big sort of rockets that need to happen at the same time in any given organization. There's this notion that you want to put trustworthy data in the hands of data consumers, while at the same time mitigating risk. And that's precisely what Elation does. >> So I want to clarify this for the audience. So there's four principles of database. This came after you guys did it. And I wonder how it aligns. Domain ownership, give data, as you were saying to the, to the domain owners who have context, data as product, you guys are building data products, and that creates two problems. How do you give people self-service infrastructure and how do you automate governance? So the first two, great. But then it creates these other problems. Does that align with your philosophy? Where's alignment? What's different? >> Yeah. Data products is exactly where we're going. And that sort of, that domain based design, that's really key as well. In our business, you think about who the customer is, as an example, right? Depending on who you ask, it's going to be, the answer might be different, you know, to the movie business, it's probably going to be the person who watches a movie in a theater. To the streaming business, to HBO Max, it's the streamer, right? To others, someone watching live CNN on their TV, right? There's yet another group. Think about all the franchising we do. So you see Batman action figures and T-shirts, and Warner brothers branded stuff in stores, that's yet another business unit. But at the end of the day, it's not a different person, it's you and me, right? We do all these things. So the domain concept, make sure that you ingest data and you bring data relevant to the context, however, not sort of making it so stringent where it cannot integrate, and then you integrate it at a higher level to create that 360. >> And it's discoverable. So the point is, I don't have to go tap Ash on the shoulder, say, how do I get this data? Is it governed? Do I have access to it? Give me the rules of it. Just, I go grab it, right? And the system computationally automates whether or not I have access to it. And it's, as you say, self-service. >> In this case, exactly right. It enables people to just search for data and know that when they find the data, whether it's trustworthy or not, through trust flags, and the like, it's doing both of those things at the same time. >> How is it an enabler of solving some of the big challenges that the media and entertainment industry is going through? We've seen so much change the last couple of years. The rising consumer expectations aren't going to go back down. They're only going to come up. We want you to serve us up content that's relevant, that's personalized, that makes sense. I'd love to understand from your perspective, Mitesh, from an industry challenges perspective, how does this technology help customers like Warner Brothers Discovery, meet business customers, where they are and reduce the volume on those challenges? >> It's a great question. And as I mentioned earlier, we had five industry competency badges that were awarded to us by Snowflake. And one of those four, Median Telcom. And the reason for that is we're helping media companies understand their audiences better, and ultimately serve up better experiences for their audiences. But we've got Ash right here that can tell us how that's happening in practice. >> Yeah, tell us. >> So I'll share a story. I always like to tell stories, right? Once once upon a time before we had Elation in place, it was like, who you knew was how you got access to the data. So if I knew you and I knew you had access to a certain kind of data and your access to the right kind of data was based on the network you had at the company- >> I had to trust you. >> Yeah. >> I might not want to give up my data. >> That's it. And so that's where Elation sort of helps us democratize it, but, you know, puts the governance and controls, right? There are certain sensitive things as well, such as viewership, such as subscriber accounts, which are very important. So making sure that the right people have access to it, that's the other problem that Elation helps us solve. >> That's precisely part of our integration with Snowflake in particular, being able to define and manage policies within Elation. Saying, you know, certain people should have access to certain rows, doing column level masking. And having those policies actually enforced at the Snowflake data layer is precisely part of our value product. >> And that's automated. >> And all that's automated. Exactly. >> Right. So I don't have to think about it. I don't have to go through the tap on their shoulder. What has been the impact, Ash, on data quality as you've pushed it down into the domains? >> That's a great question. So it has definitely improved, but data quality is a very interesting subject, because back to my example of, you know, when we started doing things, we, you know, the centralized IT team always said, well, it has to be like this, Right? And if it doesn't fit in this, then it's bad quality. Well, sometimes context changes. Businesses change, right? You have to be able to react to it quickly. So making sure that a lot of that quality is managed at the decentralized level, at the place where you have that business context, that ensures you have the most up to date quality. We're talking about media industry changing so quickly. I mean, would we have thought three years ago that people would watch a lot of these major movies on streaming services? But here's the reality, right? You have to react and, you know, having it at that level just helps you react faster. >> So data, if I play that back, data quality is not a static framework. It's flexible based on the business context and the business owners can make those adjustments, cause they own the data. >> That's it. That's exactly it. >> That's awesome. Wow. That's amazing progress that you guys have made. >> In quality, if I could just add, it also just changes depending on where you are in your data pipeline stage, right? Data, quality data observability, this is a very fast evolving space at the moment, and if I look to my left right now, I bet you I can probably see a half-dozen quality observability vendors right now. And so given that and given the fact that Elation still is sort of a central hub to find trustworthy data, we've actually announced an open data quality initiative, allowing for best-of-breed data quality vendors to integrate with the platform. So whoever they are, whatever tool folks want to use, they can use that particular tool of choice. >> And this all runs in the cloud, or is it a hybrid sort of? >> Everything is in the cloud. We're all in the cloud. And you know, again, helps us go faster. >> Let me ask you a question. I could go on forever in this topic. One of the concepts that was put forth is whether it's a Snowflake data warehouse or a data bricks, data lake, or an Oracle data warehouse, they should all be inclusive. They should just be a node on the mesh. Like, wow, that sounds good. But I haven't seen it yet. Right? I'm guessing that Snowflake and Elation enable all the self-serve, all this automated governance, and that including those other items, it's got to be a one-off at this point in time. Do you ever see you expanding that scope or is it better off to just kind of leave it into the, the Snowflake data cloud? >> It's a good question. You know, I feel like where we're at today, especially in terms of sort of technology giving us so many options, I don't think there's a one size fits all. Right? Even though we are very heavily invested in Snowflake and we use Snowflake consistently across the organization, but you could, theoretically, could have an architecture that blends those two, right? Have different types of data platforms like a teradata or an Oracle and sort of bring it all together today. We have the technology, you know, that and all sorts of things that can make sure that you query on different databases. So I don't think the technology is the problem, I think it's the organizational mindset. I think that that's what gets in the way. >> Oh, interesting. So I was going to ask you, will hybrid tables help you solve that problem? And, maybe not, what you're saying, it's the organization that owns the Oracle database saying, Hey, we have our system. It processes, it works, you know, go away. >> Yeah. Well, you know, hybrid tables I think, is a great sort of next step in Snowflake's evolution. I think it's, in my opinion, I, think it's a game changer, but yeah. I mean, they can still exist. You could do hybrid tables right on Snowflake, or you could, you know, you could kind of coexist as well. >> Yeah. But, do you have a thought on this? >> Yeah, I do. I mean, we're always going to live in a time where you've got data distributed in throughout the organization and around the globe. And that could be even if you're all in on Snowflake, you could have data in Snowflake here, you could have data in Snowflake in EMEA and Europe somewhere. It could be anywhere. By the same token you might be using. Every organization is using on-premises systems. They have data, they naturally have data everywhere. And so, you know, this one solution to this is really centralizing, as I mentioned, not just governance, but also metadata about all of the data in your organization so that you can enable people to search and find and discover trustworthy data no matter where it is in your organization. >> Yeah. That's a great point. I mean, if you have the data about the data, then you can, you can treat these independent nodes. That's just that. Right? And maybe there's some advantages of putting it all in the Snowflake cloud, but to your point, organizationally, that's just not feasible. The whole, unfortunately, sorry, Snowflake, all the world's data is not going to go into Snowflake, but they play a key role in accelerating, what I'm hearing, your vision of data mesh. >> Yeah, absolutely. I think going forward in the future, we have to start thinking about data platforms as just one place where you sort of dump all the data. That's where the mesh concept comes in. It is going to be a mesh. It's going to be distributed and organizations have to be okay with that. And they have to embrace the tools. I mean, you know, Facebook developed a tool called Presto many years ago that that helps them solve exactly the same problem. So I think the technology is there. I think the organizational mindset needs to evolve. >> Yeah. Definitely. >> Culture. Culture is one of the hardest things to change. >> Exactly. >> Guys, this was a masterclass in data mesh, I think. Thank you so much for coming on talking. >> We appreciate it. Thank you so much. >> Of course. What Elation is doing with Snowflake and with Warner Brothers Discovery, Keep that content coming. I got a lot of stuff I got to catch up on watching. >> Sounds good. Thank you for having us. >> Thanks guys. >> Thanks, you guys. >> For Dave Vellante, I'm Lisa Martin. You're watching theCUBE live from Snowflake Summit '22. We'll be back after a short break. (upbeat music)
SUMMARY :
session coming for you next. and Ash Naseer great, to have you, in the conference center. and now it's great to kind of see the acceleration that you guys have of the year for data And we've also been awarded Why did you decide that you So the idea of a data mesh Or is it really, how deep have you gone the brands to ingest that data separately, terms of the business and make sure that you let allows us to, you know, separate those, guess the Snowflake cloud, of decentralizing that the data engineers the data cataloging, you know, storing all So you have a master that are responsible for the data, right? Is that the right way to think about it? And they're governed. that need to happen at the So the first two, great. the answer might be different, you know, So the point is, It enables people to just search that the media and entertainment And the reason for that is So if I knew you and I knew that the right people have access to it, Saying, you know, certain And all that's automated. I don't have to go through You have to react and, you know, It's flexible based on the That's exactly it. that you guys have made. and given the fact that Elation still And you know, again, helps us go faster. a node on the mesh. We have the technology, you that owns the Oracle database saying, you know, you could have a thought on this? And so, you know, this one solution I mean, if you have the I mean, you know, the hardest things to change. Thank you so much for coming on talking. Thank you so much. of stuff I got to catch up on watching. Thank you for having us. from Snowflake Summit '22.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
CNN | ORGANIZATION | 0.99+ |
HBO | ORGANIZATION | 0.99+ |
Mitesh Shah | PERSON | 0.99+ |
Ash Naseer | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Mitesh | PERSON | 0.99+ |
Elation | ORGANIZATION | 0.99+ |
TNT | ORGANIZATION | 0.99+ |
Warner brothers | ORGANIZATION | 0.99+ |
EMEA | LOCATION | 0.99+ |
second year | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
2019 | DATE | 0.99+ |
two years | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Cartoon Network | ORGANIZATION | 0.99+ |
Game of Thrones | TITLE | 0.99+ |
two problems | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
Warner Brothers | ORGANIZATION | 0.99+ |
10th | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
Snowflake | ORGANIZATION | 0.99+ |
Snowflake Summit '22 | EVENT | 0.99+ |
Warner brothers | ORGANIZATION | 0.99+ |
each | QUANTITY | 0.99+ |
four | QUANTITY | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
Median Telcom | ORGANIZATION | 0.99+ |
20 years later | DATE | 0.98+ |
both | QUANTITY | 0.98+ |
five different industries | QUANTITY | 0.98+ |
10 years ago | DATE | 0.98+ |
30 plus brands | QUANTITY | 0.98+ |
Alation | PERSON | 0.98+ |
four years ago | DATE | 0.98+ |
today | DATE | 0.98+ |
20 plus years ago | DATE | 0.97+ |
Warner Brothers Discovery | ORGANIZATION | 0.97+ |
One | QUANTITY | 0.97+ |
five years ago | DATE | 0.97+ |
Snowflake Summit 2022 | EVENT | 0.97+ |
three years ago | DATE | 0.97+ |
five different ways | QUANTITY | 0.96+ |
earlier this week | DATE | 0.96+ |
Snowflake | TITLE | 0.96+ |
Max | TITLE | 0.96+ |
early last year | DATE | 0.95+ |
about 1900 attendees | QUANTITY | 0.95+ |
Snowflake | EVENT | 0.94+ |
Ash | PERSON | 0.94+ |
three-peat | QUANTITY | 0.94+ |
around 10,000 | QUANTITY | 0.93+ |
Mitesh Shah, Alation & Ash Naseer, Warner Bros Discovery | Snowflake Summit 2022
(upbeat music) >> Welcome back to theCUBE's continuing coverage of Snowflake Summit '22 live from Caesar's Forum in Las Vegas. I'm Lisa Martin, my cohost Dave Vellante, we've been here the last day and a half unpacking a lot of news, a lot of announcements, talking with customers and partners, and we have another great session coming for you next. We've got a customer and a partner talking tech and data mash. Please welcome Mitesh Shah, VP in market strategy at Elation. >> Great to be here. >> and Ash Naseer great, to have you, senior director of data engineering at Warner Brothers Discovery. Welcome guys. >> Thank you for having me. >> It's great to be back in person and to be able to really get to see and feel and touch this technology, isn't it? >> Yeah, it is. I mean two years or so. Yeah. Great to feel the energy in the conference center. >> Yeah. >> Snowflake was virtual, I think for two years and now it's great to kind of see the excitement firsthand. So it's wonderful. >> Th excitement, but also the boom and the number of customers and partners and people attending. They were saying the first, or the summit in 2019 had about 1900 attendees. And this is around 10,000. So a huge jump in a short time period. Talk a little bit about the Elation-Snowflake partnership and probably some of the acceleration that you guys have been experiencing as a Snowflake partner. >> Yeah. As a snowflake partner. I mean, Snowflake is an investor of us in Elation early last year, and we've been a partner for, for longer than that. And good news. We have been awarded Snowflake partner of the year for data governance, just earlier this week. And that's in fact, our second year in a row for winning that award. So, great news on that front as well. >> Repeat, congratulations. >> Repeat. Absolutely. And we're going to hope to make it a three-peat as well. And we've also been awarded industry competency badges in five different industries, those being financial services, healthcare, retail technology, and Median Telcom. >> Excellent. Okay. Going to right get into it. Data mesh. You guys actually have a data mesh and you've presented at the conference. So, take us back to the beginning. Why did you decide that you needed to implement something like data mesh? What was the impetus? >> Yeah. So when people think of Warner brothers, you always think of like the movie studio, but we're more than that, right? I mean, you think of HBO, you think of TNT, you think of CNN, we have 30 plus brands in our portfolio and each have their own needs. So the idea of a data mesh really helps us because what we can do is we can federate access across the company so that, you know, CNN can work at their own pace. You know, when there's election season, they can ingest their own data and they don't have to, you know, bump up against as an example, HBO, if Game of Thrones is going on. >> So, okay. So the, the impetus was to serve those lines of business better. Actually, given that you've got these different brands, it was probably easier than most companies. Cause if you're, let's say you're a big financial services company, and now you have to decide who owns what. CNN owns its own data products, HBO. Now, do they decide within those different brands, how to distribute even further? Or is it really, how deep have you gone in that decentralization? >> That's a great question. It's a very close partnership, because there are a number of data sets, which are used by all the brands, right? You think about people browsing websites, right? You know, CNN has a website, Warner brothers has a website. So for us to ingest that data for each of the brands to ingest that data separately, that means five different ways of doing things and you know, a big environment, right? So that is where our team comes into play. We ingest a lot of the common data sets, but like I said, any unique data sets, data sets regarding theatrical as an example, you know, Warner brothers does it themselves, you know, for streaming, HBO Max, does it themselves. So we kind of operate in partnership. >> So do you have a centralized data team and also decentralized data teams, right? >> That's right. >> So I love this conversation because that was heresy 10 years ago, five years ago, even, cause that's inefficient. But you've, I presume you've found that it's actually more productive in terms of the business output, explain that dynamic. >> You know, you bring up such a good point. So I, you know, I consider myself as one of the dinosaurs who started like 20 plus years ago in this industry. And back then, we were all taught to think of the data warehouse as like a monolithic thing. And the reason for that is the technology wasn't there. The technology didn't catch up. Now, 20 years later, the technology is way ahead, right? But like, our mindset's still the same because we think of data warehouses and data platforms still as a monolithic thing. But if you really sort of remove that sort of mental barrier, if you will, and if you start thinking about, well, how do I sort of, you know, federate everything and make sure that you let folks who are building, or are closest to the customer or are building their products, let them own that data and have a partnership. The results have been amazing. And if we were only sort of doing it as a centralized team, we would not be able to do a 10th of what we do today. So it's that massive scale in, in our company as well. >> And I should have clarified, when we talk about data mesh are we talking about the implementing in practice, the octagon sort of framework, or is this sort of your own sort of terminology? >> Well, so the interesting part is four years ago, we didn't have- >> It didn't exist. >> Yeah. It didn't exist. And, and so we, our principle was very simple, right? When we started out, we said, we want to make sure that our brands are able to operate independently with some oversight and guidance from our technology teams, right? That's what we set out to do. We did that with Snowflake by design because Snowflake allows us to, you know, separate those, those brands into different accounts. So that was done by design. And then the, the magic, I think, is the Snowflake data sharing where, which allows us to sort of bring data in here once, and then share it with whoever needs it. So think about HBO Max. On HBO Max, You not only have HBO Max content, but content from CNN, from Cartoon Network, from Warner Brothers, right? All the movies, right? So to see how The Batman movie did in theaters and then on streaming, you don't need, you know, Warner brothers doesn't need to ingest the same streaming data. HBO Max does it. HBO Max shares it with Warner brothers, you know, store once, share many times, and everyone works at their own pace. >> So they're building data products. Those data products are discoverable APIs, I presume, or I guess maybe just, I guess the Snowflake cloud, but very importantly, they're governed. And that's correct, where Elation comes in? >> That's precisely where Elation comes in, is where sort of this central flexible foundation for data governance. You know, you mentioned data mesh. I think what's interesting is that it's really an answer to the bottlenecks created by centralized IT, right? There's this notion of decentralizing that the data engineers and making the data domain owners, the people that know the data the best, have them be in control of publishing the data to the data consumers. There are other popular concepts actually happening right now, as we speak, around modern data stack. Around data fabric that are also in many ways underpinned by this notion of decentralization, right? These are concepts that are underpinned by decentralization and as the pendulum swings, sort of between decentralization and centralization, as we go back and forth in the world of IT and data, there are certain constants that need to be centralized over time. And one of those I believe is very much a centralized platform for data governance. And that's certainly, I think where we come in. Would love to hear more about how you use Elation. >> Yeah. So, I mean, elation helps us sort of, as you guys say, sort of, map, the treasure map of the data, right? So for consumers to find where their data is, that's where Elation helps us. It helps us with the data cataloging, you know, storing all the metadata and, you know, users can go in, they can sort of find, you know, the data that they need and they can also find how others are using data. So it's, there's a little bit of a crowdsourcing aspect that Elation helps us to do whereby you know, you can see, okay, my peer in the other group, well, that's how they use this piece of data. So I'm not going to spend hours trying to figure this out. You're going to use the query that they use. So yeah. >> So you have a master catalog, I presume. And then each of the brands has their own sub catalogs, is that correct? >> Well, for the most part, we have that master catalog and then the brands sort of use it, you know, separately themselves. The key here is all that catalog, that catalog isn't maintained by a centralized group as well, right? It's again, maintained by the individual teams and not only in the individual teams, but the folks that are responsible for the data, right? So I talked about the concept of crowdsourcing, whoever sort of puts the data in, has to make sure that they update the catalog and make sure that the definitions are there and everything sort of in line. >> So HBO, CNN, and each have their own, sort of access to their catalog, but they feed into the master catalog. Is that the right way to think about it? >> Yeah. >> Okay. And they have their own virtual data warehouses, right? They have ownership over that? They can spin 'em up, spin 'em down as they see fit? Right? And they're governed. >> They're governed. And what's interesting is it's not just governed, right? Governance is a, is a big word. It's a bit nebulous, but what's really being enabled here is this notion of self-service as well, right? There's two big sort of rockets that need to happen at the same time in any given organization. There's this notion that you want to put trustworthy data in the hands of data consumers, while at the same time mitigating risk. And that's precisely what Elation does. >> So I want to clarify this for the audience. So there's four principles of database. This came after you guys did it. And I wonder how it aligns. Domain ownership, give data, as you were saying to the, to the domain owners who have context, data as product, you guys are building data products, and that creates two problems. How do you give people self-service infrastructure and how do you automate governance? So the first two, great. But then it creates these other problems. Does that align with your philosophy? Where's alignment? What's different? >> Yeah. Data products is exactly where we're going. And that sort of, that domain based design, that's really key as well. In our business, you think about who the customer is, as an example, right? Depending on who you ask, it's going to be, the answer might be different, you know, to the movie business, it's probably going to be the person who watches a movie in a theater. To the streaming business, to HBO Max, it's the streamer, right? To others, someone watching live CNN on their TV, right? There's yet another group. Think about all the franchising we do. So you see Batman action figures and T-shirts, and Warner brothers branded stuff in stores, that's yet another business unit. But at the end of the day, it's not a different person, it's you and me, right? We do all these things. So the domain concept, make sure that you ingest data and you bring data relevant to the context, however, not sort of making it so stringent where it cannot integrate, and then you integrate it at a higher level to create that 360. >> And it's discoverable. So the point is, I don't have to go tap Ash on the shoulder, say, how do I get this data? Is it governed? Do I have access to it? Give me the rules of it. Just, I go grab it, right? And the system computationally automates whether or not I have access to it. And it's, as you say, self-service. >> In this case, exactly right. It enables people to just search for data and know that when they find the data, whether it's trustworthy or not, through trust flags, and the like, it's doing both of those things at the same time. >> How is it an enabler of solving some of the big challenges that the media and entertainment industry is going through? We've seen so much change the last couple of years. The rising consumer expectations aren't going to go back down. They're only going to come up. We want you to serve us up content that's relevant, that's personalized, that makes sense. I'd love to understand from your perspective, Mitesh, from an industry challenges perspective, how does this technology help customers like Warner Brothers Discovery, meet business customers, where they are and reduce the volume on those challenges? >> It's a great question. And as I mentioned earlier, we had five industry competency badges that were awarded to us by Snowflake. And one of those four, Median Telcom. And the reason for that is we're helping media companies understand their audiences better, and ultimately serve up better experiences for their audiences. But we've got Ash right here that can tell us how that's happening in practice. >> Yeah, tell us. >> So I'll share a story. I always like to tell stories, right? Once once upon a time before we had Elation in place, it was like, who you knew was how you got access to the data. So if I knew you and I knew you had access to a certain kind of data and your access to the right kind of data was based on the network you had at the company- >> I had to trust you. >> Yeah. >> I might not want to give up my data. >> That's it. And so that's where Elation sort of helps us democratize it, but, you know, puts the governance and controls, right? There are certain sensitive things as well, such as viewership, such as subscriber accounts, which are very important. So making sure that the right people have access to it, that's the other problem that Elation helps us solve. >> That's precisely part of our integration with Snowflake in particular, being able to define and manage policies within Elation. Saying, you know, certain people should have access to certain rows, doing column level masking. And having those policies actually enforced at the Snowflake data layer is precisely part of our value product. >> And that's automated. >> And all that's automated. Exactly. >> Right. So I don't have to think about it. I don't have to go through the tap on their shoulder. What has been the impact, Ash, on data quality as you've pushed it down into the domains? >> That's a great question. So it has definitely improved, but data quality is a very interesting subject, because back to my example of, you know, when we started doing things, we, you know, the centralized IT team always said, well, it has to be like this, Right? And if it doesn't fit in this, then it's bad quality. Well, sometimes context changes. Businesses change, right? You have to be able to react to it quickly. So making sure that a lot of that quality is managed at the decentralized level, at the place where you have that business context, that ensures you have the most up to date quality. We're talking about media industry changing so quickly. I mean, would we have thought three years ago that people would watch a lot of these major movies on streaming services? But here's the reality, right? You have to react and, you know, having it at that level just helps you react faster. >> So data, if I play that back, data quality is not a static framework. It's flexible based on the business context and the business owners can make those adjustments, cause they own the data. >> That's it. That's exactly it. >> That's awesome. Wow. That's amazing progress that you guys have made. >> In quality, if I could just add, it also just changes depending on where you are in your data pipeline stage, right? Data, quality data observability, this is a very fast evolving space at the moment, and if I look to my left right now, I bet you I can probably see a half-dozen quality observability vendors right now. And so given that and given the fact that Elation still is sort of a central hub to find trustworthy data, we've actually announced an open data quality initiative, allowing for best-of-breed data quality vendors to integrate with the platform. So whoever they are, whatever tool folks want to use, they can use that particular tool of choice. >> And this all runs in the cloud, or is it a hybrid sort of? >> Everything is in the cloud. We're all in the cloud. And you know, again, helps us go faster. >> Let me ask you a question. I could go on forever in this topic. One of the concepts that was put forth is whether it's a Snowflake data warehouse or a data bricks, data lake, or an Oracle data warehouse, they should all be inclusive. They should just be a node on the mesh. Like, wow, that sounds good. But I haven't seen it yet. Right? I'm guessing that Snowflake and Elation enable all the self-serve, all this automated governance, and that including those other items, it's got to be a one-off at this point in time. Do you ever see you expanding that scope or is it better off to just kind of leave it into the, the Snowflake data cloud? >> It's a good question. You know, I feel like where we're at today, especially in terms of sort of technology giving us so many options, I don't think there's a one size fits all. Right? Even though we are very heavily invested in Snowflake and we use Snowflake consistently across the organization, but you could, theoretically, could have an architecture that blends those two, right? Have different types of data platforms like a teradata or an Oracle and sort of bring it all together today. We have the technology, you know, that and all sorts of things that can make sure that you query on different databases. So I don't think the technology is the problem, I think it's the organizational mindset. I think that that's what gets in the way. >> Oh, interesting. So I was going to ask you, will hybrid tables help you solve that problem? And, maybe not, what you're saying, it's the organization that owns the Oracle database saying, Hey, we have our system. It processes, it works, you know, go away. >> Yeah. Well, you know, hybrid tables I think, is a great sort of next step in Snowflake's evolution. I think it's, in my opinion, I, think it's a game changer, but yeah. I mean, they can still exist. You could do hybrid tables right on Snowflake, or you could, you know, you could kind of coexist as well. >> Yeah. But, do you have a thought on this? >> Yeah, I do. I mean, we're always going to live in a time where you've got data distributed in throughout the organization and around the globe. And that could be even if you're all in on Snowflake, you could have data in Snowflake here, you could have data in Snowflake in EMEA and Europe somewhere. It could be anywhere. By the same token you might be using. Every organization is using on-premises systems. They have data, they naturally have data everywhere. And so, you know, this one solution to this is really centralizing, as I mentioned, not just governance, but also metadata about all of the data in your organization so that you can enable people to search and find and discover trustworthy data no matter where it is in your organization. >> Yeah. That's a great point. I mean, if you have the data about the data, then you can, you can treat these independent nodes. That's just that. Right? And maybe there's some advantages of putting it all in the Snowflake cloud, but to your point, organizationally, that's just not feasible. The whole, unfortunately, sorry, Snowflake, all the world's data is not going to go into Snowflake, but they play a key role in accelerating, what I'm hearing, your vision of data mesh. >> Yeah, absolutely. I think going forward in the future, we have to start thinking about data platforms as just one place where you sort of dump all the data. That's where the mesh concept comes in. It is going to be a mesh. It's going to be distributed and organizations have to be okay with that. And they have to embrace the tools. I mean, you know, Facebook developed a tool called Presto many years ago that that helps them solve exactly the same problem. So I think the technology is there. I think the organizational mindset needs to evolve. >> Yeah. Definitely. >> Culture. Culture is one of the hardest things to change. >> Exactly. >> Guys, this was a masterclass in data mesh, I think. Thank you so much for coming on talking. >> We appreciate it. Thank you so much. >> Of course. What Elation is doing with Snowflake and with Warner Brothers Discovery, Keep that content coming. I got a lot of stuff I got to catch up on watching. >> Sounds good. Thank you for having us. >> Thanks guys. >> Thanks, you guys. >> For Dave Vellante, I'm Lisa Martin. You're watching theCUBE live from Snowflake Summit '22. We'll be back after a short break. (upbeat music)
SUMMARY :
session coming for you next. and Ash Naseer great, to have you, in the conference center. and now it's great to kind of see the acceleration that you guys have of the year for data And we've also been awarded Why did you decide that you So the idea of a data mesh Or is it really, how deep have you gone the brands to ingest that data separately, terms of the business and make sure that you let allows us to, you know, separate those, guess the Snowflake cloud, of decentralizing that the data engineers the data cataloging, you know, storing all So you have a master that are responsible for the data, right? Is that the right way to think about it? And they're governed. that need to happen at the So the first two, great. the answer might be different, you know, So the point is, It enables people to just search that the media and entertainment And the reason for that is So if I knew you and I knew that the right people have access to it, Saying, you know, certain And all that's automated. I don't have to go through You have to react and, you know, It's flexible based on the That's exactly it. that you guys have made. and given the fact that Elation still And you know, again, helps us go faster. a node on the mesh. We have the technology, you that owns the Oracle database saying, you know, you could have a thought on this? And so, you know, this one solution I mean, if you have the I mean, you know, the hardest things to change. Thank you so much for coming on talking. Thank you so much. of stuff I got to catch up on watching. Thank you for having us. from Snowflake Summit '22.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
CNN | ORGANIZATION | 0.99+ |
HBO | ORGANIZATION | 0.99+ |
Mitesh Shah | PERSON | 0.99+ |
Ash Naseer | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Mitesh | PERSON | 0.99+ |
Elation | ORGANIZATION | 0.99+ |
TNT | ORGANIZATION | 0.99+ |
Warner brothers | ORGANIZATION | 0.99+ |
EMEA | LOCATION | 0.99+ |
second year | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
2019 | DATE | 0.99+ |
two years | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Cartoon Network | ORGANIZATION | 0.99+ |
Game of Thrones | TITLE | 0.99+ |
two problems | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
Warner Brothers | ORGANIZATION | 0.99+ |
10th | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
Snowflake | ORGANIZATION | 0.99+ |
Snowflake Summit '22 | EVENT | 0.99+ |
Warner brothers | ORGANIZATION | 0.99+ |
each | QUANTITY | 0.99+ |
four | QUANTITY | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
Median Telcom | ORGANIZATION | 0.99+ |
20 years later | DATE | 0.98+ |
both | QUANTITY | 0.98+ |
five different industries | QUANTITY | 0.98+ |
10 years ago | DATE | 0.98+ |
30 plus brands | QUANTITY | 0.98+ |
Alation | PERSON | 0.98+ |
four years ago | DATE | 0.98+ |
today | DATE | 0.98+ |
20 plus years ago | DATE | 0.97+ |
Warner Brothers Discovery | ORGANIZATION | 0.97+ |
One | QUANTITY | 0.97+ |
five years ago | DATE | 0.97+ |
Snowflake Summit 2022 | EVENT | 0.97+ |
three years ago | DATE | 0.97+ |
five different ways | QUANTITY | 0.96+ |
earlier this week | DATE | 0.96+ |
Snowflake | TITLE | 0.96+ |
Max | TITLE | 0.96+ |
early last year | DATE | 0.95+ |
about 1900 attendees | QUANTITY | 0.95+ |
Snowflake | EVENT | 0.94+ |
Ash | PERSON | 0.94+ |
three-peat | QUANTITY | 0.94+ |
around 10,000 | QUANTITY | 0.93+ |
Kapil Thangavelu & Umair Khan, Stacklet | Kubecon + Cloudnativecon Europe 2022
>>The cube presents, Coon and cloud native con Europe, 2022, brought to you by red hat, the cloud native computing foundation and its ecosystem partners. >>Welcome to Valencia Spain in Coon cloud native con Europe, 2022. I'm your host Keith Townsend. And we're continuing the conversation with community, with startups, with people building cloud native, a cube alum joint by a CTO. And not as the CTO advisor. I really appreciate talking to CTOs Capel. Th Lou don't forgive me if I murder the name, that's a tough one. I'm I'm, I'm getting warmed up to the cubey, but don't worry. When we get to the technical parts, it's gonna be fun. And then a cube alum, Umer K director of marketing Capel. You're the CTO. So we we'll start out with you. What's the problem statement? What, what, what are you guys doing? >>So, uh, we're building on top of an open source project podcast, custodian, uh, that is in CNCF. And that I built when I was at capital one and just as they were going, they're taking those first few steps. It's a large regulated enterprise into the cloud. And the challenge that I saw was, you know, how do we enable developers to pick whatever tools and technologies they want, if they wanna use Terraform or cloud formation or Ansible? I mean, the cloud gives us APIs and we wanna be able to enable people to use those APIs through innovative ways. Uh, but at the same time, we wanna make sure that the, regardless of what choices those developers make, that the organization is being is being well managed, that all those resources, all that infrastructure is complying to the organizational's policies. And what we saw at the time was that what we were getting impediments around our velocity into the cloud, because we had to cover off on all of the compliance and regulation aspects. >>And we were doing that them as one offs. And so, uh, taking a step back, I realized that what we really needed was a way to go faster on the compliance side and clock custodian was born out of that effort side of desk that we took through enterprise wide. And it was really about, um, accelerating the velocity around compliance, but doing it in the same way that we do application and infrastructure is code. So doing policy as code in a very simple readable YAML DSL, um, because, you know, PO you have, we, anytime we write code, we're gonna more people are gonna read that code than, than are going to need to be able to write it. And so being able to make it really easy to understand from both the developers that are in the environment from the compliance folks or auditors or security folks that might wanna review it, um, it was super important. And then instead of being at the time, we saw lots of very under products and they were all just big walls of red in somebody's corner office and getting that to actually back the information back in the hands of developers so that they can fix things, um, was problematic. So being able to do time remediation and real time collaboration and communication back to developers, Hey, you put a database on the internet. It's okay. We fixed it for you. And here's the corporate policy on how to do it better in the future. >>So this is a area of focus of mind that people, I think don't get right. A lot, the technology hard enough by itself. The transformation cloud is not just about adopting new technologies, but adopting new processes, the data, and information's there automatically. But when I go to an auditor or, or, uh, compliance and say, Hey, we've changed the process for how do we do change control for our software stack? I get a blank stare. It's what do you mean we've been doing it this way for the past 15, 20 years, that's resistance, it's a pain point and projects fail due to this issue. So talk to me about that initial customer engagement. What's what's that conversation like? >>So we start off by deploying our, our platform on top of buck custodian. Um, and as far as our customers, and we give them a view of all the things that are in their cloud, what is their baseline, so to speak. Um, but I think it's really important. Like I think you bring up a good point, like communication, the challenge, larger challenge for enterprises in the cloud, and especially with grocery compliance is understanding that it is not a steady state. It's always, there's always something new in the backlog. And so being able, and the, one of the challenges for larger orgs is just being able to communicate out what that is. I remember changing a tag policy and spending the next two years, explaining it to people what the actual tag policy was. Um, and so being able to actually inform them, you know, via email, via slack, via, you know, any communication mechanism, uh, as they're doing things is, is so powerful to be able to, to help the organization grow together and move and get an alignment about what, what the, what the new things are. >>And then additionally, you know, from a perspective of, uh, tooling that is built for the real world, like being able to, as those new policies come into play, being able to say, okay, we're going to segment into stopping the bleeding on the net new and being able to then take action on what's already deployed that now needs to become into compliance is, is really important. But coming back to your question on customer engagements, so we'll go in and we'll deploy, uh, a SAC platform for them. We'll basically show them all of the things that are there already and extent. Um, we provide a real time SQL interface that customers can use, um, that is an asset inventory of all their cloud assets. Uh, and then we provide, uh, policy packs that sort of cover off on compliance, security, cost, optimizations, and opportunities for them. Uh, and then we help them through, uh, get ops around those policies, help deploy remediation activities and capabilities for their environment. >>So walk me through some of the detail of, of, of the process and where the software helps and where people need to step in. I'm making I'm, I'm talking to my security auditor, and he's saying, you know what, Keith, I understand that the Aw, that the, uh, VM talking to the application, VM talking to the Oracle database, there is a firewall rule that says that that can happen. Show me that rule in cloud custodian. And you're trying to explain, well, well, there's no longer a firewall. There's a service. And the service is talking to that. And it, it is here and clouds, custodian and St is whether Stant help come to either help with the conversation, or where do I inject more of my experience and my ability to negotiate with the auditor. >>So stalet from the perspective, uh, and if we take a step back, we, we talk about governances code and, and the four pillars around compliance, security, cost, optimization operations, uh, that we help organizations do. But if we take a step back, what is cloud custodian? Cloud custodian is really a cloud orchestrator, a resource orchestrator. What <inaudible> provides on top of that is UI UX, um, policy packs at scale execution, across thousands of accounts, but in the context of an auditor, what we're really providing is here's the policy that we're enforcing. And here's the evidence, the attestation over time. And here's the resource database with history that shows how we, how we got here, where we compliant last year to this policy that we just wrote today. >>So shifting the conversation, you just mentioned operations. One of the larger conversations that I have with CIOs and CTOs is where do I put my people? Like this is a really tough challenge. When you look at moving to something like a SRE model, or, uh, let's say, even focus on the SRE, like what, where does the SRE sit in an organization? How does stack, like if at all, help me make those types of strategic decisions if I'm talking about governance overall. So, >>So I think in terms of personas, if you look at there's a cloud engineer, then SRE, I think that what at its core Stackler and cloud custodian does is a centralized engine, right? So your cost policies, your compliance policies, your security policies are not in a silo anymore. It's one tool. It's one repository that everyone can collaborate on as well. And even engineering, a lot of engineering teams run custodian and, and adopt custodian as well. So in terms of persona stack, it really helps bring it together. All teams have the same simple YAML DSL file that they can write their policies, share their policies and communicate and collaborate better as well. >>Yeah. So I mean, cloud transformation for an enterprise is a deeper topic. Like I think, you know, there's a lot of good breast practices establishing a cloud center of excellence. Um, I, I think, you know, investing in training for people, uh, getting certification so everyone can speak the same language when it comes to cloud is a key aspect. When it comes to the operations aspect, I very much believe that you should have, you know, try to devolve and get the developers writing, uh, some of the DevOps. And so having SREs around for the actual application teams is, is valuable, but you still have a core cloud infrastructure engineering group that's doing potentially any of your core networking, any of your, you know, IM authentication aspects. And so, uh, what we found is that, you know, SLA and cloud custodian get PR primarily get deployed by one of three groups. >>The, uh, you know, you've got the, the CIO buyer within that cloud infrastructure engineering team. And what we found is that group is because they're working with the application teams in a read right way. Uh, they're very much more, um, uh, used to doing and open to doing remediation in real time. Um, and so, and then we also have the CISO teams that want to get to a secure compliance state, be able to do audit and, and validate that all the environments are, um, you know, secure, frankly. And then we get to the CFO groups. Uh, and so, and this sometimes is part of the cloud center of excellence. And so it, it has to be this cross team collaboration. And they're really focused on the, that, that cost optimization, finding the over provision, underutilized things, establishing workloads for dev environments to turn them off at night. Um, and of course, respective of time zones, cause we're all global these days. Uh, and so those are sort of the three groups that we see that sort of really want to engage with us because we can provide value for them to help their accelerate their business goals. >>So that's an expansive view, cost compliance, security operations. That's a lot, I'm thinking about all the tools, all the information that feeds into that, where does cloud custodians start and stop? Like, am I putting cloud custodian agents on servers or, uh, pods, like how, how am I interacting with this? >>So the core clock suiting is just to see lot it's stateless, it's designed to be operationally simple. Um, and so you can run it in Kubernetes, in Jenkins. We've seen people use GitLab. We've seen people run just as a query interactive tool just from, um, investigations perspective on their laptop. But when you write a policy, a policy really consists of, you know, a couple of core elements. Uh, you identify a resource you want to target say an S3 bucket or, uh, a Google cloud VM. And then you say establishes that a filters. I want to look for all the C two instances that are on public subnets with an IM roll attached that has the ability to, uh, create another IM user. And so that, you know, you filter down, you ask the arbitrary questions to filter to the interesting set of things you want, and then you take a set of actions on them. >>So you might take an action, like stop an C two instance, and you might use it as an incident response. Um, you might, uh, use it for off hours in a, in that type of policy. So you get this library of filters and actions that you can combine to form, you know, millions of different types of policies. Now, we also have this notion of an execution mode. So you might say, uh, let's operate in real time. Whenever someone launches this instance, whenever there's an API call, we want to introspect what that API I call is doing and make sure that it's compliant to policy. Now, when you do that, custo will, when you, and you run it with the COI, cause you will actually provision a Lambda function and hook up the event sources to it. Uh, and sorry, Lambda really the serverless we bind into the serverless native capabilities of the underlying cloud provider. So Google cloud function, Azure serverless functions, uh, and native AWS Lambda native us. And so now that policy is effectively hermetically sealed, running, uh, in the Seus runtime of that cloud and responding to API calls in real time, all with, you know, structured outputs and logs and metrics to the native cloud provider capabilities around those. Um, and that really ensures that, uh, you know, it's effectively becomes operation free from the perspective of the user of having to maintain infrastructure >>For it. So let's talk about >>Agent agent list and API based. >>Let's talk about like the a non-developer use case specifically finance. Absolutely. We, you have to deploy the ability to deploy, uh, um, uh, SAP in a, uh, E C two instance, but it's very expensive. Do it only when you absolutely need to do it, but you have the rights to do it. And I wanna run a, uh, a check to see if anyone's doing it like this is this isn't a colder developer, what is their experience? So, >>So primarily we focus on the infrastructure. So low balancers, VMs, you know, encryption and address on discs. Um, when we get into the application workloads running on those instances, we spend, we don't spend that that's on our target focus area. Mm-hmm <affirmative>, we can do it. Uh, and it really depends on the underlying cloud provider's capabilities. So in Amazon, there's a system called systems manager and it runs, and it's basically running an agent on the box. We're not running the agent, but we can communicate with that agent. We can, I inspect the, the inventory that's running on that box. We can send commands to that box, through those serverless functions and through those policies. And so we see it commonly used for like incident response and a security perspective where you might wanna take a memory snapshot of, of, of the instance before, uh, um, yeah, putting it into a forensic cloud and adding >>To that, like these days we're seeing the emerging personas of a fops engineer or a fops director as well, because cost in cloud is totally different. So what custodian and Stackler allows to do is again, using the simple policy files. Even if they have a non-developer background, they can understand this DSL, they can create policies, they can better, uh, target developers, better get them to take actions on policy as well. If they're overspending in the cloud or underspending in the cloud, uh, especially with St. You get, they get a lot of, out of the box dashboards and policy packs too. So say they can really understand how the cost has been consumed. They can have the developers take actions because a lot of the fops finance people complain like my developers does not understand it. Right. How do we get them to take action and make sure we are not over spending? Right. So with custodian policies, they're able to send them, uh, educational messages on slack or open a J ticket and really enforce them to take action as well and start saving cost. Like >>If you, uh, if you imagine cloud custodian as, um, you know, cleaning staff for, for the, your, your cloud environment, like it, it's, uh, you know, if you go to a typical, you know, cloud account, you're gonna see chairs that are 10 feet tall sitting at the table. You're gonna, because it's been over provision and obviously, you know, one can use it. Um, you're gonna find like the trash is overflowing because no one set up a log retention policy on the log group or set up S3, uh, life cycle rules on their buckets. And so you just have this, um, sort of this, uh, this explosion of things that people now, you know, beyond application functioning, like beyond, you know, getting to, you know, high performance, Dr. Capable, uh, SLAs around your application model, you now have to worry about the life cycle of all those resources and helping people manage that life cycle and making sure that they're using the, the, just the resources and consumption that they need, because we're all utilization based, uh, in the cloud. And so getting that to be more in line with what the application actually needs is really where we can help organizations and the CFO cost context. >>So, Emil, you got 10 seconds to tell me why you brought me a comic book. >><laugh> we created this comic book, uh, to explain the concept of governance scored in a simplified fashion. I know Keith, you like comic books, I believe. Uh, so it's a simple way of describing what we do, why it's important for pH ops for SecOps teams. And it talks about custodian and St. It as well. >>Well, I'm more of an Ironman type of guy or Batman cloud governance or governance cloud native governance is a very tough problem. I can't under emphasize how many projects get stalled or fail from a perception perspective, even if you're technically delivered what you've asked to deliver. That's where a lot of these conversations are going. We're gonna talk to a bunch of startups that are solving these tough problems here from Licia Spain, I'm Keith Townsend, and you're watching the cube, the leader in high tech coverage.
SUMMARY :
The cube presents, Coon and cloud native con Europe, 2022, brought to you by red hat, And not as the CTO advisor. And the challenge that I saw was, you know, how do we enable developers to pick And here's the corporate policy on how to do it better in the future. It's what do you mean we've been Um, and so being able to actually inform them, you know, via email, And then additionally, you know, from a perspective of, uh, And the service is talking to that. So stalet from the perspective, uh, and if we take a step back, So shifting the conversation, you just mentioned operations. So I think in terms of personas, if you look at there's a cloud engineer, then SRE, uh, what we found is that, you know, SLA and cloud custodian get PR primarily get deployed The, uh, you know, you've got the, the CIO buyer within that cloud infrastructure engineering team. all the information that feeds into that, where does cloud custodians And so that, you know, you filter down, you ask the arbitrary questions to filter to Uh, and sorry, Lambda really the serverless we bind into the serverless native capabilities of the underlying cloud So let's talk about to do it, but you have the rights to do it. We're not running the agent, but we can communicate with that agent. they're able to send them, uh, educational messages on slack or open a J ticket and And so getting that to be more in I know Keith, you like comic books, I believe. We're gonna talk to a bunch of startups that are solving
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Laura | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
2015 | DATE | 0.99+ |
John Troyer | PERSON | 0.99+ |
Umair Khan | PERSON | 0.99+ |
Laura Dubois | PERSON | 0.99+ |
Keith Townsend | PERSON | 0.99+ |
1965 | DATE | 0.99+ |
Keith | PERSON | 0.99+ |
Laura Dubois | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Emil | PERSON | 0.99+ |
Cloud Native Computing Foundation | ORGANIZATION | 0.99+ |
Fidelity | ORGANIZATION | 0.99+ |
Lisa | PERSON | 0.99+ |
1946 | DATE | 0.99+ |
10 seconds | QUANTITY | 0.99+ |
2020 | DATE | 0.99+ |
2019 | DATE | 0.99+ |
Amr Abdelhalem | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
Kapil Thangavelu | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
San Diego | LOCATION | 0.99+ |
10 feet | QUANTITY | 0.99+ |
Avamar | ORGANIZATION | 0.99+ |
Amr | PERSON | 0.99+ |
One | QUANTITY | 0.99+ |
San Diego, California | LOCATION | 0.99+ |
12 months | QUANTITY | 0.99+ |
one tool | QUANTITY | 0.99+ |
Fidelity Investments | ORGANIZATION | 0.99+ |
tens of thousands | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
thousands | QUANTITY | 0.99+ |
one repository | QUANTITY | 0.99+ |
Lambda | TITLE | 0.99+ |
Dell Technologies | ORGANIZATION | 0.99+ |
Tens of thousands | QUANTITY | 0.99+ |
six month | QUANTITY | 0.99+ |
8000 people | QUANTITY | 0.99+ |
next year | DATE | 0.99+ |
10,000 developers | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
214 | OTHER | 0.99+ |
six months later | DATE | 0.99+ |
C two | TITLE | 0.99+ |
today | DATE | 0.99+ |
fourth year | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
NoSQL | TITLE | 0.99+ |
CNCF | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
150,000 | QUANTITY | 0.99+ |
79% | QUANTITY | 0.99+ |
KubeCon | EVENT | 0.99+ |
2022 | DATE | 0.99+ |
OpenVMS | TITLE | 0.99+ |
Networker | ORGANIZATION | 0.99+ |
GitOps | TITLE | 0.99+ |
DOD | ORGANIZATION | 0.99+ |
Rob Thomas, IBM | IBM Data and AI Forum
>>live from Miami, Florida. It's the Q covering. IBM is data in a I forum brought to you by IBM. >>Welcome back to the port of Miami, Everybody. You're watching the Cube, the leader in live tech coverage. We're here covering the IBM data and a I form. Rob Thomas is here. He's the general manager for data in A I and I'd be great to see again. >>Right. Great to see you here in Miami. Beautiful week here on the beach area. It's >>nice. Yeah. This is quite an event. I mean, I had thought it was gonna be, like, roughly 1000 people. It's over. Sold or 17. More than 1700 people here. This is a learning event, right? I mean, people here, they're here to absorb best practice, you know, learn technical hands on presentations. Tell us a little bit more about how this event has evolved. >>It started as a really small training event, like you said, which goes back five years. And what we saw those people, they weren't looking for the normal kind of conference. They wanted to be hands on. They want to build something. They want to come here and leave with something they didn't have when they arrived. So started as a little small builder conference and now somehow continues to grow every year, which were very thankful for. And we continue to kind of expand at sessions. We've had to add hotels this year, so it's really taken off >>you and your title has two of the three superpowers data. And of course, Cloud is the third superpower, which is part of IBMs portfolio. But people want to apply those superpowers, and you use that metaphor in your your keynote today to really transform their business. But you pointed out that only about a eyes only 4 to 10% penetrated within organizations, and you talked about some of the barriers that, but this is a real appetite toe. Learn isn't there. >>There is. Let's go talk about the superpower for a bit. A. I does give employees superpowers because they can do things now. They couldn't do before, but you think about superheroes. They all have an origin story. They always have somewhere where they started and applying a I an organization. It's actually not about doing something completely different. It's about extenuating. What you already d'oh doing something massively better. That's kind of in your DNA already. So we're encouraging all of our clients this week like use the time to understand what you're great at, what your value proposition is. And then how do you use a I to accentuate that? Because your superpower is only gonna last if it's starts with who you are as a company or as a >>person who was your favorite superhero is a kid. Let's see. I was >>kind of into the whole Hall of Justice. Super Superman, that kind of thing. That was probably my cartoon. >>I was a Batman guy. And the reason I love that movie because all the combination of tech, it's kind of reminds me, is what's happening here today. In the marketplace, people are taking data. They're taking a I. They're applying machine intelligence to that data to create new insights, which they couldn't have before. But to your point, there's a There's an issue with the quality of data and and there's a there's a skills gap as well. So let's let's start with the data quality problem described that problem and how are you guys attacking it? >>You're a I is only as good as your data. I'd say that's the fundamental problem and organization we worked with. 80% of the projects get slowed down or they get stopped because the company has a date. A problem. That's why we introduce this idea of the A i ladder, which is all of the steps that a company has to think about for how they get to a level of data maturity that supports a I. So how they collect their data, organize their data, analyze their data and ultimately begin to infuse a I into business processes soap. Every organization needs to climb that ladder, and they're all different spots. So for someone might be, we gotta focus on organization a data catalogue. For others, it might be we got do a better job of data collection data management. That's for every organization to figure out. But you need a methodical approach to how you attack the data problem. >>So I wanna ask you about the Aye aye ladder so you could have these verbs, the verbs overlay on building blocks. I went back to some of my notes in the original Ai ai ladder conversation that you introduced a while back. It was data and information architecture at the at the base and then building on that analytics machine learning. Aye, aye, aye. And then now you've added the verbs, collect, organized, analyze and infused. Should we think of this as a maturity model or building blocks and verbs that you can apply depending on where you are in that maturity model, >>I would think of it as building blocks and the methodology, which is you got to decide. Do wish we focus on our data collection and doing that right? Is that our weakness or is a data organization or is it the sexy stuff? The Aye. Aye. The data science stuff. We just This is just a tool to help organizations organize themselves on what's important. I asked every company I visit. Do you have a date? A strategy? You wouldn't believe the looks you get when you ask that question, you get either. Well, she's got one. He's got one. So we got seven or you get No, we've never had one. Or Hey, we just hired a CDO. So we hope to have one. But we use the eye ladder just as a tool to encourage companies to think about your data strategy >>should do you think in the context I want follow up on that data strategy because you see a lot of tactical data strategies? Well, we use Data Thio for this initiative of that initiative. Maybe in sales or marketing, or maybe in R and D. Increasingly, our organization's developing. And should they develop a holistic data strategy, or should they trying to just get kind of quick wins? What are you seeing in the marketplace? >>It depends on where you are in your maturity cycle. I do think it behooves every company to say We understand where we are and we understand where we want to go. That could be the high level data strategy. What are our focus and priorities gonna be? Once you understand focus and priorities, the best way to get things into production is through a bunch of small experiments to your point. So I don't think it's an either or, but I think it's really valuable tohave an overarching data strategy, and I recommended companies think about a hub and spokes model for this. Have a centralized chief date officer, but your business units also need a cheap date officer. So strategy and one place execution in another. There's a best practice to going about this >>the next you ask the question. What is a I? You get that question a lot, and you said it's about predicting, automating and optimizing. Can we unpack that a little bit? What's behind those three items? >>People? People overreact a hype on topics like II. And they think, Well, I'm not ready for robots or I'm not ready for self driving Vehicles like those Mayor may not happen. Don't know. But a eyes. Let's think more basic it's about can we make better predictions of the business? Every company wants to see a future. They want the proverbial crystal ball. A. I helped you make better predictions. If you have the data to do that, it helps you automate tasks, automate the things that you don't want to do. There's a lot of work that has to happen every day that nobody really wants to do you software to automate that there's about optimization. How do you optimize processes to drive greater productivity? So this is not black magic. This is not some far off thing. We're talking about basics better predictions, better automation, better optimization. >>Now interestingly, use the term black magic because because a lot of a I is black box and IBM is always made a point of we're trying to make a I transparent. You talk a lot about taking the bias out, or at least understanding when bias makes sense. When it doesn't make sense, Talk about the black box problem and how you're addressing. >>That starts with one simple idea. A eyes, not magic. I say that over and over again. This is just computer science. Then you have to look at what are the components inside the proverbial black box. With Watson, we have a few things. We've got tools for clients that want to build their own. Aye, aye, to think of it as a tool box you can choose. Do you want a hammer and you want a screwdriver? You wanna nail you go build your own, aye, aye. Using Watson. We also have applications, so it's basically an end user application that puts a I into practice things like Watson assistant to virtually no create a virtual agent for customer service or Watson Discovery or things like open pages with Watson for governance, risk and compliance. So, aye, aye, for Watson is about tools. You want to build your own applications if you want to consume an application, but we've also got in bed today. I capability so you can pick up Watson and put it inside of any software product in the >>world. He also mentioned that Watson was built with a lot of of of, of open source components, which a lot of people might not know. What's behind Watson. >>85% of the work that happens and Watson today is open source. Most people don't know that it's Python. It's our it's deploying into tensorflow. What we've done, where we focused our efforts, is how do you make a I easier to use? So we've introduced Auto Way. I had to watch the studio, So if you're building models and python, you can use auto. I tow automate things like feature engineering algorithm, selection, the kind of thing that's hard for a lot of data scientists. So we're not trying to create our own language. We're using open source, but then we make that better so that a data scientist could do their job better >>so again come back to a adoption. We talked about three things. Quality, trust and skills. We talked about the data quality piece we talked about the black box, you know, challenge. It's not about skills you mention. There's a 250,000 person Gap data science skills. How is IBM approaching how our customers and IBM approaching closing that gap? >>So think of that. But this in basic economic terms. So we have a supply demand mismatch. Massive demand for data scientists, not enough supply. The way that we address that is twofold. One is we've created a team called Data Science Elite. They've done a lot of work for the clients that were on stage with me, who helped a client get to their first big win with a I. It's that simple. We go in for 4 to 6 weeks. It's an elite team. It's not a long project we're gonna get you do for your success. Second piece is the other way to solve demand and supply mismatch is through automation. So I talked about auto. Aye, aye. But we also do things like using a eye for building data catalogs, metadata creation data matching so making that data prep process automated through A. I can also help that supply demand. Miss Max. The way that you solve this is we put skills on the field, help clients, and we do a lot of automation in software. That's how we can help clients navigate this. So the >>data science elite team. I love that concept because way first picked up on a couple of years ago. At least it's one of the best freebies in the business. But of course you're doing it with the customers that you want to have deeper relationships with, and I'm sure it leads toe follow on business. What are some of the things that you're most proud of from the data science elite team that you might be able to share with us? >>The clients stories are amazing. I talked in the keynote about origin stories, Roll Bank of Scotland, automating 40% of their customer service. Now customer SATs going up 20% because they put their customer service reps on those hardest problems. That's data science, a lead helping them get to a first success. Now they scale it out at Wonderman Thompson on stage, part of big W P p big advertising agency. They're using a I to comb through customer records they're using auto Way I. That's the data science elite team that went in for literally four weeks and gave them the confidence that they could then do this on their own. Once we left, we got countless examples where this team has gone in for very short periods of time. And clients don't talk about this because they have to talk about it cause they're like, we can't believe what this team did. So we're really excited by the >>interesting thing about the RVs example to me, Rob was that you basically applied a I to remove a lot of these mundane tasks that weren't really driving value for the organization. And an R B s was able to shift the skill sets. It's a more strategic areas. We always talk about that, but But I love the example C. Can you talk a little bit more about really, where, where that ship was, What what did they will go from and what did they apply to and how it impacted their businesses? A improvement? I think it was 20% improvement in NPS but >>realizes the inquiry's they had coming in were two categories. There were ones that were really easy. There were when they were really hard and they were spreading those equally among their employees. So what you get is a lot of unhappy customers. And then once they said, we can automate all the easy stuff, we can put all of our people in the hardest things customer sat shot through the roof. Now what is a virtual agent do? Let's decompose that a bit. We have a thing called intent classifications as part of Watson assistant, which is, it's a model that understands customer a tent, and it's trained based on the data from Royal Bank of Scotland. So this model, after 30 days is not very good. After 90 days, it's really good. After 180 days, it's excellent, because at the core of this is we understand the intent of customers engaging with them. We use natural language processing. It really becomes a virtual agent that's done all in software, and you can only do that with things like a I. >>And what is the role of the human element in that? How does it interact with that virtual agent. Is it a Is it sort of unattended agent or is it unattended? What is that like? >>So it's two pieces. So for the easiest stuff no humans needed, we just go do that in software for the harder stuff. We've now given the RVs, customer service agents, superpowers because they've got Watson assistant at their fingertips. The hardest thing for a customer service agent is only finding the right data to solve a problem. Watson Discovery is embedded and Watson assistant so they can basically comb through all the data in the bank to answer a question. So we're giving their employees superpowers. So on one hand, it's augmenting the humans. In another case, we're just automating the stuff the humans don't want to do in the first place. >>I'm gonna shift gears a little bit. Talk about, uh, red hat in open shift. Obviously huge acquisition last year. $34 billion Next chapter, kind of in IBM strategy. A couple of things you're doing with open shift. Watson is now available on open shifts. So that means you're bringing Watson to the data. I want to talk about that and then cloudpack for data also on open shifts. So what has that Red had acquisition done for? You obviously know a lot about M and A but now you're in the position of you've got to take advantage of that. And you are taking advantage of this. So give us an update on what you're doing there. >>So look at the cloud market for a moment. You've got around $600 million of opportunity of traditional I t. On premise, you got another 600 billion. That's public clouds, dedicated clouds. And you got about 400 billion. That's private cloud. So the cloud market is fragmented between public, private and traditional. I t. The opportunity we saw was, if we can help clients integrate across all of those clouds, that's a great opportunity for us. What red at open shift is It's a liberator. It says right. Your application once deployed them anywhere because you build them on red hot, open shift. Now we've brought cloudpack for data. Our data platform on the red hot open shift certified on that Watson now runs on red had open shift. What that means is you could have the best data platform. The best Aye, Aye. And you can run it on Google. Eight of us, Azure, Your own private cloud. You get the best, Aye. Aye. With Watson from IBM and run it in any of those places. So the >>reason why that's so powerful because you're able to bring those capabilities to the data without having to move the date around It was Jennifer showed an example or no, maybe was tail >>whenever he was showing Burt analyzing the data. >>And so the beauty of that is I don't have to move any any data, talk about the importance of not having Thio move that data. And I want I want to understand what the client prerequisite is. They really take advantage of that. This one >>of the greatest inventions out of IBM research in the last 10 years, that hasn't gotten a lot attention, which is data virtualization. Data federation. Traditional federation's been around forever. The issue is it doesn't perform our data virtualization performance 500% faster than anything else in the market. So what Jennifer showed that demo was I'm training a model, and I'm gonna virtualized a data set from Red shift on AWS and on premise repositories a my sequel database. We don't have to move the data. We just virtualized those data sets into cloudpack for data and then we can train the model in one place like this is actually breaking down data silos that exist in every organization. And it's really unique. >>It was a very cool demo because what she did is she was pulling data from different data stores doing joins. It was a health care application, really trying to understand where the bias was peeling the onion, right? You know, it is it is bias, sometimes biases. Okay, you just got to know whether or not it's actionable. And so that was that was very cool without having to move any of the data. What is the prerequisite for clients? What do they have to do to take advantage of this? >>Start using cloudpack for data. We've got something on the Web called cloudpack experiences. Anybody can go try this in less than two minutes. I just say go try it. Because cloudpack for data will just insert right onto any public cloud you're running or in your private cloud environment. You just point to the sources and it will instantly begin to start to create what we call scheme a folding. So a skiing version of the schema from your source writing compact for data. This is like instant access to your data. >>It sounds like magic. OK, last question. One of the big takeaways You want people to leave this event with? >>We are trying to inspire clients to give a I shot. Adoption is 4 to 10% for what is the largest economic opportunity we will ever see in our lives. That's not an acceptable rate of adoption. So we're encouraging everybody Go try things. Don't do one, eh? I experiment. Do Ah, 100. Aye, aye. Experiments in the next year. If you do, 150 of them probably won't work. This is where you have to change the cultural idea. Ask that comes into it, be prepared that half of them are gonna work. But then for the 52 that do work, then you double down. Then you triple down. Everybody will be successful. They I if they had this iterative mindset >>and with cloud it's very inexpensive to actually do those experiments. Rob Thomas. Thanks so much for coming on. The Cuban great to see you. Great to see you. All right, Keep right, everybody. We'll be back with our next guest. Right after this short break, we'll hear from Miami at the IBM A I A data form right back.
SUMMARY :
IBM is data in a I forum brought to you by IBM. We're here covering the IBM data and a I form. Great to see you here in Miami. I mean, people here, they're here to absorb best practice, It started as a really small training event, like you said, which goes back five years. and you use that metaphor in your your keynote today to really transform their business. the time to understand what you're great at, what your value proposition I was kind of into the whole Hall of Justice. quality problem described that problem and how are you guys attacking it? But you need a methodical approach to how you attack the data problem. So I wanna ask you about the Aye aye ladder so you could have these verbs, the verbs overlay So we got seven or you get No, we've never had one. What are you seeing in the marketplace? It depends on where you are in your maturity cycle. the next you ask the question. There's a lot of work that has to happen every day that nobody really wants to do you software to automate that there's Talk about the black box problem and how you're addressing. Aye, aye, to think of it as a tool box you He also mentioned that Watson was built with a lot of of of, of open source components, What we've done, where we focused our efforts, is how do you make a I easier to use? We talked about the data quality piece we talked about the black box, you know, challenge. It's not a long project we're gonna get you do for your success. it with the customers that you want to have deeper relationships with, and I'm sure it leads toe follow on have to talk about it cause they're like, we can't believe what this team did. interesting thing about the RVs example to me, Rob was that you basically applied So what you get is a lot of unhappy customers. What is that like? So for the easiest stuff no humans needed, we just go do that in software for And you are taking advantage of this. What that means is you And so the beauty of that is I don't have to move any any data, talk about the importance of not having of the greatest inventions out of IBM research in the last 10 years, that hasn't gotten a lot attention, What is the prerequisite for clients? This is like instant access to your data. One of the big takeaways You want people This is where you have to change the cultural idea. The Cuban great to see you.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Miami | LOCATION | 0.99+ |
Jennifer | PERSON | 0.99+ |
4 | QUANTITY | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
20% | QUANTITY | 0.99+ |
Royal Bank of Scotland | ORGANIZATION | 0.99+ |
40% | QUANTITY | 0.99+ |
Python | TITLE | 0.99+ |
IBMs | ORGANIZATION | 0.99+ |
$34 billion | QUANTITY | 0.99+ |
seven | QUANTITY | 0.99+ |
Rob | PERSON | 0.99+ |
Eight | QUANTITY | 0.99+ |
two pieces | QUANTITY | 0.99+ |
python | TITLE | 0.99+ |
two categories | QUANTITY | 0.99+ |
250,000 person | QUANTITY | 0.99+ |
500% | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
four weeks | QUANTITY | 0.99+ |
less than two minutes | QUANTITY | 0.99+ |
Second piece | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
Miami, Florida | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Max. | PERSON | 0.99+ |
Roll Bank of Scotland | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
next year | DATE | 0.99+ |
One | QUANTITY | 0.99+ |
10% | QUANTITY | 0.99+ |
Data Thio | ORGANIZATION | 0.99+ |
Red | ORGANIZATION | 0.99+ |
6 weeks | QUANTITY | 0.99+ |
52 | QUANTITY | 0.98+ |
600 billion | QUANTITY | 0.98+ |
Watson | TITLE | 0.98+ |
Wonderman Thompson | ORGANIZATION | 0.98+ |
one simple idea | QUANTITY | 0.98+ |
More than 1700 people | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Batman | PERSON | 0.98+ |
about 400 billion | QUANTITY | 0.97+ |
first | QUANTITY | 0.97+ |
IBM Data | ORGANIZATION | 0.97+ |
100 | QUANTITY | 0.97+ |
this year | DATE | 0.97+ |
around $600 million | QUANTITY | 0.97+ |
this week | DATE | 0.96+ |
third superpower | QUANTITY | 0.96+ |
Burt | PERSON | 0.96+ |
red | ORGANIZATION | 0.96+ |
three things | QUANTITY | 0.96+ |
17 | QUANTITY | 0.95+ |
Hall of Justice | TITLE | 0.94+ |
Superman | PERSON | 0.94+ |
three superpowers | QUANTITY | 0.94+ |
cloudpack | TITLE | 0.94+ |
Azure | ORGANIZATION | 0.94+ |
five years | QUANTITY | 0.93+ |
couple of years ago | DATE | 0.92+ |
80% | QUANTITY | 0.91+ |
1000 people | QUANTITY | 0.9+ |
Jeff Moncrief, Cisco | Cisco Live US 2019
>> Announcer: Live from San Diego, California it's The Cube! Covering Cisco Live US 2019. Brought to you by Cisco and it's ecosystem partners. >> Welcome back to The Cube's coverage of Cisco Live Day 2 from sunny San Diego. I'm Lisa Martin joined by Dave Vallante. Dave and I have an alumni, a Cube alumni back with us, Jeff Moncrief, consulting systems engineer from Cisco. Jeff, welcome back! >> Thank you very much, it's great to be back! >> So, we're in the DevNet Zone, loads of buzz going on behind us. This community is nearly 600,000 strong. We want to talk with you about Stealthwatch. You did a very interesting talk yesterday. You said, it had a couple hundred folks in there. War stories from real networks. War stories ... strong descriptor. Talk to us about what that means, what some of those war stories are, and how Stealthwatch can help customers learn from that and eradicate those. >> So it's called Saved by Stealthwatch. It was a really good session. This is the third Cisco Live that I've presented this session at. And it's really just stories from actual customer networks where I've actually deployed Stealthwatch into. I've been selling Stealthwatch for about five years now. And I've compiled quite a list of stories, right? And it really ... if you think about advanced threats and insider threats and those kinds of exciting things, the presentation was really about getting back to fundamentals. Getting back to the fact that in all these years that I've been working with customers and using Stealthwatch, a lot of the scary things that I have found have nothing to do with that. With the advanced type threat stuff. It really has to do with the fact that they're forgetting the basics. Their firewalls are wide open, their networks are flat. Their segmentation boundaries aren't being adhered to. So it's allowed us to come in and expose a lot of scary things that were going on and they were just completely oblivious to it. >> Why are those gaps there? Is it because of a change management issue? Technology's moving so quickly? Lack of automation? >> Yeah, I think there's a couple reasons that I've seen. It's a recurring theme really. Limited resources ... number one. Number two, limited budgets, so your priorities have to shift. But I think a big one that I've seen a lot is turnover and attrition. A lot of times we'll go in with Stealthwatch and we'll kick off an evaluation or whatnot and the customer will say, I just don't know what's there. I don't know if I have 100 machines that need visibility or for a thousand. And I'm a Stealthwatch cloud consulting systems engineer so the cloud world is where I spend a lot of my time now and what I'm seeing as it relates to the cloud realm is that's exponentially worse now. Because now you've got things like devops and shadow IT that are all playing in the customer's public cloud environment deploying workloads, deploying instances and building things that the security team has no awareness of. So there's a lot of things that are living and breathing on the network that they just don't know about. >> And so the tribal knowledge leaves the building, how do you guys help solve that problem? >> So we come in ... and you know the last time that you and I spoke, you used the term cockroaches, I think, which I loved. I actually have used that a lot since then, so thank you for that. >> Dave: Yeah, you're welcome. >> No, but, you know ... we come in and we actually, we turn the customer's network infrastructure ... Whether it's on-prem or in the public cloud into a giant security sensor grid. And we leverage something called NetFlow, which you've probably heard of. And it's essentially allowing us to account for every conversation throughout the entire infrastructure, whether or not it's on-prem or in the public cloud or maybe even in a private cloud. We've got you covered in that area. And it allows us to expose every one of those living, breathing things. And then we can just query the system. So think of us like a giant network DVR on steroids. We see everything, you can't hide from us, because we're using the network to look at everything. And then we can just set little trip wires up. And that's kind of what I go into in my presentation also is how you can set these trip wires ahead of time to find things that are going on that you just didn't know about and frankly, they're probably going to scare ya. >> One of the stories that you shared in your talk yesterday. You talk about people really forgetting the basics. A university that had a vending machine breach. You just think, a vending machine in a cafeteria? >> Jeff: That's right. >> Really? Tell us about that. What kind of data was exposed from a vending machine? >> So that's one of my favorite stories to tell. We had gone in and we'd installed Stealthwatch at a small university in the US. And they had a very small team. Okay, you're going to see that recurring theme. Limited staff. And they really just had a firewall. Okay, that was what they were doing for security. So we came in, we enabled NetFlow, we kind of let Stealthwatch do it's thing for a couple of days, and I just queried the system. Okay, it's not rocket science, it's not AI a lot of times, it's really the fundamentals. And I just said, tell me anything talking on remote desktop protocols inside the network out to the internet. And lo and behold, there was one IP address that had communication from it to every bad country you can imagine ... actively. And I said to them ... I said, what is this IP address? What's it doing? And that was in the conference room in the university with their staff and the guy looked it up in the asset inventory system, and he looked at me and he goes, that's a vending machine. And I said, a vending machine? And he said, yeah. And then I was like, okay, well that's a first, I've never heard of that before. And he goes wait a minute, it's a dirty tray return machine. You ever heard of one of those? >> Lisa: No. >> I hadn't either. >> Lisa: Explain. >> So for loss prevention, I guess universities and other public institutions, they will buy these unique vending machines that are designed for loss prevention. So that the college students don't go around and you know, steal or throw away the trays from the cafeteria. You have to return the tray to get a coin. There's a common supermarket chain that does the same thing with their shopping carts. And it's for loss prevention. So I said, okay, that's pretty strange. Even stranger than just vending machine. And I said, well did you realize that it was talking to a remote desktop all over the world? And he said no. And I said so, can you tell me what it has access to? So he looked it up in the firewall manager right there and he said, it has access to the entire network. Flat network, no segmentation. No telling how long this had been going on, and we exposed it. >> And Stealthwatch exposes those gaps with just kind of old school knock on the door. >> Yeah, it really is. We're talking about fundamental network telemetry that we're gathering off the route switch infrastructure itself. You know, obviously, we're at Cisco Live, we work really well with Cisco gear. Cisco actually invented NetFlow about 20 years ago. And we leveraged that to give visibility footprint that allow us to expose things like the vending machine. I've found hospital x-ray machines that were scanning all the US military, for instance. I find things in the cloud that are just completely wide open from a security ACL standpoint. So we've got that fundamental level of visibility with Stealthwatch, and then we kick in some really cool machine learning and statistical analytics and machine running analytics and that allows us to look for anomalies that would be indicators of compromise. So we're taking that visibility footprint and we're taking it to that next level looking for threats that might be in the customer's environment. >> So before we get to the machine intelligence, I presume that cloud and containers only makes this problem worse. What are you seeing in the field? How are you dealing with that? >> So we're in a landscape today where we've got a lot of customers that might be cloud averse. But we've also got a lot of customers that are on the wide other side of that spectrum and they're very cloud progressive. And a lot of them are doing things like server-less micro services, containers and, when you think of containers you think of container orchestration ... kubernetes. So Stealthwatch Cloud is actually in that realm right now today, able to protect and illuminate those environments. That's really the Wild West right now, is trying to protect those very abstract server-less and containerized environments but yeah, we come in, we are able to deploy inside kubernetes clusters or AWS or azure or GCP, and tell the Stealthwatch story in those environments, find segmentation violations, find firewall holes just like we would on premise, and then look for anomalies that would be interesting. >> So the security paradigm for those three you mentioned, those three cloud vendors, and you're on-prem, and maybe even some of your partners, is a lot of variability there. How should customers deal with maintaining the edicts of the organization and sort of busting down those silos? >> Yeah, so you think about like Stealthwatch Cloud which is the product that I'm a CSE for, we're really focusing on automation, high efficacy and accuracy. All right, we're not going to be triggering hundreds or thousands of alerts whenever you plug us in. It's going to further bog down a limited team. They've got limited time and they have to change their priorities constantly. This solution is designed to work immediately out of the box quickly deploy within a matter of hours. It's all SAAS based so actually it lives in the cloud. And it really takes that burden off of the organization of having to go and set a bunch of policies and trip wires and alerts. It does it automatically. It's going to let you know when you need to take a look at it so that you can focus on your other priorities. >> So curious where your conversations are within an organization - whether it's a hospital, or a university when what you're finding is in this multi-cloud world that we live in where there's attrition and all of these other factors contributing to organizations that don't know what they have with multi-cloud edge comes this very amorphous perimeter, right? Where are those conversations because if data is the lifeblood of an organization, if it's not secure and protected, if it's exposed there's a waterfall of problems that could come with that. So is this being elevated into the C-Suite of an organization? How do you start those conversations? >> So it's not just the C-Suite and the executive type structure that we're having to talk to now, traditionally we would go in with the Stealthwatch opportunity and talk to the teams in the organization it's going to be the InfoSec team, right? As we move to the cloud though, we're talking about a whole bunch of different teams. You've got the InfoSec team, you've got the network operations team now, they're deploying those workloads. The big one though that we've really got to think about and what we've really got to educate our customers on is the Dev Ops teams. Because the Dev Ops teams, they're really the ones that are deploying those cloud workloads now. You've got to think about ... they've got API access, they've got direct console login access. So you've got multiple different entry points now into all these different heterogeneous environments. And a lot of times, we'll go in and we'll turn on Stealthwatch and we show the organization, yeah, you knew that Dev Ops was in the VPC's deploying things, but you didn't know the extent that they were deploying them. >> Lights up like a Christmas tree? >> Yeah, lights up like a Christmas tree and like a conversation I had last week with a customer. I asked them, I said, all right so you're in AWS, are we talking do you have 50 instances or do you have 500? He said, I have no idea. Because I'm not the one deploying these instances. I'm just lucky enough to get permission to have access to them to let you plug your stuff in to show me what's going on in that environment. But yet they're in charge of securing that data. So it's quite frightening. >> So you've got discovery, you've got ways to expose the gaps, and then you're obviously advising on remediation activity. And you're also bringing in machine intelligence. So what's the endgame there? Is it automation? Is it systems of agency where the machine is actually taking action? Can you explain that? So when the statistical analysis comes in and the anomaly detection comes in, it's really that network DVR, so we've got the data, now let's do some really cool things with it. And that's where we're in actually, for every single one of these entities, and I do stress entities because the days of operating systems and IP addresses are going away. Face it, it's happening. Things are becoming more and more abstract. You know, API keys, user accounts, lambda's and runtime compute, we have to think about those. So what we do for all these different entities is we build a model for each one of these, and that model, that's where all the math and the AI comes in. We're going to learn Known Good for it. Who do they talk to? How much data's sent or received? And then we start looking for activity in that infrastructure as it relates to that entity that's outside of that Known Good model. So that would be the anomaly detection and you know, our anomaly detection, it really can be attributed to two different major categories. Number one is going to be, we're looking for things that cross the cyber kill chain. So those different IOC's as a threat actually manifests. That's what the anomaly detection's doing. And then we're also looking for just straight compliance and configuration violations in the customer's cloud infrastructure, for instance, that would just be a flat out security risk today, day one, forget base lining anomaly detection, it should just not be configured that way. >> Let's see, roughly 25% of Cisco's revenue is in services, what role does the customer service team play in all this? How do you interact ... how do the product guys and the service guys work together? >> So we've got a great customer experience team, customer services team for Stealthwatch and it doesn't matter if we're talking Stealthwatch on-premise or the Stealthwatch cloud, they cover both. And what will happen is we'll come in from a pre-sales standpoint, we do the evaluation, show good value, and then we've got a good relationship with the CX team where we'll hand that off to them, and then we'll work with the CX team to make sure that customer is good to go, they're taken care of, and it's not we've sold this and we're just going to forget you type scenario. They do a good job of coming in, they make sure that the customer's needs are met, any feature requests that they like taken care of. You know, they have routine touchpoints with the customers and they make sure that the product, for all intents and purposes, doesn't lose interest or visibility in the customer's environment. That they're using it, they're getting good value out of it, and we're going to build a relationship. I call it cradle to grave. We're going to be with that customer cradle to grave. >> Now Jeff, one of the things I didn't talk to you about at Google Next was ... first I got to ask you, you're a security guy, right? Have you always been a security guy? >> Yeah, security for about 20 years now, dating back to internet security systems. >> The question I often ask security guys is who's your favorite superhero? >> My favorite superhero ... I'd say Batman. >> Dave: Batman? >> Yeah. >> I like Batman. (chuckles) The reason I ask is that somebody told me one time that true security guys, they love superheroes because they grew up kind of wanting to save the world and protect the innocent. So ... just had to ask. >> Yeah there you go .. Batman. >> I'm sensing a tattoo coming. Last question for you Jeff is in terms of time to business impact, the vending machine story is just so polarizing because it's such a shocking massive exposure point, did they ever discover how long it had been open and in terms of being able to remedy that, how quickly can Stealthwatch come in, identify these- >> So very quick operation wise. So like the vending machine story, that's something that if you turn on Flow, and you send it to Stealthwatch right now, we can pick that up in 10 minutes. That quick to visibility and value. Now how long has it been going on? A lot of times they can't answer that question because they've never had anything to illuminate that to begin with. But moving forward, now they've got a forensic incident response audit trail capability with Stealthwatch which is actually a pretty common use case. Especially if you think about things like PCI that have got auto requirements and whatnot. A lot of organizations if they're not using a Flow based security analytics tool, they can't always meet those audit and forensic requirements. So at least from the point of installing Stealthwatch they'll be good to go from that point forward. >> So if they can find an anomaly that needs to be rectified in 10 minutes, what's the next step for them to actually completely close that gap? >> So like with Cisco Identity Services engine, we've got a great integration there where we can actually take action, shut off that machine instantly. We can shut off a switch port. We can isolate that machine to an isolated sandboxed VLAN, get it off the network, and then in the cloud, we can do things like automated remediation. We can use things like Amazon and Lambda to actually shut off an instance that might be compromised. We can actually use Lambda's to insert firewall rules. So if we find a hole, we can plug it. Very easily, automated- >> Ship a function to it and plug a hole. >> Batman slash detective. I think you need a tattoo and a badge. >> I can work on that, I like it. >> Jeff thank you so much for joining Dave and me on The Cube this afternoon. >> My pleasure. >> Really interesting stuff, we appreciate your time. >> Absolutely. >> For Dave Vallante, I'm Lisa Martin. You're watching The Cube's second day of coverage of Cisco Live from San Diego. Thanks for watching. (upbeat music)
SUMMARY :
Brought to you by Cisco Welcome back to The Cube's coverage We want to talk with you about Stealthwatch. And it really ... if you think about that are all playing in the customer's public So we come in ... and you know the last time and frankly, they're probably going to scare ya. One of the stories that you What kind of data was exposed from a vending machine? And I said to them ... I said, So that the college students don't go around And Stealthwatch exposes those gaps and then we kick in some really cool machine learning So before we get to the machine intelligence, that are on the wide other side of that spectrum So the security paradigm for those three you mentioned, And it really takes that burden off of the organization if data is the lifeblood of an organization, So it's not just the C-Suite and the executive to have access to them to let you plug your stuff in that infrastructure as it relates to that entity and the service guys work together? to forget you type scenario. Now Jeff, one of the things I didn't talk to you about dating back to internet security systems. My favorite superhero ... So ... just had to ask. and in terms of being able to remedy that, So like the vending machine story, We can isolate that machine to an isolated I think you need a tattoo and a badge. Jeff thank you so much for joining Dave and me of Cisco Live from San Diego.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Moncrief | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Jeff | PERSON | 0.99+ |
Dave Vallante | PERSON | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
San Diego | LOCATION | 0.99+ |
hundreds | QUANTITY | 0.99+ |
US | LOCATION | 0.99+ |
Stealthwatch | ORGANIZATION | 0.99+ |
Lisa | PERSON | 0.99+ |
100 machines | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
San Diego, California | LOCATION | 0.99+ |
50 instances | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
three | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
yesterday | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
third | QUANTITY | 0.99+ |
Batman | PERSON | 0.99+ |
Cube | ORGANIZATION | 0.99+ |
second day | QUANTITY | 0.99+ |
thousands | QUANTITY | 0.99+ |
25% | QUANTITY | 0.99+ |
10 minutes | QUANTITY | 0.98+ |
CX | ORGANIZATION | 0.98+ |
today | DATE | 0.98+ |
about 20 years | QUANTITY | 0.98+ |
first | QUANTITY | 0.97+ |
InfoSec | ORGANIZATION | 0.97+ |
one | QUANTITY | 0.97+ |
each one | QUANTITY | 0.97+ |
500 | QUANTITY | 0.96+ |
Cisco Identity Services | ORGANIZATION | 0.96+ |
one time | QUANTITY | 0.95+ |
C-Suite | TITLE | 0.94+ |
about five years | QUANTITY | 0.94+ |
nearly 600,000 strong | QUANTITY | 0.93+ |
Stealthwatch Cloud | ORGANIZATION | 0.93+ |
NetFlow | TITLE | 0.92+ |
Cisco Live | ORGANIZATION | 0.92+ |
The Cube | ORGANIZATION | 0.92+ |
three cloud vendors | QUANTITY | 0.9+ |
two different major categories | QUANTITY | 0.9+ |
The Cube | TITLE | 0.89+ |
Dev Ops | TITLE | 0.89+ |
alerts | QUANTITY | 0.89+ |
Christmas | EVENT | 0.89+ |
2019 | DATE | 0.85+ |
Lambda | TITLE | 0.84+ |
One of the stories | QUANTITY | 0.84+ |
couple reasons | QUANTITY | 0.84+ |
about 20 years ago | DATE | 0.83+ |
Number one | QUANTITY | 0.83+ |
Phill Ring, TT Games | E3 2018
>> [Announcer] Live from Los Angeles, it's The Cube, covering E3 2018, brought to you by SiliconANGLE Media. >> Hey welcome back everybody, Jeff Frick here with The Cube, we're at E3 at the LA Convention Center, 68,000 people milling around, but we've got kind of the backdoor access here to the Warner Brothers Games booth, so we're really excited to be back in here, the inner sanctum, talking about some of the new games coming out and we got Phill Ring, he's the Executive Producer of TT Games, Phil, great to see you. >> No, thank you very much for having me. >> Absolutely, so you're in charge of this wonderful game, that we've got on the wall behind us, the Lego DC Super-Villains? >> Sure, yeah, I'm lucky enough to be one of the incredibly talented team, 'cause we're really excited about this game, Lego DC Super-Villains is something we've actually been playing around with as an idea for a while, you get to be the villains, you get to be the bad guys, so we're really excited we actually finally get to show and talk about it. >> Right, after doing what, three games of Batman, so now you get a flip over, you get to be the Riddler or the Joker? >> Yeah, this is it, so with the kind of DC universe, we did the Lego Batman titles, but DC has amazing villains, you've got Joker, you've got Harley, you've got Lex and we were like you know what? Let's play as those, let's do something really cool, let's do a story where we're focusing on the villains, because we've never done it before, we think it'll be quite fun and hopefully people are gonna really enjoy it. >> Great, so it's coming out, so give the particulars for everybody at home, who's waiting to get their order in. >> Sure, so it's available October 16th, it's actually available for pre-order now and depending where you're pre-ordering it from, there's actually a really cool Lex Luthor power-suit mini figure you can get, so it features in the game and then you can actually have that model sat on your desk, so I'm really excited, I'm gonna run off and pre-order it as soon as I can, 'cause I want that figure. >> Well, that's cool, but the other feature you talked about before we turned on the cameras, you can actually make yourself into a Lego figure, right? >> This is really cool, yeah. So when we were looking at this game, we were sat there thinking, okay, villains are really, really cool, but I wonder what it would be like if I could put myself into this world, what happens if I'm playing with Joker and with Lex, so we decided to put the Character Customizer in, so right at the very beginning of the game, Commissioner Gordon's heading to find out some information about this new character and then you customize that character and that's your character, so you make whoever you want, as crazy as you want, there's loads of kind of depth to the Customizer, you can change decors, colors, torsos, facial features, hair pieces and then that character appears throughout the story, so they walk out in a cut scene and that's really cool and then that character unlocks new powers and abilities and becomes stronger as you play through the game. >> Right, so I'm just curious on kind of the evolution of the game, again you did some earlier versions, that weren't the same game, but you know, this one is kind of built onto that, what did you discover, in terms of how people play the game? One of my favorite topics is degree of difficulty, >> Sure. >> How do you figure out the degree of difficulty, to make it difficult enough from excited to attack a challenge and conquer it, but not so difficult, where I'm just banging my head against a wall and throw my controller out the window and say, I just can't get through this thing. >> So that's something that the team do really, really well. We always look at it and go, okay, we know that these games are for a younger audience or at least to start with, so we want something that an eight-year old kid, who may have never played a Lego game before can come along, have loads of fun with this world, so we're making sure that we're kind of educating the player, we have a new tutorial system in this game, where we can show little videos to go, so you've just unlocked this cool power, this is how it works. So we can kind of educate people, but then we know that we're gonna have like either fans of Lego games, but also like DC Comic fans, like we have people kind of telling us, "Oh, I play this with my wife and things," so they want a bit more of a challenge and that's when we get to go into like the Free Play world, so once you're playing the story, you can then go explore all these locations and you find the slightly trickier puzzles, where it's like, oh, I need to figure out what I need to do here, what character do I need, what ability do I need to use? So having that kind of accessibility, so it's really accessible to get into the game, but then there's loads of depth to it, >> Right. >> so that's really cool for us and it's one of these things that we're really kind of happy with, 'cause we also find that the eight-year old kids run around doing all the hard puzzles and we struggle with them, so sometimes it swings, so. >> I was gonna say, so what are some of the things you measure to see if you're hitting that objective? Is it time in a level? Is it time being in there? I mean, what are some of the factors, that you guys are actually looking and measuring to see if you maybe have to make an adjustment, based on the actual behavior? >> So we love getting people to play the game, so we bring kids in and we'll sit there, then we see them playing it and if they're getting stuck, if there's something that's not really kind of standing out to them, if they're spending too much time in an area, not knowing what they're doing, we'll go okay, right, we need to change that, we need to signpost that differently, we need to turn round and say, how can we make it clearer to the player, so they know what they do, but also keep the rewards, so that they feel like they've achieved, that they feel like they've figured it out. >> Right. >> So that's one of the things, like if someone's getting stuck on a level and they're there for like three, four, five minutes and they don't know what to do, we don't want that experience for people, so we'll sit there and go, okay, how can we make that clearer? Is there something we can do? Is there something we can maybe flash a piece of Lego or something and sit there and go, these Lego bricks, maybe you wanna smash those up and that's also really cool, 'cause villains get to smash things up. >> Right, right. >> and go, okay, if I break that, I can make that clearer, then the player will then know what to do and they'll be able to progress. >> So it's really signaling is really the big kind of, way to help them get over that, versus completely changing that piece of the play? >> Yeah, we really do think that we can hopefully change the puzzles to be able to do that, we have had instances though, where we sit there and go, actually, no one gets this, this is too complicated, back to the drawing board and so we'll rip a puzzle out and sit there and go, actually, how do we change this, this is overly complicated, it's too confusing, let's do something different, let's do something that's really cool and it also means that we get to go, let's have a second stab at it and sometimes we get really cool results from it and some of the puzzles are even better than what we had previously, so. >> And the other piece I think is really interesting is clearly these are very well-known brands, Lego's a very well-known brand, DC is a very well-known brand, so you've got a narrative, you've got a story, you have kind of the look and feel, at the same time you want players to be able to do all kinds of things and you don't necessarily know where they're gonna go, how they're gonna interact, so how do you kind of balance the play with the narrative? >> So one of the great things about this game is from a story point of view and a narrative, we actually, it's an original creation and we worked really closely with DC and that allows us to kind of really help with the kind of pacing of the adventure, so as you're playing through and you start off on the first level, when you're breaking out of a prison, you then get dropped into the Open World Hub and we get to signpost people and say, hey, you can go over here to continue the story, but if you wanna go off and explore, you do that, go for it, go see what you can find and then we kind of have something that allows players to keep coming back, because these worlds, we know that there are massive fans of them, so if you turn round to someone and say, you can go to Gotham City, they'll know where they wanna go, like if I'm a Batman fan, I'm like, I'm going to the Iceberg Lounge, I wanna see what it is. So we give players that freedom to really explore it, but then always kind of let them be able to kind of return to the story path and that's another thing that we think is really important, because when people are playing these games, we want them to be able to make the choices of how they play the game. >> Right, great, that's interesting, so if there is a place, that they want to go to, 'cause they love Gotham City, they're big fans of Batman and it's not there, you guys hear a lot of feedback? I mean, do people come back, so that you've got to pump that into the next iteration of the game and the next update? >> Yeah, we do, we listen to what fans do and we've been doing that for years, so ever since we've been doing these DC titles, we sit there and go, what do people wanna do, what do people wanna see? One of the things that I love is that we have massive DC fans in the office, so a lot of the stuff, we'll sit there and we'll see like requests coming in on social media going, I really hope this character's there and we get to look at our character list and go, yep, he's there, who put it in? And then we go chat with them and they go, of course I'm gonna include that character, I love them and some of them are really obscure. >> Right. >> But yeah, we love listening to feedback and seeing what people expect and what they want to see from this world. >> It's really interesting balance, 'cause you get all the leverage from those known brands, those known characters, those known stories, >> Sure. >> but at the same time, as you said, you've got a lot of people, that are really into it and they're gonna hold you to a standard, >> Yeah. >> to make sure, that you're representing everything as they think it really should be. >> Yeah, very much so and this is the other thing about having fans in the office is we keep ourselves to that high standard as well, we sit there and go, it needs to be right, like I am a fan of Gorilla Grodd, he needs to do everything I want him to do, because I know this character inside and out and so when we have people, who are that passionate about the game on staff, we just wanna be able to share that with the world and so when we hear feedback, that people go, "Oh, we love it, it's exactly what I wanted," it's like we love that, it's incredible to know that we kind of feel like we've got it right, we've got these characters right. >> It's so cool though, just the integration of the Legos with all these other brands and just the, and it's not even the Lego blocks, the Lego people and how well it's been able to be integrated with all these other brands and the integration just seems to work so, so, so well. >> Yeah, no, I've been lucky enough to be with TT for over 11 years now, so being able to work on these games and see how we can do a Lego version of these stories and these worlds and these universes, I'm so privileged to be able to do that and the Lego version is different, so Lego DC Super-Villains is a world of DC, that you won't see anywhere else, because it's our take on it, >> Right. >> it's the developer and working with DC, being able to go, let's make something cool and working really closely with Lego and going, what sets are you making? Let's put those in, that's really cool, so. >> It's awesome, alright, well Phill, thanks for taking a few minutes, congratulations on the game and good luck on October 16th. >> Great, thank you very much, thank you. >> Alright, he's Phill, I'm Jeff, you're watching The Cube from E3 and LA Convention center. Thanks for watching. (dynamic music)
SUMMARY :
brought to you by SiliconANGLE Media. coming out and we got Phill Ring, you get to be the villains, you get to be the bad guys, and we were like you know what? so give the particulars for everybody at home, and then you can actually have that model sat on your desk, so we decided to put the Character Customizer in, but not so difficult, and you find the slightly trickier puzzles, and we struggle with them, so sometimes it swings, so. so we bring kids in and we'll sit there, and they don't know what to do, and they'll be able to progress. and it also means that we get to go, and then we kind of have something that allows players and we get to look at our character list and seeing what people expect to make sure, and so when we have people, and the integration just seems to work so, so, so well. and going, what sets are you making? congratulations on the game and good luck on October 16th. Great, thank you very much, he's Phill, I'm Jeff, you're watching The Cube
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff | PERSON | 0.99+ |
Jeff Frick | PERSON | 0.99+ |
October 16th | DATE | 0.99+ |
Los Angeles | LOCATION | 0.99+ |
Gotham City | LOCATION | 0.99+ |
Phil | PERSON | 0.99+ |
Lego | ORGANIZATION | 0.99+ |
Gordon | PERSON | 0.99+ |
three | QUANTITY | 0.99+ |
The Cube | TITLE | 0.99+ |
eight-year old | QUANTITY | 0.99+ |
Iceberg Lounge | LOCATION | 0.99+ |
Phill | PERSON | 0.99+ |
DC | ORGANIZATION | 0.99+ |
68,000 people | QUANTITY | 0.99+ |
Lex | PERSON | 0.98+ |
E3 | EVENT | 0.98+ |
Riddler | PERSON | 0.98+ |
SiliconANGLE Media | ORGANIZATION | 0.98+ |
first level | QUANTITY | 0.98+ |
E3 2018 | EVENT | 0.98+ |
Joker | PERSON | 0.98+ |
TT Games | ORGANIZATION | 0.98+ |
LA Convention Center | LOCATION | 0.98+ |
Batman | PERSON | 0.98+ |
Harley | PERSON | 0.98+ |
over 11 years | QUANTITY | 0.98+ |
three games | QUANTITY | 0.98+ |
four | QUANTITY | 0.97+ |
five minutes | QUANTITY | 0.97+ |
One | QUANTITY | 0.97+ |
LA Convention center | LOCATION | 0.96+ |
one | QUANTITY | 0.96+ |
DC Comic | ORGANIZATION | 0.94+ |
second stab | QUANTITY | 0.92+ |
Warner Brothers Games | ORGANIZATION | 0.9+ |
Phill Ring | PERSON | 0.88+ |
TT | ORGANIZATION | 0.8+ |
DC Super-Villains | TITLE | 0.79+ |
Legos | ORGANIZATION | 0.78+ |
years | QUANTITY | 0.76+ |
Lex | ORGANIZATION | 0.74+ |
one of these things | QUANTITY | 0.72+ |
The Cube | ORGANIZATION | 0.72+ |
much time | QUANTITY | 0.7+ |
lot of people | QUANTITY | 0.69+ |
City | LOCATION | 0.69+ |
Open | LOCATION | 0.69+ |
Ring | ORGANIZATION | 0.68+ |
DC Super | TITLE | 0.63+ |
Gorilla | TITLE | 0.58+ |
Batman | TITLE | 0.57+ |
Villains | TITLE | 0.57+ |
Cube | TITLE | 0.56+ |
Luthor | PERSON | 0.55+ |
Phill | EVENT | 0.51+ |
World | TITLE | 0.5+ |
Gotham | TITLE | 0.49+ |
Grodd | PERSON | 0.45+ |
Hub | LOCATION | 0.44+ |
Dr. Nic Williams, Stark & Wayne | Cloud Foundry Summit 2018
(electronic music) >> Announcer: From Boston, Massachusetts, it's theCUBE. Covering Cloud Foundry Summit 2018. Brought to you by the Cloud Foundry Foundation. >> I'm Stu Miniman, and this is theCUBE's coverage of Cloud Foundry Summit 2018, here in beautiful Boston, Massachusetts. Happy to welcome to the program first-time guest, Dr. Nic Williams, CEO of Stark and Wayne. Dr. Nic, thanks for joining me >> Thank you very much. I think you must've come to the conference from a different direction than I came. >> I'm a local, and I'm trying to get more people to come to the Boston area. We've been doing theCUBE now for, coming up on our ninth year of doing it, and it's only the third time I've done something in this convention center, so please, more tech shows to this area, Boston, the Hynes Convention Center, and things like that. >> There's plenty of tech people. I was at the Nero Cafe, everyone seemed like they were a tech person. >> Oh no, the Seaport region here is exploding. I've done two interviews today with companies here in Boston or Cambridge. There's a great tech scene. For some reason, you and I were joking, it's like, do we really need another conference in Vegas? I mean really. >> Dr. Nic: Right, no, I like the regional. >> But yeah, the weather here is unseasonably cold. It was snowing and sleeting this morning, which is not the Spring weather. >> It is April, it is mid-April, and it's almost snowing outside. >> Alright, so Dr. Nic, first of all, you get props for the T-shirt. You've got Iron Man and Doctor Doom, and we're saying that there is a connection between the superheroes and Stark and Wayne. >> Right, so Stark and Wayne is founded by two fictional superheroes. The best founders are the fictional ones, they don't go to meetings, they're too busy making, you know, films. >> Yes, but everybody knows that Tony Stark is Iron Man, but nobody's supposed to know that Bruce Wayne was Batman. >> Nic: Right, right. >> But I've heard Stark and Wayne mentioned a number of times by customers here at the conference. So, for our audience that doesn't know, what does Stark and Wayne do, and how are you involved in the Cloud Foundry ecosystem? >> So Stark and Wayne, I first found Bosh, I founded Stark and Wayne. Earlier than that I discovered Bosh, six years ago, when it was first released, became like, I claimed to be the world's first evangelist for Bosh, and still probably the number one evangelist. And so Stark and Wayne came out of that. I was VMWare Pivotal's go-to person for standing things up and then customers grew, and you know. Yeah, people want to know who to go to, and when it comes to running Cloud Foundry, that's us. >> Yeah well, there's always that discussion, right? We've got all these wonderful platforms and these things that go together, but a lot of times there's services and people that help to get those up. Pivotal, just had a great discussion with a Pivotal person, talking about the reason they bought Pivotal Labs originally was like, wow, when people got stuck, that's what Pivotal Labs helps with that whole application development, so you're doing similar things with Bosh? >> Correct. No it's, we have our mental model around what it is to run operations of a platform, where you're running complex software, but you have an end user who expects everything just to work. And they never want to talk to you, and you don't want to talk to them. So it's this new world of IT where they get what they want instantly, that's the platform and it has to keep working. >> Dr Nic, is it an unreasonable thing for people to say that, yeah I want the things to work, and it shouldn't go down, and you know-- >> What is shadow IT? Shadow IT is the rebellion against corporate IT, so we want to bring back, well, we want to bring the wonders of public services to corporate environments. >> Okay, so-- >> That's the Cloud Foundry's story. >> Yeah, so talk to me a little bit about your users. We've watched this ecosystem mature since the early days, you know, things are more mature, but what's working well? What are the challenges? What are some of the prime things that have people calling up your team? >> So our scope, our users, or our customers, are people, they're the GEs and the Fords of the world running either as a service or internally large Cloud Foundry installations. And whilst Cloud Foundry is getting better and better, the security model is better, the upgrades seem to be flawless, it does keep getting more complex. You know, you can't just add container to container networking and it not get more complicated, right? So, yeah, trying to keep up-to-date with not just the core, but even the community of projects going on is part of the novelty, but also it's trying to bring it to customers and be successful. >> Yeah, I go to a number of these shows that are open source and every time you come there, it's like, "Well, here's the main things we're talking about "but here's six other projects that come up." How's that impact some of what you were just talking about? But, maybe elaborate as to how you deal with the pace of change, and those big companies, how are they help integrate those into what they're doing, or do they, you know-- >> So my Twitter is different from your Twitter. So my Twitter is 10 years worth of collecting of people who talk about interesting things, putting in a URL, just referencing an idea they're having, so they tend to be the thought leaders. They might be wrong, or like, let's put Docker into production, like, it doesn't make it wrong, but you've got to be wary of people who are too early. And you just start to peace a picture of what's being built, and you start to know which groups and which individuals are machines, and make great stuff, and you sort of track their work. Like HashiCorp, Mitchell Hashimoto, I knew him before HashiCorp, and he is a monster, and so you tend to track their work. >> So your Twitter and my Twitter might be more alike than you think. >> Nic: No maybe, right. >> I interviewed Armon at the Cube-Con show last year. My Twitter blowing up the show was a bunch of people arguing about whether Serverless was going to eradicate this whole ecosystem. >> Well, we can argue about that if you like, I guess. >> But love, one of the things coming into this show, was, you know, how does the whole Kubernetes discussion fit into Cloud Foundry? We've heard at this show, Microsoft, Google, many others, talking about, look, open source communities, they're going to work together. >> Well Windows is going to track things 'cause they think they need to sell them, right? But then Microsoft has Service Fabric, which they've owned and operated internally for 10 years, and so, I think some really interesting products may be built on top of Service Fabric, because of what it is. Whereas, you know, Kubernetes will run things, Service Fabric may build net new projects. And then Cloud Foundry's a different experience altogether, so some people, it's what problems they experienced, comes to the solution they find, and unless you've tried to run a platform for people, you might not think the solution's a platform. You might think it's Kubernetes, but-- >> Yeah, so one of the things we always look at when we talk about platforms, is what do they get stood up for? How many applications do you get to stand up there? What don't they work for? Maybe you could help give us a little bit of color as to what you see? >> I'm pretty good at jamming anything into Cloud Foundry, so I have a pretty small scope of what doesn't fit, but typically the idea of Cloud Foundry is the assumption the user is a developer who has 10 iterations a day. Alright, so they want to deploy, test, deploy, test, and then layer pipelines on top of that. You also get, you're going to get the backend of long, stable apps, but the value is, for many people, is that the deploy experience. And then, you know, but whilst, you're going to get those apps that live forever, we still get to replace the underlying core of it. So you still maintain a security model even for the things that are relatively unloved. Andthis is really valuable, like the nice, clean separation of the security, the package, CVEs, and the base OS, then the apps is part of the-- >> Yeah, absolutely, there's been an interesting kind of push and pull lately. We need to take some of those old applications, and we may need to lift and shift them. It doesn't mean that I can necessarily take advantage of all the cool stuff, and there are some things that I can't do with them when I get them on to that new platform. But absolutely, you need to worry about security, you know, data's like the center of everything. >> If you're lifting and shifting, there probably is no developer looking after it, so it's more of an operator function, and they can put it anywhere they like. They're looking after it now, whereas the Cloud Foundry experience is that developer-led experience that has an operations backend. If you're lifting and shifting, if it fits in Cloud Foundry, great, if it fits in Kubernetes, great. It's your responsibility. >> Yeah, what interaction do you have with your clients, with some of the kind of cultural and operational changes that they need to go through? So thinking specifically, you've go the developers doing things, you know, the operators, whether they're involved, whether that be devops or not, but I'm curious-- >> So the biggest change when it comes to helping people who are running platforms. And I know many people want to talk about the cloud transformation, but let's talk about the operations transformation, is to become a service-orientated group who are there to provide a service. Yes you're internal, yes they all have the same email address that you do, but you're a service-orientated organization, and that is not technology, that is a mental mode. And if you're not service-orientated, shadow IT occurs, because they can go to Amazon and get a support organization that will respond to them, and so you're competing with Amazon, and Google, and you need to be pretty good. >> Yeah, you mentioned that, you know, your typical client is kind of a large, maybe I'm putting words in your mouth, the Fortune 1000 type companies, does this sort of-- >> We haven't got Berkshire. We haven't got Berkshire, and so if we're going to go Fortune 5, you know, we'd like, I've read my Warren Buffett biography, I reckon the FA are here to meet him I reckon. >> Right, so one of the questions, is this only for the enterprise? Can it be used for smaller businesses, for newer businesses? >> What's interesting is people think about Cloud Foundry as like, "Oh you run it on your infrastructure." Like, I did a talk in 2014, 15, when Docker was starting to be frothy, was, before you think you want to build your own pass, ring me on the hotline. Like, argue with me about why you wouldn't just use Heroku, or Pivotal Web Services, or IBM Cloud, like a public pass. Please, I beg of you, before you go down any path of running on-prem anything, answer solidly the question of why you just wouldn't use a public service. And yeah, so it really starts at that point. It's like, use someone else's, and then if you have to run your own. So, who's really going to have all these rules? It's large organization that have these, "Oh, no, no, we have to run our own." >> Well doctor, one of the things we've said for a while, is there's lots of things that enterprise suck at, that they need to realize that they shouldn't be doing. So start at the most basic level, there's like five companies in the world that are good at building data centers, nobody else should build data centers, if you're using somebody else that can do that. So as you go up and up the stack, you want to get rid of the undifferentiated lifting, things like that, so-- >> I like to joke that every CIO, the moment they get that job, like that's their ticket to get to build their own data center. It's like, what else was the point of becoming a CIO? I want to build my own data center. >> No, not anymore, please-- >> Not anymore, but you know, plus they've been around a little longer than-- >> So, what is that line? What should companies be able to consume a platform, versus where do they add the value, and do you help customers kind of understand that that-- >> By the time they're talking to us, they're pretty far along having convinced themselves about what they're doing. And they have their rules. They have their isolation rules, their data-ownership rules, and they'll have their level of comfort. So they might be comfortable on Amazon, Google, Azure, or they might still not be comfortable with public cloud, and they want the vSphere, but they still have that notion of we're going to run this ourselves. And most of them it's not running one, because that idea of we need our own, propagates throughout the entire organization, and they'll start wanting their own Cloud Foundry-- >> Look, I find that when I talk to users, we, the vendors, and those that watch the industry, always try to come up with these multi-cloud hybrid cloud-type discussion. Users, have a cloud strategy, and it's usually often siloed just like everything else, and right, they're using-- >> Developers-- >> I have some data service, and it's running on Google-- >> Developers just want to have a nice life. >> Microsoft apps. >> They just want to get their work done. They want to feel like, "Alright this is a great job, "like, I'm respected, I get interesting work, "we get to ship it, it actually goes into production." I think if you haven't ever had a project you've worked on that didn't go into production, you haven't worked long enough. Many of us work on something for it not to be shipped. Get it into production as quick as possible and-- >> So, do you have your, you know, utopian ideal world though as to, this is the step-- >> Oh, absolutely-- >> And this is how it'll be simple. >> Tell developers what the business problems are. Get them as close to the business problems, and give them responsibility to solve them. Don't put them behind layers of product managers, and IT support-- >> But Dr. Nic, the developers, they don't have the budget-- >> Speak for utopian-- >> How do we sort through that, because, right, the developer says they want to do this, but they're not tied to the person that has the budget, or they're not working with the operators, I mean, how do we sort through that? >> How do we get to utopia? >> Stu: Yeah. Well, Facebook, Google, Microsoft, they all solved utopia, right? So, this is, think more like them, and perhaps the CEO of the company shouldn't come from sales, perhaps it should be an IT person. >> Well, yeah, what's the T-shirt for the show? It was like running at scale, when you reach a certain point of scale, you either need to solve some of these things, or you will break? >> Right, alright look, hire great sales organizations, but if you don't have empathy for what your company needs to look like in five years time, you're probably not going to allow your organization to become that. The power games, alright? If everyone assumes that the marketing department becomes the top of the organization, or the, you know, then the good people are going to leave to go to organizations where they might be become CEO one day. >> Alright, Dr. Nic, want to give you the final word. For the people that haven't been able to come to the sessions, check out the environment, what are they missing at this show? What is exciting you the most in this ecosystem? >> Like any conference you go to, you come, the learning is all put online. Your show is put online, or every session is put online. You don't come just to learn. You get the energy. I live in Australia, I work from a coffee shop, my staff are all in America, and so to come and just to get the energy that you're doing the right thing, that you get surrounded by a group of people, and certainly no one walks away from a CF Summit feeling like they're in the wrong career. >> Excellent. Well, Dr. Nic, appreciate you helping us understand the infinity wars of cloud environments here. Stark and Wayne, thanks so much for joining us. I'm Stu Miniman, and you're watching theCUBE. >> Dr. Nic: Thanks Stu. (electronic music)
SUMMARY :
Brought to you by the Cloud Foundry Foundation. I'm Stu Miniman, and this is theCUBE's coverage I think you must've come to the conference and it's only the third time everyone seemed like they were a tech person. For some reason, you and I were joking, It was snowing and sleeting this morning, and it's almost snowing outside. you get props for the T-shirt. they're too busy making, you know, films. but nobody's supposed to know that Bruce Wayne was Batman. and how are you involved in the Cloud Foundry ecosystem? and then customers grew, and you know. talking about the reason they bought Pivotal Labs originally and you don't want to talk to them. Shadow IT is the rebellion against corporate IT, Yeah, so talk to me a little bit about your users. You know, you can't just add and every time you come there, and he is a monster, and so you tend to track their work. than you think. I interviewed Armon at the Cube-Con show last year. was, you know, how does the whole Kubernetes discussion Whereas, you know, Kubernetes will run things, is that the deploy experience. But absolutely, you need to worry about security, and they can put it anywhere they like. and you need to be pretty good. and so if we're going to go Fortune 5, you know, we'd like, and then if you have to run your own. that they need to realize that they shouldn't be doing. the moment they get that job, By the time they're talking to us, and right, they're using-- I think if you haven't ever had a project and give them responsibility to solve them. But Dr. Nic, the developers, and perhaps the CEO of the company but if you don't have empathy Alright, Dr. Nic, want to give you the final word. and so to come and just to get the energy Well, Dr. Nic, appreciate you helping us understand Dr. Nic: Thanks Stu.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
2014 | DATE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Boston | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Cambridge | LOCATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Australia | LOCATION | 0.99+ |
America | LOCATION | 0.99+ |
Bruce Wayne | PERSON | 0.99+ |
Stark | PERSON | 0.99+ |
Cloud Foundry Foundation | ORGANIZATION | 0.99+ |
Nic | PERSON | 0.99+ |
Vegas | LOCATION | 0.99+ |
five companies | QUANTITY | 0.99+ |
10 years | QUANTITY | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Mitchell Hashimoto | PERSON | 0.99+ |
last year | DATE | 0.99+ |
Nic Williams | PERSON | 0.99+ |
Stark and Wayne | ORGANIZATION | 0.99+ |
ninth year | QUANTITY | 0.99+ |
Boston, Massachusetts | LOCATION | 0.99+ |
six years ago | DATE | 0.99+ |
Dr. | PERSON | 0.99+ |
Cloud Foundry | TITLE | 0.99+ |
Pivotal Web Services | ORGANIZATION | 0.99+ |
Warren Buffett | PERSON | 0.99+ |
Tony Stark | PERSON | 0.99+ |
Batman | PERSON | 0.99+ |
April | DATE | 0.99+ |
Wayne | PERSON | 0.99+ |
Cloud Foundry Summit 2018 | EVENT | 0.99+ |
five years | QUANTITY | 0.99+ |
third time | QUANTITY | 0.98+ |
GEs | ORGANIZATION | 0.98+ |
theCUBE | ORGANIZATION | 0.98+ |
two interviews | QUANTITY | 0.98+ |
Pivotal | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
Bosh | ORGANIZATION | 0.98+ |
Stu | PERSON | 0.98+ |
Hynes Convention Center | LOCATION | 0.98+ |
Pivotal Labs | ORGANIZATION | 0.98+ |
first | QUANTITY | 0.98+ |
15 | DATE | 0.98+ |
six other projects | QUANTITY | 0.98+ |
first-time | QUANTITY | 0.98+ |
Berkshire | LOCATION | 0.98+ |
today | DATE | 0.97+ |
10 iterations a day | QUANTITY | 0.97+ |
Azure | ORGANIZATION | 0.96+ |
CF Summit | EVENT | 0.96+ |
Iron Man | PERSON | 0.96+ |
Fords | ORGANIZATION | 0.96+ |
Heroku | ORGANIZATION | 0.95+ |
Mary O'Brien, IBM Securities | IBM Think 2018
>> Announcer: Live from Las Vegas, it's The Cube. Covering IBM Think 2018, brought to you by IBM. >> Welcome back to IBM Think 2018. My name is Dave Vellante and you're watching The Cube, the leader in live tech coverage. This is IBM's inaugural Think event. Companies consolidated about six major events into one We're trying to figure it out, 30-40,000 people there's too many people to count, it's just unbelievable. Mary O'Brien is here, she is the vice president of research and development at IBM in from Cork, Ireland. Mary, great to see you, thanks for coming on The Cube. >> Thank you, Dave. >> So tell us a little bit more about your role at IBM as head of research and development. >> Okay so I'm head of research and development for IBM Security explicitly so in that capacity I manage a worldwide team of researchers and developers and we take products from, you know, incubation, initial ideas all the way through to products in the field. Products that help defend businesses against cyber crime. >> So, Jenny was talking today about, you know, security is one of the tenants of your offerings at the core. >> Mary: Yes. >> So, everybody talks about security. >> You can't bolt it on, you know, there's a lot of sort of conversations around that. What does that mean, security at the core from a design and R & D perspective? >> That actually means that the developers of applications are actually aware of security best practices as they design, as they architect and design their applications. So that they don't deliver applications to the field that have vulnerabilities that can be exploited. So, instead of trying to secure a perimeter of an application or a product or, you know, a perimeter full stop they actually design security into the application. It makes it a much more efficient, much cheaper way to deliver security and also, you know, much stronger security base there. >> So, I wonder if you could relate, sort of, what you guys are doing in security with what's happened in the market over the last 10 or 15 years. So, it used to be security was, you know, hacktivists and you know throw some malware in and maybe do some disruption has become cyber criminals, you know, big business now and then of course you've got nation states. >> Mm-hmm How have you had to respond specifically within the R & D organization to deal with those threats? >> So, you know, you have described the evolution of cyber crime over the last years and for sure it's no longer kids in a basement you know, hacking to, for the fun of it. Cyber crime is big business and, you know, there's money to be made for cyber criminals. So, as a result they are looking to hack in and get high value assets out of enterprises, and of course, we as an organization and as a security business unit have had to respond to that. By really understanding, you know, what constitutes a very mature set of security competencies and practices and you know how we break down this massive problem into you know, bite sized consumable pieces that any business can consume and work into their enterprise in order to protect them. So, we have developed a portfolio of products that look at protecting all parts of your enterprise. You know, by infusing security everywhere, you know, on your devices, on the, you know, the perimeter of your business. Protecting your data, protecting all sorts, and we also have developed a huge practice of security professionals who actually will go out and do it for you or will, you know, assess your security posture and tell you where you've got problems and how to fix them. >> I remember a piece that our head of research, >> Peter Burris, wrote years ago and it was entitled something like "Bad User Behavior will Trump Good Security Every Time" and so my understanding is phishing is obviously one of the big problems today. How do you combat that, can you use machine intelligence to help people, you know, users that aren't security conscious sort of avoid the mistakes that they've been making? >> So, before I get into the, the complicated, advanced, you know, machine learning and artificial intelligence practices that we are bringing to bear now, you know, it's important to be clear that you know, a vast number of breaches come from the inside. So, they come from either the sloppy employee who doesn't change their password often or uses the same password for work and play and the same password everywhere. Or, you know, the unfortunate employee who clicks on a malicious link and you know, takes in some malware into their devices and malware that can actually you know, move horizontally through the business. Or it can come from you know, the end user or the insider with malicious intent. Okay, so, it's pretty clear to all of us that basic security hygiene is the fundamental so actually making sure that your laptop, your devices are patched. They have the latest security patches on board. Security practices are understood. Basic password hygiene and et cetera, that's kind of the start. >> Uh oh. >> Okay keep going. >> Okay, so-- >> I'm starting to sweat. >> So, you know, and of course, you know, in this era of cyber crime as we've seen it evolve in the last few years, the security industry has reached a perfect storm because it's well known that by 2020 there will be 1.2 million unfilled security professional roles, okay? Now, couple that with the fact that there are in the region, in the same time frame, in the region of 50 billion connected devices in the internet of things. So what's happening is the attack landscape and you know, the attack surface is increasing. The opportunity for the cyber criminalist to attack is increasing and the number of professionals available to fight that crime is not increasing because of this huge shortage. So, you know, you heard Jenny this morning talking about the era of man assisted by machine so infusing artificial intelligence and machine learning into security products and practices is another instantiation of man being assisted by machine and that is our, our tool and our new practice in the fight against cyber crime. >> So when I talk to security professionals consistently they tell us that they have more demand for their services than supply to chase down, you know, threats. They have, they struggle to prioritize. They struggle with just too many false positives and they need help. They're not as productive as they'd like to be. Can machine intelligence assist there? >> Absolutely, so computers, let's face it, computers are ideally placed to pour over vast quantities of data looking for trends, anomalies, and really finding the needle in the haystack. They have such a vast capacity to do this that's way out, you know, that really surpasses what a human can do and so you know, with, in this era of machine learning you can actually you know, equip a computer with a set of basic rules and you know, set it loose on vast quantities of data and let it test and iterate those rules with this data and become increasingly knowledgeable you know, about the data. The trends in the data, what the data, what good data looks like, what anomalous data looks like and at speed point out the anomalies and find that needle in the haystack. >> So, there's a stat, depending on which, you know, firm you look at or which organization you believe, but it's scary none the less. That the average penetration is only detected 250 or 350 days after the infiltration, and that is a scary stat, it would take a year to find out that somebody has infiltrated my organization or whatever it is, 200 days. Is that number shrinking, is the industry as a whole, not just IBM, attacking that figure? First of all, is it a valid figure, and are you able to attack that? >> Well, the figure is definitely scary. I don't know whether your figure is exactly >> Yeah, well the latest figure but it's a scary figure >> Yeah. and it's well known that attackers will get in. So, of course, there's, uh there's the various phases of, you know, protecting yourself. So, you're going to try to avoid the attackers getting in in the first place. Using the various hygienic means of you know, keeping your devices, you know, clean and free from vulnerabilities and so on. But you've also got to be aware that the attacker does get in so now you've got to make sure that you limit the damage that they can cause when they're in. So, of course, you know security is a, you know you can take a layered approach to security. So you've got to firstly understand what is your most valuable data, where are your most valuable assets and layer up the levels of security around those first. So you make sure that if the attacker gets in, they don't get there and you limit the damage they can do and then of course you limit their ability to exfiltrate data and get anything out of your organization. Because I mean if they are just in there, of course they can do some damage. But, the real damage happens when they can manage to exfiltrate data and do something with that. >> So again Mary, it make sense that artificial intelligence or machine intelligence could help with this but specifically what do you see as the future role of Watson as it relates to cyber security? >> So, I mentioned the shortage of security professionals and that growing problem, okay so Watson in our cyber security space acts as an assistant to the security analyst. So, we have taught Watson the language of cyber security, and Watson manages to ingest vast troves of unstructured security data, that means blogs and you know, written text of security data from, that's available on the internet and out there all day, everyday. It just ingests this and fills a corpus of knowledge with this, with these jewels of information. And, basically that information and that corpus of knowledge is now available to a security analyst who, you know, a junior security analyst could take years to become very efficient and to really be able to recognize the needle in the haystack themselves. But with the Watson assistant they can embellish their understanding and what they see and all of the, all of the relationships and the data that augments the detail about a cyber incident you know, fairly instantaneous. And it, you know, really augment their own knowledge with the knowledge that would take years to generate, you know. >> So, I wonder if we could talk about collaboration a little bit because this is good versus evil. You guys are like one of the super heroes and your competitors are also sort of super heroes. >> Of course. >> You got Batman, you got Superman, Catwoman, and Spiderman, et cetera. How do you guys collaborate and share in a, highly competitive industry? Well, they're vary as far as you know, appearing for sharing okay, so firstly you absolutely nailed the importance for sharing because you know, the cyber criminals share on the dark web. They actually share, they sell their wares, they trade, you know so very important for us to share as well. So, you know, there are various industry forum for sharing and also organizations like IBM have created collaborative capabilities like we have our X-force Exchange which is basically a sharing portal. So, any of our competitors or other security organizations or interested parties can create you know, a piece of work describing a particular incident that they are investigating or a particular event that's happening and others can add to it and they can share information. Now, historically people have not been keen to share in this space so it is an evolving event. >> So speaking of super heroes I got to ask ya, a lot of security professionals that I talk to say well when I was a kid I read comic books. You know, I envisioned saving the world. So, how did you, how did you get into this, and was that you as a kid? Did you like-- >> No, it wasn't. I'm not a long term security professional. But, I've been in technology and evolving products for, you know, in the telecommunication business and now security over many years. So, I got into this to bring that capability of delivering quality software and hardware products to the field back in 2013 when a part of our IBM security business needed some leadership. So, I had the opportunity to take my family to Atlanta, Georgia to lead a part of the IBM security business then. >> Well, it's a very challenging field. It's one of those, you know, never ending, you know, missions so thank you for your hard work and congratulations on all the success. >> Thank you David. >> Alright, appreciate you coming on The Cube, Mary. >> Thank you. >> Keep it right there everybody, we will be back with our next guest, you're watching The Cube. We're live from IBM Think 2018 in Las Vegas, be right back. (pleasant music)
SUMMARY :
Covering IBM Think 2018, brought to you by IBM. Mary O'Brien is here, she is the vice president about your role at IBM as head of research and development. and we take products from, you know, So, Jenny was talking today about, you know, You can't bolt it on, you know, there's of an application or a product or, you know, So, it used to be security was, you know, So, you know, you have described the evolution you know, users that aren't security conscious malware that can actually you know, and of course, you know, in this era to chase down, you know, threats. with a set of basic rules and you know, you know, firm you look at or which organization Well, the figure is definitely scary. the various phases of, you know, protecting yourself. a security analyst who, you know, a junior You guys are like one of the super heroes the importance for sharing because you know, the a lot of security professionals that I talk to products for, you know, in the telecommunication you know, missions so thank you for your Alright, appreciate you coming Keep it right there everybody, we will be back
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jenny | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Mary O'Brien | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Mary | PERSON | 0.99+ |
Peter Burris | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
2013 | DATE | 0.99+ |
Superman | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
250 | QUANTITY | 0.99+ |
200 days | QUANTITY | 0.99+ |
Cork, Ireland | LOCATION | 0.99+ |
Batman | PERSON | 0.99+ |
Spiderman | PERSON | 0.99+ |
350 days | QUANTITY | 0.99+ |
2020 | DATE | 0.99+ |
Catwoman | PERSON | 0.99+ |
Atlanta, Georgia | LOCATION | 0.99+ |
today | DATE | 0.98+ |
IBM Securities | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
first | QUANTITY | 0.98+ |
Watson | PERSON | 0.98+ |
a year | QUANTITY | 0.97+ |
The Cube | TITLE | 0.97+ |
30-40,000 people | QUANTITY | 0.96+ |
Trump | PERSON | 0.95+ |
1.2 million unfilled security professional roles | QUANTITY | 0.93+ |
years ago | DATE | 0.92+ |
First | QUANTITY | 0.92+ |
firstly | QUANTITY | 0.91+ |
this morning | DATE | 0.9+ |
50 billion connected devices | QUANTITY | 0.9+ |
six major events | QUANTITY | 0.89+ |
too many people | QUANTITY | 0.88+ |
IBM Think 2018 | EVENT | 0.87+ |
Think | EVENT | 0.82+ |
last few years | DATE | 0.8+ |
X-force Exchange | TITLE | 0.75+ |
2018 | DATE | 0.72+ |
Cube | TITLE | 0.7+ |
Watson | TITLE | 0.69+ |
Think 2018 | EVENT | 0.65+ |
last | DATE | 0.62+ |
15 years | QUANTITY | 0.61+ |
last years | DATE | 0.58+ |
10 | QUANTITY | 0.54+ |
tenants | QUANTITY | 0.39+ |
Think | COMMERCIAL_ITEM | 0.33+ |
Data Science: Present and Future | IBM Data Science For All
>> Announcer: Live from New York City it's The Cube, covering IBM data science for all. Brought to you by IBM. (light digital music) >> Welcome back to data science for all. It's a whole new game. And it is a whole new game. >> Dave Vellante, John Walls here. We've got quite a distinguished panel. So it is a new game-- >> Well we're in the game, I'm just happy to be-- (both laugh) Have a swing at the pitch. >> Well let's what we have here. Five distinguished members of our panel. It'll take me a minute to get through the introductions, but believe me they're worth it. Jennifer Shin joins us. Jennifer's the founder of 8 Path Solutions, the director of the data science of Comcast and part of the faculty at UC Berkeley and NYU. Jennifer, nice to have you with us, we appreciate the time. Joe McKendrick an analyst and contributor of Forbes and ZDNet, Joe, thank you for being here at well. Another ZDNetter next to him, Dion Hinchcliffe, who is a vice president and principal analyst of Constellation Research and also contributes to ZDNet. Good to see you, sir. To the back row, but that doesn't mean anything about the quality of the participation here. Bob Hayes with a killer Batman shirt on by the way, which we'll get to explain in just a little bit. He runs the Business over Broadway. And Joe Caserta, who the founder of Caserta Concepts. Welcome to all of you. Thanks for taking the time to be with us. Jennifer, let me just begin with you. Obviously as a practitioner you're very involved in the industry, you're on the academic side as well. We mentioned Berkeley, NYU, steep experience. So I want you to kind of take your foot in both worlds and tell me about data science. I mean where do we stand now from those two perspectives? How have we evolved to where we are? And how would you describe, I guess the state of data science? >> Yeah so I think that's a really interesting question. There's a lot of changes happening. In part because data science has now become much more established, both in the academic side as well as in industry. So now you see some of the bigger problems coming out. People have managed to have data pipelines set up. But now there are these questions about models and accuracy and data integration. So the really cool stuff from the data science standpoint. We get to get really into the details of the data. And I think on the academic side you now see undergraduate programs, not just graduate programs, but undergraduate programs being involved. UC Berkeley just did a big initiative that they're going to offer data science to undergrads. So that's a huge news for the university. So I think there's a lot of interest from the academic side to continue data science as a major, as a field. But I think in industry one of the difficulties you're now having is businesses are now asking that question of ROI, right? What do I actually get in return in the initial years? So I think there's a lot of work to be done and just a lot of opportunity. It's great because people now understand better with data sciences, but I think data sciences have to really think about that seriously and take it seriously and really think about how am I actually getting a return, or adding a value to the business? >> And there's lot to be said is there not, just in terms of increasing the workforce, the acumen, the training that's required now. It's a still relatively new discipline. So is there a shortage issue? Or is there just a great need? Is the opportunity there? I mean how would you look at that? >> Well I always think there's opportunity to be smart. If you can be smarter, you know it's always better. It gives you advantages in the workplace, it gets you an advantage in academia. The question is, can you actually do the work? The work's really hard, right? You have to learn all these different disciplines, you have to be able to technically understand data. Then you have to understand it conceptually. You have to be able to model with it, you have to be able to explain it. There's a lot of aspects that you're not going to pick up overnight. So I think part of it is endurance. Like are people going to feel motivated enough and dedicate enough time to it to get very good at that skill set. And also of course, you know in terms of industry, will there be enough interest in the long term that there will be a financial motivation. For people to keep staying in the field, right? So I think it's definitely a lot of opportunity. But that's always been there. Like I tell people I think of myself as a scientist and data science happens to be my day job. That's just the job title. But if you are a scientist and you work with data you'll always want to work with data. I think that's just an inherent need. It's kind of a compulsion, you just kind of can't help yourself, but dig a little bit deeper, ask the questions, you can't not think about it. So I think that will always exist. Whether or not it's an industry job in the way that we see it today, and like five years from now, or 10 years from now. I think that's something that's up for debate. >> So all of you have watched the evolution of data and how it effects organizations for a number of years now. If you go back to the days when data warehouse was king, we had a lot of promises about 360 degree views of the customer and how we were going to be more anticipatory in terms and more responsive. In many ways the decision support systems and the data warehousing world didn't live up to those promises. They solved other problems for sure. And so everybody was looking for big data to solve those problems. And they've begun to attack many of them. We talked earlier in The Cube today about fraud detection, it's gotten much, much better. Certainly retargeting of advertising has gotten better. But I wonder if you could comment, you know maybe start with Joe. As to the effect that data and data sciences had on organizations in terms of fulfilling that vision of a 360 degree view of customers and anticipating customer needs. >> So. Data warehousing, I wouldn't say failed. But I think it was unfinished in order to achieve what we need done today. At the time I think it did a pretty good job. I think it was the only place where we were able to collect data from all these different systems, have it in a single place for analytics. The big difference between what I think, between data warehousing and data science is data warehouses were primarily made for the consumer to human beings. To be able to have people look through some tool and be able to analyze data manually. That really doesn't work anymore, there's just too much data to do that. So that's why we need to build a science around it so that we can actually have machines actually doing the analytics for us. And I think that's the biggest stride in the evolution over the past couple of years, that now we're actually able to do that, right? It used to be very, you know you go back to when data warehouses started, you had to be a deep technologist in order to be able to collect the data, write the programs to clean the data. But now you're average causal IT person can do that. Right now I think we're back in data science where you have to be a fairly sophisticated programmer, analyst, scientist, statistician, engineer, in order to do what we need to do, in order to make machines actually understand the data. But I think part of the evolution, we're just in the forefront. We're going to see over the next, not even years, within the next year I think a lot of new innovation where the average person within business and definitely the average person within IT will be able to do as easily say, "What are my sales going to be next year?" As easy as it is to say, "What were my sales last year." Where now it's a big deal. Right now in order to do that you have to build some algorithms, you have to be a specialist on predictive analytics. And I think, you know as the tools mature, as people using data matures, and as the technology ecosystem for data matures, it's going to be easier and more accessible. >> So it's still too hard. (laughs) That's something-- >> Joe C.: Today it is yes. >> You've written about and talked about. >> Yeah no question about it. We see this citizen data scientist. You know we talked about the democratization of data science but the way we talk about analytics and warehousing and all the tools we had before, they generated a lot of insights and views on the information, but they didn't really give us the science part. And that's, I think that what's missing is the forming of the hypothesis, the closing of the loop of. We now have use of this data, but are are changing, are we thinking about it strategically? Are we learning from it and then feeding that back into the process. I think that's the big difference between data science and the analytics side. But, you know just like Google made search available to everyone, not just people who had highly specialized indexers or crawlers. Now we can have tools that make these capabilities available to anyone. You know going back to what Joe said I think the key thing is we now have tools that can look at all the data and ask all the questions. 'Cause we can't possibly do it all ourselves. Our organizations are increasingly awash in data. Which is the life blood of our organizations, but we're not using it, you know this a whole concept of dark data. And so I think the concept, or the promise of opening these tools up for everyone to be able to access those insights and activate them, I think that, you know, that's where it's headed. >> This is kind of where the T shirt comes in right? So Bob if you would, so you've got this Batman shirt on. We talked a little bit about it earlier, but it plays right into what Dion's talking about. About tools and, I don't want to spoil it, but you go ahead (laughs) and tell me about it. >> Right, so. Batman is a super hero, but he doesn't have any supernatural powers, right? He can't fly on his own, he can't become invisible on his own. But the thing is he has the utility belt and he has these tools he can use to help him solve problems. For example he as the bat ring when he's confronted with a building that he wants to get over, right? So he pulls it out and uses that. So as data professionals we have all these tools now that these vendors are making. We have IBM SPSS, we have data science experience. IMB Watson that these data pros can now use it as part of their utility belt and solve problems that they're confronted with. So if you''re ever confronted with like a Churn problem and you have somebody who has access to that data they can put that into IBM Watson, ask a question and it'll tell you what's the key driver of Churn. So it's not that you have to be a superhuman to be a data scientist, but these tools will help you solve certain problems and help your business go forward. >> Joe McKendrick, do you have a comment? >> Does that make the Batmobile the Watson? (everyone laughs) Analogy? >> I was just going to add that, you know all of the billionaires in the world today and none of them decided to become Batman yet. It's very disappointing. >> Yeah. (Joe laughs) >> Go ahead Joe. >> And I just want to add some thoughts to our discussion about what happened with data warehousing. I think it's important to point out as well that data warehousing, as it existed, was fairly successful but for larger companies. Data warehousing is a very expensive proposition it remains a expensive proposition. Something that's in the domain of the Fortune 500. But today's economy is based on a very entrepreneurial model. The Fortune 500s are out there of course it's ever shifting. But you have a lot of smaller companies a lot of people with start ups. You have people within divisions of larger companies that want to innovate and not be tied to the corporate balance sheet. They want to be able to go through, they want to innovate and experiment without having to go through finance and the finance department. So there's all these open source tools available. There's cloud resources as well as open source tools. Hadoop of course being a prime example where you can work with the data and experiment with the data and practice data science at a very low cost. >> Dion mentioned the C word, citizen data scientist last year at the panel. We had a conversation about that. And the data scientists on the panel generally were like, "Stop." Okay, we're not all of a sudden going to turn everybody into data scientists however, what we want to do is get people thinking about data, more focused on data, becoming a data driven organization. I mean as a data scientist I wonder if you could comment on that. >> Well I think so the other side of that is, you know there are also many people who maybe didn't, you know follow through with science, 'cause it's also expensive. A PhD takes a lot of time. And you know if you don't get funding it's a lot of money. And for very little security if you think about how hard it is to get a teaching job that's going to give you enough of a pay off to pay that back. Right, the time that you took off, the investment that you made. So I think the other side of that is by making data more accessible, you allow people who could have been great in science, have an opportunity to be great data scientists. And so I think for me the idea of citizen data scientist, that's where the opportunity is. I think in terms of democratizing data and making it available for everyone, I feel as though it's something similar to the way we didn't really know what KPIs were, maybe 20 years ago. People didn't use it as readily, didn't teach it in schools. I think maybe 10, 20 years from now, some of the things that we're building today from data science, hopefully more people will understand how to use these tools. They'll have a better understanding of working with data and what that means, and just data literacy right? Just being able to use these tools and be able to understand what data's saying and actually what it's not saying. Which is the thing that most people don't think about. But you can also say that data doesn't say anything. There's a lot of noise in it. There's too much noise to be able to say that there is a result. So I think that's the other side of it. So yeah I guess in terms for me, in terms of data a serious data scientist, I think it's a great idea to have that, right? But at the same time of course everyone kind of emphasized you don't want everyone out there going, "I can be a data scientist without education, "without statistics, without math," without understanding of how to implement the process. I've seen a lot of companies implement the same sort of process from 10, 20 years ago just on Hadoop instead of SQL. Right and it's very inefficient. And the only difference is that you can build more tables wrong than they could before. (everyone laughs) Which is I guess >> For less. it's an accomplishment and for less, it's cheaper, yeah. >> It is cheaper. >> Otherwise we're like I'm not a data scientist but I did stay at a Holiday Inn Express last night, right? >> Yeah. (panelists laugh) And there's like a little bit of pride that like they used 2,000, you know they used 2,000 computers to do it. Like a little bit of pride about that, but you know of course maybe not a great way to go. I think 20 years we couldn't do that, right? One computer was already an accomplishment to have that resource. So I think you have to think about the fact that if you're doing it wrong, you're going to just make that mistake bigger, which his also the other side of working with data. >> Sure, Bob. >> Yeah I have a comment about that. I've never liked the term citizen data scientist or citizen scientist. I get the point of it and I think employees within companies can help in the data analytics problem by maybe being a data collector or something. I mean I would never have just somebody become a scientist based on a few classes here she takes. It's like saying like, "Oh I'm going to be a citizen lawyer." And so you come to me with your legal problems, or a citizen surgeon. Like you need training to be good at something. You can't just be good at something just 'cause you want to be. >> John: Joe you wanted to say something too on that. >> Since we're in New York City I'd like to use the analogy of a real scientist versus a data scientist. So real scientist requires tools, right? And the tools are not new, like microscopes and a laboratory and a clean room. And these tools have evolved over years and years, and since we're in New York we could walk within a 10 block radius and buy any of those tools. It doesn't make us a scientist because we use those tools. I think with data, you know making, making the tools evolve and become easier to use, you know like Bob was saying, it doesn't make you a better data scientist, it just makes the data more accessible. You know we can go buy a microscope, we can go buy Hadoop, we can buy any kind of tool in a data ecosystem, but it doesn't really make you a scientist. I'm very involved in the NYU data science program and the Columbia data science program, like these kids are brilliant. You know these kids are not someone who is, you know just trying to run a day to day job, you know in corporate America. I think the people who are running the day to day job in corporate America are going to be the recipients of data science. Just like people who take drugs, right? As a result of a smart data scientist coming up with a formula that can help people, I think we're going to make it easier to distribute the data that can help people with all the new tools. But it doesn't really make it, you know the access to the data and tools available doesn't really make you a better data scientist. Without, like Bob was saying, without better training and education. >> So how-- I'm sorry, how do you then, if it's not for everybody, but yet I'm the user at the end of the day at my company and I've got these reams of data before me, how do you make it make better sense to me then? So that's where machine learning comes in or artificial intelligence and all this stuff. So how at the end of the day, Dion? How do you make it relevant and usable, actionable to somebody who might not be as practiced as you would like? >> I agree with Joe that many of us will be the recipients of data science. Just like you had to be a computer science at one point to develop programs for a computer, now we can get the programs. You don't need to be a computer scientist to get a lot of value out of our IT systems. The same thing's going to happen with data science. There's far more demand for data science than there ever could be produced by, you know having an ivory tower filled with data scientists. Which we need those guys, too, don't get me wrong. But we need to have, productize it and make it available in packages such that it can be consumed. The outputs and even some of the inputs can be provided by mere mortals, whether that's machine learning or artificial intelligence or bots that go off and run the hypotheses and select the algorithms maybe with some human help. We have to productize it. This is a constant of data scientist of service, which is becoming a thing now. It's, "I need this, I need this capability at scale. "I need it fast and I need it cheap." The commoditization of data science is going to happen. >> That goes back to what I was saying about, the recipient also of data science is also machines, right? Because I think the other thing that's happening now in the evolution of data is that, you know the data is, it's so tightly coupled. Back when you were talking about data warehousing you have all the business transactions then you take the data out of those systems, you put them in a warehouse for analysis, right? Maybe they'll make a decision to change that system at some point. Now the analytics platform and the business application is very tightly coupled. They become dependent upon one another. So you know people who are using the applications are now be able to take advantage of the insights of data analytics and data science, just through the app. Which never really existed before. >> I have one comment on that. You were talking about how do you get the end user more involved, well like we said earlier data science is not easy, right? As an end user, I encourage you to take a stats course, just a basic stats course, understanding what a mean is, variability, regression analysis, just basic stuff. So you as an end user can get more, or glean more insight from the reports that you're given, right? If you go to France and don't know French, then people can speak really slowly to you in French, you're not going to get it. You need to understand the language of data to get value from the technology we have available to us. >> Incidentally French is one of the languages that you have the option of learning if you're a mathematicians. So math PhDs are required to learn a second language. France being the country of algebra, that's one of the languages you could actually learn. Anyway tangent. But going back to the point. So statistics courses, definitely encourage it. I teach statistics. And one of the things that I'm finding as I go through the process of teaching it I'm actually bringing in my experience. And by bringing in my experience I'm actually kind of making the students think about the data differently. So the other thing people don't think about is the fact that like statisticians typically were expected to do, you know, just basic sort of tasks. In a sense that they're knowledge is specialized, right? But the day to day operations was they ran some data, you know they ran a test on some data, looked at the results, interpret the results based on what they were taught in school. They didn't develop that model a lot of times they just understand what the tests were saying, especially in the medical field. So when you when think about things like, we have words like population, census. Which is when you take data from every single, you have every single data point versus a sample, which is a subset. It's a very different story now that we're collecting faster than it used to be. It used to be the idea that you could collect information from everyone. Like it happens once every 10 years, we built that in. But nowadays you know, you know here about Facebook, for instance, I think they claimed earlier this year that their data was more accurate than the census data. So now there are these claims being made about which data source is more accurate. And I think the other side of this is now statisticians are expected to know data in a different way than they were before. So it's not just changing as a field in data science, but I think the sciences that are using data are also changing their fields as well. >> Dave: So is sampling dead? >> Well no, because-- >> Should it be? (laughs) >> Well if you're sampling wrong, yes. That's really the question. >> Okay. You know it's been said that the data doesn't lie, people do. Organizations are very political. Oftentimes you know, lies, damned lies and statistics, Benjamin Israeli. Are you seeing a change in the way in which organizations are using data in the context of the politics. So, some strong P&L manager say gets data and crafts it in a way that he or she can advance their agenda. Or they'll maybe attack a data set that is, probably should drive them in a different direction, but might be antithetical to their agenda. Are you seeing data, you know we talked about democratizing data, are you seeing that reduce the politics inside of organizations? >> So you know we've always used data to tell stories at the top level of an organization that's what it's all about. And I still see very much that no matter how much data science or, the access to the truth through looking at the numbers that story telling is still the political filter through which all that data still passes, right? But it's the advent of things like Block Chain, more and more corporate records and corporate information is going to end up in these open and shared repositories where there is not alternate truth. It'll come back to whoever tells the best stories at the end of the day. So I still see the organizations are very political. We are seeing now more open data though. Open data initiatives are a big thing, both in government and in the private sector. It is having an effect, but it's slow and steady. So that's what I see. >> Um, um, go ahead. >> I was just going to say as well. Ultimately I think data driven decision making is a great thing. And it's especially useful at the lower tiers of the organization where you have the routine day to day's decisions that could be automated through machine learning and deep learning. The algorithms can be improved on a constant basis. On the upper levels, you know that's why you pay executives the big bucks in the upper levels to make the strategic decisions. And data can help them, but ultimately, data, IT, technology alone will not create new markets, it will not drive new businesses, it's up to human beings to do that. The technology is the tool to help them make those decisions. But creating businesses, growing businesses, is very much a human activity. And that's something I don't see ever getting replaced. Technology might replace many other parts of the organization, but not that part. >> I tend to be a foolish optimist when it comes to this stuff. >> You do. (laughs) >> I do believe that data will make the world better. I do believe that data doesn't lie people lie. You know I think as we start, I'm already seeing trends in industries, all different industries where, you know conventional wisdom is starting to get trumped by analytics. You know I think it's still up to the human being today to ignore the facts and go with what they think in their gut and sometimes they win, sometimes they lose. But generally if they lose the data will tell them that they should have gone the other way. I think as we start relying more on data and trusting data through artificial intelligence, as we start making our lives a little bit easier, as we start using smart cars for safety, before replacement of humans. AS we start, you know, using data really and analytics and data science really as the bumpers, instead of the vehicle, eventually we're going to start to trust it as the vehicle itself. And then it's going to make lying a little bit harder. >> Okay, so great, excellent. Optimism, I love it. (John laughs) So I'm going to play devil's advocate here a little bit. There's a couple elephant in the room topics that I want to, to explore a little bit. >> Here it comes. >> There was an article today in Wired. And it was called, Why AI is Still Waiting for It's Ethics Transplant. And, I will just read a little segment from there. It says, new ethical frameworks for AI need to move beyond individual responsibility to hold powerful industrial, government and military interests accountable as they design and employ AI. When tech giants build AI products, too often user consent, privacy and transparency are overlooked in favor of frictionless functionality that supports profit driven business models based on aggregate data profiles. This is from Kate Crawford and Meredith Whittaker who founded AI Now. And they're calling for sort of, almost clinical trials on AI, if I could use that analogy. Before you go to market you've got to test the human impact, the social impact. Thoughts. >> And also have the ability for a human to intervene at some point in the process. This goes way back. Is everybody familiar with the name Stanislav Petrov? He's the Soviet officer who back in 1983, it was in the control room, I guess somewhere outside of Moscow in the control room, which detected a nuclear missile attack against the Soviet Union coming out of the United States. Ordinarily I think if this was an entirely AI driven process we wouldn't be sitting here right now talking about it. But this gentlemen looked at what was going on on the screen and, I'm sure he's accountable to his authorities in the Soviet Union. He probably got in a lot of trouble for this, but he decided to ignore the signals, ignore the data coming out of, from the Soviet satellites. And as it turned out, of course he was right. The Soviet satellites were seeing glints of the sun and they were interpreting those glints as missile launches. And I think that's a great example why, you know every situation of course doesn't mean the end of the world, (laughs) it was in this case. But it's a great example why there needs to be a human component, a human ability for human intervention at some point in the process. >> So other thoughts. I mean organizations are driving AI hard for profit. Best minds of our generation are trying to figure out how to get people to click on ads. Jeff Hammerbacher is famous for saying it. >> You can use data for a lot of things, data analytics, you can solve, you can cure cancer. You can make customers click on more ads. It depends on what you're goal is. But, there are ethical considerations we need to think about. When we have data that will have a racial bias against blacks and have them have higher prison sentences or so forth or worse credit scores, so forth. That has an impact on a broad group of people. And as a society we need to address that. And as scientists we need to consider how are we going to fix that problem? Cathy O'Neil in her book, Weapons of Math Destruction, excellent book, I highly recommend that your listeners read that book. And she talks about these issues about if AI, if algorithms have a widespread impact, if they adversely impact protected group. And I forget the last criteria, but like we need to really think about these things as a people, as a country. >> So always think the idea of ethics is interesting. So I had this conversation come up a lot of times when I talk to data scientists. I think as a concept, right as an idea, yes you want things to be ethical. The question I always pose to them is, "Well in the business setting "how are you actually going to do this?" 'Cause I find the most difficult thing working as a data scientist, is to be able to make the day to day decision of when someone says, "I don't like that number," how do you actually get around that. If that's the right data to be showing someone or if that's accurate. And say the business decides, "Well we don't like that number." Many people feel pressured to then change the data, change, or change what the data shows. So I think being able to educate people to be able to find ways to say what the data is saying, but not going past some line where it's a lie, where it's unethical. 'Cause you can also say what data doesn't say. You don't always have to say what the data does say. You can leave it as, "Here's what we do know, "but here's what we don't know." There's a don't know part that many people will omit when they talk about data. So I think, you know especially when it comes to things like AI it's tricky, right? Because I always tell people I don't know everyone thinks AI's going to be so amazing. I started an industry by fixing problems with computers that people didn't realize computers had. For instance when you have a system, a lot of bugs, we all have bug reports that we've probably submitted. I mean really it's no where near the point where it's going to start dominating our lives and taking over all the jobs. Because frankly it's not that advanced. It's still run by people, still fixed by people, still managed by people. I think with ethics, you know a lot of it has to do with the regulations, what the laws say. That's really going to be what's involved in terms of what people are willing to do. A lot of businesses, they want to make money. If there's no rules that says they can't do certain things to make money, then there's no restriction. I think the other thing to think about is we as consumers, like everyday in our lives, we shouldn't separate the idea of data as a business. We think of it as a business person, from our day to day consumer lives. Meaning, yes I work with data. Incidentally I also always opt out of my credit card, you know when they send you that information, they make you actually mail them, like old school mail, snail mail like a document that says, okay I don't want to be part of this data collection process. Which I always do. It's a little bit more work, but I go through that step of doing that. Now if more people did that, perhaps companies would feel more incentivized to pay you for your data, or give you more control of your data. Or at least you know, if a company's going to collect information, I'd want you to be certain processes in place to ensure that it doesn't just get sold, right? For instance if a start up gets acquired what happens with that data they have on you? You agree to give it to start up. But I mean what are the rules on that? So I think we have to really think about the ethics from not just, you know, someone who's going to implement something but as consumers what control we have for our own data. 'Cause that's going to directly impact what businesses can do with our data. >> You know you mentioned data collection. So slightly on that subject. All these great new capabilities we have coming. We talked about what's going to happen with media in the future and what 5G technology's going to do to mobile and these great bandwidth opportunities. The internet of things and the internet of everywhere. And all these great inputs, right? Do we have an arms race like are we keeping up with the capabilities to make sense of all the new data that's going to be coming in? And how do those things square up in this? Because the potential is fantastic, right? But are we keeping up with the ability to make it make sense and to put it to use, Joe? >> So I think data ingestion and data integration is probably one of the biggest challenges. I think, especially as the world is starting to become more dependent on data. I think you know, just because we're dependent on numbers we've come up with GAAP, which is generally accepted accounting principles that can be audited and proven whether it's true or false. I think in our lifetime we will see something similar to that we will we have formal checks and balances of data that we use that can be audited. Getting back to you know what Dave was saying earlier about, I personally would trust a machine that was programmed to do the right thing, than to trust a politician or some leader that may have their own agenda. And I think the other thing about machines is that they are auditable. You know you can look at the code and see exactly what it's doing and how it's doing it. Human beings not so much. So I think getting to the truth, even if the truth isn't the answer that we want, I think is a positive thing. It's something that we can't do today that once we start relying on machines to do we'll be able to get there. >> Yeah I was just going to add that we live in exponential times. And the challenge is that the way that we're structured traditionally as organizations is not allowing us to absorb advances exponentially, it's linear at best. Everyone talks about change management and how are we going to do digital transformation. Evidence shows that technology's forcing the leaders and the laggards apart. There's a few leading organizations that are eating the world and they seem to be somehow rolling out new things. I don't know how Amazon rolls out all this stuff. There's all this artificial intelligence and the IOT devices, Alexa, natural language processing and that's just a fraction, it's just a tip of what they're releasing. So it just shows that there are some organizations that have path found the way. Most of the Fortune 500 from the year 2000 are gone already, right? The disruption is happening. And so we are trying, have to find someway to adopt these new capabilities and deploy them effectively or the writing is on the wall. I spent a lot of time exploring this topic, how are we going to get there and all of us have a lot of hard work is the short answer. >> I read that there's going to be more data, or it was predicted, more data created in this year than in the past, I think it was five, 5,000 years. >> Forever. (laughs) >> And that to mix the statistics that we're analyzing currently less than 1% of the data. To taking those numbers and hear what you're all saying it's like, we're not keeping up, it seems like we're, it's not even linear. I mean that gap is just going to grow and grow and grow. How do we close that? >> There's a guy out there named Chris Dancy, he's known as the human cyborg. He has 700 hundred sensors all over his body. And his theory is that data's not new, having access to the data is new. You know we've always had a blood pressure, we've always had a sugar level. But we were never able to actually capture it in real time before. So now that we can capture and harness it, now we can be smarter about it. So I think that being able to use this information is really incredible like, this is something that over our lifetime we've never had and now we can do it. Which hence the big explosion in data. But I think how we use it and have it governed I think is the challenge right now. It's kind of cowboys and indians out there right now. And without proper governance and without rigorous regulation I think we are going to have some bumps in the road along the way. >> The data's in the oil is the question how are we actually going to operationalize around it? >> Or find it. Go ahead. >> I will say the other side of it is, so if you think about information, we always have the same amount of information right? What we choose to record however, is a different story. Now if you want wanted to know things about the Olympics, but you decide to collect information every day for years instead of just the Olympic year, yes you have a lot of data, but did you need all of that data? For that question about the Olympics, you don't need to collect data during years there are no Olympics, right? Unless of course you're comparing it relative. But I think that's another thing to think about. Just 'cause you collect more data does not mean that data will produce more statistically significant results, it does not mean it'll improve your model. You can be collecting data about your shoe size trying to get information about your hair. I mean it really does depend on what you're trying to measure, what your goals are, and what the data's going to be used for. If you don't factor the real world context into it, then yeah you can collect data, you know an infinite amount of data, but you'll never process it. Because you have no question to ask you're not looking to model anything. There is no universal truth about everything, that just doesn't exist out there. >> I think she's spot on. It comes down to what kind of questions are you trying to ask of your data? You can have one given database that has 100 variables in it, right? And you can ask it five different questions, all valid questions and that data may have those variables that'll tell you what's the best predictor of Churn, what's the best predictor of cancer treatment outcome. And if you can ask the right question of the data you have then that'll give you some insight. Just data for data's sake, that's just hype. We have a lot of data but it may not lead to anything if we don't ask it the right questions. >> Joe. >> I agree but I just want to add one thing. This is where the science in data science comes in. Scientists often will look at data that's already been in existence for years, weather forecasts, weather data, climate change data for example that go back to data charts and so forth going back centuries if that data is available. And they reformat, they reconfigure it, they get new uses out of it. And the potential I see with the data we're collecting is it may not be of use to us today, because we haven't thought of ways to use it, but maybe 10, 20, even 100 years from now someone's going to think of a way to leverage the data, to look at it in new ways and to come up with new ideas. That's just my thought on the science aspect. >> Knowing what you know about data science, why did Facebook miss Russia and the fake news trend? They came out and admitted it. You know, we miss it, why? Could they have, is it because they were focused elsewhere? Could they have solved that problem? (crosstalk) >> It's what you said which is are you asking the right questions and if you're not looking for that problem in exactly the way that it occurred you might not be able to find it. >> I thought the ads were paid in rubles. Shouldn't that be your first clue (panelists laugh) that something's amiss? >> You know red flag, so to speak. >> Yes. >> I mean Bitcoin maybe it could have hidden it. >> Bob: Right, exactly. >> I would think too that what happened last year is actually was the end of an age of optimism. I'll bring up the Soviet Union again, (chuckles). It collapsed back in 1991, 1990, 1991, Russia was reborn in. And think there was a general feeling of optimism in the '90s through the 2000s that Russia is now being well integrated into the world economy as other nations all over the globe, all continents are being integrated into the global economy thanks to technology. And technology is lifting entire continents out of poverty and ensuring more connectedness for people. Across Africa, India, Asia, we're seeing those economies that very different countries than 20 years ago and that extended into Russia as well. Russia is part of the global economy. We're able to communicate as a global, a global network. I think as a result we kind of overlook the dark side that occurred. >> John: Joe? >> Again, the foolish optimist here. But I think that... It shouldn't be the question like how did we miss it? It's do we have the ability now to catch it? And I think without data science without machine learning, without being able to train machines to look for patterns that involve corruption or result in corruption, I think we'd be out of luck. But now we have those tools. And now hopefully, optimistically, by the next election we'll be able to detect these things before they become public. >> It's a loaded question because my premise was Facebook had the ability and the tools and the knowledge and the data science expertise if in fact they wanted to solve that problem, but they were focused on other problems, which is how do I get people to click on ads? >> Right they had the ability to train the machines, but they were giving the machines the wrong training. >> Looking under the wrong rock. >> (laughs) That's right. >> It is easy to play armchair quarterback. Another topic I wanted to ask the panel about is, IBM Watson. You guys spend time in the Valley, I spend time in the Valley. People in the Valley poo-poo Watson. Ah, Google, Facebook, Amazon they've got the best AI. Watson, and some of that's fair criticism. Watson's a heavy lift, very services oriented, you just got to apply it in a very focused. At the same time Google's trying to get you to click on Ads, as is Facebook, Amazon's trying to get you to buy stuff. IBM's trying to solve cancer. Your thoughts on that sort of juxtaposition of the different AI suppliers and there may be others. Oh, nobody wants to touch this one, come on. I told you elephant in the room questions. >> Well I mean you're looking at two different, very different types of organizations. One which is really spent decades in applying technology to business and these other companies are ones that are primarily into the consumer, right? When we talk about things like IBM Watson you're looking at a very different type of solution. You used to be able to buy IT and once you installed it you pretty much could get it to work and store your records or you know, do whatever it is you needed it to do. But these types of tools, like Watson actually tries to learn your business. And it needs to spend time doing that watching the data and having its models tuned. And so you don't get the results right away. And I think that's been kind of the challenge that organizations like IBM has had. Like this is a different type of technology solution, one that has to actually learn first before it can provide value. And so I think you know you have organizations like IBM that are much better at applying technology to business, and then they have the further hurdle of having to try to apply these tools that work in very different ways. There's education too on the side of the buyer. >> I'd have to say that you know I think there's plenty of businesses out there also trying to solve very significant, meaningful problems. You know with Microsoft AI and Google AI and IBM Watson, I think it's not really the tool that matters, like we were saying earlier. A fool with a tool is still a fool. And regardless of who the manufacturer of that tool is. And I think you know having, a thoughtful, intelligent, trained, educated data scientist using any of these tools can be equally effective. >> So do you not see core AI competence and I left out Microsoft, as a strategic advantage for these companies? Is it going to be so ubiquitous and available that virtually anybody can apply it? Or is all the investment in R&D and AI going to pay off for these guys? >> Yeah, so I think there's different levels of AI, right? So there's AI where you can actually improve the model. I remember when I was invited when Watson was kind of first out by IBM to a private, sort of presentation. And my question was, "Okay, so when do I get "to access the corpus?" The corpus being sort of the foundation of NLP, which is natural language processing. So it's what you use as almost like a dictionary. Like how you're actually going to measure things, or things up. And they said, "Oh you can't." "What do you mean I can't?" It's like, "We do that." "So you're telling me as a data scientist "you're expecting me to rely on the fact "that you did it better than me and I should rely on that." I think over the years after that IBM started opening it up and offering different ways of being able to access the corpus and work with that data. But I remember at the first Watson hackathon there was only two corpus available. It was either the travel or medicine. There was no other foundational data available. So I think one of the difficulties was, you know IBM being a little bit more on the forefront of it they kind of had that burden of having to develop these systems and learning kind of the hard way that if you don't have the right models and you don't have the right data and you don't have the right access, that's going to be a huge limiter. I think with things like medical, medical information that's an extremely difficult data to start with. Partly because you know anything that you do find or don't find, the impact is significant. If I'm looking at things like what people clicked on the impact of using that data wrong, it's minimal. You might lose some money. If you do that with healthcare data, if you do that with medical data, people may die, like this is a much more difficult data set to start with. So I think from a scientific standpoint it's great to have any information about a new technology, new process. That's the nice that is that IBM's obviously invested in it and collected information. I think the difficulty there though is just 'cause you have it you can't solve everything. And if feel like from someone who works in technology, I think in general when you appeal to developers you try not to market. And with Watson it's very heavily marketed, which tends to turn off people who are more from the technical side. Because I think they don't like it when it's gimmicky in part because they do the opposite of that. They're always trying to build up the technical components of it. They don't like it when you're trying to convince them that you're selling them something when you could just give them the specs and look at it. So it could be something as simple as communication. But I do think it is valuable to have had a company who leads on the forefront of that and try to do so we can actually learn from what IBM has learned from this process. >> But you're an optimist. (John laughs) All right, good. >> Just one more thought. >> Joe go ahead first. >> Joe: I want to see how Alexa or Siri do on Jeopardy. (panelists laugh) >> All right. Going to go around a final thought, give you a second. Let's just think about like your 12 month crystal ball. In terms of either challenges that need to be met in the near term or opportunities you think will be realized. 12, 18 month horizon. Bob you've got the microphone headed up, so I'll let you lead off and let's just go around. >> I think a big challenge for business, for society is getting people educated on data and analytics. There's a study that was just released I think last month by Service Now, I think, or some vendor, or Click. They found that only 17% of the employees in Europe have the ability to use data in their job. Think about that. >> 17. >> 17. Less than 20%. So these people don't have the ability to understand or use data intelligently to improve their work performance. That says a lot about the state we're in today. And that's Europe. It's probably a lot worse in the United States. So that's a big challenge I think. To educate the masses. >> John: Joe. >> I think we probably have a better chance of improving technology over training people. I think using data needs to be iPhone easy. And I think, you know which means that a lot of innovation is in the years to come. I do think that a keyboard is going to be a thing of the past for the average user. We are going to start using voice a lot more. I think augmented reality is going to be things that becomes a real reality. Where we can hold our phone in front of an object and it will have an overlay of prices where it's available, if it's a person. I think that we will see within an organization holding a camera up to someone and being able to see what is their salary, what sales did they do last year, some key performance indicators. I hope that we are beyond the days of everyone around the world walking around like this and we start actually becoming more social as human beings through augmented reality. I think, it has to happen. I think we're going through kind of foolish times at the moment in order to get to the greater good. And I think the greater good is using technology in a very, very smart way. Which means that you shouldn't have to be, sorry to contradict, but maybe it's good to counterpoint. I don't think you need to have a PhD in SQL to use data. Like I think that's 1990. I think as we evolve it's going to become easier for the average person. Which means people like the brain trust here needs to get smarter and start innovating. I think the innovation around data is really at the tip of the iceberg, we're going to see a lot more of it in the years to come. >> Dion why don't you go ahead, then we'll come down the line here. >> Yeah so I think over that time frame two things are likely to happen. One is somebody's going to crack the consumerization of machine learning and AI, such that it really is available to the masses and we can do much more advanced things than we could. We see the industries tend to reach an inflection point and then there's an explosion. No one's quite cracked the code on how to really bring this to everyone, but somebody will. And that could happen in that time frame. And then the other thing that I think that almost has to happen is that the forces for openness, open data, data sharing, open data initiatives things like Block Chain are going to run headlong into data protection, data privacy, customer privacy laws and regulations that have to come down and protect us. Because the industry's not doing it, the government is stepping in and it's going to re-silo a lot of our data. It's going to make it recede and make it less accessible, making data science harder for a lot of the most meaningful types of activities. Patient data for example is already all locked down. We could do so much more with it, but health start ups are really constrained about what they can do. 'Cause they can't access the data. We can't even access our own health care records, right? So I think that's the challenge is we have to have that battle next to be able to go and take the next step. >> Well I see, with the growth of data a lot of it's coming through IOT, internet of things. I think that's a big source. And we're going to see a lot of innovation. A new types of Ubers or Air BnBs. Uber's so 2013 though, right? We're going to see new companies with new ideas, new innovations, they're going to be looking at the ways this data can be leveraged all this big data. Or data coming in from the IOT can be leveraged. You know there's some examples out there. There's a company for example that is outfitting tools, putting sensors in the tools. Industrial sites can therefore track where the tools are at any given time. This is an expensive, time consuming process, constantly loosing tools, trying to locate tools. Assessing whether the tool's being applied to the production line or the right tool is at the right torque and so forth. With the sensors implanted in these tools, it's now possible to be more efficient. And there's going to be innovations like that. Maybe small start up type things or smaller innovations. We're going to see a lot of new ideas and new types of approaches to handling all this data. There's going to be new business ideas. The next Uber, we may be hearing about it a year from now whatever that may be. And that Uber is going to be applying data, probably IOT type data in some, new innovative way. >> Jennifer, final word. >> Yeah so I think with data, you know it's interesting, right, for one thing I think on of the things that's made data more available and just people we open to the idea, has been start ups. But what's interesting about this is a lot of start ups have been acquired. And a lot of people at start ups that got acquired now these people work at bigger corporations. Which was the way it was maybe 10 years ago, data wasn't available and open, companies kept it very proprietary, you had to sign NDAs. It was like within the last 10 years that open source all of that initiatives became much more popular, much more open, a acceptable sort of way to look at data. I think that what I'm kind of interested in seeing is what people do within the corporate environment. Right, 'cause they have resources. They have funding that start ups don't have. And they have backing, right? Presumably if you're acquired you went in at a higher title in the corporate structure whereas if you had started there you probably wouldn't be at that title at that point. So I think you have an opportunity where people who have done innovative things and have proven that they can build really cool stuff, can now be in that corporate environment. I think part of it's going to be whether or not they can really adjust to sort of the corporate, you know the corporate landscape, the politics of it or the bureaucracy. I think every organization has that. Being able to navigate that is a difficult thing in part 'cause it's a human skill set, it's a people skill, it's a soft skill. It's not the same thing as just being able to code something and sell it. So you know it's going to really come down to people. I think if people can figure out for instance, what people want to buy, what people think, in general that's where the money comes from. You know you make money 'cause someone gave you money. So if you can find a way to look at a data or even look at technology and understand what people are doing, aren't doing, what they're happy about, unhappy about, there's always opportunity in collecting the data in that way and being able to leverage that. So you build cooler things, and offer things that haven't been thought of yet. So it's a very interesting time I think with the corporate resources available if you can do that. You know who knows what we'll have in like a year. >> I'll add one. >> Please. >> The majority of companies in the S&P 500 have a market cap that's greater than their revenue. The reason is 'cause they have IP related to data that's of value. But most of those companies, most companies, the vast majority of companies don't have any way to measure the value of that data. There's no GAAP accounting standard. So they don't understand the value contribution of their data in terms of how it helps them monetize. Not the data itself necessarily, but how it contributes to the monetization of the company. And I think that's a big gap. If you don't understand the value of the data that means you don't understand how to refine it, if data is the new oil and how to protect it and so forth and secure it. So that to me is a big gap that needs to get closed before we can actually say we live in a data driven world. >> So you're saying I've got an asset, I don't know if it's worth this or this. And they're missing that great opportunity. >> So devolve to what I know best. >> Great discussion. Really, really enjoyed the, the time as flown by. Joe if you get that augmented reality thing to work on the salary, point it toward that guy not this guy, okay? (everyone laughs) It's much more impressive if you point it over there. But Joe thank you, Dion, Joe and Jennifer and Batman. We appreciate and Bob Hayes, thanks for being with us. >> Thanks you guys. >> Really enjoyed >> Great stuff. >> the conversation. >> And a reminder coming up a the top of the hour, six o'clock Eastern time, IBMgo.com featuring the live keynote which is being set up just about 50 feet from us right now. Nick Silver is one of the headliners there, John Thomas is well, or rather Rob Thomas. John Thomas we had on earlier on The Cube. But a panel discussion as well coming up at six o'clock on IBMgo.com, six to 7:15. Be sure to join that live stream. That's it from The Cube. We certainly appreciate the time. Glad to have you along here in New York. And until the next time, take care. (bright digital music)
SUMMARY :
Brought to you by IBM. Welcome back to data science for all. So it is a new game-- Have a swing at the pitch. Thanks for taking the time to be with us. from the academic side to continue data science And there's lot to be said is there not, ask the questions, you can't not think about it. of the customer and how we were going to be more anticipatory And I think, you know as the tools mature, So it's still too hard. I think that, you know, that's where it's headed. So Bob if you would, so you've got this Batman shirt on. to be a data scientist, but these tools will help you I was just going to add that, you know I think it's important to point out as well that And the data scientists on the panel And the only difference is that you can build it's an accomplishment and for less, So I think you have to think about the fact that I get the point of it and I think and become easier to use, you know like Bob was saying, So how at the end of the day, Dion? or bots that go off and run the hypotheses So you know people who are using the applications are now then people can speak really slowly to you in French, But the day to day operations was they ran some data, That's really the question. You know it's been said that the data doesn't lie, the access to the truth through looking at the numbers of the organization where you have the routine I tend to be a foolish optimist You do. I think as we start relying more on data and trusting data There's a couple elephant in the room topics Before you go to market you've got to test And also have the ability for a human to intervene to click on ads. And I forget the last criteria, but like we need I think with ethics, you know a lot of it has to do of all the new data that's going to be coming in? Getting back to you know what Dave was saying earlier about, organizations that have path found the way. than in the past, I think it was (laughs) I mean that gap is just going to grow and grow and grow. So I think that being able to use this information Or find it. But I think that's another thing to think about. And if you can ask the right question of the data you have And the potential I see with the data we're collecting is Knowing what you know about data science, for that problem in exactly the way that it occurred I thought the ads were paid in rubles. I think as a result we kind of overlook And I think without data science without machine learning, Right they had the ability to train the machines, At the same time Google's trying to get you And so I think you know And I think you know having, I think in general when you appeal to developers But you're an optimist. Joe: I want to see how Alexa or Siri do on Jeopardy. in the near term or opportunities you think have the ability to use data in their job. That says a lot about the state we're in today. I don't think you need to have a PhD in SQL to use data. Dion why don't you go ahead, We see the industries tend to reach an inflection point And that Uber is going to be applying data, I think part of it's going to be whether or not if data is the new oil and how to protect it I don't know if it's worth this or this. Joe if you get that augmented reality thing Glad to have you along here in New York.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Dion Hinchcliffe | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Jennifer | PERSON | 0.99+ |
Joe | PERSON | 0.99+ |
Comcast | ORGANIZATION | 0.99+ |
Chris Dancy | PERSON | 0.99+ |
Jennifer Shin | PERSON | 0.99+ |
Cathy O'Neil | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Stanislav Petrov | PERSON | 0.99+ |
Joe McKendrick | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Nick Silver | PERSON | 0.99+ |
John Thomas | PERSON | 0.99+ |
100 variables | QUANTITY | 0.99+ |
John Walls | PERSON | 0.99+ |
1990 | DATE | 0.99+ |
Joe Caserta | PERSON | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
UC Berkeley | ORGANIZATION | 0.99+ |
1983 | DATE | 0.99+ |
1991 | DATE | 0.99+ |
2013 | DATE | 0.99+ |
Constellation Research | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Bob | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Bob Hayes | PERSON | 0.99+ |
United States | LOCATION | 0.99+ |
360 degree | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
New York | LOCATION | 0.99+ |
Benjamin Israeli | PERSON | 0.99+ |
France | LOCATION | 0.99+ |
Africa | LOCATION | 0.99+ |
12 month | QUANTITY | 0.99+ |
Soviet Union | LOCATION | 0.99+ |
Batman | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
last year | DATE | 0.99+ |
Olympics | EVENT | 0.99+ |
Meredith Whittaker | PERSON | 0.99+ |
iPhone | COMMERCIAL_ITEM | 0.99+ |
Moscow | LOCATION | 0.99+ |
Ubers | ORGANIZATION | 0.99+ |
20 years | QUANTITY | 0.99+ |
Joe C. | PERSON | 0.99+ |
Nick Ducoff, Infochimps - SxSWi 2011 - theCUBE
hello welcome back mark risen Hopkins here at South by Southwest 2011 and I'm here with Nick do cough from info chimps where I'm from I'm pretty familiar with because I'm a tech center and I hear about these guys all the time you may or may not you should probably should know who these people are but if you're not Nick I'm just going to have you start off with a little bit of an elevator pitch talk about what your company does and acquaint them I hope hopefully they can hear you over the whatever that is a keynote or contest what is going on out there sure thank you info chums is a market place to find share and build on data we have two big customer bases one is the developer community which we're just really focused on making it super easy for developers to build applications you know an application is really two things right it's code and it's a database and there's lots of folks out there that help developers get access to code such as github but there's really not a centralized repository for structured information data and so that's what we're building and we're really excited about it the other part of our business is our marketplace where we have data sets that are published and can be downloaded as flat files so if you're you know mom and pop or you know non technical user and you know data for you is you know viewable in Microsoft Excel that's you know that's the place for you the beautiful thing is it's all found at the same place and that's info gems com I was going to talk a little bit about your recent announcement and Michelle the former contributors SiliconANGLE if you're watching this video you probably know who Michelle Greer is has been excitedly talking in hushed tones don't tell anybody till we announced but check a look at this is really cool your API Explorer and the launch of is it 1000 API is 1000 2000 data sets so i've i've never really dug is deep into your data sets as I have in the last couple of weeks while you've been turning on the API Explorer and uploading these new things so tell me tell me for is all about the broadly about the data sets and the API explored how that works and then we'll dive deeper into a couple of these that are really cool thanks and you know sorry to steal Michelle from you but she's a rock star and we love her so we recently published two thousand new API calls and you know that that's pretty exciting for us we're trying to make you know as much data is available in one place as as there is on the internet and these two thousand API calls range from social media data to weather data to stock data and really you know our key focus here was just to try to think of what are the building blocks for an application and how can we provide just data sets that you know can inspire developers to build applications without ever having to bring data down onto their own server the API Explorer makes it super easy for anybody to come and see you know after they pass through an input what what the output looks like within their web browser so they don't have to go and start coding to figure out what the output is going to look like they can you know get a few samples right there in the browser so the and as someone who is a lightweight developer these days but was a heavy coder back in my early days the API Explorer is what really makes it real in my opinion because you can look at the documentation all day long and we spoke to somebody earlier today that's in the documentation business as soon as you hear that you know it's nor right you know I don't want some ads either you're thinking about it something has to write the documentation which is a which is a big task always or someone's got to read it unless you need it like five minutes ago you you're not going to be hitting the books so but being able to just see a little box and like okay here's what I put into this box and hit the button and see what comes out the other end that's what makes it real so that that's I think something that makes what you guys are doing pretty exciting now but one of the ones that Michele showed me was clearly which is another company that uses you as the platform to publish the data and the AP I and so talk a little bit about what clearly does I can see a hundred uses for this for applications we're developing so talk a little bit about what that does and in depth about as much debt as you can about how they get their data and all that so poorly is a company run by Mac Schneider Hoffer based in London UK and he was previously at Atlas ventures he was a VC you know came back to the bright side of things and started his own company what clerk poorly does is a database across social identities so you know who are you online who am i online I'm Nick do cough um Twitter I'm / do cough on facebook I'm / Nick dash do cough on linkedin and you know it's hard to sometimes find in a programmatic fashion you know all of the identities for a person online and so what queries done is you can pass through whatever you've got twitter handle or Facebook account or a linkedin account and it will help map across all of the other social networks and help you find your flickr account the youtube account your LinkedIn account so that you know developers can help build you know any number of applications we deal we're based out of the cloud air office our Palo Alto group is based on cloud our office so a lot of what we do is using Hadoop to bring structure into unstructured data and I know that API right there I think saved us probably about three months worth of development on one aspect so we're going to be using it just just so you know but I mean being able to surface a surface content in a way that like being able to access you know you know the people that are around it like invented by stop by Southwest you control feeds find people that are there at South by Southwest but you don't always have access to all the content they're publishing because they may not have an auto feed going but you know with something like we really you can pull all their other feeds and then you know just just filter it based on location or date range or whatever it is you're doing and really go up with something useful you know to speak a little bit about what they do and I'm happy to also introduce you to max he's coming into Austin for South by Southwest but I hope you get it through us and not them but so what max does is you know they use indicators you know strong links across your various profiles to see UK is at Nick Duke off really the same guy as facebook / Nick Duke off right you know am I linking to my facebook profile from my twitter profile or you know in my facebook have i mentioned you know back to my twitter profile or my about me profile or something else right so that they can see okay well is this person really this this person well and then this kind of links into the the other discs the other API we were discussing earlier which is the Twitter profile search that combined with maybe the queerly search would be a great way of surfacing like Authority nodes on you know amongst content providers so talk about the differences between Twitter's native profile search we did we ran it on Batman Batman comics my thing and versus the the profile search that you guys have so we're really moving to having you know the data store of choice for us is elastic search it's an incredibly powerful tool that allows you to do essentially boolean searches across large data files for instance the Twitter profile search is a hunt across 100 million nodes and what we've got now is the ability to search across those 100 million users you know with the key words that they use in their profile and that can be you know obviously name it can be how they describe themselves what they like we're even there from Twitter the way that they do it based on just a couple searches that we ran it looks like they have some kind of method of looking both at the tweets themselves as well as potentially other keywords around what you need Charlie in character Gotham news and all kinds of crazy stuff nothing none of it had to do with that man comics per se than loosely associated with Batman so I guess if you're into that there you go but if you want an exact match this would be the way to go so so it's not all social data you've got I know there's some sports related ones in there there's a the raw word searches it was at the British corporate national corpus you've got a couple other ones that escaped me at mall and just a well with 2000 but so lots of interesting data to be able to search tubing so let's uh let's look a little bit broader where did you guys where was the inspiration for this what was the amo because big data is this is the is a focus for us editorially for the next foreseeable future whatever that ends up being because we covered a couple of conferences recently strata Hadoop amazing viewership that we were just talking about the concepts behind big data and it resonated with both our consumer oriented audiences developers of course but also enterprise because big data is something that affects them too and it's not just all about social and mobile and you know the fun stuff that Mashable and the TechCrunch and the web to blogs like to talk about but it's it's crossed over at IT so what was your aha moment that led you to pursue the path that that info chimps has because you're you're positioned at a good nexus for enterprise and all the consumer facing data stores so we'll just just talk a little bit about that journey sure so flip Cromer another one of our co-founders and CTO was pursuing his PhD in physics at UT and in the course of his research no spent a lot of time you know finding and munching data the kind of aha moment for him was it's a pain in the butt to find data online no Google does a wonderful job of indexing you know blobs unstructured information on web pages but they don't do a great job of indexing structured information and so flip set out to solve this problem and asked around his his fellow PhD candidates if anybody might be interested in pursuing pursuing this this this mission and found dhruv bandage m's team and kind of from there you know we've built up to 15 chimps trying to democratize access to structured information so so talk about the process of like data sanitization i know its a mix of automated and hand hand washing of the data so talk if you can talk about that it may be part of your secret sauce but if you didn't talk a little about that process I'd like to learn more sure so one of our kind of core philosophies is we take data and we publish it in a structured format we don't necessarily cleanse it when there's clearly articulated demand for a very high quality data set either we'll find it either through a third party supplier or we'll build it ourselves but unless there's clearly articulated demand we publish it the same way that we find it the only change that we make is we identify columns and rows so that you can make that you know in a machine-readable format okay but and also part of the rolls is documentation of that which is which is your next big but you can only do with 15 people do to so much at one time so you've got all the data published and part of that role is actually making it searchable curated and findable yeah so we absolutely want to continue to work on cleaning up the metadata you know around the data one of the things that we've been working on is a unified format of metadata and so that's something that we're pretty far along on and really excited about and I think it will really help with scalability because you know our data team can ingest data you know pretty quickly at this point you know we're pulling in you know hundreds of gigabytes a week or more probably closer to terabytes a week and but you know we got to make sure that we keep up with respect to you no documentation like you were saying and making it easily findable or we end up in the same place that we were before we started in foot jumps and so what we've done is we've loaded all of the metadata into elasticsearch as well as some of the data so that you know we obviously our search algorithm is part of our special sauce but we try to make you know the data set that's most relevant to you adjacent to the data that you either have or otherwise we're looking for so search search is really becoming a everything old is new again that's like a one of the themes people going back to search and reapplying it to problems that Google you know doesn't need to work on right Google is everybody thinks Google is solved search and I think they'll probably the first to tell you that we got ninety five percent of it down but I think it may be more than that really because there's so many different aspects of search that haven't been tackle I mean you got the semantic side you've got different different organizations that are trying to patch holes in micro site search you know or whitelisted topic-specific search and you're working on a couple different approaches to structure data search so that's that's one of the things I'm seeing is emerging theme what just stepping back I mean you've been like I suspend like a day and a half here in South by Southwest but you've probably been exposed to the the prep a little bit longer than I have been local to Austin what's what are some of the themes you're seeing emerge out of the conference here so you know it's it's all about location right you know you know location local and you know the data that powers that and so with respect to location you know one of the important themes is you know places where am i standing right now and there's a number of folks out there that you know might even tell you different things about where you're standing and so over the next couple months we're pretty excited to announce some partnerships that you know will save for another story to really make it easy for developers to build location-based applications and obviously a big part of that will be you know retail inventory and and and other things about where you are right happy hour specials you know all the other ratings and reviews you know all the kinds of stuff that folks ask for all the time you know can you scrape citysearch can you scrape yelp and you know we won't necessarily but we'll work with a lot of folks who have similar databases or those companies themselves to make it available to our developer community so one of the yet so that's a good position to delve into a little bit because i think that the fear is with companies that sit in a position you do where you envelop so much of an ecosystem is that you will compete with that ecosystem eventually we see it with Twitter you see with Facebook and you know those evangelists for those those organizations will will tell you okay we're not really competing but we know they are I mean either they are or they're just really bad at communicating how they don't want to communicate compete with their own ecosystem so that you leave the data sanitization scraping and otherwise organizing to other people and you're just organizing the organization of the data that that's an interesting point to elaborate on for instance a good number of those two thousand data sets where we took factual corpus of data sets and published them as api's right so we took what was you know structure data and made it published in an application programming interface right and that was something that hadn't been done before and now it's even easier to build on top of those databases right so you know they existed in the wild and we just made them easier to find an easier to access and that's really what we're what we're trying to do very cool stuff big data a theme search a theme South by Southwest 2011 I am Margaret Ann Hopkins we've been chatting with info chimps so a company to watch keep an eye on these guys play with the API Explorer I can't I am I'm not getting paid by these guys to say this I just really like it I played with it I really liked it so I think you should to stay tuned to SiliconANGLE console can hang a lot TV we'll have more coverage coming out of the conference so don't go away
**Summary and Sentiment Analysis are not been shown because of improper transcript**
ENTITIES
Entity | Category | Confidence |
---|---|---|
Michelle Greer | PERSON | 0.99+ |
Austin | LOCATION | 0.99+ |
Margaret Ann Hopkins | PERSON | 0.99+ |
Nick Ducoff | PERSON | 0.99+ |
Michelle | PERSON | 0.99+ |
15 people | QUANTITY | 0.99+ |
ninety five percent | QUANTITY | 0.99+ |
Michele | PERSON | 0.99+ |
Nick | PERSON | 0.99+ |
Charlie | PERSON | 0.99+ |
100 million users | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
youtube | ORGANIZATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
mark risen Hopkins | PERSON | 0.99+ |
a day and a half | QUANTITY | 0.99+ |
Nick Duke | PERSON | 0.98+ |
ORGANIZATION | 0.98+ | |
API Explorer | TITLE | 0.98+ |
TechCrunch | ORGANIZATION | 0.98+ |
UK | LOCATION | 0.98+ |
one time | QUANTITY | 0.98+ |
ORGANIZATION | 0.97+ | |
ORGANIZATION | 0.97+ | |
five minutes ago | DATE | 0.97+ |
ORGANIZATION | 0.97+ | |
ORGANIZATION | 0.97+ | |
two things | QUANTITY | 0.97+ |
ORGANIZATION | 0.96+ | |
one | QUANTITY | 0.96+ |
London UK | LOCATION | 0.96+ |
flickr | ORGANIZATION | 0.96+ |
two thousand API | QUANTITY | 0.96+ |
Hadoop | TITLE | 0.96+ |
both | QUANTITY | 0.95+ |
Mashable | ORGANIZATION | 0.95+ |
Mac Schneider Hoffer | PERSON | 0.95+ |
100 million nodes | QUANTITY | 0.94+ |
one place | QUANTITY | 0.94+ |
Atlas | ORGANIZATION | 0.94+ |
first | QUANTITY | 0.93+ |
github | TITLE | 0.93+ |
one aspect | QUANTITY | 0.93+ |
about three months | QUANTITY | 0.93+ |
Nick Duke | PERSON | 0.92+ |
two thousand new API calls | QUANTITY | 0.92+ |
UT | ORGANIZATION | 0.9+ |
hundreds of gigabytes a week | QUANTITY | 0.89+ |
two | QUANTITY | 0.89+ |
terabytes a week | QUANTITY | 0.89+ |
Cromer | PERSON | 0.89+ |
2011 | DATE | 0.88+ |
earlier today | DATE | 0.86+ |
1000 | QUANTITY | 0.86+ |
two thousand data sets | QUANTITY | 0.85+ |
last couple of weeks | DATE | 0.85+ |
a hundred uses | QUANTITY | 0.85+ |
API Explorer | TITLE | 0.84+ |
British | OTHER | 0.83+ |
Batman | PERSON | 0.82+ |
lot of folks | QUANTITY | 0.8+ |
theCUBE | ORGANIZATION | 0.8+ |
2000 | DATE | 0.79+ |
next couple months | DATE | 0.79+ |
SiliconANGLE | ORGANIZATION | 0.75+ |
up to 15 chimps | QUANTITY | 0.74+ |
Batman Batman | TITLE | 0.73+ |
1000 2000 data sets | QUANTITY | 0.72+ |
lots of folks | QUANTITY | 0.71+ |
number of folks | QUANTITY | 0.71+ |