Lie 1, The Most Effective Data Architecture Is Centralized | Starburst
(bright upbeat music) >> In 2011, early Facebook employee and Cloudera co-founder Jeff Hammerbacher famously said, "The best minds of my generation are thinking about how to get people to click on ads, and that sucks!" Let's face it. More than a decade later, organizations continue to be frustrated with how difficult it is to get value from data and build a truly agile and data-driven enterprise. What does that even mean, you ask? Well, it means that everyone in the organization has the data they need when they need it in a context that's relevant to advance the mission of an organization. Now, that could mean cutting costs, could mean increasing profits, driving productivity, saving lives, accelerating drug discovery, making better diagnoses, solving supply chain problems, predicting weather disasters, simplifying processes, and thousands of other examples where data can completely transform people's lives beyond manipulating internet users to behave a certain way. We've heard the prognostications about the possibilities of data before and in fairness we've made progress, but the hard truth is the original promises of master data management, enterprise data warehouses, data marts, data hubs, and yes even data lakes were broken and left us wanting for more. Welcome to The Data Doesn't Lie... Or Does It? A series of conversations produced by theCUBE and made possible by Starburst Data. I'm your host, Dave Vellante, and joining me today are three industry experts. Justin Borgman is the co-founder and CEO of Starburst, Richard Jarvis is the CTO at EMIS Health, and Teresa Tung is cloud first technologist at Accenture. Today, we're going to have a candid discussion that will expose the unfulfilled, and yes, broken promises of a data past. We'll expose data lies: big lies, little lies, white lies, and hidden truths. And we'll challenge, age old data conventions and bust some data myths. We're debating questions like is the demise of a single source of truth inevitable? Will the data warehouse ever have feature parity with the data lake or vice versa? Is the so-called modern data stack simply centralization in the cloud, AKA the old guards model in new cloud close? How can organizations rethink their data architectures and regimes to realize the true promises of data? Can and will an open ecosystem deliver on these promises in our lifetimes? We're spanning much of the Western world today. Richard is in the UK, Teresa is on the West Coast, and Justin is in Massachusetts with me. I'm in theCUBE studios, about 30 miles outside of Boston. Folks, welcome to the program. Thanks for coming on. >> Thanks for having us. >> Okay, let's get right into it. You're very welcome. Now, here's the first lie. The most effective data architecture is one that is centralized with a team of data specialists serving various lines of business. What do you think Justin? >> Yeah, definitely a lie. My first startup was a company called Hadapt, which was an early SQL engine for IDU that was acquired by Teradata. And when I got to Teradata, of course, Teradata is the pioneer of that central enterprise data warehouse model. One of the things that I found fascinating was that not one of their customers had actually lived up to that vision of centralizing all of their data into one place. They all had data silos. They all had data in different systems. They had data on prem, data in the cloud. Those companies were acquiring other companies and inheriting their data architecture. So despite being the industry leader for 40 years, not one of their customers truly had everything in one place. So I think definitely history has proven that to be a lie. >> So Richard, from a practitioner's point of view, what are your thoughts? I mean, there's a lot of pressure to cut cost, keep things centralized, serve the business as best as possible from that standpoint. What does your experience show? >> Yeah, I mean, I think I would echo Justin's experience really that we as a business have grown up through acquisition, through storing data in different places sometimes to do information governance in different ways to store data in a platform that's close to data experts people who really understand healthcare data from pharmacies or from doctors. And so, although if you were starting from a greenfield site and you were building something brand new, you might be able to centralize all the data and all of the tooling and teams in one place. The reality is that businesses just don't grow up like that. And it's just really impossible to get that academic perfection of storing everything in one place. >> Teresa, I feel like Sarbanes-Oxley have kind of saved the data warehouse, right? (laughs) You actually did have to have a single version of the truth for certain financial data, but really for some of those other use cases I mentioned, I do feel like the industry has kind of let us down. What's your take on this? Where does it make sense to have that sort of centralized approach versus where does it make sense to maybe decentralize? >> I think you got to have centralized governance, right? So from the central team, for things like Sarbanes-Oxley, for things like security, for certain very core data sets having a centralized set of roles, responsibilities to really QA, right? To serve as a design authority for your entire data estate, just like you might with security, but how it's implemented has to be distributed. Otherwise, you're not going to be able to scale, right? So being able to have different parts of the business really make the right data investments for their needs. And then ultimately, you're going to collaborate with your partners. So partners that are not within the company, right? External partners. We're going to see a lot more data sharing and model creation. And so you're definitely going to be decentralized. >> So Justin, you guys last, jeez, I think it was about a year ago, had a session on data mesh. It was a great program. You invited Zhamak Dehghani. Of course, she's the creator of the data mesh. One of our fundamental premises is that you've got this hyper specialized team that you've got to go through if you want anything. But at the same time, these individuals actually become a bottleneck, even though they're some of the most talented people in the organization. So I guess, a question for you Richard. How do you deal with that? Do you organize so that there are a few sort of rock stars that build cubes and the like or have you had any success in sort of decentralizing with your constituencies that data model? >> Yeah. So we absolutely have got rockstar data scientists and data guardians, if you like. People who understand what it means to use this data, particularly the data that we use at EMIS is very private, it's healthcare information. And some of the rules and regulations around using the data are very complex and strict. So we have to have people who understand the usage of the data, then people who understand how to build models, how to process the data effectively. And you can think of them like consultants to the wider business because a pharmacist might not understand how to structure a SQL query, but they do understand how they want to process medication information to improve patient lives. And so that becomes a consulting type experience from a set of rock stars to help a more decentralized business who needs to understand the data and to generate some valuable output. >> Justin, what do you say to a customer or prospect that says, "Look, Justin. I got a centralized team and that's the most cost effective way to serve the business. Otherwise, I got duplication." What do you say to that? >> Well, I would argue it's probably not the most cost effective, and the reason being really twofold. I think, first of all, when you are deploying a enterprise data warehouse model, the data warehouse itself is very expensive, generally speaking. And so you're putting all of your most valuable data in the hands of one vendor who now has tremendous leverage over you for many, many years to come. I think that's the story at Oracle or Teradata or other proprietary database systems. But the other aspect I think is that the reality is those central data warehouse teams, as much as they are experts in the technology, they don't necessarily understand the data itself. And this is one of the core tenets of data mesh that Zhamak writes about is this idea of the domain owners actually know the data the best. And so by not only acknowledging that data is generally decentralized, and to your earlier point about Sarbanes-Oxley, maybe saving the data warehouse, I would argue maybe GDPR and data sovereignty will destroy it because data has to be decentralized for those laws to be compliant. But I think the reality is the data mesh model basically says data's decentralized and we're going to turn that into an asset rather than a liability. And we're going to turn that into an asset by empowering the people that know the data the best to participate in the process of curating and creating data products for consumption. So I think when you think about it that way, you're going to get higher quality data and faster time to insight, which is ultimately going to drive more revenue for your business and reduce costs. So I think that that's the way I see the two models comparing and contrasting. >> So do you think the demise of the data warehouse is inevitable? Teresa, you work with a lot of clients. They're not just going to rip and replace their existing infrastructure. Maybe they're going to build on top of it, but what does that mean? Does that mean the EDW just becomes less and less valuable over time or it's maybe just isolated to specific use cases? What's your take on that? >> Listen, I still would love all my data within a data warehouse. I would love it mastered, would love it owned by a central team, right? I think that's still what I would love to have. That's just not the reality, right? The investment to actually migrate and keep that up to date, I would say it's a losing battle. Like we've been trying to do it for a long time. Nobody has the budgets and then data changes, right? There's going to be a new technology that's going to emerge that we're going to want to tap into. There's going to be not enough investment to bring all the legacy, but still very useful systems into that centralized view. So you keep the data warehouse. I think it's a very, very valuable, very high performance tool for what it's there for, but you could have this new mesh layer that still takes advantage of the things I mentioned: the data products in the systems that are meaningful today, and the data products that actually might span a number of systems. Maybe either those that either source systems with the domains that know it best, or the consumer-based systems or products that need to be packaged in a way that'd be really meaningful for that end user, right? Each of those are useful for a different part of the business and making sure that the mesh actually allows you to use all of them. >> So, Richard, let me ask you. Take Zhamak's principles back to those. You got the domain ownership and data as product. Okay, great. Sounds good. But it creates what I would argue are two challenges: self-serve infrastructure, let's park that for a second, and then in your industry, one of the most regulated, most sensitive, computational governance. How do you automate and ensure federated governance in that mesh model that Teresa was just talking about? >> Well, it absolutely depends on some of the tooling and processes that you put in place around those tools to centralize the security and the governance of the data. And I think although a data warehouse makes that very simple 'cause it's a single tool, it's not impossible with some of the data mesh technologies that are available. And so what we've done at EMIS is we have a single security layer that sits on top of our data mesh, which means that no matter which user is accessing which data source, we go through a well audited, well understood security layer. That means that we know exactly who's got access to which data field, which data tables. And then everything that they do is audited in a very kind of standard way regardless of the underlying data storage technology. So for me, although storing the data in one place might not be possible, understanding where your source of truth is and securing that in a common way is still a valuable approach, and you can do it without having to bring all that data into a single bucket so that it's all in one place. And so having done that and investing quite heavily in making that possible has paid dividends in terms of giving wider access to the platform, and ensuring that only data that's available under GDPR and other regulations is being used by the data users. >> Yeah. So Justin, we always talk about data democratization, and up until recently, they really haven't been line of sight as to how to get there, but do you have anything to add to this because you're essentially doing analytic queries with data that's all dispersed all over. How are you seeing your customers handle this challenge? >> Yeah, I mean, I think data products is a really interesting aspect of the answer to that. It allows you to, again, leverage the data domain owners, the people who know the data the best, to create data as a product ultimately to be consumed. And we try to represent that in our product as effectively, almost eCommerce like experience where you go and discover and look for the data products that have been created in your organization, and then you can start to consume them as you'd like. And so really trying to build on that notion of data democratization and self-service, and making it very easy to discover and start to use with whatever BI tool you may like or even just running SQL queries yourself. >> Okay guys, grab a sip of water. After the short break, we'll be back to debate whether proprietary or open platforms are the best path to the future of data excellence. Keep it right there. (bright upbeat music)
SUMMARY :
has the data they need when they need it Now, here's the first lie. has proven that to be a lie. of pressure to cut cost, and all of the tooling have kind of saved the data So from the central team, for that build cubes and the like and to generate some valuable output. and that's the most cost effective way is that the reality is those of the data warehouse is inevitable? and making sure that the mesh one of the most regulated, most sensitive, and processes that you put as to how to get there, aspect of the answer to that. or open platforms are the best path
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Richard | PERSON | 0.99+ |
Justin Borgman | PERSON | 0.99+ |
Justin | PERSON | 0.99+ |
Richard Jarvis | PERSON | 0.99+ |
Teresa Tung | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Teresa | PERSON | 0.99+ |
Teradata | ORGANIZATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Massachusetts | LOCATION | 0.99+ |
Zhamak Dehghani | PERSON | 0.99+ |
UK | LOCATION | 0.99+ |
2011 | DATE | 0.99+ |
two challenges | QUANTITY | 0.99+ |
Hadapt | ORGANIZATION | 0.99+ |
40 years | QUANTITY | 0.99+ |
Starburst | ORGANIZATION | 0.99+ |
two models | QUANTITY | 0.99+ |
thousands | QUANTITY | 0.99+ |
Boston | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Sarbanes-Oxley | ORGANIZATION | 0.99+ |
Each | QUANTITY | 0.99+ |
first lie | QUANTITY | 0.99+ |
Accenture | ORGANIZATION | 0.99+ |
GDPR | TITLE | 0.99+ |
Today | DATE | 0.98+ |
today | DATE | 0.98+ |
SQL | TITLE | 0.98+ |
Starburst Data | ORGANIZATION | 0.98+ |
EMIS Health | ORGANIZATION | 0.98+ |
Cloudera | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
first startup | QUANTITY | 0.98+ |
one place | QUANTITY | 0.98+ |
about 30 miles | QUANTITY | 0.98+ |
One | QUANTITY | 0.97+ |
More than a decade later | DATE | 0.97+ |
EMIS | ORGANIZATION | 0.97+ |
single bucket | QUANTITY | 0.97+ |
first technologist | QUANTITY | 0.96+ |
three industry experts | QUANTITY | 0.96+ |
single tool | QUANTITY | 0.96+ |
single version | QUANTITY | 0.94+ |
Zhamak | PERSON | 0.92+ |
theCUBE | ORGANIZATION | 0.91+ |
single source | QUANTITY | 0.9+ |
West Coast | LOCATION | 0.87+ |
one vendor | QUANTITY | 0.84+ |
single security layer | QUANTITY | 0.81+ |
about a year ago | DATE | 0.75+ |
IDU | ORGANIZATION | 0.68+ |
Is | TITLE | 0.65+ |
a second | QUANTITY | 0.64+ |
EDW | ORGANIZATION | 0.57+ |
examples | QUANTITY | 0.55+ |
echo | COMMERCIAL_ITEM | 0.54+ |
twofold | QUANTITY | 0.5+ |
Lie | TITLE | 0.35+ |
Simon Guest Nil V2 | AWS Executive Summit 2021
(upbeat music) >> Welcome back to theCUBE's presentation of the AWS Executive Summit at re:Invent 2021 made possible by Accenture. My name is Dave Vellante. We're going to look at how digital infrastructure is helping to transform consumer experiences, specifically how an insurance company is changing its industry by incentivizing and rewarding consumers who changed their behavior to live healthier lives, a real passion of mine, and getting to the really root cause of health. With me now are Simon Guest, who's the Chief Executive Officer of Generali Vitality, GmbH, and Nils Muller-Sheffer, who's the Managing Director at the Cloud First Application Engineering Lead for the European market at Accenture. Gentlemen, welcome to theCUBE. >> Thanks for having us. >> You're very welcome Simon. Simon, Generali Vitality is a really interesting concept that you guys have envisioned and now put it into practice. Tell us how does it all work? >> Sure. No problem. And thanks for having us on David, pleasure to be here. So look, Generali Vitality is in its core a pretty simple concept. It's a program that you have on your phone. And the idea of this program is that it's a wellness coach for you as an individual, and it's going to help you to understand your health and where you are in terms of the state of your health at the moment, and it's going to take you on a journey to improve your lifestyle and your wellness, and hopefully help you to live a healthier and a more sort of mindful life, I guess, is the best way of summarizing it. From our point of view as an insurance company, of course, our historical role has always been to be the company that's there if something goes wrong. So if unfortunately you pass away or you have sickness in your life or your family's life, that's historically been our role. But what we see with Generali Vitality is something a little bit different. So it's a program that really is supposed to be with you every day of your life to help you to live a healthier life. It's something that we already have in for European markets and in fact, in five from this week, I'm a little bit behind the times. So we're live already in Germany, in France, in Austria, in Italy and in Spain. And fundamentally what we do Dave, is to say to customers, "Look, if you want to understand your health, if you want to improve it by moving a little bit more, or by visiting the doctor more, by eating healthier, by healthy choices on a daily basis, we're going to help you to do that. And we're going to incentivize you for going on this journey and making healthy choices. And we're going to reward you for doing the same." So, we partner up with great companies like Garmin, like Adidas, like big brands that are, let's say, invested in this health and wellness space so that we can produce really an ecosystem for customers that's all about live well, make good choices, be healthy, have an insurance company that partners you along that journey. And if you do that, we've going to reward you for that. So, we're here not just in a difficult times, which of course is one of our main roles, but we're here as a partner, as a lifetime partner to you to help you feel better and live a better life. >> I love it, I mean, it sounds so simple, but I'm sure it's very complicated to make the technology simple for the user. You've got mobile involved, you've got the back end and we're going to get into some of the tech, but first I want to understand the member engagement and some of the lifestyle changes Simon that you've analyzed. What's the feedback that you're getting from your customers? What does the data tell you? How do the incentives work as well? What is the incentive for the member to actually do the right thing? >> Sure, I think actually that the COVID situation that we've had in the last sort of two years is really crystallized the fact that this is something that we really ought to be doing and something that our customers really value. Just to give you a bit of a sort of information about how it works for our customers. So what we try to do with them, is to get customers to understand their current health situation, using their phone. So, we asked our customers to go through a sort of health assessments around how they live, what they eat, how they sleep, and to go through that sort of process and to give them all the Vitality age, which is a sort of actuarial comparison with their real age. So I'm 45, but unfortunately my Vitality age is 49 and it means I have some work to do to bring that back together. And what we see is that, two thirds of our customers take this test every year because they want to see how they are progressing on an annual basis in terms of living a healthier life. And if what they are doing is having an impact on their life expectancy and their lifespan and their health span. So how long are they going to live healthier for? So you see them really engaging in this approach of understanding that current situation. Then what we know actually, because the program is built around this model that's really activity and moving, and exercise is the biggest contributors to living a healthier life. We know that the majority of deaths are caused by lifestyle illnesses like poor nutrition and smoking and drinking alcohol and not exercising. And so a lot of the program is really built around getting people to move more. And it's not about being an athlete. It's about, getting off the underground one station earlier and walking home or making sure you do your 10,000 steps a day. And what we see is that that sort of 40% of our customers are on a regularly basis linking either their phone or their exercise device to our program and downloading that data so that they can see how much they are exercising. And at the same time, what we do is we set our customers weekly challenges to say, look, if you can move a little bit more than last week, we are go into to reward you for that. And we see that almost half of our customers are achieving this weekly goal every week. And it's really a level of engagement that normally as an insurer, we don't see. The way that rewards work is pretty simple. It's similar in a way to an airline program. So every good choice you make every activity to every piece of good food that you eat. When you check your on your health situation, we'll give you points. And the more points you get, you go through through a sort of status approach of starting off at the bottom status and ending up at a golden and a platinum status. And the higher up you get in the status, the higher the value of the rewards that we give you. So almost a quarter of our customers now, and this has accelerated through COVID have reached that platinum status. So they are the most engaged customers that we have and those ones who are really engaging in the program. And what we really tried to create is this sort of virtuous circle that says If you live well, you make good choices, you improve your health, you progress through the program and we give you better and stronger and more valuable rewards for doing that. And some of those rewards are around health and wellness. So it might be that you get a discounts on gym gear from Adidas, it might be that you get a discount on a device from Garmin, or it might be actually on other things. We also give people Amazon vouchers. We also give people discounts on holidays. And another thing that we did actually in the last year, which we found really powerful is that we've given the opportunity for our customers to convert those rewards into charitable donations. Because we work in generosity with a sort of campaign called The Human Safety Net, which is helping out the poorest people in society. And so what our customers do a lot of the time is instead of taking those financial rewards for themselves, they convert it into a charitable donation. So we're actually also linking wellness and feeling good and insurance and some societal goods. So we're really trying to create a virtuous circle of engagement with our customers. >> That's a powerful cocktail. I love it. You've got the data, because if I see the data, then I can change my behavior. You've got the gamification piece. You actually have hard dollar rewards. You could give those to charities and you've got the most important, which is priceless, you can't put a value on good health. I got one more question for Simon and Nils I'd love for you to chime in as well on this question. How did you guys decide, Simon, to engage with Accenture and AWS and the cloud to build out this platform? What's the story behind that collaboration? Was there unique value that you saw that you wanted to tap, that you feel like they bring to the table? What was your experience? >> Yeah, we work with Accenture as well because the sort of constructs of this Vitality proposition is a pretty complex one. So you mentioned that the idea is simple, but the build is not so simple and that's the case. So Accenture has been part of that journey from the beginning. They are one of the partners that we work with, but specifically around the topic of rewards, we're primarily European focused organization, but when you take those countries that I mentioned, even though we're next to each other geographically, we're quite diverse. And what we wanted to create was really a sustainable and reusable and consistent customer experience that allowed us to go get to market with an increasing amounts of efficiency. And to do that, we needed to work with somebody who understood our business, has this historical, let's say investment in the Vitality concepts and so knows how to bring it to life, but then could really support us in making what can be a complex piece of work, as simple, as replicable as possible across multiple markets, because we don't want to go reinventing the wheel every time we knew we moved to a new market. So we need to find a balance between having a consistent product, a consistent technology offer, a consistent customer experience with the fact that we operate in quite diverse markets. So this was, let's say the reason for more deeply engaging with Accenture on this journey. >> Thank you very much, Nils, why don't you comment on that as well? I'd love to get your thoughts and really is kind of your role here, an Accenture global SI, deep expertise in industry, but also technology, what are your thoughts on this topic? >> Yeah, I'd love to love to comment. So when we started the journey, it was pretty clear from the outset that we would need to build this on cloud in order to get this scalability and this ability to roll out to different markets, have a central solution that can act as a template for the different markets, but then also have the opportunity to localize different languages, different partners for the rewards, there's different reward partners in the different markets. So we needed to build an asset basically that could work as a template, centrally standardizing things, but also leaving enough flexibility to then localize in the individual markets. And if we talk about some of the most specific requirements, so one thing that gave us headaches in the beginning was the authentication of the users because each of the markets has their own systems of record where the, basically the authentication needs to happen. And if we somehow needed to still find a holistic solution that comes through the central platform, and we were able to do that at the end through the AWS cognitive service, sort of wrapping the individual markets, local IDP systems. And by now we've even extended that solution to have a standalone cloud native kind of IDP solution in place for markets that do not have a local IDP solution in place, or don't want to use it for this purpose. >> So you had data, you had the integration, you've got local laws, you mentioned the flexibility, you're building ecosystems that are unique to the local, both language and cultures. Please, you had another comment, I interrupted you. >> No, I just wanted to expand basically on the requirements. So that was the central one being able to roll this out in a standardized way across the markets, but then there were further requirements. For example, like being able to operate the platform with very low operations overhead. There is no large IT team behind Generali Vitality that, works disservice or can act as this backbone support. So we needed to have basically a solution that runs itself that runs on autopilot. And that was another big, big driver for first of all, going to cloud, but second of all, making specific choices within cloud. So we specifically chose to build this as a cloud native solution using for example, managed database services, with automatic backup, with automatic ability to restore data that scales automatically that has all this built in which usually maybe in a database administrator would take care of. And we applied that concept basically to every component, to everything we looked at, we applied this requirement of how can this run on autopilot? How can we make this as much managed by itself within the cloud as possible, and then lend it on these services. For example, we also use the API gateway from AWS for our API services that also came in handy when, for example, we had some response time issues with the third party we needed to call. And then we could just with a flick of a button basically, introduced caching on the level of the API gateway and really improve the user experience because the data wasn't updated so much, so it was easier to cache. So these are all experiences I think that that proved in the end that we made the right choices here and the requirements that drove that to have a good user experience. >> Would you say that the architecture is a sort of a, data architecture specifically, is it a decentralized data architecture with sort of federated, centralized governance? Or is it more of a centralized view, wonder if you could talk about that? >> Yeah, it's actually a centralized platform basically. So the core product is the same for all the markets and we run them as different tenants basically on top of the infrastructure. So the data is separated in a way, obviously by the different tenants, but it's in a central place and we can analyze it in a central fashion if the need arises from the business. >> And the reason I asked that Simon is because essentially I look at this as largely a data offering for your customers. And so Nils, you were talking about the local language and Simon as well. I would imagine that the local business lines have specific requirements and specific data requirements. And so you've got to build an architecture that is flexible enough to meet those needs yet at the same time can ensure data quality and governance and security. And that's not a trivial challenge. I wonder if you both could comment on that. >> Yeah, maybe I'll give a start and then Simon can chime in. So what we're specifically doing is managing the rewards experience, so our solution will take care of tracking what rewards have been earned for what customer, what rewards have been redeemed, what rewards can be unlocked on the next level, and we foreshadow a little bit to motivate incentivize the customer and asset that data sits in an AWS database by tenant fashion. And you can run analysis on top of that. Maybe what you're getting into is also the, let's say the exercise data, the fitness device tracking data that is not specifically part of what my team has built, but I'm sure Simon can comment a little bit on that angle as well. >> Yeah, please. >> Yeah, sure. I think the topic of data and how we use it in our business is a very interesting one because it's not historically been seen, let's say as the remit of insurance to go beyond the data that you need to underwrite policies or process claims or whatever it might be. But actually we see that this is a whole point around being able to create some shared value in this kind of products. And what I mean by that is, if you are a customer and you're buying an insurance policy, it might be a life insurance or health insurance policy from Generali, and we're not giving you access to this program. And through that program, you are living a healthier life and that might have a positive impact on generosity in terms of, maybe we're going to increase our market share, or maybe we are going through lower claims, or we're going to generate value of that then. One of the points of this program is we then share that value back with customers, through the rewards on the platform that we've built here. And of course, being able to understand that data and to quantify it and to value that data is an important part of the different stages of how much value you are creating. And it's also interesting to know that, in a couple of our markets, we operate in the corporate space. So not with retail customers, but with organizations. And one of the reasons that those companies give Vitality to their employees is that they want to see things like the improved health of a workforce. They want to see higher presenteeism, lower absenteeism of employees, and of course, being able to demonstrate that there's a sort of correlation between participation in the Vitality program and things like that is also important. And as we've said, the markets are very different. So we need to be able to take the data that we have out of the Vitality Program and be able in the company that I'm managing to interpret that data so that in our insurance businesses, we are able to make good decisions about kind of insurance product we have. I think what's interesting to make clear is that actually that the kind of health data that we generate states purely within the Vitality business itself and what we do inside the Vitality business is to analyze that data and say, is this also helping our insurance businesses to drive better top line and bottom line in the relevant business lines? And this is different per company. Being able to interrogate that data, understand it, apply it in different markets, in different distribution systems and different kinds of approaches to insurance is an important one, yes. >> It's an excellent example of a digital business and we talked about digital transformation. What does that mean? This is what it means. It must be really interesting board discussions because you're transforming an industry, you're lowering overall costs. I mean, if people are getting less sick, that's more profit for your company and you can choose to invest that in new products, you can give back some to your corporate clients, you can play that balancing act, you can gain market share. And you've got some knobs to turn, some levers, for your stakeholders, which is awesome. Nils, something that I'm interested in, it must've been really important for you to figure out how to determine and measure success. Obviously it's up to Generali Vitality to get adoption for their customers, but at the same time, the efficacy of your solution is going to determine, the ease of delivery and consumption. So, how did you map to the specific goals? What were some of the key KPIs in terms of mapping to their aggressive goals. >> Besides the things we already touched on, I think one thing I would mention is the timeline. So, we started the team ramping in January, February, and then within six months basically, we had the solution built and then we went through a extensive test phase. And within the next six months we had the product rolled out to three markets. So this speed to value, speed to market that we were able to achieve, I think is one of the key criteria that also Simon and team gave to us. There was a timeline and that time I was not going to move. So we needed to make a plan, adjust to that timeline. And I think it's both a testament to the team's work that we met this timeline, but it also is enabled by a technology stack cloud. I have to say, if I go back five years, 10 years, if you had to build in a solution like this on a corporate data center across so many different markets and each managed locally, there would've been no way to do this in 12 months, that's for sure. >> Yeah, Simon, you're a technology company. I mean, insurance has always been a tech heavy company, but as Nils just mentioned, if you had to do that with IT departments in each region. So my question is now you've got this, it's almost like nonrecurring engineering costs, it took one year to actually get the first one done, how fast are you able to launch into new markets just from a technology perspective, not withstanding local regulations and figuring out the go to market? Is that compressed? >> So you asked specifically technology-wise I think we would be able to set up a new market, including localizations that often involves translation of, because in Europe you have all the different languages and so on, I would say four to six weeks, we probably could stand up a localized solution. In reality, it takes more like six to nine months to get it rolled out because there's many other things involved, obviously, but just our piece of the solution, we can pretty quickly localize it to a new market. >> But Simon, that means that you can spend time on those other factors, you don't have to really worry so much about the technology. And so you've launched in multiple European markets, what do you see for the future of this program? Come to America. >> You can find that this program in America Dave, but with one of our competitors, we're not operating so much in the US, but you can find it if you want to become a customer for sure. But yes, you're right. I think from our perspective, to put this kind of business into a new market is not an easy thing because what we're doing is not offering it just as a service on a standalone basis to customers, we want to link it with insurance business. In the end, we are an insurance business, and we want to see the value that comes from that. So there's a lot of effort that has to go into making sure that we land it in the right way, also from a customer proposition points of view with our distribution, they are all quite different. Coming to the question of what's next? It comes in three stages for me. So as I mentioned, we are in five markets already. In the first half of 2022, we'll also come to the Czech Republic and Poland, which we're excited to do. And that will basically mean that we have this business in the seven main Generali markets in Europe related to life and health business, which is the most natural at let's say fit for something like Vitality. Then, the sort of second part of that is to say, we have a program that is very heavily focused around activity and rewards, and that's a good place to start, but, wellness these days is not just about, can you move a bit more than you did historically, it's also about mental wellbeing, it's about sleeping good, it's about mindfulness, it's about being able to have a more holistic approach to wellbeing and COVID has taught us, and customer feedback has taught is actually that this is something where we need to go. And here we need to have the technology to move there as well. So to be able to work with partners that are not just based on physical activity, but also on mindfulness. So this is how one other way we will develop the proposition. And I think the third one, which is more strategic and we are really looking into is, there's clearly something in the whole perception of incentives and rewards, which drives a level of engagement between an insurer like Generali and its customers that it hasn't had historically. So I think we need to learn, forgetting about the specific one or Vitality being a wellness program, but if there's an insurer, there's a role for us to play where we offer incentives to customers to do something in a specific way and reward them for doing that. And it creates value for us as an insurer, then this is probably a place that we'd want to investigate more. And to be able to do that in other areas means we need to have the technology available, that is, as I said before, replicable faster market can adapt quickly to other ideas that we have, so we can go and test those in different markets. So yes, we have to, we have to complete our scope on Vitality, We have to get that to scale and be able to manage all of this data at scale, all of those rewards that real scale, and to have the technology that allows us to do that without thinking about it too much. And then to say, okay, how do we widen the proposition? And how do we take the concept that sits behind Vitality to see if we can apply it to other areas of our business. And that's really what the future is going to look like for us. >> The isolation era really taught us that if you're not a digital business, you're out of business, and pre COVID, a lot of these stories were kind of buried, but the companies that have invested in digital are now thriving. And this is an awesome example, and another point is that Jeff Hammerbacher, one of the founders of Cloudera, early Facebook employee, famously said about 10, 12 years ago, "The best and greatest engineering minds of my generation are trying to figure out how to get people to click on ads." And this is a wonderful example of how to use data to change people's lives. So guys, congratulations, best of luck, really awesome example of applying technology to create an important societal outcome. Really appreciate your time on theCUBE. Thank you. >> Bye-bye. >> All right, and thanks for watching this segment of theCUBE's presentation of the AWS Executive Summit at re:Invent 2021 made possible by Accenture. Keep it right there for more deep dives. (upbeat music)
SUMMARY :
and getting to the really concept that you guys have and it's going to help you the member to actually And the higher up you get in the status, and AWS and the cloud to and so knows how to bring it to life, and this ability to roll So you had data, you the end that we made the right So the data is separated in a way, And the reason I asked that and we foreshadow a little bit to motivate take the data that we have is going to determine, speed to market that we figuring out the go to market? but just our piece of the solution, that you can spend time to see if we can apply it to example of how to use data of the AWS Executive
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Adidas | ORGANIZATION | 0.99+ |
Garmin | ORGANIZATION | 0.99+ |
Nils | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Simon | PERSON | 0.99+ |
America | LOCATION | 0.99+ |
US | LOCATION | 0.99+ |
Germany | LOCATION | 0.99+ |
France | LOCATION | 0.99+ |
Dave | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
40% | QUANTITY | 0.99+ |
Generali | ORGANIZATION | 0.99+ |
Vitality | ORGANIZATION | 0.99+ |
six | QUANTITY | 0.99+ |
Austria | LOCATION | 0.99+ |
four | QUANTITY | 0.99+ |
January | DATE | 0.99+ |
Spain | LOCATION | 0.99+ |
Nils Muller-Sheffer | PERSON | 0.99+ |
Accenture | ORGANIZATION | 0.99+ |
nine months | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Generali Vitality | ORGANIZATION | 0.99+ |
49 | QUANTITY | 0.99+ |
Italy | LOCATION | 0.99+ |
one year | QUANTITY | 0.99+ |
Poland | LOCATION | 0.99+ |
six weeks | QUANTITY | 0.99+ |
five markets | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Generali Vitality, GmbH | ORGANIZATION | 0.99+ |
12 months | QUANTITY | 0.99+ |
45 | QUANTITY | 0.99+ |
third one | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
Simon Guest | PERSON | 0.99+ |
10 years | QUANTITY | 0.99+ |
each | QUANTITY | 0.98+ |
six months | QUANTITY | 0.98+ |
first one | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
Amazon | ORGANIZATION | 0.98+ |
theCUBE | ORGANIZATION | 0.98+ |
Cloudera | ORGANIZATION | 0.98+ |
five years | QUANTITY | 0.98+ |
each region | QUANTITY | 0.97+ |
five | QUANTITY | 0.97+ |
second part | QUANTITY | 0.97+ |
this week | DATE | 0.97+ |
February | DATE | 0.96+ |
The Human Safety Net | ORGANIZATION | 0.96+ |
one thing | QUANTITY | 0.96+ |
one more question | QUANTITY | 0.96+ |
last week | DATE | 0.96+ |
two thirds | QUANTITY | 0.95+ |
AWS Executive Summit | EVENT | 0.95+ |
Fadzi Ushewokunze and Ajay Vohora | Io Tahoe Enterprise Digital Resilience on Hybrid and Multicloud
>> Announcer: From around the globe, it's theCUBE presenting Enterprise Digital Resilience on Hybrid and multicloud brought to you by io/tahoe >> Hello everyone, and welcome to our continuing series covering data automation brought to you by io/tahoe. Today we're going to look at how to ensure enterprise resilience for hybrid and multicloud, let's welcome in Ajay Vohora who's the CEO of io/tahoe Ajay, always good to see you again, thanks for coming on. >> Great to be back David, pleasure. >> And he's joined by Fadzi Ushewokunze, who is a global principal architect for financial services, the vertical of financial services at Red Hat. He's got deep experiences in that sector. Welcome Fadzi, good to see you. >> Thank you very much. Happy to be here. >> Fadzi, let's start with you. Look, there are a lot of views on cloud and what it is. I wonder if you could explain to us how you think about what is a hybrid cloud and how it works. >> Sure, Yeah. So, a hybrid cloud is an IT architecture that incorporates some degree of workload portability, orchestration and management across multiple clouds. Those clouds could be private clouds or public clouds or even your own data centers. And how does it all work? It's all about secure interconnectivity and on demand allocation of resources across clouds. And separate clouds can become hybrid when you're seamlessly interconnected. And it is that interconnectivity that allows the workloads to be moved and how management can be unified and orchestration can work. And how well you have these interconnections has a direct impact of how well your hybrid cloud will work. >> Okay, so well Fadzi, staying with you for a minute. So, in the early days of cloud that term private cloud was thrown around a lot. But it often just meant virtualization of an on-prem system and a network connection to the public cloud. Let's bring it forward. What, in your view does a modern hybrid cloud architecture look like? >> Sure, so, for modern hybrid clouds we see that teams or organizations need to focus on the portability of applications across clouds. That's very important, right. And when organizations build applications they need to build and deploy these applications as a small collections of independently loosely coupled services. And then have those things run on the same operating system, which means in other words, running it all Linux everywhere and building cloud native applications and being able to manage it and orchestrate these applications with platforms like Kubernetes or Red Hat OpenShift, for example. >> Okay, so, Fadzi that's definitely different from building a monolithic application that's fossilized and doesn't move. So, what are the challenges for customers, you know, to get to that modern cloud is as you've just described it as it skillsets, is it the ability to leverage things like containers? What's your View there? >> So, I mean, from what we've seen around the industry especially around financial services where I spend most of my time. We see that the first thing that we see is management, right. Now, because you have all these clouds, you know, all these applications. You have a massive array of connections, of interconnections. You also have massive array of integrations portability and resource allocation as well. And then orchestrating all those different moving pieces things like storage networks. Those are really difficult to manage, right? So, management is the first challenge. The second one is workload placement. Where do you place this cloud? How do you place these cloud native operations? Do you, what do you keep on site on prem and what do you put in the cloud? That is the other challenge. The major one, the third one is security. Security now becomes the key challenge and concern for most customers. And we're going to talk about how to address that. >> Yeah, we're definitely going to dig into that. Let's bring Ajay into the conversation. Ajay, you know, you and I have talked about this in the past. One of the big problems that virtually every company face is data fragmentation. Talk a little bit about how io/tahoe, unifies data across both traditional systems, legacy systems and it connects to these modern IT environments. >> Yeah, sure Dave. I mean, a Fadzi just nailed it there. It used to be about data, the volume of data and the different types of data, but as applications become more connected and interconnected the location of that data really matters. How we serve that data up to those apps. So, working with Red Hat and our partnership with Red Hat. Being able to inject our data discovery machine learning into these multiple different locations. whether it be an AWS or an IBM cloud or a GCP or on prem. Being able to automate that discovery and pulling that single view of where is all my data, then allows the CIO to manage cost. They can do things like, one, I keep the data where it is, on premise or in my Oracle cloud or in my IBM cloud and connect the application that needs to feed off that data. And the way in which we do that is machine learning that learns over time as it recognizes different types of data, applies policies to classify that data and brings it all together with automation. >> Right, and one of the big themes that we've talked about this on earlier episodes is really simplification, really abstracting a lot of that heavy lifting away. So, we can focus on things Ajay, as you just mentioned. I mean, Fadzi, one of the big challenges that of course we all talk about is governance across these disparate data sets. I'm curious as your thoughts how does Red Hat really think about helping customers adhere to corporate edicts and compliance regulations? Which of course are particularly acute within financial services. >> Oh yeah, yes. So, for banks and payment providers like you've just mentioned there. Insurers and many other financial services firms, you know they have to adhere to a standard such as say a PCI DSS. And in Europe you've got the GDPR, which requires stringent tracking, reporting, documentation and, you know for them to, to remain in compliance. And the way we recommend our customers to address these challenges is by having an automation strategy, right. And that type of strategy can help you to improve the security on compliance of of your organization and reduce the risk out of the business, right. And we help organizations build security and compliance from the start with our consulting services, residencies. We also offer courses that help customers to understand how to address some of these challenges. And there's also, we help organizations build security into their applications with our open source middleware offerings and even using a platform like OpenShift, because it allows you to run legacy applications and also containerized applications in a unified platform. Right, and also that provides you with, you know with the automation and the tooling that you need to continuously monitor, manage and automate the systems for security and compliance purposes. >> Ajay, anything, any color you could add to this conversation? >> Yeah, I'm pleased Fadzi brought up OpenShift. I mean we're using OpenShift to be able to take that security application of controls to the data level and it's all about context. So, understanding what data is there, being able to assess it to say, who should have access to it, which application permission should be applied to it. That's a great combination of Red Hat and io/tahoe. >> Fadzi, what about multi-cloud? Doesn't that complicate the situation even further, maybe you could talk about some of the best practices to apply automation across not only hybrid cloud, but multi-cloud as well. >> Yeah, sure, yeah. So, the right automation solution, you know can be the difference between, you know cultivating an automated enterprise or automation carries. And some of the recommendations we give our clients is to look for an automation platform that can offer the first thing is complete support. So, that means have an automation solution that provides, you know, promotes IT availability and reliability with your platform so that, you can provide enterprise grade support, including security and testing integration and clear roadmaps. The second thing is vendor interoperability in that, you are going to be integrating multiple clouds. So, you're going to need a solution that can connect to multiple clouds seamlessly, right? And with that comes the challenge of maintainability. So, you're going to need to look into a automation solution that is easy to learn or has an easy learning curve. And then, the fourth idea that we tell our customers is scalability. In the hybrid cloud space, scale is the big, big deal here. And you need to deploy an automation solution that can span across the whole enterprise in a consistent manner, right. And then also that allows you finally to integrate the multiple data centers that you have. >> So, Ajay, I mean, this is a complicated situation for if a customer has to make sure things work on AWS or Azure or Google. They're going to spend all their time doing that. What can you add to really just simplify that multi-cloud and hybrid cloud equation. >> Yeah, I can give a few customer examples here. One being a manufacturer that we've worked with to drive that simplification. And the real bonuses for them has been a reduction in cost. We worked with them late last year to bring the cost spend down by $10 million in 2021. So, they could hit that reduced budget. And, what we brought to that was the ability to deploy using OpenShift templates into their different environments, whether it was on premise or in, as you mentioned, AWS. They had GCP as well for their marketing team and across those different platforms, being able to use a template, use prebuilt scripts to get up and running and catalog and discover that data within minutes. It takes away the legacy of having teams of people having to jump on workshop calls. And I know we're all on a lot of teams zoom calls. And in these current times. They're just simply using enough hours of the day to manually perform all of this. So, yeah, working with Red Hat, applying machine learning into those templates, those little recipes that we can put that automation to work regardless which location the data's in allows us to pull that unified view together. >> Great, thank you. Fadzi, I want to come back to you. So, the early days of cloud you're in the Big Apple, you know financial services really well. Cloud was like an evil word and within financial services, and obviously that's changed, it's evolved. We talk about the pandemic has even accelerated that. And when you really dug into it, when you talk to customers about their experiences with security in the cloud, it was not that it wasn't good, it was great, whatever, but it was different. And there's always this issue of skill, lack of skills and multiple tools, set up teams. are really overburdened. But in the cloud requires, you know, new thinking you've got the shared responsibility model. You've got to obviously have specific corporate, you know requirements and compliance. So, this is even more complicated when you introduce multiple clouds. So, what are the differences that you can share from your experiences running on a sort of either on prem or on a mono cloud or, you know, versus across clouds? What, do you suggest there? >> Sure, you know, because of these complexities that you have explained here mixed configurations and the inadequate change control are the top security threats. So, human error is what we want to avoid, because as you know, as your clouds grow with complexity then you put humans in the mix. Then the rate of errors is going to increase and that is going to expose you to security threats. So, this is where automation comes in, because automation will streamline and increase the consistency of your infrastructure management also application development and even security operations to improve in your protection compliance and change control. So, you want to consistently configure resources according to a pre-approved, you know, pre-approved policies and you want to proactively maintain them in a repeatable fashion over the whole life cycle. And then, you also want to rapidly the identify system that require patches and reconfiguration and automate that process of patching and reconfiguring. So that, you don't have humans doing this type of thing, And you want to be able to easily apply patches and change assistance settings according to a pre-defined base like I explained before, you know with the pre-approved policies. And also you want ease of auditing and troubleshooting, right. And from a Red Hat perspective we provide tools that enable you to do this. We have, for example a tool called Ansible that enables you to automate data center operations and security and also deployment of applications. And also OpenShift itself, it automates most of these things and obstruct the human beings from putting their fingers and causing, you know potentially introducing errors, right. Now, in looking into the new world of multiple clouds and so forth. The differences that we're seeing here between running a single cloud or on prem is three main areas, which is control, security and compliance, right. Control here, it means if you're on premise or you have one cloud you know, in most cases you have control over your data and your applications, especially if you're on prem. However, if you're in the public cloud, there is a difference that the ownership it is still yours, but your resources are running on somebody else's or the public clouds, EWS and so forth infrastructure. So, people that are going to do these need to really, especially banks and governments need to be aware of the regulatory constraints of running those applications in the public cloud. And we also help customers rationalize some of these choices. And also on security, you will see that if you're running on premises or in a single cloud you have more control, especially if you're on prem. You can control the sensitive information that you have. However, in the cloud, that's a different situation especially from personal information of employees and things like that. You need to be really careful with that. And also again, we help you rationalize some of those choices. And then, the last one is compliance. As well, you see that if you're running on prem on single cloud, regulations come into play again, right? And if you're running on prem, you have control over that. You can document everything, you have access to everything that you need, but if you're going to go to the public cloud again, you need to think about that. We have automation and we have standards that can help you you know, address some of these challenges. >> So, that's really strong insights, Fadzi. I mean, first of all Ansible has a lot of market momentum, you know, Red Hat's done a really good job with that acquisition. Your point about repeatability is critical, because you can't scale otherwise. And then, that idea you're putting forth about control, security and compliance. It's so true, I called it the shared responsibility model. And there was a lot of misunderstanding in the early days of cloud. I mean, yeah, maybe AWS is going to physically secure the you know, the S3, but in the bucket but we saw so many misconfigurations early on. And so it's key to have partners that really understand this stuff and can share the experiences of other clients. So, this all sounds great. Ajay, you're sharp, financial background. What about the economics? You know, our survey data shows that security it's at the top of the spending priority list, but budgets are stretched thin. I mean, especially when you think about the work from home pivot and all the areas that they had to, the holes that they had to fill there, whether it was laptops, you know, new security models, et cetera. So, how to organizations pay for this? What's the business case look like in terms of maybe reducing infrastructure costs, so I can pay it forward or there's a there's a risk reduction angle. What can you share there? >> Yeah, I mean, that perspective I'd like to give here is not being multi-cloud as multi copies of an application or data. When I think back 20 years, a lot of the work in financial services I was looking at was managing copies of data that were feeding different pipelines, different applications. Now, what we're seeing at io/tahoe a lot of the work that we're doing is reducing the number of copies of that data. So that, if I've got a product lifecycle management set of data, if I'm a manufacturer I'm just going to keep that at one location. But across my different clouds, I'm going to have best of breed applications developed in-house, third parties in collaboration with my supply chain, connecting securely to that single version of the truth. What I'm not going to do is to copy that data. So, a lot of what we're seeing now is that interconnectivity using applications built on Kubernetes that are decoupled from the data source. That allows us to reduce those copies of data within that you're gaining from a security capability and resilience, because you're not leaving yourself open to those multiple copies of data. And with that come cost of storage and a cost to compute. So, what we're saying is using multi-cloud to leverage the best of what each cloud platform has to offer. And that goes all the way to Snowflake and Heroku on a cloud managed databases too. >> Well and the people cost too as well. When you think about, yes, the copy creep. But then, you know, when something goes wrong a human has to come in and figure it out. You know, you brought up Snowflake, I get this vision of the data cloud, which is, you know data. I think we're going to be rethinking Ajay, data architectures in the coming decade where data stays where it belongs, it's distributed and you're providing access. Like you said, you're separating the data from the applications. Applications as we talked about with Fadzi, much more portable. So, it's really the last 10 years it'd be different than the next 10 years ago Ajay. >> Definitely, I think the people cost reduction is used. Gone are the days where you needed to have a dozen people governing, managing byte policies to data. A lot of that repetitive work, those tasks can be in part automated. We're seen examples in insurance where reduced teams of 15 people working in the back office, trying to apply security controls, compliance down to just a couple of people who are looking at the exceptions that don't fit. And that's really important because maybe two years ago the emphasis was on regulatory compliance of data with policies such as GDPR and CCPA. Last year, very much the economic effect to reduce head counts and enterprises running lean looking to reduce that cost. This year, we can see that already some of the more proactive companies are looking at initiatives, such as net zero emissions. How they use data to understand how they can become more, have a better social impact and using data to drive that. And that's across all of their operations and supply chain. So, those regulatory compliance issues that might have been external. We see similar patterns emerging for internal initiatives that are benefiting that environment, social impact, and of course costs. >> Great perspectives. Jeff Hammerbacher once famously said, the best minds of my generation are trying to get people to click on ads and Ajay those examples that you just gave of, you know social good and moving things forward are really critical. And I think that's where data is going to have the biggest societal impact. Okay guys, great conversation. Thanks so much for coming to the program. Really appreciate your time. >> Thank you. >> Thank you so much, Dave. >> Keep it right there, for more insight and conversation around creating a resilient digital business model. You're watching theCube. (soft music)
SUMMARY :
Ajay, always good to see you for financial services, the vertical Thank you very much. explain to us how you think And how well you have So, in the early days of cloud and being able to manage it and is it the ability to leverage We see that the first thing that we see One of the big problems that virtually And the way in which we do that is Right, and one of the And that type of strategy can help you to being able to assess it to say, some of the best practices can be the difference between, you know What can you add to really just simplify enough hours of the day that you can share to everything that you need, that security it's at the top And that goes all the way to Snowflake of the data cloud, you needed to have a dozen just gave of, you know Keep it right there, for
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Fadzi | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Ajay Vohora | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Last year | DATE | 0.99+ |
Ajay | PERSON | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
Fadzi Ushewokunze | PERSON | 0.99+ |
15 people | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
This year | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
One | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
$10 million | QUANTITY | 0.99+ |
fourth idea | QUANTITY | 0.99+ |
second thing | QUANTITY | 0.99+ |
OpenShift | TITLE | 0.99+ |
Ansible | ORGANIZATION | 0.99+ |
Linux | TITLE | 0.98+ |
two years ago | DATE | 0.98+ |
single | QUANTITY | 0.98+ |
third one | QUANTITY | 0.98+ |
io | ORGANIZATION | 0.98+ |
second one | QUANTITY | 0.98+ |
first challenge | QUANTITY | 0.98+ |
first thing | QUANTITY | 0.98+ |
EWS | ORGANIZATION | 0.97+ |
both | QUANTITY | 0.97+ |
next 10 years ago | DATE | 0.97+ |
Today | DATE | 0.97+ |
one | QUANTITY | 0.96+ |
one cloud | QUANTITY | 0.95+ |
single cloud | QUANTITY | 0.95+ |
late last year | DATE | 0.94+ |
pandemic | EVENT | 0.93+ |
each cloud platform | QUANTITY | 0.93+ |
Red Hat OpenShift | TITLE | 0.91+ |
a minute | QUANTITY | 0.91+ |
one location | QUANTITY | 0.91+ |
theCUBE | ORGANIZATION | 0.89+ |
Kubernetes | TITLE | 0.88+ |
io/tahoe | ORGANIZATION | 0.87+ |
three main areas | QUANTITY | 0.87+ |
Ansible | TITLE | 0.86+ |
CCPA | TITLE | 0.85+ |
zero emissions | QUANTITY | 0.83+ |
tahoe | ORGANIZATION | 0.81+ |
IBM | ORGANIZATION | 0.81+ |
a dozen people | QUANTITY | 0.79+ |
Snowflake | TITLE | 0.78+ |
Io Tahoe | PERSON | 0.75+ |
Azure | ORGANIZATION | 0.75+ |
last 10 years | DATE | 0.74+ |
20 years | QUANTITY | 0.74+ |
IBM cloud | ORGANIZATION | 0.72+ |
single version | QUANTITY | 0.71+ |
Red Hat | TITLE | 0.71+ |
S3 | COMMERCIAL_ITEM | 0.71+ |
Fadzi Ushewokunze and Ajay Vohora V2b
>> Announcer: From around the globe, it's theCUBE presenting Enterprise Digital Resilience on Hybrid and multicloud brought to you by io/tahoe >> Hello everyone, and welcome to our continuing series covering data automation brought to you by io/tahoe. Today we're going to look at how to ensure enterprise resilience for hybrid and multicloud, let's welcome in Ajay Vohora who's the CEO of io/tahoe Ajay, always good to see you again, thanks for coming on. >> Great to be back David, pleasure. >> And he's joined by Fadzi Ushewokunze, who is a global principal architect for financial services, the vertical of financial services at Red Hat. He's got deep experiences in that sector. Welcome Fadzi, good to see you. >> Thank you very much. Happy to be here. >> Fadzi, let's start with you. Look, there are a lot of views on cloud and what it is. I wonder if you could explain to us how you think about what is a hybrid cloud and how it works. >> Sure, Yeah. So, a hybrid cloud is an IT architecture that incorporates some degree of workload portability, orchestration and management across multiple clouds. Those clouds could be private clouds or public clouds or even your own data centers. And how does it all work? It's all about secure interconnectivity and on demand allocation of resources across clouds. And separate clouds can become hybrid when you're seamlessly interconnected. And it is that interconnectivity that allows the workloads to be moved and how management can be unified and orchestration can work. And how well you have these interconnections has a direct impact of how well your hybrid cloud will work. >> Okay, so well Fadzi, staying with you for a minute. So, in the early days of cloud that term private cloud was thrown around a lot. But it often just meant virtualization of an on-prem system and a network connection to the public cloud. Let's bring it forward. What, in your view does a modern hybrid cloud architecture look like? >> Sure, so, for modern hybrid clouds we see that teams or organizations need to focus on the portability of applications across clouds. That's very important, right. And when organizations build applications they need to build and deploy these applications as a small collections of independently loosely coupled services. And then have those things run on the same operating system, which means in other words, running it all Linux everywhere and building cloud native applications and being able to manage it and orchestrate these applications with platforms like Kubernetes or Red Hat OpenShift, for example. >> Okay, so, Fadzi that's definitely different from building a monolithic application that's fossilized and doesn't move. So, what are the challenges for customers, you know, to get to that modern cloud is as you've just described it as it skillsets, is it the ability to leverage things like containers? What's your View there? >> So, I mean, from what we've seen around the industry especially around financial services where I spend most of my time. We see that the first thing that we see is management, right. Now, because you have all these clouds, you know, all these applications. You have a massive array of connections, of interconnections. You also have massive array of integrations portability and resource allocation as well. And then orchestrating all those different moving pieces things like storage networks. Those are really difficult to manage, right? So, management is the first challenge. The second one is workload placement. Where do you place this cloud? How do you place these cloud native operations? Do you, what do you keep on site on prem and what do you put in the cloud? That is the other challenge. The major one, the third one is security. Security now becomes the key challenge and concern for most customers. And we're going to talk about how to address that. >> Yeah, we're definitely going to dig into that. Let's bring Ajay into the conversation. Ajay, you know, you and I have talked about this in the past. One of the big problems that virtually every company face is data fragmentation. Talk a little bit about how io/tahoe, unifies data across both traditional systems, legacy systems and it connects to these modern IT environments. >> Yeah, sure Dave. I mean, a Fadzi just nailed it there. It used to be about data, the volume of data and the different types of data, but as applications become more connected and interconnected the location of that data really matters. How we serve that data up to those apps. So, working with Red Hat and our partnership with Red Hat. Being able to inject our data discovery machine learning into these multiple different locations. whether it be an AWS or an IBM cloud or a GCP or on prem. Being able to automate that discovery and pulling that single view of where is all my data, then allows the CIO to manage cost. They can do things like, one, I keep the data where it is, on premise or in my Oracle cloud or in my IBM cloud and connect the application that needs to feed off that data. And the way in which we do that is machine learning that learns over time as it recognizes different types of data, applies policies to classify that data and brings it all together with automation. >> Right, and one of the big themes that we've talked about this on earlier episodes is really simplification, really abstracting a lot of that heavy lifting away. So, we can focus on things Ajay, as you just mentioned. I mean, Fadzi, one of the big challenges that of course we all talk about is governance across these disparate data sets. I'm curious as your thoughts how does Red Hat really think about helping customers adhere to corporate edicts and compliance regulations? Which of course are particularly acute within financial services. >> Oh yeah, yes. So, for banks and payment providers like you've just mentioned there. Insurers and many other financial services firms, you know they have to adhere to a standard such as say a PCI DSS. And in Europe you've got the GDPR, which requires stringent tracking, reporting, documentation and, you know for them to, to remain in compliance. And the way we recommend our customers to address these challenges is by having an automation strategy, right. And that type of strategy can help you to improve the security on compliance of of your organization and reduce the risk out of the business, right. And we help organizations build security and compliance from the start with our consulting services, residencies. We also offer courses that help customers to understand how to address some of these challenges. And there's also, we help organizations build security into their applications with our open source middleware offerings and even using a platform like OpenShift, because it allows you to run legacy applications and also containerized applications in a unified platform. Right, and also that provides you with, you know with the automation and the tooling that you need to continuously monitor, manage and automate the systems for security and compliance purposes. >> Ajay, anything, any color you could add to this conversation? >> Yeah, I'm pleased Fadzi brought up OpenShift. I mean we're using OpenShift to be able to take that security application of controls to the data level and it's all about context. So, understanding what data is there, being able to assess it to say, who should have access to it, which application permission should be applied to it. That's a great combination of Red Hat and io/tahoe. >> Fadzi, what about multi-cloud? Doesn't that complicate the situation even further, maybe you could talk about some of the best practices to apply automation across not only hybrid cloud, but multi-cloud as well. >> Yeah, sure, yeah. So, the right automation solution, you know can be the difference between, you know cultivating an automated enterprise or automation carries. And some of the recommendations we give our clients is to look for an automation platform that can offer the first thing is complete support. So, that means have an automation solution that provides, you know, promotes IT availability and reliability with your platform so that, you can provide enterprise grade support, including security and testing integration and clear roadmaps. The second thing is vendor interoperability in that, you are going to be integrating multiple clouds. So, you're going to need a solution that can connect to multiple clouds seamlessly, right? And with that comes the challenge of maintainability. So, you're going to need to look into a automation solution that is easy to learn or has an easy learning curve. And then, the fourth idea that we tell our customers is scalability. In the hybrid cloud space, scale is the big, big deal here. And you need to deploy an automation solution that can span across the whole enterprise in a consistent manner, right. And then also that allows you finally to integrate the multiple data centers that you have. >> So, Ajay, I mean, this is a complicated situation for if a customer has to make sure things work on AWS or Azure or Google. They're going to spend all their time doing that. What can you add to really just simplify that multi-cloud and hybrid cloud equation. >> Yeah, I can give a few customer examples here. One being a manufacturer that we've worked with to drive that simplification. And the real bonuses for them has been a reduction in cost. We worked with them late last year to bring the cost spend down by $10 million in 2021. So, they could hit that reduced budget. And, what we brought to that was the ability to deploy using OpenShift templates into their different environments, whether it was on premise or in, as you mentioned, AWS. They had GCP as well for their marketing team and across those different platforms, being able to use a template, use prebuilt scripts to get up and running and catalog and discover that data within minutes. It takes away the legacy of having teams of people having to jump on workshop calls. And I know we're all on a lot of teams zoom calls. And in these current times. They're just simply using enough hours of the day to manually perform all of this. So, yeah, working with Red Hat, applying machine learning into those templates, those little recipes that we can put that automation to work regardless which location the data's in allows us to pull that unified view together. >> Great, thank you. Fadzi, I want to come back to you. So, the early days of cloud you're in the Big Apple, you know financial services really well. Cloud was like an evil word and within financial services, and obviously that's changed, it's evolved. We talk about the pandemic has even accelerated that. And when you really dug into it, when you talk to customers about their experiences with security in the cloud, it was not that it wasn't good, it was great, whatever, but it was different. And there's always this issue of skill, lack of skills and multiple tools, set up teams. are really overburdened. But in the cloud requires, you know, new thinking you've got the shared responsibility model. You've got to obviously have specific corporate, you know requirements and compliance. So, this is even more complicated when you introduce multiple clouds. So, what are the differences that you can share from your experiences running on a sort of either on prem or on a mono cloud or, you know, versus across clouds? What, do you suggest there? >> Sure, you know, because of these complexities that you have explained here mixed configurations and the inadequate change control are the top security threats. So, human error is what we want to avoid, because as you know, as your clouds grow with complexity then you put humans in the mix. Then the rate of errors is going to increase and that is going to expose you to security threats. So, this is where automation comes in, because automation will streamline and increase the consistency of your infrastructure management also application development and even security operations to improve in your protection compliance and change control. So, you want to consistently configure resources according to a pre-approved, you know, pre-approved policies and you want to proactively maintain them in a repeatable fashion over the whole life cycle. And then, you also want to rapidly the identify system that require patches and reconfiguration and automate that process of patching and reconfiguring. So that, you don't have humans doing this type of thing, And you want to be able to easily apply patches and change assistance settings according to a pre-defined base like I explained before, you know with the pre-approved policies. And also you want ease of auditing and troubleshooting, right. And from a Red Hat perspective we provide tools that enable you to do this. We have, for example a tool called Ansible that enables you to automate data center operations and security and also deployment of applications. And also OpenShift itself, it automates most of these things and obstruct the human beings from putting their fingers and causing, you know potentially introducing errors, right. Now, in looking into the new world of multiple clouds and so forth. The differences that we're seeing here between running a single cloud or on prem is three main areas, which is control, security and compliance, right. Control here, it means if you're on premise or you have one cloud you know, in most cases you have control over your data and your applications, especially if you're on prem. However, if you're in the public cloud, there is a difference that the ownership it is still yours, but your resources are running on somebody else's or the public clouds, EWS and so forth infrastructure. So, people that are going to do these need to really, especially banks and governments need to be aware of the regulatory constraints of running those applications in the public cloud. And we also help customers rationalize some of these choices. And also on security, you will see that if you're running on premises or in a single cloud you have more control, especially if you're on prem. You can control the sensitive information that you have. However, in the cloud, that's a different situation especially from personal information of employees and things like that. You need to be really careful with that. And also again, we help you rationalize some of those choices. And then, the last one is compliance. As well, you see that if you're running on prem on single cloud, regulations come into play again, right? And if you're running on prem, you have control over that. You can document everything, you have access to everything that you need, but if you're going to go to the public cloud again, you need to think about that. We have automation and we have standards that can help you you know, address some of these challenges. >> So, that's really strong insights, Fadzi. I mean, first of all Ansible has a lot of market momentum, you know, Red Hat's done a really good job with that acquisition. Your point about repeatability is critical, because you can't scale otherwise. And then, that idea you're putting forth about control, security and compliance. It's so true, I called it the shared responsibility model. And there was a lot of misunderstanding in the early days of cloud. I mean, yeah, maybe AWS is going to physically secure the you know, the S3, but in the bucket but we saw so many misconfigurations early on. And so it's key to have partners that really understand this stuff and can share the experiences of other clients. So, this all sounds great. Ajay, you're sharp, financial background. What about the economics? You know, our survey data shows that security it's at the top of the spending priority list, but budgets are stretched thin. I mean, especially when you think about the work from home pivot and all the areas that they had to, the holes that they had to fill there, whether it was laptops, you know, new security models, et cetera. So, how to organizations pay for this? What's the business case look like in terms of maybe reducing infrastructure costs, so I can pay it forward or there's a there's a risk reduction angle. What can you share there? >> Yeah, I mean, that perspective I'd like to give here is not being multi-cloud as multi copies of an application or data. When I think back 20 years, a lot of the work in financial services I was looking at was managing copies of data that were feeding different pipelines, different applications. Now, what we're seeing at io/tahoe a lot of the work that we're doing is reducing the number of copies of that data. So that, if I've got a product lifecycle management set of data, if I'm a manufacturer I'm just going to keep that at one location. But across my different clouds, I'm going to have best of breed applications developed in-house, third parties in collaboration with my supply chain, connecting securely to that single version of the truth. What I'm not going to do is to copy that data. So, a lot of what we're seeing now is that interconnectivity using applications built on Kubernetes that are decoupled from the data source. That allows us to reduce those copies of data within that you're gaining from a security capability and resilience, because you're not leaving yourself open to those multiple copies of data. And with that come cost of storage and a cost to compute. So, what we're saying is using multi-cloud to leverage the best of what each cloud platform has to offer. And that goes all the way to Snowflake and Heroku on a cloud managed databases too. >> Well and the people cost too as well. When you think about, yes, the copy creep. But then, you know, when something goes wrong a human has to come in and figure it out. You know, you brought up Snowflake, I get this vision of the data cloud, which is, you know data. I think we're going to be rethinking Ajay, data architectures in the coming decade where data stays where it belongs, it's distributed and you're providing access. Like you said, you're separating the data from the applications. Applications as we talked about with Fadzi, much more portable. So, it's really the last 10 years it'd be different than the next 10 years ago Ajay. >> Definitely, I think the people cost reduction is used. Gone are the days where you needed to have a dozen people governing, managing byte policies to data. A lot of that repetitive work, those tasks can be in part automated. We're seen examples in insurance where reduced teams of 15 people working in the back office, trying to apply security controls, compliance down to just a couple of people who are looking at the exceptions that don't fit. And that's really important because maybe two years ago the emphasis was on regulatory compliance of data with policies such as GDPR and CCPA. Last year, very much the economic effect to reduce head counts and enterprises running lean looking to reduce that cost. This year, we can see that already some of the more proactive companies are looking at initiatives, such as net zero emissions. How they use data to understand how they can become more, have a better social impact and using data to drive that. And that's across all of their operations and supply chain. So, those regulatory compliance issues that might have been external. We see similar patterns emerging for internal initiatives that are benefiting that environment, social impact, and of course costs. >> Great perspectives. Jeff Hammerbacher once famously said, the best minds of my generation are trying to get people to click on ads and Ajay those examples that you just gave of, you know social good and moving things forward are really critical. And I think that's where data is going to have the biggest societal impact. Okay guys, great conversation. Thanks so much for coming to the program. Really appreciate your time. >> Thank you. >> Thank you so much, Dave. >> Keep it right there, for more insight and conversation around creating a resilient digital business model. You're watching theCube. (soft music)
SUMMARY :
Ajay, always good to see you for financial services, the vertical Thank you very much. explain to us how you think And how well you have So, in the early days of cloud and being able to manage it and is it the ability to leverage We see that the first thing that we see One of the big problems that virtually And the way in which we do that is Right, and one of the And that type of strategy can help you to being able to assess it to say, some of the best practices can be the difference between, you know What can you add to really just simplify enough hours of the day that you can share to everything that you need, that security it's at the top And that goes all the way to Snowflake of the data cloud, you needed to have a dozen just gave of, you know Keep it right there, for
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Fadzi | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Ajay Vohora | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Last year | DATE | 0.99+ |
Ajay | PERSON | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
Fadzi Ushewokunze | PERSON | 0.99+ |
15 people | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
This year | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
One | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
$10 million | QUANTITY | 0.99+ |
fourth idea | QUANTITY | 0.99+ |
second thing | QUANTITY | 0.99+ |
OpenShift | TITLE | 0.99+ |
Ansible | ORGANIZATION | 0.99+ |
Linux | TITLE | 0.98+ |
two years ago | DATE | 0.98+ |
single | QUANTITY | 0.98+ |
third one | QUANTITY | 0.98+ |
io | ORGANIZATION | 0.98+ |
second one | QUANTITY | 0.98+ |
first challenge | QUANTITY | 0.98+ |
first thing | QUANTITY | 0.98+ |
EWS | ORGANIZATION | 0.97+ |
both | QUANTITY | 0.97+ |
next 10 years ago | DATE | 0.97+ |
Today | DATE | 0.97+ |
one | QUANTITY | 0.96+ |
one cloud | QUANTITY | 0.95+ |
single cloud | QUANTITY | 0.95+ |
late last year | DATE | 0.94+ |
pandemic | EVENT | 0.93+ |
each cloud platform | QUANTITY | 0.93+ |
Red Hat OpenShift | TITLE | 0.91+ |
a minute | QUANTITY | 0.91+ |
one location | QUANTITY | 0.91+ |
theCUBE | ORGANIZATION | 0.89+ |
Kubernetes | TITLE | 0.88+ |
io/tahoe | ORGANIZATION | 0.87+ |
three main areas | QUANTITY | 0.87+ |
Ansible | TITLE | 0.86+ |
CCPA | TITLE | 0.85+ |
zero emissions | QUANTITY | 0.83+ |
tahoe | ORGANIZATION | 0.81+ |
IBM | ORGANIZATION | 0.81+ |
a dozen people | QUANTITY | 0.79+ |
Snowflake | TITLE | 0.78+ |
Azure | ORGANIZATION | 0.75+ |
last 10 years | DATE | 0.74+ |
20 years | QUANTITY | 0.74+ |
IBM cloud | ORGANIZATION | 0.72+ |
single version | QUANTITY | 0.71+ |
Red Hat | TITLE | 0.71+ |
S3 | COMMERCIAL_ITEM | 0.71+ |
Big Apple | LOCATION | 0.7+ |
Fadzi Ushewokunze and Ajay Vohora |
>> Announcer: From around the globe, it's theCUBE presenting Enterprise Digital Resilience on Hybrid and multicloud brought to you by io/tahoe >> Hello everyone, and welcome to our continuing series covering data automation brought to you by io/tahoe. Today we're going to look at how to ensure enterprise resilience for hybrid and multicloud, let's welcome in Ajay Vohora who's the CEO of io/tahoe Ajay, always good to see you again, thanks for coming on. >> Great to be back David, pleasure. >> And he's joined by Fadzi Ushewokunze, who is a global principal architect for financial services, the vertical of financial services at Red Hat. He's got deep experiences in that sector. Welcome Fadzi, good to see you. >> Thank you very much. Happy to be here. >> Fadzi, let's start with you. Look, there are a lot of views on cloud and what it is. I wonder if you could explain to us how you think about what is a hybrid cloud and how it works. >> Sure, Yeah. So, a hybrid cloud is an IT architecture that incorporates some degree of workload portability, orchestration and management across multiple clouds. Those clouds could be private clouds or public clouds or even your own data centers. And how does it all work? It's all about secure interconnectivity and on demand allocation of resources across clouds. And separate clouds can become hybrid when you're seamlessly interconnected. And it is that interconnectivity that allows the workloads to be moved and how management can be unified and orchestration can work. And how well you have these interconnections has a direct impact of how well your hybrid cloud will work. >> Okay, so well Fadzi, staying with you for a minute. So, in the early days of cloud that term private cloud was thrown around a lot. But it often just meant virtualization of an on-prem system and a network connection to the public cloud. Let's bring it forward. What, in your view does a modern hybrid cloud architecture look like? >> Sure, so, for modern hybrid clouds we see that teams or organizations need to focus on the portability of applications across clouds. That's very important, right. And when organizations build applications they need to build and deploy these applications as a small collections of independently loosely coupled services. And then have those things run on the same operating system, which means in other words, running it all Linux everywhere and building cloud native applications and being able to manage it and orchestrate these applications with platforms like Kubernetes or Red Hat OpenShift, for example. >> Okay, so, Fadzi that's definitely different from building a monolithic application that's fossilized and doesn't move. So, what are the challenges for customers, you know, to get to that modern cloud is as you've just described it as it skillsets, is it the ability to leverage things like containers? What's your View there? >> So, I mean, from what we've seen around the industry especially around financial services where I spend most of my time. We see that the first thing that we see is management, right. Now, because you have all these clouds, you know, all these applications. You have a massive array of connections, of interconnections. You also have massive array of integrations portability and resource allocation as well. And then orchestrating all those different moving pieces things like storage networks. Those are really difficult to manage, right? So, management is the first challenge. The second one is workload placement. Where do you place this cloud? How do you place these cloud native operations? Do you, what do you keep on site on prem and what do you put in the cloud? That is the other challenge. The major one, the third one is security. Security now becomes the key challenge and concern for most customers. And we're going to talk about how to address that. >> Yeah, we're definitely going to dig into that. Let's bring Ajay into the conversation. Ajay, you know, you and I have talked about this in the past. One of the big problems that virtually every company face is data fragmentation. Talk a little bit about how io/tahoe, unifies data across both traditional systems, legacy systems and it connects to these modern IT environments. >> Yeah, sure Dave. I mean, a Fadzi just nailed it there. It used to be about data, the volume of data and the different types of data, but as applications become more connected and interconnected the location of that data really matters. How we serve that data up to those apps. So, working with Red Hat and our partnership with Red Hat. Being able to inject our data discovery machine learning into these multiple different locations. whether it be an AWS or an IBM cloud or a GCP or on prem. Being able to automate that discovery and pulling that single view of where is all my data, then allows the CIO to manage cost. They can do things like, one, I keep the data where it is, on premise or in my Oracle cloud or in my IBM cloud and connect the application that needs to feed off that data. And the way in which we do that is machine learning that learns over time as it recognizes different types of data, applies policies to classify that data and brings it all together with automation. >> Right, and one of the big themes that we've talked about this on earlier episodes is really simplification, really abstracting a lot of that heavy lifting away. So, we can focus on things Ajay, as you just mentioned. I mean, Fadzi, one of the big challenges that of course we all talk about is governance across these disparate data sets. I'm curious as your thoughts how does Red Hat really think about helping customers adhere to corporate edicts and compliance regulations? Which of course are particularly acute within financial services. >> Oh yeah, yes. So, for banks and payment providers like you've just mentioned there. Insurers and many other financial services firms, you know they have to adhere to a standard such as say a PCI DSS. And in Europe you've got the GDPR, which requires stringent tracking, reporting, documentation and, you know for them to, to remain in compliance. And the way we recommend our customers to address these challenges is by having an automation strategy, right. And that type of strategy can help you to improve the security on compliance of of your organization and reduce the risk out of the business, right. And we help organizations build security and compliance from the start with our consulting services, residencies. We also offer courses that help customers to understand how to address some of these challenges. And there's also, we help organizations build security into their applications with our open source middleware offerings and even using a platform like OpenShift, because it allows you to run legacy applications and also containerized applications in a unified platform. Right, and also that provides you with, you know with the automation and the tooling that you need to continuously monitor, manage and automate the systems for security and compliance purposes. >> Ajay, anything, any color you could add to this conversation? >> Yeah, I'm pleased Fadzi brought up OpenShift. I mean we're using OpenShift to be able to take that security application of controls to the data level and it's all about context. So, understanding what data is there, being able to assess it to say, who should have access to it, which application permission should be applied to it. That's a great combination of Red Hat and io/tahoe. >> Fadzi, what about multi-cloud? Doesn't that complicate the situation even further, maybe you could talk about some of the best practices to apply automation across not only hybrid cloud, but multi-cloud as well. >> Yeah, sure, yeah. So, the right automation solution, you know can be the difference between, you know cultivating an automated enterprise or automation carries. And some of the recommendations we give our clients is to look for an automation platform that can offer the first thing is complete support. So, that means have an automation solution that provides, you know, promotes IT availability and reliability with your platform so that, you can provide enterprise grade support, including security and testing integration and clear roadmaps. The second thing is vendor interoperability in that, you are going to be integrating multiple clouds. So, you're going to need a solution that can connect to multiple clouds seamlessly, right? And with that comes the challenge of maintainability. So, you're going to need to look into a automation solution that is easy to learn or has an easy learning curve. And then, the fourth idea that we tell our customers is scalability. In the hybrid cloud space, scale is the big, big deal here. And you need to deploy an automation solution that can span across the whole enterprise in a consistent manner, right. And then also that allows you finally to integrate the multiple data centers that you have. >> So, Ajay, I mean, this is a complicated situation for if a customer has to make sure things work on AWS or Azure or Google. They're going to spend all their time doing that. What can you add to really just simplify that multi-cloud and hybrid cloud equation. >> Yeah, I can give a few customer examples here. One being a manufacturer that we've worked with to drive that simplification. And the real bonuses for them has been a reduction in cost. We worked with them late last year to bring the cost spend down by $10 million in 2021. So, they could hit that reduced budget. And, what we brought to that was the ability to deploy using OpenShift templates into their different environments, whether it was on premise or in, as you mentioned, AWS. They had GCP as well for their marketing team and across those different platforms, being able to use a template, use prebuilt scripts to get up and running and catalog and discover that data within minutes. It takes away the legacy of having teams of people having to jump on workshop calls. And I know we're all on a lot of teams zoom calls. And in these current times. They're just simply using enough hours of the day to manually perform all of this. So, yeah, working with Red Hat, applying machine learning into those templates, those little recipes that we can put that automation to work regardless which location the data's in allows us to pull that unified view together. >> Great, thank you. Fadzi, I want to come back to you. So, the early days of cloud you're in the Big Apple, you know financial services really well. Cloud was like an evil word and within financial services, and obviously that's changed, it's evolved. We talk about the pandemic has even accelerated that. And when you really dug into it, when you talk to customers about their experiences with security in the cloud, it was not that it wasn't good, it was great, whatever, but it was different. And there's always this issue of skill, lack of skills and multiple tools, set up teams. are really overburdened. But in the cloud requires, you know, new thinking you've got the shared responsibility model. You've got to obviously have specific corporate, you know requirements and compliance. So, this is even more complicated when you introduce multiple clouds. So, what are the differences that you can share from your experiences running on a sort of either on prem or on a mono cloud or, you know, versus across clouds? What, do you suggest there? >> Sure, you know, because of these complexities that you have explained here mixed configurations and the inadequate change control are the top security threats. So, human error is what we want to avoid, because as you know, as your clouds grow with complexity then you put humans in the mix. Then the rate of errors is going to increase and that is going to expose you to security threats. So, this is where automation comes in, because automation will streamline and increase the consistency of your infrastructure management also application development and even security operations to improve in your protection compliance and change control. So, you want to consistently configure resources according to a pre-approved, you know, pre-approved policies and you want to proactively maintain them in a repeatable fashion over the whole life cycle. And then, you also want to rapidly the identify system that require patches and reconfiguration and automate that process of patching and reconfiguring. So that, you don't have humans doing this type of thing, And you want to be able to easily apply patches and change assistance settings according to a pre-defined base like I explained before, you know with the pre-approved policies. And also you want ease of auditing and troubleshooting, right. And from a Red Hat perspective we provide tools that enable you to do this. We have, for example a tool called Ansible that enables you to automate data center operations and security and also deployment of applications. And also OpenShift itself, it automates most of these things and obstruct the human beings from putting their fingers and causing, you know potentially introducing errors, right. Now, in looking into the new world of multiple clouds and so forth. The differences that we're seeing here between running a single cloud or on prem is three main areas, which is control, security and compliance, right. Control here, it means if you're on premise or you have one cloud you know, in most cases you have control over your data and your applications, especially if you're on prem. However, if you're in the public cloud, there is a difference that the ownership it is still yours, but your resources are running on somebody else's or the public clouds, EWS and so forth infrastructure. So, people that are going to do these need to really, especially banks and governments need to be aware of the regulatory constraints of running those applications in the public cloud. And we also help customers rationalize some of these choices. And also on security, you will see that if you're running on premises or in a single cloud you have more control, especially if you're on prem. You can control the sensitive information that you have. However, in the cloud, that's a different situation especially from personal information of employees and things like that. You need to be really careful with that. And also again, we help you rationalize some of those choices. And then, the last one is compliance. As well, you see that if you're running on prem on single cloud, regulations come into play again, right? And if you're running on prem, you have control over that. You can document everything, you have access to everything that you need, but if you're going to go to the public cloud again, you need to think about that. We have automation and we have standards that can help you you know, address some of these challenges. >> So, that's really strong insights, Fadzi. I mean, first of all Ansible has a lot of market momentum, you know, Red Hat's done a really good job with that acquisition. Your point about repeatability is critical, because you can't scale otherwise. And then, that idea you're putting forth about control, security and compliance. It's so true, I called it the shared responsibility model. And there was a lot of misunderstanding in the early days of cloud. I mean, yeah, maybe AWS is going to physically secure the you know, the S3, but in the bucket but we saw so many misconfigurations early on. And so it's key to have partners that really understand this stuff and can share the experiences of other clients. So, this all sounds great. Ajay, you're sharp, financial background. What about the economics? You know, our survey data shows that security it's at the top of the spending priority list, but budgets are stretched thin. I mean, especially when you think about the work from home pivot and all the areas that they had to, the holes that they had to fill there, whether it was laptops, you know, new security models, et cetera. So, how to organizations pay for this? What's the business case look like in terms of maybe reducing infrastructure costs, so I can pay it forward or there's a there's a risk reduction angle. What can you share there? >> Yeah, I mean, that perspective I'd like to give here is not being multi-cloud as multi copies of an application or data. When I think back 20 years, a lot of the work in financial services I was looking at was managing copies of data that were feeding different pipelines, different applications. Now, what we're seeing at io/tahoe a lot of the work that we're doing is reducing the number of copies of that data. So that, if I've got a product lifecycle management set of data, if I'm a manufacturer I'm just going to keep that at one location. But across my different clouds, I'm going to have best of breed applications developed in-house, third parties in collaboration with my supply chain, connecting securely to that single version of the truth. What I'm not going to do is to copy that data. So, a lot of what we're seeing now is that interconnectivity using applications built on Kubernetes that are decoupled from the data source. That allows us to reduce those copies of data within that you're gaining from a security capability and resilience, because you're not leaving yourself open to those multiple copies of data. And with that come cost of storage and a cost to compute. So, what we're saying is using multi-cloud to leverage the best of what each cloud platform has to offer. And that goes all the way to Snowflake and Heroku on a cloud managed databases too. >> Well and the people cost too as well. When you think about, yes, the copy creep. But then, you know, when something goes wrong a human has to come in and figure it out. You know, you brought up Snowflake, I get this vision of the data cloud, which is, you know data. I think we're going to be rethinking Ajay, data architectures in the coming decade where data stays where it belongs, it's distributed and you're providing access. Like you said, you're separating the data from the applications. Applications as we talked about with Fadzi, much more portable. So, it's really the last 10 years it'd be different than the next 10 years ago Ajay. >> Definitely, I think the people cost reduction is used. Gone are the days where you needed to have a dozen people governing, managing byte policies to data. A lot of that repetitive work, those tasks can be in part automated. We're seen examples in insurance where reduced teams of 15 people working in the back office, trying to apply security controls, compliance down to just a couple of people who are looking at the exceptions that don't fit. And that's really important because maybe two years ago the emphasis was on regulatory compliance of data with policies such as GDPR and CCPA. Last year, very much the economic effect to reduce head counts and enterprises running lean looking to reduce that cost. This year, we can see that already some of the more proactive companies are looking at initiatives, such as net zero emissions. How they use data to understand how they can become more, have a better social impact and using data to drive that. And that's across all of their operations and supply chain. So, those regulatory compliance issues that might have been external. We see similar patterns emerging for internal initiatives that are benefiting that environment, social impact, and of course costs. >> Great perspectives. Jeff Hammerbacher once famously said, the best minds of my generation are trying to get people to click on ads and Ajay those examples that you just gave of, you know social good and moving things forward are really critical. And I think that's where data is going to have the biggest societal impact. Okay guys, great conversation. Thanks so much for coming to the program. Really appreciate your time. >> Thank you. >> Thank you so much, Dave. >> Keep it right there, for more insight and conversation around creating a resilient digital business model. You're watching theCube. (soft music)
SUMMARY :
Ajay, always good to see you for financial services, the vertical Thank you very much. explain to us how you think And how well you have So, in the early days of cloud and being able to manage it and is it the ability to leverage We see that the first thing that we see One of the big problems that virtually And the way in which we do that is Right, and one of the And that type of strategy can help you to being able to assess it to say, some of the best practices can be the difference between, you know What can you add to really just simplify enough hours of the day that you can share to everything that you need, that security it's at the top And that goes all the way to Snowflake of the data cloud, you needed to have a dozen just gave of, you know Keep it right there, for
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Fadzi | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Ajay Vohora | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Last year | DATE | 0.99+ |
Ajay | PERSON | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
Fadzi Ushewokunze | PERSON | 0.99+ |
15 people | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
This year | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
One | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
$10 million | QUANTITY | 0.99+ |
fourth idea | QUANTITY | 0.99+ |
second thing | QUANTITY | 0.99+ |
OpenShift | TITLE | 0.99+ |
Ansible | ORGANIZATION | 0.99+ |
Linux | TITLE | 0.98+ |
two years ago | DATE | 0.98+ |
single | QUANTITY | 0.98+ |
third one | QUANTITY | 0.98+ |
io | ORGANIZATION | 0.98+ |
second one | QUANTITY | 0.98+ |
first challenge | QUANTITY | 0.98+ |
first thing | QUANTITY | 0.98+ |
EWS | ORGANIZATION | 0.97+ |
both | QUANTITY | 0.97+ |
next 10 years ago | DATE | 0.97+ |
Today | DATE | 0.97+ |
one | QUANTITY | 0.96+ |
one cloud | QUANTITY | 0.95+ |
single cloud | QUANTITY | 0.95+ |
late last year | DATE | 0.94+ |
pandemic | EVENT | 0.93+ |
each cloud platform | QUANTITY | 0.93+ |
Red Hat OpenShift | TITLE | 0.91+ |
a minute | QUANTITY | 0.91+ |
one location | QUANTITY | 0.91+ |
theCUBE | ORGANIZATION | 0.89+ |
Kubernetes | TITLE | 0.88+ |
io/tahoe | ORGANIZATION | 0.87+ |
three main areas | QUANTITY | 0.87+ |
Ansible | TITLE | 0.86+ |
CCPA | TITLE | 0.85+ |
zero emissions | QUANTITY | 0.83+ |
tahoe | ORGANIZATION | 0.81+ |
IBM | ORGANIZATION | 0.81+ |
a dozen people | QUANTITY | 0.79+ |
Snowflake | TITLE | 0.78+ |
Azure | ORGANIZATION | 0.75+ |
last 10 years | DATE | 0.74+ |
20 years | QUANTITY | 0.74+ |
IBM cloud | ORGANIZATION | 0.72+ |
single version | QUANTITY | 0.71+ |
Red Hat | TITLE | 0.71+ |
S3 | COMMERCIAL_ITEM | 0.71+ |
Big Apple | LOCATION | 0.7+ |
Breaking Analysis: Google's Antitrust Play Should be to get its Head out of its Ads
>> From the CUBE studios in Palo Alto in Boston, bringing you data-driven insights from the CUBE in ETR. This is breaking analysis with Dave Vellante. >> Earlier these week, the U S department of justice, along with attorneys general from 11 States filed a long expected antitrust lawsuit, accusing Google of being a monopoly gatekeeper for the internet. The suit draws on section two of the Sherman antitrust act, which makes it illegal to monopolize trade or commerce. Of course, Google is going to fight the lawsuit, but in our view, the company has to make bigger moves to diversify its business and the answer we think lies in the cloud and at the edge. Hello everyone. This is Dave Vellante and welcome to this week's Wiki Bond Cube insights powered by ETR. In this Breaking Analysis, we want to do two things. First we're going to review a little bit of history, according to Dave Vollante of the monopolistic power in the computer industry. And then next, we're going to look into the latest ETR data. And we're going to make the case that Google's response to the DOJ suit should be to double or triple its focus on cloud and edge computing, which we think is a multi-trillion dollar opportunity. So let's start by looking at the history of monopolies in technology. We start with IBM. In 1969 the U S government filed an antitrust lawsuit against Big Blue. At the height of its power. IBM generated about 50% of the revenue and two thirds of the profits for the entire computer industry, think about that. IBM has monopoly on a relative basis, far exceeded that of the virtual Wintel monopoly that defined the 1990s. IBM had 90% of the mainframe market and controlled the protocols to a highly vertically integrated mainframe stack, comprising semiconductors, operating systems, tools, and compatible peripherals like terminal storage and printers. Now the government's lawsuit dragged on for 13 years before it was withdrawn in 1982, IBM at one point had 200 lawyers on the case and it really took a toll on IBM and to placate the government during this time and someone after IBM made concessions such as allowing mainframe plug compatible competitors to access its code, limiting the bundling of application software in fear of more government pressure. Now the biggest mistake IBM made when it came out of antitrust was holding on to its mainframe past. And we saw this in the way it tried to recover from the mistake of handing its monopoly over to Microsoft and Intel. The virtual monopoly. What it did was you may not remember this, but it had OS/2 and Windows and it said to Microsoft, we'll keep OS/2 you take Windows. And the mistake IBM was making with sticking to the PC could be vertically integrated, like the main frame. Now let's fast forward to Microsoft. Microsoft monopoly power was earned in the 1980s and carried into the 1990s. And in 1998 the DOJ filed the lawsuit against Microsoft alleging that the company was illegally thwarting competition, which I argued at the time was the case. Now, ironically, this is the same year that Google was started in a garage. And I'll come back to that in a minute. Now, in the early days of the PC, Microsoft they were not a dominant player in desktop software, you had Lotus 1-2-3, WordPerfect. You had this company called Harvard Presentation Graphics. These were discreet products that competed very effectively in the market. Now in 1987, Microsoft paid $14 million for PowerPoint. And then in 1990 launched Office, which bundled Spreadsheets, Word Processing, and presentations into a single suite. And it was priced far more attractively than the some of the alternative point products. Now in 1995, Microsoft launched Internet Explorer, and began bundling its browser into windows for free. Windows had a 90% market share. Netscape was the browser leader and a high flying tech company at the time. And the company's management who pooed Microsoft bundling of IE saying, they really weren't concerned because they were moving up the stack into business software, now they later changed that position after realizing the damage that Microsoft bundling would do to its business, but it was too late. So in similar moves of ineptness, Lotus refuse to support Windows at its launch. And instead it wrote software to support the (indistinct). A mini computer that you probably have never even heard of. Novell was a leader in networking software at the time. Anyone remember NetWare. So they responded to Microsoft's move to bundle network services into its operating systems by going on a disastrous buying spree they acquired WordPerfect, Quattro Pro, which was a Spreadsheet and a Unix OS to try to compete with Microsoft, but Microsoft turned the volume and kill them. Now the difference between Microsoft and IBM is that Microsoft didn't build PC hardware rather it partnered with Intel to create a virtual monopoly and the similarities between IBM and Microsoft, however, were that it fought the DOJ hard, Okay, of course. But it made similar mistakes to IBM by hugging on to its PC software legacy. Until the company finally pivoted to the cloud under the leadership of Satya Nadella, that brings us to Google. Google has a 90% share of the internet search market. There's that magic number again. Now IBM couldn't argue that consumers weren't hurt by its tactics. Cause they were IBM was gouging mainframe customers because it could on pricing. Microsoft on the other hand could argue that consumers were actually benefiting from lower prices. Google attorneys are doing what often happens in these cases. First they're arguing that the government's case is deeply flawed. Second, they're saying the government's actions will cause higher prices because they'll have to raise prices on mobile software and hardware, Hmm. Sounds like a little bit of a threat. And of course, it's making the case that many of its services are free. Now what's different from Microsoft is Microsoft was bundling IE, that was a product which was largely considered to be crap, when it first came out, it was inferior. But because of the convenience, most users didn't bother switching. Google on the other hand has a far superior search engine and earned its rightful place at the top by having a far better product than Yahoo or Excite or Infoseek or even Alta Vista, they all wanted to build portals versus having a clean user experience with some non-intrusive of ads on the side. Hmm boy, is that part changed, regardless? What's similar in this case with, as in the case with Microsoft is the DOJ is arguing that Google and Apple are teaming up with each other to dominate the market and create a monopoly. Estimates are that Google pays Apple between eight and $11 billion annually to have its search engine embedded like a tick into Safari and Siri. That's about one third of Google's profits go into Apple. And it's obviously worth it because according to the government's lawsuit, Apple originated search accounts for 50% of Google search volume, that's incredible. Now, does the government have a case here? I don't know. I'm not qualified to give a firm opinion on this and I haven't done enough research yet, but I will say this, even in the case of IBM where the DOJ eventually dropped the lawsuit, if the U S government wants to get you, they usually take more than a pound of flesh, but the DOJ did not suggest any remedies. And the Sherman act is open to wide interpretation so we'll see. What I am suggesting is that Google should not hang too tightly on to it's search and advertising past. Yes, Google gives us amazing free services, but it has every incentive to appropriate our data. And there are innovators out there right now, trying to develop answers to that problem, where the use of blockchain and other technologies can give power back to us users. So if I'm arguing that Google shouldn't like the other great tech monopolies, hang its hat too tightly on the past, what should Google do? Well, the answer is obvious, isn't it? It's cloud and edge computing. Now let me first say that Google understandably promotes G Suite quite heavily as part of its cloud computing story, I get that. But it's time to move on and aggressively push into the areas that matters in cloud core infrastructure, database, machine intelligence containers and of course the edge. Not to say that Google isn't doing this, but there are areas of greatest growth potential that they should focus on. And the ETR data shows it. But let me start with one of our favorite graphics, which shows the breakdown of survey respondents used to derive net score. Net score remembers ETR's quarterly measurement of spending velocity. And here we show the breakdown for Google cloud. The lime green is new adoptions. The forest green is the percentage of customers increasing spending more than 5%. The gray is flat and the pinkish is decreased by 6% or more. And the bright red is we're replacing or swapping out the platform. You subtract the reds from the greens and you get a net score at 43%, which is not off the charts, but it's pretty good. And compares quite favorably to most companies, but not so favorite with AWS, which is at 51% and Microsoft which is at 49%, both AWS and Microsoft red scores are in the single digits. Whereas Google's is at 10%, look all three are down since January, thanks to COVID, but AWS and Microsoft are much larger than Google. And we'd like to see stronger across the board scores from Google. But there's good news in the numbers for Google. Take a look at this chart. It's a breakdown of Google's net scores over three survey snapshots. Now we skip January in this view and we do that to provide a year of a year context for October. But look at the all important database category. We've been watching this very closely, particularly with the snowflake momentum because big query generally is considered the other true cloud native database. And we have a lot of respect for what Google is doing in this area. Look at the areas of strength highlighted in the green. You've got machine intelligence where Google is a leader AI you've got containers. Kubernetes was an open source gift to the industry, and linchpin of Google's cloud and multi-cloud strategy. Google cloud is strong overall. We were surprised to see some deceleration in Google cloud functions at 51% net scores to be on honest with you, because if you look at AWS Lambda and Microsoft Azure functions, they're showing net scores in the mid to high 60s. But we're still elevated for Google. Now. I'm not that worried about steep declines, and Apogee and Looker because after an acquisitions things kind of get spread out around the ETR taxonomy so don't be too concerned about that. But as I said earlier, G Suite may just not that compelling relative to the opportunity in other areas. Now I won't show the data, but Google cloud is showing good momentum across almost all interest industries and sectors with the exception of consulting and small business, which is understandable, but notable deceleration in healthcare, which is a bit of a concern. Now I want to share some customer anecdotes about Google. These comments come from an ETR Venn round table. The first comment comes from an architect who says that "it's an advantage that Google is "not entrenched in the enterprise." Hmm. I'm not sure I agree with that, but anyway, I do take stock in what this person is saying about Microsoft trying to lure people away from AWS. And this person is right that Google essentially is exposed its internal cloud to the world and has ways to go, which is why I don't agree with the first statement. I think Google still has to figure out the enterprise. Now the second comment here underscores a point that we made earlier about big query customers really like the out of the box machine learning capabilities, it's quite compelling. Okay. Let's look at some of the data that we shared previously, we'll update this chart once the company's all report earnings, but here's our most recent take on the big three cloud vendors market performance. The key point here is that our data and the ETR data reflects Google's commentary in its earning statements. And the GCP is growing much faster than its overall cloud business, which includes things that are not apples to apples with AWS the same thing is true with Azure. Remember AWS is the only company that provides clear data on its cloud business. Whereas the others will make comments, but not share the data explicitly. So these are estimates based on those comments. And we also use, as I say, the ETR survey data and our own intelligence. Now, as one of the practitioners said, Google has a long ways to go as buddy an eighth of the size of AWS and about a fifth of the size of Azure. And although it's growing faster at this size, we feel that its growth should be even higher, but COVID is clear a factor here so we have to take that into consideration. Now I want to close by coming back to antitrust. Google spends a lot on R&D, these are quick estimates but let me give you some context. Google shells out about $26 billion annually on research and development. That's about 16% of revenue. Apple spends less about 16 billion, which is about 6% of revenue, Amazon 23 billion about 8% of the top line, Microsoft 19 billion or 13% of revenue and Facebook 14 billion or 20% of revenue, wow. So Google for sure spends on innovation. And I'm not even including CapEx in any of these numbers and the hype guys as you know, spend tons on CapEx building data centers. So I'm not saying Google cheaping out, they're not. And I got plenty of cash in there balance sheet. They got to run 120 billion. So I can't criticize they're roughly $9 billion in stock buybacks the way I often point fingers at what I consider IBM's overly wall street friendly use of cash, but I will say this and it was Jeff Hammerbacher, who I spoke with on the Cube in the early part of last decade at a dupe world, who said "the best minds of my generation are spending there time, "trying to figure out how to get people to click on ads." And frankly, that's where much of Google's R&D budget goes. And again, I'm not saying Google doesn't spend on cloud computing. It does, but I'm going to make a prediction. The post cookie apocalypse is coming soon, it may be here. iOS 14 makes you opt in to find out everything about you. This is why it's such a threat to Google. The days when Google was able to be the keeper of all of our data and to house it and to do whatever it likes with that data that ended with GDPR. And that was just the beginning of the end. This decade is going to see massive changes in public policy that will directly affect Google and other consumer facing technology companies. So my premise is that Google needs to step up its game and enterprise cloud and the edge much more than it's doing today. And I like what Thomas Kurian is doing, but Google's undervalued relative to some of the other big tech names. And I think it should tell wall street that our future is in enterprise cloud and edge computing. And we're going to take a hit to our profitability and go big in those areas. And I would suggest a few things, first ramp up R&D spending and acquisitions even more. Go on a mission to create cloud native fabric across all on-prem and the edge multicloud. Yes, I know this is your strategy, but step it up even more forget satisfying investors. You're getting dinged in the market anyway. So now's the time the moon wall street and attack the opportunity unless you don't see it, but it's staring you right in the face. Second, get way more cozy with the enterprise players that are scared to death of the cloud generally. And they're afraid of AWS in particular, spend the cash and go way, way deeper with the big tech players who have built the past IBM, Dell, HPE, Cisco, Oracle, SAP, and all the others. Those companies that have the go to market shops to help you win the day in enterprise cloud. Now, I know you partner with these companies already, but partner deeper identify game-changing innovations that you can co-create with these companies and fund it with your cash hoard. I'm essentially saying, do what you do with Apple. And instead of sucking up all our data and getting us to click on ads, solve really deep problems in the enterprise and the edge. It's all about actually building an on-prem to cloud across cloud, to the edge fabric and really making that a unified experience. And there's a data angle too, which I'll talk about now, the data collection methods that you've used on consumers, it's incredibly powerful if applied responsibly and correctly for IOT and edge computing. And I don't mean to trivialize the complexity at the edge. There really isn't one edge it's Telcos and factories and banks and cars. And I know you're in all these places Google because of Android, but there's a new wave of data coming from machines and cars. And it's going to dwarf people's clicks and believe me, Tesla wants to own its own data and Google needs to put forth a strategy that's a win-win. And so far you haven't done that because your head is an advertising. Get your heads out of your ads and cut partners in on the deal. Next, double down on your open source commitment. Kubernetes showed the power that you have in the industry. Ecosystems are going to be the linchpin of innovation over the next decade and transcend products and platforms use your money, your technology, and your position in the marketplace to create the next generation of technology leveraging the power of the ecosystem. Now I know Google is going to say, we agree, this is exactly what we're doing, but I'm skeptical. Now I think you see either the cloud is a tiny little piece of your business. You have to do with Satya Nadella did and completely pivot to the new opportunity, make cloud and the edge your mission bite the bullet with wall street and go dominate a multi-trillion dollar industry. Okay, well there you have it. Remember, all these episodes are available as podcasts, so please subscribe wherever you listen. I publish weekly on Wikibond.com and Siliconangle.com and I post on LinkedIn each week as well. So please comment or DM me @DVollante, or you can email me @David.Vollante @Siliconangle.com. And don't forget to check out etr.plus that's where all the survey action is. This is Dave Vollante for the Cube Insights powered by ETR. Thanks for watching everybody be well. And we'll see you next. (upbeat instrumental)
SUMMARY :
insights from the CUBE in ETR. in the mid to high 60s.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Microsoft | ORGANIZATION | 0.99+ |
Dave Vollante | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
Apple | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
1982 | DATE | 0.99+ |
90% | QUANTITY | 0.99+ |
1998 | DATE | 0.99+ |
1995 | DATE | 0.99+ |
1987 | DATE | 0.99+ |
Telcos | ORGANIZATION | 0.99+ |
HPE | ORGANIZATION | 0.99+ |
6% | QUANTITY | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
OS/2 | TITLE | 0.99+ |
Satya Nadella | PERSON | 0.99+ |
1990s | DATE | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
120 billion | QUANTITY | 0.99+ |
200 lawyers | QUANTITY | 0.99+ |
Siri | TITLE | 0.99+ |
Bob Russell, CTA Group | CUBE Conversation, June 2020
>> Narrator: From the CUBE studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is The CUBE Conversation. >> Everyone, welcome to the special CUBE Conversation here. In the CUBES Palo Alto studios, I'm John Furrier, your host with a great story here to tell and a great story with Bob Russell, the CEO of the CTA group, also known as the Community Technology Alliance. Great story, very relevant in this time and has to involve data and technologies for good. So, Bob, thanks for spending the time to join me today. Thanks for remote dialing in or internetting in thank you. >> My pleasure, great to be with you. >> You guys have a really great mission with the Community Technology Alliance. Also known as the CTA group, which is you guys go by, take a minute to explain the firm and what you guys do coz I think this is a high impact story for this community just in general, but now more than ever, it's great story. Can you take a minute to explain? >> Thank you. We're a San Jose based nonprofit and we were founded in 1991 to provide the technology needed to support the work to end homelessness in a number of California communities and counties, primarily by providing data collection and reporting tools for agencies that were receiving federal funding to house the homeless. Several years ago, as we were looking at the data, we realized that we needed to expand our focus to not only include the homeless, but to include what's called homeless prevention. And homeless prevention is providing services to those who are not homeless, but who are at risk of becoming homeless, or those that are living in poverty and do not have enough money to pay the mortgage or pay their rent and so they too are at risk of becoming homeless. Because what the data is showing is that once you become homeless, it can be difficult, it can be time consuming, and it can take a long time for you to secure new housing. So if you can help people who are on the cusp of becoming homeless, that is, that's a wonderful thing. Keeping people from becoming homeless in the first place is one of the most effective tools in fighting homelessness in the Bay Area and throughout the United States. That expanded focus meant we really, we needed to rethink how best to leverage technology in order to help agencies communities, both in homelessness and homeless prevention. And so we focused on three different components or three tools. The first one was creating data integration tool, so that agencies that are using multiple systems, can integrate their data into a single source of truth, they can quickly communicate and exchange data with one another in order to identify how best to help people in need in their communities. The second thing that we did was we created a mobile app so that you could collect data out of your closed or your proprietary system, upload that data later to your system, or to this, to a central data warehouse. And then also, you could use this data that once we pulled your data in from multiple data systems and created a single source of truth, you could actually view that unified data. And the third tool we developed was a reporting and analytics tool, so that you could quickly visualize your data, look at overall trends and determine what measures are most effective in helping people to remain housed or to help people who are homeless to secure housing as quickly as possible. So that's our story in a nutshell, John. >> Yeah, one of the famous CUBE alumni Jeff Hammerbacher, founder of Cloudera, one said in the CUBE. This is 10 years ago, and he came from Facebook and then he said, our bright minds in the industry are working on data science so that people click on and add. And that really kind of became a rallying point in the computer science industry, because this is really a data driven strategy, you guys are taking this proactive, it's not reactive, which is still got it's own challenges. So, you know, using data for good, there's some reality there. It's like collective intelligence or predictive analytics or a recommendation engine for services to be delivered. So Love it. Love this story, I think is super important. It's not going to go away it's only going to get stronger and better. But I got to ask you with that, what are some of the challenges with the current environment for social services? Because, you mentioned legacy, legacy systems. Well, this legacy a process too. I can only imagine the challenges, what are some of those challenges in the current environment? >> Yes, yeah, there are many challenges, but I'd like to focus in on two. The first is agencies aren't network, their systems are not network. And so agency A cannot exchange and communicate with agency B. And so what happens in most communities is that if someone's in need, whether it's an individual or a family, odds are they're going to multiple agencies to secure all the different services that they need. And because agencies are not networked, it can be very difficult to secure services. If you're a need, you can end up spending a lot of time going from agency to agency, asking what's available, and seeing that if you're eligible for services. So one of the challenges that we were asked to overcome by, you know, talking to various agencies and communities is can you allow us to continue to use our current systems, but can you figure out a way for our systems to communicate and exchange critical data with one another, and the second reason or challenge is tied to first, most agencies have multiple funding sources in order to provide the services that they provide. And many of those funding sources will say to an agency in exchange for us giving you funding, you must use this system to collect data and to report out. And so what happens is a single agency can have multiple data systems that either, that just simply cannot communicate with one another. And so this creates inefficiencies. And this means that resources that would be going to a client, a family and an individual has to be redirected to doing multiple data entry and administering multiple systems. And so before we built any of our tools, we spent a good chunk of time talking to these various stakeholders in the homeless and poverty arena going, what are your primary pain points? These were the two that stood out for us. In how we could use technology to help these agencies get a more unified view of what's going on in their community and what works. >> How has any of the systematic changes affected you coz the networking piece is huge. When we see this play out in data driven businesses, obvious ones are cybersecurity, the more data the better, coz you got a machine learning is a lot of things there. The other problem I want to get your thoughts on is just the idea of not just not being networked, but the data silos. So the data silos are out there, and sometimes they're not talking to each other, even if they are connected. >> So if you're homeless or at risk of becoming homeless, odds are you're going to need multiple services to help you. It's very rare that an agency has all the services that you need so that you end up being helped by multiple services. Each one of those service, each one of those agencies, ends up being a data silo. And so you do not get a complete picture of in your community of how what are the various services that you are providing this client, and which services are most rapidly helping that client move either into housing or into self sufficiency. So agencies are very much aware that they have data silos out there, but they simply do not have the expertise or the time or the resources to manually take all of that data and try to come up with a single spreadsheet that tells 'em everything. >> On the role of data, I've seen you mentioned the users, you mentioned an app, can you just share some anecdotal examples of kind of where it's working and challenges and opportunities you guys are doubling down on because, I mean, this is a really important point, because if you look at our society at large today, the ability to deliver services, whether it's education, homelessness, poverty, it's all kind of interconnected, all has the same almost systematic kind of functional role, right you got to, identify services, needs, match them to funding and or people and move in real time or as contextually relevant as possible. If you do that, right, you're on the front end, not the back end of reacting to it. Can you give some examples? >> Yeah, I'm thinking of a young woman. I mean, this is, for me, this has been a powerful story for our organization in helping us to understand the human impact that data silos can have. So this is, in one of our communities there was a young woman with, who was recently divorced with a young son who became sick. And so she went to the hospital to secure treatment for her child, the hospital, the clinic was able to help her. But when she asked about are there agencies out there are there services out there that can help me with financial assistance can help me with getting food and finding a stable housing? They told her no, we can't help you we're clinic, but we can point you to a shelter. Well, by the time she got to that shelter, they were full for the night. So she had no place for her and her son to stay. And so what happens is she ended up spending the night out on the street. And then she spent the next week looking for, you know food bank, so she'd get food. Going to various agencies to find out, you know do you have any available housing, do you have any financial assistance and she was coming up against, you know obstacle, one obstacle over another. So if you're homeless and you don't have a car, and you know, think about anyone in the Bay Area, how difficult it is to get around if you don't have a vehicle or someone who can provide you with it, with a transportation. Her life changed and I yeah, her life changed when she ended up at a homeless encampment. And a what's called an outreach worker, went to that outreach, that encampment with our tools, with our mobile app. And this outreach worker met up with this young woman and said, how can I help you? And she, this woman explained, look, I need a place to stay for the night. I need food for my child, can you helped me? But what she did was she took her tablet open, opened up our mobile app and found yes, there is a nearby shelter that has space available. Let me get you into that shelter as soon as possible. She also alerted the case managers at that shelter that this is what the woman needs. Can you provide that assistance to her as soon as we get her to the shelter? And so what happened was instead of wandering around the community, trying to find help, because of this timely encounter between this young woman and his outreach worker, this outreach worker was able to get this woman and her child into a temporary shelter an emergency shelter for the night. And then over time, helped her secure her own apartment with financial assistance, and also the other services that she needed. And for me, that is the essence of what we're trying to do here is simply remove the barriers for you to.. The essence, what happened here was this woman was able to quickly determine through the help of an agency, what's currently available, and then connect her to those appropriate agencies to get the services that she needed. And so I have told this story many times it still gets me that it's, this is the beauty of technology. This is how you can leverage technology and help someone in need. For me it's just amazing what you can do with the right. Yes, with the right technology. >> It's such a powerful story coz it also not only illustrates the personal needs that they were met. But it also illustrates the scale of how data and the contextually relevant need at that time having the right thing happen at the right time, when it needs to happen, can scale. So it's not, it's not a one off. This is how technology can work. So I think this is a great indicator of things to come. And I think this is going to be playing out more and that is the role of data and people. This has been a fundamental dynamics, not just about machines anymore. It's the human and the data interaction. This is becoming a huge thing. Can you share your thoughts on the role of people because audiences want to get involved you seeing a much more mission driven, culture evolving quickly. People want to have an impact. >> Right. Oh, yeah, data plays a fundamental role. Best way, what helps me to understand just how fundamental that role is that what data does is it creates a narrative on the past and current experiences of people in need. In other words, data tells a story. And whether that person is homeless or at risk of becoming homeless or living in poverty, that narrative becomes a powerful tool for agencies. And it, when you take that narrative because you've been able to harness technology, create that narrative. What you can do with that narrative, is you can coordinate available services to those in need. And as, you know the story of this young woman, you can also rightly reduce the wait times and the time that someone says I have this need until you connect them with that available service. That narrative also helps you to improve your programs and services. You can look at what's working, what's not working, and make the necessary changes so that you can end up helping more people. It improves access to programs and services, instead of someone going by bus, or however I'm trying to go from one end of town to the other. Imagine if you could go to a public library, for example. And as a person in need, you could log in and go, you could tell your story, interesting data and say, help me to find the services that I need. >> Yeah. >> The other thing is that it reduces inefficiencies. Many agencies are spending considerable amount of time in duplicate data entry in order to make sure that they're collecting the data and all the different systems that they need. And then I think another key thing that data plays a fundamental role is that you can take your data as an agency, as a community and you can tell your story to policy leaders and to funders and say look, if there is how you can support us in order to provide effective homeless and poverty alleviation solutions, so again the idea that-- >> Yeah that's a key point right there, that's I mean, the key point is, you look at people process technology, which is like the, overused cliche of digital transformation very relevant by the way, the process piece is kind of taking that same track as you saw the internet technologies, change marketing and advertising, performance based, show me the clicks. If you think about what you just said, that's really what's going on here is you can actually have performance based programs with specific deliverables, if I can do this, would you do more? And the answer is you can measure it with data. This is really the magic of this. It's a new way of doing things. And again, this is not going to go away. And I think stakeholders can hold people's feet to the fire for performance based results, because the data is there if you strive to do a good mission. If the systems are in place, you can measure it. >> Thanks for that question John. Three (background noise drowns out other sounds) come to mind. First, many organizations now financially match the donations made by their employees that they make to nonprofit. So I would say that check with your HR department and see if they have a matching program. And if they do, what happens then is that for every dollar that you give to that agency, your organization, the company that you work for, will match that, and so your money will go further. These same pro... These same Corporate Social Responsibility programs, not only will match your donations, but the other thing that they will do is they will sometime arrange, sometimes workout opportunities to volunteer at very various nonprofits. And so you can also check with your organization to see if they do that. A second possibility is that you connect with groups such as the Full Circle Fund. There are other groups out there, but I'm most familiar with the Full Circle Fund. And it is a San Francisco based nonprofit that leverages your time and your resources and intellectual capital to help out with nonprofits throughout the Bay Area. So whether it is that you're looking to volunteer coding or development skills, or you're looking for some way to find out what's going on in the Bay Community, and how can I help. Full Circle Fund would be a great resource. And again, there are other nonprofits like them out there as well. The third thing is, if you know of an agency in your area, a goodwill, united way, a habitat for humanity, give them a call or check on their website to see what volunteer, positions they have available or what they're looking for. And if it looks like a good match, give them a call and have that conversation. Those are three things that immediately came to mind for me John about if he wanted to help out, how could you? >> Well, certainly it's important mission. I really appreciate, Bob, what you're doing and your team, Bob Russell, the CEO of CTA group, also known as the Community Technology Alliance. Really putting technology into practice, to help the services get to the folks that matter, the homelessness and the folks in poverty, on the edge of poverty. It really is an example of how you can solve some of these systematic problems with performance base. If you follow the data, follow the money, follow the services, it all can work in real time. And that's a good example. So thank you so much for what you do. And great mission. Thank you for your time. >> Thank you, and thank you for having me. >> Okay I'm the CUBE. I'm John Furrier, covering all the stories here while we're still programming here in the CUBE studios with our quarantine crew. Bob Russell, the CEO of CTA group, out with a great story. Check it out and get involved. I'm John Furrier, The CUBE. Thanks for watching (bright upbeat music)
SUMMARY :
Narrator: From the CUBE the time to join me today. firm and what you guys do so that you could collect But I got to ask you with that, to overcome by, you know, to get your thoughts on all the services that you need the back end of reacting to it. is simply remove the barriers for you to.. and that is the role of data and people. so that you can end up is that you can take And the answer is you And so you can also check and the folks in poverty, here in the CUBE studios
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
Bob Russell | PERSON | 0.99+ |
Bob | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
June 2020 | DATE | 0.99+ |
John Furrier | PERSON | 0.99+ |
1991 | DATE | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
California | LOCATION | 0.99+ |
John | PERSON | 0.99+ |
United States | LOCATION | 0.99+ |
Community Technology Alliance | ORGANIZATION | 0.99+ |
First | QUANTITY | 0.99+ |
three tools | QUANTITY | 0.99+ |
Community Technology Alliance | ORGANIZATION | 0.99+ |
Bay Area | LOCATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
CTA Group | ORGANIZATION | 0.99+ |
third tool | QUANTITY | 0.99+ |
Three | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
next week | DATE | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
San Francisco | LOCATION | 0.99+ |
Boston | LOCATION | 0.99+ |
second thing | QUANTITY | 0.99+ |
CTA | ORGANIZATION | 0.98+ |
CUBE | ORGANIZATION | 0.98+ |
first | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
second possibility | QUANTITY | 0.98+ |
second reason | QUANTITY | 0.98+ |
both | QUANTITY | 0.97+ |
third thing | QUANTITY | 0.97+ |
CTA group | ORGANIZATION | 0.97+ |
single source | QUANTITY | 0.97+ |
10 years ago | DATE | 0.96+ |
single agency | QUANTITY | 0.96+ |
Each one | QUANTITY | 0.95+ |
Full Circle Fund | ORGANIZATION | 0.95+ |
each one | QUANTITY | 0.95+ |
first one | QUANTITY | 0.93+ |
first place | QUANTITY | 0.92+ |
single spreadsheet | QUANTITY | 0.91+ |
three things | QUANTITY | 0.9+ |
Several years ago | DATE | 0.89+ |
three different components | QUANTITY | 0.88+ |
CUBES | ORGANIZATION | 0.87+ |
CUBE Conversation | EVENT | 0.83+ |
Social Responsibility | OTHER | 0.75+ |
one obstacle | QUANTITY | 0.74+ |
Bay | ORGANIZATION | 0.59+ |
tools | QUANTITY | 0.58+ |
most | QUANTITY | 0.43+ |
Conversation | EVENT | 0.41+ |
Dave Levy, AWS | AWS Imagine Nonprofit 2019
(stirring music) >> Announcer: From Seattle, Washington, it's theCUBE. Covering AWS IMAGINE Nonprofit. Brought to you by Amazon Web Services. >> Hey, welcome back everybody. Jeff Frick here with theCUBE. We're in downtown Seattle, Washington, actually right on the waterfront. It has been a spectacular visit here for the last couple of days. And we're back in Seattle for AWS IMAGINE. We were here a couple weeks ago for AWS IMAGINE Education. This is a different version of the conference, really focused around government and nonprofits, and we're really excited to kick off our day with the guy coming right off the keynote who's running this, he's Dave Levy. He's the vice president for U.S. Government and Nonprofit for AWS. Dave, great to see you, and congrats on the keynote. >> Thank you, thanks for having me, too. We're really excited. >> Absolutely. So as you're talking about mission and purpose, and as I'm doing my homework for some of the topics we're going to cover today, these are big problems. I couldn't help but think of a famous quote from Jeff Hammerbacher from years ago, who said, "The greatest minds of my generation "are thinking about how to make us click ads." And I'm so happy and refreshed to be here with you and your team to be working on much bigger problems. >> Yeah, well thank you. We're very excited, we're thrilled with all the customers here, all the nonprofits, all the nongovernmental organizations, all of our partners. It's just very exciting, and there are a lot of big challenges out there, and we're happy to be a part of it. >> So it's our first time here, but you guys have been doing this show, I believe this is the fourth year. >> Its fourth year, yeah. >> Give us a little background on the nonprofit sector at AWS. How did you get involved, you know, what's your mission, and some of the numbers behind. >> Well, it's one of the most exciting part of our businesses in the worldwide public sector. And we have tens of thousands of customers in the nonprofit sector, and they are doing all sorts of wonderful things in terms of their mission. And we're trying to help them deliver on their mission with our technology. So you see everything from hosting websites, to doing back office functions in the cloud, running research and donor platforms, and so it's just a very exciting time, I think. And nonprofit missions are accelerating, and we're helping them do that. >> Yeah, it's quite a different mission than selling books, or selling services, or selling infrastructure, when you have this real focus. The impact of some of these organizations is huge. We're going to talk to someone involved in human trafficking. 25,000,000 people involved in this problem. So these are really big problems that you guys are helping out with. >> They're huge problems, and at Amazon, we really identify with missionaries. We want our partners and our customers to be able to be empowered to deliver on their mission. We feel like we're missionaries and we're builders at Amazon, so this is a really good fit for us, to work with nonprofits all over the world. >> And how did you get involved? We were here a couple weeks ago, talked to Andrew Ko. He runs EDU, he'd grown up in tech, and then one of his kids had an issue that drove him into the education. What's your mission story? >> Well, on a personal level, I'm just passionate about this space. There's so much opportunity. It's everything from solving challenges around heart disease, to research for cancer, patient care, to human trafficking. So all of those things resonate. It touches all of our lives, and I'm thrilled to be able to contribute, and I've got a fantastic team, and we've got amazing customers. >> Right. It's great. Did a little homework on you, you're a pretty good, interesting guy too. But you referenced something that I thought was really powerful, and somebody interviewing you. You talked about practice. Practice, practice, practice, as a person. And you invoked Amara's Law, which I had never heard for a person, which is we tend to overestimate what we can do in the short term, but we underestimate what we can do in the long term. And as these people are focused on these giant missions, the long term impacts can be gargantuan. >> Yeah, I think so. Like you said, we're tackling some huge problems out there. Huge, difficult problems. Migrations, diseases. And, you know, it takes a while to get these things done. And when you look back on a ten year horizon, you can really accomplish a lot. So we like to set big, bold, audacious goals at Amazon. We like to think big. And we want to encourage our customers to think big along with us. And we'll support them to go on this journey. And it may take some time, but I'm confident we can solve a lot of the big problems out there. >> But it's funny, there's a lot of stuff in social now where a lot of people don't think big enough. And you were very specific in your keynote. You had three really significant challenges. Go from big ideas to impact. Learn and be curious, and dive deep. Because like you said, these are not simple problems. These aren't just going to go away. But you really need to spend the time to get into it. And I think what's cool about Amazon, and your fanatical customer focus, to apply that type of a framework, that type of way of go to market into the nonprofit area, really gives you a unique point of view. >> I hope so. And we're doing a lot of really cool things here at the conference. We've got a Working Backwards session. One of the things about working backwards that's really interesting is the customer's at the center of that. And it all starts with the customer. I can't tell you how many times I've been in a meeting at Amazon where somebody has said, wait a second. This is what we heard these customers say, this is what we heard about their mission. And it's all about what customers want. So we're really excited that our customers here and our nonprofits here are going to be going through some of those sessions, and hopefully we can provide a little innovation engine for them by applying Amazon processes to it. >> For the people that aren't familiar, the working backwards, if I'm hearing you right, is the Amazon practice where you actually write the press release for when you're finished, and then work backwards. So you stay focused on those really core objectives. >> Yeah, that's right. It's start with your end state in mind and work backwards from there. And it starts with a press release. And certainly those are fun to write, because you want to know what you're going to be delivering and how you're going to be delivering it, and frankly how your customers and how your stakeholders will be responding. So it's a really great exercise, helps you focus on the mission, and sets up the stage for delivery in the future. >> It's funny, I think one of the greatest and easy simple examples of that is the Amazon Go store. And I've heard lots of stores, I've been it now a couple times up here, in San Francisco, and the story that I've heard, maybe you know if it's true or not, is that when they tried to implement it at first, they had a lot of more departments. And unfortunately it introduced lines not necessarily at checkout, but other places in the store. And with that single focus mission of no lines, cut back the SKUs, cut back the selection, and so when I went in it in San Francisco the other day, and it gave me my little time in the store, the Google search results? It was, I think, a minute and 19 to go in, grab a quick lunch, and then get back on my way. So really laser-focused on a specific objective. >> Yeah, and that's the point of the working backwards process. It's all about what customers want, and you can refine that and continue to refine that, and you get feedback, and you're able to answer those questions and solve those difficult problems. >> That's great. Well, Dave, thanks for inviting us here for the first time again. Congrats on the keynote, and we look forward to a bunch of really important work that your customers and your team are working on, and learning more about those stories. >> Thanks, we're thrilled. Very thrilled. >> All right. He's Dave, I'm Jeff. You're watching theCUBE. We're in Seattle at the AWS IMAGINE Nonprofit. Thanks for watching, we'll see you next time. (light electronic music)
SUMMARY :
Brought to you by Amazon Web Services. and congrats on the keynote. We're really excited. to be here with you and your team and we're happy to be a part of it. but you guys have been doing this show, and some of the numbers behind. and we're helping them do that. that you guys are helping out with. and at Amazon, we really identify with missionaries. And how did you get involved? and I'm thrilled to be able to contribute, And you invoked Amara's Law, And when you look back on a ten year horizon, And you were very specific in your keynote. and hopefully we can provide is the Amazon practice where you actually and how you're going to be delivering it, and the story that I've heard, Yeah, and that's the point and we look forward to a bunch of really important work Thanks, we're thrilled. We're in Seattle at the AWS IMAGINE Nonprofit.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave | PERSON | 0.99+ |
Dave Levy | PERSON | 0.99+ |
Andrew Ko | PERSON | 0.99+ |
Jeff Frick | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Seattle | LOCATION | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Amazon Web Services | ORGANIZATION | 0.99+ |
Jeff | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
San Francisco | LOCATION | 0.99+ |
fourth year | QUANTITY | 0.99+ |
ten year | QUANTITY | 0.99+ |
Seattle, Washington | LOCATION | 0.99+ |
first time | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
three | QUANTITY | 0.98+ |
25,000,000 people | QUANTITY | 0.98+ |
2019 | DATE | 0.98+ |
EDU | ORGANIZATION | 0.98+ |
AWS IMAGINE | ORGANIZATION | 0.97+ |
U.S. Government | ORGANIZATION | 0.97+ |
a minute | QUANTITY | 0.96+ |
single | QUANTITY | 0.95+ |
One | QUANTITY | 0.94+ |
today | DATE | 0.94+ |
couple weeks ago | DATE | 0.91+ |
19 | QUANTITY | 0.88+ |
ORGANIZATION | 0.88+ | |
tens of thousands of customers | QUANTITY | 0.86+ |
Amazon Go | ORGANIZATION | 0.85+ |
years ago | DATE | 0.73+ |
significant challenges | QUANTITY | 0.71+ |
Nonprofit | ORGANIZATION | 0.7+ |
theCUBE | ORGANIZATION | 0.7+ |
first | QUANTITY | 0.64+ |
couple times | QUANTITY | 0.61+ |
days | DATE | 0.56+ |
Amara | TITLE | 0.51+ |
last couple | DATE | 0.49+ |
second | QUANTITY | 0.4+ |
Imagine Nonprofit | TITLE | 0.4+ |
IMAGINE | TITLE | 0.32+ |
Kickoff | theCUBE NYC 2018
>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Hello, everyone, welcome to this CUBE special presentation here in New York City for CUBENYC. I'm John Furrier with Dave Vellante. This is our ninth year covering the big data industry, starting with Hadoop World and evolved over the years. This is our ninth year, Dave. We've been covering Hadoop World, Hadoop Summit, Strata Conference, Strata Hadoop. Now it's called Strata Data, I don't know what Strata O'Reilly's going to call it next. As you all know, theCUBE has been present for the creation at the Hadoop big data ecosystem. We're here for our ninth year, certainly a lot's changed. AI's the center of the conversation, and certainly we've seen some horses come in, some haven't come in, and trends have emerged, some gone away, your thoughts. Nine years covering big data. >> Well, John, I remember fondly, vividly, the call that I got. I was in Dallas at a storage networking world show and you called and said, "Hey, we're doing "Hadoop World, get over there," and of course, Hadoop, big data, was the new, hot thing. I told everybody, "I'm leaving." Most of the people said, "What's Hadoop?" Right, so we came, we started covering, it was people like Jeff Hammerbacher, Amr Awadallah, Doug Cutting, who invented Hadoop, Mike Olson, you know, head of Cloudera at the time, and people like Abi Mehda, who at the time was at B of A, and some of the things we learned then that were profound-- >> Yeah. >> As much as Hadoop is sort of on the back burner now and people really aren't talking about it, some of the things that are profound about Hadoop, really, were the idea, the notion of bringing five megabytes of code to a petabyte of data, for example, or the notion of no schema on write. You know, put it into the database and then figure it out. >> Unstructured data. >> Right. >> Object storage. >> And so, that created a state of innovation, of funding. We were talking last night about, you know, many, many years ago at this event this time of the year, concurrent with Strata you would have VCs all over the place. There really aren't a lot of VCs here this year, not a lot of VC parties-- >> Mm-hm. >> As there used to be, so that somewhat waned, but some of the things that we talked about back then, we said that big money and big data is going to be made by the practitioners, not by the vendors, and that's proved true. I mean... >> Yeah. >> The big three Hadoop distro vendors, Cloudera, Hortonworks, and MapR, you know, Cloudera's $2.5 billion valuation, you know, not bad, but it's not a $30, $40 billion value company. The other thing we said is there will be no Red Hat of big data. You said, "Well, the only Red Hat of big data might be "Red Hat," and so, (chuckles) that's basically proved true. >> Yeah. >> And so, I think if we look back we always talked about Hadoop and big data being a reduction, the ROI was a reduction on investment. >> Yeah. >> It was a way to have a cheaper data warehouse, and that's essentially-- Well, what did we get right and wrong? I mean, let's look at some of the trends. I mean, first of all, I think we got pretty much everything right, as you know. We tend to make the calls pretty accurately with theCUBE. Got a lot of data, we look, we have the analytics in our own system, plus we have the research team digging in, so you know, we pretty much get, do a good job. I think one thing that we predicted was that Hadoop certainly would change the game, and that did. We also predicted that there wouldn't be a Red Hat for Hadoop, that was a production. The other prediction was is that we said Hadoop won't kill data warehouses, it didn't, and then data lakes came along. You know my position on data lakes. >> Yeah. >> I've always hated the term. I always liked data ocean because I think it was much more fluidity of the data, so I think we got that one right and data lakes still doesn't look like it's going to be panning out well. I mean, most people that deploy data lakes, it's really either not a core thing or as part of something else and it's turning into a data swamp, so I think the data lake piece is not panning out the way it, people thought it would be. I think one thing we did get right, also, is that data would be the center of the value proposition, and it continues and remains to be, and I think we're seeing that now, and we said data's the development kit back in 2010 when we said data's going to be part of programming. >> Some of the other things, our early data, and we went out and we talked to a lot of practitioners who are the, it was hard to find in the early days. They were just a select few, I mean, other than inside of Google and Yahoo! But what they told us is that things like SQL and the enterprise data warehouse were key components on their big data strategy, so to your point, you know, it wasn't going to kill the EDW, but it was going to surround it. The other thing we called was cloud. Four years ago our data showed clearly that much of this work, the modeling, the big data wrangling, et cetera, was being done in the cloud, and Cloudera, Hortonworks, and MapR, none of them at the time really had a cloud strategy. Today that's all they're talking about is cloud and hybrid cloud. >> Well, it's interesting, I think it was like four years ago, I think, Dave, when we actually were riffing on the notion of, you know, Cloudera's name. It's called Cloudera, you know. If you spell it out, in Cloudera we're in a cloud era, and I think we were very aggressive at that point. I think Amr Awadallah even made a comment on Twitter. He was like, "I don't understand "where you guys are coming from." We were actually saying at the time that Cloudera should actually leverage more cloud at that time, and they didn't. They stayed on their IPO track and they had to because they had everything betted on Impala and this data model that they had and being the business model, and then they went public, but I think clearly cloud is now part of Cloudera's story, and I think that's a good call, and it's not too late for them. It never was too late, but you know, Cloudera has executed. I mean, if you look at what's happened with Cloudera, they were the only game in town. When we started theCUBE we were in their office, as most people know in this industry, that we were there with Cloudera when they had like 17 employees. I thought Cloudera was going to run the table, but then what happened was Hortonworks came out of the Yahoo! That, I think, changed the game and I think in that competitive battle between Hortonworks and Cloudera, in my opinion, changed the industry, because if Hortonworks did not come out of Yahoo! Cloudera would've had an uncontested run. I think the landscape of the ecosystem would look completely different had Hortonworks not competed, because you think about, Dave, they had that competitive battle for years. The Hortonworks-Cloudera battle, and I think it changed the industry. I think it couldn't been a different outcome. If Hortonworks wasn't there, I think Cloudera probably would've taken Hadoop and making it so much more, and I think they wouldn't gotten more done. >> Yeah, and I think the other point we have to make here is complexity really hurt the Hadoop ecosystem, and it was just bespoke, new projects coming out all the time, and you had Cloudera, Hortonworks, and maybe to a lesser extent MapR, doing a lot of the heavy lifting, particularly, you know, Hortonworks and Cloudera. They had to invest a lot of their R&D in making these systems work and integrating them, and you know, complexity just really broke the back of the Hadoop ecosystem, and so then Spark came in, everybody said, "Oh, Spark's going to basically replace Hadoop." You know, yes and no, the people who got Hadoop right, you know, embraced it and they still use it. Spark definitely simplified things, but now the conversation has turned to AI, John. So, I got to ask you, I'm going to use your line on you in kind of the ask-me-anything segment here. AI, is it same wine, new bottle, or is it really substantively different in your opinion? >> I think it's substantively different. I don't think it's the same wine in a new bottle. I'll tell you... Well, it's kind of, it's like the bad wine... (laughs) Is going to be kind of blended in with the good wine, which is now AI. If you look at this industry, the big data industry, if you look at what O'Reilly did with this conference. I think O'Reilly really has not done a good job with the conference of big data. I think they blew it, I think that they made it a, you know, monetization, closed system when the big data business could've been all about AI in a much deeper way. I think AI is subordinate to cloud, and you mentioned cloud earlier. If you look at all the action within the AI segment, Diane Greene talking about it at Google Next, Amazon, AI is a software layer substrate that will be underpinned by the cloud. Cloud will drive more action, you need more compute, that drives more data, more data drives the machine learning, machine learning drives the AI, so I think AI is always going to be dependent upon cloud ends or some sort of high compute resource base, and all the cloud analytics are feeding into these AI models, so I think cloud takes over AI, no doubt, and I think this whole ecosystem of big data gets subsumed under either an AWS, VMworld, Google, and Microsoft Cloud show, and then also I think specialization around data science is going to go off on its own. So, I think you're going to see the breakup of the big data industry as we know it today. Strata Hadoop, Strata Data Conference, that thing's going to crumble into multiple, fractured ecosystems. >> It's already starting to be forked. I think the other thing I want to say about Hadoop is that it actually brought such great awareness to the notion of data, putting data at the core of your company, data and data value, the ability to understand how data at least contributes to the monetization of your company. AI would not be possible without the data. Right, and we've talked about this before. You call it the innovation sandwich. The innovation sandwich, last decade, last three decades, has been Moore's law. The innovation sandwich going forward is data, machine intelligence applied to that data, and cloud for scale, and that's the sandwich of innovation over the next 10 to 20 years. >> Yeah, and I think data is everywhere, so this idea of being a categorical industry segment is a little bit off, I mean, although I know data warehouse is kind of its own category and you're seeing that, but I don't think it's like a Magic Quadrant anymore. Every quadrant has data. >> Mm-hm. >> So, I think data's fundamental, and I think that's why it's going to become a layer within a control plane of either cloud or some other system, I think. I think that's pretty clear, there's no, like, one. You can't buy big data, you can't buy AI. I think you can have AI, you know, things like TensorFlow, but it's going to be a completely... Every layer of the stack is going to be impacted by AI and data. >> And I think the big players are going to infuse their applications and their databases with machine intelligence. You're going to see this, you're certainly, you know, seeing it with IBM, the sort of Watson heavy lift. Clearly Google, Amazon, you know, Facebook, Alibaba, and Microsoft, they're infusing AI throughout their entire set of cloud services and applications and infrastructure, and I think that's good news for the practitioners. People aren't... Most companies aren't going to build their own AI, they're going to buy AI, and that's how they close the gap between the sort of data haves and the data have-nots, and again, I want to emphasize that the fundamental difference, to me anyway, is having data at the core. If you look at the top five companies in terms of market value, US companies, Facebook maybe not so much anymore because of the fake news, though Facebook will be back with it's two billion users, but Apple, Google, Facebook, Amazon, who am I... And Microsoft, those five have put data at the core and they're the most valuable companies in the stock market from a market cap standpoint, why? Because it's a recognition that that intangible value of the data is actually quite valuable, and even though banks and financial institutions are data companies, their data lives in silos. So, these five have put data at the center, surrounded it with human expertise, as opposed to having humans at the center and having data all over the place. So, how do they, how do these companies close the gap? How do the companies in the flyover states close the gap? The way they close the gap, in my view, is they buy technologies that have AI infused in it, and I think the last thing I'll say is I see cloud as the substrate, and AI, and blockchain and other services, as the automation layer on top of it. I think that's going to be the big tailwind for innovation over the next decade. >> Yeah, and obviously the theme of machine learning drives a lot of the conversations here, and that's essentially never going to go away. Machine learning is the core of AI, and I would argue that AI truly doesn't even exist yet. It's machine learning really driving the value, but to put a validation on the fact that cloud is going to be driving AI business is some of the terms in popular conversations we're hearing here in New York around this event and topic, CUBENYC and Strata Conference, is you're hearing Kubernetes and blockchain, and you know, these automation, AI operation kind of conversations. That's an IT conversation, (chuckles) so you know, that's interesting. You've got IT, really, with storage. You've got to store the data, so you can't not talk about workloads and how the data moves with workloads, so you're starting to see data and workloads kind of be tossed in the same conversation, that's a cloud conversation. That is all about multi-cloud. That's why you're seeing Kubernetes, a term I never thought I would be saying at a big data show, but Kubernetes is going to be key for moving workloads around, of which there's data involved. (chuckles) Instrumenting the workloads, data inside the workloads, data driving data. This is where AI and machine learning's going to play, so again, cloud subsumes AI, that's the story, and I think that's going to be the big trend. >> Well, and I think you're right, now. I mean, that's why you're hearing the messaging of hybrid cloud and from the big distro vendors, and the other thing is you're hearing from a lot of the no-SQL database guys, they're bringing ACID compliance, they're bringing enterprise-grade capability, so you're seeing the world is hybrid. You're seeing those two worlds come together, so... >> Their worlds, it's getting leveled in the playing field out there. It's all about enterprise, B2B, AI, cloud, and data. That's theCUBE bringing you the data here. New York City, CUBENYC, that's the hashtag. Stay with us for more coverage live in New York after this short break. (techy music)
SUMMARY :
Brought to you by SiliconANGLE Media for the creation at the Hadoop big data ecosystem. and some of the things we learned then some of the things that are profound about Hadoop, We were talking last night about, you know, but some of the things that we talked about back then, You said, "Well, the only Red Hat of big data might be being a reduction, the ROI was a reduction I mean, first of all, I think we got and I think we're seeing that now, and the enterprise data warehouse were key components and I think we were very aggressive at that point. Yeah, and I think the other point and all the cloud analytics are and cloud for scale, and that's the sandwich Yeah, and I think data is everywhere, and I think that's why it's going to become I think that's going to be the big tailwind and I think that's going to be the big trend. and the other thing is you're hearing New York City, CUBENYC, that's the hashtag.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Apple | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Diane Greene | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
ORGANIZATION | 0.99+ | |
John | PERSON | 0.99+ |
Alibaba | ORGANIZATION | 0.99+ |
Dave | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
$30 | QUANTITY | 0.99+ |
New York | LOCATION | 0.99+ |
2010 | DATE | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Doug Cutting | PERSON | 0.99+ |
Mike Olson | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Dallas | LOCATION | 0.99+ |
O'Reilly | ORGANIZATION | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
five | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Abi Mehda | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
$2.5 billion | QUANTITY | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
MapR | ORGANIZATION | 0.99+ |
Amr Awadallah | PERSON | 0.99+ |
$40 billion | QUANTITY | 0.99+ |
17 employees | QUANTITY | 0.99+ |
VMworld | ORGANIZATION | 0.99+ |
Today | DATE | 0.99+ |
Impala | ORGANIZATION | 0.99+ |
Nine years | QUANTITY | 0.99+ |
four years ago | DATE | 0.98+ |
last night | DATE | 0.98+ |
last decade | DATE | 0.98+ |
Strata Data Conference | EVENT | 0.98+ |
Strata Conference | EVENT | 0.98+ |
Hadoop Summit | EVENT | 0.98+ |
ninth year | QUANTITY | 0.98+ |
Four years ago | DATE | 0.98+ |
two worlds | QUANTITY | 0.97+ |
five companies | QUANTITY | 0.97+ |
today | DATE | 0.97+ |
Strata Hadoop | EVENT | 0.97+ |
Hadoop World | EVENT | 0.96+ |
CUBE | ORGANIZATION | 0.96+ |
Google Next | ORGANIZATION | 0.95+ |
ORGANIZATION | 0.95+ | |
this year | DATE | 0.95+ |
Spark | ORGANIZATION | 0.95+ |
US | LOCATION | 0.94+ |
CUBENYC | EVENT | 0.94+ |
Strata O'Reilly | ORGANIZATION | 0.93+ |
next decade | DATE | 0.93+ |
Bob Rogers, Intel, Julie Cordua, Thorn | AWS re:Invent
>> Narrator: Live from Las Vegas, it's theCUBE, covering AWS re:Invent 2017, presented by AWS, Intel, and our ecosystem of partners. >> Hello everyone, welcome to a special CUBE presentation here, live in Las Vegas for Amazon Web Service's AWS re:Invent 2017. This is theCUBE's fifth year here. We've been watching the progression. I'm John Furrier with Justin here as my co-host. Our two next guests are Bob Rogers, the chief data scientist at Intel, and Julie Cardoa, who's the CEO of Thorn. Great guests, showing some AI for good. Intel, obviously, good citizen and great technology partner. Welcome to theCUBE. >> Thank you, thanks for having us! >> So, I saw your talk you gave at the Public Sector Breakfast this morning here at re:Invent. Packed house, fire marshal was kicking people out. Really inspirational story. Intel, we've talked at South by Southwest. You guys are really doing a lot of AI for good. That's the theme here. You guys are doing incredible work. >> Julie: Thank you. >> Tell your story real quick. >> Yeah, so Thorn is a nonprofit, we started about five years ago, and we are just specifically dedicated to build new technologies to defend children form sexual abuse. We were seeing that, as, you know, new technologies emerge, there's new innovation out there, how child sexual abuse was presenting itself was changing dramatically. So, everything from child sex trafficking online, to the spread of child sexual abuse material, livestreaming abuse, and there wasn't a concentrated effort to put the best and brightest minds and technology together to be a part of the solution, and so that's what we do. We build products to stop child abuse. >> John: So you're a nonprofit? >> Julie: Yep! >> And you're in that public sector, but you guys have made a great progress. What's the story behind it? How did you get to do so effective work in such a short period of time as a nonprofit? >> Well, I think there's a couple things to that. One is, well, we learned a lot really quickly, so what we're doing today is not what we thought we would do five years ago. We thought we were gonna talk to big companies, and push them to do more, and then we realized that we actually needed to be a hub. We needed to build our own engineering teams, we needed to build product, and then bring in these companies to help us, and to add to that, but there had to be some there there, and so we actually have evolved. We're a nonprofit, but we are a product company. We have two products used in 23 countries around the world, stopping abuse every day. And I think the other thing we learned is that we really have to break down silos. So, we didn't, in a lot of our development, we didn't go the normal route of saying, okay, well this is a law enforcement job, so we're gonna go bid for a big government RFE. We just went and built a tool and gave it to a bunch of police officers and they said, "Wow, this works really well, "we're gonna keep using it." And it kinda spread like wildfire. >> And it's making a difference. It's really been a great inspirational story. Check out Thorn, amazing work, real use case, in my mind, a testimonial for how fast you can accelerate. Congratulations. Bob, I wanna get your take on this because it's a data problem that, actually, the technology's applying to a problem that people have been trying to crack the code on for a long time. >> Yeah, well, it's interesting, 'cause the context is that we're really in this era of AI explosion, and AI is really computer systems that can do things that only humans could do 10 years ago. That's kind of my basic way of thinking about it, so the problem of being able to recognize when you're looking at two images of the same child, which is the piece that we solved for Thorn, actually, you know, is a great example of using the current AI capabilities. You start with the problem of, if I show an algorithm two different images of the same child, can it recognize that they're the same? And you basically customize your training to create a very specific capability. Not a basic image recognition or facial recognition, but a very specific capability that's been trained with specific examples. I was gonna say something about what Julie was describing about their model. Their model to create that there there has been incredible because it allows them to really focus our energy into the right problems. We have lots of technology, we have lots of different ways of doing AI and machine learning, but when we get a focus on this is the data, this is the exact problem we need to solve, and this is the way it needs to work for law enforcement, for National Center for Missing and Exploited Children. It has really just turned the knob up to 11, so to speak. >> I mean, this is an example where, I mean, we always talk about how tech transformation can make things go faster. It's such an obvious problem. I mean, it's almost everyone kinda looks away because it's too hard. So, I wanna ask you, how do people make this happen for other areas for good? So, for instance, you know, what was the bottlenecks before? What solved the problem, because, I mean, you could really make a difference here. You guys are. >> Well, I think there's a couple things. I think you hit on one, which is this is a problem people turn away from. It's really hard to look at. And the other thing is is there's not a lot of money to be made in using advanced technology to find missing and exploited children, right? So, it did require the development of a nonprofit that said, "We're gonna do this, "and we're gonna fundraise to get it done." But it also required us to look at it from a technology angle, right? I think a lot of times people look at social issues from the impact angle, which we do, but we said, "What if we looked at it "from a different perspective? "How can technology disrupt in this area?" And then we made that the core of what we do, and we partnered with all the other amazing organizations that are doing the other work. And I think, then, what Bob said was that we created a hub where other experts could plug into, and I think, in any other issue area that you're working on, you can't just talk about it and convene people. You actually have to build, and when you build, you create a platform that others can add to, and I think that is one of the core reasons why we have seen so much progress, is we started out convening and really realized that wasn't gonna last very long, and then we built, and once we started building, we scaled. >> So, you got in the market quickly with something. >> Yeah. >> So, one of the issues with any sort of criminal enterprise is it tends to end up in a bit of an arms race, so you've built this great technology but then you've gotta keep one step ahead of the bad guys. So, how are you actually doing that? How are you continuing to invest in this and develop it to make sure that you're always one step ahead? >> So, I can address that on a couple of levels. One is, you know, working with Thorn, and I lead a program at Intel called the Safer Children Program, where we work with Thorn and also the National Center for Missing and Exploited Children. Those conversations bring in all of the tech giants, and there's a little bit of sibling rivalry. We're all trying to throw in our best tech. So, I think we all wanna do as well as we can for these partnerships. The other thing is, just in very tactical terms, working with Thorn, we've actually, Thorn and with Microsoft, we've created a capability to crowdsource more data to help improve the accuracy of these deep learning algorithms. So, by getting critical mass around this problem, we've actually now created enough visibility that we're getting more and more data. And as you said earlier, it's a data problem, so if you have enough data, you can actually create the models with the accuracy and the capability that you need. So, it starts to feed on itself. >> Julie talked about the business logic, how she attacked that. That's really, 'cause I think one thing notable, good use case, but from a tech perspective, how does the cloud fit in with Intel specifically? Because it really, the cloud is an enabler too. >> Bob: Yeah, absolutely. >> How's that all working with Intel? And you go on about whole new territory you guys are forging in here, it's awesome, but the cloud. >> Right, so, for us, the cloud is an incredible way for us to make our compute capability available to anyone who needs to do computing, especially in this data-driven algorithm era where more and more machine learning, more and more AI, more and more data-driven problems are coming to the fore, doing that work on the cloud and being able to scale your work according to how much data is coming in at any time, it makes the cloud a really natural place for us. And of course, Intel's hardware is a core component of pretty much all the cloud that you could connect to. >> And the compute that you guys provide, and Amazon adds to it, their cloud is impressive. Now, I'd like to know what you guys are gonna be talking about in your session. You have a session here at re:Invent. What's the title of the session, what's the agenda, is it the same stuff here, what's gonna be talked about? >> So, we're talking about life-changing AI applications, and in specific we're gonna talk about, at the end Julie will talk about what Thorn has done with the child-finder and the AI that we and Microsoft built for them. We'll also, I'll start out by talking about Intel's role broadly in the computing and AI space. Intel really looks to take all of its different hardware, and networking, and memory assets, and make it possible for anybody to do the kinds of artificial intelligence or machine learning they need to do. And then in the middle, there's a really cool deployment on AWS sandwich that (something) will talk about how they've taken the models and really dialed them up in terms of how fast you can go through this data, so that we can go through millions and millions of images in our searches, and come back with results really, really fast. So, it's a great sort of three piece story about the conception of AI, the deployment at scale and with high performance, and then how Thorn is really taking that and creating a human impact around it. >> So, Bob, I asked you the Intel question because no one calls up Intel and says, "Hey, give me some AI for good." I mean, I wish that would be the case. >> Well, they do now. >> If they do, well, share your strategy, because cloud makes sense. I could see how you could provision easily, get in there, really empowering people to do stuff that's passionable and relevant. But how do you guys play in all of this? 'Cause I know you supply stuff to the cloud guys. Is this a formal program you're doing at Intel? Is this a one-off? >> Yeah, so Safer Children is a formal program. It started with two other folks, Lisa Davis and Lisa Theinai, going to the VP of the entire data center group and saying, "There is an opportunity to make a big impact "with Intel technology, and we'd like to do this." And it started literally because Intel does actually want to do good work for humankind, and frankly, the fact that these people are using our technology and other technology to hurt children, it steams our dumplings, frankly. So, it started with that. >> You've been a team player with Amazon and everyone else. >> Exactly, so then, once we've been able to show that we can actually create technology and provide infrastructure to solve these problems, it starts to become a self-fulfilling prophecy where people are saying, "Hey, we've got this "interesting adjacent problem that "this kind of technology could solve. "Is there an opportunity to work together and solve that?" And that fits into our bigger, you know, people ask me all the time, "Why does Intel have a chief data scientist?" We're a hardware company, right? The answer is-- >> That processes a lot of data! >> Yes, that processes a lot of data. Literally, we need to help people know how to get value from their data. So, if people are successful with their analytics and their AI, guess what, they're gonna invest in their infrastructure, and it sort of lifts Intel's boat across the board. >> You guys have always been a great citizen, and great technology provider, and hats off to Intel. Julie, tell a story about an example people can get a feel for some of the impact, because I saw you on stage this morning with Theresa Carlson, and we've been tracking her efforts in the public sector have been amazing, and Intel's been part of that too, congratulations. But you were kind of emotional, and you got a lot of applause. What's some of the impact? Tell a story of how important this really is, and your work at Thorn. >> Yeah, well, I mean, one of the areas we work in is trying to identify children who are being sold online in the US. A lot of people, first of all, think that's happening somewhere else. No, that's here in this country. A lot of these kids are coming out of foster care, or are runaways, and they get convinced by a pimp or a trafficker to be sold into prostitution, basically. So, we have 150,000 escort ads posted every single day in this country, and somewhere in there are children, and it's really difficult to look through that with your eye, and determine what's a child. So, we built a tool called Spotlight that basically reads and analyzes every ad as it comes in, and we layer on smart algorithms to say to an officer, "Hey, this is an ad you need to pay attention to. "It looks like this could be a child." And we've had over 6,000 children identified over the last year. >> John: That's amazing. >> You know, it happens in a situation where, you know, you have online it says, you know, this girl's 18, and it's actually a 15-year-old girl who met a man who said he was 17, he was actually 30, had already been convicted of sex trafficking, and within 48 hours of meeting this girl, he had her up online for sale. So, that sounds like a unique incident. It is not unique, it happens every single day in almost every city and town across this country. And the work we're doing is to find those kids faster, and stop that trauma. >> Well, I just wanna say congratulations. That's great work. We had a CUBE alumni, founder of CloudAir, Jeff Hammerbacher, good friend of theCUBE. He had a famous quote that he said on theCUBE, then said on the Charlie Rose Show, "The best minds of our generations "are thinking about how to make people click ads. "That sucks." This is an example where you can really put the best minds on some of the real important things. >> Yeah, we love Jeff. I read that quote all the time. >> It's really a most important quote. Well, thanks so much. Congratulations, great inspiration, great story. Bob, thanks for coming on, appreciate it. CUBE live coverage here at AWS re:Invent 2017, kicking off day one of three days of wall-to-wall coverage here, live in Las Vegas. We'll be right back with more after this short break.
SUMMARY :
Intel, and our ecosystem of partners. Welcome to theCUBE. the Public Sector Breakfast this morning and we are just specifically dedicated to build but you guys have made a great progress. and then bring in these companies to help us, the technology's applying to a problem that so the problem of being able to recognize So, for instance, you know, You actually have to build, and when you build, So, one of the issues with and the capability that you need. how does the cloud fit in with Intel specifically? And you go on about whole new territory that you could connect to. And the compute that you guys provide, and make it possible for anybody to do the kinds of So, Bob, I asked you the Intel question because 'Cause I know you supply stuff to the cloud guys. and frankly, the fact that these people and provide infrastructure to solve these problems, and it sort of lifts Intel's boat across the board. and hats off to Intel. and it's really difficult to and stop that trauma. This is an example where you can really I read that quote all the time. We'll be right back with more
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Davis | PERSON | 0.99+ |
Lisa Theinai | PERSON | 0.99+ |
Julie | PERSON | 0.99+ |
Bob Rogers | PERSON | 0.99+ |
Julie Cardoa | PERSON | 0.99+ |
Theresa Carlson | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Jeff | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
John | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Bob | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Julie Cordua | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
Intel | ORGANIZATION | 0.99+ |
millions | QUANTITY | 0.99+ |
CloudAir | ORGANIZATION | 0.99+ |
Justin | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
two images | QUANTITY | 0.99+ |
US | LOCATION | 0.99+ |
National Center for Missing and Exploited Children | ORGANIZATION | 0.99+ |
CUBE | ORGANIZATION | 0.99+ |
150,000 escort ads | QUANTITY | 0.99+ |
23 countries | QUANTITY | 0.99+ |
three days | QUANTITY | 0.99+ |
two products | QUANTITY | 0.99+ |
18 | QUANTITY | 0.99+ |
30 | QUANTITY | 0.99+ |
Thorn | ORGANIZATION | 0.99+ |
17 | QUANTITY | 0.99+ |
fifth year | QUANTITY | 0.99+ |
National Center for Missing and Exploited Children | ORGANIZATION | 0.99+ |
two different images | QUANTITY | 0.99+ |
theCUBE | ORGANIZATION | 0.99+ |
15-year-old | QUANTITY | 0.99+ |
One | QUANTITY | 0.98+ |
Thorn | PERSON | 0.98+ |
one | QUANTITY | 0.97+ |
48 hours | QUANTITY | 0.97+ |
five years ago | DATE | 0.97+ |
three piece | QUANTITY | 0.97+ |
over 6,000 children | QUANTITY | 0.97+ |
Amazon Web Service | ORGANIZATION | 0.97+ |
10 years ago | DATE | 0.97+ |
Charlie Rose Show | TITLE | 0.96+ |
South by Southwest | ORGANIZATION | 0.96+ |
two next guests | QUANTITY | 0.95+ |
last year | DATE | 0.94+ |
two other folks | QUANTITY | 0.94+ |
today | DATE | 0.94+ |
Spotlight | TITLE | 0.93+ |
day one | QUANTITY | 0.93+ |
one step | QUANTITY | 0.92+ |
Data Science: Present and Future | IBM Data Science For All
>> Announcer: Live from New York City it's The Cube, covering IBM data science for all. Brought to you by IBM. (light digital music) >> Welcome back to data science for all. It's a whole new game. And it is a whole new game. >> Dave Vellante, John Walls here. We've got quite a distinguished panel. So it is a new game-- >> Well we're in the game, I'm just happy to be-- (both laugh) Have a swing at the pitch. >> Well let's what we have here. Five distinguished members of our panel. It'll take me a minute to get through the introductions, but believe me they're worth it. Jennifer Shin joins us. Jennifer's the founder of 8 Path Solutions, the director of the data science of Comcast and part of the faculty at UC Berkeley and NYU. Jennifer, nice to have you with us, we appreciate the time. Joe McKendrick an analyst and contributor of Forbes and ZDNet, Joe, thank you for being here at well. Another ZDNetter next to him, Dion Hinchcliffe, who is a vice president and principal analyst of Constellation Research and also contributes to ZDNet. Good to see you, sir. To the back row, but that doesn't mean anything about the quality of the participation here. Bob Hayes with a killer Batman shirt on by the way, which we'll get to explain in just a little bit. He runs the Business over Broadway. And Joe Caserta, who the founder of Caserta Concepts. Welcome to all of you. Thanks for taking the time to be with us. Jennifer, let me just begin with you. Obviously as a practitioner you're very involved in the industry, you're on the academic side as well. We mentioned Berkeley, NYU, steep experience. So I want you to kind of take your foot in both worlds and tell me about data science. I mean where do we stand now from those two perspectives? How have we evolved to where we are? And how would you describe, I guess the state of data science? >> Yeah so I think that's a really interesting question. There's a lot of changes happening. In part because data science has now become much more established, both in the academic side as well as in industry. So now you see some of the bigger problems coming out. People have managed to have data pipelines set up. But now there are these questions about models and accuracy and data integration. So the really cool stuff from the data science standpoint. We get to get really into the details of the data. And I think on the academic side you now see undergraduate programs, not just graduate programs, but undergraduate programs being involved. UC Berkeley just did a big initiative that they're going to offer data science to undergrads. So that's a huge news for the university. So I think there's a lot of interest from the academic side to continue data science as a major, as a field. But I think in industry one of the difficulties you're now having is businesses are now asking that question of ROI, right? What do I actually get in return in the initial years? So I think there's a lot of work to be done and just a lot of opportunity. It's great because people now understand better with data sciences, but I think data sciences have to really think about that seriously and take it seriously and really think about how am I actually getting a return, or adding a value to the business? >> And there's lot to be said is there not, just in terms of increasing the workforce, the acumen, the training that's required now. It's a still relatively new discipline. So is there a shortage issue? Or is there just a great need? Is the opportunity there? I mean how would you look at that? >> Well I always think there's opportunity to be smart. If you can be smarter, you know it's always better. It gives you advantages in the workplace, it gets you an advantage in academia. The question is, can you actually do the work? The work's really hard, right? You have to learn all these different disciplines, you have to be able to technically understand data. Then you have to understand it conceptually. You have to be able to model with it, you have to be able to explain it. There's a lot of aspects that you're not going to pick up overnight. So I think part of it is endurance. Like are people going to feel motivated enough and dedicate enough time to it to get very good at that skill set. And also of course, you know in terms of industry, will there be enough interest in the long term that there will be a financial motivation. For people to keep staying in the field, right? So I think it's definitely a lot of opportunity. But that's always been there. Like I tell people I think of myself as a scientist and data science happens to be my day job. That's just the job title. But if you are a scientist and you work with data you'll always want to work with data. I think that's just an inherent need. It's kind of a compulsion, you just kind of can't help yourself, but dig a little bit deeper, ask the questions, you can't not think about it. So I think that will always exist. Whether or not it's an industry job in the way that we see it today, and like five years from now, or 10 years from now. I think that's something that's up for debate. >> So all of you have watched the evolution of data and how it effects organizations for a number of years now. If you go back to the days when data warehouse was king, we had a lot of promises about 360 degree views of the customer and how we were going to be more anticipatory in terms and more responsive. In many ways the decision support systems and the data warehousing world didn't live up to those promises. They solved other problems for sure. And so everybody was looking for big data to solve those problems. And they've begun to attack many of them. We talked earlier in The Cube today about fraud detection, it's gotten much, much better. Certainly retargeting of advertising has gotten better. But I wonder if you could comment, you know maybe start with Joe. As to the effect that data and data sciences had on organizations in terms of fulfilling that vision of a 360 degree view of customers and anticipating customer needs. >> So. Data warehousing, I wouldn't say failed. But I think it was unfinished in order to achieve what we need done today. At the time I think it did a pretty good job. I think it was the only place where we were able to collect data from all these different systems, have it in a single place for analytics. The big difference between what I think, between data warehousing and data science is data warehouses were primarily made for the consumer to human beings. To be able to have people look through some tool and be able to analyze data manually. That really doesn't work anymore, there's just too much data to do that. So that's why we need to build a science around it so that we can actually have machines actually doing the analytics for us. And I think that's the biggest stride in the evolution over the past couple of years, that now we're actually able to do that, right? It used to be very, you know you go back to when data warehouses started, you had to be a deep technologist in order to be able to collect the data, write the programs to clean the data. But now you're average causal IT person can do that. Right now I think we're back in data science where you have to be a fairly sophisticated programmer, analyst, scientist, statistician, engineer, in order to do what we need to do, in order to make machines actually understand the data. But I think part of the evolution, we're just in the forefront. We're going to see over the next, not even years, within the next year I think a lot of new innovation where the average person within business and definitely the average person within IT will be able to do as easily say, "What are my sales going to be next year?" As easy as it is to say, "What were my sales last year." Where now it's a big deal. Right now in order to do that you have to build some algorithms, you have to be a specialist on predictive analytics. And I think, you know as the tools mature, as people using data matures, and as the technology ecosystem for data matures, it's going to be easier and more accessible. >> So it's still too hard. (laughs) That's something-- >> Joe C.: Today it is yes. >> You've written about and talked about. >> Yeah no question about it. We see this citizen data scientist. You know we talked about the democratization of data science but the way we talk about analytics and warehousing and all the tools we had before, they generated a lot of insights and views on the information, but they didn't really give us the science part. And that's, I think that what's missing is the forming of the hypothesis, the closing of the loop of. We now have use of this data, but are are changing, are we thinking about it strategically? Are we learning from it and then feeding that back into the process. I think that's the big difference between data science and the analytics side. But, you know just like Google made search available to everyone, not just people who had highly specialized indexers or crawlers. Now we can have tools that make these capabilities available to anyone. You know going back to what Joe said I think the key thing is we now have tools that can look at all the data and ask all the questions. 'Cause we can't possibly do it all ourselves. Our organizations are increasingly awash in data. Which is the life blood of our organizations, but we're not using it, you know this a whole concept of dark data. And so I think the concept, or the promise of opening these tools up for everyone to be able to access those insights and activate them, I think that, you know, that's where it's headed. >> This is kind of where the T shirt comes in right? So Bob if you would, so you've got this Batman shirt on. We talked a little bit about it earlier, but it plays right into what Dion's talking about. About tools and, I don't want to spoil it, but you go ahead (laughs) and tell me about it. >> Right, so. Batman is a super hero, but he doesn't have any supernatural powers, right? He can't fly on his own, he can't become invisible on his own. But the thing is he has the utility belt and he has these tools he can use to help him solve problems. For example he as the bat ring when he's confronted with a building that he wants to get over, right? So he pulls it out and uses that. So as data professionals we have all these tools now that these vendors are making. We have IBM SPSS, we have data science experience. IMB Watson that these data pros can now use it as part of their utility belt and solve problems that they're confronted with. So if you''re ever confronted with like a Churn problem and you have somebody who has access to that data they can put that into IBM Watson, ask a question and it'll tell you what's the key driver of Churn. So it's not that you have to be a superhuman to be a data scientist, but these tools will help you solve certain problems and help your business go forward. >> Joe McKendrick, do you have a comment? >> Does that make the Batmobile the Watson? (everyone laughs) Analogy? >> I was just going to add that, you know all of the billionaires in the world today and none of them decided to become Batman yet. It's very disappointing. >> Yeah. (Joe laughs) >> Go ahead Joe. >> And I just want to add some thoughts to our discussion about what happened with data warehousing. I think it's important to point out as well that data warehousing, as it existed, was fairly successful but for larger companies. Data warehousing is a very expensive proposition it remains a expensive proposition. Something that's in the domain of the Fortune 500. But today's economy is based on a very entrepreneurial model. The Fortune 500s are out there of course it's ever shifting. But you have a lot of smaller companies a lot of people with start ups. You have people within divisions of larger companies that want to innovate and not be tied to the corporate balance sheet. They want to be able to go through, they want to innovate and experiment without having to go through finance and the finance department. So there's all these open source tools available. There's cloud resources as well as open source tools. Hadoop of course being a prime example where you can work with the data and experiment with the data and practice data science at a very low cost. >> Dion mentioned the C word, citizen data scientist last year at the panel. We had a conversation about that. And the data scientists on the panel generally were like, "Stop." Okay, we're not all of a sudden going to turn everybody into data scientists however, what we want to do is get people thinking about data, more focused on data, becoming a data driven organization. I mean as a data scientist I wonder if you could comment on that. >> Well I think so the other side of that is, you know there are also many people who maybe didn't, you know follow through with science, 'cause it's also expensive. A PhD takes a lot of time. And you know if you don't get funding it's a lot of money. And for very little security if you think about how hard it is to get a teaching job that's going to give you enough of a pay off to pay that back. Right, the time that you took off, the investment that you made. So I think the other side of that is by making data more accessible, you allow people who could have been great in science, have an opportunity to be great data scientists. And so I think for me the idea of citizen data scientist, that's where the opportunity is. I think in terms of democratizing data and making it available for everyone, I feel as though it's something similar to the way we didn't really know what KPIs were, maybe 20 years ago. People didn't use it as readily, didn't teach it in schools. I think maybe 10, 20 years from now, some of the things that we're building today from data science, hopefully more people will understand how to use these tools. They'll have a better understanding of working with data and what that means, and just data literacy right? Just being able to use these tools and be able to understand what data's saying and actually what it's not saying. Which is the thing that most people don't think about. But you can also say that data doesn't say anything. There's a lot of noise in it. There's too much noise to be able to say that there is a result. So I think that's the other side of it. So yeah I guess in terms for me, in terms of data a serious data scientist, I think it's a great idea to have that, right? But at the same time of course everyone kind of emphasized you don't want everyone out there going, "I can be a data scientist without education, "without statistics, without math," without understanding of how to implement the process. I've seen a lot of companies implement the same sort of process from 10, 20 years ago just on Hadoop instead of SQL. Right and it's very inefficient. And the only difference is that you can build more tables wrong than they could before. (everyone laughs) Which is I guess >> For less. it's an accomplishment and for less, it's cheaper, yeah. >> It is cheaper. >> Otherwise we're like I'm not a data scientist but I did stay at a Holiday Inn Express last night, right? >> Yeah. (panelists laugh) And there's like a little bit of pride that like they used 2,000, you know they used 2,000 computers to do it. Like a little bit of pride about that, but you know of course maybe not a great way to go. I think 20 years we couldn't do that, right? One computer was already an accomplishment to have that resource. So I think you have to think about the fact that if you're doing it wrong, you're going to just make that mistake bigger, which his also the other side of working with data. >> Sure, Bob. >> Yeah I have a comment about that. I've never liked the term citizen data scientist or citizen scientist. I get the point of it and I think employees within companies can help in the data analytics problem by maybe being a data collector or something. I mean I would never have just somebody become a scientist based on a few classes here she takes. It's like saying like, "Oh I'm going to be a citizen lawyer." And so you come to me with your legal problems, or a citizen surgeon. Like you need training to be good at something. You can't just be good at something just 'cause you want to be. >> John: Joe you wanted to say something too on that. >> Since we're in New York City I'd like to use the analogy of a real scientist versus a data scientist. So real scientist requires tools, right? And the tools are not new, like microscopes and a laboratory and a clean room. And these tools have evolved over years and years, and since we're in New York we could walk within a 10 block radius and buy any of those tools. It doesn't make us a scientist because we use those tools. I think with data, you know making, making the tools evolve and become easier to use, you know like Bob was saying, it doesn't make you a better data scientist, it just makes the data more accessible. You know we can go buy a microscope, we can go buy Hadoop, we can buy any kind of tool in a data ecosystem, but it doesn't really make you a scientist. I'm very involved in the NYU data science program and the Columbia data science program, like these kids are brilliant. You know these kids are not someone who is, you know just trying to run a day to day job, you know in corporate America. I think the people who are running the day to day job in corporate America are going to be the recipients of data science. Just like people who take drugs, right? As a result of a smart data scientist coming up with a formula that can help people, I think we're going to make it easier to distribute the data that can help people with all the new tools. But it doesn't really make it, you know the access to the data and tools available doesn't really make you a better data scientist. Without, like Bob was saying, without better training and education. >> So how-- I'm sorry, how do you then, if it's not for everybody, but yet I'm the user at the end of the day at my company and I've got these reams of data before me, how do you make it make better sense to me then? So that's where machine learning comes in or artificial intelligence and all this stuff. So how at the end of the day, Dion? How do you make it relevant and usable, actionable to somebody who might not be as practiced as you would like? >> I agree with Joe that many of us will be the recipients of data science. Just like you had to be a computer science at one point to develop programs for a computer, now we can get the programs. You don't need to be a computer scientist to get a lot of value out of our IT systems. The same thing's going to happen with data science. There's far more demand for data science than there ever could be produced by, you know having an ivory tower filled with data scientists. Which we need those guys, too, don't get me wrong. But we need to have, productize it and make it available in packages such that it can be consumed. The outputs and even some of the inputs can be provided by mere mortals, whether that's machine learning or artificial intelligence or bots that go off and run the hypotheses and select the algorithms maybe with some human help. We have to productize it. This is a constant of data scientist of service, which is becoming a thing now. It's, "I need this, I need this capability at scale. "I need it fast and I need it cheap." The commoditization of data science is going to happen. >> That goes back to what I was saying about, the recipient also of data science is also machines, right? Because I think the other thing that's happening now in the evolution of data is that, you know the data is, it's so tightly coupled. Back when you were talking about data warehousing you have all the business transactions then you take the data out of those systems, you put them in a warehouse for analysis, right? Maybe they'll make a decision to change that system at some point. Now the analytics platform and the business application is very tightly coupled. They become dependent upon one another. So you know people who are using the applications are now be able to take advantage of the insights of data analytics and data science, just through the app. Which never really existed before. >> I have one comment on that. You were talking about how do you get the end user more involved, well like we said earlier data science is not easy, right? As an end user, I encourage you to take a stats course, just a basic stats course, understanding what a mean is, variability, regression analysis, just basic stuff. So you as an end user can get more, or glean more insight from the reports that you're given, right? If you go to France and don't know French, then people can speak really slowly to you in French, you're not going to get it. You need to understand the language of data to get value from the technology we have available to us. >> Incidentally French is one of the languages that you have the option of learning if you're a mathematicians. So math PhDs are required to learn a second language. France being the country of algebra, that's one of the languages you could actually learn. Anyway tangent. But going back to the point. So statistics courses, definitely encourage it. I teach statistics. And one of the things that I'm finding as I go through the process of teaching it I'm actually bringing in my experience. And by bringing in my experience I'm actually kind of making the students think about the data differently. So the other thing people don't think about is the fact that like statisticians typically were expected to do, you know, just basic sort of tasks. In a sense that they're knowledge is specialized, right? But the day to day operations was they ran some data, you know they ran a test on some data, looked at the results, interpret the results based on what they were taught in school. They didn't develop that model a lot of times they just understand what the tests were saying, especially in the medical field. So when you when think about things like, we have words like population, census. Which is when you take data from every single, you have every single data point versus a sample, which is a subset. It's a very different story now that we're collecting faster than it used to be. It used to be the idea that you could collect information from everyone. Like it happens once every 10 years, we built that in. But nowadays you know, you know here about Facebook, for instance, I think they claimed earlier this year that their data was more accurate than the census data. So now there are these claims being made about which data source is more accurate. And I think the other side of this is now statisticians are expected to know data in a different way than they were before. So it's not just changing as a field in data science, but I think the sciences that are using data are also changing their fields as well. >> Dave: So is sampling dead? >> Well no, because-- >> Should it be? (laughs) >> Well if you're sampling wrong, yes. That's really the question. >> Okay. You know it's been said that the data doesn't lie, people do. Organizations are very political. Oftentimes you know, lies, damned lies and statistics, Benjamin Israeli. Are you seeing a change in the way in which organizations are using data in the context of the politics. So, some strong P&L manager say gets data and crafts it in a way that he or she can advance their agenda. Or they'll maybe attack a data set that is, probably should drive them in a different direction, but might be antithetical to their agenda. Are you seeing data, you know we talked about democratizing data, are you seeing that reduce the politics inside of organizations? >> So you know we've always used data to tell stories at the top level of an organization that's what it's all about. And I still see very much that no matter how much data science or, the access to the truth through looking at the numbers that story telling is still the political filter through which all that data still passes, right? But it's the advent of things like Block Chain, more and more corporate records and corporate information is going to end up in these open and shared repositories where there is not alternate truth. It'll come back to whoever tells the best stories at the end of the day. So I still see the organizations are very political. We are seeing now more open data though. Open data initiatives are a big thing, both in government and in the private sector. It is having an effect, but it's slow and steady. So that's what I see. >> Um, um, go ahead. >> I was just going to say as well. Ultimately I think data driven decision making is a great thing. And it's especially useful at the lower tiers of the organization where you have the routine day to day's decisions that could be automated through machine learning and deep learning. The algorithms can be improved on a constant basis. On the upper levels, you know that's why you pay executives the big bucks in the upper levels to make the strategic decisions. And data can help them, but ultimately, data, IT, technology alone will not create new markets, it will not drive new businesses, it's up to human beings to do that. The technology is the tool to help them make those decisions. But creating businesses, growing businesses, is very much a human activity. And that's something I don't see ever getting replaced. Technology might replace many other parts of the organization, but not that part. >> I tend to be a foolish optimist when it comes to this stuff. >> You do. (laughs) >> I do believe that data will make the world better. I do believe that data doesn't lie people lie. You know I think as we start, I'm already seeing trends in industries, all different industries where, you know conventional wisdom is starting to get trumped by analytics. You know I think it's still up to the human being today to ignore the facts and go with what they think in their gut and sometimes they win, sometimes they lose. But generally if they lose the data will tell them that they should have gone the other way. I think as we start relying more on data and trusting data through artificial intelligence, as we start making our lives a little bit easier, as we start using smart cars for safety, before replacement of humans. AS we start, you know, using data really and analytics and data science really as the bumpers, instead of the vehicle, eventually we're going to start to trust it as the vehicle itself. And then it's going to make lying a little bit harder. >> Okay, so great, excellent. Optimism, I love it. (John laughs) So I'm going to play devil's advocate here a little bit. There's a couple elephant in the room topics that I want to, to explore a little bit. >> Here it comes. >> There was an article today in Wired. And it was called, Why AI is Still Waiting for It's Ethics Transplant. And, I will just read a little segment from there. It says, new ethical frameworks for AI need to move beyond individual responsibility to hold powerful industrial, government and military interests accountable as they design and employ AI. When tech giants build AI products, too often user consent, privacy and transparency are overlooked in favor of frictionless functionality that supports profit driven business models based on aggregate data profiles. This is from Kate Crawford and Meredith Whittaker who founded AI Now. And they're calling for sort of, almost clinical trials on AI, if I could use that analogy. Before you go to market you've got to test the human impact, the social impact. Thoughts. >> And also have the ability for a human to intervene at some point in the process. This goes way back. Is everybody familiar with the name Stanislav Petrov? He's the Soviet officer who back in 1983, it was in the control room, I guess somewhere outside of Moscow in the control room, which detected a nuclear missile attack against the Soviet Union coming out of the United States. Ordinarily I think if this was an entirely AI driven process we wouldn't be sitting here right now talking about it. But this gentlemen looked at what was going on on the screen and, I'm sure he's accountable to his authorities in the Soviet Union. He probably got in a lot of trouble for this, but he decided to ignore the signals, ignore the data coming out of, from the Soviet satellites. And as it turned out, of course he was right. The Soviet satellites were seeing glints of the sun and they were interpreting those glints as missile launches. And I think that's a great example why, you know every situation of course doesn't mean the end of the world, (laughs) it was in this case. But it's a great example why there needs to be a human component, a human ability for human intervention at some point in the process. >> So other thoughts. I mean organizations are driving AI hard for profit. Best minds of our generation are trying to figure out how to get people to click on ads. Jeff Hammerbacher is famous for saying it. >> You can use data for a lot of things, data analytics, you can solve, you can cure cancer. You can make customers click on more ads. It depends on what you're goal is. But, there are ethical considerations we need to think about. When we have data that will have a racial bias against blacks and have them have higher prison sentences or so forth or worse credit scores, so forth. That has an impact on a broad group of people. And as a society we need to address that. And as scientists we need to consider how are we going to fix that problem? Cathy O'Neil in her book, Weapons of Math Destruction, excellent book, I highly recommend that your listeners read that book. And she talks about these issues about if AI, if algorithms have a widespread impact, if they adversely impact protected group. And I forget the last criteria, but like we need to really think about these things as a people, as a country. >> So always think the idea of ethics is interesting. So I had this conversation come up a lot of times when I talk to data scientists. I think as a concept, right as an idea, yes you want things to be ethical. The question I always pose to them is, "Well in the business setting "how are you actually going to do this?" 'Cause I find the most difficult thing working as a data scientist, is to be able to make the day to day decision of when someone says, "I don't like that number," how do you actually get around that. If that's the right data to be showing someone or if that's accurate. And say the business decides, "Well we don't like that number." Many people feel pressured to then change the data, change, or change what the data shows. So I think being able to educate people to be able to find ways to say what the data is saying, but not going past some line where it's a lie, where it's unethical. 'Cause you can also say what data doesn't say. You don't always have to say what the data does say. You can leave it as, "Here's what we do know, "but here's what we don't know." There's a don't know part that many people will omit when they talk about data. So I think, you know especially when it comes to things like AI it's tricky, right? Because I always tell people I don't know everyone thinks AI's going to be so amazing. I started an industry by fixing problems with computers that people didn't realize computers had. For instance when you have a system, a lot of bugs, we all have bug reports that we've probably submitted. I mean really it's no where near the point where it's going to start dominating our lives and taking over all the jobs. Because frankly it's not that advanced. It's still run by people, still fixed by people, still managed by people. I think with ethics, you know a lot of it has to do with the regulations, what the laws say. That's really going to be what's involved in terms of what people are willing to do. A lot of businesses, they want to make money. If there's no rules that says they can't do certain things to make money, then there's no restriction. I think the other thing to think about is we as consumers, like everyday in our lives, we shouldn't separate the idea of data as a business. We think of it as a business person, from our day to day consumer lives. Meaning, yes I work with data. Incidentally I also always opt out of my credit card, you know when they send you that information, they make you actually mail them, like old school mail, snail mail like a document that says, okay I don't want to be part of this data collection process. Which I always do. It's a little bit more work, but I go through that step of doing that. Now if more people did that, perhaps companies would feel more incentivized to pay you for your data, or give you more control of your data. Or at least you know, if a company's going to collect information, I'd want you to be certain processes in place to ensure that it doesn't just get sold, right? For instance if a start up gets acquired what happens with that data they have on you? You agree to give it to start up. But I mean what are the rules on that? So I think we have to really think about the ethics from not just, you know, someone who's going to implement something but as consumers what control we have for our own data. 'Cause that's going to directly impact what businesses can do with our data. >> You know you mentioned data collection. So slightly on that subject. All these great new capabilities we have coming. We talked about what's going to happen with media in the future and what 5G technology's going to do to mobile and these great bandwidth opportunities. The internet of things and the internet of everywhere. And all these great inputs, right? Do we have an arms race like are we keeping up with the capabilities to make sense of all the new data that's going to be coming in? And how do those things square up in this? Because the potential is fantastic, right? But are we keeping up with the ability to make it make sense and to put it to use, Joe? >> So I think data ingestion and data integration is probably one of the biggest challenges. I think, especially as the world is starting to become more dependent on data. I think you know, just because we're dependent on numbers we've come up with GAAP, which is generally accepted accounting principles that can be audited and proven whether it's true or false. I think in our lifetime we will see something similar to that we will we have formal checks and balances of data that we use that can be audited. Getting back to you know what Dave was saying earlier about, I personally would trust a machine that was programmed to do the right thing, than to trust a politician or some leader that may have their own agenda. And I think the other thing about machines is that they are auditable. You know you can look at the code and see exactly what it's doing and how it's doing it. Human beings not so much. So I think getting to the truth, even if the truth isn't the answer that we want, I think is a positive thing. It's something that we can't do today that once we start relying on machines to do we'll be able to get there. >> Yeah I was just going to add that we live in exponential times. And the challenge is that the way that we're structured traditionally as organizations is not allowing us to absorb advances exponentially, it's linear at best. Everyone talks about change management and how are we going to do digital transformation. Evidence shows that technology's forcing the leaders and the laggards apart. There's a few leading organizations that are eating the world and they seem to be somehow rolling out new things. I don't know how Amazon rolls out all this stuff. There's all this artificial intelligence and the IOT devices, Alexa, natural language processing and that's just a fraction, it's just a tip of what they're releasing. So it just shows that there are some organizations that have path found the way. Most of the Fortune 500 from the year 2000 are gone already, right? The disruption is happening. And so we are trying, have to find someway to adopt these new capabilities and deploy them effectively or the writing is on the wall. I spent a lot of time exploring this topic, how are we going to get there and all of us have a lot of hard work is the short answer. >> I read that there's going to be more data, or it was predicted, more data created in this year than in the past, I think it was five, 5,000 years. >> Forever. (laughs) >> And that to mix the statistics that we're analyzing currently less than 1% of the data. To taking those numbers and hear what you're all saying it's like, we're not keeping up, it seems like we're, it's not even linear. I mean that gap is just going to grow and grow and grow. How do we close that? >> There's a guy out there named Chris Dancy, he's known as the human cyborg. He has 700 hundred sensors all over his body. And his theory is that data's not new, having access to the data is new. You know we've always had a blood pressure, we've always had a sugar level. But we were never able to actually capture it in real time before. So now that we can capture and harness it, now we can be smarter about it. So I think that being able to use this information is really incredible like, this is something that over our lifetime we've never had and now we can do it. Which hence the big explosion in data. But I think how we use it and have it governed I think is the challenge right now. It's kind of cowboys and indians out there right now. And without proper governance and without rigorous regulation I think we are going to have some bumps in the road along the way. >> The data's in the oil is the question how are we actually going to operationalize around it? >> Or find it. Go ahead. >> I will say the other side of it is, so if you think about information, we always have the same amount of information right? What we choose to record however, is a different story. Now if you want wanted to know things about the Olympics, but you decide to collect information every day for years instead of just the Olympic year, yes you have a lot of data, but did you need all of that data? For that question about the Olympics, you don't need to collect data during years there are no Olympics, right? Unless of course you're comparing it relative. But I think that's another thing to think about. Just 'cause you collect more data does not mean that data will produce more statistically significant results, it does not mean it'll improve your model. You can be collecting data about your shoe size trying to get information about your hair. I mean it really does depend on what you're trying to measure, what your goals are, and what the data's going to be used for. If you don't factor the real world context into it, then yeah you can collect data, you know an infinite amount of data, but you'll never process it. Because you have no question to ask you're not looking to model anything. There is no universal truth about everything, that just doesn't exist out there. >> I think she's spot on. It comes down to what kind of questions are you trying to ask of your data? You can have one given database that has 100 variables in it, right? And you can ask it five different questions, all valid questions and that data may have those variables that'll tell you what's the best predictor of Churn, what's the best predictor of cancer treatment outcome. And if you can ask the right question of the data you have then that'll give you some insight. Just data for data's sake, that's just hype. We have a lot of data but it may not lead to anything if we don't ask it the right questions. >> Joe. >> I agree but I just want to add one thing. This is where the science in data science comes in. Scientists often will look at data that's already been in existence for years, weather forecasts, weather data, climate change data for example that go back to data charts and so forth going back centuries if that data is available. And they reformat, they reconfigure it, they get new uses out of it. And the potential I see with the data we're collecting is it may not be of use to us today, because we haven't thought of ways to use it, but maybe 10, 20, even 100 years from now someone's going to think of a way to leverage the data, to look at it in new ways and to come up with new ideas. That's just my thought on the science aspect. >> Knowing what you know about data science, why did Facebook miss Russia and the fake news trend? They came out and admitted it. You know, we miss it, why? Could they have, is it because they were focused elsewhere? Could they have solved that problem? (crosstalk) >> It's what you said which is are you asking the right questions and if you're not looking for that problem in exactly the way that it occurred you might not be able to find it. >> I thought the ads were paid in rubles. Shouldn't that be your first clue (panelists laugh) that something's amiss? >> You know red flag, so to speak. >> Yes. >> I mean Bitcoin maybe it could have hidden it. >> Bob: Right, exactly. >> I would think too that what happened last year is actually was the end of an age of optimism. I'll bring up the Soviet Union again, (chuckles). It collapsed back in 1991, 1990, 1991, Russia was reborn in. And think there was a general feeling of optimism in the '90s through the 2000s that Russia is now being well integrated into the world economy as other nations all over the globe, all continents are being integrated into the global economy thanks to technology. And technology is lifting entire continents out of poverty and ensuring more connectedness for people. Across Africa, India, Asia, we're seeing those economies that very different countries than 20 years ago and that extended into Russia as well. Russia is part of the global economy. We're able to communicate as a global, a global network. I think as a result we kind of overlook the dark side that occurred. >> John: Joe? >> Again, the foolish optimist here. But I think that... It shouldn't be the question like how did we miss it? It's do we have the ability now to catch it? And I think without data science without machine learning, without being able to train machines to look for patterns that involve corruption or result in corruption, I think we'd be out of luck. But now we have those tools. And now hopefully, optimistically, by the next election we'll be able to detect these things before they become public. >> It's a loaded question because my premise was Facebook had the ability and the tools and the knowledge and the data science expertise if in fact they wanted to solve that problem, but they were focused on other problems, which is how do I get people to click on ads? >> Right they had the ability to train the machines, but they were giving the machines the wrong training. >> Looking under the wrong rock. >> (laughs) That's right. >> It is easy to play armchair quarterback. Another topic I wanted to ask the panel about is, IBM Watson. You guys spend time in the Valley, I spend time in the Valley. People in the Valley poo-poo Watson. Ah, Google, Facebook, Amazon they've got the best AI. Watson, and some of that's fair criticism. Watson's a heavy lift, very services oriented, you just got to apply it in a very focused. At the same time Google's trying to get you to click on Ads, as is Facebook, Amazon's trying to get you to buy stuff. IBM's trying to solve cancer. Your thoughts on that sort of juxtaposition of the different AI suppliers and there may be others. Oh, nobody wants to touch this one, come on. I told you elephant in the room questions. >> Well I mean you're looking at two different, very different types of organizations. One which is really spent decades in applying technology to business and these other companies are ones that are primarily into the consumer, right? When we talk about things like IBM Watson you're looking at a very different type of solution. You used to be able to buy IT and once you installed it you pretty much could get it to work and store your records or you know, do whatever it is you needed it to do. But these types of tools, like Watson actually tries to learn your business. And it needs to spend time doing that watching the data and having its models tuned. And so you don't get the results right away. And I think that's been kind of the challenge that organizations like IBM has had. Like this is a different type of technology solution, one that has to actually learn first before it can provide value. And so I think you know you have organizations like IBM that are much better at applying technology to business, and then they have the further hurdle of having to try to apply these tools that work in very different ways. There's education too on the side of the buyer. >> I'd have to say that you know I think there's plenty of businesses out there also trying to solve very significant, meaningful problems. You know with Microsoft AI and Google AI and IBM Watson, I think it's not really the tool that matters, like we were saying earlier. A fool with a tool is still a fool. And regardless of who the manufacturer of that tool is. And I think you know having, a thoughtful, intelligent, trained, educated data scientist using any of these tools can be equally effective. >> So do you not see core AI competence and I left out Microsoft, as a strategic advantage for these companies? Is it going to be so ubiquitous and available that virtually anybody can apply it? Or is all the investment in R&D and AI going to pay off for these guys? >> Yeah, so I think there's different levels of AI, right? So there's AI where you can actually improve the model. I remember when I was invited when Watson was kind of first out by IBM to a private, sort of presentation. And my question was, "Okay, so when do I get "to access the corpus?" The corpus being sort of the foundation of NLP, which is natural language processing. So it's what you use as almost like a dictionary. Like how you're actually going to measure things, or things up. And they said, "Oh you can't." "What do you mean I can't?" It's like, "We do that." "So you're telling me as a data scientist "you're expecting me to rely on the fact "that you did it better than me and I should rely on that." I think over the years after that IBM started opening it up and offering different ways of being able to access the corpus and work with that data. But I remember at the first Watson hackathon there was only two corpus available. It was either the travel or medicine. There was no other foundational data available. So I think one of the difficulties was, you know IBM being a little bit more on the forefront of it they kind of had that burden of having to develop these systems and learning kind of the hard way that if you don't have the right models and you don't have the right data and you don't have the right access, that's going to be a huge limiter. I think with things like medical, medical information that's an extremely difficult data to start with. Partly because you know anything that you do find or don't find, the impact is significant. If I'm looking at things like what people clicked on the impact of using that data wrong, it's minimal. You might lose some money. If you do that with healthcare data, if you do that with medical data, people may die, like this is a much more difficult data set to start with. So I think from a scientific standpoint it's great to have any information about a new technology, new process. That's the nice that is that IBM's obviously invested in it and collected information. I think the difficulty there though is just 'cause you have it you can't solve everything. And if feel like from someone who works in technology, I think in general when you appeal to developers you try not to market. And with Watson it's very heavily marketed, which tends to turn off people who are more from the technical side. Because I think they don't like it when it's gimmicky in part because they do the opposite of that. They're always trying to build up the technical components of it. They don't like it when you're trying to convince them that you're selling them something when you could just give them the specs and look at it. So it could be something as simple as communication. But I do think it is valuable to have had a company who leads on the forefront of that and try to do so we can actually learn from what IBM has learned from this process. >> But you're an optimist. (John laughs) All right, good. >> Just one more thought. >> Joe go ahead first. >> Joe: I want to see how Alexa or Siri do on Jeopardy. (panelists laugh) >> All right. Going to go around a final thought, give you a second. Let's just think about like your 12 month crystal ball. In terms of either challenges that need to be met in the near term or opportunities you think will be realized. 12, 18 month horizon. Bob you've got the microphone headed up, so I'll let you lead off and let's just go around. >> I think a big challenge for business, for society is getting people educated on data and analytics. There's a study that was just released I think last month by Service Now, I think, or some vendor, or Click. They found that only 17% of the employees in Europe have the ability to use data in their job. Think about that. >> 17. >> 17. Less than 20%. So these people don't have the ability to understand or use data intelligently to improve their work performance. That says a lot about the state we're in today. And that's Europe. It's probably a lot worse in the United States. So that's a big challenge I think. To educate the masses. >> John: Joe. >> I think we probably have a better chance of improving technology over training people. I think using data needs to be iPhone easy. And I think, you know which means that a lot of innovation is in the years to come. I do think that a keyboard is going to be a thing of the past for the average user. We are going to start using voice a lot more. I think augmented reality is going to be things that becomes a real reality. Where we can hold our phone in front of an object and it will have an overlay of prices where it's available, if it's a person. I think that we will see within an organization holding a camera up to someone and being able to see what is their salary, what sales did they do last year, some key performance indicators. I hope that we are beyond the days of everyone around the world walking around like this and we start actually becoming more social as human beings through augmented reality. I think, it has to happen. I think we're going through kind of foolish times at the moment in order to get to the greater good. And I think the greater good is using technology in a very, very smart way. Which means that you shouldn't have to be, sorry to contradict, but maybe it's good to counterpoint. I don't think you need to have a PhD in SQL to use data. Like I think that's 1990. I think as we evolve it's going to become easier for the average person. Which means people like the brain trust here needs to get smarter and start innovating. I think the innovation around data is really at the tip of the iceberg, we're going to see a lot more of it in the years to come. >> Dion why don't you go ahead, then we'll come down the line here. >> Yeah so I think over that time frame two things are likely to happen. One is somebody's going to crack the consumerization of machine learning and AI, such that it really is available to the masses and we can do much more advanced things than we could. We see the industries tend to reach an inflection point and then there's an explosion. No one's quite cracked the code on how to really bring this to everyone, but somebody will. And that could happen in that time frame. And then the other thing that I think that almost has to happen is that the forces for openness, open data, data sharing, open data initiatives things like Block Chain are going to run headlong into data protection, data privacy, customer privacy laws and regulations that have to come down and protect us. Because the industry's not doing it, the government is stepping in and it's going to re-silo a lot of our data. It's going to make it recede and make it less accessible, making data science harder for a lot of the most meaningful types of activities. Patient data for example is already all locked down. We could do so much more with it, but health start ups are really constrained about what they can do. 'Cause they can't access the data. We can't even access our own health care records, right? So I think that's the challenge is we have to have that battle next to be able to go and take the next step. >> Well I see, with the growth of data a lot of it's coming through IOT, internet of things. I think that's a big source. And we're going to see a lot of innovation. A new types of Ubers or Air BnBs. Uber's so 2013 though, right? We're going to see new companies with new ideas, new innovations, they're going to be looking at the ways this data can be leveraged all this big data. Or data coming in from the IOT can be leveraged. You know there's some examples out there. There's a company for example that is outfitting tools, putting sensors in the tools. Industrial sites can therefore track where the tools are at any given time. This is an expensive, time consuming process, constantly loosing tools, trying to locate tools. Assessing whether the tool's being applied to the production line or the right tool is at the right torque and so forth. With the sensors implanted in these tools, it's now possible to be more efficient. And there's going to be innovations like that. Maybe small start up type things or smaller innovations. We're going to see a lot of new ideas and new types of approaches to handling all this data. There's going to be new business ideas. The next Uber, we may be hearing about it a year from now whatever that may be. And that Uber is going to be applying data, probably IOT type data in some, new innovative way. >> Jennifer, final word. >> Yeah so I think with data, you know it's interesting, right, for one thing I think on of the things that's made data more available and just people we open to the idea, has been start ups. But what's interesting about this is a lot of start ups have been acquired. And a lot of people at start ups that got acquired now these people work at bigger corporations. Which was the way it was maybe 10 years ago, data wasn't available and open, companies kept it very proprietary, you had to sign NDAs. It was like within the last 10 years that open source all of that initiatives became much more popular, much more open, a acceptable sort of way to look at data. I think that what I'm kind of interested in seeing is what people do within the corporate environment. Right, 'cause they have resources. They have funding that start ups don't have. And they have backing, right? Presumably if you're acquired you went in at a higher title in the corporate structure whereas if you had started there you probably wouldn't be at that title at that point. So I think you have an opportunity where people who have done innovative things and have proven that they can build really cool stuff, can now be in that corporate environment. I think part of it's going to be whether or not they can really adjust to sort of the corporate, you know the corporate landscape, the politics of it or the bureaucracy. I think every organization has that. Being able to navigate that is a difficult thing in part 'cause it's a human skill set, it's a people skill, it's a soft skill. It's not the same thing as just being able to code something and sell it. So you know it's going to really come down to people. I think if people can figure out for instance, what people want to buy, what people think, in general that's where the money comes from. You know you make money 'cause someone gave you money. So if you can find a way to look at a data or even look at technology and understand what people are doing, aren't doing, what they're happy about, unhappy about, there's always opportunity in collecting the data in that way and being able to leverage that. So you build cooler things, and offer things that haven't been thought of yet. So it's a very interesting time I think with the corporate resources available if you can do that. You know who knows what we'll have in like a year. >> I'll add one. >> Please. >> The majority of companies in the S&P 500 have a market cap that's greater than their revenue. The reason is 'cause they have IP related to data that's of value. But most of those companies, most companies, the vast majority of companies don't have any way to measure the value of that data. There's no GAAP accounting standard. So they don't understand the value contribution of their data in terms of how it helps them monetize. Not the data itself necessarily, but how it contributes to the monetization of the company. And I think that's a big gap. If you don't understand the value of the data that means you don't understand how to refine it, if data is the new oil and how to protect it and so forth and secure it. So that to me is a big gap that needs to get closed before we can actually say we live in a data driven world. >> So you're saying I've got an asset, I don't know if it's worth this or this. And they're missing that great opportunity. >> So devolve to what I know best. >> Great discussion. Really, really enjoyed the, the time as flown by. Joe if you get that augmented reality thing to work on the salary, point it toward that guy not this guy, okay? (everyone laughs) It's much more impressive if you point it over there. But Joe thank you, Dion, Joe and Jennifer and Batman. We appreciate and Bob Hayes, thanks for being with us. >> Thanks you guys. >> Really enjoyed >> Great stuff. >> the conversation. >> And a reminder coming up a the top of the hour, six o'clock Eastern time, IBMgo.com featuring the live keynote which is being set up just about 50 feet from us right now. Nick Silver is one of the headliners there, John Thomas is well, or rather Rob Thomas. John Thomas we had on earlier on The Cube. But a panel discussion as well coming up at six o'clock on IBMgo.com, six to 7:15. Be sure to join that live stream. That's it from The Cube. We certainly appreciate the time. Glad to have you along here in New York. And until the next time, take care. (bright digital music)
SUMMARY :
Brought to you by IBM. Welcome back to data science for all. So it is a new game-- Have a swing at the pitch. Thanks for taking the time to be with us. from the academic side to continue data science And there's lot to be said is there not, ask the questions, you can't not think about it. of the customer and how we were going to be more anticipatory And I think, you know as the tools mature, So it's still too hard. I think that, you know, that's where it's headed. So Bob if you would, so you've got this Batman shirt on. to be a data scientist, but these tools will help you I was just going to add that, you know I think it's important to point out as well that And the data scientists on the panel And the only difference is that you can build it's an accomplishment and for less, So I think you have to think about the fact that I get the point of it and I think and become easier to use, you know like Bob was saying, So how at the end of the day, Dion? or bots that go off and run the hypotheses So you know people who are using the applications are now then people can speak really slowly to you in French, But the day to day operations was they ran some data, That's really the question. You know it's been said that the data doesn't lie, the access to the truth through looking at the numbers of the organization where you have the routine I tend to be a foolish optimist You do. I think as we start relying more on data and trusting data There's a couple elephant in the room topics Before you go to market you've got to test And also have the ability for a human to intervene to click on ads. And I forget the last criteria, but like we need I think with ethics, you know a lot of it has to do of all the new data that's going to be coming in? Getting back to you know what Dave was saying earlier about, organizations that have path found the way. than in the past, I think it was (laughs) I mean that gap is just going to grow and grow and grow. So I think that being able to use this information Or find it. But I think that's another thing to think about. And if you can ask the right question of the data you have And the potential I see with the data we're collecting is Knowing what you know about data science, for that problem in exactly the way that it occurred I thought the ads were paid in rubles. I think as a result we kind of overlook And I think without data science without machine learning, Right they had the ability to train the machines, At the same time Google's trying to get you And so I think you know And I think you know having, I think in general when you appeal to developers But you're an optimist. Joe: I want to see how Alexa or Siri do on Jeopardy. in the near term or opportunities you think have the ability to use data in their job. That says a lot about the state we're in today. I don't think you need to have a PhD in SQL to use data. Dion why don't you go ahead, We see the industries tend to reach an inflection point And that Uber is going to be applying data, I think part of it's going to be whether or not if data is the new oil and how to protect it I don't know if it's worth this or this. Joe if you get that augmented reality thing Glad to have you along here in New York.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Dion Hinchcliffe | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Jennifer | PERSON | 0.99+ |
Joe | PERSON | 0.99+ |
Comcast | ORGANIZATION | 0.99+ |
Chris Dancy | PERSON | 0.99+ |
Jennifer Shin | PERSON | 0.99+ |
Cathy O'Neil | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Stanislav Petrov | PERSON | 0.99+ |
Joe McKendrick | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Nick Silver | PERSON | 0.99+ |
John Thomas | PERSON | 0.99+ |
100 variables | QUANTITY | 0.99+ |
John Walls | PERSON | 0.99+ |
1990 | DATE | 0.99+ |
Joe Caserta | PERSON | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
UC Berkeley | ORGANIZATION | 0.99+ |
1983 | DATE | 0.99+ |
1991 | DATE | 0.99+ |
2013 | DATE | 0.99+ |
Constellation Research | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Bob | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Bob Hayes | PERSON | 0.99+ |
United States | LOCATION | 0.99+ |
360 degree | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
New York | LOCATION | 0.99+ |
Benjamin Israeli | PERSON | 0.99+ |
France | LOCATION | 0.99+ |
Africa | LOCATION | 0.99+ |
12 month | QUANTITY | 0.99+ |
Soviet Union | LOCATION | 0.99+ |
Batman | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
last year | DATE | 0.99+ |
Olympics | EVENT | 0.99+ |
Meredith Whittaker | PERSON | 0.99+ |
iPhone | COMMERCIAL_ITEM | 0.99+ |
Moscow | LOCATION | 0.99+ |
Ubers | ORGANIZATION | 0.99+ |
20 years | QUANTITY | 0.99+ |
Joe C. | PERSON | 0.99+ |
Tricia Wang, Sudden Compass | IBM Data Science For All
>> Narrator: Live from New York City, it's theCUBE covering IBM Data Science For All brought to you by IBM. >> Welcome back here on theCUBE. We are live in New York continuing our coverage here for Data Science for All where all things happen. Big things are happening. In fact, there's a huge event tonight I'm going to tell you about a little bit later on, but Tricia Wang who is our next guest is a part of that panel discussion that you'll want to tune in for live on ibmgo.com. 6 o'clock, but more on that a little bit later on. Along with Dave Vellante, John Walls here, and Tricia Wang now joins us. A first ever for us. How are you doing? >> Good. >> A global tech ethnographer. >> You said it correctly, yay! >> I learned a long time ago when you're not sure slow down. >> A plus already. >> Slow down and breathe. >> Slow down. >> You did a good job. Want to do it one more time? >> A global tech ethnographer. >> Tricia: Good job. >> Studying ethnography and putting ethnography into practice. How about that? >> Really great. >> That's taking on the challenge stretch. >> Now say it 10 times faster in a row. >> How about when we're done? Also co-founder of Sudden Compass. So first off, let's tell our viewers a little bit about Sudden Compass. Then I want to get into the ethnography and how that relates to tech. So let's go first off about Sudden Compass and the origins there. >> So Sudden Compass, we're a consulting firm based in New York City, and we help our partners embrace and understand the complexity of their customers. So whenever there are, wherever there's data and wherever there's people, we are there to help them make sure that they can understand their customers at the end of the day. And customers are really the most unpredictable, the most unknown, and the most difficult to quantify thing for any business. We see a lot of our partners really investing in big data data science tools and they're hiring the most amazing data scientists, but we saw them still struggling to make the right decisions, they still weren't getting their ROI, and they certainly weren't growing their customer base. And what we are helping them do is to say, "Look, you can't just rely only on data science. "You can't put it all into only the tool. "You have to think about how to operationalize that "and build a culture around it "and get the right skillsets in place, "and incorporate what we call the thick data, "which is the stuff that's very difficult to quantify, "the unknown, "and then you can figure out "how to best mathematically scale your data models "when it's actually based on real human behavior, "which is what the practice of ethnography is there to help "is to help you understand what do humans actually do, "what is unquantifiable. "And then once you find out those unquantifiable bits "you then have the art and science of figuring out "how do you scale it into a data model." >> Yeah, see that's what I find fascinating about this is that you've got hard and fast, right, data, objective, black and white, very clear, and then you've got people, you know? We all react differently. We have different influences, and different biases, and prejudices, and all that stuff, aptitudes. So you are meshing this art and science. >> Tricia: Absolutely. >> And what is that telling you then about how best to your clients and how to use data (mumbles)? >> Well, we tell our clients that because people are, there are biases, and people are not objective and there's emotions, that all ends up in the data set. To think that your data set, your quantitative data set, is free of biases and has some kind of been scrubbed of emotion is a total fallacy and it's something that needs to be corrected, because that means decision makers are making decisions based off of numbers thinking that they're objective when in fact they contain all the biases of the very complexity of the humans that they're serving. So, there is an art and science of making sure that when you capture that complexity ... We're saying, "Don't scrub it away." Traditional marketing wants to say, "Put your customers in boxes. "Put them in segments. "Use demographic variables like education, income. "Then you can just put everyone in a box, "figure out where you want to target, "figure out the right channels, "and you buy against that and you reach them." That's not how it works anymore. Customers now are moving faster than corporations. The new net worth customer of today has multiple identities is better understood when in relationship to other people. And we're not saying get rid of the data science. We're saying absolutely have it. You need to have scale. What is thick data going to offer you? Not scale, but it will offer you depth. So, that's why you need to combine both to be able to make effective decisions. >> So, I presume you work with a lot of big consumer brands. Is that a safe assumption? >> Absolutely. >> Okay. So, we work with a lot of big tech brands, like IBM and others, and they tend to move at the speed of the CIO, which tends to be really slow and really risk averse, and they're afraid to over rotate and get ahead over their skis. What do you tell folks like that? Is that a mistake being so cautious in this digital age? >> Well, I think the new CIO is on the cutting edge. I was just at Constellation Research Annual Conference in Half Moon Bay at-- >> Our friend Ray Wang. >> Yeah, Ray Wang. And I just spoke about this at their Constellation Connected Enterprise where they had the most, I would have to say the most amazing forward thinking collection of CIOs, CTOs, CDOs all in one room. And the conversation there was like, "We cannot afford to be slow anymore. "We have to be on the edge "of helping our companies push the ground." So, investing in tools is not enough. It is no longer enough to be the buyer, and to just have a relationship with your vendor and assume that they will help you deliver all the understanding. So, CIOs and CTOs need to ensure that their teams are diverse, multi-functional, and that they're totally integrated embedded into the business. And I don't mean just involve a business analyst as if that's cutting edge. I'm saying, "No, you need to make sure that every team "has qualitative people, "and that they're embedded and working closely together." The problem is we don't teach these skills. We're not graduating data scientists or ethnographers who even want to talk to each other. In fact, each side thinks the other side is useless. We're saying, "No, "we need to be able to have these skills "being taught within companies." And you don't need to hire a PhD data scientist or a PhD ethnographer. What we're saying is that these skills can be taught. We need to teach people to be data literate. You've hired the right experts, you have bought the right tools, but we now need to make sure that we're creating data literacy among decision makers so that we can turn these data into insights and then into action. >> Let's peel that a little bit. Data literate, you're talking about creativity, visualization, combining different perspectives? Where should the educational focus be? >> The educational focus should be on one storytelling. Right now, you cannot just be assuming that you can have a decision maker make a decision based on a number or some long PowerPoint report. We have to teach people how to tell compelling stories with data. And when I say data I'm talking about it needs the human component and it needs the numbers. And so one of the things that I saw, this is really close to my heart, was when I was at Nokia, and I remember I spent a decade understanding China. I really understood China. And when I finally had the insight where I was like, "Look, after spending 10 years there, "following 100 to 200 families around, "I had the insight back in 2009 that look, "your company is about to go out of business because "people don't want to buy your feature phones anymore. "They're going to want to buy smartphones." But, I only had qualitative data, and I needed to work alongside the business analysts and the data scientists. I needed access to their data sets, but I needed us to play together and to be on a team together so that I could scale my insights into quantitative models. And the problem was that, your question is, "What does that look like?" That looks like sitting on a team, having a mandate to say, "You have to play together, "and be able to tell an effective story "to the management and to leadership." But back then they were saying, "No, "we don't even consider your data set "to be worthwhile to even look at." >> We love our candy bar phone, right? It's a killer. >> Tricia: And we love our numbers. We love our surveys that tell us-- >> Market share was great. >> Market share is great. We've done all of the analysis. >> Forget the razor. >> Exactly. I'm like, "Look, of course your market share was great, "because your surveys were optimized "for your existing business model." So, big data is great if you want to optimize your supply chain or in systems that are very contained and quantifiable that's more or less fine. You can get optimization. You can get that one to two to five percent. But if you really want to grow your company and you want to ensure its longevity, you cannot just rely on your quantitative data to tell you how to do that. You actually need thick data for discovery, because you need to find the unknown. >> One of the things you talk about your passion is to understand how human perspectives shape the technology we build and how we use it. >> Tricia: Yes, you're speaking my language. >> Okay, so when you think about the development of the iPhone, it wasn't a bunch of surveys that led Steve Jobs to develop the iPhone. I guess the question is does technology lead and shape human perspectives or do human perspectives shape technology? >> Well, it's a dialectical relationship. It's like does a hamburger ... Does a bun shape the burger or does the bun shape the burger? You would never think of asking someone who loves a hamburger that question, because they both shape each other. >> Okay. (laughing) >> So, it's symbiote here, totally symbiotic. >> Surprise answer. You weren't expecting that. >> No, but it is kind of ... Okay, so you're saying it's not a chicken and egg, it's both. >> Absolutely. And the best companies are attuned to both. The best companies know that. The most powerful companies of the 21st century are obsessed with their customers and they're going to do a great job at leveraging human models to be scaled into data models, and that gap is going to be very, very narrow. You get big data. We're going to see more AI or ML disasters when their data models are really far from their actual human models. That's how we get disasters like Tesco or Target, or even when Google misidentified black people as gorillas. It's because their model of their data was so far from the understanding of humans. And the best companies of the future are going to know how to close that gap, and that means they will have the thick data and big data closely integrated. >> Who's doing that today? It seems like there are no ethics in AI. People are aggressively AI for profit and not really thinking about the human impacts and the societal impacts. >> Let's look at IBM. They're doing it. I would say that some of the most innovative projects that are happening at IBM with Watson, where people are using AI to solve meaningful social problems. I don't think that has to be-- >> Like IBM For Social Good. >> Exactly, but it's also, it's not just experimental. I think IBM is doing really great stuff using Watson to understand, identify skin cancer, or looking at the ways that people are using AI to understand eye diseases, things that you can do at scale. But also businesses are also figuring out how to use AI for actually doing better things. I think some of the most interesting ... We're going to see more examples of people using AI for solving meaningful social problems and making a profit at the same time. I think one really great example is WorkIt is they're using AI. They're actually working with Watson. Watson is who they hired to create their engine where union workers can ask questions of Watson that they may not want to ask or may be too costly to ask. So you can be like, "If I want to take one day off, "will this affect my contract or my job?" That's a very meaningful social problem that unions are now working with, and I think that's a really great example of how Watson is really pushing the edge to solve meaningful social problems at the same time. >> I worry sometimes that that's like the little device that you put in your car for the insurance company to see how you drive. >> How do you brake? How do you drive? >> Do people trust feeding that data to Watson because they're afraid Big Brother is watching? >> That's why we always have to have human intelligence working with machine intelligence. This idea of AI versus humans is a false binary, and I don't even know why we're engaging in those kinds of questions. We're not clearly, but there are people who are talking about it as if it's one or the other, and I find it to be a total waste of time. It's like clearly the best AI systems will be integrated with human intelligence, and we need the human training the data with machine learning systems. >> Alright, I'll play the yeah but. >> You're going to play the what? >> Yeah but! >> Yeah but! (crosstalk) >> That machines are replacing humans in cognitive functions. You walk into an airport and there are kiosks. People are losing jobs. >> Right, no that's real. >> So okay, so that's real. >> That is real. >> You agree with that. >> Job loss is real and job replacement is real. >> And I presume you agree that education is at least a part the answer, and training people differently than-- >> Tricia: Absolutely. >> Just straight reading, writing, and arithmetic, but thoughts on that. >> Well what I mean is that, yes, AI is replacing jobs, but the fact that we're treating AI as some kind of rogue machine that is operating on its own without human guidance, that's not happening, and that's not happening right now, and that's not happening in application. And what is more meaningful to talk about is how do we make sure that humans are more involved with the machines, that we always have a human in the loop, and that they're always making sure that they're training in a way where it's bringing up these ethical questions that are very important that you just raised. >> Right, well, and of course a lot of AI people would say is about prediction and then automation. So think about some of the brands that you serve, consult with, don't they want the machines to make certain decisions for them so that they can affect an outcome? >> I think that people want machines to surface things that is very difficult for humans to do. So if a machine can efficiently surface here is a pattern that's going on then that is very helpful. I think we have companies that are saying, "We can automate your decisions," but when you actually look at what they can automate it's in very contained, quantifiable systems. It's around systems around their supply chain or logistics. But, you really do not want your machine automating any decision when it really affects people, in particular your customers. >> Okay, so maybe changing the air pressure somewhere on a widget that's fine, but not-- >> Right, but you still need someone checking that, because will that air pressure create some unintended consequences later on? There's always some kind of human oversight. >> So I was looking at your website, and I always look for, I'm intrigued by interesting, curious thoughts. >> Tricia: Okay, I have a crazy website. >> No, it's very good, but back in your favorite quotes, "Rather have a question I can't answer "than an answer I can't question." So, how do you bring that kind of there's no fear of failure to the boardroom, to people who have to make big leaps and big decisions and enter this digital transformative world? >> I think that a lot of companies are so fearful of what's going to happen next, and that fear can oftentimes corner them into asking small questions and acting small where they're just asking how do we optimize something? That's really essentially what they're asking. "How do we optimize X? "How do we optimize this business?" What they're not really asking are the hard questions, the right questions, the discovery level questions that are very difficult to answer that no big data set can answer. And those are questions ... The questions about the unknown are the most difficult, but that's where you're going to get growth, because when something is unknown that means you have not either quantified it yet or you haven't found the relationship yet in your data set, and that's your competitive advantage. And that's where the boardroom really needs to set the mandate to say, "Look, I don't want you guys only answering "downstream, company-centric questions like, "'How do we optimize XYZ?"'" which is still important to answer. We're saying you absolutely need to pay attention to that, but you also need to ask upstream very customer-centric questions. And that's very difficult, because all day you're operating inside a company . You have to then step outside of your shoes and leave the building and see the world from a customer's perspective or from even a non existing customer's perspective, which is even more difficult. >> The whole know your customer meme has taken off in a big way right now, but I do feel like the pendulum is swinging. Well, I'm sanguined toward AI. It seems to me that ... It used to be that brands had all the power. They had all the knowledge, they knew the pricing, and the consumers knew nothing. The Internet changed all that. I feel like digital transformation and all this AI is an attempt to create that asymmetry again back in favor of the brand. I see people getting very aggressive toward, certainly you see this with Amazon, Amazon I think knows more about me than I know about myself. Should we be concerned about that and who protects the consumer, or is just maybe the benefits outweigh the risks there? >> I think that's such an important question you're asking and it's totally important. A really great TED talk just went up by Zeynep Tufekci where she talks about the most brilliant data scientists, the most brilliant minds of our day, are working on ad tech platforms that are now being created to essentially do what Kenyatta Jeez calls advertising terrorism, which is that all of this data is being collected so that advertisers have this information about us that could be used to create the future forms of surveillance. And that's why we need organizations to ask the kind of questions that you did. So two organizations that I think are doing a really great job to look at are Data & Society. Founder is Danah Boyd. Based in New York City. This is where I'm an affiliate. And they have all these programs that really look at digital privacy, identity, ramifications of all these things we're looking at with AI systems. Really great set of researchers. And then Vint Cerf (mumbles) co-founded People-Centered Internet. And I think this is another organization that we really should be looking at, it's based on the West Coast, where they're also asking similar questions of like instead of just looking at the Internet as a one-to-one model, what is the Internet doing for communities, and how do we make sure we leverage the role of communities to protect what the original founders of the Internet created? >> Right, Danah Boyd, CUBE alum. Shout out to Jeff Hammerbacher, founder of Cloudera, the originator of the greatest minds of my generation are trying to get people to click on ads. Quit Cloudera and now is working at Mount Sinai as an MD, amazing, trying to solve cancer. >> John: A lot of CUBE alums out there. >> Yeah. >> And now we have another one. >> Woo-hoo! >> Tricia, thank you for being with us. >> You're welcome. >> Fascinating stuff. >> Thanks for being on. >> It really is. >> Great questions. >> Nice to really just change the lens a little bit, look through it a different way. Tricia, by the way, part of a panel tonight with Michael Li and Nir Kaldero who we had earlier on theCUBE, 6 o'clock to 7:15 live on ibmgo.com. Nate Silver also joining the conversation, so be sure to tune in for that live tonight 6 o'clock. Back with more of theCUBE though right after this. (techno music)
SUMMARY :
brought to you by IBM. I'm going to tell you about a little bit later on, Want to do it one more time? and putting ethnography into practice. the challenge stretch. and how that relates to tech. and the most difficult to quantify thing for any business. and different biases, and prejudices, and all that stuff, and it's something that needs to be corrected, So, I presume you work with a lot of big consumer brands. and they tend to move at the speed of the CIO, I was just at Constellation Research Annual Conference and assume that they will help you deliver Where should the educational focus be? and to be on a team together We love our candy bar phone, right? We love our surveys that tell us-- We've done all of the analysis. You can get that one to two to five percent. One of the things you talk about your passion that led Steve Jobs to develop the iPhone. or does the bun shape the burger? Okay. You weren't expecting that. but it is kind of ... and that gap is going to be very, very narrow. and the societal impacts. I don't think that has to be-- and making a profit at the same time. that you put in your car for the insurance company and I find it to be a total waste of time. You walk into an airport and there are kiosks. but thoughts on that. that are very important that you just raised. So think about some of the brands that you serve, But, you really do not want your machine Right, but you still need someone checking that, and I always look for, to the boardroom, and see the world from a customer's perspective and the consumers knew nothing. that I think are doing a really great job to look at Shout out to Jeff Hammerbacher, Nice to really just change the lens a little bit,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Diane Greene | PERSON | 0.99+ |
Eric Herzog | PERSON | 0.99+ |
James Kobielus | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Diane | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Mark Albertson | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Rebecca Knight | PERSON | 0.99+ |
Jennifer | PERSON | 0.99+ |
Colin | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
Rob Hof | PERSON | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Tricia Wang | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Singapore | LOCATION | 0.99+ |
James Scott | PERSON | 0.99+ |
Scott | PERSON | 0.99+ |
Ray Wang | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Brian Walden | PERSON | 0.99+ |
Andy Jassy | PERSON | 0.99+ |
Verizon | ORGANIZATION | 0.99+ |
Jeff Bezos | PERSON | 0.99+ |
Rachel Tobik | PERSON | 0.99+ |
Alphabet | ORGANIZATION | 0.99+ |
Zeynep Tufekci | PERSON | 0.99+ |
Tricia | PERSON | 0.99+ |
Stu | PERSON | 0.99+ |
Tom Barton | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Sandra Rivera | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Qualcomm | ORGANIZATION | 0.99+ |
Ginni Rometty | PERSON | 0.99+ |
France | LOCATION | 0.99+ |
Jennifer Lin | PERSON | 0.99+ |
Steve Jobs | PERSON | 0.99+ |
Seattle | LOCATION | 0.99+ |
Brian | PERSON | 0.99+ |
Nokia | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
Peter Burris | PERSON | 0.99+ |
Scott Raynovich | PERSON | 0.99+ |
Radisys | ORGANIZATION | 0.99+ |
HP | ORGANIZATION | 0.99+ |
Dave | PERSON | 0.99+ |
Eric | PERSON | 0.99+ |
Amanda Silver | PERSON | 0.99+ |
Austin Miller, Oracle Marketing Cloud - Oracle Modern Customer Experience #ModernCX - #theCUBE
>> Narrator: Live from Las Vegas, it's theCUBE, covering Oracle Modern Customer Experience 2017, brought to you by Oracle. (bright, lively music) >> Hello and welcome back to a CUBE coverage of Oracle's Modern Customer Conference here at the Mandalay Bay in Las Vegas. I'm John Furrier with SiliconANGLE, theCUBE, with my co-host this week, Peter Burris, head of research at Wikibon.com, part of SiliconANGLE Media, and our next guest is Austin Miller, Product Marketing Director for Oracle Marketing Cloud. Welcome to theCUBE conversation. >> Thank you very much for having me. >> This coveted post-launch spot. >> Yeah, we have a lunch coma kicking in, but no, seriously, you have a really tough job because you're seeing the growth of the Platform Play, right, really robust horizontal platform, but how you got here through some really smart acquisitions but handled well, and integrated, we covered that last year. You guys are seeing some nice tailwinds with some momentum certainly around the expectations of what the customers want. >> Yeah, I think that one of the best things when we start thinking about, to your point, product integration, it's also the way that we are talking to our customers about how they can use the products together. It's not really enough just to have maybe one talk to another, but unless we prove out the use cases, you don't get the utilization, and I think this year what we've really seen is getting those use cases to actually start getting some traction in the field. >> So this integrated marketing idea seems to be the reality that everyone wants. >> Where are we on that progress bar, because this seems to be pretty much unanimous with customers, the question is how to get there, the journey, and the heroes that are going to drive and the theme of the conference. But the reality is this digital transformation is being forced for business change. >> Austin: Absolutely. >> And marketing is part of that digital fabric. >> I think that one of the most interesting things about this is if you look at kind of the history of when did the stacks start becoming actually part of the story, it was at a point where we didn't really necessarily even have the capabilities to do it. As a result many marketers who thought they were maybe buying into a stack approach got a little bit burned. I think now we are actually at that place where that value is not only something that they can see inherently and say "oh, I'd like all these applications to talk together," but it's actually feasible, it's something that they're going to be able to use, and they can be optimistic about, frankly. >> Where are they getting burned, you mentioned that, from buying into a full stack of software for a point solution, is that kind of what you meant? >> No, I think that in the marketing realm, when you're talking to marketers, it is very easy to think about all the horrible things that they have to deal with on a daily basis, all these problems. And the reality is that oftentimes you've had to have this conversation with them that says, you know, there are not going to be easy answers to hard problems. There are usually hard answers to hard problems. We can help alleviate some of that friction, especially when we start talking about data silos or things about interoperability, so being able to not just have integration, but pre-built function within these particular platforms, but realistically, it just wasn't something that we necessarily in the market in general were able to deliver on until somewhat recently. >> So, I am very happy that I heard you use the word "use cases," especially at a launch, because that's been one of the biggest challenges of both marketing technology when we think about big data, there's been such a focus on the technology, getting the technology right, and then the use cases and how it changed the way the business or the function did things, kind of either did or didn't happen. Talk about how a focus in use case is actually getting people to emphasize the outcomes, and how Oracle is helping people then turn that into technology decisions. >> This may sound almost counterintuitive, but in reality the way that use cases we see helping us the most is that it really helps spur about the organizational changes that we need in order to actually have some of this happen, 'cause it's very easy to say, "we have all this technology marketer and you should be using it all," but if you don't actually prove it out and how that's going to impact let's say the way that they're creating their marketing messages, on even a kind of not exciting basis, like how are you creating your emails, how are you creating your mobile messaging, how are you doing your website, and then start talking about those in actual use cases, it's very hard for people to organize their organizations around this kind of transformation. They need something tangible to hold onto. >> And the old way with putting things in buckets, >> Austin: Exactly. >> Right, so so hey we got one covered, move on to the next one ... >> Peter: Or by channels even. We got an email solution, or we got a web solution and as the customer moves amongst these different mechanisms, or engages differently with these mechanisms, the data then becomes, we've talked a lot about this, becomes the integration point, and that as you said affects a significant change on how folks think about organizing, but what do you think are going to be some of the big use cases if people are going to be ... you're providing advice and counsel to folks on the 2017. >> Yeah, so I think that talking about marketing-specific use cases is really important, especially when we start thinking about how am I using my first-party data that I may have within a particular channel. And I'm using that to contextually change the way I'm communicating to somebody on another channel. But if we kind of take that theme, and we think about let's not just expand it to marketing but let's really talk about customer experience, because as a customer, I go in-store, I go on email, I go on your mobile app, I don't view those as different things. That's just my experience with your brand. And even as we start getting to maybe some of the service things, am I calling a call center? The way that we're really thinking about marketing is not only bringing all this information across our traditional marketing channels, but how are we helping marketers drive organizational change beyond the traditional bounds of even their own marketing department into service, into sales, into on-store, because in reality that's where kind of the next step is. It's not just about, to your point, promotional emails. It's about how are we bringing this experience across the full spectrum. >> So it's really how is first-person data going to drive the role of marketer differently, the tasks of marketing as a consequence, and therefore how we institutionalize that work. >> Absolutely, and I think that you can see this in the investments that we've made in the ODC, Oracle Data Cloud. It's first step, let's start thinking about how we can start moving around on first-party data, that'll be a nice starting point, but then afterwards, how are we taking third-party data let's say from offline purchases, starting to incorporate that and that store's third-party data, 'cause then we really start getting to that simultaneously good experience or at least consistent experience across digital, across in-store, we start piecing together, but we really need to start at that baseline. >> A lot of people have been talking about the convergence of adtech and martech for years, and we had a CUBE alumni on our CUBE many years ago, when the Big Data movement started to happen, and he was a visionary, revolutionary kind of guy, Jeff Hammerbacher, the founder of Cloudera, who's now doing some pioneering work in New York City around science. He's since left Cloudera. But he said on theCUBE what really bothered him was some of the brightest minds in the industry were working on using data and put an ad in the right place. And he was being kind of critical of, use it for cooler things, but we look at what's happening on martech side, when you have customer experience, that same kind of principle of predictive thinking around how to use an asset can be applied to the customer journey, so now you bring up the question of A.I. If you broaden the scope of adtech and martech to say all things consumer, in any context, at any given time, you got to have an A.I. or machine learning approach to put the right thing at the right place at the right time that benefits the user >> Austin: It's not scalable. That's the reality of it. To you point, if you're going to start thinking about this across all these different channels, including advertising as well, the idea of being able to do these on a one-off basis, from a manual perspective, it's completely untenable, you're completely correct, but to that point, where you're talking about the best minds in the industry maybe dedicated to figuring out, "if I put a little target here, am I going to get somebody to click on that ad one time, or how am I placing it," that is very much the way that we were at the very beginning parts of marketing technology, where it was bash and blast messaging, how can we just kind of get the clicks and the engagement, and how do we send out >> John: spray and pray >> Exactly. And now I think that we are getting to a much more nuanced understanding of the way that we advertise because it's much more reliant on context, it's not just how can I get my stuff in front of somebody's eyeballs, it's how am I placing it when they're actually showing some sort of intention for maybe the products I already have. >> Adaptive intelligence is interesting to me because what that speaks to is, one, being adapted to a real time, not batch, spray and pray and the old methodology of database-driven things, no offense to the main database cache at Oracle, but it's a system of record, but now new systems of data are available, and that seems to be the key message here, that the customer experience is changing, multiple channels, that's omnichannel, there needs to be ... everyone's looking for the silver bullet. They think it's A.I., augmented intelligence or artificial intelligence. How do you see that product roadmap looking, because you're going to need to automate, you're going to need to use software differently to handle literally real time. >> Completely. I think that this is a really important distinction about the way that we view A.I. and how it factors into marketing technology and the way that I think a lot of people in the industry do. I think that once again this theme of there aren't easy answers to hard problems, it is very pleasant to think that I'm just going to have one product that's going to solve everything, from when I should send my next email, to if there's clean water in this particular area in a third-world country, and that's just something that maybe sounds nice, but it's not necessarily something that's actually tangible. The way that we view A.I. is it's something that's going to be embedded and actually built into each of these different functions so that we can do the mission-critical things on the actual practical level, and kind of make it real for marketers, make it something that's isn't just "oh, buy this and it will solve all your problems." >> So I'm going to ask you the question, the old adage, "Use the right tool for the right job, and if you're a hammer everything looks like a nail." A lot of people use email marketing that way, they're using it for notifications when in reality that's not the expectation of the consumer, some are building in a notification engine separate from email. All that stuff's kind of under the covers, in the weeds, but the bigger question to you is, I want to get your insight on this because you're talking to customers all the time, is as customers as you said need to change organizationally, they're essentially operationalizing this modern era of CX, customer experience, so it's a platform-based concept which pretty much everyone agrees on, but we're in the early innings of operationalizing this >> Austin: Oh yeah. >> So how do you see that evolving and what do you want customers to do to be set up properly if they're coming in for the first inning of their journey, or even if they're midstream with legacy stuff? >> I think that that's a really good perspective, because you don't want to necessarily force people to go through excruciating organizational change in preparation if we're in maybe the first inning, but it is really just about setting up the organization to adjust as realistically we get into the middle innings and into the later innings. And really the kind of beginning foundation of this is understanding that these arbitrary almost like tribal distinctions between who owns what channel, who's the email marketer, or who's the mobile person, they need to be broken down, and start thinking about things instead of these promotional blasts to your point, or even maybe reactionary notifications. How is this contributing to the number of times your brand is touching me in a day, or the way that I'm actually communicating, so I think that it's an interesting kind of perspective of how we were organizationally set up for that, but the short answer is that A.I. is going to fundamentally change the way that marketers are operating. It's not going to fundamentally change maybe everything that they're doing or it's not going to be replacing it. It's going to be a complementary role that they need to be ready to adjust to. >> So you are, you're in product, product management. >> Austin: Product marketing >> Product marketing. So you are at that interface between product and marketing, both moving more towards agile. How are you starting to use data differently and how would you advise folks like you in other businesses not selling software that might not have the same digital component today but might have a comparable digital component in the future, what would you tell them to do differently? >> So, I think that the first step is to actually have an honest assessment of what we have and what we don't have. I think that there's a lot of people who like to kind of close their eyes or maybe plug their ears and just sort of continue down the path of least resistance. >> Peter: Give me ... >> Oh, an honest assessment of what kind of data we do have today, what kind of data we might actually need, and then most importantly, is that actually feasible data to get. Because you can't >> you can wish it but you can't get it >> You can wave a magic wand and say these are the numbers that I need on this particular maybe interest level of these particular ... >> John: The fatal flaw is hoping that you're going to get data that you never get, or is ungettable. >> Or, this is really something that I think a lot, would resonate more with marketers is that we have now set up all these different points of interaction that are firehoses of data spraying it at me, I may be able to retroactively look at it and maybe garner some kind of insight, but there's just no real way for me to take that and make it actionable right away. It is a complete mess of data in a lot of these organizations. >> And that's where A.I. comes in. >> Austin: Absolutely. It's able to automate that, reaction ... >> Peter: Triage at a bare minimum. >> Correct >> So the first starts with data. What would be the second thing? >> So it's data, presume that you're going to need help on the triage and organizing that data. Is there a third thing? >> I would say that you're going down the right path with the steps there, but once again, we're all talking about these concepts that do require a great deal of specialization and a lot of actual understanding of the way we're dealing with data. So honest assessment is definitely that first part, but then do I have the actual people that I need in order to actually take action on this? Because it is a specialized kind of role that really hasn't traditionally been within marketing organizations. >> I know you guys have a big account-based, focus-account-based marketing, you know, doing all kinds of things, but I'm a person, I'm not a company, so that's a database saying "hey, what company do you work for?" And all the people who work for that company and their target list. I'm a person. I'm walking around, I've got a wearable, I might be doing a retail transaction, so the persona base seems to be the rage and seems to be the center and we heard from Mark Hurd's keynote, that's obviously his perspective and others as well so it's not like a secret, but how do you take it to the next level? An account base could help there too, but you need to organize around the person, and that seems to open up the identity question of okay, how do I know it's John? >> I think that goes beyond just personal taste, but into what does this person actually do at this company, because I can go in and give a headspinning presentation to maybe a C-level executive and say, "look at all this crazy stuff you can do," and meanwhile the guy who might be making the buying decision at the end of the table's looking at that and being like, "there's no way we can do that, we don't have the personnel to do that, there's no chance," and you have already dissension from the innards of the actual people who are making the buying decisions. The vision can't be so big that it resonates with no one. And you need to understand on a persona level what is actually resonated with them. 'Cause feasibility is a very important thing to our end user, and we need to actually incorporate that into our messaging, so it's not just so pie-in-the-sky visioning. >> I did a piece of research, sorry John, I did a piece of research a number of years ago that looked at the impact of selling mainly to the CIO. And if you sell successfully to the CIO, you can probably guarantee nine months additional time before the sale closes. >> Austin: Yeah. Because the CIO says "this is a great idea," and then everybody in the organization who's now responsible for doing it says "hold on, don't put this in my KPIs while I take a look at it and what it really means and blah blah blah. Don't make me responsible for this stuff." You just added nine months. >> Absolutely. I even have a very minute example for something that we rolled out. This was a great learning opportunity. Because we rolled out a feature called multi-variant testing. It's not important what exactly it is for the purposes of this, but basically it's the idea of you can take one email and eight versions of it, test it, and then send out the best one. Sounds great, right? I'm an executive, I'm like boy, I'm going to get every last ounce of revenue from my emails, I'm only going to send out the best content. If you don't pitch that right, the end user, all they hear is wait, the thing that I do one of, I have to create eight of now? Am I going to get to see my kids ever again? That's just the way you have to adjust ... >> And seven of 'em are going to be thrown away. I'm going to be called a failure. >> Exactly. So it's just not something that you can take for granted because marketers have a variety of different roles and a variety of firm responsibilities. >> And compound that with everything's going digital. >> Exactly. >> So (mumbles) Austin, great to have you on theCUBE. Spend the last minute though, I'd like you just to share for the last minute, what's the most important thing happening here at #ModernCX besides the simplicity of the messaging of modern era of customer expectations, experiences, all that's really awesome, but what should people know about that aren't here, watching. >> I'd just say that the one thing that at least resonates most with me, and this is once again coming from a product and sort of edging on marketing, is that the things that we've been talking about with not only A.I. but even just simple things like having systems that are communicating to each other, they're actually real and we're seeing that as real. You can actually see them working together in products and serving up experiences to customers that we're even doing now as part of the sales process and saying "hey, this is how you would actually do this," as opposed to just "here's our Chinese menu of different options. Pick what you want and then we can just kind of serve it up." Because I think that there's something that's very heartening to maybe marketers who have a little bit of, I don't know, doubt about whether or not this is real. It is real, it's here today, and we're able to execute on it. >> And that's the integration of a multi-product and technology solution. >> I would almost say that it's slightly different from that though, in terms of, it's not just integration of these pieces, it's integration that's pre-built, so we actually have it pre-built together and then we also have these tremendous, new, innovative features and functionality that are coming with those integrations. It's not just portability, it's actual use cases. >> Would you say that it's as real as the data? >> It's as real as the data. I think that that's ... >> If you have the data, then you can do what you need to do. >> That's a very, a very good point. >> Austin Miller, Product Marketing Director at Oracle Marketing Cloud. Thanks for sharing the data here on theCUBE where we're agile, agile marketing is the focus. I'm John Furrier, Peter Burris. More coverage from day one at Mandalay Bay for Oracle Modern Customer Experience show. We'll be right back with more after this short break. (bright, lively music)
SUMMARY :
brought to you by Oracle. Welcome to theCUBE conversation. but how you got here through some really smart acquisitions product integration, it's also the way that we are talking to be the reality that everyone wants. and the heroes that are going to drive the capabilities to do it. there are not going to be easy answers to hard problems. and how it changed the way the business and how that's going to impact let's say the way to the next one ... and counsel to folks on the 2017. It's not just about, to your point, promotional emails. going to drive the role of marketer differently, Absolutely, and I think that you can see this to the customer journey, so now you bring up the question and the engagement, and how do we send out And now I think that we are getting to a much more of data are available, and that seems to be the way that we view A.I. but the bigger question to you is, I want to get your insight that they're doing or it's not going to be replacing it. in the future, what would you tell them So, I think that the first step is to actually have to get. that I need on this particular maybe interest level get data that you never get, or is ungettable. is that we have now set up all these different points It's able to automate that, So the first starts with data. on the triage and organizing that data. in order to actually take action on this? around the person, and that seems to open up to our end user, and we need to actually incorporate that that looked at the impact of selling mainly to the CIO. Because the CIO says "this is a great idea," That's just the way you have to adjust ... And seven of 'em are going to be thrown away. So it's just not something that you can take for granted So (mumbles) Austin, great to have you on theCUBE. on marketing, is that the things that we've And that's the integration of a multi-product and then we also have these tremendous, new, It's as real as the data. what you need to do. Thanks for sharing the data here on theCUBE
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
Peter Burris | PERSON | 0.99+ |
John | PERSON | 0.99+ |
Peter | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
Austin Miller | PERSON | 0.99+ |
nine months | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
first inning | QUANTITY | 0.99+ |
Mandalay Bay | LOCATION | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
first step | QUANTITY | 0.99+ |
Mark Hurd | PERSON | 0.99+ |
2017 | DATE | 0.99+ |
last year | DATE | 0.99+ |
Wikibon.com | ORGANIZATION | 0.99+ |
first part | QUANTITY | 0.99+ |
SiliconANGLE | ORGANIZATION | 0.99+ |
one email | QUANTITY | 0.99+ |
CUBE | ORGANIZATION | 0.99+ |
second thing | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
martech | ORGANIZATION | 0.98+ |
one | QUANTITY | 0.98+ |
eight | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Oracle Marketing Cloud | ORGANIZATION | 0.98+ |
this year | DATE | 0.98+ |
Austin | PERSON | 0.98+ |
seven | QUANTITY | 0.97+ |
SiliconANGLE Media | ORGANIZATION | 0.97+ |
one product | QUANTITY | 0.97+ |
first | QUANTITY | 0.96+ |
eight versions | QUANTITY | 0.96+ |
Las Vegas | LOCATION | 0.96+ |
each | QUANTITY | 0.96+ |
day one | QUANTITY | 0.96+ |
third thing | QUANTITY | 0.95+ |
ODC | ORGANIZATION | 0.95+ |
this week | DATE | 0.94+ |
Oracle Modern Customer Experience | EVENT | 0.92+ |
Oracle Data Cloud | ORGANIZATION | 0.9+ |
a day | QUANTITY | 0.83+ |
Chinese | OTHER | 0.83+ |
number of years ago | DATE | 0.8+ |
CX | TITLE | 0.8+ |
theCUBE | ORGANIZATION | 0.79+ |
many years ago | DATE | 0.78+ |
one time | QUANTITY | 0.77+ |
Oracle Modern Customer Experience 2017 | EVENT | 0.75+ |
#ModernCX | TITLE | 0.75+ |
Big Data | EVENT | 0.74+ |
one thing | QUANTITY | 0.73+ |
Oracle Modern | ORGANIZATION | 0.72+ |
agile | TITLE | 0.69+ |
Customer | EVENT | 0.69+ |
Austin | LOCATION | 0.69+ |
#ModernCX | ORGANIZATION | 0.65+ |
adtech | ORGANIZATION | 0.53+ |
years | QUANTITY | 0.49+ |
Day 1 Wrap - DataWorks Summit Europe 2017 - #DWS17 - #theCUBE
(Rhythm music) >> Narrator: Live, from Munich, Germany, it's The Cube. Coverage, DataWorks Summit Europe, 2017. Brought to you by Hortonworks. >> Okay, welcome back everyone. We are live in Munich, Germany for DataWorks 2017, formally known as Hadoop Summit. This is The Cube special coverage of the Big Data world. I'm John Furrier my co-host Dave Vallente. Two days of live coverage, day one wrapping up. Now, Dave, we're just kind of reviewing the scene here. First of all, Europe is a different vibe. But the game is still the same. It's about Big Data evolving from Hadoop to full open source penetration. Puppy's now public in markets Hortonworks, Cloudera is now filing an S-1, Neosoft, Talon, variety of the other public companies. Alteryx. Hadoop is not dead, it's not dying. It certainly is going to have a position in the industry, but the Big Data conversation is front and center. And one thing that's striking to me is that in Europe, more than in the North America, is IOT is more centrally themed in this event. Europe is on the Internet of Things because of the manufacturing, smart cities. So this is a lot of IOT happening here, and I think this is a big discovery certainly, Hortonworks event is much more of a community event than Strata Hadoop. Which is much more about making money and modernization. This show's got a lot more engagement with real conversations and developers sessions. Very engaging audience. Well, yeah, it's Europe. So you've go a little bit different smaller show than North America but to me, IOT, Internet of Things, is bringing the other cloud world with Big Data. That's the forcing function. And real time data is the center of the action. I think is going to be a continuing theme as we move forward. >> So, in 2010 John, it was all about 'What is Hadoop?' With the middle part of that decade was all about Hadoop's got to go into the enterprise. It's gone mainstream in to the enterprise, and now it's sort of 'what's next?' Same wine new bottle. But I will say this, Hadoop, as you pointed out, is not dead. And I liken it to the early web. Web one dot O it was profound. It was a new paradigm. The profundity of Hadoop was that you could ship five megabytes of code to a petabyte of data. And that was the new model and that's spawned, that's catalyzed the Big Data movement. That is with us now and it's entrenched, and now you're seeing layers of innovation on top of that. >> Yeah, and I would just reiterate and reinforce that point by saying that Cloudera, the founders of this industry if you will, with Hadoop the first company to be commercially funded to do what Hortonworks came in after the fact out of Yahoo, came out of a web-scale world. So you have the cloud native DevOps culture, Amar Ujala's at Yahoo, Mike Olson, Jeff Hammerbacher, Christopher Vercelli. These guys were hardcore large-scale data guys. Again, this is the continuation of the evolution, and I think nothing is changed it that regard because those pioneers have set the stage for now the commercialization and now the conversation around operationalizing this cloud is big. And having Alan Nance, a practitioner, rock-star, talking about radical deployments that can drop a billion dollars at a cost savings to the bottom line. This is the kind of conversations we're going to see more of this is going to change the game from, you know, "Hey, I'm the CFO buyer" or "CIO doing IT", to an operational CEO, chief operating officer level conversation. That operational model of cloud is now coming into the view what ERP did in software, those kinds of megatrends, this is happening right now. >> As we talk about the open, the people who are going to make the real money on Big Data are the practitioners, those people applying it. We talked about Alan Nance's example of billion dollar, half a billion dollar cost-savings revenue opportunities, that's where the money's being made. It's not being made, yet anyway with these public companies. You're seeing it Splunk, Tableau, now Cloudera, Hortonworks, MapR. Is MapR even here? >> Haven't seen 'em. >> No I haven't seen MapR, they used to have pretty prominent display at the show. >> You brought up point I want to get back to. This relates to those guys, which is, profitless prosperity. >> Yeah. >> A term used for open source. I think there's a trend happening and I can't put a finger on it but I can kind of feel it. That is the ecosystems of open source are now going to a dimension where they're not yet valued in the classic sense. Most people that build platforms value ecosystems, that's where developers came from. Developer ecosystems fuel open source. But if you look at enterprise, at transformations over the decades, you'd see the successful companies have ecosystems of channel partners; ecosystems of indirect sales if you will. We're seeing the formation, at least I can start seeing the formation of an indirect engine of value creation, vis-Ã -vis this organic developer community where the people are building businesses and companies. Shaun Connolly pointed to Thintech as an example. Where these startups became financial services businesses that became Thintech suppliers, the banks. They're not in the banking business per se, but they're becoming as important as banks 'cuz they're the providers in Thintech, Thintech being financial tech. So you're starting to see this ecosystem of not "channel partners", resell my equipment or software in the classic sense as we know them as they're called channel partners. But if this continues to develop, the thousand flower blooming strategy, you could argue that Hortonworks is undervalued as a company because they're not realizing those gains yet or those gains can't be measured. So if you're an MBA or an investment banker, you've got to be looking at the market saying, "wow, is there a net-present value to an ecosystem?" It begs the question Dave. >> Dave: It's a great question John. >> This is a wealth creation. A rising tide floats all boats, in that rising tide is a ecosystem value number there. No one has their hands on that, no one's talked about that. That is the upshot in my mind, the silver-lining to what some are saying is the consolidation of Hadoop. Some are saying Cloudera is going to get a huge haircut off their four point one billion dollar value. >> Dave: I think that's inevitable. >> Which is some say, they may lose two to three billion in value, in the IPO. Post IPO which would put them in line with Hortonworks based on the numbers. You know, is that good or bad? I don't think it's bad because the value shifts to the ecosystem. Both Cloudera and Hortonworks both play in open source so you can be glass half-full on one hand, on the haircut, upcoming for Cloudera, two saying "No, the glass is half-full because it's a haircut in the short-term maybe", if that happens. I mean some said Pure Storage was going to get a haircut, they never really did Dave. So, again, no one yet has pegged the valuation of an ecosystem. >> Well, and I think that is a great point, personally I think, I've been sort of racking my brain, will this Big Data hike be realized. Like the internet. You remember the internet hyped up, then it crashed; no one wanted to own any of these companies. But it actually lived up to the hype. It actually exceeded the hype. >> You can get pet food online now, it's called amazon. [Co-Hosts Chuckle Together] All the e-commerce played out. >> Right, e-commerce played out. But I think you're right. But everybody's expecting sort of, was expecting a similar type of cycle. "Oh, this will replace that." And that's now what's going to happen. What's going to happen is the ecosystem is going to create a flywheel effect, is really what you're saying. >> Jeff: Yes. >> And there will be huge valuations that emerge out of this. But today, the guys that we know and love, the Hortonworks, the Clouderas, et cetera, aren't really on the winners list, I mean some of their founders maybe are. But who are the winners? Maybe the customers because they saw a big drop in cost. Apache's a big winner here. Wouldn't ya say? >> Yeah. >> Apache's looking pretty good, Apache Foundation. I would say AWS is a pretty big winner. They're drifting off of this. How about Microsoft and IBM? I mean I feel in a way IBM is sort of co-opted this Big Data meme, and said, "okay, cognitive." And layered all of it's stuff on top of it. Bought the weather company, repositioned the company, now it hasn't translated in to growth, but certainly has profitability implications. >> IBM plays well here, I'll tell you why. They're very big in open source, so that's positive. Two, they have huge track record and staff dealing with professional services in the enterprise. So if transformation is the journey conversation, IBM's right there. You can't ignore IBM on this one. Now, the stack might be different, but again, beauty is in the eye of the beholder because depending on what work clothes you have it depends. IBM is not going to leave you high and dry 'cuz they have a really you need for what they can do with their customers. Where people are going to get blindsided in my opinion, the IBMs and Oracles of the world, and even Microsoft, is what Alan Nance was talking about, the radical transformation around the operating model is going to force people to figure out when to start cannibalizing their own stacks. That's going to be a tell sign for winners and losers in the big game. Because if IBM can shift quickly and co-op the megatrends, make it their own, get out in front of that next wave as Pat Gelsinger would say, they could surf that wave and then tweak, and then get out in front. If they don't get behind that next wave, they're driftwood. It really is all about where you are in the spectrum, and analytics is one of those things in data where, you've got to have a cohesive horizontal strategy. You got to be horizontally scalable with data. You got to make data freely available. You have to have an abstraction layer of software that will allow free movement of data, across systems. That's the number one thing that comes out of seeing the Hortonwork's data platform for instance. Shaun Connolly called it 'connective tissue'. Cloudera is the same thing, they have to start figuring out ways to be better at the data across the horizontal view. Cloudera like IBM has an opportunity as well, to get out in front of the next wave. I think you can see that with AI and machine learning, clearly they're going to go after that. >> Just to finish off on the winners and losers; I mean, the other winner is systems integrators to service these companies. But I like what you said about cannibalizing stacks as an indicator of what's happening. So let's talk about that. Oracle clearly cannibalizing it's stacks, saying, "okay, we're going to the red stack to the cloud, go." Microsoft has made that decision to do that. IBM? To a large degree is cannibalizing it's stack. HP sold off it's stack, said, "we don't want to cannibalize our stack, we want to sell and try to retool." >> So, your question, your point? >> So, haven't they already begun to do that, the big legacy companies? >> They're doing their tweaking the collet and mog, as an example. At Oracle Open World and IBM Interconnect, all the shows we, except for Amazon, 'cuz they're pure cloud. All are taking the unique differentiation approach to their own stuff. IBM is putting stuff that's relate to IBM in their cloud. Oracle differentiates on their stack, for instance, I have no problem with Oracle because they have a huge database business. And, you're high as a kite if you think Oracle's going to lose that database business when data is the number one asset in the world. What Oracle's doing which I think is quite brilliant on Oracle's part is saying, "hey, if you want to run on premise with hardware, we got Sun, and oh by the way, our database is the fastest on our stuff." Check. Win. "Oh you want to move to the cloud? Come to the Oracle cloud, our database runs the fastest in our cloud", which is their stuff in the cloud. So if you're an Oracle customer you just can't lose there. So they created an inimitability around their own database. So does that mean they're going to win the new database war? Maybe not, but they can coexist as a system of records so that's a win. Microsoft Office 365, tightly coupling that with Azure is a brilliant move. Why wouldn't they do that? They're going to migrate their customer base to their own clouds. Oracle and Microsoft are going to migrate their customers to their own cloud. Differentiate and give their customers a gateway to the cloud. VVMware is partnering with Amazon. Brilliant move and they just sold vCloud Air which we reported at Silicon Angle last night, to a French company recently so vCloud Air is gone. Now that puts the VMware clearly in bed with Amazon web services. Great move for VMware, benefit to AWS, that's a differentiation for VMware. >> Dave: Somebody bought vCloud Air? >> I think you missed that last night 'cuz you were traveling. >> Chuckling: That's tongue-in-cheek, I mean what did they get for vCloud Air? >> OVH bought them, French company. >> More de-levering by Michael. >> Well, they're inter-clouding right? I mean de-leveraging the focus, right? So OVH, French company, has a very much coexisted... >> What'd they pay? >> ... strategy. It's undisclosed. >> Yeah, well why? 'Cuz it wasn't a big number. That's my point. >> Back to the other cloud players, Google. I think Google's differentiating on their technology. Great move, smart move. They just got to get, as someone who's been following them, and you know, you and I both love an enterprise experience. They got to speak the enterprise language and execute the language. Not through 19 year olds and interns or recent smart college grads ad and say, "we're instantly enterprise." There's a dis-economies of scale for trying to ramp up and trying to be too heavy on the enterprise. Amazon's got the same problem, you can't hire sales guy fast enough, and oh by the way, find me a sales guy that has ten 15 years executive selling experience to a complex strategic sales, like the enterprise where you now have stakeholders that are in multiple roles and changing roles as Alan Nance pointed out. So the enterprise game is very difficult. >> Yup. >> Very very difficult. >> Well, I think these dupe startups are seeing that. None of them are making money. Shaun Connolly basically said, "hey, it used to be growth they would pay for growth, but now their punishing you if you don't have growth plus profitability." By the way, that's not all totally true. Amazon makes no money, unless stock prices go through the roof. >> There is no self-service, there is no self-service business model for digital transformation for enterprise customers today. It doesn't exist. The value proposition doesn't resinate with customers. It works good for Shadow IT, and if you want to roll out G Suite in some pockets of your organization, but an ad-sense sales force doesn't work in the enterprise. Everyone's finding that out right now because they're basically transforming their enterprise. >> I think Google's going to solve their problem. I think Google has to solve their problem 'cuz... >> I think they will, but to me it's, buy a company, there's a zillion company out there they could buy tomorrow that are private, that have like 300 sales people that are senior people. Pay the bucks, buy a sales force, roll your stuff out and start speaking the language. I think Dianne Green gets this. So, I think, I expect to see Google ... >> Dave: Totally. >> do some things in that area. >> And I think, to you're point, I've always said the rich get richer. The traditional legacy companies, they're holding servant in this. They waited they waited they waited, and they said, "okay now we're going to go put our chips on the table." Oracle made it's bets. IBM made it's bets. HP, not really, betting on hardware. Okay. Fine. Cisco, Microsoft, they're all making their bets. >> It's all about bets on technology and profitability. This is what I'm looking at right now Dave. We talked about it on our intro. Shaun Connolly who's in charge of strategy at Hortonworks clarified it that clearly revenue, losing money is not going to solve the problem for credibility. Profitability matters. This comes back to the point we've said on The Cube multiple years ago and even just as recently as last year, that the world's flipping back down to credibility. Customers in the enterprise want to see credibility and track record. And they're going to evaluate the suppliers based upon key fundamentals in their business. Can they make money? Can they deliver SLAs? These are going to be key requirements, not the shiny new toy from Silicon Valley. Or the cool machine learning algorithm. It has to apply to their product, their value, and they're going to look to companies on the scoreboard and say, "are you profitable?" As a proxy for relevance. >> Well I want to keep it, but I do want to, we've been kind of critical of some of the Hadoop players. Cloudera and Hortonworks specifically. But I want to give them props 'cuz you remember well John, when the legacy enterprise guys started coming into the Hadoop market they all said that they had the same messaging, "we're going to make Hadoop enterprise ready." You remember that well, and I have to say that Hortonworks, Cloudera, I would say MapR as well and the ecosystem, have done a pretty good job of making Hadoop and Big Data enterprise ready. They were already working on it very hard, I think they took it seriously and I think that that's why they are in the mix and they are growing as they are. Shaun Connolly talked about them being operating cashflow positive. Eking out some plus cash. On the next earnings call, pressures on. But we want to see, you know, rocket ships. >> I think they've done a good job, I mean, I don't think anyone's been asleep at the switch. At all, enterprise ready. The questions always been "can they get there fast enough?" I think everyone's recognized that cost of ownership's down. We still solicit on the OpenStack ecosystem, and that they move right from the valley properties. So we'll keep an eye on it, tomorrow we'll be checking in. We got a great day tomorrow. Live coverage here in Munich, Germany for DataWorks 2017. More coverage tomorrow, stay with us. I'm John Furrier with Dave Vallente. Be right back with more tomorrow, day two. Keep following us.
SUMMARY :
Brought to you by Hortonworks. Europe is on the Internet of Things And I liken it to the early web. the founders of this industry if you will, on Big Data are the practitioners, prominent display at the show. This relates to those guys, which is, That is the ecosystems of open source the silver-lining to what some are saying on one hand, on the haircut, You remember the internet hyped up, All the e-commerce played out. the ecosystem is going to the Hortonworks, the Clouderas, et cetera, Bought the weather company, IBM is not going to leave you high and dry the red stack to the cloud, go." Now that puts the VMware clearly in bed I think you missed that last night I mean de-leveraging the focus, right? It's undisclosed. 'Cuz it wasn't a big number. like the enterprise where you now have By the way, that's not all totally true. and if you want to roll out G Suite I think Google has to start speaking the language. And I think, to you're point, that the world's flipping of some of the Hadoop players. We still solicit on the
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vallente | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Michael | PERSON | 0.99+ |
Dianne Green | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Dave | PERSON | 0.99+ |
Shaun Connolly | PERSON | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Alan Nance | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
two | QUANTITY | 0.99+ |
Pat Gelsinger | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Jeff | PERSON | 0.99+ |
Apache | ORGANIZATION | 0.99+ |
John | PERSON | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
tomorrow | DATE | 0.99+ |
Christopher Vercelli | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
John Furrier | PERSON | 0.99+ |
Thintech | ORGANIZATION | 0.99+ |
HP | ORGANIZATION | 0.99+ |
billion dollar | QUANTITY | 0.99+ |
VVMware | ORGANIZATION | 0.99+ |
three billion | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Sun | ORGANIZATION | 0.99+ |
Mike Olson | PERSON | 0.99+ |
Two days | QUANTITY | 0.99+ |
North America | LOCATION | 0.99+ |
2010 | DATE | 0.99+ |
Neosoft | ORGANIZATION | 0.99+ |
Talon | ORGANIZATION | 0.99+ |
Chandra Mukhyala, IBM - DataWorks Summit Europe 2017 - #DW17 - #theCUBE
>> Narrator: theCUBE covering, DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Welcome back to the DataWorks Summit in Munich everybody. This is The Cube, the leader in live tech coverage. Chandra Mukhyala is here. He's the offering manager for IBM Storage. Chandra, good to see you. It always comes back to storage. >> It does, it's the foundation. We're here at a Data Show, and you got to put the data somewhere. How's the show going? What are you guys doing here? >> The show's going good. We have lots of participation. I didn't expect this big a crowd, but there is good crowd. Storage, people don't look at it as the most sexy thing but I still see a lot of people coming and asking. "What do you have to do with Hadoop?" kind of questions which is exactly the kind of question I expect. So, going good, we're able to-- >> It's interesting, in the early days of Hadoop and big data, I remember we interviewed, John and I interviewed Jeff Hammerbacher, founder of Cloudera and he was at Facebook and he said, "My whole goal at Facebook "when we're working with Hadoop was to "eliminate the storage container "and the expensive storage container." They succeeded, but now you see guys like you coming in and saying, "Hey, we have better storage." Why does the world need anything different than HDFS? >> This has been happening for the last two decades, right? In storage, every few years a startup comes, they address one problem very well. They address one problem and create a whole storage solution around that. Everybody understands the benefit of it and that becomes part of the main storage. When I say main storage, because these new point solutions address one problem but what about all the rest of the features storage has been developing for decades. Same thing happened with other solutions, for example, deduplication. Very popular, right at one point, dedupe appliances. Nowadays, every storage solution has dedupe in. I think same thing with HDFS right? HDFS's purpose is built for Hadoop. It solves that problem in terms of giving local access storage, scalable storage, big plural storage. But, it's missing out many things you know. One of the biggest problems they have with HDFS is it's siloed storage, meaning that data is only available, the data in HDFS is only for Hadoop. You can't, what about the rest of the applications in the organizations, who may need it through traditional protocols like NFS, or SMB or they maybe need it through new applications like S3 interfaces or Swift interfaces. So, you don't want that siloed storage. That's one of the biggest problems we have. >> So, you're putting forth a vision of some kind horizontal infrastructure that can be leveraged across your application portfolio... >> Chandra: Yes. >> How common is that? And what's the value of that? >> It's not really common, that's one of the stories, messages we're trying to get out. And I've been talking to data scientists in the last one year, a lot of them. One of the first things they do when they are implementing a Hadoop project is, they have to copy a lot data into HDFS Because before they could enter it just as HDFS they can't on any set. That copy process takes days. >> Dave: That's a big move, yeah. >> It's not only wasting time from a data scientist, but it also makes the data stale. I tell them you don't have to do that if your data was on something like IBM Spectrum Scale. You can run Hadoop straight off that, why do you even have to copy into HDFS. You can use the same existing applications map, and just applications with zero change to it and pour in them at Spectrum Scale it can still use the HSFS API. You don't have to copy that. And every data scientists I talk to is like, "Really?" "I don't know how to do this, I'm wasting time?" Yes. So, it's not very well known that, you know, most people think that there's only one way to do Hadoop applications, in sometimes HDFS. You don't have to. And advantages there is, one, you don't have to copy, you can share the data with the rest of the applications but its no more stale data. But also, one other big difference between the HDFS type of storage versus shared storages. In the shared, which is what HDFS is, the various scale is by adding new nodes, which adds both compute and storage. What if our applications, which don't necessarily need need more compute, all they need is more throughput. You're wasting computer resources, right? So there are certain applications where a share nothing is a better architecture. Now the solution which IBM has, will allow you to deploy it in either way. Share nothing or shared storage but that's one of the main reasons, people want to, data scientists especially, want to look at these alternative solutions for storage. >> So when I go back to my Hammerbacher example, it worked for a Facebook of the early days because they didn't have a bunch of legacy data hanging around, they could start with, pretty much, a blank piece of paper. >> Yes. >> Re-architect, plus they had such scale, they probably said, "Okay, we don't want to go to EMC "and NetApp or IBM, or whomever and buy storage, "we want to use commodity components." Not every enterprise can do that, is what you're saying. >> Yes, exactly. It's probably okay for somebody like a very large search engine, when all they're doing is analytics, nothing else. But if you to any large commercial enterprise, they have lots of, the whole point around analytics is they want to pool all of the data and look at that. So, find the correlations, right? It's not about analyzing one small, one dataset from one business function. It's about pooling everything together and see what insights can I get out of it. So that's one of the reasons it's very important to have support to access the data for your legacy enterprise applications, too, right? Yeah, so NFS and SMB are pretty important, so are S3 and Swift, but also for these analytics applications, one of the advantage of IBM Solution here is we provide local access for file system. Not necessarily through mass protocols like an access, we do that, but we also have PO SIX access to have data local access to the file system. With that, HDFS you have to first copy the file into HDFS, you had to bring it back to do anything with that. All those copy operations go away. And this is important, again in enterprise, not just for data sharing but also to get local access. >> You're saying your system is Hadoop ready. >> Chandra: It is. >> Okay. And then, the other thing you hear a lot from IT practitioners anyway, not so much from from the line of businesses, that when people spin up these Hadoop projects, big data projects, they go outside of the edicts of the organization in terms of governance and compliance, and often, security. How do you solve, do you solve that problem? >> Yeah, that's one of the reason to consider again, the enterprise storage, right? It's not just because you have, you're able to share the data with rest of applications, but also the whole bunch of data management features, including data governance features. You can talk about encryption there, you can talk about auditing there, you can talk about features like WAN, right, WAN, so data is, especially archival data, once you write you can't modify that. There are a whole bunch of features around data retention, data governance, those are all part of the data management stack we have. You get that for free. You not only get universal access, unified access, but you also get data governance. >> So is this one of the situations where, on the face of it, when you look at the CapEx, you say, "Oh, wow, I cause use commodity components, save a bunch of money." You know, you remember the client server days. "Oh, wow, cheap, cheap, cheep, "microprocessor based solution," and then all the sudden, people realize we have to manage this. Have we seen a similar sort of trend with Hadoop, with the ability to or the complexity of managing all of this infrastructure? It's so high than it actually drives costs up. >> Actually there are two parts to it, right? There is actually value in utilizing commodity hardware, industry standards. That does reduce your costs right? If you can just buy a standard XL6 server we can, a storage server and utilize that, why not. That is kind of just because. But the real value in any kind of a storage data manage solution is in the software stack. Now you can reduce CapEx by using industry standards. It's a good thing to do and we should, and we support that but in the end, the data management is there in the software stack. What I'm saying is HDFS is solving one problem by dismissing the whole data management problems, which we just touched on. And that all comes in software which goes down under service. >> Well, and you know, it's funny, I've been saying for years, that if you peel back the onion on any storage device, the vast majority anyway, they're all based on standard components. It's the software that you're paying for. So it's sort of artificial in that a company like IBM will say, "Okay, we've got all this value in here, "but it's on top of commodity components, "we're going to charge for the value." >> Right. >> And so if you strip that out, sure, you do it yourself. >> Yeah, exactly. And it's all standard service. It's been like that always. Now one difference is ten years ago people used propriety array controllers. Now all of the functionalities coming into software-- >> ASICs, >> Recording. >> Yeah, 3PAR still has an ASIC, but most don't. >> Right, that's funny, they only come in like.. Almost everybody has some kind of a software-based recording and they're able to utilize sharing server. Now the reason advantage in appliance more over, because, yes it can run on industry's standard, but this is storage, this is where, that's a foundation of all of your inter sectors. And you want RAS, or you want reliability and availability. The only way to get that is a fully integrated, tight solution, where you're doing a lot of testing on the software and the hardware. Yes, it's supposed to work, but what really happens when it fails, how does the sub react. And that's where I think there is still a value for integrated systems. If you're a large customer, you have a lot of storage saving, source of the administrators and they know to build solutions and validate it. Yes, software based storage is the right answer for you. And you're the offering manager for Spectrum Scale, which is the file offering, right, that's right? >> Yes, right yes. >> And it includes object as well, or-- >> Spectrum Sale is a file and object storage pack. It supports both file and protocols. It also supports object protocols. The thing about object storage is it means different things to different people. To some people, it's the object interface. >> Yeah, to me it means get put. >> Yeah, that's what the definition is, then it is objectivity. But the fact is that everybody's supposed to stay in now. But to some of the people, it's not about the protocol, because they're going to still access by finding those protocols, but to them, it's about the object store, which means it's a flat name space and there's no hierarchical name structure, and you can get into billions of finites without having any scalable issues. That's an object store. But to some other people it's neither of those, it's about a range of coding which object storage, so it's cheap storage. It allows you to run on storage and service, and you get cheap storage. So it's three different things. So if you're talking about protocols yes, but their skill is by their definition is object storage, also. >> So in thinking about, well let's start with Spectrum Scale generally. But specifically, your angle in big data and Hadoop, and we talked about that a little bit, but what are you guys doing here, what are you showing, what's your partership with Hortonworks. Maybe talk about that a little bit. >> So we've been supporting this, what we call as Hadoop connector on Spectrum Scale for almost a year now, which is allowing our existing Spectrum Scale customers to run Hadoop straight on it. But if you look at the Hadoop distributions, there are two or three major ones, right? Cloudera, Hortonworks, maybe MapArt. One of the first questions we get is, we tell our customers you can run Hadoop on this. "Oh, is this supported by my distribution?" So that has been a problem. So what we announced is, we found a partnership with Hortonworks, so now Hortonwords is certifying IBM Spectrum Scale. It's not new code changes, it's not new features, but it's a validation and a stamp from Hortonworks, that's in the process. The result of is, Hortonworks certified reference architecture, which is what we announced. We announced it about a month ago. We should be publishing that soon. Now customers can have more confidence in the joint solutions. It's not just IBM saying that it's Hadoop ready, but it's Hortonworks backing that up. >> Okay, and your scope, correct me if I'm wrong, is sort of on prem and hybrid, >> Chandra: Yes. >> Not cloud services. That's kind of you might sell your technology internally, but-- >> Correct so IBM storage is primarily focused on on prem storage. We do have a separate cloud division, but almost every IBM storage production, especially Spectrum Scale, is what I can speak of, we treat them as hybrid cloud storage. What we mean that is we have built in capabilities, we have feature. Most of our products call transfer in cloud tiering, it allows you to set a policy on when data should be automatically tiered to the cloud. Everybody wants public, everybody wants on prem. Obviously there are pros and cons of on primary storage, versus off primary storage, but basially, it boils down to, if you want performance and security, you want to be on premises. But there's always some which is better to be in the cloud, and we try to automate that with our feature called transfer and cloud data. You set a policy based on age, based on the type of data, based on the ownership. The system will automatically tier the data to the cloud, and when a user access that cloud, it comes back automatically, too. It's all transferred to the end. So yes, we're a non primary storage business but our solutions are hybrid cloud storage. >> So, as somebody who knows the file business pretty well, let's talk about kind of the business file and sort of where it's headed. There's some mega trends and dislocations. There's obviously software defined. You guys have made a big investment in software defined a year and a half, two years ago. There's cloud, Amazon with S3 sort of shook up the world. I mean, at first it was sort of small, but then now, it's really catching on. Object obviously fits in there. What do you see as the future of file. >> That's a great question. When it comes to data layout, there's really a block file of object. Software defined and cloud are various ways of consuming storage. If you're large service probably, you would prefer a software based solution so you can run it on your existing service. But who are your preferred solutions? Depending on the organization's preferences for security, and how concerned they are about security and performance needs, they will prefer to run some of the applications on cloud. These are different ways of consuming storage. But coming back to file, an object right? So object is perfect if you are not going to modify the data. You're done writing that data, and you're not going to change. It just belongs an object store, right? It's more scalable storage, I say scalable because file systems are hierarchical in nature. Because it's a file system tree, you have travels through the various subtype trees. Beyond a few million subtype trees, it slows you down. But file systems have a strength. When you want to modify the file, any application which is going to edit the file, which is going to modify the file, that application belongs on file storage, not on object. But let's say you are dealing with medical images. You're not going to modify an x-ray once it's done. That's better suited on an object storage. So file storage will always have a place. Take video editing and all these videos they are doing, you know video, we do a lot of video editing. That belongs on file storage, not on object. If you care about file modifications and file performance, file is your answer, but if you're done and you just want to archive it, you know, you want a scalable storage, billions of objects, then object is answer. Now either of these can be software based storage or it could be appliance. That's again an organization's preference for do you want to integrate a robust ready, ready made solution, then appliance is an answer. "Ah, no I'm a large organization. "I have a lot of storage administered," as they can build something on their own, then software based is answer. Having most windows will give you a choice. >> What brought you to IBM. You used to be at NetApp. IBM's buying the weather company. Dell's buying EMC. What attracted you to IBM? Storage is the foundation which we have, but it's really about data, and it's really about making sense of it, right? And everybody saying data is the new oil, right? And IBM is probably the only company I can think of, which has the tools and the IT to make sense of all this. NetApp, it was great in early 2000s. Even as a storage foundation, they have issues, with scale out and a true scale out, not just a single name space. EMC is pure storage company. In the future it's all about, the reason we are here at this conference is about analyzing the data. What tools do you have to make sense of that. And that's where machine learning, then deep learning comes. Watson is very well-known for that. IBM has the IT and it has a rightful research going on behind that, and I think storage will make more sense here. And also, IBM is doing the right thing by investing almost a billion dollars in software defined storage. They are one of the first companies who did not hesitate to take the software from the integrated systems, for example, XIV, and made the software available as software only. We did the same thing with Store-Wise. We took the software off it and made available as Spectrum Virtualize. We did not hesitate at all to take the same software which was available, to some other vendors, "I can't do that. "I'm going to lose all my margins." We didn't hesitate. We made it available as software. 'Cause we believe that's an important need for our customers. >> So the vision of the company, cognitive, the halo effect of that business, that's the future, is going to bring a lot of storage action, is sort of the premise there. >> Chandra: Yes. >> Excellent, well Chandra, thanks very much for coming to theCUBE. It was great to have you, and good luck with attacking the big data world. >> Thank you, thanks for having me. >> You're welcome. Keep it right there everybody. We'll be back with our next guest. We're live from Munich. This is DataWorks 2017. Right back. (techno music)
SUMMARY :
Brought to you by Hortonworks. This is The Cube, the leader It does, it's the foundation. at it as the most sexy thing in the early days of Hadoop and big data, and that becomes part of the main storage. of some kind horizontal infrastructure One of the first things they do but it also makes the data stale. of legacy data hanging around, that, is what you're saying. So that's one of the You're saying your of the organization in terms of governance but also the whole bunch of the client server days. It's a good thing to do and we should, It's the software that you're paying for. And so if you strip that Now all of the functionalities an ASIC, but most don't. is the right answer for you. To some people, it's the object interface. it's not about the protocol, but what are you guys doing One of the first questions we get is, That's kind of you might sell based on the type of data, let's talk about kind of the business file of the applications on cloud. And also, IBM is doing the right thing is sort of the premise there. to theCUBE. This is DataWorks 2017.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
John | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Dave | PERSON | 0.99+ |
Hortonwords | ORGANIZATION | 0.99+ |
two | QUANTITY | 0.99+ |
Munich | LOCATION | 0.99+ |
Chandra Mukhyala | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Chandra | PERSON | 0.99+ |
two parts | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
billions | QUANTITY | 0.99+ |
EMC | ORGANIZATION | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
DataWorks Summit | EVENT | 0.99+ |
Swift | TITLE | 0.99+ |
early 2000s | DATE | 0.99+ |
One | QUANTITY | 0.99+ |
one problem | QUANTITY | 0.99+ |
DataWorks Summit | EVENT | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
S3 | TITLE | 0.98+ |
one | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
MapArt | ORGANIZATION | 0.98+ |
first | QUANTITY | 0.98+ |
Spectrum Scale | TITLE | 0.97+ |
ten years ago | DATE | 0.97+ |
two years ago | DATE | 0.97+ |
first questions | QUANTITY | 0.96+ |
first companies | QUANTITY | 0.96+ |
billions of objects | QUANTITY | 0.95+ |
Hadoop | TITLE | 0.95+ |
#DW17 | EVENT | 0.95+ |
one point | QUANTITY | 0.95+ |
2017 | EVENT | 0.94+ |
decades | QUANTITY | 0.94+ |
one business function | QUANTITY | 0.94+ |
zero | QUANTITY | 0.94+ |
a year and a half | DATE | 0.93+ |
DataWorks Summit Europe 2017 | EVENT | 0.92+ |
one dataset | QUANTITY | 0.92+ |
one way | QUANTITY | 0.92+ |
three different things | QUANTITY | 0.92+ |
DataWorks 2017 | EVENT | 0.91+ |
SMB | TITLE | 0.91+ |
CapEx | ORGANIZATION | 0.9+ |
last one year | DATE | 0.89+ |
Kickoff - IBM Machine Learning Launch - #IBMML - #theCUBE
>> Narrator: Live from New York, it's The Cube covering the IBM Machine Learning Launch Event brought to you by IBM. Here are your hosts, Dave Vellante and Stu Miniman. >> Good morning everybody, welcome to the Waldorf Astoria. Stu Miniman and I are here in New York City, the Big Apple, for IBM's Machine Learning Event #IBMML. We're fresh off Spark Summit, Stu, where we had The Cube, this by the way is The Cube, the worldwide leader in live tech coverage. We were at Spark Summit last week, George Gilbert and I, watching the evolution of so-called big data. Let me frame, Stu, where we're at and bring you into the conversation. The early days of big data were all about offloading the data warehouse and reducing the cost of the data warehouse. I often joke that the ROI of big data is reduction on investment, right? There's these big, expensive data warehouses. It was quite successful in that regard. What then happened is we started to throw all this data into the data warehouse. People would joke it became a data swamp, and you had a lot of tooling to try to clean the data warehouse and a lot of transforming and loading and the ETL vendors started to participate there in a bigger way. Then you saw the extension of these data pipelines to try to more with that data. The Cloud guys have now entered in a big way. We're now entering the Cognitive Era, as IBM likes to refer to it. Others talk about AI and machine learning and deep learning, and that's really the big topic here today. What we can tell you, that the news goes out at 9:00am this morning, and it was well known that IBM's bringing machine learning to its mainframe, z mainframe. Two years ago, Stu, IBM announced the z13, which was really designed to bring analytic and transaction processing together on a single platform. Clearly IBM is extending the useful life of the mainframe by bringing things like Spark, certainly what it did with Linux and now machine learning into z. I want to talk about Cloud, the importance of Cloud, and how that has really taken over the world of big data. Virtually every customer you talk to now is doing work on the Cloud. It's interesting to see now IBM unlocking its transaction base, its mission-critical data, to this machine learning world. What are you seeing around Cloud and big data? >> We've been digging into this big data space since before it was called big data. One of the early things that really got me interested and exciting about it is, from the infrastructure standpoint, storage has always been one of its costs that we had to have, and the massive amounts of data, the digital explosion we talked about, is keeping all that information or managing all that information was a huge challenge. Big data was really that bit flip. How do we take all that information and make it an opportunity? How do we get new revenue streams? Dave, IBM has been at the center of this and looking at the higher-level pieces of not just storing data, but leveraging it. Obviously huge in analytics, lots of focus on everything from Hadoop and Spark and newer technologies, but digging in to how they can leverage up the stack, which is where IBM has done a lot of acquisitions in that space and leveraging that and wants to make sure that they have a strong position both in Cloud, which was renamed. The soft layer is now IBM Bluemix with a lot of services including a machine learning service that leverages the Watson technology and of course OnPrem they've got the z and the power solutions that you and I have covered for many years at the IBM Med show. >> Machine learning obviously heavily leverages models. We've seen in the early days of the data, the data scientists would build models and machine learning allows those models to be perfected over time. So there's this continuous process. We're familiar with the world of Batch and then some mini computer brought in the world of interactive, so we're familiar with those types of workloads. Now we're talking about a new emergent workload which is continuous. Continuous apps where you're streaming data in, what Spark is all about. The models that data scientists are building can constantly be improved. The key is automation, right? Being able to automate that whole process, and being able to collaborate between the data scientist, the data quality engineers, even the application developers that's something that IBM really tried to address in its last big announcement in this area of which was in October of last year the Watson data platform, what they called at the time the DataWorks. So really trying to bring together those different personas in a way that they can collaborate together and improve models on a continuous basis. The use cases that you often hear in big data and certainly initially in machine learning are things like fraud detection. Obviously ad serving has been a big data application for quite some time. In financial services, identifying good targets, identifying risk. What I'm seeing, Stu, is that the phase that we're in now of this so-called big data and analytics world, and now bringing in machine learning and deep learning, is to really improve on some of those use cases. For example, fraud's gotten much, much better. Ten years ago, let's say, it took many, many months, if you ever detected fraud. Now you get it in seconds, or sometimes minutes, but you also get a lot of false positives. Oops, sorry, the transaction didn't go through. Did you do this transaction? Yes, I did. Oh, sorry, you're going to have to redo it because it didn't go through. It's very frustrating for a lot of users. That will get better and better and better. We've all experienced retargeting from ads, and we know how crappy they are. That will continue to get better. The big question that people have and it goes back to Jeff Hammerbacher, the best minds of my generation are trying to get people to click on ads. When will we see big data really start to affect our lives in different ways like patient outcomes? We're going to hear some of that today from folks in health care and pharma. Again, these are the things that people are waiting for. The other piece is, of course, IT. What you're seeing, in terms of IT, in the whole data flow? >> Yes, a big question we have, Dave, is where's the data? And therefore, where does it make sense to be able to do that processing? In big data we talked about you've got masses amounts of data, can we move the processing to that data? With IT, the day before, your RCTO talked that there's going to be massive amounts of data at the edge and I don't have the time or the bandwidth or the need necessarily to pull that back to some kind of central repository. I want to be able to work on it there. Therefore there's going to be a lot of data worked at the edge. Peter Levine did a whole video talking about how, "Oh, Public Cloud is dead, it's all going to the edge." A little bit hyperbolic to the statement we understand that there's plenty use cases for both Public Cloud and for the edge. In fact we see Google big pushing machine learning TensorFlow, it's got one of those machine learning frameworks out there that we expect a lot of people to be working on. Amazon is putting effort into the MXNet framework, which is once again an open-source effort. One of the things I'm looking at the space, and I think IBM can provide some leadership here is to what frameworks are going to become popular across multiple scenarios? How many winners can there be for these frameworks? We already have multiple programming languages, multiple Clouds. How much of it is just API compatibility? How much of work there, and where are the repositories of data going to be, and where does it make sense to do that predictive analytics, that advanced processing? >> You bring up a good point. Last year, last October, at Big Data CIV, we had a special segment of data scientists with a data scientist panel. It was great. We had some rockstar data scientists on there like Dee Blanchfield and Joe Caserta, and a number of others. They echoed what you always hear when you talk to data scientists. "We spend 80% of our time messing with the data, "trying to clean the data, figuring out the data quality, "and precious little time on the models "and proving the models "and actually getting outcomes from those models." So things like Spark have simplified that whole process and unified a lot of the tooling around so-called big data. We're seeing Spark adoption increase. George Gilbert in our part one and part two last week in the big data forecast from Wikibon showed that we're still not on the steep part of the Se-curve, in terms of Spark adoption. Generically, we're talking about streaming as well included in that forecast, but it's forecasting that increasingly those applications are going to become more and more important. It brings you back to what IBM's trying to do is bring machine learning into this critical transaction data. Again, to me, it's an extension of the vision that they put forth two years ago, bringing analytic and transaction data together, actually processing within that Private Cloud complex, which is what essentially this mainframe is, it's the original Private Cloud, right? You were saying off-camera, it's the original converged infrastructure. It's the original Private Cloud. >> The mainframe's still here, lots of Linux on it. We've covered for many years, you want your cool Linux docker, containerized, machine learning stuff, I can do that on the Zn-series. >> You want Python and Spark and Re and Papa Java, and all the popular programming languages. It makes sense. It's not like a huge growth platform, it's kind of flat, down, up in the product cycle but it's alive and well and a lot of companies run their businesses obviously on the Zn. We're going to be unpacking that all day. Some of the questions we have is, what about Cloud? Where does it fit? What about Hybrid Cloud? What are the specifics of this announcement? Where does it fit? Will it be extended? Where does it come from? How does it relate to other products within the IBM portfolio? And very importantly, how are customers going to be applying these capabilities to create business value? That's something that we'll be looking at with a number of the folks on today. >> Dave, another thing, it reminds me of two years ago you and I did an event with the MIT Sloan school on The Second Machine Age with Andy McAfee and Erik Brynjolfsson talking about as machines can help with some of these analytics, some of this advanced technology, what happens to the people? Talk about health care, it's doctors plus machines most of the time. As these two professors say, it's racing with the machines. What is the impact on people? What's the impact on jobs? And productivity going forward, really interesting hot space. They talk about everything from autonomous vehicles, advanced health care and the like. This is right at the core of where the next generation of the economy and jobs are going to go. >> It's a great point, and no doubt that's going to come up today and some of our segments will explore that. Keep it right there, everybody. We'll be here all day covering this announcement, talking to practitioners, talking to IBM executives and thought leaders and sharing some of the major trends that are going on in machine learning, the specifics of this announcement. Keep it right there, everybody. This is The Cube. We're live from the Waldorf Astoria. We'll be right back.
SUMMARY :
covering the IBM Machine and that's really the and the massive amounts of data, and it goes back to Jeff Hammerbacher, and I don't have the time or the bandwidth of the Se-curve, in I can do that on the Zn-series. Some of the questions we have is, of the economy and jobs are going to go. and sharing some of the major trends
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jeff Hammerbacher | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Peter Levine | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Erik Brynjolfsson | PERSON | 0.99+ |
Joe Caserta | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Last year | DATE | 0.99+ |
80% | QUANTITY | 0.99+ |
Andy McAfee | PERSON | 0.99+ |
Stu | PERSON | 0.99+ |
New York City | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
last October | DATE | 0.99+ |
Dee Blanchfield | PERSON | 0.99+ |
last week | DATE | 0.99+ |
Python | TITLE | 0.99+ |
two professors | QUANTITY | 0.99+ |
Spark | TITLE | 0.99+ |
October | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
New York | LOCATION | 0.99+ |
Linux | TITLE | 0.98+ |
today | DATE | 0.98+ |
two years ago | DATE | 0.98+ |
Ten years ago | DATE | 0.98+ |
Waldorf Astoria | ORGANIZATION | 0.98+ |
Big Apple | LOCATION | 0.98+ |
Two years ago | DATE | 0.97+ |
Spark Summit | EVENT | 0.97+ |
single platform | QUANTITY | 0.97+ |
both | QUANTITY | 0.97+ |
One | QUANTITY | 0.97+ |
Wikibon | ORGANIZATION | 0.96+ |
one | QUANTITY | 0.96+ |
The Cube | COMMERCIAL_ITEM | 0.96+ |
MIT Sloan school | ORGANIZATION | 0.96+ |
Watson | TITLE | 0.91+ |
9:00am this morning | DATE | 0.9+ |
Hadoop | TITLE | 0.9+ |
Re | TITLE | 0.9+ |
Papa Java | TITLE | 0.9+ |
Zn | TITLE | 0.88+ |
Watson | ORGANIZATION | 0.87+ |
IBM Machine Learning Launch Event | EVENT | 0.87+ |
MXNet | TITLE | 0.84+ |
part two | QUANTITY | 0.82+ |
Cloud | TITLE | 0.81+ |
Second | TITLE | 0.8+ |
IBM Med | EVENT | 0.8+ |
Machine Learning Event | EVENT | 0.79+ |
z13 | COMMERCIAL_ITEM | 0.78+ |
#IBMML | EVENT | 0.77+ |
Big | ORGANIZATION | 0.75+ |
#IBMML | TITLE | 0.75+ |
DataWorks | ORGANIZATION | 0.71+ |
Wikibon Big Data Market Update Pt. 1 - Spark Summit East 2017 - #sparksummit - #theCUBE
>> [Announcer] Live from Boston, Massachusetts, this is theCUBE, covering Spark Summit East 2017, brought to you by Databricks. Now, here are your hosts, Dave Vellante and George Gilbert. >> We're back, welcome to Boston, everybody, this is a special presentation that George Gilbert and I are going to provide to you now. SiliconANGLE Media is the umbrella brand of our company, and we've got three sub-brands. One of them is Wikibon, it's the research organization that Gorge works in, and then of course, we have theCUBE and then SiliconANGLE, which is the tech publication, and then we extensively, as you may know, use CrowdChat and other social data, but we want to drill down now on the Wikibon, Wikibon research side of things. Wikibon was the first research company ever to do a big data forecast. Many, many years ago, our friend Jeff Kelly produced that for several years, we opensourced it, and it really, I think helped the industry a lot, sort of framing the big data opportunity, and then George last year did the first Spark forecast, really Spark adoption, so what we want to do now is talk about some of the trends in the marketplace, this is going to be done in two parts, today's part one, and we're really going to talk about the overall market trends and the market conditions, and then we're going to go to part two tomorrow, where you're going to release some of the numbers, right? And we'll share some of the numbers today. So, we're going to start on the first slide here, we're going to share with you some slides. The Wikibon forecast review, and George is going to, I'm going to ask you to talk about where we are at with big data apps, everybody's saying it's peaked, big data's now going mainstream, where are we at with big data apps? >> [George] Okay, so, I want to quote, just to provide context, the former CTO on VMware, Steve Herrod. He said, "In the end, it wasn't big data, "it was big analytics." And what's interesting is that when we start thinking about it, there have been three classes of, there have been traditionally two classes of workloads, one batch, and in the context of analytics, that means running reports in the background, doing offline business intelligence, but then there was also the interactive-type work. What's emerging is something that's continuously happening, and it doesn't mean that all apps are going to be always on, it just means that there are, all apps will have a batch component, an interactive component, like with the user, and then a streaming, or continuous component. >> [Dave] So it's a new type of workload. >> Yes. >> Okay. Anything else you want to point out here? >> Yeah, what's worth mentioning, this is, it's not like it's going to burst fully-formed out of the clouds, and become sort of a new standard, there's two things that has to happen, the technology has to mature, so right now you have some pretty tough trade-offs between integration, which provides simplicity, and choice and optimization, which gives you fragmentation, and then skillset, and both of those need to develop. >> [Dave] Alright, we're going to talk about both of those a little bit later in this segment. Let's go to the next slide, which really talks to some of the high-level forecast that we released last year, so these are last year's numbers, correct? >> Yes, yes. >> [Dave] Okay, so, what's changed? You've got the ogive curve, which is sort of the streaming penetration, Spark/streaming, that's what, was last year, this is now reflective of continuous, you'll be updating that, how is this changing, what do you want us to know here? >> [George] Okay, so the key takeaways here are, first, we took three application patterns, the first being the data lake, which is sort of the original canonical repository of all your data. That never goes away, but on top of it, you layer what we were calling last year systems of engagement, which is where you've got the interactive machine learning component helping to anticipate and influence a user's decision, and then on top of that, which was the aqua color, was the self-tuning systems, which is probably more IIoT stuff, where you've got a whole ecosystem of devices and intelligence in the cloud and at the edge, and you don't necessarily need a human in the loop. But, these now, when you look at them, you can break them down as having three types of workloads, the batch, the interactive, and the continuous. >> Okay, and that is sort of a new workload here, and this is a real big theme of your research now is, we all remember, no, we don't all remember, I remember punch cards, that's the ultimate batch, and then of course, the terminals were interactive, and you think of that as closer to real time, but now, this notion of continuous, if you go to the next slide, Patrick, we can take a look at how workloads are changing, so George, take us through that dynamic. >> [George] Okay so, to understand where we're going, sometimes it helps to look at where we've come from, and the traditional workloads, if we talk about applications, they were divided into, now, we talked about sort of batch versus interactive, but now, they were also divided into online transaction processing, operational application, systems of record, and then there was the analytic side, which was reporting on it, but this was sort of backward-looking reporting, and we begin to see some convergence between the two with web and mobile apps, where a user was interacting both with the analytics that informed an interaction that they might have. That's looking backwards, and we're going to take a quick look at some of the new technologies that augmented those older application patterns. Then we're going to go look at the emergent workloads and what they look like. >> Okay so, let's have a quick conversation about this before we go on to the next segment. Hadoop obviously was batch. It really was a way, as we've talked about today and many other dates in theCUBE, a way to reduce the expense of doing data warehousing and business intelligence, I remember we were interviewing Jeff Hammerbacher, and he said, "When I was at Facebook, "my mission was to break the dependency "and the container, the storage container." So he really wanted to, needed to reduce costs, he saw that infrastructure needed to change, so if you look at the next slide, which is really sort of talking to Hadoop doing batch in traditional BI, take us through that, and then we'll sort of evolve to the future. >> Okay, so this is an example of traditional workloads, batch business intelligence, because Hadoop has not really gotten to the maturity point of view where you can really do interactive business intelligence. It's going to take a little more work. But here, you've basically put in a repository more data than you could possibly ever fit in a data warehouse, and the key is, this environment was very fragmented, there were many different engines involved, and so there was a high developer complexity, and a high operational complexity, and we're getting to the point where we can do somewhat better on the integration, and we're getting to the point where we might be able to do interactive business intelligence and start doing a little bit of advanced analytics like machine learning. >> Okay. Let's talk a little bit about why we're here, we're here 'cause it's Spark Summit, Spark was designed to simplify big data, simplify a lot of the complexity in Hadoop, so on the next slide, you've got this red line of Spark, so what is Spark's role, what does that red line represent? >> Okay, so the key takeaway from this slide is, couple things. One, it's interesting, but when you listen to Matei Zaharia, who is the creator of Spark, he said, "I built this to be a better MapReduce than MapReduce," which was the old crufty heart of Hadoop. And of course, they've stretched it far beyond their original intentions, but it's not the panacea yet, and if you put it in the context of a data lake, it can help you with what a data engineer does with exploring and munging the data, and what a data scientist might do in terms of processing the data and getting it ready for more advanced analytics, but it doesn't give you an end-to-end solution, not even within the data lake. The point of explaining this is important, because we want to explain how, even in the newer workloads, Spark isn't yet mature to handle the end-to-end integration, and by making that point, we'll show where it needs still more work, and where you have to substitute other products. >> Okay, so let's have a quick discussion about those workloads. Workloads really kind of drive everything, a lot of decisions for organizations, where to put things, and how to protect data, where the value is, so in this next slide you've got, you're juxtaposing traditional workloads with emerging workloads, so let's talk about these new continuous apps. >> Okay, so, this tees it up well, 'cause we focused on the traditional workloads. The emerging ones are where data is always coming in. You could take a big flow of data and sort of end it and bucket it, and turn it into a batch process, but now that we have the capability to keep processing it, and you want answers from it very near real time, you don't want to stop it from flowing, so the first one that took off like this was collecting telemetry about the operation and performance of your apps and your infrastructure, and Splunk sort of conquered that workload first. And then the second one, the one that everyone's talking about now is sort of Internet of Things, but more accurately, the Industrial Internet of Things, and that stream of data is, again, something you'll want to analyze and act on with as little delay as possible. The third one is interesting, asynchronous microservices. This is difficult, because this doesn't necessarily require a lot of new technology, so much as a new skillset for developers, and that's going to mean it takes off fairly slowly. Maybe new developers coming out of school will adopt it whole cloth, but this is where you don't rely on a big central database, this is where you break things into little pieces, and each piece manages itself. >> So you say the components of these arrows that you're showing in just explore processor, these are all sort of discrete elements of the data flow that you have to then integrate as a customer? >> [George] Yes, frankly, these are all steps that could be an end-to-end integrative process, but it's not yet mature enough really to do it end-to-end. For example, we don't even have a data store that can go all the way from ingest to serve, and by ingest, I mean taking the millions, potentially millions or more, events per second coming in from your Internet of Things devices, the explorer would be in that same data store, letting you visualize what's there, and process doing the analysis, and serving then is, from that same data store, letting your industrial devices, or your business intelligence workloads get real-time updates. For this to work as one whole, we need a data store, for example, that can go from end-to-end, in addition to the compute and analytic capabilities that go end-to-end. The point of this is, for continuous workloads, we do want to get to this integrated point somehow, sometime, but we're not there yet. >> Okay, let's go deeper, and take a look at the next slide, you've got this data feedback loop, and you've got this prediction on top of this, what does all that mean, let's double-click on that. >> Okay, so now we're unpacking the slide we just looked at, in that we're unpacking it into two different elements, one is what you're doing when you're running the system, and the next one will be what you're doing when you're designing it. And so for this one, what you're doing when you're running the system, I've grayed out the where's the data coming from and where's it going to, just to focus on how we're operating on the data, and again, to repeat the green part, which is storage, we don't have an end-to-end integrated store that could cost-effectively, scalably handle this whole chain of steps, but what we do have is that in the runtime, you're going to ingest the data, you're going to process it and make it ready for prediction, then there's a step that's called devops for data science, we know devops for developers, but devops for data science, as we're going to see, actually unpacks a whole 'nother level of complexity, but this devops for data science, this is where you get the prediction, of, okay, so, if this turbine is vibrating and has a heat spike, it means shut it down because something's going to fail. That's the prediction component, and the serve part then takes that prediction, and makes sure that that device gets it fast. >> So you're putting that capability in the hands of the data science component so they can effect that outcome virtually instantaneously? >> Yes, but in this case, the data scientist will have done that at design time. We're still at run time, so this is, once the data scientist has built that model, here, it's the engineer who's keeping it running. >> Yeah, but it's designed into the process, that's the devops analogy. Okay great, well let's go to that sort of next piece, which is design, so how does this all affect design, what are the implications there? >> So now, before we had ingest process, then prediction with devops for data science, and then serving, now when you're at design time, you ingest the data, and there's a whole unpacking of steps, which requires a handful, or two fistfuls of tools right now to make operate. This is to acquire the data, explore it, prepare it, model it, assess it, distribute it, all those things are today handled by a collection of tools that you have to stitch together, and then you have process at which could be typically done in Spark, where you do the analysis, and then serving it, Spark isn't ready to serve, that's typically a high-speed database, one that either has tons of data for history, or gets very, very fast updates, like a Redis that's almost like a cache. So the point of this is, we can't yet take Spark as gospel from end to end. >> Okay so, there's a lot of complexity here. >> [George] Right, that's the trade-off. >> So let's take a look at the next slide, which talks to where that complexity comes from, let's look at it first from the developer side, and then we'll look at the admin, so, so on the next slide, we're looking at the complexity from the dev perspective, explain the axes here. >> Okay, okay. So, there's two axes. If you look at the x-axis at the bottom, there's ingest, explore, process, serve. Those were the steps at a high level that we said a developer has to master, and it's going to be in separate products, because we don't have the maturity today. Then on the y-axis, we have some, but not all, this is not an exhaustive list of all the different things a developer has to deal with, with each product, so the complexity is multiplying all the steps on the y-axis, data model, addressing, programming model, persistence, all the stuff's on the y-axis, by all the products he needs on the x-axis, it's a mess, which is why it's very, very hard to build these types of systems today. >> Well, and why everybody's pushing on this whole unified integration, that was a major thing that we heard throughout the day today. What about from the admin's side, let's take a look at the next slide, which is our last slide, in terms of the operational complexity, take us through that. >> [George] Okay, so, the admin is when the system's running, and reading out the complexity, or inferring the complexity, follows the same process. On the y-axis, there's a separate set of tasks. These are admin-related. Governance, scheduling and orchestration, a high availability, all the different types of security, resource isolation, each of these is done differently for each product, and the products are on the x-axis, ingest, explore, process, serve, so that when you multiply those out, and again, this isn't exhaustive, you get, again, essentially a mess of complexity. >> Okay, so we got the message, if you're a practitioner of these so-called big data technologies, you're going to be dealing with more complexity, despite the industry's pace of trying to address that, but you're seeing new projects pop up, but nonetheless, it feels like the complexity curve is growing faster than customer's ability to absorb that complexity. Okay, well, is there hope? >> Yes. But here's where we've had this conundrum. The Apache opensource community has been the most amazing source of innovation I think we've ever seen in the industry, but the problem is, going back to the amazing book, The Cathedral and the Bazaar, about opensource innovation versus top-down, the cathedral has this central architecture that makes everything fit together harmoniously, and beautifully, with simplicity. But the bazaar is so much faster, 'cause it's sort of this free market of innovation. The Apache ecosystem is the bazaar, and the burden is on the developer and the administrator to make it work together, and it was most appropriate for the big internet companies that had the skills to do that. Now, the companies that are distributing these Apache opensource components are doing a Herculean job of putting them together, but they weren't designed to fit together. On the other hand, you've got the cloud service providers, who are building, to some extent, services that have standard APIs that might've been supported by some of the Apache products, but they have proprietary implementations, so you have lock-in, but they have more of the cathedral-type architecture that-- >> And they're delivering 'em their services, even though actually, many of those data services are discrete APIs, as you point out, are proprietary. Okay, so, very useful, George, thank you, if you have questions on this presentation, you can hit Wikibon.com and fire off a question to us, we'll make sure it gets to George and gets answered. This is part one, part two tomorrow is we're going to dig into some of the numbers, right? So if you care about where the trends are, what the numbers look like, what the market size looks like, we'll be sharing that with you tomorrow, all this stuff, of course, will be available on-demand, we'll be doing CrowdChats on this, George, excellent job, thank you very much for taking us through this. Thanks for watching today, it is a wrap of day one, Spark Summit East, we'll be back live tomorrow from Boston, this is theCUBE, so check out siliconangle.com for a review of all the action today, all the news, check out Wikibon.com for all the research, siliconangle.tv is where we house all these videos, check that out, we start again tomorrow at 11 o'clock east coast time, right after the keynotes, this is theCUBE, we're at Spark Summit, #SparkSummit, we're out, see you tomorrow. (electronic music jingle)
SUMMARY :
brought to you by Databricks. and the market conditions, and then we're going to go and it doesn't mean that all apps are going to be always on, Anything else you want to point out here? the technology has to mature, so right now Let's go to the next slide, which really and at the edge, and you don't necessarily need and you think of that as closer to real time, and the traditional workloads, "and the container, the storage container." and we're getting to the point where so on the next slide, you've got this red line of Spark, but it's not the panacea yet, and if you put it Okay, so let's have a quick discussion and you want answers from it very near real time, and by ingest, I mean taking the millions, and take a look at the next slide, and the next one will be what you're doing here, it's the engineer who's keeping it running. Yeah, but it's designed into the process, So the point of this is, we can't yet take Spark so on the next slide, we're looking of all the different things a developer has to deal with, let's take a look at the next slide, and the products are on the x-axis, it feels like the complexity curve is growing faster and the burden is on the developer and the administrator of all the action today, all the news,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George Gilbert | PERSON | 0.99+ |
Patrick | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Jeff Hammerbacher | PERSON | 0.99+ |
Steve Herrod | PERSON | 0.99+ |
Jeff Kelly | PERSON | 0.99+ |
George | PERSON | 0.99+ |
Matei Zaharia | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
last year | DATE | 0.99+ |
Wikibon | ORGANIZATION | 0.99+ |
SiliconANGLE | ORGANIZATION | 0.99+ |
tomorrow | DATE | 0.99+ |
millions | QUANTITY | 0.99+ |
VMware | ORGANIZATION | 0.99+ |
Spark | TITLE | 0.99+ |
Gorge | ORGANIZATION | 0.99+ |
one batch | QUANTITY | 0.99+ |
Boston, Massachusetts | LOCATION | 0.99+ |
two classes | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
three classes | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
two parts | QUANTITY | 0.99+ |
each | QUANTITY | 0.99+ |
second one | QUANTITY | 0.99+ |
two different elements | QUANTITY | 0.99+ |
first slide | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
The Cathedral and the Bazaar | TITLE | 0.99+ |
each product | QUANTITY | 0.99+ |
each piece | QUANTITY | 0.99+ |
third one | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
Databricks | ORGANIZATION | 0.99+ |
today | DATE | 0.98+ |
ORGANIZATION | 0.98+ | |
first one | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
Apache | ORGANIZATION | 0.98+ |
SiliconANGLE Media | ORGANIZATION | 0.98+ |
first research | QUANTITY | 0.98+ |
Spark Summit East 2017 | EVENT | 0.97+ |
Hadoop | TITLE | 0.97+ |
two things | QUANTITY | 0.97+ |
two fistfuls of tools | QUANTITY | 0.96+ |
theCUBE | ORGANIZATION | 0.96+ |
one | QUANTITY | 0.96+ |
day one | QUANTITY | 0.95+ |
#SparkSummit | EVENT | 0.93+ |
siliconangle.com | OTHER | 0.93+ |
two axes | QUANTITY | 0.92+ |