Breaking Analysis: As the tech tide recedes, all sectors feel the pinch

>> From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is "Breaking Analysis" with Dave Vellante. >> Virtually all tech companies have expressed caution in their respective earnings calls, and why not? I know you're sick in talking about the macroeconomic environment, but it's full of uncertainties and there's no upside to providing aggressive guidance when sellers are in control. They punish even the slightest miss. Moreover, the spending data confirms the softening market across the board, so it's becoming expected that CFOs will guide cautiously. But companies facing execution challenges, they can't hide behind the macro, which is why it's important to understand which firms are best positioned to maintain momentum through the headwinds and come out the other side stronger. Hello, and welcome to this week's Wikibon Cube Insights powered by ETR. In this "Breaking Analysis," we'll do three things. First, we're going to share a high-level view of the spending pinch that almost all sectors are experiencing. Second, we're going to highlight some of those companies that continue to show notably strong momentum and relatively high spending velocity on their platforms, albeit less robust than last year. And third, we're going to give you a peak at how one senior technology leader in the financial sector sees the competitive dynamic between AWS, Snowflake, and Databricks. So I landed on the red eye this morning and opened my eyes, and then opened my email to see this. My Barron's Daily had a headline telling me how bad things are and why they could get worse. The S&P Thursday hit a new closing low for the year. The safe haven of bonds are sucking wind. The market hasn't seemed to find a floor. Central banks are raising rates. Inflation is still high, but the job market remains strong. Oh, not to mention that the US debt service is headed toward a trillion dollars per year, and the geopolitical situation is pretty tense, and Europe seems to be really struggling. Yeah, so the Santa Claus rally is really looking pretty precarious, especially if there's a liquidity crunch coming, like guess why they call Barron's Barron's. Last week, we showed you this graphic ahead of the UiPath event. For months, the big four sectors, cloud, containers, AI, and RPA, have shown spending momentum above the rest. Now, this chart shows net score or spending velocity on specific sectors, and these four have consistently trended above the 40% red line for two years now, until this past ETR survey. ML/AI and RPA have decelerated as shown by the squiggly lines, and our premise was that they are more discretionary than the other sectors. The big four is now the big two: cloud and containers. But the reality is almost every sector in the ETR taxonomy is down as shown here. This chart shows the sectors that have decreased in a meaningful way. Almost all sectors are now below the trend line and only cloud and containers, as we showed earlier, are above the magic 40% mark. Container platforms and container orchestration are those gray dots. And no sector has shown a significant increase in spending velocity relative to October 2021 survey. In addition to ML/AI and RPA, information security, yes, security, virtualizations, video conferencing, outsourced IT, syndicated research. Syndicated research, yeah, those Gartner, IDC, Forrester, they stand out as seemingly the most discretionary, although we would argue that security is less discretionary. But what you're seeing is a share shift as we've previously reported toward modern platforms and away from point tools. But the point is there is no sector that is immune from the macroeconomic environment. Although remember, as we reported last week, we're still expecting five to 6% IT spending growth this year relative to 2021, but it's a dynamic environment. So let's now take a look at some of the key players and see how they're performing on a relative basis. This chart shows the net score or spending momentum on the y-axis and the pervasiveness of the vendor within the ETR survey measured as the percentage of respondents citing the vendor in use. As usual, Microsoft and AWS stand out because they are both pervasive on the x-axis and they're highly elevated on the vertical axis. For two companies of this size that demonstrate and maintain net scores above the 40% mark is extremely impressive. Although AWS is now showing much higher on the vertical scale relative to Microsoft, which is a new trend. Normally, we see Microsoft dominating on both dimensions. Salesforce is impressive as well because it's so large, but it's below those two on the vertical axis. Now, Google is meaningfully large, but relative to the other big public clouds, AWS and Azure, we see this as disappointing. John Blackledge of Cowen went on CNBC this past week and said that GCP, by his estimates, are 75% of Google Cloud's reported revenue and is now only five years behind AWS in Azure. Now, our models say, "No way." Google Cloud Platform, by our estimate, is running at about $3 billion per quarter or more like 60% of Google's reported overall cloud revenue. You have to go back to 2016 to find AWS running at that level and 2018 for Azure. So we would estimate that GCP is six years behind AWS and four years behind Azure from a revenue performance standpoint. Now, tech-wise, you can make a stronger case for Google. They have really strong tech. But revenue is, in our view, a really good indicator. Now, we circle here ServiceNow because they have become a generational company and impressively remain above the 40% line. We were at CrowdStrike with theCUBE two weeks ago, and we saw firsthand what we see as another generational company in the making. And you can see the company spending momentum is quite impressive. Now, HashiCorp and Snowflake have now surpassed Kubernetes to claim the top net score spots. Now, we know Kubernetes isn't a company, but ETR tracks it as though it were just for context. And we've highlighted Databricks as well, showing momentum, but it doesn't have the market presence of Snowflake. And there are a number of other players in the green: Pure Storage, Workday, Elastic, JFrog, Datadog, Palo Alto, Zscaler, CyberArk, Fortinet. Those last ones are in security, but again, they're all off their recent highs of 2021 and early 2022. Now, speaking of AWS, Snowflake, and Databricks, our colleague Eric Bradley of ETR recently held an in-depth interview with a senior executive at a large financial institution to dig into the analytics space. And there were some interesting takeaways that we'd like to share. The first is a discussion about whether or not AWS can usurp Snowflake as the top dog in analytics. I'll let you read this at your at your leisure, but I'll pull out some call-outs as indicated by the red lines. This individual's take was quite interesting. Note the comment that quote, this is my area of expertise. This person cited AWS's numerous databases as problematic, but Redshift was cited as the closest competitors to Snowflake. This individual also called out Snowflake's current cross-cloud Advantage, what we sometimes call supercloud, as well as the value add in their marketplace as a differentiator. But the point is this person was actually making, the point that this person was actually making is that cloud vendors make a lot of money from Snowflake. AWS, for example, see Snowflake as much more of a partner than a competitor. And as we've reported, Snowflake drives a lot of EC2 and storage revenue for AWS. Now, as well, this doesn't mean AWS does not have a strong marketplace. It does. Probably the best in the business, but the point is Snowflake's marketplace is exclusively focused on a data marketplace and the company's challenge or opportunity is to build up that ecosystem and to continue to add partners and create network effects that allow them to create long-term sustainable moat for the company, while at the same time, staying ahead of the competition with innovation. Now, the other comment that caught our attention was Snowflake's differentiators. This individual cited three areas. One, the well-known separation of compute and storage, which, of course, AWS has replicated sort of, maybe not as elegant in the sense that you can reduce the compute load with Redshift, but unlike Snowflake, you can't shut it down. Two, with Snowflake's data sharing capability, which is becoming quite well-known and a key part of its value proposition. And three, its marketplace. And again, key opportunity for Snowflake to build out its ecosystem. Close feature gaps that it's not necessarily going to deliver on its own. And really importantly, create governed and secure data sharing experiences for anyone on the data cloud or across clouds. Now, the last thing this individual addressed in the ETR interview that we'll share is how Databricks and Snowflake are attacking a similar problem, i.e. simplifying data, data sharing, and getting more value from data. The key messages here are there's overlap with these two platforms, but Databricks appeals to a more techy crowd. You open a notebook, when you're working with Databricks, you're more likely to be a data scientist, whereas with Snowflake, you're more likely to be aligned with the lines of business within sometimes an industry emphasis. We've talked about this quite often on "Breaking Analysis." Snowflake is moving into the data science arena from its data warehouse strength, and Databricks is moving into analytics and the world of SQL from its AI/ML position of strength, and both companies are doing well, although Snowflake was able to get to the public markets at IPO, Databricks has not. Now, even though Snowflake is on the quarterly shock clock as we saw earlier, it has a larger presence in the market. That's at least partly due to the tailwind of an IPO, and, of course, a stronger go-to market posture. Okay, so we wanted to share some of that with you, and I realize it's a bit of a tangent, but it's good stuff from a qualitative practitioner perspective. All right, let's close with some final thoughts. Look forward a little bit. Things in the short-term are really hard to predict. We've seen these oversold rallies peter out for the last couple of months because the world is such a mess right now, and it's really difficult to reconcile these counterveiling trends. Nothing seems to be working from a public policy perspective. Now, we know tech spending is softening, but let's not forget it, five to 6% growth. It's at or above historical norms, but there's no question the trend line is down. That said, there are certain growth companies, several mentioned in this episode, that are modern and vying to be generational platforms. They're well-positioned, financially sound, disciplined, with strong cash positions, with inherent profitability. What I mean by that is they can dial down growth if they wanted to, dial up EBIT, but being a growth company today is not what it was a year ago. Because of rising rates, the discounted cash flows are just less attractive. So earnings estimates, along with revenue multiples on these growth companies, are reverting toward the mean. However, companies like Snowflake, and CrowdStrike, and some others are able to still command a relative premium because of their execution and continued momentum. Others, as we reported last week, like UiPath for example, despite really strong momentum and customer spending, have had execution challenges. Okta is another example of a company with strong spending momentum, but is absorbing off zero for example. And as a result, they're getting hit harder from evaluation standpoint. The bottom line is sellers are still firmly in control, the bulls have been humbled, and the traders aren't buying growth tech or much tech at all right now. But long-term investors are looking for entry points because these generational companies are going to be worth significantly more five to 10 years down the line. Okay, that's it for today. Thanks for watching this "Breaking Analysis" episode. Thanks to Alex Myerson and Ken Schiffman on production. And Alex manages our podcast as well. Kristen Martin and Cheryl Knight. They help get the word out on social media and in our newsletters. And Rob Hof is our editor-in-chief over at SiliconANGLE do some wonderful editing for us, so thank you. Thank you all. Remember that all these episodes are available as podcast wherever you listen. All you do is search "Breaking Analysis" podcast. I publish each week on wikibon.com and siliconangle.com and you can email me at david.vellante@siliconangle.com, or DM me @dvellante, or comment on my LinkedIn post. And please check out etr.ai for the very best survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights, powered by ETR. Thanks for watching, and we'll see you next time on "Breaking Analysis." (gentle music)

Published Date : Oct 2 2022

SUMMARY :

This is "Breaking Analysis" and come out the other side stronger.

ENTITIES

Entity	Category	Confidence
AWS	ORGANIZATION	0.99+
Eric Bradley	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Kristen Martin	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
October 2021	DATE	0.99+
John Blackledge	PERSON	0.99+
five	QUANTITY	0.99+
Rob Hof	PERSON	0.99+
two companies	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Last week	DATE	0.99+
Gartner	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
Forrester	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
2021	DATE	0.99+
IDC	ORGANIZATION	0.99+
75%	QUANTITY	0.99+
last week	DATE	0.99+
Google	ORGANIZATION	0.99+
Fortinet	ORGANIZATION	0.99+
2018	DATE	0.99+
2016	DATE	0.99+
Datadog	ORGANIZATION	0.99+
Alex	PERSON	0.99+
two years	QUANTITY	0.99+
Palo Alto	ORGANIZATION	0.99+
Okta	ORGANIZATION	0.99+
four years	QUANTITY	0.99+
last week	DATE	0.99+
UiPath	ORGANIZATION	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
40%	QUANTITY	0.99+
last year	DATE	0.99+
CyberArk	ORGANIZATION	0.99+
60%	QUANTITY	0.99+
six years	QUANTITY	0.99+
both companies	QUANTITY	0.99+
First	QUANTITY	0.99+
Zscaler	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Second	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
CrowdStrike	ORGANIZATION	0.99+
first	QUANTITY	0.99+
third	QUANTITY	0.99+
JFrog	ORGANIZATION	0.99+
SiliconANGLE	ORGANIZATION	0.99+
three areas	QUANTITY	0.99+
a year ago	DATE	0.99+
Snowflake	TITLE	0.99+
each week	QUANTITY	0.99+
S&P	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
Pure Storage	ORGANIZATION	0.99+
two	QUANTITY	0.98+
Elastic	ORGANIZATION	0.98+
Workday	ORGANIZATION	0.98+
two weeks ago	DATE	0.98+

Breaking Analysis: Snowflake Summit 2022...All About Apps & Monetization

>> From theCUBE studios in Palo Alto in Boston, bringing you data driven insights from theCUBE and ETR. This is "Breaking Analysis" with Dave Vellante. >> Snowflake Summit 2022 underscored that the ecosystem excitement which was once forming around Hadoop is being reborn, escalated and coalescing around Snowflake's data cloud. What was once seen as a simpler cloud data warehouse and good marketing with the data cloud is evolving rapidly with new workloads of vertical industry focus, data applications, monetization, and more. The question is, will the promise of data be fulfilled this time around, or is it same wine, new bottle? Hello, and welcome to this week's Wikibon CUBE Insights powered by ETR. In this "Breaking Analysis," we'll talk about the event, the announcements that Snowflake made that are of greatest interest, the major themes of the show, what was hype and what was real, the competition, and some concerns that remain in many parts of the ecosystem and pockets of customers. First let's look at the overall event. It was held at Caesars Forum. Not my favorite venue, but I'll tell you it was packed. Fire Marshall Full, as we sometimes say. Nearly 10,000 people attended the event. Here's Snowflake's CMO Denise Persson on theCUBE describing how this event has evolved. >> Yeah, two, three years ago, we were about 1800 people at a Hilton in San Francisco. We had about 40 partners attending. This week we're close to 10,000 attendees here. Almost 10,000 people online as well, and over over 200 partners here on the show floor. >> Now, those numbers from 2019 remind me of the early days of Hadoop World, which was put on by Cloudera but then Cloudera handed off the event to O'Reilly as this article that we've inserted, if you bring back that slide would say. The headline it almost got it right. Hadoop World was a failure, but it didn't have to be. Snowflake has filled the void created by O'Reilly when it first killed Hadoop World, and killed the name and then killed Strata. Now, ironically, the momentum and excitement from Hadoop's early days, it probably could have stayed with Cloudera but the beginning of the end was when they gave the conference over to O'Reilly. We can't imagine Frank Slootman handing the keys to the kingdom to a third party. Serious business was done at this event. I'm talking substantive deals. Salespeople from a host sponsor and the ecosystems that support these events, they love physical. They really don't like virtual because physical belly to belly means relationship building, pipeline, and deals. And that was blatantly obvious at this show. And in fairness, all theCUBE events that we've done year but this one was more vibrant because of its attendance and the action in the ecosystem. Ecosystem is a hallmark of a cloud company, and that's what Snowflake is. We asked Frank Slootman on theCUBE, was this ecosystem evolution by design or did Snowflake just kind of stumble into it? Here's what he said. >> Well, when you are a data clouding, you have data, people want to do things with that data. They don't want just run data operations, populate dashboards, run reports. Pretty soon they want to build applications and after they build applications, they want build businesses on it. So it goes on and on and on. So it drives your development to enable more and more functionality on that data cloud. Didn't start out that way, you know, we were very, very much focused on data operations. Then it becomes application development and then it becomes, hey, we're developing whole businesses on this platform. So similar to what happened to Facebook in many ways. >> So it sounds like it was maybe a little bit of both. The Facebook analogy is interesting because Facebook is a walled garden, as is Snowflake, but when you come into that garden, you have assurances that things are going to work in a very specific way because a set of standards and protocols is being enforced by a steward, i.e. Snowflake. This means things run better inside of Snowflake than if you try to do all the integration yourself. Now, maybe over time, an open source version of that will come out but if you wait for that, you're going to be left behind. That said, Snowflake has made moves to make its platform more accommodating to open source tooling in many of its announcements this week. Now, I'm not going to do a deep dive on the announcements. Matt Sulkins from Monte Carlo wrote a decent summary of the keynotes and a number of analysts like Sanjeev Mohan, Tony Bear and others are posting some deeper analysis on these innovations, and so we'll point to those. I'll say a few things though. Unistore extends the type of data that can live in the Snowflake data cloud. It's enabled by a new feature called hybrid tables, a new table type in Snowflake. One of the big knocks against Snowflake was it couldn't handle and transaction data. Several database companies are creating this notion of a hybrid where both analytic and transactional workloads can live in the same data store. Oracle's doing this for example, with MySQL HeatWave and there are many others. We saw Mongo earlier this month add an analytics capability to its transaction system. Mongo also added sequel, which was kind of interesting. Here's what Constellation Research analyst Doug Henschen said about Snowflake's moves into transaction data. Play the clip. >> Well with Unistore, they're reaching out and trying to bring transactional data in. Hey, don't limit this to analytical information and there's other ways to do that like CDC and streaming but they're very closely tying that again to that marketplace, with the idea of bring your data over here and you can monetize it. Don't just leave it in that transactional database. So another reach to a broader play across a big community that they're building. >> And you're also seeing Snowflake expand its workload types in its unique way and through Snowpark and its stream lit acquisition, enabling Python so that native apps can be built in the data cloud and benefit from all that structure and the features that Snowflake is built in. Hence that Facebook analogy, or maybe the App Store, the Apple App Store as I propose as well. Python support also widens the aperture for machine intelligence workloads. We asked Snowflake senior VP of product, Christian Kleinerman which announcements he thought were the most impactful. And despite the who's your favorite child nature of the question, he did answer. Here's what he said. >> I think the native applications is the one that looks like, eh, I don't know about it on the surface but he has the biggest potential to change everything. That's create an entire ecosystem of solutions for within a company or across companies that I don't know that we know what's possible. >> Snowflake also announced support for Apache Iceberg, which is a new open table format standard that's emerging. So you're seeing Snowflake respond to these concerns about its lack of openness, and they're building optionality into their cloud. They also showed some cost op optimization tools both from Snowflake itself and from the ecosystem, notably Capital One which launched a software business on top of Snowflake focused on optimizing cost and eventually the rollout data management capabilities, and all kinds of features that Snowflake announced that the show around governance, cross cloud, what we call super cloud, a new security workload, and they reemphasize their ability to read non-native on-prem data into Snowflake through partnerships with Dell and Pure and a lot more. Let's hear from some of the analysts that came on theCUBE this week at Snowflake Summit to see what they said about the announcements and their takeaways from the event. This is Dave Menninger, Sanjeev Mohan, and Tony Bear, roll the clip. >> Our research shows that the majority of organizations, the majority of people do not have access to analytics. And so a couple of the things they've announced I think address those or help to address those issues very directly. So Snowpark and support for Python and other languages is a way for organizations to embed analytics into different business processes. And so I think that'll be really beneficial to try and get analytics into more people's hands. And I also think that the native applications as part of the marketplace is another way to get applications into people's hands rather than just analytical tools. Because most people in the organization are not analysts. They're doing some line of business function. They're HR managers, they're marketing people, they're sales people, they're finance people, right? They're not sitting there mucking around in the data, they're doing a job and they need analytics in that job. >> Primarily, I think it is to contract this whole notion that once you move data into Snowflake, it's a proprietary format. So I think that's how it started but it's usually beneficial to the customers, to the users because now if you have large amount of data in paket files you can leave it on S3, but then you using the Apache Iceberg table format in Snowflake, you get all the benefits of Snowflake's optimizer. So for example, you get the micro partitioning, you get the metadata. And in a single query, you can join, you can do select from a Snowflake table union and select from an iceberg table and you can do store procedure, user defined function. So I think what they've done is extremely interesting. Iceberg by itself still does not have multi-table transactional capabilities. So if I'm running a workload, I might be touching 10 different tables. So if I use Apache Iceberg in a raw format, they don't have it, but Snowflake does. So the way I see it is Snowflake is adding more and more capabilities right into the database. So for example, they've gone ahead and added security and privacy. So you can now create policies and do even cell level masking, dynamic masking, but most organizations have more than Snowflake. So what we are starting to see all around here is that there's a whole series of data catalog companies, a bunch of companies that are doing dynamic data masking, security and governance, data observability which is not a space Snowflake has gone into. So there's a whole ecosystem of companies that is mushrooming. Although, you know, so they're using the native capabilities of Snowflake but they are at a level higher. So if you have a data lake and a cloud data warehouse and you have other like relational databases, you can run these cross platform capabilities in that layer. So that way, you know, Snowflake's done a great job of enabling that ecosystem. >> I think it's like the last mile, essentially. In other words, it's like, okay, you have folks that are basically that are very comfortable with Tableau but you do have developers who don't want to have to shell out to a separate tool. And so this is where Snowflake is essentially working to address that constituency. To Sanjeev's point, and I think part of it, this kind of plays into it is what makes this different from the Hadoop era is the fact that all these capabilities, you know, a lot of vendors are taking it very seriously to put this native. Now, obviously Snowflake acquired Streamlit. So we can expect that the Streamlit capabilities are going to be native. >> I want to share a little bit about the higher level thinking at Snowflake, here's a chart from Frank Slootman's keynote. It's his version of the modern data stack, if you will. Now, Snowflake of course, was built on the public cloud. If there were no AWS, there would be no Snowflake. Now, they're all about bringing data and live data and expanding the types of data, including structured, we just heard about that, unstructured, geospatial, and the list is going to continue on and on. Eventually I think it's going to bleed into the edge if we can figure out what to do with that edge data. Executing on new workloads is a big deal. They started with data sharing and they recently added security and they've essentially created a PaaS layer. We call it a SuperPaaS layer, if you will, to attract application developers. Snowflake has a developer-focused event coming up in November and they've extended the marketplace with 1300 native apps listings. And at the top, that's the holy grail, monetization. We always talk about building data products and we saw a lot of that at this event, very, very impressive and unique. Now here's the thing. There's a lot of talk in the press, in the Wall Street and the broader community about consumption-based pricing and concerns over Snowflake's visibility and its forecast and how analytics may be discretionary. But if you're a company building apps in Snowflake and monetizing like Capital One intends to do, and you're now selling in the marketplace, that is not discretionary, unless of course your costs are greater than your revenue for that service, in which case is going to fail anyway. But the point is we're entering a new error where data apps and data products are beginning to be built and Snowflake is attempting to make the data cloud the defacto place as to where you're going to build them. In our view they're well ahead in that journey. Okay, let's talk about some of the bigger themes that we heard at the event. Bringing apps to the data instead of moving the data to the apps, this was a constant refrain and one that certainly makes sense from a physics point of view. But having a single source of data that is discoverable, sharable and governed with increasingly robust ecosystem options, it doesn't have to be moved. Sometimes it may have to be moved if you're going across regions, but that's unique and a differentiator for Snowflake in our view. I mean, I'm yet to see a data ecosystem that is as rich and growing as fast as the Snowflake ecosystem. Monetization, we talked about that, industry clouds, financial services, healthcare, retail, and media, all front and center at the event. My understanding is that Frank Slootman was a major force behind this shift, this development and go to market focus on verticals. It's really an attempt, and he talked about this in his keynote to align with the customer mission ultimately align with their objectives which not surprisingly, are increasingly monetizing with data as a differentiating ingredient. We heard a ton about data mesh, there were numerous presentations about the topic. And I'll say this, if you map the seven pillars Snowflake talks about, Benoit Dageville talked about this in his keynote, but if you map those into Zhamak Dehghani's data mesh framework and the four principles, they align better than most of the data mesh washing that I've seen. The seven pillars, all data, all workloads, global architecture, self-managed, programmable, marketplace and governance. Those are the seven pillars that he talked about in his keynote. All data, well, maybe with hybrid tables that becomes more of a reality. Global architecture means the data is globally distributed. It's not necessarily physically in one place. Self-managed is key. Self-service infrastructure is one of Zhamak's four principles. And then inherent governance. Zhamak talks about computational, what I'll call automated governance, built in. And with all the talk about monetization, that aligns with the second principle which is data as product. So while it's not a pure hit and to its credit, by the way, Snowflake doesn't use data mesh in its messaging anymore. But by the way, its customers do, several customers talked about it. Geico, JPMC, and a number of other customers and partners are using the term and using it pretty closely to the concepts put forth by Zhamak Dehghani. But back to the point, they essentially, Snowflake that is, is building a proprietary system that substantially addresses some, if not many of the goals of data mesh. Okay, back to the list, supercloud, that's our term. We saw lots of examples of clouds on top of clouds that are architected to spin multiple clouds, not just run on individual clouds as separate services. And this includes Snowflake's data cloud itself but a number of ecosystem partners that are headed in a very similar direction. Snowflake still talks about data sharing but now it uses the term collaboration in its high level messaging, which is I think smart. Data sharing is kind of a geeky term. And also this is an attempt by Snowflake to differentiate from everyone else that's saying, hey, we do data sharing too. And finally Snowflake doesn't say data marketplace anymore. It's now marketplace, accounting for its application market. Okay, let's take a quick look at the competitive landscape via this ETR X-Y graph. Vertical access remembers net score or spending momentum and the x-axis is penetration, pervasiveness in the data center. That's what ETR calls overlap. Snowflake continues to lead on the vertical axis. They guide it conservatively last quarter, remember, so I wouldn't be surprised if that lofty height, even though it's well down from its earlier levels but I wouldn't be surprised if it ticks down again a bit in the July survey, which will be in the field shortly. Databricks is a key competitor obviously at a strong spending momentum, as you can see. We didn't draw it here but we usually draw that 40% line or red line at 40%, anything above that is considered elevated. So you can see Databricks is quite elevated. But it doesn't have the market presence of Snowflake. It didn't get to IPO during the bubble and it doesn't have nearly as deep and capable go-to market machinery. Now, they're getting better and they're getting some attention in the market, nonetheless. But as a private company, you just naturally, more people are aware of Snowflake. Some analysts, Tony Bear in particular, believe Mongo and Snowflake are on a bit of a collision course long term. I actually can see his point. You know, I mean, they're both platforms, they're both about data. It's long ways off, but you can see them sort of in a similar path. They talk about kind of similar aspirations and visions even though they're quite in different markets today but they're definitely participating in similar tam. The cloud players are probably the biggest or definitely the biggest partners and probably the biggest competitors to Snowflake. And then there's always Oracle. Doesn't have the spending velocity of the others but it's got strong market presence. It owns a cloud and it knows a thing about data and it definitely is a go-to market machine. Okay, we're going to end on some of the things that we heard in the ecosystem. 'Cause look, we've heard before how particular technology, enterprise data warehouse, data hubs, MDM, data lakes, Hadoop, et cetera. We're going to solve all of our data problems and of course they didn't. And in fact, sometimes they create more problems that allow vendors to push more incremental technology to solve the problems that they created. Like tools and platforms to clean up the no schema on right nature of data lakes or data swamps. But here are some of the things that I heard firsthand from some customers and partners. First thing is, they said to me that they're having a hard time keeping up sometimes with the pace of Snowflake. It reminds me of AWS in 2014, 2015 timeframe. You remember that fire hose of announcements which causes increased complexity for customers and partners. I talked to several customers that said, well, yeah this is all well and good but I still need skilled people to understand all these tools that I'm integrated in the ecosystem, the catalogs, the machine learning observability. A number of customers said, I just can't use one governance tool, I need multiple governance tools and a lot of other technologies as well, and they're concerned that that's going to drive up their cost and their complexity. I heard other concerns from the ecosystem that it used to be sort of clear as to where they could add value you know, when Snowflake was just a better data warehouse. But to point number one, they're either concerned that they'll be left behind or they're concerned that they'll be subsumed. Look, I mean, just like we tell AWS customers and partners, you got to move fast, you got to keep innovating. If you don't, you're going to be left. Either if your customer you're going to be left behind your competitor, or if you're a partner, somebody else is going to get there or AWS is going to solve the problem for you. Okay, and there were a number of skeptical practitioners, really thoughtful and experienced data pros that suggested that they've seen this movie before. That's hence the same wine, new bottle. Well, this time around I certainly hope not given all the energy and investment that is going into this ecosystem. And the fact is Snowflake is unquestionably making it easier to put data to work. They built on AWS so you didn't have to worry about provisioning, compute and storage and networking and scaling. Snowflake is optimizing its platform to take advantage of things like Graviton so you don't have to, and they're doing some of their own optimization tools. The ecosystem is building optimization tools so that's all good. And firm belief is the less expensive it is, the more data will get brought into the data cloud. And they're building a data platform on which their ecosystem can build and run data applications, aka data products without having to worry about all the hard work that needs to get done to make data discoverable, shareable, and governed. And unlike the last 10 years, you don't have to be a keeper and integrate all the animals in the Hadoop zoo. Okay, that's it for today, thanks for watching. Thanks to my colleague, Stephanie Chan who helps research "Breaking Analysis" topics. Sometimes Alex Myerson is on production and manages the podcasts. Kristin Martin and Cheryl Knight help get the word out on social and in our newsletters, and Rob Hof is our editor in chief over at Silicon, and Hailey does some wonderful editing, thanks to all. Remember, all these episodes are available as podcasts wherever you listen. All you got to do is search Breaking Analysis Podcasts. I publish each week on wikibon.com and siliconangle.com and you can email me at David.Vellante@siliconangle.com or DM me @DVellante. If you got something interesting, I'll respond. If you don't, I'm sorry I won't. Or comment on my LinkedIn post. Please check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, and we'll see you next time. (upbeat music)

Published Date : Jun 18 2022

SUMMARY :

bringing you data driven that the ecosystem excitement here on the show floor. and the action in the ecosystem. Didn't start out that way, you know, One of the big knocks against Snowflake the idea of bring your data of the question, he did answer. is the one that looks like, and from the ecosystem, And so a couple of the So that way, you know, from the Hadoop era is the fact the defacto place as to where

ENTITIES

Entity	Category	Confidence
Frank Slootman	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Stephanie Chan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Rob Hof	PERSON	0.99+
Benoit Dageville	PERSON	0.99+
2014	DATE	0.99+
Matt Sulkins	PERSON	0.99+
JPMC	ORGANIZATION	0.99+
2019	DATE	0.99+
Cheryl Knight	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Denise Persson	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Tony Bear	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Dell	ORGANIZATION	0.99+
July	DATE	0.99+
Geico	ORGANIZATION	0.99+
November	DATE	0.99+
Snowflake	TITLE	0.99+
40%	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
App Store	TITLE	0.99+
Capital One	ORGANIZATION	0.99+
second principle	QUANTITY	0.99+
Sanjeev Mohan	PERSON	0.99+
Snowflake	ORGANIZATION	0.99+
1300 native apps	QUANTITY	0.99+
Tony Bear	PERSON	0.99+
David.Vellante@siliconangle.com	OTHER	0.99+
Kristin Martin	PERSON	0.99+
Mongo	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Snowflake Summit 2022	EVENT	0.99+
First	QUANTITY	0.99+
two	DATE	0.99+
Python	TITLE	0.99+
10 different tables	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
ETR	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Snowflake	EVENT	0.98+
one place	QUANTITY	0.98+
each week	QUANTITY	0.98+
O'Reilly	ORGANIZATION	0.98+
This week	DATE	0.98+
Hadoop World	EVENT	0.98+
this week	DATE	0.98+
Pure	ORGANIZATION	0.98+
about 40 partners	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
last quarter	DATE	0.98+
One	QUANTITY	0.98+
S3	TITLE	0.97+
Hadoop	LOCATION	0.97+
single	QUANTITY	0.97+
Caesars Forum	LOCATION	0.97+
Iceberg	TITLE	0.97+
single source	QUANTITY	0.97+
Silicon	ORGANIZATION	0.97+
Nearly 10,000 people	QUANTITY	0.97+
Apache Iceberg	ORGANIZATION	0.97+

Why Oracle’s Stock is Surging to an All time High

>> From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from the cube in ETR. This is Breaking Analysis with Dave Vellante. >> On Friday, December 10th, Oracle announced a strong earnings beat and raise, on the strength of its licensed business, and slightly better than expected cloud performance. The stock was up sharply on the day and closed up nearly 16% surpassing 280 billion in market value. Oracle's success is due largely to its execution, of a highly differentiated strategy, that has really evolved over the past decade or more, deeply integrating its hardware and software, heavily investing in next generation cloud, creating a homogeneous experience across its application portfolio, and becoming the number one platform. Number one for the world's most mission critical applications. Now, while investors piled into the stock, skeptics will point to the beat being weighed toward licensed revenue and likely keep one finger on the sell button until they're convinced Oracle's cloud momentum, is more consistent and predictable. Hello and welcome to this week's Wikibond CUBE insights powered by ETR. In this breaking analysis, we'll review Oracle's most recent quarter, and pull in some ETR survey data, to frame the company's cloud business, the momentum of fusion ERP, where the company is winning and some gaps and opportunities that we see. The numbers this quarter was strong, particularly top line growth. Here are a few highlights. Oracle's revenues that grew 6% year on year that's in constant currency, surpassed $10 billion for the quarter. Oracle's non-gap operating margins, were an impressive 47%. Safra Catz has always said cloud is more profitable business and it's really starting to show in the income statement. Operating cash and free cash flow were 10.3 billion and 7.1 billion respectively, for the past four quarters, and would have been higher, if not for charges largely related to litigation expenses tied to the hiring of Mark Hurd, which the company said would not repeat in the future quarters. And you can see in this chart how Oracle breaks down its business, which is kind of a mishmash of items they lump into so-called the cloud. The largest piece of the revenue pie is cloud services, and licensed support, which in reading 10Ks, you'll find statements like the following; licensed support revenues are our largest revenue stream and include product upgrades, and maintenance releases and patches, as well as technical support assistance and statements like the following; cloud and licensed revenue, include the sale of cloud services, cloud licenses and on-premises licenses, which typically represent perpetual software licenses purchased by customers, for use in both cloud, and on-premises, IT environments. And cloud license and on-prem license revenues primarily represent amounts earned from granting customers perpetual licenses to use our database middleware application in industry specific products, which our customers use for cloud-based, on-premise and other IT environments. So you tell me, "is that cloud? I don't know." In the early days of Oracle cloud, the company used to break out, IaaS, PaaS and SaaS revenue separately, but it changed its mind, which really makes it difficult to determine what's happening in true cloud. Look I have no problem including same same hardware software control plane, et cetera. The hybrid if it's on-prem in a true hybrid environment like exadata cloud@customer or AWS outposts. But you have to question what's really cloud in these numbers. And Larry in the earnings call mentioned that Salesforce licenses the Oracle database, to run its cloud and Oracle doesn't count that in its cloud number, rather it counts it in license revenue, but as you can see it varies that into a line item that starts with the word cloud. So I guess I would say that Oracle's reporting is maybe somewhat better than IBM's cloud reporting, which is the worst, but I can't really say what is and isn't cloud, in these numbers. Nonetheless, Oracle is getting it done for investors. Here's a chart comparing the five-year performance of Oracle to some of its legacy peers. We excluded Microsoft because it skews the numbers. Microsoft would really crush all these names including Oracle. But look at Oracle. It's wedged in between the performance of the NASDAQ and the S&P 500, it's up over 160% in that five-year timeframe, well ahead of SAP which is up 59% in that time, and way ahead of the dismal -22% performance of IBM. Well, it's a shame. The tech tide is rising, it's lifting all boats but, IBM has unfortunately not been able to capitalize. That's a story for another day. As a market watcher, you can't help but love Larry Ellison. I only met him once at an IDC conference in Paris where I got to interview Scott McNealy, CEO at the time. Ellison is great for analysts because, he's not afraid to talk about the competition. He'll brag, he'll insult, he'll explain, and he'll pitch his stories. Now on the earnings call last night, he went off. Educating the analyst community, on the upside in the fusion ERP business, making the case that because only a thousand of the 7,500 legacy on-prem ERP customers from Oracle, JD Edwards and PeopleSoft have moved Oracle's fusion cloud ERP, and he predicted that Oracle's cloud ERP business will surpass 20 billion in five years. In fact, he said it's going to bigger than that. He slammed the hybrid cloud washing. You can see one of the quotes here in this chart, that's going on when companies have customers running in the cloud and they claim whatever they have on premise hybrid, he called that ridiculous. I would agree. And then he took an opportunity to slam the hyperscale cloud vendors, citing a telco customer that said Oracle's cloud never goes down, and of course, he chose the same week, that AWS had a major outage. And so to these points, I would say that Oracle really was the first tech company, to announce a true hybrid cloud strategy, where you have an entirely identical experience on prem and in the cloud. This was announced with cloud@customer, two years, before AWS announced outposts. Now it probably took Oracle two years to get it working as advertised, but they were first. And to the second point, this is where Oracle differentiates itself. Oracle is number one for mission critical applications. No other vendor really can come close to Oracle in this regard. And I would say that Oracle is recent quarterly performance to a large extent, is due to this differentiated approach. Over the past 10 years, we've talked to hundreds literally. Hundreds and hundreds of Oracle customers. And while they may not always like the tactics and licensing policies of Oracle in their contracting, they will tell you, that business case for investing and staying with Oracle are very strong. And yes, a big part of that is lock-in but R&D investments innovation and a keen sense of market direction, are just as important to these customers. When you're chairman and founder is a technologist and also the CTO, and has the cash on hand to invest, the results are a highly competitive story. Now that's not to say Oracle is not without its challenges. That's not to say Oracle is without its challenges. Those who follow this program know that when it comes to ETR survey data, the story is not always pretty for Oracle. So let's take a look. This chart shows the breakdown of ETR is net score methodology, Net score measures spending momentum and works ETR. Each quarter asks customers, are you adding in the platform, That's the lime green. Increasing spend by 6% or more, that's the fourth green. Is you're spending E+ or minus 5%, that's the gray. You're spending climbing by 6%, that's the pinkish. Or are you leaving the platform, that's the bright red retiring. You subtract the reds from the greens, and that yields a net score, which an Oracle's overall case, is an uninspiring -4%. This is one of the anomalies in the ETR dataset. The net score doesn't track absolute actual levels, of spending the dollars. Remember, as the leader in mission critical workloads, Oracle commands a premium price. And so what happens here is the gray, is still spending a large amount of money, enough to offset the declines, and the greens are spending more than they would on other platforms because Oracle could command higher prices. And so that's how Oracle is able to grow its overall revenue by 6% for example, whereas the ETR methodology, doesn't capture that trend. So you have to dig into the data a bit deeper. We're not going to go too deep today, but let's take a look at how some of Oracle's businesses are performing relative to its competitors. This is a popular view that we like to share. It shows net score or spending momentum on the vertical axis, and market share. Market share is a measure of pervasiveness in the survey. Think of it as mentioned share. That's on the x-axis. And we've broken down and circled Oracle overall, Oracle on prem, which is declining on the vertical axis, Oracle fusion and NetSuite, which are much higher than Oracle overall. And in the case of fusion, much closer to that 40% magic red horizontal line, remember anything above that line, we consider to be elevated. Now we've added SAP overall which has, momentum comparable to fusion in the survey, using this methodology and IBM, which is in between fusion and Oracle, overall on the y-axis. Oracle as you can see on the horizontal axis, has a larger presence than any of these firms that are below the 40% line. Now, above that 40% line, you see companies with a smaller presence in the survey like Workday, salesforce.com, pretty big presence still, Google cloud also, and Snowflake. Smaller presence but much much higher net score than anybody else on this chart. And AWS and Microsoft overall with both a strong presence, and impressive momentum, especially for their respective sizes. Now that view that we just showed you excluded on purpose Oracle specific cloud offering. So let's now take a look at that relative to other cloud providers. This chart shows the same XY view, but it cuts the data by cloud only. And you can see Oracle while still well below the 40% line, has a net score of +15 compared to a -4 overall that we showed you earlier. So here we see two key points. One, despite the convoluted reporting that we talked about earlier, the ETR data supports that Oracle's cloud business has significantly more momentum than Oracle's overall average momentum. And two, while Oracle is smaller and doesn't have the growth of the hyperscale giants, it's cloud is performing noticeably better than IBM's within the ETR survey data. Now a key point Ellison emphasized on the earnings call, was the importance of ERP, and the work that Oracle has done in this space. It lives by this notion of a cloud first mentality. It builds stuff for the cloud and then, would bring it on-prem. And it's been attracting new customers according to the company. He said Oracle has 8,500 fusion ERP customers, and 28,000 NetSuite customers in the cloud. And unlike Microsoft, it hasn't migrated its on-prem install base, to the cloud yet. Meaning these are largely new customers. Now this chart isolates fusion and NetSuite, within a sector ETR calls GPP. The very giant, public and private companies. And this is a bellwether of spending in the ETR dataset. They've gone back and it correlates to performance. So think large public companies, the biggest ones, and also privates big privates like Mars or Cargo or Fidelity. The chart shows the net score breakdown over time for fusion and NetSuite going back to 2019. And you can see, a big uptick as shown in the blue line from the October, 2020 survey. So Oracle has done a good job building and now marketing its cloud ERP to these important customers. Now, the last thing we want to show you is Oracle's performance within industry sectors. On the earnings call, Oracle said that it had a very strong momentum for fusion in financial services and healthcare. And this chart shows the net score for fusion, across each industry sector that ETR tracks, for three survey points. October, 2020, that's the gray bars, July 21, that's the blue bars and October, 2021, the yellow bars. So look it confirms Oracles assertions across the board that they're seeing fusion perform very well including the two verticals that are called out healthcare and banking slash financial services. Now the big question is where does Oracle go from here? Oracle has had a history of looking like it's going to break out, only to hit some bumps in the road. And so investors are likely going to remain a bit cautious and take profits off the table along the way. But since the Barron's article came out, we reported on that earlier this year in February, declaring Oracle a cloud giant, the stock is up more than 50% of course. 16 of those points were from Friday's move upward, but still, Oracle's highly differentiated strategy of integrating hardware and software together, investing in a modern cloud platform and selectively offering services that cater to the hardcore mission critical buyer, these have served the company, its customers and investors as well. From a cloud standpoint, we'd like to see Oracle be more inclusive, and aggressively expand its marketplace and its ecosystem. This would provide both greater optionality for customers, and further establish Oracle as a major cloud player. Indeed, one of the hallmarks of both AWS and Azure is the momentum being created, by their respective ecosystems. As well, we'd like to see more clear confirmation that Oracle's performance is being driven by its investments in technology IE cloud, same same hybrid, and industry features these modern investments, versus a legacy licensed cycles. We are generally encouraged and are reminded, of years ago when Sam Palmisano, he was retiring and leaving as the CEO of IBM. At the time, HP under the direction ironically of Mark Hurd, was the now company, Palmisano was asked, "do you worry about HP?" And he said in fact, "I don't worry about HP. I worry about Oracle because Oracle invests in R&D." And that statement has proven present. What do you think? Has Oracle hit the next inflection point? Let me know. Don't forget these episodes they're all available as podcasts wherever you listen, all you do is search it. Breaking Analysis podcast, check out ETR website at etr.plus. We also publish a full report every week on wikibon.com and siliconANGLE.com. You can get in touch with me on email David.vellante@siliconangle.com, you can DM me @dvellante on Twitter or, comment on our LinkedIn posts. This is Dave Vellante for theCUBE Insights. Powered by ETR. Have a great week everybody. Stay safe, be well, and we'll see you next time. (upbeat music)

Published Date : Dec 10 2021

SUMMARY :

insights from the cube in ETR. and of course, he chose the same week,

ENTITIES

Entity	Category	Confidence
AWS	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Larry	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Palmisano	PERSON	0.99+
10.3 billion	QUANTITY	0.99+
Sam Palmisano	PERSON	0.99+
Larry Ellison	PERSON	0.99+
NASDAQ	ORGANIZATION	0.99+
Mars	ORGANIZATION	0.99+
July 21	DATE	0.99+
October, 2020	DATE	0.99+
Ellison	PERSON	0.99+
five-year	QUANTITY	0.99+
20 billion	QUANTITY	0.99+
Friday, December 10th	DATE	0.99+
October, 2021	DATE	0.99+
HP	ORGANIZATION	0.99+
2019	DATE	0.99+
Oracles	ORGANIZATION	0.99+
Mark Hurd	PERSON	0.99+
47%	QUANTITY	0.99+
7.1 billion	QUANTITY	0.99+
Cargo	ORGANIZATION	0.99+
Fidelity	ORGANIZATION	0.99+
Scott McNealy	PERSON	0.99+
Palo Alto	LOCATION	0.99+
$10 billion	QUANTITY	0.99+
Friday	DATE	0.99+
second point	QUANTITY	0.99+
PeopleSoft	ORGANIZATION	0.99+

3 4 Insights for All v3 clean

>>Yeah. >>Welcome back for our last session of the day how to deliver career making business outcomes with Search and AI. So we're very lucky to be hearing from Canada. Canadian Tire, one of Canada's largest and most successful retailers, have been powered 4.5 1000 employees to maximize the value of data with self service insights. So today we're joining us. We have Yarrow Baturin, who is the manager of Merch analytics and planning to support at Canadian Tire and then also Andrea Frisk, who is the engagement manager manager for thoughts. What s O U R Andrea? Thanks so much for being here. And with >>that, >>I'll pass the mic to you guys. >>Thank you for having us. Um, already, I I think I'll start with an introduction off who I am, what I do. A Canadian entire on what Canadian pair is all about. So, as a manager of Merch analytics at Canadian Tire, I support merchant organization with reporting tools, and then be I platform to enable decision making on a day to day basis. What is? Canadian Tire's Canadian tire is one of the largest retailers in Canada. Um, serving Canadians with a number of lines of business spanning automotive fixing, living, playing and SNG departments. We have a number of banners, including sport check Marks Party City Phl that covers more than 1700 locations. So as an organization, we've got vast variety of different data, whether it's product or loyalty. Now, as the time goes on, the number of asks the number off data points. The complexity of the analysis has been increasing on banned traditional tools. Analytical tools such as Excel Microsoft Access do find job but start hitting their limitations. So we started on the journey of exploring what other B I platforms would be suitable for our needs. And the criteria that we thought about as we started on that journey is to make sure that we enable customization as well as the McCarthy ization of data. What does that mean? That means we wanted to ensure that each one of the end users have ability to create their own versions off the report while having consistency from the data standpoint, we also wanted Thio ensure that they're able to create there at hawks search queries and draw insights based on the desired business needs. As each one of our lines of business as each one of our departments is quite unique in their nature. And this is where thoughts about comes into play. Um, you checked off all the boxes? Um, as current customers, as potential customers, you will discover that this is the tool that allows that at hawks search ability within a matter of seconds and ability to visualize the information and create those curated pin boards for each one of the business units, depending on what the needs are. And now where? I guess well, Andrea will talk a little bit more about how we gained adoption, but the usage was like and how we, uh, implemented the tool successfully in the organization. >>Okay, so I actually used to work for Canadian tire on DSO. During that time, I helped Thio build training and engaging users to sort of really kick start our use cases. Andi, the ongoing process of adopting thought spot through Canadian Tire s 01 of the sort of reasons that we moved into using thought spot was there was a need Thio evolve, um, in order to see the wealth of data that we had coming in. So the existing reporting again. And this is this sort of standard thoughts bought fix is, um, it brings the data toe. Everyone on git makes it more accessible, so you get more out of your data. So we want to provide users with the ability to customize what they could see and personalized three information so that they could get their specific business requirements out of the data rather than relying on the weekly monthly quarterly reporting. That was all usually fairly generic eso without the ability to deep dive in. So this gave the users the agility thio optimize their campaigns, optimize product murder, urgency where products are or where there's maybe supply chain gaps. Andi just really bring this out for trillions of rose to become accessible. Thio the Canadian tire. That's what user base think. That's the slide. >>That's the slight, Um So as Andrea talked about the business use of the particular tool, let's talk a little bit about how we set it up and a wonderful journey of how it's evolved. So we first implemented 5.3 version of that spot on the Falcon server on we've been adding horsepower to it over time. Now mhm. What I want to stress is the importance off the very first, Data said. That goes into the tool toe. Actually engage the users and to gain the adoption and to make sure there is no argument whether the tool is accurate or not. So what we've started with is a key p I marked layer with all the major metrics that we have and all the available permutations and combinations off the dimensions, whether it's a calendar dimension, proud of dimension or, let's say, customer attribute now, as we started with that data set, we wanted to make sure that we're we have the ability to add and the dimensions right. So now, as we're implementing the tool, we're starting to add in more dimension tables to satisfy the needs off our clients if you want to call it that way as they want to evolve their analytics. So we started adding in some of the store attributes we started adding in some of the product attributes on when I refer to a product attributes, let's say, uh, it involves costs and involves prices involved in some of the strategic internal pieces that we're thinking about now as the comprehensive mark contains right now, in our instance, close to five billion records. This is where it becomes the one source of truth for people declaring information against right so as they go in, we also wanted to make sure when they Corey thought spot there, we're really Onley. According one source of data. One source of truth. It became apparent over time, obviously, that more metrics are needed. They might not be all set up in that particular mark. And that's when we went on the journey off implementing some of the new worksheets or some of the new data sets particularly focused on the four looking pieces. And uh, that's where it becomes important to say This is how you gain the interest and keep the interests of the public right. So you're not just implementing a number off data sets all at once and then letting the users be you're implementing pieces and stages. You're keeping the interest thio, the tool relevant. You're keeping, um, the needs of the public in mind. Now, as you can imagine on the Falcon server piece, um, adding in the horsepower capacity might become challenging the mawr. Billions of Rosie erratic eso were actually in the middle of transitioning our environment to azure in snowflake so that we can connect it. Thio embrace capability of thoughts cloud. And that's where I'm looking forward to that in 2021 I truly believe this will enable us Thio increase the speed off adoption Increase the speed of getting insights out of the tool and scale with regards Thio new data sets that we're thinking about implementing as we're continuing our thoughts about journey >>Okay, so how we drove adoption Thio 4500 plus users eso When we first started Thio approach our use case with the merchants within Canadian Tire We had meetings with these users with who are used place is gonna be with and sort of found out. What are they searching for, Where they typically looking at what existing reports are available for them. Andi kind of sought out to like, What are those things where you're pulling this on your own or someone else's pulling this data because it's not accessible yet And we really use that as our foundation to determine one what data we needed to initially bring into the system but also to sort of create those launchpad pin boards that had the base information that the users we're gonna need so that we could twofold, make it easy for them, toe adopt into the tool and also quickly start Thio, deactivate or discontinue those reports. And just like these air now only available in thought spot because with the sort of formatting within thought spot around dates, it's really easy to make this year's report last year report etcetera. Just have everything roll over every month or a recorder s. So that was kind of some of the pre work foundation when we originally did it. But really, it's been a lot of training, a lot of training. So we conducted ah, lot of in person training, obviously pre co vid eso. We've started to train the group that we targeted, which was the merchants and all of the like, surrounding support groups. Eso we had planners going in and training as well, so that everyone who was really closely connected to the merchants I had an idea of what thoughts about what was and how to use it and where the reports were, and so we just sort of rolled it out that way, and then it started to fly like wildfire. Eso the merchants start to engage with supply chain to have conversations, or the merchants were engaging with the vendors to sort of have negotiations about pricing. And they're creating these reports and getting the access to the information so quickly, and they're sharing it out that we had other groups just coming to us asking, How do I get into thoughts about how can I get in on DSO on top of those groups, we also sought out other heavy analytics groups such a supply chain where we felt like they could have the same benefits if they on boarded into thought spot with their data as well on Ben. Just continuing to evolve the training roll out. Um, you know, we continued to engage with the users, >>so >>we had a newsletter briefly Thio, sort of just keep informing users of the new data coming in or when we actually upgraded our system. So the here are the new features that you'll start seeing. We did virtual trainings and maintaining an F A Q document with the incoming questions from the users, and then eventually evolved into a self guided learning so that users that were coming to a group, or maybe we've already done a full rollout could come in and have the opportunity to learn how to use thought spot, have examples that were relevant to the business and really get started. Eso then each use case sort of after our initial started to build into a formula of the things that we needed to have. So you need to understand it. Having SMEs ready and having the database Onda worksheets built out sort of became the step by step path to drive adoption. Um, from an implementation timeline, I think they're saying, Took about two months and about half of that waas Kenny entire figuring out how figuring out our security, how to get the data in on, Do we need the time to set up the environment and get on Falcon? So then, after that initial two months, then each use case that we come through. Generally, we've got users trained and SMEs set up within about 2 to 3 weeks after the data is ingested. It's not obviously, once snowflakes set up on the data starts to get into that and the data feeds in, then you're really just looking at the 2 to 3 weeks because the data is easily connected in, >>um, no. All right, let's talk about some of the use cases. So we started with what data we've implemented. Andrea touched upon what Use a training look like what the back curate that piece wants. Now let's talk a little bit about use cases and how we actually leverage thoughts bought together the insights. So the very first one is ultimately the benefit of the tool to the entire organization. Israel Time insights. To reiterate what Andrea said, we first implemented the tool with our buyers. They're the nucleus of any retail organization as they work with everybody within the company and as the buyer's eyes, Their responsibility to ensure both the procurement and the sales channel, um, stays afloat at the end of the day, right? So they need information on a regular basis. They needed fast. They needed timely, and they needed in a fashion that they choose to digest it. It right? Not every business is the same. Not every individual is the same. They consume digest, analyze information differently. And that's what that's what allows you to dio whether it's the search, whether it's a customized onboard, please now supply chain unexpected things. As Andrea mentioned Irish work a lot of supply chain. What is the goal of supply chain to receive product and to be able to ship that product to the stores Now, as our organization has been growing and is doing extremely well, we've actually published Q three results recently. Um, the aspect off prioritization at D C level becomes very important, And what drives some of that prioritization is the analysis around what the upcoming sales would be for specific products for specific categories. And that's where again thoughts. But is one of the tools that we've utilized recently to set our prioritization logic from both inbound and outbound us. It's right because it gives you most recent results. It gives you most granular results, depending on the business problem that you're trying to tackle. Now let's chat a little bit about covert 19 response, because this one is an extremely interesting case as a pandemic hit back in March. Um, as you can imagine, the everyday life a Canadian entire became as business unusual is our executives referred to it under business unusual. This speed and the intensity of the insights and the analytics has grown exponentially. And the speed and the intensity of the insights is driven by the fact that we were trying Thio ensure that we have the right selection of products for our Canadian customers because that's ultimately bread and butter off all of the retailers is the customers, right? So thoughts bought allowed us to have early trends off both sales and inventory patterns, where, whether we were stalking out of some of the products in specific stories of provinces, whether we saw some of the upload off different lines of business, depending on the region, ality right as pandemic hit, for example, um, gym's closed restaurants closed. So as Canadian pack carries a wide variety of different lines of business, we actually offer a wide selection of exercise equipment and accessories, cycling products as well as the kitchen appliances and kitchen accessories pieces. Right? So all of those items started growing exponentially and in certain areas more than others. And this is where thoughts about comes into play. A typical analysis on what the region ality of the sales has been over the last couple of days, which is lifetime and pandemic terms, um, could have taken days weeks for analysts to ultimately cobbled together an Excel spreadsheet. Meanwhile, it can take a couple of seconds for 12 Korean tosspot set up a PIN board that can be shared through a wide variety of individuals rather than fording that one Excel spreadsheet that gets manipulated every single time. And then you don't get the right inside. So from again merch supply chain covert response aspect of things. That spot has been one of those blessings and one of those amazing tools to utilize and improve the speed off insights, improved the speed of analytics and improve the speed of decision making that's ultimately impacting, then consumer at the store level. So Andrea talked about 4500 users that we have that number of school. But what I owe the recently like to focus on, uh, Andrew and I laughing because I think the last time we've spoken at a larger forum with the fastball community, I think we had only 500 users. That was in the beginning >>of the year in in February, we were aiming to have like 1000 >>exactly. So mission accomplished. So we've got 4500 employees now. Everybody asked me, Yeah, that's a big number, but how many times do people actually log in on a weekly or daily basis? I'm or interested in that statistic? So lately, um, we've had more than 400 users on the weekly basis. What's what's been cool lately is, uh, the exponential growth off ad hoc ways. So throughout October, we've reached a 75,000 ad hoc ways in our system and about 13,000 PIN board views. So why is that's that's significant? We started off, I would say, in January of 2020 when Andrea refers to it, I think we started off with about 40 45,000 ad hoc worries a month. So again, that was cool. But at the end of the day, we were able to thio double that amount as more people migrate to act hawk searches from PIN board views, and that's that's a tremendous phenomena, because that's what that's about is all about. So I touched upon a little bit about exercise and cycling. So these are our quarterly results for Q two, um, that have showed tremendous growth that we did not plan for, that we were able to achieve with, ultimately the individuals who work throughout the organization, whether it's the merch organization or whether it's the supply chain side of the business. But coming together and utilizing a B I platform by tools such a hot spot, we can see triple digit growth results. Eso What's next for us users at Hawks searches? That's fantastic. I would still like to get to more than 1200 people on the weekly basis. The cool number to me is if all of our lifetime users were you were getting into the tool on a weekly basis. That would be cool. And what's proven to be true is ultimately the only way to achieve it is to keep surprising and delighting them and your surprising and delighting them with the functionality of the tool. With more of the relevant content and ultimately data adding in more data, um, is again possible through ET else, and it's possible through pulling that information manually. But it's expensive, expensive not from the sense of monetary value, but it's expensive from the size time, all of those aspects of things So what I'm looking forward to is migrating our platform to azure in snowflake and being able thio scale our insights accordingly. Toe adding more data to Adam or incites more, uh, more individual worksheets and data sets for people to Korea against helps the each one of the individuals learn. Get some of the insights. Helps my team in particular be, well, more well versed in the data that we have existing throughout the organization. Um, and then now Andrea, in touch upon how we scale it further and and how each one of the individuals can become better with this wonderful >>Yeah, soas used a zero mentioned theater hawk searches going up. It's sort of it's a little internal victory because our starting platform had really been thio build the pin boards to replicate what the users were already expecting. So that was sort of how we easily got people in. And then we just cut off the tap Thio, whatever the previous report waas. So it gave them away. Thio get into the tool and understand the information. So now that they're using ad hoc really means they understand the tool. Um, then they they have the data literacy Thio access the information and use it how they need. So that's it's a really cool piece. Um, that worked on for Canadian tire. A very report oriented and heavy organization. So it was a good starting platforms. So seeing those ad hoc searches go up is great. Um, one of the ways that we sort of scaled out of our initial group and I kind of mentioned this earlier I sort of stepped on my own toes here. Um is that once it was a proven success with the merchants and it started to spread through word of mouth and we sought out the analyst teams. Um, we really just kept sort of driving the insights, finding the data and learning more about the pieces of the business. As you would like to think he knows everything about everything. He only knows what he knows. Eso You have to continue to cultivate the internal champions. Um Thio really keep growing the adoption eso find this means that air excited about the possibility of using thought spot and what they can do with it. You need to find those people because they're the ones who are going to be excited to have this rapid access to the information and also to just be able to quickly spend less time telling a user had access it in thought spot. Then they would running the report because euro mentioned we basically hit a curiosity tax, right? You you didn't want to search for things or you didn't want to ask questions of the data because it was so conversed. Um, it was took too much time to get the data. And if you didn't know exactly what you were looking for, it was worse. So, you know, you wouldn't run a query and be like, Oh, that's interesting. Let me let me now run another query of all that information to get more data. Just not. It's not time effective or resource effective. Actually, at the point, eso scaling the adoption is really cultivating those people who are really into it as well. Um, from a personal development perspective, sort of as a user, I mean, one who doesn't like being smartest person in the room on bought spot sort of provides that possibility. Andi, it makes it easier for you to get recognized for delivering results on Dahlia ble insights and sort of driving the business forward. So you know, B b that all star be the Trailblazer with all the answers, and then you can just sort of find out what really like helping the organization realized the power of thought spot on, baby. Make it into a career. >>Amazing. I love love that you've joined us, Andrea. Such a such an amazing create trajectory. No bias that all of my s o heaps of great information there. Thank you both. So much for sharing your story on driving such amazing adoption and the impact that you've been able to make a T organization through. That we've got a couple of minutes remaining. So just enough time for questions. Eso Andrea. Our first questions for you from your experience. What is one thing you would recommend to new thoughts about users? >>Um, yeah, I would say Be curious and creative. Um, there's one phrase that we used a lot in training, which was just mess around in the tool. Um, it's sort of became a catchphrase. It is really true. Just just try and use it. You can't break. It s Oh, just just play around. Try it you're only limitation of what you're gonna find is your own creativity. Um, and the last thing I would say is don't get trapped by trying to replicate things. Is that exactly as they were? B, this is how we've always done it. Isin necessarily The the best move on day isn't necessarily gonna find new insights. Right. So the change forces you thio look at things from a different perspective on defined. Find new value in the data. >>Yeah, absolutely. Sage advice there. Andan another one here for Yaro. So I guess our theme for beyond this year is analytics meets Cloud Open for everyone. So, in your experience, what does What does that mean for you? >>Wonderful question. Yeah. Listen, Angela Okay, so to me, in short, uh, means scale and it means turning Yes. Sorry. No, into a yes. Uh, no, I'm gonna elaborate. Is interest is laughing at me a little bit. That's right. >>I can talk >>Fancy Two. Okay, So scale from the scale perspective Cloud a zai touched upon Throw our conversation on our presentation cloud enables your ability Thio store have more data, have access to more data without necessarily employing a number off PTL developers and going toe a number of security aspect of things in different data sources now turning a no into a yes. What does that mean with more data with more scalability? Um, the analytics possibilities become infinite throughout my career at Canadian Tire. Other organizations, if you don't necessarily have access thio data or you do not have the necessary granularity, you always tell individuals No, it's not possible. I'm not able to deliver that result. And quite often that becomes the norm, saying no becomes the norm. And I think what we're all striving towards here on this call Aziz part the conference is turning that no one say yes on then making a yes a new, uh, standard a new form. Um, as we have more access to the data, more access to the insights. So that would be my answer. >>Love it. Amazing. Well, that kind of brings in into this session. So thank you, everyone for joining us today on did wrap up this dream. Don't miss the upcoming product roadmap eso We'll be sticking around to speak thio some of the speakers you heard earlier today and I'll make the experts round table, and you can absolutely continue the conversation with this life. Q. On Q and A So you've got an opportunity here to ask questions that maybe keep you up at night. Perhaps, but yet stay tuned for the meat. The experts secrets to scaling analytics adoption after the product roadmap session. Thanks everyone. And thank you again for joining us. Guys. Appreciate it. >>Thank you. Thanks. Thanks.

Published Date : Dec 10 2020

SUMMARY :

Welcome back for our last session of the day how to deliver career making business outcomes with Search And the criteria that we thought about as we started on that journey of the sort of reasons that we moved into using thought spot was there was a need Thio the business use of the particular tool, let's talk a little bit about how we set it up and boards that had the base information that the users we're gonna need so that we could of the things that we needed to have. and the intensity of the insights is driven by the fact that we were trying Thio But at the end of the day, we were able to thio double that amount as more people Um, one of the ways that we sort of scaled out of our initial group and I kind on driving such amazing adoption and the impact that you've been able to make a T organization through. So the change forces you thio look at things from a different perspective on So I guess our theme for beyond this year is analytics meets Cloud so to me, in short, uh, means scale and And quite often that becomes the norm, saying no becomes the norm. the experts round table, and you can absolutely continue the conversation with this life. Thank you.

ENTITIES

Entity	Category	Confidence
Andrea	PERSON	0.99+
Andrea Frisk	PERSON	0.99+
Canada	LOCATION	0.99+
Angela	PERSON	0.99+
January of 2020	DATE	0.99+
Andrew	PERSON	0.99+
Yarrow Baturin	PERSON	0.99+
February	DATE	0.99+
2	QUANTITY	0.99+
Merch analytics	ORGANIZATION	0.99+
4500 employees	QUANTITY	0.99+
first questions	QUANTITY	0.99+
Canadian Tire	ORGANIZATION	0.99+
October	DATE	0.99+
2021	DATE	0.99+
Adam	PERSON	0.99+
one phrase	QUANTITY	0.99+
Korea	LOCATION	0.99+
Excel	TITLE	0.99+
One source	QUANTITY	0.99+
last year	DATE	0.99+
March	DATE	0.99+
more than 1700 locations	QUANTITY	0.99+
one	QUANTITY	0.99+
12	QUANTITY	0.99+
today	DATE	0.99+
more than 1200 people	QUANTITY	0.99+
one source	QUANTITY	0.99+
1000	QUANTITY	0.99+
3 weeks	QUANTITY	0.99+
more than 400 users	QUANTITY	0.98+
both	QUANTITY	0.98+
two months	QUANTITY	0.98+
first	QUANTITY	0.98+
4.5 1000 employees	QUANTITY	0.98+
Corey	PERSON	0.97+
each one	QUANTITY	0.97+
four looking pieces	QUANTITY	0.96+
this year	DATE	0.96+
trillions of rose	QUANTITY	0.96+
about two months	QUANTITY	0.96+
each use case	QUANTITY	0.96+
Kenny	PERSON	0.96+
about 40 45,000 ad	QUANTITY	0.96+
McCarthy	PERSON	0.95+
75,000 ad	QUANTITY	0.95+
Eso	ORGANIZATION	0.95+
five billion records	QUANTITY	0.95+
three information	QUANTITY	0.94+
first one	QUANTITY	0.93+
Canadian	OTHER	0.93+
earlier today	DATE	0.91+
Thio	ORGANIZATION	0.9+
500 users	QUANTITY	0.9+
Thio	PERSON	0.9+
hawks	ORGANIZATION	0.9+
about 2	QUANTITY	0.89+
both sales	QUANTITY	0.88+
pandemic	EVENT	0.88+
Microsoft	ORGANIZATION	0.87+
about 4500 users	QUANTITY	0.86+

IBM Flash System 9100 Digital Launch

(bright music) >> Hi, I'm Peter Burris, and welcome to another special digital community event, brought to you by theCUBE and Wikibon. We've got a great session planned for the next hour or so. Specifically, we're gonna talk about the journey to the data-driven multi-cloud. Sponsored by IBM, with a lot of great thought leadership content from IBM guests. Now, what we'll do is, we'll introduce some of these topics, we'll have these conversations, and at the end, this is gonna be an opportunity for you to participate, as a community, in a crowd chat, so that you can ask questions, voice your opinions, hear what others have to say about this crucial issue. Now why is this so important? Well Wikibon believes very strongly that one of the seminal features of the transition to digital business, driving new-type AI classes of applications, et cetera, is the ability of using flash-based storage systems and related software, to do a better job of delivering data to more complex, richer applications, faster, and that's catalyzing a lot of the transformation that we're talking about. So let me introduce our first guest. Eric Herzog is the CMO and VP Worldwide Storage Channels at IBM. Eric, thanks for coming on theCUBE. >> Great, well thank you Peter. We love coming to theCUBE, and most importantly, it's what you guys can do to help educate all the end-users and the resellers that sell to them, and that's very, very valuable and we've had good feedback from clients and partners, that, hey, we heard you guys on theCUBE, and very interesting, so I really appreciate all the work you guys do. >> Oh, thank you very much. We've got a lot of great things to talk about today. First, and I want to start it off, kick off the proceedings for the next hour or so by addressing the most important issue here. Data-driven. Now Wikibon believes that digital transformation means something, it's the process by which a business treats data as an asset, and re-institutionalizes its work and changes the way it engages with customers, et cetera. But this notion of data-driven is especially important because it elevates the role that storage is gonna play within an organization. Sometimes I think maybe we shouldn't even call it storage. Talk to us a little bit about data-driven and how that concept is driving some of the concepts in innovation that are represented in this and future IBM products. >> Sure. So I think the first thing, it is all about the data, and it doesn't matter whether you're a small company, like Herzog's Bar and Grill, or the largest Fortune 500 in the world. The bottom line is, your most valuable asset is you data, whether that's customer data, supply chain data, partner data that comes to you, that you use, services data, the data you guys sell, right? You're an analysis firm, so you've got data, and you use that data to create you analysis, and then you use that as a product. So, data is the most critical asset. At the same time, data always goes onto storage. So if that foundation of storage is not resilient, is not available, is not performant, then either A, it's totally unavailable, right, you can't get to the customer data. B, there's a problem with the data, okay, so you're doing supply chain and if the storage corrupts the data, then guess what? You can't send out the T-shirts to the right retail location, or have it available online if you're an online retailer. >> Or you sent 200,000 instead of 20, and you get stuck with the bill. >> Right, exactly. So data is that incredible asset and then underneath, think of storage as the foundation of a building. Data is your building, okay, and all the various aspects of that data, customer data, your data, internal data, everything you're doing, that's the building. If the foundation of the building isn't rock solid the building falls down. Whether your building is big or small, and that's what storage does, and then storage can also optimize the building above it. So think of it more than just the foundation but the foundation if you will, that almost has like a tree, and has got things that come up from the bottom and have that beautiful image, and storage can help you out. For example, metadata. Metadata which is data about data could be used by analytics, package them, well guess what? The metadata about data could be exposed by the storage company. So that's why data-driven is so important from an end-user perspective and why storage is that foundation underneath a data-driven enterprise. >> Now we've seen a lot of folks talk about how cloud is the centerpiece of thinking about infrastructure. You're suggesting that data is the centerpiece of infrastructure, and cloud is gonna be an implementation decision. Where do I put the workloads, costs, all the other elements associated with it. But it suggests ultimately that data is not gonna end up in one place. We have to think about data as being where it needs to be to perform the work. That suggests multi-cloud, multi-premise. Talk to us a little bit about the role that storage and multi-cloud play together. >> So let's take multi-cloud first and peel that away. So multi-cloud, we see a couple of different things. So first of all, certain companies don't want to use a public cloud. Whether it's a security issue, and actually some people have found out that public cloud providers, no matter who the vendor is, sort of is a razor in a razor blade. Very cheap to put the storage out there but we want certain SLAs, guess what? The cloud vendors charge more. If you move data around a lot, in and out as you were describing, it's really that valuable, guess what? On ingress and egress gets you charges for that. The cloud provider. So it's almost the razor and the razor blades. So A, there's a cost factor in public only. B, you've got people that have security issues. C, what we've seen is, in many cases, hybrid. So certain datasets go out to the cloud and other datasets stay on the premises. So you've got that aspect of multi, which is public, private or hybrid. The second aspect, which is very common in bigger companies that are either divisionalized or large geographically, is literally the usage, in a hybrid or a public cloud environment, of multiple cloud vendors. So for example, in several countries the data has to physically stay within the confines of that country. So if you're a big enterprise and you've got offices in 200 different, well not 200, but 100 different countries, and 20 of 'em you have to keep in that country by law. If your cloud provider doesn't have a data center there you need to use a different cloud provider. So you've got that. And you also have, I would argue that the cloud is not new anymore. The internet is the original cloud. So it's really old. >> Cloud in many respects is the programming model, or the mature programming model for the internet-based programming applications. >> I'd agree with that. So what that means is, as it gets more mature, from the mid-sized company up, all of a sudden procurement's involved. So think about the way networking, storage and servers, and sometimes even software was bought. The IT guy, the CIO, the line of business might specify, I want to use it but then it goes to procurement. In the mid to big company it's like, great, are we getting three bids on that? So we've also seen that happen, particularly with larger enterprise where, well you were using IBM cloud, that's great, but you are getting a quote from Microsoft or Amazon right? So those are the two aspects we see in multi-cloud, and by the way, that can be a very complex situation dealing with big companies. So the key thing that we do at IBM, is make sure that whichever model you take, public, private or hybrid, or multiple public clouds, or multiple public cloud providers, using a hybrid configuration, that we can support that. So things like our transparent cloud tiering, we've also recently created some solution blueprints for multi-clouds. So these things allow you to simply and easily deploy. Storage has to be viewed as transparent to a cloud. You've gotta be able to move the data back and forth, whether that be backing the data up, or archiving the data, or secondary data usage, or whatever that may be. And so storage really is, gotta be multi-cloud and we've been doing those solutions already and in fact, but honestly for the software side of the IBM portfolio for storage, we have hundreds of cloud providers mid, big and small, that use our storage software to offer backup as a service or storage as a service, and we're again the software foundation underneath what an end-user would buy as a service from those cloud providers. >> So I want to pick up on a word you used, simplicity. So, you and I are old infrastructure hacks and for many years I used to tell my management, infrastructure must do no harm. That's the best way to think about infrastructure. Simplicity is the new value proposition, complexity remains the killer. Talk to us a little bit about the role that simplicity in packaging and service delivery and everything else is again, shaping the way you guys, IBM, think about what products, what systems and when. >> So I think there's a couple of things. First of all, it's all about the right tool for the right job. So you don't want to over-sell and sell a big, giant piece of high-end all-flash array, for example, to a small company. They're not gonna buy that. So we have created a portfolio of which our FlashSystem 9100 is our newest product, but we've got a whole set of portfolios from the entry space to the mid range to the high end. We also have stuff that's tuned for applications, so for example, our lasting storage server which comes in an all-flash configuration is ideal for big data analytics workloads. Our DS8000 family of flash is ideal for mainframe attach, and in fact we have close to 65% of all mainframe attached storage, is from IBM. But you have the right tool for the right job, so that's item number one. The second thing you want to do is easier and easier to use. Whether that be configuring the physical entity itself, so how do you cable, how do you rack and stack it, make sure that it easily integrates into whatever else they're putting together in their data center, but it a cloud data center, a traditional on-premises data center, it doesn't matter. The third thing is all about the software. So how do you have software that makes the array easier and easier to use, and is heavily automated based on AI. So the old automation way, and we've both been in that era, was you set policies. Policy-based management, and when it came out 10 years ago, it was a transformational event. Now it's all about using AI in your infrastructure. Not only does your storage need to be right to enable AI at the server workload level, but we're saying, we've actually deployed AI inside of our storage, making it easier for the storage manager or the IT manager, and in some cases even the app owner to configure the storage 'cause it's automated. >> Going back to that notion that the storage knows something about the metadata, too. >> Right, exactly, exactly. So the last thing is our multi-cloud blueprint. So in those cases, what we've done is create these multi-cloud blueprints. For example, disaster recovery and business continuity using a public cloud. Or secondary data use in a public cloud. How do you go ahead and take a snapshot, a replica or a backup, and use it for dev-ops or test or analytics? And by the way, our Spectrum copy data management software allows you, but you need a blueprint so that it's easy for the end user, or for those end users who buy through our partners, our partners then have this recipe book, these blueprints, you put them together, use the software that happens to come embedded in our new FlashSystem 9100 and then they use that and create all these various different recipes. Almost, I hate to say it, like a baker would do. They use some base ingredients in baking but you can make cookies, candies, all kinds of stuff, like a donut is essentially a baked good that's fried. So all these things use the same base ingredients and that software that comes with the FlashSystem 9100, are those base ingredients, reformulated in different models to give all these multi-cloud blueprints. >> And we've gotta learn more about vegetables so we can talk about salad in that metaphor, (Eric laughing) you and I. Eric once again. >> Great, thank you. >> Thank you so much for joining us here on the CUBE. >> Great, thank you. >> Alright, so let's hear this come to life in the form of a product video from IBM on the FlashSystem 9100. >> Some things change so quickly, it's impossible to track with the naked eye. The speed of change in your business can be just as sudden and requires the ability to rapidly analyze the details of your data. The new, IBM FlashSystem 9100, accelerates your ability to obtain real-time value from that information, and rapidly evolve to a multi-cloud infrastructure, fueled by NVMe technology. In one powerful platform. IBM FlashSystem 9100, combines the performance, of IBM FlashCore technology. The efficiency of IBM Spectrum Virtualize. The IBM software solutions, to speed your multi-cloud deployments, reduce overall costs, plan for performance and capacity, and simplify support using cloud-based IBM storage insights to provide AI-powered predictive analytics, and simplify data protection with a storage solution that's flexible, modern, and agile. It's time to re-think your data infrastructure. (upbeat music) >> Great to hear about the IBM FlashSystem 9100 but let's get some more details. To help us with that, we've got Bina Hallman who's the Vice President Offering Management at IBM Storage. Bina, welcome to theCUBE. >> Well, thanks for having me. It's an exciting even, we're looking forward to it. >> So Bina, I want to build on some of the stuff that we talked to Eric about. Eric did a good job of articulating the overall customer challenge. As IBM conceives how it's going to approach customers and help them solve these challenges, let's talk about some of the core values that IBM brings to bear. What would you say would be one of the, say three, what are the three things that IBM really focuses on, as it thinks about its core values to approach these challenges? >> Sure, sure. It's really around helping the client, providing a simple one-stop shopping approach, ensuring that we're doing all the right things to bring the capabilities together so that clients don't have to take different component technologies and put them together themselves. They can focus on providing business value. And it's really around, delivering the economic benefits around CapEx and OpEx, delivering a set of capabilities that help them move on their journey to a data-driven, multi-cloud. Make it easier and make it simpler. >> So, making sure that it's one place they can go where they can get the solution. But IBM has a long history of engineering. Are you doing anything special in terms of pre-testing, pre-packaging some of these things to make it easier? >> Yeah, we over the years have worked with many of our clients around the world and helping them achieve their vision and their strategy around multi-cloud, and in that journey and those set of experiences, we've identified some key solutions that really do make it easier. And so we're leveraging the breadth of IBM, the power of IBM, making those investment to deliver a set of solutions that are pre-tested, they are supported at the solutions level. Really focusing on delivering and underpinning the solutions with blueprints. Step-by-step documentation, and as clients deploy these solutions, they run into challenges, having IBM support to assist. Really bringing it all together. This notion of a multi-cloud architecture, around delivering modern infrastructure capabilities, NVMe acceleration, but also some of our really core differentiation that we deliver through FlashCore data reduction capabilities, along with things like modern data protection. That segment is changing and we really want to enable clients, their IT, and their line of business to really free them up and focus on a business value, versus putting these components together. So it's really around taking those complex things and make them easier for clients. Get improved RPO, RTO, get improved performance, get improved costs, but also flexibility and agility are very critical. >> That sounds like therefore, I mean the history of storage has been trade-offs that you, this can only go that fast, and that tape can only go that fast but now when we start thinking about flash, NVMe, the trade-offs are not as acute as they used to be. Is IBM's engineering chops capable of pointing how you can in fact have almost all of this at one time? >> Oh absolutely. The breadth and the capabilities in our R and D and the research capabilities, also our experiences that I talked about, engagements, putting all of that together to deliver some key solutions and capabilities. Like, look, everybody needs backup and archive. Backup to recover your data in case of a disaster occurs, archive for long-term retention. That data management, the data protection segment, it's going through a transformation. New emerging capabilities, new ways to do backup. And what we're doing is, pulling all of that together, with things that we introduced, for example, our Protect Plus in the fourth quarter, along with this FS 9100 and the cloud capabilities, to deliver a solution around data protection, data reuse, so that you have a modern backup approach for both virtual and physical environments that is really based on things like snapshots and mountable copies, So you're not using that traditional approach to recovering your copy from a backup by bringing it back. Instead, all you're doing is mounting one of those copies and instantly getting your application back and running for operational recovery. >> So to summarize some of those value, once stop, pre-tested, advanced technologies, smartly engineered. You guys did something interesting on July 10th. Why don't you talk about how those values, and the understanding of the problem, manifested so fast. Kind of an exciting set of new products that you guys introduced on July 10th. >> Absolutely. On July 10th we not only introduced our flagship FlashSystem, the FS 9100, which delivers some amazing client value around the economic benefits of CapEx, OpEx reduction, but also seamless data mobility, data reuse, security. All the things that are important for a client on their cloud journey. In addition to that, we infused that offering with AI-based predictive analytics and of course that performance and NVMe acceleration is really key, but in addition to doing that, we've also introduced some very exciting solutions. Really three key solutions. One around data protection, data reuse, to enable clients to get that agility, and second is around business continuity and data reuse. To be able to really reduce the expense of having business continuity in today's environment. It's a high-risk environment, it's inevitable to have disruptions but really being prepared to mitigate some of those risks and having operational continuity is important and by doing things like leveraging the public cloud for your DR capabilities. That's very important, so we introduced a solution around that. And the third is around private cloud. Taking your IBM storage, your FS 9100, along with the heterogeneous environment you have, and making it cloud-ready. Getting the cloud efficiencies. Making it to where you can use it for environments to create things like native cloud applications that are portable, from on-prem and into the cloud. So those are some of the key ways that we brought this together to really deliver on client value. >> So could you give us just one quick use case of your clients that are applying these technologies to solve their problems? >> Yeah, so let me use the first one that I talked about, the data protection and data reuse. So to be able to take your on-premise environment, really apply an abstraction layer, set up catalogs, set up SLAs and access control, but then be able to step away and manage that storage all through API bays. We have a lot of clients that are doing that and then taking that, making the snapshots, using those copies for things like, whether it's the disaster recovery or secondary use cases like analytics, dev-ops. You know, dev-ops is a really important use case and our clients are really leveraging some of these capabilities for it because you want to make sure that, as application developers are developing their applications, they're working with the latest data and making sure that the testing they're doing is meaningful in finding the maximum number of defects so you get the highest quality of code coming out of them and being able to do that, in a self-service driven way so that they're not having to slow down their innovation. We have clients leveraging our capabilities for those kinds of use cases. >> It's great to hear about the FlashSystem 9100 but let's hear what customers have to say about it. Not too long ago, IBM convened a customer panel to discuss many aspects of this announcement. So let's hear what some of the customers had to say about the FlashSystem 9100. >> Now Owen, you've used just about every flash system that IBM has made. Tell us, what excites you about this announcement of our new FlashSystem 9100. >> Well, let's start with the hardware. The fact that they took the big modules from the older systems, and collapsed that down to a two-and-a-half inch form-factor NVMe drive is mind-blowing. And to do it with the full speed compression as well. When the compression was first announced, for the last FlashSystem 900, I didn't think it was possible. We tested it, I was proven wrong. (laughing) It's entirely possible. And to do that on a small form-factor NVMe drive is just astounding. Now to layer on the full software stack, get all those features, and the possibilities for your business, and what we can do, and leverage those systems and technologies, and take the snapshots in the replication and the insights into what our system's doing, it is really mind-blowing what's coming out today and I cannot wait to just kick those tires. There's more. So with that real-world compression ratio, that we can validate on the new 900, and it's the same in this new system, which is astounding, but we can get more, and just the amount of storage you get in this really small footprint. Like, two rack units is nothing. Half our services are two rack units, which is absolutely astounding, to get that much data in such a very small package, like, 460 terabytes is phenomenal, with all these features. The full solution is amazing, but what else can we do with it? And especially as they've said, if it's for a comparable price as what we've bought before, and we're getting the full solution with the software, the hardware, the extremely small form-factor, what else can you do? What workloads can you pull forward? So where our backup systems weren't on the super fast storage like our production systems are, now we can pull those forward and they can give the same performance as production to run the back-end of the company, which I can't wait to test. >> It's great to hear from customers. The centerpiece of the Wikibon community. But let's also get the analyst's perspective. Let's hear from Eric Burgener, who's the Research Vice President for Storage at IDC. >> Thanks very much Peter, good to be back. >> So we've heard a lot from a number of folks today about some of the changes that are happening in the industry and I want to amplify some things and get the analyst's perspective. So Wikibon, as a fellow analyst, Wikibon believes pretty strongly that the emergence of flash-based storage systems is one of the catalyst technologies that's driving a lot of the changes. If only because, old storage technologies are focused on persisting data. Disc, slow, but at least it was there. Flash systems allow a bit flip, they allow you to think about delivering data to anywhere in your organization. Different applications, without a lot of complexity, but it's gotta be more than that. What else is crucial, to making sure that these systems in fact are enabling the types of applications that customers are trying to deliver today. >> Yeah, so actually there's an emerging technology that provides the perfect answer to that, which is NVMe. If you look at most of the all-flash systems that have shipped so far, they've been based around SCSI. SCSI was a protocol designed for hard disk drives, not flash, even though you can use it with flash. NVMe is specifically designed for flash and that's really gonna open up the ability to get the full value of the performance, the capacity utilization, and the efficiencies, that all-flash arrays can bring to the market. And in this era of big data, more than ever, we need to unlock that performance capability. >> So as we think about the big data, AI, that's gonna have a significant impact overall in the market and how a lot of different vendors are jockeying for position. When IDC looks at the impact of flash, NVMe, and the reemergence of some traditional big vendors, how do you think the market landscape's gonna be changing over the next few years? >> Yeah, how this market has developed, really the NVMe-based all-flash arrays are gonna be a carve-out from the primary storage market which are SCSI-based AFAs today. So we're gonna see that start to grow over time, it's just emerging. We had startups begin to ship NVMe-based arrays back in 2016. This year we've actually got several of the majors who've got products based around their flagship platforms that are optimized for NVMe. So very quickly we're gonna move to a situation where we've got a number of options from both startups and major players available, with the NVMe technology as the core. >> And as you think about NVMe, at the core, it also means that we can do more with software, closer to the data. So that's gotta be another feature of how the market's gonna evolve over the next couple of years, wouldn't you say? >> Yeah, absolutely. A lot of the data services that generate latencies, like in-line data reduction, encryption and that type of thing, we can run those with less impact on the application side when we have much more performant storage on the back-end. But I have to mention one other thing. To really get all that NVMe performance all the way to the application side, you've gotta have an NVMe Over Fabric connection. So it's not enough to just have NVMe in the back-end array but you need that RDMA connection to the hosts and that's what NVMe Over Fabric provides for you. >> Great, so that's what's happening on the technology-product-vendor side, but ultimately the goal here is to enable enterprises to do something different. So what's gonna be the impact on the enterprise over the next few years? >> Yeah, so we believe that SCSI clearly will get replaced in the primary storage space, by NVMe over time. In fact, we've predicted that by 2021, we think that over 50% of all the external, primary storage revenue, will be generated by these end-to-end NVMe-based systems. So we see that transition happening over the course of the next two to three years. Probably by the end of this year, we'll have NVMe-based offerings, with NVMe Over Fabric front ends, available from six of the established storage providers, as well as a number of smaller startups. >> We've come a long way from the brown, spinning stuff, haven't we? >> (laughing) Absolutely. >> Alright, Eric Burgener, thank you very much. IDC Research Vice President, great once again to have you in theCUBE. >> Thanks Peter. >> Always great to get the analyst's perspective, but let's get back to the customer perspective. Again, from that same panel that we saw before, here's some highlights of what customers had to say about IBM's Spectrum family of software. (upbeat music) We love hearing those customer highlights but let's get into some of the overall storage trends and to do that we've asked Eric Herzog and Bina Hallman back to theCUBE. Eric, Bina, thanks again for coming back. So, what I want to do now is, I want to talk a little bit about some trends within the storage world and what the next few years are gonna mean, but Eric, I want to start with you. I was recently at IBM Think, and Ginni Rometty talked about the idea of putting smart to work. Now, I can tell you, that means something to me because the whole notion of how data gets used, how work gets institutionalized around your data, what does storage do in that context? To put smart to work. >> Well I think there's a couple of things. First we've gotta realize that it's not about storage, it's about the data and the information that happens to sit on the storage. So you have to have storage that's always available, always resilient, is incredibly fast, and as I said earlier, transparently moves things in and out of the cloud, automatically, so that the user doesn't have to do it. Second thing that's critical is the integration of AI, artificial intelligence. Both into the storage solution itself, of what the storage does, how you do it, and how it plays with the data, but also, if you're gonna do AI on a broad scale, and for example we're working with a customer right now and their AI configuration in 100 petabytes. Leveraging our storage underneath the hood of that big, giant AI analytics workload. So that's why they have to both think of it in the storage to make the storage better and more productive with the data and the information that it has, but then also as the undercurrent for any AI solution that anyone wants to employ, big, medium or small. >> So Bina, I want to pick up on that because there are gonna be some, there's some advanced technologies that are being exploited within storage right now, to achieve what Eric's talking about, but there's gonna be a lot more. And there's gonna be more intensive application utilizations of some of those technologies. What are some of the technologies that are becoming increasingly important, from a storage standpoint, that people have to think about as they try to achieve their digital transformation objectives. >> That's right, I mean Peter, in addition to some of the basics around making sure your infrastructure is enabled to handle the SLAs and the level of performance that's required by these AI workloads, when you think about what Eric said, this data's gonna reside, it's gonna reside on-premise, it's gonna be behind a firewall, potentially in the cloud, or multiple public clouds. How do you manage that data? How do you get visibility to that data? And then be able to leverage that data for your analytics. And so data management is going to be very important but also, being able to understand what that data contains and be able to run the analytics and be able to do things like tagging the metadata and then doing some specialized analytics around that is going to be very important. The fabric to move that data, data portability from on-prem into the cloud, and back and forth, bidirectionally, is gonna be very important as you look into the future. >> And obviously things like IOT's gonna mean bigger, more, more available. So a lot of technologies, in a big picture, are gonna become more closely associated with storage. I like to say that, at some point in time we've gotta stop thinking about calling stuff storage because it's gonna be so central to the fabric of how data works within a business. But Eric, I want to come back to you and say, those are some of the big picture technologies but what are some of the little picture technologies? That none-the-less are really central to being able to build up this vision over the course of the next few years? >> Well a couple of things. One is the move to NVMe, so we've integrated NVMe into our FLashSystem 9100, we have fabric support, we already announced back in February actually, fabric support for NVMe over an InfiniBand infrastructure with our FlashSystem 900 and we're extending that to all of the other inter-connects from a fabric perspective for NVMe, whether that be ethernet or whether that be fiber channel and we put NVMe in the system. We also have integrated our custom flash models, our FlashCore technology allows us to take raw flash and create, if you will, a custom SSD. Why does that matter? We can get better resiliency, we can get incredibly better performance, which is very tied in to your applications workloads and use cases, especially in data-driven multi-cloud environment. It's critical that the flash is incredibly fast and it really matters. And resilient, what do you do? You try to move it to the cloud and you lose your data. So if you don't have that resiliency and availability, that's a big issue. I think the third thing is, what I call the cloud-ification of software. All of IBM's storage software is cloud-ified. We can move things simultaneously into the cloud. It's all automated. We can move data around all over the place. Not only our data, not only to our boxes, we could actually move other people's array's data around for them and we can do it with our storage software. So it's really critical to have this cloud-ification. It's really cool to have this now technology, NVMe from an end-to-end perspective for fabric and then inside the system, to get the right resiliency, the right availability, the right performance for your applications, workloads and use cases, and you've gotta make sure that everything is cloud-ified and portable, and mobile, and we've done that with the solutions that are wrapped into our FlashSystem 9100 that we launched a couple of weeks ago. >> So you are both though leaders in the storage industry. I think that's very clear, and the whole notion of storage technology, and you work with a lot of customers, you see a lot of use cases. So I want to ask you one quick question, to close here. And that is, if there was one thing that you would tell a storage leader, a CIO or someone who things about storage in a broad way, one mindset change that they have to make, to start this journey and get it going so that it's gonna be successful. What would that one mindset change be? Bina, what do you think? >> You know, I think it's really around, there's a lot of capabilities out there. It's really around simplifying your environment and making sure that, as you're deploying these new solutions or new capabilities, that you've really got a partnership with a vendor that's gonna help you make it easier. Take those complex tasks, make them easier, deliver those step-by-step instructions and documentation and be right there when you need their assistance. So I think that's gonna be really important. >> So look at it from a portfolio perspective, where best of breed is still important, but it's gotta work together because it leverages itself. >> It's gotta work together, absolutely. >> Eric, what would you say? >> Well I think the key thing is, people think storage is storage. All storage is not the same and one of the central tenets at IBM storage is to make sure that we're integrated with the cloud. We can move data around transparently, easily, simply, Bina pointed out the simplicity. If you can't support the cloud, then you're really just a storage box, and that's not what IBM does. Over 40% of what we sell is actually storage software and all that software works with all of our competitors' gear. And in fact our Spectrum Virtualize for Public Cloud, for example, can simultaneously have datasets sitting in a cloud instantiation, and sitting on premises, and then we can use our copy data management to take advantage of that secondary copy. That's all because we're so cloud-ified from a software perspective, so all storage is not the same, and you can't think of storage as, I need the cheapest storage. It's gotta be, how does it drive business value for my oceans of data? That's what matters most, and by the way, we're very cost-effective anyway, especially because of our custom flash model which allows us to have a real price advantage. >> You ain't doing business at a level of 100 petabytes if you're not cost effective. >> Right, so those are the things that we see as really critical, is storage is not storage. Storage is about data and information. >> So let me summarize your point then, if I can really quickly. That in other words, that we have to think about storage as the first step to great data management. >> Absolutely, absolutely Peter. >> Eric, Bina, great conversation. >> Thank you. >> So we've heard a lot of great thought leaderships comments on the data-driven journey with multi-cloud and some great product announcements. But now, let's do the crowd chat. This is your opportunity to participate in this proceedings. It's the centerpiece of the digital community event. What questions do you have? What comments do you have? What answers might you provide to your peers? This is an opportunity for all of us collectively to engage and have those crucial conversations that are gonna allow you to, from a storage perspective, drive business value in your digital business transformations. So, let's get straight to the crowd chat. (bright music)

Published Date : Jul 25 2018

SUMMARY :

the journey to the data-driven multi-cloud. and the resellers that sell to them, and changes the way it engages with customers, et cetera. and if the storage corrupts the data, then guess what? and you get stuck with the bill. and have that beautiful image, and storage can help you out. is the centerpiece of infrastructure, the data has to physically stay Cloud in many respects is the programming model, already and in fact, but honestly for the software side is again, shaping the way you guys, IBM, think about from the entry space to the mid range to the high end. Going back to that notion that the storage so that it's easy for the end user, (Eric laughing) you and I. Thank you so much in the form of a product video from IBM and requires the ability to rapidly analyze the details Great to hear about the IBM FlashSystem 9100 It's an exciting even, we're looking forward to it. that IBM brings to bear. so that clients don't have to pre-packaging some of these things to make it easier? and in that journey and those set of experiences, and that tape can only go that fast and the research capabilities, also our experiences and the understanding of the problem, manifested so fast. Making it to where you can use it for environments and making sure that the testing they're doing It's great to hear about the FlashSystem 9100 Tell us, what excites you about this announcement and it's the same in this new system, which is astounding, The centerpiece of the Wikibon community. and get the analyst's perspective. that provides the perfect answer to that, and the reemergence of some traditional big vendors, really the NVMe-based all-flash arrays over the next couple of years, wouldn't you say? So it's not enough to just have NVMe in the back-end array over the next few years? over the course of the next two to three years. great once again to have you in theCUBE. and to do that we've asked Eric Herzog so that the user doesn't have to do it. from a storage standpoint, that people have to think about and be able to run the analytics because it's gonna be so central to the fabric One is the move to NVMe, so we've integrated NVMe and the whole notion of storage technology, and be right there when you need their assistance. So look at it from a portfolio perspective, It's gotta work together, and by the way, we're very cost-effective anyway, You ain't doing business at a level of 100 petabytes that we see as really critical, as the first step to great data management. on the data-driven journey with multi-cloud

ENTITIES

Entity	Category	Confidence
Eric Burgener	PERSON	0.99+
Peter Burris	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Eric	PERSON	0.99+
Peter	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Eric Herzog	PERSON	0.99+
IBM	ORGANIZATION	0.99+
July 10th	DATE	0.99+
Owen	PERSON	0.99+
Herzog's Bar and Grill	ORGANIZATION	0.99+
2016	DATE	0.99+
six	QUANTITY	0.99+
February	DATE	0.99+
first	QUANTITY	0.99+
Bina Hallman	PERSON	0.99+
200,000	QUANTITY	0.99+
First	QUANTITY	0.99+
20	QUANTITY	0.99+
2021	DATE	0.99+
100 petabytes	QUANTITY	0.99+
Bina	PERSON	0.99+
two aspects	QUANTITY	0.99+
DS8000	COMMERCIAL_ITEM	0.99+
Wikibon	ORGANIZATION	0.99+
100 different countries	QUANTITY	0.99+
two-and-a-half inch	QUANTITY	0.99+
460 terabytes	QUANTITY	0.99+
Ginni Rometty	PERSON	0.99+
FlashSystem 9100	COMMERCIAL_ITEM	0.99+
FlashSystem 900	COMMERCIAL_ITEM	0.99+
second aspect	QUANTITY	0.99+
FS 9100	COMMERCIAL_ITEM	0.99+
hundreds	QUANTITY	0.99+
third thing	QUANTITY	0.99+
200	QUANTITY	0.99+
Both	QUANTITY	0.99+
today	DATE	0.99+
ingress	ORGANIZATION	0.98+
One	QUANTITY	0.98+
Over 40%	QUANTITY	0.98+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
one thing	QUANTITY	0.98+
first step	QUANTITY	0.98+

Action Item Quick Take | David Floyer | Flash and SSD, April 2018

>> Hi, I'm Peter Burris with another Wikibon Action Item Quick Take. David Floyer, you've been at the vanguard of talking about the role that Flash, SSD's, and others, other technologies are going to have in the technology industry, predicting early on that it was going to eclipse HDD, even though you got a lot of blow back about the "We're going to remain expensive and small". That's changed. What's going on? >> Well, I've got a prediction that we'll have petabyte drives, SSD drives, within five years. Let me tell you a little bit why. So there's this new type of SSD that's coming into town. It's the mega SSD, and Nimbus Data has just announced this mega SSD. It's a hundred terabyte drive. It's very high density, obviously. It has much fewer, uh, much fewer? It has fewer IOPS and bandwidth than SSD. The access density is much better than HDD, but still obviously lower than high-performance SSD. Much, much lower space power than either SSD or HDD in terms of environmentals. It's three and a half inch. That's compatible with HDD. It's obviously looking to go into the same slots. A hundred terabytes today, two hundred terabytes, 10x, that 10x of the Hammer drives that are coming in from HDD's in 2019, 2020, and the delta will increase over time. It's still more expensive than HDD per bit, but it's, and it's not a direct replacement, but much greater ability to integrate with data services and other things like that. So the prediction, then, is get ready for mega SSD's. It's going to carve out a space at the low end of SSD's and into the HDD's, and we're going to have one petabyte, or more, drives within five years. >> Big stuff from small things. David Floyer, thank you very much. And, once again, this has been a Wikibon Action Item Quick Take. (chill techno music)

Published Date : Apr 6 2018

SUMMARY :

about the "We're going to remain expensive and small". It's the mega SSD, and Nimbus Data has just announced Wikibon Action Item Quick Take.

ENTITIES

Entity	Category	Confidence
David Floyer	PERSON	0.99+
Peter Burris	PERSON	0.99+
April 2018	DATE	0.99+
2019	DATE	0.99+
2020	DATE	0.99+
10x	QUANTITY	0.99+
two hundred terabytes	QUANTITY	0.99+
three and a half inch	QUANTITY	0.99+
Nimbus Data	ORGANIZATION	0.98+
A hundred terabytes	QUANTITY	0.98+
one petabyte	QUANTITY	0.97+
today	DATE	0.95+
five years	QUANTITY	0.94+
Wikibon	ORGANIZATION	0.92+
a hundred terabyte	QUANTITY	0.83+
petabyte	QUANTITY	0.56+

Data Science for All: It's a Whole New Game

>> There's a movement that's sweeping across businesses everywhere here in this country and around the world. And it's all about data. Today businesses are being inundated with data. To the tune of over two and a half million gigabytes that'll be generated in the next 60 seconds alone. What do you do with all that data? To extract insights you typically turn to a data scientist. But not necessarily anymore. At least not exclusively. Today the ability to extract value from data is becoming a shared mission. A team effort that spans the organization extending far more widely than ever before. Today, data science is being democratized. >> Data Sciences for All: It's a Whole New Game. >> Welcome everyone, I'm Katie Linendoll. I'm a technology expert writer and I love reporting on all things tech. My fascination with tech started very young. I began coding when I was 12. Received my networking certs by 18 and a degree in IT and new media from Rochester Institute of Technology. So as you can tell, technology has always been a sure passion of mine. Having grown up in the digital age, I love having a career that keeps me at the forefront of science and technology innovations. I spend equal time in the field being hands on as I do on my laptop conducting in depth research. Whether I'm diving underwater with NASA astronauts, witnessing the new ways which mobile technology can help rebuild the Philippine's economy in the wake of super typhoons, or sharing a first look at the newest iPhones on The Today Show, yesterday, I'm always on the hunt for the latest and greatest tech stories. And that's what brought me here. I'll be your host for the next hour and as we explore the new phenomenon that is taking businesses around the world by storm. And data science continues to become democratized and extends beyond the domain of the data scientist. And why there's also a mandate for all of us to become data literate. Now that data science for all drives our AI culture. And we're going to be able to take to the streets and go behind the scenes as we uncover the factors that are fueling this phenomenon and giving rise to a movement that is reshaping how businesses leverage data. And putting organizations on the road to AI. So coming up, I'll be doing interviews with data scientists. We'll see real world demos and take a look at how IBM is changing the game with an open data science platform. We'll also be joined by legendary statistician Nate Silver, founder and editor-in-chief of FiveThirtyEight. Who will shed light on how a data driven mindset is changing everything from business to our culture. We also have a few people who are joining us in our studio, so thank you guys for joining us. Come on, I can do better than that, right? Live studio audience, the fun stuff. And for all of you during the program, I want to remind you to join that conversation on social media using the hashtag DSforAll, it's data science for all. Share your thoughts on what data science and AI means to you and your business. And, let's dive into a whole new game of data science. Now I'd like to welcome my co-host General Manager IBM Analytics, Rob Thomas. >> Hello, Katie. >> Come on guys. >> Yeah, seriously. >> No one's allowed to be quiet during this show, okay? >> Right. >> Or, I'll start calling people out. So Rob, thank you so much. I think you know this conversation, we're calling it a data explosion happening right now. And it's nothing new. And when you and I chatted about it. You've been talking about this for years. You have to ask, is this old news at this point? >> Yeah, I mean, well first of all, the data explosion is not coming, it's here. And everybody's in the middle of it right now. What is different is the economics have changed. And the scale and complexity of the data that organizations are having to deal with has changed. And to this day, 80% of the data in the world still sits behind corporate firewalls. So, that's becoming a problem. It's becoming unmanageable. IT struggles to manage it. The business can't get everything they need. Consumers can't consume it when they want. So we have a challenge here. >> It's challenging in the world of unmanageable. Crazy complexity. If I'm sitting here as an IT manager of my business, I'm probably thinking to myself, this is incredibly frustrating. How in the world am I going to get control of all this data? And probably not just me thinking it. Many individuals here as well. >> Yeah, indeed. Everybody's thinking about how am I going to put data to work in my organization in a way I haven't done before. Look, you've got to have the right expertise, the right tools. The other thing that's happening in the market right now is clients are dealing with multi cloud environments. So data behind the firewall in private cloud, multiple public clouds. And they have to find a way. How am I going to pull meaning out of this data? And that brings us to data science and AI. That's how you get there. >> I understand the data science part but I think we're all starting to hear more about AI. And it's incredible that this buzz word is happening. How do businesses adopt to this AI growth and boom and trend that's happening in this world right now? >> Well, let me define it this way. Data science is a discipline. And machine learning is one technique. And then AI puts both machine learning into practice and applies it to the business. So this is really about how getting your business where it needs to go. And to get to an AI future, you have to lay a data foundation today. I love the phrase, "there's no AI without IA." That means you're not going to get to AI unless you have the right information architecture to start with. >> Can you elaborate though in terms of how businesses can really adopt AI and get started. >> Look, I think there's four things you have to do if you're serious about AI. One is you need a strategy for data acquisition. Two is you need a modern data architecture. Three is you need pervasive automation. And four is you got to expand job roles in the organization. >> Data acquisition. First pillar in this you just discussed. Can we start there and explain why it's so critical in this process? >> Yeah, so let's think about how data acquisition has evolved through the years. 15 years ago, data acquisition was about how do I get data in and out of my ERP system? And that was pretty much solved. Then the mobile revolution happens. And suddenly you've got structured and non-structured data. More than you've ever dealt with. And now you get to where we are today. You're talking terabytes, petabytes of data. >> [Katie] Yottabytes, I heard that word the other day. >> I heard that too. >> Didn't even know what it meant. >> You know how many zeros that is? >> I thought we were in Star Wars. >> Yeah, I think it's a lot of zeroes. >> Yodabytes, it's new. >> So, it's becoming more and more complex in terms of how you acquire data. So that's the new data landscape that every client is dealing with. And if you don't have a strategy for how you acquire that and manage it, you're not going to get to that AI future. >> So a natural segue, if you are one of these businesses, how do you build for the data landscape? >> Yeah, so the question I always hear from customers is we need to evolve our data architecture to be ready for AI. And the way I think about that is it's really about moving from static data repositories to more of a fluid data layer. >> And we continue with the architecture. New data architecture is an interesting buzz word to hear. But it's also one of the four pillars. So if you could dive in there. >> Yeah, I mean it's a new twist on what I would call some core data science concepts. For example, you have to leverage tools with a modern, centralized data warehouse. But your data warehouse can't be stagnant to just what's right there. So you need a way to federate data across different environments. You need to be able to bring your analytics to the data because it's most efficient that way. And ultimately, it's about building an optimized data platform that is designed for data science and AI. Which means it has to be a lot more flexible than what clients have had in the past. >> All right. So we've laid out what you need for driving automation. But where does the machine learning kick in? >> Machine learning is what gives you the ability to automate tasks. And I think about machine learning. It's about predicting and automating. And this will really change the roles of data professionals and IT professionals. For example, a data scientist cannot possibly know every algorithm or every model that they could use. So we can automate the process of algorithm selection. Another example is things like automated data matching. Or metadata creation. Some of these things may not be exciting but they're hugely practical. And so when you think about the real use cases that are driving return on investment today, it's things like that. It's automating the mundane tasks. >> Let's go ahead and come back to something that you mentioned earlier because it's fascinating to be talking about this AI journey, but also significant is the new job roles. And what are those other participants in the analytics pipeline? >> Yeah I think we're just at the start of this idea of new job roles. We have data scientists. We have data engineers. Now you see machine learning engineers. Application developers. What's really happening is that data scientists are no longer allowed to work in their own silo. And so the new job roles is about how does everybody have data first in their mind? And then they're using tools to automate data science, to automate building machine learning into applications. So roles are going to change dramatically in organizations. >> I think that's confusing though because we have several organizations who saying is that highly specialized roles, just for data science? Or is it applicable to everybody across the board? >> Yeah, and that's the big question, right? Cause everybody's thinking how will this apply? Do I want this to be just a small set of people in the organization that will do this? But, our view is data science has to for everybody. It's about bring data science to everybody as a shared mission across the organization. Everybody in the company has to be data literate. And participate in this journey. >> So overall, group effort, has to be a common goal, and we all need to be data literate across the board. >> Absolutely. >> Done deal. But at the end of the day, it's kind of not an easy task. >> It's not. It's not easy but it's maybe not as big of a shift as you would think. Because you have to put data in the hands of people that can do something with it. So, it's very basic. Give access to data. Data's often locked up in a lot of organizations today. Give people the right tools. Embrace the idea of choice or diversity in terms of those tools. That gets you started on this path. >> It's interesting to hear you say essentially you need to train everyone though across the board when it comes to data literacy. And I think people that are coming into the work force don't necessarily have a background or a degree in data science. So how do you manage? >> Yeah, so in many cases that's true. I will tell you some universities are doing amazing work here. One example, University of California Berkeley. They offer a course for all majors. So no matter what you're majoring in, you have a course on foundations of data science. How do you bring data science to every role? So it's starting to happen. We at IBM provide data science courses through CognitiveClass.ai. It's for everybody. It's free. And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. The key point is this though. It's more about attitude than it is aptitude. I think anybody can figure this out. But it's about the attitude to say we're putting data first and we're going to figure out how to make this real in our organization. >> I also have to give a shout out to my alma mater because I have heard that there is an offering in MS in data analytics. And they are always on the forefront of new technologies and new majors and on trend. And I've heard that the placement behind those jobs, people graduating with the MS is high. >> I'm sure it's very high. >> So go Tigers. All right, tangential. Let me get back to something else you touched on earlier because you mentioned that a number of customers ask you how in the world do I get started with AI? It's an overwhelming question. Where do you even begin? What do you tell them? >> Yeah, well things are moving really fast. But the good thing is most organizations I see, they're already on the path, even if they don't know it. They might have a BI practice in place. They've got data warehouses. They've got data lakes. Let me give you an example. AMC Networks. They produce a lot of the shows that I'm sure you watch Katie. >> [Katie] Yes, Breaking Bad, Walking Dead, any fans? >> [Rob] Yeah, we've got a few. >> [Katie] Well you taught me something I didn't even know. Because it's amazing how we have all these different industries, but yet media in itself is impacted too. And this is a good example. >> Absolutely. So, AMC Networks, think about it. They've got ads to place. They want to track viewer behavior. What do people like? What do they dislike? So they have to optimize every aspect of their business from marketing campaigns to promotions to scheduling to ads. And their goal was transform data into business insights and really take the burden off of their IT team that was heavily burdened by obviously a huge increase in data. So their VP of BI took the approach of using machine learning to process large volumes of data. They used a platform that was designed for AI and data processing. It's the IBM analytics system where it's a data warehouse, data science tools are built in. It has in memory data processing. And just like that, they were ready for AI. And they're already seeing that impact in their business. >> Do you think a movement of that nature kind of presses other media conglomerates and organizations to say we need to be doing this too? >> I think it's inevitable that everybody, you're either going to be playing, you're either going to be leading, or you'll be playing catch up. And so, as we talk to clients we think about how do you start down this path now, even if you have to iterate over time? Because otherwise you're going to wake up and you're going to be behind. >> One thing worth noting is we've talked about analytics to the data. It's analytics first to the data, not the other way around. >> Right. So, look. We as a practice, we say you want to bring data to where the data sits. Because it's a lot more efficient that way. It gets you better outcomes in terms of how you train models and it's more efficient. And we think that leads to better outcomes. Other organization will say, "Hey move the data around." And everything becomes a big data movement exercise. But once an organization has started down this path, they're starting to get predictions, they want to do it where it's really easy. And that means analytics applied right where the data sits. >> And worth talking about the role of the data scientist in all of this. It's been called the hot job of the decade. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. >> Yes. >> I want to see this on the cover of Vogue. Like I want to see the first data scientist. Female preferred, on the cover of Vogue. That would be amazing. >> Perhaps you can. >> People agree. So what changes for them? Is this challenging in terms of we talk data science for all. Where do all the data science, is it data science for everyone? And how does it change everything? >> Well, I think of it this way. AI gives software super powers. It really does. It changes the nature of software. And at the center of that is data scientists. So, a data scientist has a set of powers that they've never had before in any organization. And that's why it's a hot profession. Now, on one hand, this has been around for a while. We've had actuaries. We've had statisticians that have really transformed industries. But there are a few things that are new now. We have new tools. New languages. Broader recognition of this need. And while it's important to recognize this critical skill set, you can't just limit it to a few people. This is about scaling it across the organization. And truly making it accessible to all. >> So then do we need more data scientists? Or is this something you train like you said, across the board? >> Well, I think you want to do a little bit of both. We want more. But, we can also train more and make the ones we have more productive. The way I think about it is there's kind of two markets here. And we call it clickers and coders. >> [Katie] I like that. That's good. >> So, let's talk about what that means. So clickers are basically somebody that wants to use tools. Create models visually. It's drag and drop. Something that's very intuitive. Those are the clickers. Nothing wrong with that. It's been valuable for years. There's a new crop of data scientists. They want to code. They want to build with the latest open source tools. They want to write in Python or R. These are the coders. And both approaches are viable. Both approaches are critical. Organizations have to have a way to meet the needs of both of those types. And there's not a lot of things available today that do that. >> Well let's keep going on that. Because I hear you talking about the data scientists role and how it's critical to success, but with the new tools, data science and analytics skills can extend beyond the domain of just the data scientist. >> That's right. So look, we're unifying coders and clickers into a single platform, which we call IBM Data Science Experience. And as the demand for data science expertise grows, so does the need for these kind of tools. To bring them into the same environment. And my view is if you have the right platform, it enables the organization to collaborate. And suddenly you've changed the nature of data science from an individual sport to a team sport. >> So as somebody that, my background is in IT, the question is really is this an additional piece of what IT needs to do in 2017 and beyond? Or is it just another line item to the budget? >> So I'm afraid that some people might view it that way. As just another line item. But, I would challenge that and say data science is going to reinvent IT. It's going to change the nature of IT. And every organization needs to think about what are the skills that are critical? How do we engage a broader team to do this? Because once they get there, this is the chance to reinvent how they're performing IT. >> [Katie] Challenging or not? >> Look it's all a big challenge. Think about everything IT organizations have been through. Some of them were late to things like mobile, but then they caught up. Some were late to cloud, but then they caught up. I would just urge people, don't be late to data science. Use this as your chance to reinvent IT. Start with this notion of clickers and coders. This is a seminal moment. Much like mobile and cloud was. So don't be late. >> And I think it's critical because it could be so costly to wait. And Rob and I were even chatting earlier how data analytics is just moving into all different kinds of industries. And I can tell you even personally being effected by how important the analysis is in working in pediatric cancer for the last seven years. I personally implement virtual reality headsets to pediatric cancer hospitals across the country. And it's great. And it's working phenomenally. And the kids are amazed. And the staff is amazed. But the phase two of this project is putting in little metrics in the hardware that gather the breathing, the heart rate to show that we have data. Proof that we can hand over to the hospitals to continue making this program a success. So just in-- >> That's a great example. >> An interesting example. >> Saving lives? >> Yes. >> That's also applying a lot of what we talked about. >> Exciting stuff in the world of data science. >> Yes. Look, I just add this is an existential moment for every organization. Because what you do in this area is probably going to define how competitive you are going forward. And think about if you don't do something. What if one of your competitors goes and creates an application that's more engaging with clients? So my recommendation is start small. Experiment. Learn. Iterate on projects. Define the business outcomes. Then scale up. It's very doable. But you've got to take the first step. >> First step always critical. And now we're going to get to the fun hands on part of our story. Because in just a moment we're going to take a closer look at what data science can deliver. And where organizations are trying to get to. All right. Thank you Rob and now we've been joined by Siva Anne who is going to help us navigate this demo. First, welcome Siva. Give him a big round of applause. Yeah. All right, Rob break down what we're going to be looking at. You take over this demo. >> All right. So this is going to be pretty interesting. So Siva is going to take us through. So he's going to play the role of a financial adviser. Who wants to help better serve clients through recommendations. And I'm going to really illustrate three things. One is how do you federate data from multiple data sources? Inside the firewall, outside the firewall. How do you apply machine learning to predict and to automate? And then how do you move analytics closer to your data? So, what you're seeing here is a custom application for an investment firm. So, Siva, our financial adviser, welcome. So you can see at the top, we've got market data. We pulled that from an external source. And then we've got Siva's calendar in the middle. He's got clients on the right side. So page down, what else do you see down there Siva? >> [Siva] I can see the recent market news. And in here I can see that JP Morgan is calling for a US dollar rebound in the second half of the year. And, I have upcoming meeting with Leo Rakes. I can get-- >> [Rob] So let's go in there. Why don't you click on Leo Rakes. So, you're sitting at your desk, you're deciding how you're going to spend the day. You know you have a meeting with Leo. So you click on it. You immediately see, all right, so what do we know about him? We've got data governance implemented. So we know his age, we know his degree. We can see he's not that aggressive of a trader. Only six trades in the last few years. But then where it gets interesting is you go to the bottom. You start to see predicted industry affinity. Where did that come from? How do we have that? >> [Siva] So these green lines and red arrows here indicate the trending affinity of Leo Rakes for particular industry stocks. What we've done here is we've built machine learning models using customer's demographic data, his stock portfolios, and browsing behavior to build a model which can predict his affinity for a particular industry. >> [Rob] Interesting. So, I like to think of this, we call it celebrity experiences. So how do you treat every customer like they're a celebrity? So to some extent, we're reading his mind. Because without asking him, we know that he's going to have an affinity for auto stocks. So we go down. Now we look at his portfolio. You can see okay, he's got some different holdings. He's got Amazon, Google, Apple, and then he's got RACE, which is the ticker for Ferrari. You can see that's done incredibly well. And so, as a financial adviser, you look at this and you say, all right, we know he loves auto stocks. Ferrari's done very well. Let's create a hedge. Like what kind of security would interest him as a hedge against his position for Ferrari? Could we go figure that out? >> [Siva] Yes. Given I know that he's gotten an affinity for auto stocks, and I also see that Ferrari has got some terminus gains, I want to lock in these gains by hedging. And I want to do that by picking a auto stock which has got negative correlation with Ferrari. >> [Rob] So this is where we get to the idea of in database analytics. Cause you start clicking that and immediately we're getting instant answers of what's happening. So what did we find here? We're going to compare Ferrari and Honda. >> [Siva] I'm going to compare Ferrari with Honda. And what I see here instantly is that Honda has got a negative correlation with Ferrari, which makes it a perfect mix for his stock portfolio. Given he has an affinity for auto stocks and it correlates negatively with Ferrari. >> [Rob] These are very powerful tools at the hand of a financial adviser. You think about it. As a financial adviser, you wouldn't think about federating data, machine learning, pretty powerful. >> [Siva] Yes. So what we have seen here is that using the common SQL engine, we've been able to federate queries across multiple data sources. Db2 Warehouse in the cloud, IBM's Integrated Analytic System, and Hortonworks powered Hadoop platform for the new speeds. We've been able to use machine learning to derive innovative insights about his stock affinities. And drive the machine learning into the appliance. Closer to where the data resides to deliver high performance analytics. >> [Rob] At scale? >> [Siva] We're able to run millions of these correlations across stocks, currency, other factors. And even score hundreds of customers for their affinities on a daily basis. >> That's great. Siva, thank you for playing the role of financial adviser. So I just want to recap briefly. Cause this really powerful technology that's really simple. So we federated, we aggregated multiple data sources from all over the web and internal systems. And public cloud systems. Machine learning models were built that predicted Leo's affinity for a certain industry. In this case, automotive. And then you see when you deploy analytics next to your data, even a financial adviser, just with the click of a button is getting instant answers so they can go be more productive in their next meeting. This whole idea of celebrity experiences for your customer, that's available for everybody, if you take advantage of these types of capabilities. Katie, I'll hand it back to you. >> Good stuff. Thank you Rob. Thank you Siva. Powerful demonstration on what we've been talking about all afternoon. And thank you again to Siva for helping us navigate. Should be give him one more round of applause? We're going to be back in just a moment to look at how we operationalize all of this data. But in first, here's a message from me. If you're a part of a line of business, your main fear is disruption. You know data is the new goal that can create huge amounts of value. So does your competition. And they may be beating you to it. You're convinced there are new business models and revenue sources hidden in all the data. You just need to figure out how to leverage it. But with the scarcity of data scientists, you really can't rely solely on them. You may need more people throughout the organization that have the ability to extract value from data. And as a data science leader or data scientist, you have a lot of the same concerns. You spend way too much time looking for, prepping, and interpreting data and waiting for models to train. You know you need to operationalize the work you do to provide business value faster. What you want is an easier way to do data prep. And rapidly build models that can be easily deployed, monitored and automatically updated. So whether you're a data scientist, data science leader, or in a line of business, what's the solution? What'll it take to transform the way you work? That's what we're going to explore next. All right, now it's time to delve deeper into the nuts and bolts. The nitty gritty of operationalizing data science and creating a data driven culture. How do you actually do that? Well that's what these experts are here to share with us. I'm joined by Nir Kaldero, who's head of data science at Galvanize, which is an education and training organization. Tricia Wang, who is co-founder of Sudden Compass, a consultancy that helps companies understand people with data. And last, but certainly not least, Michael Li, founder and CEO of Data Incubator, which is a data science train company. All right guys. Shall we get right to it? >> All right. >> So data explosion happening right now. And we are seeing it across the board. I just shared an example of how it's impacting my philanthropic work in pediatric cancer. But you guys each have so many unique roles in your business life. How are you seeing it just blow up in your fields? Nir, your thing? >> Yeah, for example like in Galvanize we train many Fortune 500 companies. And just by looking at the demand of companies that wants us to help them go through this digital transformation is mind-blowing. Data point by itself. >> Okay. Well what we're seeing what's going on is that data science like as a theme, is that it's actually for everyone now. But what's happening is that it's actually meeting non technical people. But what we're seeing is that when non technical people are implementing these tools or coming at these tools without a base line of data literacy, they're often times using it in ways that distance themselves from the customer. Because they're implementing data science tools without a clear purpose, without a clear problem. And so what we do at Sudden Compass is that we work with companies to help them embrace and understand the complexity of their customers. Because often times they are misusing data science to try and flatten their understanding of the customer. As if you can just do more traditional marketing. Where you're putting people into boxes. And I think the whole ROI of data is that you can now understand people's relationships at a much more complex level at a greater scale before. But we have to do this with basic data literacy. And this has to involve technical and non technical people. >> Well you can have all the data in the world, and I think it speaks to, if you're not doing the proper movement with it, forget it. It means nothing at the same time. >> No absolutely. I mean, I think that when you look at the huge explosion in data, that comes with it a huge explosion in data experts. Right, we call them data scientists, data analysts. And sometimes they're people who are very, very talented, like the people here. But sometimes you have people who are maybe re-branding themselves, right? Trying to move up their title one notch to try to attract that higher salary. And I think that that's one of the things that customers are coming to us for, right? They're saying, hey look, there are a lot of people that call themselves data scientists, but we can't really distinguish. So, we have sort of run a fellowship where you help companies hire from a really talented group of folks, who are also truly data scientists and who know all those kind of really important data science tools. And we also help companies internally. Fortune 500 companies who are looking to grow that data science practice that they have. And we help clients like McKinsey, BCG, Bain, train up their customers, also their clients, also their workers to be more data talented. And to build up that data science capabilities. >> And Nir, this is something you work with a lot. A lot of Fortune 500 companies. And when we were speaking earlier, you were saying many of these companies can be in a panic. >> Yeah. >> Explain that. >> Yeah, so you know, not all Fortune 500 companies are fully data driven. And we know that the winners in this fourth industrial revolution, which I like to call the machine intelligence revolution, will be companies who navigate and transform their organization to unlock the power of data science and machine learning. And the companies that are not like that. Or not utilize data science and predictive power well, will pretty much get shredded. So they are in a panic. >> Tricia, companies have to deal with data behind the firewall and in the new multi cloud world. How do organizations start to become driven right to the core? >> I think the most urgent question to become data driven that companies should be asking is how do I bring the complex reality that our customers are experiencing on the ground in to a corporate office? Into the data models. So that question is critical because that's how you actually prevent any big data disasters. And that's how you leverage big data. Because when your data models are really far from your human models, that's when you're going to do things that are really far off from how, it's going to not feel right. That's when Tesco had their terrible big data disaster that they're still recovering from. And so that's why I think it's really important to understand that when you implement big data, you have to further embrace thick data. The qualitative, the emotional stuff, that is difficult to quantify. But then comes the difficult art and science that I think is the next level of data science. Which is that getting non technical and technical people together to ask how do we find those unknown nuggets of insights that are difficult to quantify? Then, how do we do the next step of figuring out how do you mathematically scale those insights into a data model? So that actually is reflective of human understanding? And then we can start making decisions at scale. But you have to have that first. >> That's absolutely right. And I think that when we think about what it means to be a data scientist, right? I always think about it in these sort of three pillars. You have the math side. You have to have that kind of stats, hardcore machine learning background. You have the programming side. You don't work with small amounts of data. You work with large amounts of data. You've got to be able to type the code to make those computers run. But then the last part is that human element. You have to understand the domain expertise. You have to understand what it is that I'm actually analyzing. What's the business proposition? And how are the clients, how are the users actually interacting with the system? That human element that you were talking about. And I think having somebody who understands all of those and not just in isolation, but is able to marry that understanding across those different topics, that's what makes a data scientist. >> But I find that we don't have people with those skill sets. And right now the way I see teams being set up inside companies is that they're creating these isolated data unicorns. These data scientists that have graduated from your programs, which are great. But, they don't involve the people who are the domain experts. They don't involve the designers, the consumer insight people, the people, the salespeople. The people who spend time with the customers day in and day out. Somehow they're left out of the room. They're consulted, but they're not a stakeholder. >> Can I actually >> Yeah, yeah please. >> Can I actually give a quick example? So for example, we at Galvanize train the executives and the managers. And then the technical people, the data scientists and the analysts. But in order to actually see all of the RY behind the data, you also have to have a creative fluid conversation between non technical and technical people. And this is a major trend now. And there's a major gap. And we need to increase awareness and kind of like create a new, kind of like environment where technical people also talks seamlessly with non technical ones. >> [Tricia] We call-- >> That's one of the things that we see a lot. Is one of the trends in-- >> A major trend. >> data science training is it's not just for the data science technical experts. It's not just for one type of person. So a lot of the training we do is sort of data engineers. People who are more on the software engineering side learning more about the stats of math. And then people who are sort of traditionally on the stat side learning more about the engineering. And then managers and people who are data analysts learning about both. >> Michael, I think you said something that was of interest too because I think we can look at IBM Watson as an example. And working in healthcare. The human component. Because often times we talk about machine learning and AI, and data and you get worried that you still need that human component. Especially in the world of healthcare. And I think that's a very strong point when it comes to the data analysis side. Is there any particular example you can speak to of that? >> So I think that there was this really excellent paper a while ago talking about all the neuro net stuff and trained on textual data. So looking at sort of different corpuses. And they found that these models were highly, highly sexist. They would read these corpuses and it's not because neuro nets themselves are sexist. It's because they're reading the things that we write. And it turns out that we write kind of sexist things. And they would sort of find all these patterns in there that were sort of latent, that had a lot of sort of things that maybe we would cringe at if we sort of saw. And I think that's one of the really important aspects of the human element, right? It's being able to come in and sort of say like, okay, I know what the biases of the system are, I know what the biases of the tools are. I need to figure out how to use that to make the tools, make the world a better place. And like another area where this comes up all the time is lending, right? So the federal government has said, and we have a lot of clients in the financial services space, so they're constantly under these kind of rules that they can't make discriminatory lending practices based on a whole set of protected categories. Race, sex, gender, things like that. But, it's very easy when you train a model on credit scores to pick that up. And then to have a model that's inadvertently sexist or racist. And that's where you need the human element to come back in and say okay, look, you're using the classic example would be zip code, you're using zip code as a variable. But when you look at it, zip codes actually highly correlated with race. And you can't do that. So you may inadvertently by sort of following the math and being a little naive about the problem, inadvertently introduce something really horrible into a model and that's where you need a human element to sort of step in and say, okay hold on. Slow things down. This isn't the right way to go. >> And the people who have -- >> I feel like, I can feel her ready to respond. >> Yes, I'm ready. >> She's like let me have at it. >> And the people here it is. And the people who are really great at providing that human intelligence are social scientists. We are trained to look for bias and to understand bias in data. Whether it's quantitative or qualitative. And I really think that we're going to have less of these kind of problems if we had more integrated teams. If it was a mandate from leadership to say no data science team should be without a social scientist, ethnographer, or qualitative researcher of some kind, to be able to help see these biases. >> The talent piece is actually the most crucial-- >> Yeah. >> one here. If you look about how to enable machine intelligence in organization there are the pillars that I have in my head which is the culture, the talent and the technology infrastructure. And I believe and I saw in working very closely with the Fortune 100 and 200 companies that the talent piece is actually the most important crucial hard to get. >> [Tricia] I totally agree. >> It's absolutely true. Yeah, no I mean I think that's sort of like how we came up with our business model. Companies were basically saying hey, I can't hire data scientists. And so we have a fellowship where we get 2,000 applicants each quarter. We take the top 2% and then we sort of train them up. And we work with hiring companies who then want to hire from that population. And so we're sort of helping them solve that problem. And the other half of it is really around training. Cause with a lot of industries, especially if you're sort of in a more regulated industry, there's a lot of nuances to what you're doing. And the fastest way to develop that data science or AI talent may not necessarily be to hire folks who are coming out of a PhD program. It may be to take folks internally who have a lot of that domain knowledge that you have and get them trained up on those data science techniques. So we've had large insurance companies come to us and say hey look, we hire three or four folks from you a quarter. That doesn't move the needle for us. What we really need is take the thousand actuaries and statisticians that we have and get all of them trained up to become a data scientist and become data literate in this new open source world. >> [Katie] Go ahead. >> All right, ladies first. >> Go ahead. >> Are you sure? >> No please, fight first. >> Go ahead. >> Go ahead Nir. >> So this is actually a trend that we have been seeing in the past year or so that companies kind of like start to look how to upscale and look for talent within the organization. So they can actually move them to become more literate and navigate 'em from analyst to data scientist. And from data scientist to machine learner. So this is actually a trend that is happening already for a year or so. >> Yeah, but I also find that after they've gone through that training in getting people skilled up in data science, the next problem that I get is executives coming to say we've invested in all of this. We're still not moving the needle. We've already invested in the right tools. We've gotten the right skills. We have enough scale of people who have these skills. Why are we not moving the needle? And what I explain to them is look, you're still making decisions in the same way. And you're still not involving enough of the non technical people. Especially from marketing, which is now, the CMO's are much more responsible for driving growth in their companies now. But often times it's so hard to change the old way of marketing, which is still like very segmentation. You know, demographic variable based, and we're trying to move people to say no, you have to understand the complexity of customers and not put them in boxes. >> And I think underlying a lot of this discussion is this question of culture, right? >> Yes. >> Absolutely. >> How do you build a data driven culture? And I think that that culture question, one of the ways that comes up quite often in especially in large, Fortune 500 enterprises, is that they are very, they're not very comfortable with sort of example, open source architecture. Open source tools. And there is some sort of residual bias that that's somehow dangerous. So security vulnerability. And I think that that's part of the cultural challenge that they often have in terms of how do I build a more data driven organization? Well a lot of the talent really wants to use these kind of tools. And I mean, just to give you an example, we are partnering with one of the major cloud providers to sort of help make open source tools more user friendly on their platform. So trying to help them attract the best technologists to use their platform because they want and they understand the value of having that kind of open source technology work seamlessly on their platforms. So I think that just sort of goes to show you how important open source is in this movement. And how much large companies and Fortune 500 companies and a lot of the ones we work with have to embrace that. >> Yeah, and I'm seeing it in our work. Even when we're working with Fortune 500 companies, is that they've already gone through the first phase of data science work. Where I explain it was all about the tools and getting the right tools and architecture in place. And then companies started moving into getting the right skill set in place. Getting the right talent. And what you're talking about with culture is really where I think we're talking about the third phase of data science, which is looking at communication of these technical frameworks so that we can get non technical people really comfortable in the same room with data scientists. That is going to be the phase, that's really where I see the pain point. And that's why at Sudden Compass, we're really dedicated to working with each other to figure out how do we solve this problem now? >> And I think that communication between the technical stakeholders and management and leadership. That's a very critical piece of this. You can't have a successful data science organization without that. >> Absolutely. >> And I think that actually some of the most popular trainings we've had recently are from managers and executives who are looking to say, how do I become more data savvy? How do I figure out what is this data science thing and how do I communicate with my data scientists? >> You guys made this way too easy. I was just going to get some popcorn and watch it play out. >> Nir, last 30 seconds. I want to leave you with an opportunity to, anything you want to add to this conversation? >> I think one thing to conclude is to say that companies that are not data driven is about time to hit refresh and figure how they transition the organization to become data driven. To become agile and nimble so they can actually see what opportunities from this important industrial revolution. Otherwise, unfortunately they will have hard time to survive. >> [Katie] All agreed? >> [Tricia] Absolutely, you're right. >> Michael, Trish, Nir, thank you so much. Fascinating discussion. And thank you guys again for joining us. We will be right back with another great demo. Right after this. >> Thank you Katie. >> Once again, thank you for an excellent discussion. Weren't they great guys? And thank you for everyone who's tuning in on the live webcast. As you can hear, we have an amazing studio audience here. And we're going to keep things moving. I'm now joined by Daniel Hernandez and Siva Anne. And we're going to turn our attention to how you can deliver on what they're talking about using data science experience to do data science faster. >> Thank you Katie. Siva and I are going to spend the next 10 minutes showing you how you can deliver on what they were saying using the IBM Data Science Experience to do data science faster. We'll demonstrate through new features we introduced this week how teams can work together more effectively across the entire analytics life cycle. How you can take advantage of any and all data no matter where it is and what it is. How you could use your favorite tools from open source. And finally how you could build models anywhere and employ them close to where your data is. Remember the financial adviser app Rob showed you? To build an app like that, we needed a team of data scientists, developers, data engineers, and IT staff to collaborate. We do this in the Data Science Experience through a concept we call projects. When I create a new project, I can now use the new Github integration feature. We're doing for data science what we've been doing for developers for years. Distributed teams can work together on analytics projects. And take advantage of Github's version management and change management features. This is a huge deal. Let's explore the project we created for the financial adviser app. As you can see, our data engineer Joane, our developer Rob, and others are collaborating this project. Joane got things started by bringing together the trusted data sources we need to build the app. Taking a closer look at the data, we see that our customer and profile data is stored on our recently announced IBM Integrated Analytics System, which runs safely behind our firewall. We also needed macro economic data, which she was able to find in the Federal Reserve. And she stored it in our Db2 Warehouse on Cloud. And finally, she selected stock news data from NASDAQ.com and landed that in a Hadoop cluster, which happens to be powered by Hortonworks. We added a new feature to the Data Science Experience so that when it's installed with Hortonworks, it automatically uses a need of security and governance controls within the cluster so your data is always secure and safe. Now we want to show you the news data we stored in the Hortonworks cluster. This is the mean administrative console. It's powered by an open source project called Ambari. And here's the news data. It's in parquet files stored in HDFS, which happens to be a distributive file system. To get the data from NASDAQ into our cluster, we used IBM's BigIntegrate and BigQuality to create automatic data pipelines that acquire, cleanse, and ingest that news data. Once the data's available, we use IBM's Big SQL to query that data using SQL statements that are much like the ones we would use for any relation of data, including the data that we have in the Integrated Analytics System and Db2 Warehouse on Cloud. This and the federation capabilities that Big SQL offers dramatically simplifies data acquisition. Now we want to show you how we support a brand new tool that we're excited about. Since we launched last summer, the Data Science Experience has supported Jupyter and R for data analysis and visualization. In this week's update, we deeply integrated another great open source project called Apache Zeppelin. It's known for having great visualization support, advanced collaboration features, and is growing in popularity amongst the data science community. This is an example of Apache Zeppelin and the notebook we created through it to explore some of our data. Notice how wonderful and easy the data visualizations are. Now we want to walk you through the Jupyter notebook we created to explore our customer preference for stocks. We use notebooks to understand and explore data. To identify the features that have some predictive power. Ultimately, we're trying to assess what ultimately is driving customer stock preference. Here we did the analysis to identify the attributes of customers that are likely to purchase auto stocks. We used this understanding to build our machine learning model. For building machine learning models, we've always had tools integrated into the Data Science Experience. But sometimes you need to use tools you already invested in. Like our very own SPSS as well as SAS. Through new import feature, you can easily import those models created with those tools. This helps you avoid vendor lock-in, and simplify the development, training, deployment, and management of all your models. To build the models we used in app, we could have coded, but we prefer a visual experience. We used our customer profile data in the Integrated Analytic System. Used the Auto Data Preparation to cleanse our data. Choose the binary classification algorithms. Let the Data Science Experience evaluate between logistic regression and gradient boosted tree. It's doing the heavy work for us. As you can see here, the Data Science Experience generated performance metrics that show us that the gradient boosted tree is the best performing algorithm for the data we gave it. Once we save this model, it's automatically deployed and available for developers to use. Any application developer can take this endpoint and consume it like they would any other API inside of the apps they built. We've made training and creating machine learning models super simple. But what about the operations? A lot of companies are struggling to ensure their model performance remains high over time. In our financial adviser app, we know that customer data changes constantly, so we need to always monitor model performance and ensure that our models are retrained as is necessary. This is a dashboard that shows the performance of our models and lets our teams monitor and retrain those models so that they're always performing to our standards. So far we've been showing you the Data Science Experience available behind the firewall that we're using to build and train models. Through a new publish feature, you can build models and deploy them anywhere. In another environment, private, public, or anywhere else with just a few clicks. So here we're publishing our model to the Watson machine learning service. It happens to be in the IBM cloud. And also deeply integrated with our Data Science Experience. After publishing and switching to the Watson machine learning service, you can see that our stock affinity and model that we just published is there and ready for use. So this is incredibly important. I just want to say it again. The Data Science Experience allows you to train models behind your own firewall, take advantage of your proprietary and sensitive data, and then deploy those models wherever you want with ease. So summarize what we just showed you. First, IBM's Data Science Experience supports all teams. You saw how our data engineer populated our project with trusted data sets. Our data scientists developed, trained, and tested a machine learning model. Our developers used APIs to integrate machine learning into their apps. And how IT can use our Integrated Model Management dashboard to monitor and manage model performance. Second, we support all data. On premises, in the cloud, structured, unstructured, inside of your firewall, and outside of it. We help you bring analytics and governance to where your data is. Third, we support all tools. The data science tools that you depend on are readily available and deeply integrated. This includes capabilities from great partners like Hortonworks. And powerful tools like our very own IBM SPSS. And fourth, and finally, we support all deployments. You can build your models anywhere, and deploy them right next to where your data is. Whether that's in the public cloud, private cloud, or even on the world's most reliable transaction platform, IBM z. So see for yourself. Go to the Data Science Experience website, take us for a spin. And if you happen to be ready right now, our recently created Data Science Elite Team can help you get started and run experiments alongside you with no charge. Thank you very much. >> Thank you very much Daniel. It seems like a great time to get started. And thanks to Siva for taking us through it. Rob and I will be back in just a moment to add some perspective right after this. All right, once again joined by Rob Thomas. And Rob obviously we got a lot of information here. >> Yes, we've covered a lot of ground. >> This is intense. You got to break it down for me cause I think we zoom out and see the big picture. What better data science can deliver to a business? Why is this so important? I mean we've heard it through and through. >> Yeah, well, I heard it a couple times. But it starts with businesses have to embrace a data driven culture. And it is a change. And we need to make data accessible with the right tools in a collaborative culture because we've got diverse skill sets in every organization. But data driven companies succeed when data science tools are in the hands of everyone. And I think that's a new thought. I think most companies think just get your data scientist some tools, you'll be fine. This is about tools in the hands of everyone. I think the panel did a great job of describing about how we get to data science for all. Building a data culture, making it a part of your everyday operations, and the highlights of what Daniel just showed us, that's some pretty cool features for how organizations can get to this, which is you can see IBM's Data Science Experience, how that supports all teams. You saw data analysts, data scientists, application developer, IT staff, all working together. Second, you saw how we support all tools. And your choice of tools. So the most popular data science libraries integrated into one platform. And we saw some new capabilities that help companies avoid lock-in, where you can import existing models created from specialist tools like SPSS or others. And then deploy them and manage them inside of Data Science Experience. That's pretty interesting. And lastly, you see we continue to build on this best of open tools. Partnering with companies like H2O, Hortonworks, and others. Third, you can see how you use all data no matter where it lives. That's a key challenge every organization's going to face. Private, public, federating all data sources. We announced new integration with the Hortonworks data platform where we deploy machine learning models where your data resides. That's been a key theme. Analytics where the data is. And lastly, supporting all types of deployments. Deploy them in your Hadoop cluster. Deploy them in your Integrated Analytic System. Or deploy them in z, just to name a few. A lot of different options here. But look, don't believe anything I say. Go try it for yourself. Data Science Experience, anybody can use it. Go to datascience.ibm.com and look, if you want to start right now, we just created a team that we call Data Science Elite. These are the best data scientists in the world that will come sit down with you and co-create solutions, models, and prove out a proof of concept. >> Good stuff. Thank you Rob. So you might be asking what does an organization look like that embraces data science for all? And how could it transform your role? I'm going to head back to the office and check it out. Let's start with the perspective of the line of business. What's changed? Well, now you're starting to explore new business models. You've uncovered opportunities for new revenue sources and all that hidden data. And being disrupted is no longer keeping you up at night. As a data science leader, you're beginning to collaborate with a line of business to better understand and translate the objectives into the models that are being built. Your data scientists are also starting to collaborate with the less technical team members and analysts who are working closest to the business problem. And as a data scientist, you stop feeling like you're falling behind. Open source tools are keeping you current. You're also starting to operationalize the work that you do. And you get to do more of what you love. Explore data, build models, put your models into production, and create business impact. All in all, it's not a bad scenario. Thanks. All right. We are back and coming up next, oh this is a special time right now. Cause we got a great guest speaker. New York Magazine called him the spreadsheet psychic and number crunching prodigy who went from correctly forecasting baseball games to correctly forecasting presidential elections. He even invented a proprietary algorithm called PECOTA for predicting future performance by baseball players and teams. And his New York Times bestselling book, The Signal and the Noise was named by Amazon.com as the number one best non-fiction book of 2012. He's currently the Editor in Chief of the award winning website, FiveThirtyEight and appears on ESPN as an on air commentator. Big round of applause. My pleasure to welcome Nate Silver. >> Thank you. We met backstage. >> Yes. >> It feels weird to re-shake your hand, but you know, for the audience. >> I had to give the intense firm grip. >> Definitely. >> The ninja grip. So you and I have crossed paths kind of digitally in the past, which it really interesting, is I started my career at ESPN. And I started as a production assistant, then later back on air for sports technology. And I go to you to talk about sports because-- >> Yeah. >> Wow, has ESPN upped their game in terms of understanding the importance of data and analytics. And what it brings. Not just to MLB, but across the board. >> No, it's really infused into the way they present the broadcast. You'll have win probability on the bottom line. And they'll incorporate FiveThirtyEight metrics into how they cover college football for example. So, ESPN ... Sports is maybe the perfect, if you're a data scientist, like the perfect kind of test case. And the reason being that sports consists of problems that have rules. And have structure. And when problems have rules and structure, then it's a lot easier to work with. So it's a great way to kind of improve your skills as a data scientist. Of course, there are also important real world problems that are more open ended, and those present different types of challenges. But it's such a natural fit. The teams. Think about the teams playing the World Series tonight. The Dodgers and the Astros are both like very data driven, especially Houston. Golden State Warriors, the NBA Champions, extremely data driven. New England Patriots, relative to an NFL team, it's shifted a little bit, the NFL bar is lower. But the Patriots are certainly very analytical in how they make decisions. So, you can't talk about sports without talking about analytics. >> And I was going to save the baseball question for later. Cause we are moments away from game seven. >> Yeah. >> Is everyone else watching game seven? It's been an incredible series. Probably one of the best of all time. >> Yeah, I mean-- >> You have a prediction here? >> You can mention that too. So I don't have a prediction. FiveThirtyEight has the Dodgers with a 60% chance of winning. >> [Katie] LA Fans. >> So you have two teams that are about equal. But the Dodgers pitching staff is in better shape at the moment. The end of a seven game series. And they're at home. >> But the statistics behind the two teams is pretty incredible. >> Yeah. It's like the first World Series in I think 56 years or something where you have two 100 win teams facing one another. There have been a lot of parity in baseball for a lot of years. Not that many offensive overall juggernauts. But this year, and last year with the Cubs and the Indians too really. But this year, you have really spectacular teams in the World Series. It kind of is a showcase of modern baseball. Lots of home runs. Lots of strikeouts. >> [Katie] Lots of extra innings. >> Lots of extra innings. Good defense. Lots of pitching changes. So if you love the modern baseball game, it's been about the best example that you've had. If you like a little bit more contact, and fewer strikeouts, maybe not so much. But it's been a spectacular and very exciting World Series. It's amazing to talk. MLB is huge with analysis. I mean, hands down. But across the board, if you can provide a few examples. Because there's so many teams in front offices putting such an, just a heavy intensity on the analysis side. And where the teams are going. And if you could provide any specific examples of teams that have really blown your mind. Especially over the last year or two. Because every year it gets more exciting if you will. I mean, so a big thing in baseball is defensive shifts. So if you watch tonight, you'll probably see a couple of plays where if you're used to watching baseball, a guy makes really solid contact. And there's a fielder there that you don't think should be there. But that's really very data driven where you analyze where's this guy hit the ball. That part's not so hard. But also there's game theory involved. Because you have to adjust for the fact that he knows where you're positioning the defenders. He's trying therefore to make adjustments to his own swing and so that's been a major innovation in how baseball is played. You know, how bullpens are used too. Where teams have realized that actually having a guy, across all sports pretty much, realizing the importance of rest. And of fatigue. And that you can be the best pitcher in the world, but guess what? After four or five innings, you're probably not as good as a guy who has a fresh arm necessarily. So I mean, it really is like, these are not subtle things anymore. It's not just oh, on base percentage is valuable. It really effects kind of every strategic decision in baseball. The NBA, if you watch an NBA game tonight, see how many three point shots are taken. That's in part because of data. And teams realizing hey, three points is worth more than two, once you're more than about five feet from the basket, the shooting percentage gets really flat. And so it's revolutionary, right? Like teams that will shoot almost half their shots from the three point range nowadays. Larry Bird, who wound up being one of the greatest three point shooters of all time, took only eight three pointers his first year in the NBA. It's quite noticeable if you watch baseball or basketball in particular. >> Not to focus too much on sports. One final question. In terms of Major League Soccer, and now in NFL, we're having the analysis and having wearables where it can now showcase if they wanted to on screen, heart rate and breathing and how much exertion. How much data is too much data? And when does it ruin the sport? >> So, I don't think, I mean, again, it goes sport by sport a little bit. I think in basketball you actually have a more exciting game. I think the game is more open now. You have more three pointers. You have guys getting higher assist totals. But you know, I don't know. I'm not one of those people who thinks look, if you love baseball or basketball, and you go in to work for the Astros, the Yankees or the Knicks, they probably need some help, right? You really have to be passionate about that sport. Because it's all based on what questions am I asking? As I'm a fan or I guess an employee of the team. Or a player watching the game. And there isn't really any substitute I don't think for the insight and intuition that a curious human has to kind of ask the right questions. So we can talk at great length about what tools do you then apply when you have those questions, but that still comes from people. I don't think machine learning could help with what questions do I want to ask of the data. It might help you get the answers. >> If you have a mid-fielder in a soccer game though, not exerting, only 80%, and you're seeing that on a screen as a fan, and you're saying could that person get fired at the end of the day? One day, with the data? >> So we found that actually some in soccer in particular, some of the better players are actually more still. So Leo Messi, maybe the best player in the world, doesn't move as much as other soccer players do. And the reason being that A) he kind of knows how to position himself in the first place. B) he realizes that you make a run, and you're out of position. That's quite fatiguing. And particularly soccer, like basketball, is a sport where it's incredibly fatiguing. And so, sometimes the guys who conserve their energy, that kind of old school mentality, you have to hustle at every moment. That is not helpful to the team if you're hustling on an irrelevant play. And therefore, on a critical play, can't get back on defense, for example. >> Sports, but also data is moving exponentially as we're just speaking about today. Tech, healthcare, every different industry. Is there any particular that's a favorite of yours to cover? And I imagine they're all different as well. >> I mean, I do like sports. We cover a lot of politics too. Which is different. I mean in politics I think people aren't intuitively as data driven as they might be in sports for example. It's impressive to follow the breakthroughs in artificial intelligence. It started out just as kind of playing games and playing chess and poker and Go and things like that. But you really have seen a lot of breakthroughs in the last couple of years. But yeah, it's kind of infused into everything really. >> You're known for your work in politics though. Especially presidential campaigns. >> Yeah. >> This year, in particular. Was it insanely challenging? What was the most notable thing that came out of any of your predictions? >> I mean, in some ways, looking at the polling was the easiest lens to look at it. So I think there's kind of a myth that last year's result was a big shock and it wasn't really. If you did the modeling in the right way, then you realized that number one, polls have a margin of error. And so when a candidate has a three point lead, that's not particularly safe. Number two, the outcome between different states is correlated. Meaning that it's not that much of a surprise that Clinton lost Wisconsin and Michigan and Pennsylvania and Ohio. You know I'm from Michigan. Have friends from all those states. Kind of the same types of people in those states. Those outcomes are all correlated. So what people thought was a big upset for the polls I think was an example of how data science done carefully and correctly where you understand probabilities, understand correlations. Our model gave Trump a 30% chance of winning. Others models gave him a 1% chance. And so that was interesting in that it showed that number one, that modeling strategies and skill do matter quite a lot. When you have someone saying 30% versus 1%. I mean, that's a very very big spread. And number two, that these aren't like solved problems necessarily. Although again, the problem with elections is that you only have one election every four years. So I can be very confident that I have a better model. Even one year of data doesn't really prove very much. Even five or 10 years doesn't really prove very much. And so, being aware of the limitations to some extent intrinsically in elections when you only get one kind of new training example every four years, there's not really any way around that. There are ways to be more robust to sparce data environments. But if you're identifying different types of business problems to solve, figuring out what's a solvable problem where I can add value with data science is a really key part of what you're doing. >> You're such a leader in this space. In data and analysis. It would be interesting to kind of peek back the curtain, understand how you operate but also how large is your team? How you're putting together information. How quickly you're putting it out. Cause I think in this right now world where everybody wants things instantly-- >> Yeah. >> There's also, you want to be first too in the world of journalism. But you don't want to be inaccurate because that's your credibility. >> We talked about this before, right? I think on average, speed is a little bit overrated in journalism. >> [Katie] I think it's a big problem in journalism. >> Yeah. >> Especially in the tech world. You have to be first. You have to be first. And it's just pumping out, pumping out. And there's got to be more time spent on stories if I can speak subjectively. >> Yeah, for sure. But at the same time, we are reacting to the news. And so we have people that come in, we hire most of our people actually from journalism. >> [Katie] How many people do you have on your team? >> About 35. But, if you get someone who comes in from an academic track for example, they might be surprised at how fast journalism is. That even though we might be slower than the average website, the fact that there's a tragic event in New York, are there things we have to say about that? A candidate drops out of the presidential race, are things we have to say about that. In periods ranging from minutes to days as opposed to kind of weeks to months to years in the academic world. The corporate world moves faster. What is a little different about journalism is that you are expected to have more precision where people notice when you make a mistake. In corporations, you have maybe less transparency. If you make 10 investments and seven of them turn out well, then you'll get a lot of profit from that, right? In journalism, it's a little different. If you make kind of seven predictions or say seven things, and seven of them are very accurate and three of them aren't, you'll still get criticized a lot for the three. Just because that's kind of the way that journalism is. And so the kind of combination of needing, not having that much tolerance for mistakes, but also needing to be fast. That is tricky. And I criticize other journalists sometimes including for not being data driven enough, but the best excuse any journalist has, this is happening really fast and it's my job to kind of figure out in real time what's going on and provide useful information to the readers. And that's really difficult. Especially in a world where literally, I'll probably get off the stage and check my phone and who knows what President Trump will have tweeted or what things will have happened. But it really is a kind of 24/7. >> Well because it's 24/7 with FiveThirtyEight, one of the most well known sites for data, are you feeling micromanagey on your people? Because you do have to hit this balance. You can't have something come out four or five days later. >> Yeah, I'm not -- >> Are you overseeing everything? >> I'm not by nature a micromanager. And so you try to hire well. You try and let people make mistakes. And the flip side of this is that if a news organization that never had any mistakes, never had any corrections, that's raw, right? You have to have some tolerance for error because you are trying to decide things in real time. And figure things out. I think transparency's a big part of that. Say here's what we think, and here's why we think it. If we have a model to say it's not just the final number, here's a lot of detail about how that's calculated. In some case we release the code and the raw data. Sometimes we don't because there's a proprietary advantage. But quite often we're saying we want you to trust us and it's so important that you trust us, here's the model. Go play around with it yourself. Here's the data. And that's also I think an important value. >> That speaks to open source. And your perspective on that in general. >> Yeah, I mean, look, I'm a big fan of open source. I worry that I think sometimes the trends are a little bit away from open source. But by the way, one thing that happens when you share your data or you share your thinking at least in lieu of the data, and you can definitely do both is that readers will catch embarrassing mistakes that you made. By the way, even having open sourceness within your team, I mean we have editors and copy editors who often save you from really embarrassing mistakes. And by the way, it's not necessarily people who have a training in data science. I would guess that of our 35 people, maybe only five to 10 have a kind of formal background in what you would call data science. >> [Katie] I think that speaks to the theme here. >> Yeah. >> [Katie] That everybody's kind of got to be data literate. >> But yeah, it is like you have a good intuition. You have a good BS detector basically. And you have a good intuition for hey, this looks a little bit out of line to me. And sometimes that can be based on domain knowledge, right? We have one of our copy editors, she's a big college football fan. And we had an algorithm we released that tries to predict what the human being selection committee will do, and she was like, why is LSU rated so high? Cause I know that LSU sucks this year. And we looked at it, and she was right. There was a bug where it had forgotten to account for their last game where they lost to Troy or something and so -- >> That also speaks to the human element as well. >> It does. In general as a rule, if you're designing a kind of regression based model, it's different in machine learning where you have more, when you kind of build in the tolerance for error. But if you're trying to do something more precise, then so much of it is just debugging. It's saying that looks wrong to me. And I'm going to investigate that. And sometimes it's not wrong. Sometimes your model actually has an insight that you didn't have yourself. But fairly often, it is. And I think kind of what you learn is like, hey if there's something that bothers me, I want to go investigate that now and debug that now. Because the last thing you want is where all of a sudden, the answer you're putting out there in the world hinges on a mistake that you made. Cause you never know if you have so to speak, 1,000 lines of code and they all perform something differently. You never know when you get in a weird edge case where this one decision you made winds up being the difference between your having a good forecast and a bad one. In a defensible position and a indefensible one. So we definitely are quite diligent and careful. But it's also kind of knowing like, hey, where is an approximation good enough and where do I need more precision? Cause you could also drive yourself crazy in the other direction where you know, it doesn't matter if the answer is 91.2 versus 90. And so you can kind of go 91.2, three, four and it's like kind of A) false precision and B) not a good use of your time. So that's where I do still spend a lot of time is thinking about which problems are "solvable" or approachable with data and which ones aren't. And when they're not by the way, you're still allowed to report on them. We are a news organization so we do traditional reporting as well. And then kind of figuring out when do you need precision versus when is being pointed in the right direction good enough? >> I would love to get inside your brain and see how you operate on just like an everyday walking to Walgreens movement. It's like oh, if I cross the street in .2-- >> It's not, I mean-- >> Is it like maddening in there? >> No, not really. I mean, I'm like-- >> This is an honest question. >> If I'm looking for airfares, I'm a little more careful. But no, part of it's like you don't want to waste time on unimportant decisions, right? I will sometimes, if I can't decide what to eat at a restaurant, I'll flip a coin. If the chicken and the pasta both sound really good-- >> That's not high tech Nate. We want better. >> But that's the point, right? It's like both the chicken and the pasta are going to be really darn good, right? So I'm not going to waste my time trying to figure it out. I'm just going to have an arbitrary way to decide. >> Serious and business, how organizations in the last three to five years have just evolved with this data boom. How are you seeing it as from a consultant point of view? Do you think it's an exciting time? Do you think it's a you must act now time? >> I mean, we do know that you definitely see a lot of talent among the younger generation now. That so FiveThirtyEight has been at ESPN for four years now. And man, the quality of the interns we get has improved so much in four years. The quality of the kind of young hires that we make straight out of college has improved so much in four years. So you definitely do see a younger generation for which this is just part of their bloodstream and part of their DNA. And also, particular fields that we're interested in. So we're interested in people who have both a data and a journalism background. We're interested in people who have a visualization and a coding background. A lot of what we do is very much interactive graphics and so forth. And so we do see those skill sets coming into play a lot more. And so the kind of shortage of talent that had I think frankly been a problem for a long time, I'm optimistic based on the young people in our office, it's a little anecdotal but you can tell that there are so many more programs that are kind of teaching students the right set of skills that maybe weren't taught as much a few years ago. >> But when you're seeing these big organizations, ESPN as perfect example, moving more towards data and analytics than ever before. >> Yeah. >> You would say that's obviously true. >> Oh for sure. >> If you're not moving that direction, you're going to fall behind quickly. >> Yeah and the thing is, if you read my book or I guess people have a copy of the book. In some ways it's saying hey, there are lot of ways to screw up when you're using data. And we've built bad models. We've had models that were bad and got good results. Good models that got bad results and everything else. But the point is that the reason to be out in front of the problem is so you give yourself more runway to make errors and mistakes. And to learn kind of what works and what doesn't and which people to put on the problem. I sometimes do worry that a company says oh we need data. And everyone kind of agrees on that now. We need data science. Then they have some big test case. And they have a failure. And they maybe have a failure because they didn't know really how to use it well enough. But learning from that and iterating on that. And so by the time that you're on the third generation of kind of a problem that you're trying to solve, and you're watching everyone else make the mistake that you made five years ago, I mean, that's really powerful. But that doesn't mean that getting invested in it now, getting invested both in technology and the human capital side is important. >> Final question for you as we run out of time. 2018 beyond, what is your biggest project in terms of data gathering that you're working on? >> There's a midterm election coming up. That's a big thing for us. We're also doing a lot of work with NBA data. So for four years now, the NBA has been collecting player tracking data. So they have 3D cameras in every arena. So they can actually kind of quantify for example how fast a fast break is, for example. Or literally where a player is and where the ball is. For every NBA game now for the past four or five years. And there hasn't really been an overall metric of player value that's taken advantage of that. The teams do it. But in the NBA, the teams are a little bit ahead of journalists and analysts. So we're trying to have a really truly next generation stat. It's a lot of data. Sometimes I now more oversee things than I once did myself. And so you're parsing through many, many, many lines of code. But yeah, so we hope to have that out at some point in the next few months. >> Anything you've personally been passionate about that you've wanted to work on and kind of solve? >> I mean, the NBA thing, I am a pretty big basketball fan. >> You can do better than that. Come on, I want something real personal that you're like I got to crunch the numbers. >> You know, we tried to figure out where the best burrito in America was a few years ago. >> I'm going to end it there. >> Okay. >> Nate, thank you so much for joining us. It's been an absolute pleasure. Thank you. >> Cool, thank you. >> I thought we were going to chat World Series, you know. Burritos, important. I want to thank everybody here in our audience. Let's give him a big round of applause. >> [Nate] Thank you everyone. >> Perfect way to end the day. And for a replay of today's program, just head on over to ibm.com/dsforall. I'm Katie Linendoll. And this has been Data Science for All: It's a Whole New Game. Test one, two. One, two, three. Hi guys, I just want to quickly let you know as you're exiting. A few heads up. Downstairs right now there's going to be a meet and greet with Nate. And we're going to be doing that with clients and customers who are interested. So I would recommend before the game starts, and you lose Nate, head on downstairs. And also the gallery is open until eight p.m. with demos and activations. And tomorrow, make sure to come back too. Because we have exciting stuff. I'll be joining you as your host. And we're kicking off at nine a.m. So bye everybody, thank you so much. >> [Announcer] Ladies and gentlemen, thank you for attending this evening's webcast. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your name badge at the registration desk. Thank you. Also, please note there are two exits on the back of the room on either side of the room. Have a good evening. Ladies and gentlemen, the meet and greet will be on stage. Thank you.

Published Date : Nov 1 2017

SUMMARY :

Today the ability to extract value from data is becoming a shared mission. And for all of you during the program, I want to remind you to join that conversation on And when you and I chatted about it. And the scale and complexity of the data that organizations are having to deal with has It's challenging in the world of unmanageable. And they have to find a way. AI. And it's incredible that this buzz word is happening. And to get to an AI future, you have to lay a data foundation today. And four is you got to expand job roles in the organization. First pillar in this you just discussed. And now you get to where we are today. And if you don't have a strategy for how you acquire that and manage it, you're not going And the way I think about that is it's really about moving from static data repositories And we continue with the architecture. So you need a way to federate data across different environments. So we've laid out what you need for driving automation. And so when you think about the real use cases that are driving return on investment today, Let's go ahead and come back to something that you mentioned earlier because it's fascinating And so the new job roles is about how does everybody have data first in their mind? Everybody in the company has to be data literate. So overall, group effort, has to be a common goal, and we all need to be data literate But at the end of the day, it's kind of not an easy task. It's not easy but it's maybe not as big of a shift as you would think. It's interesting to hear you say essentially you need to train everyone though across the And look, if you want to get your hands on code and just dive right in, you go to datascience.ibm.com. And I've heard that the placement behind those jobs, people graduating with the MS is high. Let me get back to something else you touched on earlier because you mentioned that a number They produce a lot of the shows that I'm sure you watch Katie. And this is a good example. So they have to optimize every aspect of their business from marketing campaigns to promotions And so, as we talk to clients we think about how do you start down this path now, even It's analytics first to the data, not the other way around. We as a practice, we say you want to bring data to where the data sits. And a Harvard Business Review even dubbed it the sexiest job of the 21st century. Female preferred, on the cover of Vogue. And how does it change everything? And while it's important to recognize this critical skill set, you can't just limit it And we call it clickers and coders. [Katie] I like that. And there's not a lot of things available today that do that. Because I hear you talking about the data scientists role and how it's critical to success, And my view is if you have the right platform, it enables the organization to collaborate. And every organization needs to think about what are the skills that are critical? Use this as your chance to reinvent IT. And I can tell you even personally being effected by how important the analysis is in working And think about if you don't do something. And now we're going to get to the fun hands on part of our story. And then how do you move analytics closer to your data? And in here I can see that JP Morgan is calling for a US dollar rebound in the second half But then where it gets interesting is you go to the bottom. data, his stock portfolios, and browsing behavior to build a model which can predict his affinity And so, as a financial adviser, you look at this and you say, all right, we know he loves And I want to do that by picking a auto stock which has got negative correlation with Ferrari. Cause you start clicking that and immediately we're getting instant answers of what's happening. And what I see here instantly is that Honda has got a negative correlation with Ferrari, As a financial adviser, you wouldn't think about federating data, machine learning, pretty And drive the machine learning into the appliance. And even score hundreds of customers for their affinities on a daily basis. And then you see when you deploy analytics next to your data, even a financial adviser, And as a data science leader or data scientist, you have a lot of the same concerns. But you guys each have so many unique roles in your business life. And just by looking at the demand of companies that wants us to help them go through this And I think the whole ROI of data is that you can now understand people's relationships Well you can have all the data in the world, and I think it speaks to, if you're not doing And I think that that's one of the things that customers are coming to us for, right? And Nir, this is something you work with a lot. And the companies that are not like that. Tricia, companies have to deal with data behind the firewall and in the new multi cloud And so that's why I think it's really important to understand that when you implement big And how are the clients, how are the users actually interacting with the system? And right now the way I see teams being set up inside companies is that they're creating But in order to actually see all of the RY behind the data, you also have to have a creative That's one of the things that we see a lot. So a lot of the training we do is sort of data engineers. And I think that's a very strong point when it comes to the data analysis side. And that's where you need the human element to come back in and say okay, look, you're And the people who are really great at providing that human intelligence are social scientists. the talent piece is actually the most important crucial hard to get. It may be to take folks internally who have a lot of that domain knowledge that you have And from data scientist to machine learner. And what I explain to them is look, you're still making decisions in the same way. And I mean, just to give you an example, we are partnering with one of the major cloud And what you're talking about with culture is really where I think we're talking about And I think that communication between the technical stakeholders and management You guys made this way too easy. I want to leave you with an opportunity to, anything you want to add to this conversation? I think one thing to conclude is to say that companies that are not data driven is And thank you guys again for joining us. And we're going to turn our attention to how you can deliver on what they're talking about And finally how you could build models anywhere and employ them close to where your data is. And thanks to Siva for taking us through it. You got to break it down for me cause I think we zoom out and see the big picture. And we saw some new capabilities that help companies avoid lock-in, where you can import And as a data scientist, you stop feeling like you're falling behind. We met backstage. And I go to you to talk about sports because-- And what it brings. And the reason being that sports consists of problems that have rules. And I was going to save the baseball question for later. Probably one of the best of all time. FiveThirtyEight has the Dodgers with a 60% chance of winning. So you have two teams that are about equal. It's like the first World Series in I think 56 years or something where you have two 100 And that you can be the best pitcher in the world, but guess what? And when does it ruin the sport? So we can talk at great length about what tools do you then apply when you have those And the reason being that A) he kind of knows how to position himself in the first place. And I imagine they're all different as well. But you really have seen a lot of breakthroughs in the last couple of years. You're known for your work in politics though. What was the most notable thing that came out of any of your predictions? And so, being aware of the limitations to some extent intrinsically in elections when It would be interesting to kind of peek back the curtain, understand how you operate but But you don't want to be inaccurate because that's your credibility. I think on average, speed is a little bit overrated in journalism. And there's got to be more time spent on stories if I can speak subjectively. And so we have people that come in, we hire most of our people actually from journalism. And so the kind of combination of needing, not having that much tolerance for mistakes, Because you do have to hit this balance. And so you try to hire well. And your perspective on that in general. But by the way, one thing that happens when you share your data or you share your thinking And you have a good intuition for hey, this looks a little bit out of line to me. And I think kind of what you learn is like, hey if there's something that bothers me, It's like oh, if I cross the street in .2-- I mean, I'm like-- But no, part of it's like you don't want to waste time on unimportant decisions, right? We want better. It's like both the chicken and the pasta are going to be really darn good, right? Serious and business, how organizations in the last three to five years have just And man, the quality of the interns we get has improved so much in four years. But when you're seeing these big organizations, ESPN as perfect example, moving more towards But the point is that the reason to be out in front of the problem is so you give yourself Final question for you as we run out of time. And so you're parsing through many, many, many lines of code. You can do better than that. You know, we tried to figure out where the best burrito in America was a few years Nate, thank you so much for joining us. I thought we were going to chat World Series, you know. And also the gallery is open until eight p.m. with demos and activations. If you are not attending all cloud and cognitive summit tomorrow, we ask that you recycle your

ENTITIES

Entity	Category	Confidence
Tricia Wang	PERSON	0.99+
Katie	PERSON	0.99+
Katie Linendoll	PERSON	0.99+
Rob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Joane	PERSON	0.99+
Daniel	PERSON	0.99+
Michael Li	PERSON	0.99+
Nate Silver	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Trump	PERSON	0.99+
Nate	PERSON	0.99+
Honda	ORGANIZATION	0.99+
Siva	PERSON	0.99+
McKinsey	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Larry Bird	PERSON	0.99+
2017	DATE	0.99+
Rob Thomas	PERSON	0.99+
Michigan	LOCATION	0.99+
Yankees	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Clinton	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Tesco	ORGANIZATION	0.99+
Michael	PERSON	0.99+
America	LOCATION	0.99+
Leo	PERSON	0.99+
four years	QUANTITY	0.99+
five	QUANTITY	0.99+
30%	QUANTITY	0.99+
Astros	ORGANIZATION	0.99+
Trish	PERSON	0.99+
Sudden Compass	ORGANIZATION	0.99+
Leo Messi	PERSON	0.99+
two teams	QUANTITY	0.99+
1,000 lines	QUANTITY	0.99+
one year	QUANTITY	0.99+
10 investments	QUANTITY	0.99+
NASDAQ	ORGANIZATION	0.99+
The Signal and the Noise	TITLE	0.99+
Tricia	PERSON	0.99+
Nir Kaldero	PERSON	0.99+
80%	QUANTITY	0.99+
BCG	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
ESPN	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Ferrari	ORGANIZATION	0.99+
last year	DATE	0.99+
18	QUANTITY	0.99+
three	QUANTITY	0.99+
Data Incubator	ORGANIZATION	0.99+
Patriots	ORGANIZATION	0.99+

Data Science: Present and Future | IBM Data Science For All

>> Announcer: Live from New York City it's The Cube, covering IBM data science for all. Brought to you by IBM. (light digital music) >> Welcome back to data science for all. It's a whole new game. And it is a whole new game. >> Dave Vellante, John Walls here. We've got quite a distinguished panel. So it is a new game-- >> Well we're in the game, I'm just happy to be-- (both laugh) Have a swing at the pitch. >> Well let's what we have here. Five distinguished members of our panel. It'll take me a minute to get through the introductions, but believe me they're worth it. Jennifer Shin joins us. Jennifer's the founder of 8 Path Solutions, the director of the data science of Comcast and part of the faculty at UC Berkeley and NYU. Jennifer, nice to have you with us, we appreciate the time. Joe McKendrick an analyst and contributor of Forbes and ZDNet, Joe, thank you for being here at well. Another ZDNetter next to him, Dion Hinchcliffe, who is a vice president and principal analyst of Constellation Research and also contributes to ZDNet. Good to see you, sir. To the back row, but that doesn't mean anything about the quality of the participation here. Bob Hayes with a killer Batman shirt on by the way, which we'll get to explain in just a little bit. He runs the Business over Broadway. And Joe Caserta, who the founder of Caserta Concepts. Welcome to all of you. Thanks for taking the time to be with us. Jennifer, let me just begin with you. Obviously as a practitioner you're very involved in the industry, you're on the academic side as well. We mentioned Berkeley, NYU, steep experience. So I want you to kind of take your foot in both worlds and tell me about data science. I mean where do we stand now from those two perspectives? How have we evolved to where we are? And how would you describe, I guess the state of data science? >> Yeah so I think that's a really interesting question. There's a lot of changes happening. In part because data science has now become much more established, both in the academic side as well as in industry. So now you see some of the bigger problems coming out. People have managed to have data pipelines set up. But now there are these questions about models and accuracy and data integration. So the really cool stuff from the data science standpoint. We get to get really into the details of the data. And I think on the academic side you now see undergraduate programs, not just graduate programs, but undergraduate programs being involved. UC Berkeley just did a big initiative that they're going to offer data science to undergrads. So that's a huge news for the university. So I think there's a lot of interest from the academic side to continue data science as a major, as a field. But I think in industry one of the difficulties you're now having is businesses are now asking that question of ROI, right? What do I actually get in return in the initial years? So I think there's a lot of work to be done and just a lot of opportunity. It's great because people now understand better with data sciences, but I think data sciences have to really think about that seriously and take it seriously and really think about how am I actually getting a return, or adding a value to the business? >> And there's lot to be said is there not, just in terms of increasing the workforce, the acumen, the training that's required now. It's a still relatively new discipline. So is there a shortage issue? Or is there just a great need? Is the opportunity there? I mean how would you look at that? >> Well I always think there's opportunity to be smart. If you can be smarter, you know it's always better. It gives you advantages in the workplace, it gets you an advantage in academia. The question is, can you actually do the work? The work's really hard, right? You have to learn all these different disciplines, you have to be able to technically understand data. Then you have to understand it conceptually. You have to be able to model with it, you have to be able to explain it. There's a lot of aspects that you're not going to pick up overnight. So I think part of it is endurance. Like are people going to feel motivated enough and dedicate enough time to it to get very good at that skill set. And also of course, you know in terms of industry, will there be enough interest in the long term that there will be a financial motivation. For people to keep staying in the field, right? So I think it's definitely a lot of opportunity. But that's always been there. Like I tell people I think of myself as a scientist and data science happens to be my day job. That's just the job title. But if you are a scientist and you work with data you'll always want to work with data. I think that's just an inherent need. It's kind of a compulsion, you just kind of can't help yourself, but dig a little bit deeper, ask the questions, you can't not think about it. So I think that will always exist. Whether or not it's an industry job in the way that we see it today, and like five years from now, or 10 years from now. I think that's something that's up for debate. >> So all of you have watched the evolution of data and how it effects organizations for a number of years now. If you go back to the days when data warehouse was king, we had a lot of promises about 360 degree views of the customer and how we were going to be more anticipatory in terms and more responsive. In many ways the decision support systems and the data warehousing world didn't live up to those promises. They solved other problems for sure. And so everybody was looking for big data to solve those problems. And they've begun to attack many of them. We talked earlier in The Cube today about fraud detection, it's gotten much, much better. Certainly retargeting of advertising has gotten better. But I wonder if you could comment, you know maybe start with Joe. As to the effect that data and data sciences had on organizations in terms of fulfilling that vision of a 360 degree view of customers and anticipating customer needs. >> So. Data warehousing, I wouldn't say failed. But I think it was unfinished in order to achieve what we need done today. At the time I think it did a pretty good job. I think it was the only place where we were able to collect data from all these different systems, have it in a single place for analytics. The big difference between what I think, between data warehousing and data science is data warehouses were primarily made for the consumer to human beings. To be able to have people look through some tool and be able to analyze data manually. That really doesn't work anymore, there's just too much data to do that. So that's why we need to build a science around it so that we can actually have machines actually doing the analytics for us. And I think that's the biggest stride in the evolution over the past couple of years, that now we're actually able to do that, right? It used to be very, you know you go back to when data warehouses started, you had to be a deep technologist in order to be able to collect the data, write the programs to clean the data. But now you're average causal IT person can do that. Right now I think we're back in data science where you have to be a fairly sophisticated programmer, analyst, scientist, statistician, engineer, in order to do what we need to do, in order to make machines actually understand the data. But I think part of the evolution, we're just in the forefront. We're going to see over the next, not even years, within the next year I think a lot of new innovation where the average person within business and definitely the average person within IT will be able to do as easily say, "What are my sales going to be next year?" As easy as it is to say, "What were my sales last year." Where now it's a big deal. Right now in order to do that you have to build some algorithms, you have to be a specialist on predictive analytics. And I think, you know as the tools mature, as people using data matures, and as the technology ecosystem for data matures, it's going to be easier and more accessible. >> So it's still too hard. (laughs) That's something-- >> Joe C.: Today it is yes. >> You've written about and talked about. >> Yeah no question about it. We see this citizen data scientist. You know we talked about the democratization of data science but the way we talk about analytics and warehousing and all the tools we had before, they generated a lot of insights and views on the information, but they didn't really give us the science part. And that's, I think that what's missing is the forming of the hypothesis, the closing of the loop of. We now have use of this data, but are are changing, are we thinking about it strategically? Are we learning from it and then feeding that back into the process. I think that's the big difference between data science and the analytics side. But, you know just like Google made search available to everyone, not just people who had highly specialized indexers or crawlers. Now we can have tools that make these capabilities available to anyone. You know going back to what Joe said I think the key thing is we now have tools that can look at all the data and ask all the questions. 'Cause we can't possibly do it all ourselves. Our organizations are increasingly awash in data. Which is the life blood of our organizations, but we're not using it, you know this a whole concept of dark data. And so I think the concept, or the promise of opening these tools up for everyone to be able to access those insights and activate them, I think that, you know, that's where it's headed. >> This is kind of where the T shirt comes in right? So Bob if you would, so you've got this Batman shirt on. We talked a little bit about it earlier, but it plays right into what Dion's talking about. About tools and, I don't want to spoil it, but you go ahead (laughs) and tell me about it. >> Right, so. Batman is a super hero, but he doesn't have any supernatural powers, right? He can't fly on his own, he can't become invisible on his own. But the thing is he has the utility belt and he has these tools he can use to help him solve problems. For example he as the bat ring when he's confronted with a building that he wants to get over, right? So he pulls it out and uses that. So as data professionals we have all these tools now that these vendors are making. We have IBM SPSS, we have data science experience. IMB Watson that these data pros can now use it as part of their utility belt and solve problems that they're confronted with. So if you''re ever confronted with like a Churn problem and you have somebody who has access to that data they can put that into IBM Watson, ask a question and it'll tell you what's the key driver of Churn. So it's not that you have to be a superhuman to be a data scientist, but these tools will help you solve certain problems and help your business go forward. >> Joe McKendrick, do you have a comment? >> Does that make the Batmobile the Watson? (everyone laughs) Analogy? >> I was just going to add that, you know all of the billionaires in the world today and none of them decided to become Batman yet. It's very disappointing. >> Yeah. (Joe laughs) >> Go ahead Joe. >> And I just want to add some thoughts to our discussion about what happened with data warehousing. I think it's important to point out as well that data warehousing, as it existed, was fairly successful but for larger companies. Data warehousing is a very expensive proposition it remains a expensive proposition. Something that's in the domain of the Fortune 500. But today's economy is based on a very entrepreneurial model. The Fortune 500s are out there of course it's ever shifting. But you have a lot of smaller companies a lot of people with start ups. You have people within divisions of larger companies that want to innovate and not be tied to the corporate balance sheet. They want to be able to go through, they want to innovate and experiment without having to go through finance and the finance department. So there's all these open source tools available. There's cloud resources as well as open source tools. Hadoop of course being a prime example where you can work with the data and experiment with the data and practice data science at a very low cost. >> Dion mentioned the C word, citizen data scientist last year at the panel. We had a conversation about that. And the data scientists on the panel generally were like, "Stop." Okay, we're not all of a sudden going to turn everybody into data scientists however, what we want to do is get people thinking about data, more focused on data, becoming a data driven organization. I mean as a data scientist I wonder if you could comment on that. >> Well I think so the other side of that is, you know there are also many people who maybe didn't, you know follow through with science, 'cause it's also expensive. A PhD takes a lot of time. And you know if you don't get funding it's a lot of money. And for very little security if you think about how hard it is to get a teaching job that's going to give you enough of a pay off to pay that back. Right, the time that you took off, the investment that you made. So I think the other side of that is by making data more accessible, you allow people who could have been great in science, have an opportunity to be great data scientists. And so I think for me the idea of citizen data scientist, that's where the opportunity is. I think in terms of democratizing data and making it available for everyone, I feel as though it's something similar to the way we didn't really know what KPIs were, maybe 20 years ago. People didn't use it as readily, didn't teach it in schools. I think maybe 10, 20 years from now, some of the things that we're building today from data science, hopefully more people will understand how to use these tools. They'll have a better understanding of working with data and what that means, and just data literacy right? Just being able to use these tools and be able to understand what data's saying and actually what it's not saying. Which is the thing that most people don't think about. But you can also say that data doesn't say anything. There's a lot of noise in it. There's too much noise to be able to say that there is a result. So I think that's the other side of it. So yeah I guess in terms for me, in terms of data a serious data scientist, I think it's a great idea to have that, right? But at the same time of course everyone kind of emphasized you don't want everyone out there going, "I can be a data scientist without education, "without statistics, without math," without understanding of how to implement the process. I've seen a lot of companies implement the same sort of process from 10, 20 years ago just on Hadoop instead of SQL. Right and it's very inefficient. And the only difference is that you can build more tables wrong than they could before. (everyone laughs) Which is I guess >> For less. it's an accomplishment and for less, it's cheaper, yeah. >> It is cheaper. >> Otherwise we're like I'm not a data scientist but I did stay at a Holiday Inn Express last night, right? >> Yeah. (panelists laugh) And there's like a little bit of pride that like they used 2,000, you know they used 2,000 computers to do it. Like a little bit of pride about that, but you know of course maybe not a great way to go. I think 20 years we couldn't do that, right? One computer was already an accomplishment to have that resource. So I think you have to think about the fact that if you're doing it wrong, you're going to just make that mistake bigger, which his also the other side of working with data. >> Sure, Bob. >> Yeah I have a comment about that. I've never liked the term citizen data scientist or citizen scientist. I get the point of it and I think employees within companies can help in the data analytics problem by maybe being a data collector or something. I mean I would never have just somebody become a scientist based on a few classes here she takes. It's like saying like, "Oh I'm going to be a citizen lawyer." And so you come to me with your legal problems, or a citizen surgeon. Like you need training to be good at something. You can't just be good at something just 'cause you want to be. >> John: Joe you wanted to say something too on that. >> Since we're in New York City I'd like to use the analogy of a real scientist versus a data scientist. So real scientist requires tools, right? And the tools are not new, like microscopes and a laboratory and a clean room. And these tools have evolved over years and years, and since we're in New York we could walk within a 10 block radius and buy any of those tools. It doesn't make us a scientist because we use those tools. I think with data, you know making, making the tools evolve and become easier to use, you know like Bob was saying, it doesn't make you a better data scientist, it just makes the data more accessible. You know we can go buy a microscope, we can go buy Hadoop, we can buy any kind of tool in a data ecosystem, but it doesn't really make you a scientist. I'm very involved in the NYU data science program and the Columbia data science program, like these kids are brilliant. You know these kids are not someone who is, you know just trying to run a day to day job, you know in corporate America. I think the people who are running the day to day job in corporate America are going to be the recipients of data science. Just like people who take drugs, right? As a result of a smart data scientist coming up with a formula that can help people, I think we're going to make it easier to distribute the data that can help people with all the new tools. But it doesn't really make it, you know the access to the data and tools available doesn't really make you a better data scientist. Without, like Bob was saying, without better training and education. >> So how-- I'm sorry, how do you then, if it's not for everybody, but yet I'm the user at the end of the day at my company and I've got these reams of data before me, how do you make it make better sense to me then? So that's where machine learning comes in or artificial intelligence and all this stuff. So how at the end of the day, Dion? How do you make it relevant and usable, actionable to somebody who might not be as practiced as you would like? >> I agree with Joe that many of us will be the recipients of data science. Just like you had to be a computer science at one point to develop programs for a computer, now we can get the programs. You don't need to be a computer scientist to get a lot of value out of our IT systems. The same thing's going to happen with data science. There's far more demand for data science than there ever could be produced by, you know having an ivory tower filled with data scientists. Which we need those guys, too, don't get me wrong. But we need to have, productize it and make it available in packages such that it can be consumed. The outputs and even some of the inputs can be provided by mere mortals, whether that's machine learning or artificial intelligence or bots that go off and run the hypotheses and select the algorithms maybe with some human help. We have to productize it. This is a constant of data scientist of service, which is becoming a thing now. It's, "I need this, I need this capability at scale. "I need it fast and I need it cheap." The commoditization of data science is going to happen. >> That goes back to what I was saying about, the recipient also of data science is also machines, right? Because I think the other thing that's happening now in the evolution of data is that, you know the data is, it's so tightly coupled. Back when you were talking about data warehousing you have all the business transactions then you take the data out of those systems, you put them in a warehouse for analysis, right? Maybe they'll make a decision to change that system at some point. Now the analytics platform and the business application is very tightly coupled. They become dependent upon one another. So you know people who are using the applications are now be able to take advantage of the insights of data analytics and data science, just through the app. Which never really existed before. >> I have one comment on that. You were talking about how do you get the end user more involved, well like we said earlier data science is not easy, right? As an end user, I encourage you to take a stats course, just a basic stats course, understanding what a mean is, variability, regression analysis, just basic stuff. So you as an end user can get more, or glean more insight from the reports that you're given, right? If you go to France and don't know French, then people can speak really slowly to you in French, you're not going to get it. You need to understand the language of data to get value from the technology we have available to us. >> Incidentally French is one of the languages that you have the option of learning if you're a mathematicians. So math PhDs are required to learn a second language. France being the country of algebra, that's one of the languages you could actually learn. Anyway tangent. But going back to the point. So statistics courses, definitely encourage it. I teach statistics. And one of the things that I'm finding as I go through the process of teaching it I'm actually bringing in my experience. And by bringing in my experience I'm actually kind of making the students think about the data differently. So the other thing people don't think about is the fact that like statisticians typically were expected to do, you know, just basic sort of tasks. In a sense that they're knowledge is specialized, right? But the day to day operations was they ran some data, you know they ran a test on some data, looked at the results, interpret the results based on what they were taught in school. They didn't develop that model a lot of times they just understand what the tests were saying, especially in the medical field. So when you when think about things like, we have words like population, census. Which is when you take data from every single, you have every single data point versus a sample, which is a subset. It's a very different story now that we're collecting faster than it used to be. It used to be the idea that you could collect information from everyone. Like it happens once every 10 years, we built that in. But nowadays you know, you know here about Facebook, for instance, I think they claimed earlier this year that their data was more accurate than the census data. So now there are these claims being made about which data source is more accurate. And I think the other side of this is now statisticians are expected to know data in a different way than they were before. So it's not just changing as a field in data science, but I think the sciences that are using data are also changing their fields as well. >> Dave: So is sampling dead? >> Well no, because-- >> Should it be? (laughs) >> Well if you're sampling wrong, yes. That's really the question. >> Okay. You know it's been said that the data doesn't lie, people do. Organizations are very political. Oftentimes you know, lies, damned lies and statistics, Benjamin Israeli. Are you seeing a change in the way in which organizations are using data in the context of the politics. So, some strong P&L manager say gets data and crafts it in a way that he or she can advance their agenda. Or they'll maybe attack a data set that is, probably should drive them in a different direction, but might be antithetical to their agenda. Are you seeing data, you know we talked about democratizing data, are you seeing that reduce the politics inside of organizations? >> So you know we've always used data to tell stories at the top level of an organization that's what it's all about. And I still see very much that no matter how much data science or, the access to the truth through looking at the numbers that story telling is still the political filter through which all that data still passes, right? But it's the advent of things like Block Chain, more and more corporate records and corporate information is going to end up in these open and shared repositories where there is not alternate truth. It'll come back to whoever tells the best stories at the end of the day. So I still see the organizations are very political. We are seeing now more open data though. Open data initiatives are a big thing, both in government and in the private sector. It is having an effect, but it's slow and steady. So that's what I see. >> Um, um, go ahead. >> I was just going to say as well. Ultimately I think data driven decision making is a great thing. And it's especially useful at the lower tiers of the organization where you have the routine day to day's decisions that could be automated through machine learning and deep learning. The algorithms can be improved on a constant basis. On the upper levels, you know that's why you pay executives the big bucks in the upper levels to make the strategic decisions. And data can help them, but ultimately, data, IT, technology alone will not create new markets, it will not drive new businesses, it's up to human beings to do that. The technology is the tool to help them make those decisions. But creating businesses, growing businesses, is very much a human activity. And that's something I don't see ever getting replaced. Technology might replace many other parts of the organization, but not that part. >> I tend to be a foolish optimist when it comes to this stuff. >> You do. (laughs) >> I do believe that data will make the world better. I do believe that data doesn't lie people lie. You know I think as we start, I'm already seeing trends in industries, all different industries where, you know conventional wisdom is starting to get trumped by analytics. You know I think it's still up to the human being today to ignore the facts and go with what they think in their gut and sometimes they win, sometimes they lose. But generally if they lose the data will tell them that they should have gone the other way. I think as we start relying more on data and trusting data through artificial intelligence, as we start making our lives a little bit easier, as we start using smart cars for safety, before replacement of humans. AS we start, you know, using data really and analytics and data science really as the bumpers, instead of the vehicle, eventually we're going to start to trust it as the vehicle itself. And then it's going to make lying a little bit harder. >> Okay, so great, excellent. Optimism, I love it. (John laughs) So I'm going to play devil's advocate here a little bit. There's a couple elephant in the room topics that I want to, to explore a little bit. >> Here it comes. >> There was an article today in Wired. And it was called, Why AI is Still Waiting for It's Ethics Transplant. And, I will just read a little segment from there. It says, new ethical frameworks for AI need to move beyond individual responsibility to hold powerful industrial, government and military interests accountable as they design and employ AI. When tech giants build AI products, too often user consent, privacy and transparency are overlooked in favor of frictionless functionality that supports profit driven business models based on aggregate data profiles. This is from Kate Crawford and Meredith Whittaker who founded AI Now. And they're calling for sort of, almost clinical trials on AI, if I could use that analogy. Before you go to market you've got to test the human impact, the social impact. Thoughts. >> And also have the ability for a human to intervene at some point in the process. This goes way back. Is everybody familiar with the name Stanislav Petrov? He's the Soviet officer who back in 1983, it was in the control room, I guess somewhere outside of Moscow in the control room, which detected a nuclear missile attack against the Soviet Union coming out of the United States. Ordinarily I think if this was an entirely AI driven process we wouldn't be sitting here right now talking about it. But this gentlemen looked at what was going on on the screen and, I'm sure he's accountable to his authorities in the Soviet Union. He probably got in a lot of trouble for this, but he decided to ignore the signals, ignore the data coming out of, from the Soviet satellites. And as it turned out, of course he was right. The Soviet satellites were seeing glints of the sun and they were interpreting those glints as missile launches. And I think that's a great example why, you know every situation of course doesn't mean the end of the world, (laughs) it was in this case. But it's a great example why there needs to be a human component, a human ability for human intervention at some point in the process. >> So other thoughts. I mean organizations are driving AI hard for profit. Best minds of our generation are trying to figure out how to get people to click on ads. Jeff Hammerbacher is famous for saying it. >> You can use data for a lot of things, data analytics, you can solve, you can cure cancer. You can make customers click on more ads. It depends on what you're goal is. But, there are ethical considerations we need to think about. When we have data that will have a racial bias against blacks and have them have higher prison sentences or so forth or worse credit scores, so forth. That has an impact on a broad group of people. And as a society we need to address that. And as scientists we need to consider how are we going to fix that problem? Cathy O'Neil in her book, Weapons of Math Destruction, excellent book, I highly recommend that your listeners read that book. And she talks about these issues about if AI, if algorithms have a widespread impact, if they adversely impact protected group. And I forget the last criteria, but like we need to really think about these things as a people, as a country. >> So always think the idea of ethics is interesting. So I had this conversation come up a lot of times when I talk to data scientists. I think as a concept, right as an idea, yes you want things to be ethical. The question I always pose to them is, "Well in the business setting "how are you actually going to do this?" 'Cause I find the most difficult thing working as a data scientist, is to be able to make the day to day decision of when someone says, "I don't like that number," how do you actually get around that. If that's the right data to be showing someone or if that's accurate. And say the business decides, "Well we don't like that number." Many people feel pressured to then change the data, change, or change what the data shows. So I think being able to educate people to be able to find ways to say what the data is saying, but not going past some line where it's a lie, where it's unethical. 'Cause you can also say what data doesn't say. You don't always have to say what the data does say. You can leave it as, "Here's what we do know, "but here's what we don't know." There's a don't know part that many people will omit when they talk about data. So I think, you know especially when it comes to things like AI it's tricky, right? Because I always tell people I don't know everyone thinks AI's going to be so amazing. I started an industry by fixing problems with computers that people didn't realize computers had. For instance when you have a system, a lot of bugs, we all have bug reports that we've probably submitted. I mean really it's no where near the point where it's going to start dominating our lives and taking over all the jobs. Because frankly it's not that advanced. It's still run by people, still fixed by people, still managed by people. I think with ethics, you know a lot of it has to do with the regulations, what the laws say. That's really going to be what's involved in terms of what people are willing to do. A lot of businesses, they want to make money. If there's no rules that says they can't do certain things to make money, then there's no restriction. I think the other thing to think about is we as consumers, like everyday in our lives, we shouldn't separate the idea of data as a business. We think of it as a business person, from our day to day consumer lives. Meaning, yes I work with data. Incidentally I also always opt out of my credit card, you know when they send you that information, they make you actually mail them, like old school mail, snail mail like a document that says, okay I don't want to be part of this data collection process. Which I always do. It's a little bit more work, but I go through that step of doing that. Now if more people did that, perhaps companies would feel more incentivized to pay you for your data, or give you more control of your data. Or at least you know, if a company's going to collect information, I'd want you to be certain processes in place to ensure that it doesn't just get sold, right? For instance if a start up gets acquired what happens with that data they have on you? You agree to give it to start up. But I mean what are the rules on that? So I think we have to really think about the ethics from not just, you know, someone who's going to implement something but as consumers what control we have for our own data. 'Cause that's going to directly impact what businesses can do with our data. >> You know you mentioned data collection. So slightly on that subject. All these great new capabilities we have coming. We talked about what's going to happen with media in the future and what 5G technology's going to do to mobile and these great bandwidth opportunities. The internet of things and the internet of everywhere. And all these great inputs, right? Do we have an arms race like are we keeping up with the capabilities to make sense of all the new data that's going to be coming in? And how do those things square up in this? Because the potential is fantastic, right? But are we keeping up with the ability to make it make sense and to put it to use, Joe? >> So I think data ingestion and data integration is probably one of the biggest challenges. I think, especially as the world is starting to become more dependent on data. I think you know, just because we're dependent on numbers we've come up with GAAP, which is generally accepted accounting principles that can be audited and proven whether it's true or false. I think in our lifetime we will see something similar to that we will we have formal checks and balances of data that we use that can be audited. Getting back to you know what Dave was saying earlier about, I personally would trust a machine that was programmed to do the right thing, than to trust a politician or some leader that may have their own agenda. And I think the other thing about machines is that they are auditable. You know you can look at the code and see exactly what it's doing and how it's doing it. Human beings not so much. So I think getting to the truth, even if the truth isn't the answer that we want, I think is a positive thing. It's something that we can't do today that once we start relying on machines to do we'll be able to get there. >> Yeah I was just going to add that we live in exponential times. And the challenge is that the way that we're structured traditionally as organizations is not allowing us to absorb advances exponentially, it's linear at best. Everyone talks about change management and how are we going to do digital transformation. Evidence shows that technology's forcing the leaders and the laggards apart. There's a few leading organizations that are eating the world and they seem to be somehow rolling out new things. I don't know how Amazon rolls out all this stuff. There's all this artificial intelligence and the IOT devices, Alexa, natural language processing and that's just a fraction, it's just a tip of what they're releasing. So it just shows that there are some organizations that have path found the way. Most of the Fortune 500 from the year 2000 are gone already, right? The disruption is happening. And so we are trying, have to find someway to adopt these new capabilities and deploy them effectively or the writing is on the wall. I spent a lot of time exploring this topic, how are we going to get there and all of us have a lot of hard work is the short answer. >> I read that there's going to be more data, or it was predicted, more data created in this year than in the past, I think it was five, 5,000 years. >> Forever. (laughs) >> And that to mix the statistics that we're analyzing currently less than 1% of the data. To taking those numbers and hear what you're all saying it's like, we're not keeping up, it seems like we're, it's not even linear. I mean that gap is just going to grow and grow and grow. How do we close that? >> There's a guy out there named Chris Dancy, he's known as the human cyborg. He has 700 hundred sensors all over his body. And his theory is that data's not new, having access to the data is new. You know we've always had a blood pressure, we've always had a sugar level. But we were never able to actually capture it in real time before. So now that we can capture and harness it, now we can be smarter about it. So I think that being able to use this information is really incredible like, this is something that over our lifetime we've never had and now we can do it. Which hence the big explosion in data. But I think how we use it and have it governed I think is the challenge right now. It's kind of cowboys and indians out there right now. And without proper governance and without rigorous regulation I think we are going to have some bumps in the road along the way. >> The data's in the oil is the question how are we actually going to operationalize around it? >> Or find it. Go ahead. >> I will say the other side of it is, so if you think about information, we always have the same amount of information right? What we choose to record however, is a different story. Now if you want wanted to know things about the Olympics, but you decide to collect information every day for years instead of just the Olympic year, yes you have a lot of data, but did you need all of that data? For that question about the Olympics, you don't need to collect data during years there are no Olympics, right? Unless of course you're comparing it relative. But I think that's another thing to think about. Just 'cause you collect more data does not mean that data will produce more statistically significant results, it does not mean it'll improve your model. You can be collecting data about your shoe size trying to get information about your hair. I mean it really does depend on what you're trying to measure, what your goals are, and what the data's going to be used for. If you don't factor the real world context into it, then yeah you can collect data, you know an infinite amount of data, but you'll never process it. Because you have no question to ask you're not looking to model anything. There is no universal truth about everything, that just doesn't exist out there. >> I think she's spot on. It comes down to what kind of questions are you trying to ask of your data? You can have one given database that has 100 variables in it, right? And you can ask it five different questions, all valid questions and that data may have those variables that'll tell you what's the best predictor of Churn, what's the best predictor of cancer treatment outcome. And if you can ask the right question of the data you have then that'll give you some insight. Just data for data's sake, that's just hype. We have a lot of data but it may not lead to anything if we don't ask it the right questions. >> Joe. >> I agree but I just want to add one thing. This is where the science in data science comes in. Scientists often will look at data that's already been in existence for years, weather forecasts, weather data, climate change data for example that go back to data charts and so forth going back centuries if that data is available. And they reformat, they reconfigure it, they get new uses out of it. And the potential I see with the data we're collecting is it may not be of use to us today, because we haven't thought of ways to use it, but maybe 10, 20, even 100 years from now someone's going to think of a way to leverage the data, to look at it in new ways and to come up with new ideas. That's just my thought on the science aspect. >> Knowing what you know about data science, why did Facebook miss Russia and the fake news trend? They came out and admitted it. You know, we miss it, why? Could they have, is it because they were focused elsewhere? Could they have solved that problem? (crosstalk) >> It's what you said which is are you asking the right questions and if you're not looking for that problem in exactly the way that it occurred you might not be able to find it. >> I thought the ads were paid in rubles. Shouldn't that be your first clue (panelists laugh) that something's amiss? >> You know red flag, so to speak. >> Yes. >> I mean Bitcoin maybe it could have hidden it. >> Bob: Right, exactly. >> I would think too that what happened last year is actually was the end of an age of optimism. I'll bring up the Soviet Union again, (chuckles). It collapsed back in 1991, 1990, 1991, Russia was reborn in. And think there was a general feeling of optimism in the '90s through the 2000s that Russia is now being well integrated into the world economy as other nations all over the globe, all continents are being integrated into the global economy thanks to technology. And technology is lifting entire continents out of poverty and ensuring more connectedness for people. Across Africa, India, Asia, we're seeing those economies that very different countries than 20 years ago and that extended into Russia as well. Russia is part of the global economy. We're able to communicate as a global, a global network. I think as a result we kind of overlook the dark side that occurred. >> John: Joe? >> Again, the foolish optimist here. But I think that... It shouldn't be the question like how did we miss it? It's do we have the ability now to catch it? And I think without data science without machine learning, without being able to train machines to look for patterns that involve corruption or result in corruption, I think we'd be out of luck. But now we have those tools. And now hopefully, optimistically, by the next election we'll be able to detect these things before they become public. >> It's a loaded question because my premise was Facebook had the ability and the tools and the knowledge and the data science expertise if in fact they wanted to solve that problem, but they were focused on other problems, which is how do I get people to click on ads? >> Right they had the ability to train the machines, but they were giving the machines the wrong training. >> Looking under the wrong rock. >> (laughs) That's right. >> It is easy to play armchair quarterback. Another topic I wanted to ask the panel about is, IBM Watson. You guys spend time in the Valley, I spend time in the Valley. People in the Valley poo-poo Watson. Ah, Google, Facebook, Amazon they've got the best AI. Watson, and some of that's fair criticism. Watson's a heavy lift, very services oriented, you just got to apply it in a very focused. At the same time Google's trying to get you to click on Ads, as is Facebook, Amazon's trying to get you to buy stuff. IBM's trying to solve cancer. Your thoughts on that sort of juxtaposition of the different AI suppliers and there may be others. Oh, nobody wants to touch this one, come on. I told you elephant in the room questions. >> Well I mean you're looking at two different, very different types of organizations. One which is really spent decades in applying technology to business and these other companies are ones that are primarily into the consumer, right? When we talk about things like IBM Watson you're looking at a very different type of solution. You used to be able to buy IT and once you installed it you pretty much could get it to work and store your records or you know, do whatever it is you needed it to do. But these types of tools, like Watson actually tries to learn your business. And it needs to spend time doing that watching the data and having its models tuned. And so you don't get the results right away. And I think that's been kind of the challenge that organizations like IBM has had. Like this is a different type of technology solution, one that has to actually learn first before it can provide value. And so I think you know you have organizations like IBM that are much better at applying technology to business, and then they have the further hurdle of having to try to apply these tools that work in very different ways. There's education too on the side of the buyer. >> I'd have to say that you know I think there's plenty of businesses out there also trying to solve very significant, meaningful problems. You know with Microsoft AI and Google AI and IBM Watson, I think it's not really the tool that matters, like we were saying earlier. A fool with a tool is still a fool. And regardless of who the manufacturer of that tool is. And I think you know having, a thoughtful, intelligent, trained, educated data scientist using any of these tools can be equally effective. >> So do you not see core AI competence and I left out Microsoft, as a strategic advantage for these companies? Is it going to be so ubiquitous and available that virtually anybody can apply it? Or is all the investment in R&D and AI going to pay off for these guys? >> Yeah, so I think there's different levels of AI, right? So there's AI where you can actually improve the model. I remember when I was invited when Watson was kind of first out by IBM to a private, sort of presentation. And my question was, "Okay, so when do I get "to access the corpus?" The corpus being sort of the foundation of NLP, which is natural language processing. So it's what you use as almost like a dictionary. Like how you're actually going to measure things, or things up. And they said, "Oh you can't." "What do you mean I can't?" It's like, "We do that." "So you're telling me as a data scientist "you're expecting me to rely on the fact "that you did it better than me and I should rely on that." I think over the years after that IBM started opening it up and offering different ways of being able to access the corpus and work with that data. But I remember at the first Watson hackathon there was only two corpus available. It was either the travel or medicine. There was no other foundational data available. So I think one of the difficulties was, you know IBM being a little bit more on the forefront of it they kind of had that burden of having to develop these systems and learning kind of the hard way that if you don't have the right models and you don't have the right data and you don't have the right access, that's going to be a huge limiter. I think with things like medical, medical information that's an extremely difficult data to start with. Partly because you know anything that you do find or don't find, the impact is significant. If I'm looking at things like what people clicked on the impact of using that data wrong, it's minimal. You might lose some money. If you do that with healthcare data, if you do that with medical data, people may die, like this is a much more difficult data set to start with. So I think from a scientific standpoint it's great to have any information about a new technology, new process. That's the nice that is that IBM's obviously invested in it and collected information. I think the difficulty there though is just 'cause you have it you can't solve everything. And if feel like from someone who works in technology, I think in general when you appeal to developers you try not to market. And with Watson it's very heavily marketed, which tends to turn off people who are more from the technical side. Because I think they don't like it when it's gimmicky in part because they do the opposite of that. They're always trying to build up the technical components of it. They don't like it when you're trying to convince them that you're selling them something when you could just give them the specs and look at it. So it could be something as simple as communication. But I do think it is valuable to have had a company who leads on the forefront of that and try to do so we can actually learn from what IBM has learned from this process. >> But you're an optimist. (John laughs) All right, good. >> Just one more thought. >> Joe go ahead first. >> Joe: I want to see how Alexa or Siri do on Jeopardy. (panelists laugh) >> All right. Going to go around a final thought, give you a second. Let's just think about like your 12 month crystal ball. In terms of either challenges that need to be met in the near term or opportunities you think will be realized. 12, 18 month horizon. Bob you've got the microphone headed up, so I'll let you lead off and let's just go around. >> I think a big challenge for business, for society is getting people educated on data and analytics. There's a study that was just released I think last month by Service Now, I think, or some vendor, or Click. They found that only 17% of the employees in Europe have the ability to use data in their job. Think about that. >> 17. >> 17. Less than 20%. So these people don't have the ability to understand or use data intelligently to improve their work performance. That says a lot about the state we're in today. And that's Europe. It's probably a lot worse in the United States. So that's a big challenge I think. To educate the masses. >> John: Joe. >> I think we probably have a better chance of improving technology over training people. I think using data needs to be iPhone easy. And I think, you know which means that a lot of innovation is in the years to come. I do think that a keyboard is going to be a thing of the past for the average user. We are going to start using voice a lot more. I think augmented reality is going to be things that becomes a real reality. Where we can hold our phone in front of an object and it will have an overlay of prices where it's available, if it's a person. I think that we will see within an organization holding a camera up to someone and being able to see what is their salary, what sales did they do last year, some key performance indicators. I hope that we are beyond the days of everyone around the world walking around like this and we start actually becoming more social as human beings through augmented reality. I think, it has to happen. I think we're going through kind of foolish times at the moment in order to get to the greater good. And I think the greater good is using technology in a very, very smart way. Which means that you shouldn't have to be, sorry to contradict, but maybe it's good to counterpoint. I don't think you need to have a PhD in SQL to use data. Like I think that's 1990. I think as we evolve it's going to become easier for the average person. Which means people like the brain trust here needs to get smarter and start innovating. I think the innovation around data is really at the tip of the iceberg, we're going to see a lot more of it in the years to come. >> Dion why don't you go ahead, then we'll come down the line here. >> Yeah so I think over that time frame two things are likely to happen. One is somebody's going to crack the consumerization of machine learning and AI, such that it really is available to the masses and we can do much more advanced things than we could. We see the industries tend to reach an inflection point and then there's an explosion. No one's quite cracked the code on how to really bring this to everyone, but somebody will. And that could happen in that time frame. And then the other thing that I think that almost has to happen is that the forces for openness, open data, data sharing, open data initiatives things like Block Chain are going to run headlong into data protection, data privacy, customer privacy laws and regulations that have to come down and protect us. Because the industry's not doing it, the government is stepping in and it's going to re-silo a lot of our data. It's going to make it recede and make it less accessible, making data science harder for a lot of the most meaningful types of activities. Patient data for example is already all locked down. We could do so much more with it, but health start ups are really constrained about what they can do. 'Cause they can't access the data. We can't even access our own health care records, right? So I think that's the challenge is we have to have that battle next to be able to go and take the next step. >> Well I see, with the growth of data a lot of it's coming through IOT, internet of things. I think that's a big source. And we're going to see a lot of innovation. A new types of Ubers or Air BnBs. Uber's so 2013 though, right? We're going to see new companies with new ideas, new innovations, they're going to be looking at the ways this data can be leveraged all this big data. Or data coming in from the IOT can be leveraged. You know there's some examples out there. There's a company for example that is outfitting tools, putting sensors in the tools. Industrial sites can therefore track where the tools are at any given time. This is an expensive, time consuming process, constantly loosing tools, trying to locate tools. Assessing whether the tool's being applied to the production line or the right tool is at the right torque and so forth. With the sensors implanted in these tools, it's now possible to be more efficient. And there's going to be innovations like that. Maybe small start up type things or smaller innovations. We're going to see a lot of new ideas and new types of approaches to handling all this data. There's going to be new business ideas. The next Uber, we may be hearing about it a year from now whatever that may be. And that Uber is going to be applying data, probably IOT type data in some, new innovative way. >> Jennifer, final word. >> Yeah so I think with data, you know it's interesting, right, for one thing I think on of the things that's made data more available and just people we open to the idea, has been start ups. But what's interesting about this is a lot of start ups have been acquired. And a lot of people at start ups that got acquired now these people work at bigger corporations. Which was the way it was maybe 10 years ago, data wasn't available and open, companies kept it very proprietary, you had to sign NDAs. It was like within the last 10 years that open source all of that initiatives became much more popular, much more open, a acceptable sort of way to look at data. I think that what I'm kind of interested in seeing is what people do within the corporate environment. Right, 'cause they have resources. They have funding that start ups don't have. And they have backing, right? Presumably if you're acquired you went in at a higher title in the corporate structure whereas if you had started there you probably wouldn't be at that title at that point. So I think you have an opportunity where people who have done innovative things and have proven that they can build really cool stuff, can now be in that corporate environment. I think part of it's going to be whether or not they can really adjust to sort of the corporate, you know the corporate landscape, the politics of it or the bureaucracy. I think every organization has that. Being able to navigate that is a difficult thing in part 'cause it's a human skill set, it's a people skill, it's a soft skill. It's not the same thing as just being able to code something and sell it. So you know it's going to really come down to people. I think if people can figure out for instance, what people want to buy, what people think, in general that's where the money comes from. You know you make money 'cause someone gave you money. So if you can find a way to look at a data or even look at technology and understand what people are doing, aren't doing, what they're happy about, unhappy about, there's always opportunity in collecting the data in that way and being able to leverage that. So you build cooler things, and offer things that haven't been thought of yet. So it's a very interesting time I think with the corporate resources available if you can do that. You know who knows what we'll have in like a year. >> I'll add one. >> Please. >> The majority of companies in the S&P 500 have a market cap that's greater than their revenue. The reason is 'cause they have IP related to data that's of value. But most of those companies, most companies, the vast majority of companies don't have any way to measure the value of that data. There's no GAAP accounting standard. So they don't understand the value contribution of their data in terms of how it helps them monetize. Not the data itself necessarily, but how it contributes to the monetization of the company. And I think that's a big gap. If you don't understand the value of the data that means you don't understand how to refine it, if data is the new oil and how to protect it and so forth and secure it. So that to me is a big gap that needs to get closed before we can actually say we live in a data driven world. >> So you're saying I've got an asset, I don't know if it's worth this or this. And they're missing that great opportunity. >> So devolve to what I know best. >> Great discussion. Really, really enjoyed the, the time as flown by. Joe if you get that augmented reality thing to work on the salary, point it toward that guy not this guy, okay? (everyone laughs) It's much more impressive if you point it over there. But Joe thank you, Dion, Joe and Jennifer and Batman. We appreciate and Bob Hayes, thanks for being with us. >> Thanks you guys. >> Really enjoyed >> Great stuff. >> the conversation. >> And a reminder coming up a the top of the hour, six o'clock Eastern time, IBMgo.com featuring the live keynote which is being set up just about 50 feet from us right now. Nick Silver is one of the headliners there, John Thomas is well, or rather Rob Thomas. John Thomas we had on earlier on The Cube. But a panel discussion as well coming up at six o'clock on IBMgo.com, six to 7:15. Be sure to join that live stream. That's it from The Cube. We certainly appreciate the time. Glad to have you along here in New York. And until the next time, take care. (bright digital music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Welcome back to data science for all. So it is a new game-- Have a swing at the pitch. Thanks for taking the time to be with us. from the academic side to continue data science And there's lot to be said is there not, ask the questions, you can't not think about it. of the customer and how we were going to be more anticipatory And I think, you know as the tools mature, So it's still too hard. I think that, you know, that's where it's headed. So Bob if you would, so you've got this Batman shirt on. to be a data scientist, but these tools will help you I was just going to add that, you know I think it's important to point out as well that And the data scientists on the panel And the only difference is that you can build it's an accomplishment and for less, So I think you have to think about the fact that I get the point of it and I think and become easier to use, you know like Bob was saying, So how at the end of the day, Dion? or bots that go off and run the hypotheses So you know people who are using the applications are now then people can speak really slowly to you in French, But the day to day operations was they ran some data, That's really the question. You know it's been said that the data doesn't lie, the access to the truth through looking at the numbers of the organization where you have the routine I tend to be a foolish optimist You do. I think as we start relying more on data and trusting data There's a couple elephant in the room topics Before you go to market you've got to test And also have the ability for a human to intervene to click on ads. And I forget the last criteria, but like we need I think with ethics, you know a lot of it has to do of all the new data that's going to be coming in? Getting back to you know what Dave was saying earlier about, organizations that have path found the way. than in the past, I think it was (laughs) I mean that gap is just going to grow and grow and grow. So I think that being able to use this information Or find it. But I think that's another thing to think about. And if you can ask the right question of the data you have And the potential I see with the data we're collecting is Knowing what you know about data science, for that problem in exactly the way that it occurred I thought the ads were paid in rubles. I think as a result we kind of overlook And I think without data science without machine learning, Right they had the ability to train the machines, At the same time Google's trying to get you And so I think you know And I think you know having, I think in general when you appeal to developers But you're an optimist. Joe: I want to see how Alexa or Siri do on Jeopardy. in the near term or opportunities you think have the ability to use data in their job. That says a lot about the state we're in today. I don't think you need to have a PhD in SQL to use data. Dion why don't you go ahead, We see the industries tend to reach an inflection point And that Uber is going to be applying data, I think part of it's going to be whether or not if data is the new oil and how to protect it I don't know if it's worth this or this. Joe if you get that augmented reality thing Glad to have you along here in New York.

ENTITIES

Entity	Category	Confidence
Jeff Hammerbacher	PERSON	0.99+
Dave	PERSON	0.99+
Dion Hinchcliffe	PERSON	0.99+
John	PERSON	0.99+
Jennifer	PERSON	0.99+
Joe	PERSON	0.99+
Comcast	ORGANIZATION	0.99+
Chris Dancy	PERSON	0.99+
Jennifer Shin	PERSON	0.99+
Cathy O'Neil	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Stanislav Petrov	PERSON	0.99+
Joe McKendrick	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Nick Silver	PERSON	0.99+
John Thomas	PERSON	0.99+
100 variables	QUANTITY	0.99+
John Walls	PERSON	0.99+
1990	DATE	0.99+
Joe Caserta	PERSON	0.99+
Rob Thomas	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
UC Berkeley	ORGANIZATION	0.99+
1983	DATE	0.99+
1991	DATE	0.99+
2013	DATE	0.99+
Constellation Research	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
Bob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Bob Hayes	PERSON	0.99+
United States	LOCATION	0.99+
360 degree	QUANTITY	0.99+
one	QUANTITY	0.99+
New York	LOCATION	0.99+
Benjamin Israeli	PERSON	0.99+
France	LOCATION	0.99+
Africa	LOCATION	0.99+
12 month	QUANTITY	0.99+
Soviet Union	LOCATION	0.99+
Batman	PERSON	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Olympics	EVENT	0.99+
Meredith Whittaker	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Moscow	LOCATION	0.99+
Ubers	ORGANIZATION	0.99+
20 years	QUANTITY	0.99+
Joe C.	PERSON	0.99+

John Thomas, IBM | IBM Data Science For All

(upbeat music) >> Narrator: Live from New York City, it's the Cube, covering IBM Data Science for All. Brought to you by IMB. >> Welcome back to Data Science for All. It's a whole new game here at IBM's event, two-day event going on, 6:00 tonight the big keynote presentation on IBM.com so be sure to join the festivities there. You can watch it live stream, all that's happening. Right now, we're live here on the Cube, along with Dave Vellente, I'm John Walls and we are joined by John Thomas who is a distinguished engineer and director at IBM. John, thank you for your time, good to see you. >> Same here, John. >> Yeah, pleasure, thanks for being with us here. >> John Thomas: Sure. >> I know, in fact, you just wrote this morning about machine learning, so that's obviously very near and dear to you. Let's talk first off about IBM, >> John Thomas: Sure. >> Not a new concept by any means, but what is new with regard to machine learning in your work? >> Yeah, well, that's a good question, John. Actually, I get that question a lot. Machine learning itself is not new, companies have been doing it for decades, so exactly what is new, right? I actually wrote this in a blog today, this morning. It's really three different things, I call them democratizing machine learning, operationalizing machine learning, and hybrid machine learning, right? And we can talk through each of these if you like. But I would say hybrid machine learning is probably closest to my heart. So let me explain what that is because it's sounds fancy, right? (laughter) >> Right. It's what we need is another hybrid something, right? >> In reality, what it is is let data gravity decide where your data stays and let your performance requirements, your SLA's, dictate where your machine learning models go, right? So what do I mean by that? You might have sensitive data, customer data, which you want to keep on a certain platform, right? Instead of moving data off that platform to do machine learning, bring machine learning to that platform, whether that be the mainframe or specialized appliances or hadoop clusters, you name it, right? Bring machine learning to where the data is. Do the training, building of the model, where that is, but then have complete flexibility in terms of where you deploy that model. As an example, you might choose to build and train your model on premises behind the firewall using very sensitive data, but the model that has been built, you may choose to deploy that into a Cloud environment because you have other applications that need to consume it. That flexibility is what I mean by hybrid. Another example is, especially when you get into so many more complex machine learning, deep learning domains, you need exploration and there is hardware that provides that exploration, right? For example, GPU's provide exploration. Well, you need to have the flexibility to train and build the models on hardware that provides that kind of exploration, but then the model that has been built might go into inside of a CICS mainframe transaction for some second scoring of a credit card transaction as to whether it's fraudulent or not, right? So there's flexibility off peri, on peri, different platforms, this is what I mean by hybrid. >> What is the technical enabler to allow that to happen? Is it just a modern software architecture, microservices, containers, blah, blah, blah? Explain that in more detail. >> Yeah, that's a good question and we're not, you know, it's a couple different things. One is bringing native machine learning to these platforms themselves. So you need native machine learning on the mainframe, in the Cloud, in a hadoop cluster environment, in an appliance, right? So you need the run times, the libraries, the frameworks running native on those platforms. And that is not easy to do that, you know? You've got machine learning running native on ZOS, not even Linux on Z. It's native to ZOS on the mainframe. >> At the very primitive level you're talking about. >> Yeah. >> So you get the performance you need. >> You have the runtime environments there and then what you need is a seamless experience across all of these platforms. You need way to export models, repositories into which you can save models, the same API's to save models into a different repository and then consume from them there. So it's a bit of engineering that IBM is doing to enable this, right? Native capabilities on the platforms, the same API's to talk to repositories and consume from the repositories. >> So the other piece of that architecture is talking a lot of tooling that integrated and native. >> John Thomas: Yes. >> And the tooling, as you know, changes, I feel like daily. There's a new tool out there and everybody gloms onto it, so the architecture has to be able to absorb those. What is the enabler there? >> Yeah, so you actually bring up a very good point. There is a new language, a new framework everyday, right? I mean, we all know that, in the world of machine learning, Python and R and Scala. Frameworks like Spark and TensorFlow, they're table scapes now, you know? You have to support all of these, scikit-learning, you name it, right? Obviously, you need a way to support all these frameworks on the platforms you want to enable, right? And then you need an environment which lets you work with the tools of your choice. So you need an environment like a workbench which can allow you to work in the language, the framework that you are the most comfortable with. And that's what we are doing with data science experience. I don't know if you have thought of this, but data science experience is an enterprise ML platform, right, runs in the Cloud, on prem, on x86 machines, you can have it on a (mumbles) box. The idea here is support for a variety of open languages, frameworks, enable through a collaborative workbench kind of interface. >> And the decision to move, whether it's on-prem or in the Cloud, it's a function of many things, but let's talk about those. I mean, data volume is one. You can't just move your business into the Cloud. It's not going to work that well. >> It's a journey, yeah. >> It's too expensive. But then there's others, there's governance edicts and security edicts, not that the security in the Cloud is any worse, it might just different than what your organization requires, and the Cloud supplier might not support that. It's different Clouds, it's location, etc. When you talked about the data thing being on trend, maybe training a model, and then that model moving to the Cloud, so obviously, it's a lighter weight ... It's not as much-- >> Yeah, yeah, yeah, you're not moving the entire data. Right. >> But I have a concern. I wonder if clients as you about this. Okay, well, it's my data, my data, I'm going to keep behind my firewall. But that data trained that model and I'm really worried that that model is now my IP that's going to seep out into the industry. What do you tell a client? >> Yeah, that's a fair point. Obviously, you still need your security mechanisms, you access control mechanisms, your governance control mechanisms. So you need governance whether you are on the Cloud or on prem. And your encryption mechanisms, your version control mechanisms, your governance mechanisms, all need to be in place, regardless of where you deploy, right? And to your question of how do you decide where the model should go, as I said earlier to John, you know, let data gravity SLA's performance security requirements dictate where the model should go. >> We're talking so much about concepts, right, and theories that you have. Lets roll up our sleeves and get to the nitty-gritty a little bit here and talk about what are people really doing out there? >> John Thomas: Oh yeah, use cases. >> Yeah, just give us an idea for some of the ... Kind of the latest and greatest that you're seeing. >> Lots of very interesting, interesting use cases out there so actually, a part of what IBM calls a data science elite team. We go out and engage with customers on very interesting use cases, right? And we see a lot of these hybrid discussions happen as well. On one end of the spectrum is understanding customers better. So I call this reading the customer's mind. So can you understand what is in the customer's mind and have an interaction with the client without asking a bunch of questions, right? Can you look at his historical data, his browsing behavior, his purchasing behavior, and have an offer that he will really love? Can you really understand him and give him a celebrity experience? That's one class of use cases, right? Another class of use cases is around improving operations, improving your own internal processes. One example is fraud detection, right? I mean, that is a hot topic these days. So how do you, as the credit card is swiped, right, it's just a few milliseconds before that travels through a network and kicks you back in mainframe and a scoring is done to as to whether this should be approved or not. Well, you need to have a prediction of how likely this is to be fraudulent or not in the span of the transaction. Here's another one. I don't know if you call help desks now. I sometimes call them "helpless desks." (laughter) >> Try not to. >> Dave: Hell desks. >> Try not to helpless desks but, you know, for pretty every enterprise that I am talking to, there is a goal to optimize their help desk, their call centers. And call center optimization is good. So as the customer calls in, can you understand the intent of the customer? See, he may start off talking about something, but as the call progresses, the intent might change. Can you understand that? In fact, not just understand, but predict it and intercept with something that the client will love before the conversation takes a bad turn? (laughter) >> You must be listening in on my calls. >> Your calls, must be your calls! >> I meander, I go every which way. >> I game the system and just go really mad and go, let me get you an operator. (laughter) Agent, okay. >> You tow guys, your data is a special case. >> Dave: Yeah right, this guy's pissed. >> We are red-flagged right off the top. >> We're not even analyzing you. >> Day job, forget about, you know. What about things, you know, because they're moving so far out to the edge and now with mobile and that explosion there, and sensor data being what it is and all this is tremendous growth. Tough to manage. >> Dave: It is, it really is. >> I guess, maybe tougher to make sense of it, so how are you helping people make sense of this so they can really filter through and find the data that matters? >> Yeah, this is a lot of things rolled up into that question, right? One is just managing those devices, those endpoints in multiple thousands, tens of thousands, millions of these devices. How would you manage them? Then, are you doing the processing of the data and applying ML and DL right at the edge, or are you bringing the data back behind the firewall or into Cloud and then processing it there? If you are doing image reduction in a car, in a self-driving car, can you allow the latency of data being shipping of an image of a pedestrian jumping in front, do we ship across the Cloud for a deep-learning network to process it and give you an answer - oh, that's a pedestrian? You know, you may not have that latency now. So you may want to do some processing on the edge, so that is another interesting discussion, right? And you need exploration there as well. Another aspect now is, as you said, separating the signal from the noise, you know. It's just really, really coming down to the different industries that we go into, what are the signals that we understand now? Can we build on them and can we re-use them? That is an interesting discussion as well. But, yeah, you're right. With the world of exploding data that we are in, with all these devices, it's very important to have systematic approach to managing your data, cataloging it, understanding where to apply ML, where to apply exploration, governance. All of these things become important. >> I want to ask you about, come back to the use cases for a moment. You talk about celebrity experiences, I put that in sort of a marketing category. Fraud detection's always been one of the favorite, big data use cases, help desks, recommendation engines and so forth. Let's start with the fraud detection. About a year ago, first of all, fraud detection in the last six, seven years, has been getting immensely better, no question. And it's great. However, the number of false positives, about a year ago, it was too many. We're a small company but we buy a lot of equipment and lights and cameras and stuff. The number of false positives that I personally get was overwhelming. >> Yeah. >> They've gone down dramatically. >> Yeah. >> In the last 12 months. Is that just a coincidence, happenstance, or is it getting better? >> No, it's not that the bad guys have gone down in number. It's not that at all, no. (laughter) >> Well, that, I know. >> No, I think there is a lot of sophistication in terms of the algorithms that are available now. In terms of ... If you have tens of thousands of features that you're looking at, how do you collapse that space and how do you do that efficiently, right? There are techniques that are evolving in terms of handing that kind of information. In terms of the actual algorithms, are different types of innovations that are happening in that space. But I think, perhaps, the most important one is that things that use to take weeks or days to train and test, now can be done in days or minutes, right? The exploration that comes from GPU's, for example, allows you to test out different algorithms, different models and say, okay, well, this performs well enough for me to roll it out and try this out, right? It gives you a very quick cycle of innovation. >> The time to value is really compressed. Okay, now let's take one that's not so good. Ad recommendations, the Google ads that pop up. One in a hundred are maybe relevant, if that, right? And they pop up on the screen and they're annoying. I worry that Siri's listening somehow. I talk to my wife about Israel and then next thing I know, I'm getting ads for going to Israel. Is that a coincidence or are they listening? What's happening there? >> I don't know about what Google's doing. I can't comment on that. (laughter) I don't want to comment on that. >> Maybe just from a technology perspective. >> From a technology perspective, this notion of understanding what is in the customer's mind and really getting to a customer segment at one, this is top interest for many, many organizations. Regardless of which industry you are, insurance or banking or retail, doesn't matter, right? And it all comes down to the fundamental principles about how efficiently can you do. Now, can you identify the features that have the most predictive power? This is a level of sophistication in terms of the feature engineering, in terms of collapsing that space of features that I had talked about, and then, how do I actually go to the latest science of this? How do I do the exploratory analysis? How do I actually build and test my machine learning models quickly? Do the tools allow me to be very productive about this? Or do I spend weeks and weeks coding in lower-level formats? Or do I get help, do I get guided interfaces, which guide me through the process, right? And then, the topic of exploration we talk about, right? These things come together and then couple that with cognitive API's. For example, speech to text, the word (mumbles) have gone down dramatically now. So as you talk on the phone, with a very high accuracy, we can understand what is being talked about. Image recognition, the accuracy has gone up dramatically. You can create custom classifiers for industry-specific topics that you want to identify in pictures. Natural language processing, natural language understanding, all of these have evolved in the last few years. And all these come together. So machine learning's not an island. All these things coming together is what makes these dramatic advancements possible. >> Well, John, if you've figured out anything about the past 20 minutes or so, is that Dave and I want ads delivered that matter and we want our help desk questions answered right away. (laugher) so if you can help us with that, you're welcome back on the Cube anytime, okay? >> We will try, John. >> That's all we want, that's all we ask. >> You guys, your calls are still being screened. (laughter) >> John Thomas, thank you for joining us, we appreciate that. >> Thank you. >> Our panel discussion coming up at 4:00 Eastern time. Live here on the Cube, we're in New York City. Be back in a bit. (upbeat music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IMB. John, thank you for your time, good to see you. I know, in fact, you just wrote this morning And we can talk through each of these if you like. It's what we need is another hybrid something, right? of where you deploy that model. What is the technical enabler to allow that to happen? And that is not easy to do that, you know? and then what you need is a seamless experience So the other piece of that architecture is And the tooling, as you know, changes, I feel like daily. the framework that you are the most comfortable with. And the decision to move, whether it's on-prem and security edicts, not that the security in the Cloud is Yeah, yeah, yeah, you're not moving the entire data. I wonder if clients as you about this. So you need governance whether you are and theories that you have. Kind of the latest and greatest that you're seeing. I don't know if you call help desks now. So as the customer calls in, can you understand and go, let me get you an operator. What about things, you know, because they're moving the signal from the noise, you know. I want to ask you about, come back to the use cases In the last 12 months. No, it's not that the bad guys have gone down in number. and how do you do that efficiently, right? I talk to my wife about Israel and then next thing I know, I don't know about what Google's doing. So as you talk on the phone, with a very high accuracy, so if you can help us with that, You guys, your calls are still being screened. Live here on the Cube, we're in New York City.

ENTITIES

Entity	Category	Confidence
Dave Vellente	PERSON	0.99+
John	PERSON	0.99+
John Thomas	PERSON	0.99+
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Israel	LOCATION	0.99+
Google	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
Siri	TITLE	0.99+
ZOS	TITLE	0.99+
today	DATE	0.99+
Linux	TITLE	0.99+
One example	QUANTITY	0.99+
Python	TITLE	0.99+
thousands	QUANTITY	0.99+
One	QUANTITY	0.99+
Scala	TITLE	0.99+
Spark	TITLE	0.98+
tens of thousands	QUANTITY	0.98+
this morning	DATE	0.98+
each	QUANTITY	0.98+
IMB	ORGANIZATION	0.96+
one	QUANTITY	0.96+
TensorFlow	TITLE	0.95+
millions	QUANTITY	0.95+
About a year ago	DATE	0.95+
first	QUANTITY	0.94+
one class	QUANTITY	0.92+
Z.	TITLE	0.91+
4:00 Eastern time	DATE	0.9+
decades	QUANTITY	0.9+
6:00 tonight	DATE	0.9+
CICS	ORGANIZATION	0.9+
about a year ago	DATE	0.89+
second	QUANTITY	0.88+
two-day event	QUANTITY	0.86+
three different things	QUANTITY	0.85+
last 12 months	DATE	0.84+
IBM Data Science	ORGANIZATION	0.82+
Cloud	TITLE	0.8+
R	TITLE	0.78+
past 20 minutes	DATE	0.77+
Cube	COMMERCIAL_ITEM	0.75+
a hundred	QUANTITY	0.72+
one end	QUANTITY	0.7+
seven years	QUANTITY	0.69+
features	QUANTITY	0.69+
couple	QUANTITY	0.67+
last six	DATE	0.66+
few milliseconds	QUANTITY	0.63+
last few years	DATE	0.59+
x86	QUANTITY	0.55+
IBM.com	ORGANIZATION	0.53+
SLA	ORGANIZATION	0.49+

Tricia Wang, Sudden Compass | IBM Data Science For All

>> Narrator: Live from New York City, it's theCUBE covering IBM Data Science For All brought to you by IBM. >> Welcome back here on theCUBE. We are live in New York continuing our coverage here for Data Science for All where all things happen. Big things are happening. In fact, there's a huge event tonight I'm going to tell you about a little bit later on, but Tricia Wang who is our next guest is a part of that panel discussion that you'll want to tune in for live on ibmgo.com. 6 o'clock, but more on that a little bit later on. Along with Dave Vellante, John Walls here, and Tricia Wang now joins us. A first ever for us. How are you doing? >> Good. >> A global tech ethnographer. >> You said it correctly, yay! >> I learned a long time ago when you're not sure slow down. >> A plus already. >> Slow down and breathe. >> Slow down. >> You did a good job. Want to do it one more time? >> A global tech ethnographer. >> Tricia: Good job. >> Studying ethnography and putting ethnography into practice. How about that? >> Really great. >> That's taking on the challenge stretch. >> Now say it 10 times faster in a row. >> How about when we're done? Also co-founder of Sudden Compass. So first off, let's tell our viewers a little bit about Sudden Compass. Then I want to get into the ethnography and how that relates to tech. So let's go first off about Sudden Compass and the origins there. >> So Sudden Compass, we're a consulting firm based in New York City, and we help our partners embrace and understand the complexity of their customers. So whenever there are, wherever there's data and wherever there's people, we are there to help them make sure that they can understand their customers at the end of the day. And customers are really the most unpredictable, the most unknown, and the most difficult to quantify thing for any business. We see a lot of our partners really investing in big data data science tools and they're hiring the most amazing data scientists, but we saw them still struggling to make the right decisions, they still weren't getting their ROI, and they certainly weren't growing their customer base. And what we are helping them do is to say, "Look, you can't just rely only on data science. "You can't put it all into only the tool. "You have to think about how to operationalize that "and build a culture around it "and get the right skillsets in place, "and incorporate what we call the thick data, "which is the stuff that's very difficult to quantify, "the unknown, "and then you can figure out "how to best mathematically scale your data models "when it's actually based on real human behavior, "which is what the practice of ethnography is there to help "is to help you understand what do humans actually do, "what is unquantifiable. "And then once you find out those unquantifiable bits "you then have the art and science of figuring out "how do you scale it into a data model." >> Yeah, see that's what I find fascinating about this is that you've got hard and fast, right, data, objective, black and white, very clear, and then you've got people, you know? We all react differently. We have different influences, and different biases, and prejudices, and all that stuff, aptitudes. So you are meshing this art and science. >> Tricia: Absolutely. >> And what is that telling you then about how best to your clients and how to use data (mumbles)? >> Well, we tell our clients that because people are, there are biases, and people are not objective and there's emotions, that all ends up in the data set. To think that your data set, your quantitative data set, is free of biases and has some kind of been scrubbed of emotion is a total fallacy and it's something that needs to be corrected, because that means decision makers are making decisions based off of numbers thinking that they're objective when in fact they contain all the biases of the very complexity of the humans that they're serving. So, there is an art and science of making sure that when you capture that complexity ... We're saying, "Don't scrub it away." Traditional marketing wants to say, "Put your customers in boxes. "Put them in segments. "Use demographic variables like education, income. "Then you can just put everyone in a box, "figure out where you want to target, "figure out the right channels, "and you buy against that and you reach them." That's not how it works anymore. Customers now are moving faster than corporations. The new net worth customer of today has multiple identities is better understood when in relationship to other people. And we're not saying get rid of the data science. We're saying absolutely have it. You need to have scale. What is thick data going to offer you? Not scale, but it will offer you depth. So, that's why you need to combine both to be able to make effective decisions. >> So, I presume you work with a lot of big consumer brands. Is that a safe assumption? >> Absolutely. >> Okay. So, we work with a lot of big tech brands, like IBM and others, and they tend to move at the speed of the CIO, which tends to be really slow and really risk averse, and they're afraid to over rotate and get ahead over their skis. What do you tell folks like that? Is that a mistake being so cautious in this digital age? >> Well, I think the new CIO is on the cutting edge. I was just at Constellation Research Annual Conference in Half Moon Bay at-- >> Our friend Ray Wang. >> Yeah, Ray Wang. And I just spoke about this at their Constellation Connected Enterprise where they had the most, I would have to say the most amazing forward thinking collection of CIOs, CTOs, CDOs all in one room. And the conversation there was like, "We cannot afford to be slow anymore. "We have to be on the edge "of helping our companies push the ground." So, investing in tools is not enough. It is no longer enough to be the buyer, and to just have a relationship with your vendor and assume that they will help you deliver all the understanding. So, CIOs and CTOs need to ensure that their teams are diverse, multi-functional, and that they're totally integrated embedded into the business. And I don't mean just involve a business analyst as if that's cutting edge. I'm saying, "No, you need to make sure that every team "has qualitative people, "and that they're embedded and working closely together." The problem is we don't teach these skills. We're not graduating data scientists or ethnographers who even want to talk to each other. In fact, each side thinks the other side is useless. We're saying, "No, "we need to be able to have these skills "being taught within companies." And you don't need to hire a PhD data scientist or a PhD ethnographer. What we're saying is that these skills can be taught. We need to teach people to be data literate. You've hired the right experts, you have bought the right tools, but we now need to make sure that we're creating data literacy among decision makers so that we can turn these data into insights and then into action. >> Let's peel that a little bit. Data literate, you're talking about creativity, visualization, combining different perspectives? Where should the educational focus be? >> The educational focus should be on one storytelling. Right now, you cannot just be assuming that you can have a decision maker make a decision based on a number or some long PowerPoint report. We have to teach people how to tell compelling stories with data. And when I say data I'm talking about it needs the human component and it needs the numbers. And so one of the things that I saw, this is really close to my heart, was when I was at Nokia, and I remember I spent a decade understanding China. I really understood China. And when I finally had the insight where I was like, "Look, after spending 10 years there, "following 100 to 200 families around, "I had the insight back in 2009 that look, "your company is about to go out of business because "people don't want to buy your feature phones anymore. "They're going to want to buy smartphones." But, I only had qualitative data, and I needed to work alongside the business analysts and the data scientists. I needed access to their data sets, but I needed us to play together and to be on a team together so that I could scale my insights into quantitative models. And the problem was that, your question is, "What does that look like?" That looks like sitting on a team, having a mandate to say, "You have to play together, "and be able to tell an effective story "to the management and to leadership." But back then they were saying, "No, "we don't even consider your data set "to be worthwhile to even look at." >> We love our candy bar phone, right? It's a killer. >> Tricia: And we love our numbers. We love our surveys that tell us-- >> Market share was great. >> Market share is great. We've done all of the analysis. >> Forget the razor. >> Exactly. I'm like, "Look, of course your market share was great, "because your surveys were optimized "for your existing business model." So, big data is great if you want to optimize your supply chain or in systems that are very contained and quantifiable that's more or less fine. You can get optimization. You can get that one to two to five percent. But if you really want to grow your company and you want to ensure its longevity, you cannot just rely on your quantitative data to tell you how to do that. You actually need thick data for discovery, because you need to find the unknown. >> One of the things you talk about your passion is to understand how human perspectives shape the technology we build and how we use it. >> Tricia: Yes, you're speaking my language. >> Okay, so when you think about the development of the iPhone, it wasn't a bunch of surveys that led Steve Jobs to develop the iPhone. I guess the question is does technology lead and shape human perspectives or do human perspectives shape technology? >> Well, it's a dialectical relationship. It's like does a hamburger ... Does a bun shape the burger or does the bun shape the burger? You would never think of asking someone who loves a hamburger that question, because they both shape each other. >> Okay. (laughing) >> So, it's symbiote here, totally symbiotic. >> Surprise answer. You weren't expecting that. >> No, but it is kind of ... Okay, so you're saying it's not a chicken and egg, it's both. >> Absolutely. And the best companies are attuned to both. The best companies know that. The most powerful companies of the 21st century are obsessed with their customers and they're going to do a great job at leveraging human models to be scaled into data models, and that gap is going to be very, very narrow. You get big data. We're going to see more AI or ML disasters when their data models are really far from their actual human models. That's how we get disasters like Tesco or Target, or even when Google misidentified black people as gorillas. It's because their model of their data was so far from the understanding of humans. And the best companies of the future are going to know how to close that gap, and that means they will have the thick data and big data closely integrated. >> Who's doing that today? It seems like there are no ethics in AI. People are aggressively AI for profit and not really thinking about the human impacts and the societal impacts. >> Let's look at IBM. They're doing it. I would say that some of the most innovative projects that are happening at IBM with Watson, where people are using AI to solve meaningful social problems. I don't think that has to be-- >> Like IBM For Social Good. >> Exactly, but it's also, it's not just experimental. I think IBM is doing really great stuff using Watson to understand, identify skin cancer, or looking at the ways that people are using AI to understand eye diseases, things that you can do at scale. But also businesses are also figuring out how to use AI for actually doing better things. I think some of the most interesting ... We're going to see more examples of people using AI for solving meaningful social problems and making a profit at the same time. I think one really great example is WorkIt is they're using AI. They're actually working with Watson. Watson is who they hired to create their engine where union workers can ask questions of Watson that they may not want to ask or may be too costly to ask. So you can be like, "If I want to take one day off, "will this affect my contract or my job?" That's a very meaningful social problem that unions are now working with, and I think that's a really great example of how Watson is really pushing the edge to solve meaningful social problems at the same time. >> I worry sometimes that that's like the little device that you put in your car for the insurance company to see how you drive. >> How do you brake? How do you drive? >> Do people trust feeding that data to Watson because they're afraid Big Brother is watching? >> That's why we always have to have human intelligence working with machine intelligence. This idea of AI versus humans is a false binary, and I don't even know why we're engaging in those kinds of questions. We're not clearly, but there are people who are talking about it as if it's one or the other, and I find it to be a total waste of time. It's like clearly the best AI systems will be integrated with human intelligence, and we need the human training the data with machine learning systems. >> Alright, I'll play the yeah but. >> You're going to play the what? >> Yeah but! >> Yeah but! (crosstalk) >> That machines are replacing humans in cognitive functions. You walk into an airport and there are kiosks. People are losing jobs. >> Right, no that's real. >> So okay, so that's real. >> That is real. >> You agree with that. >> Job loss is real and job replacement is real. >> And I presume you agree that education is at least a part the answer, and training people differently than-- >> Tricia: Absolutely. >> Just straight reading, writing, and arithmetic, but thoughts on that. >> Well what I mean is that, yes, AI is replacing jobs, but the fact that we're treating AI as some kind of rogue machine that is operating on its own without human guidance, that's not happening, and that's not happening right now, and that's not happening in application. And what is more meaningful to talk about is how do we make sure that humans are more involved with the machines, that we always have a human in the loop, and that they're always making sure that they're training in a way where it's bringing up these ethical questions that are very important that you just raised. >> Right, well, and of course a lot of AI people would say is about prediction and then automation. So think about some of the brands that you serve, consult with, don't they want the machines to make certain decisions for them so that they can affect an outcome? >> I think that people want machines to surface things that is very difficult for humans to do. So if a machine can efficiently surface here is a pattern that's going on then that is very helpful. I think we have companies that are saying, "We can automate your decisions," but when you actually look at what they can automate it's in very contained, quantifiable systems. It's around systems around their supply chain or logistics. But, you really do not want your machine automating any decision when it really affects people, in particular your customers. >> Okay, so maybe changing the air pressure somewhere on a widget that's fine, but not-- >> Right, but you still need someone checking that, because will that air pressure create some unintended consequences later on? There's always some kind of human oversight. >> So I was looking at your website, and I always look for, I'm intrigued by interesting, curious thoughts. >> Tricia: Okay, I have a crazy website. >> No, it's very good, but back in your favorite quotes, "Rather have a question I can't answer "than an answer I can't question." So, how do you bring that kind of there's no fear of failure to the boardroom, to people who have to make big leaps and big decisions and enter this digital transformative world? >> I think that a lot of companies are so fearful of what's going to happen next, and that fear can oftentimes corner them into asking small questions and acting small where they're just asking how do we optimize something? That's really essentially what they're asking. "How do we optimize X? "How do we optimize this business?" What they're not really asking are the hard questions, the right questions, the discovery level questions that are very difficult to answer that no big data set can answer. And those are questions ... The questions about the unknown are the most difficult, but that's where you're going to get growth, because when something is unknown that means you have not either quantified it yet or you haven't found the relationship yet in your data set, and that's your competitive advantage. And that's where the boardroom really needs to set the mandate to say, "Look, I don't want you guys only answering "downstream, company-centric questions like, "'How do we optimize XYZ?"'" which is still important to answer. We're saying you absolutely need to pay attention to that, but you also need to ask upstream very customer-centric questions. And that's very difficult, because all day you're operating inside a company . You have to then step outside of your shoes and leave the building and see the world from a customer's perspective or from even a non existing customer's perspective, which is even more difficult. >> The whole know your customer meme has taken off in a big way right now, but I do feel like the pendulum is swinging. Well, I'm sanguined toward AI. It seems to me that ... It used to be that brands had all the power. They had all the knowledge, they knew the pricing, and the consumers knew nothing. The Internet changed all that. I feel like digital transformation and all this AI is an attempt to create that asymmetry again back in favor of the brand. I see people getting very aggressive toward, certainly you see this with Amazon, Amazon I think knows more about me than I know about myself. Should we be concerned about that and who protects the consumer, or is just maybe the benefits outweigh the risks there? >> I think that's such an important question you're asking and it's totally important. A really great TED talk just went up by Zeynep Tufekci where she talks about the most brilliant data scientists, the most brilliant minds of our day, are working on ad tech platforms that are now being created to essentially do what Kenyatta Jeez calls advertising terrorism, which is that all of this data is being collected so that advertisers have this information about us that could be used to create the future forms of surveillance. And that's why we need organizations to ask the kind of questions that you did. So two organizations that I think are doing a really great job to look at are Data & Society. Founder is Danah Boyd. Based in New York City. This is where I'm an affiliate. And they have all these programs that really look at digital privacy, identity, ramifications of all these things we're looking at with AI systems. Really great set of researchers. And then Vint Cerf (mumbles) co-founded People-Centered Internet. And I think this is another organization that we really should be looking at, it's based on the West Coast, where they're also asking similar questions of like instead of just looking at the Internet as a one-to-one model, what is the Internet doing for communities, and how do we make sure we leverage the role of communities to protect what the original founders of the Internet created? >> Right, Danah Boyd, CUBE alum. Shout out to Jeff Hammerbacher, founder of Cloudera, the originator of the greatest minds of my generation are trying to get people to click on ads. Quit Cloudera and now is working at Mount Sinai as an MD, amazing, trying to solve cancer. >> John: A lot of CUBE alums out there. >> Yeah. >> And now we have another one. >> Woo-hoo! >> Tricia, thank you for being with us. >> You're welcome. >> Fascinating stuff. >> Thanks for being on. >> It really is. >> Great questions. >> Nice to really just change the lens a little bit, look through it a different way. Tricia, by the way, part of a panel tonight with Michael Li and Nir Kaldero who we had earlier on theCUBE, 6 o'clock to 7:15 live on ibmgo.com. Nate Silver also joining the conversation, so be sure to tune in for that live tonight 6 o'clock. Back with more of theCUBE though right after this. (techno music)

Published Date : Nov 1 2017

SUMMARY :

brought to you by IBM. I'm going to tell you about a little bit later on, Want to do it one more time? and putting ethnography into practice. the challenge stretch. and how that relates to tech. and the most difficult to quantify thing for any business. and different biases, and prejudices, and all that stuff, and it's something that needs to be corrected, So, I presume you work with a lot of big consumer brands. and they tend to move at the speed of the CIO, I was just at Constellation Research Annual Conference and assume that they will help you deliver Where should the educational focus be? and to be on a team together We love our candy bar phone, right? We love our surveys that tell us-- We've done all of the analysis. You can get that one to two to five percent. One of the things you talk about your passion that led Steve Jobs to develop the iPhone. or does the bun shape the burger? Okay. You weren't expecting that. but it is kind of ... and that gap is going to be very, very narrow. and the societal impacts. I don't think that has to be-- and making a profit at the same time. that you put in your car for the insurance company and I find it to be a total waste of time. You walk into an airport and there are kiosks. but thoughts on that. that are very important that you just raised. So think about some of the brands that you serve, But, you really do not want your machine Right, but you still need someone checking that, and I always look for, to the boardroom, and see the world from a customer's perspective and the consumers knew nothing. that I think are doing a really great job to look at Shout out to Jeff Hammerbacher, Nice to really just change the lens a little bit,

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Eric Herzog	PERSON	0.99+
James Kobielus	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
Diane	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Mark Albertson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rebecca Knight	PERSON	0.99+
Jennifer	PERSON	0.99+
Colin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Tricia Wang	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Singapore	LOCATION	0.99+
James Scott	PERSON	0.99+
Scott	PERSON	0.99+
Ray Wang	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Brian Walden	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Verizon	ORGANIZATION	0.99+
Jeff Bezos	PERSON	0.99+
Rachel Tobik	PERSON	0.99+
Alphabet	ORGANIZATION	0.99+
Zeynep Tufekci	PERSON	0.99+
Tricia	PERSON	0.99+
Stu	PERSON	0.99+
Tom Barton	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sandra Rivera	PERSON	0.99+
John	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Ginni Rometty	PERSON	0.99+
France	LOCATION	0.99+
Jennifer Lin	PERSON	0.99+
Steve Jobs	PERSON	0.99+
Seattle	LOCATION	0.99+
Brian	PERSON	0.99+
Nokia	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Peter Burris	PERSON	0.99+
Scott Raynovich	PERSON	0.99+
Radisys	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Eric	PERSON	0.99+
Amanda Silver	PERSON	0.99+

Nir Kaldero, Galvanize | IBM Data Science For All

>> Announcer: Live from New York City, it's The Cube, covering IBM data science for all. Brought to you by IBM. >> Welcome back to data science for all. This is IBM's event here on the west side of Manhattan, here on The Cube. We're live, we'll be here all day, along with Dave Vallente, I'm John Walls Poor Dave had to put up with all that howling music at this hotel last night, kept him up 'til, all hours. >> Lots of fun here in the city. >> Yeah, yeah. >> All the crazies out last night. >> Yeah, but the headphones, they worked for ya. Glad to hear that. >> People are already dressed for Halloween, you know what I mean? >> John: Yes. >> In New York, you know what I mean? >> John: All year. >> All the time. >> John: All year. >> 365. >> Yeah. We have with us now the head of data science, and the VP at Galvanize, Nir Kaldero, and Nir, good to see you, sir. Thanks for being with us. We appreciate the time. >> Well of course, my pleasure. >> Tell us about Galvanize. I know you're heavily involved in education in terms of the tech community, but you've got corporate clients, you've got academic clients. You cover the waterfront, and I know data science is your baby. >> Nir: Right. >> But tell us a little bit about Galvanize and your mission there. >> Sure, so Galvanize is the learning community for technology. We provide the training in data science, data engineering, and also modern software engineering. We recently built a very large, fast growing enterprise corporate training department, where we basically help companies become digital, become nimble, and also very data driven, so they can actually go through this digital transformation, and survive in this fourth industrial revolution. We do it across all layers of the business, from the executives, to managers, to data scientists, and data analysts, and kind of transform and upscale all current skills to be modern, to be digital, so companies can actually go through this transformation. >> Hit on one of those items you talked about, data driven. >> Nir: Right. >> It seems like a no-brainer, right? That the more information you give me, the more analysis I can apply to it, the more I can put it in my business practice, the more money I make, the more my customers are happy. It's a lay up, right? >> Nir: It is. >> What is a data driven organization, then? Do you have to convince people that this is where they need to be today? >> Sometimes I need to convince them, but (laughs) anyway, so let's back up a little bit. We are in the midst of the fourth industrial revolution, and in order to survive in this fourth industrial revolution, companies need to become nimble, as I said, become agile, but most importantly become data driven, so the organization can actually best respond to all the predictions that are coming from this very sophisticated machine intelligence models. If the organization immediately can best respond to all of that, companies will be able to enhance the user experience, get insight about their customers, enhance performances, and et cetera, and we know that the winners in this revolution, in this era, will be companies who are very digital, that master the skills of becoming a data driven organization, and you know, we can talk more about the transformation, and what it consisted of. Do you want me to? >> John: Sure. >> Can I just ask you a question? This fourth wave, this is what, the cognitive machine wave? Or how would you describe it? >> Some people call it artificial intelligence. I think artificial intelligence is like big data, kind of like a buzz word. I think more appropriately, we should call it machine intelligence industrial revolution. >> Okay. I've got a lot of questions, but carry on. >> So hitting on that, so you see that as being a major era. >> Nir: It's a game changer. >> If you will, not just a chapter, but a major game changer. >> Nir: Yup. >> Why so? >> So, okay, I'll jump in again. Machines have always replaced man, people. >> John: The automation, right. >> Nir: To some extent. >> But certain machines have replaced certain human tasks, let's say that. >> Nir: Correct. >> But for the first time in history, this fourth era, machine's are replacing humans with cognitive tasks, and that scares a lot of people, because you look at the United States, the median income of the U.S. worker has dropped since 1999, from $55,000 to $52,000, and a lot of people believe it's sort of the hollowing out of that factor that we just mentioned. Education many believe is the answer. You know, Galvanize is an organization that plays a critical role in helping deal with that problem, does it not? >> So, as Mark Zuckerberg says, there is a lot of hate love relationship with A.I. People love it on one side, because they're excited about all the opportunities that can come from this utilization of machine intelligence, but many people actually are afraid from it. I read a survey a few weeks ago that says that 36% of the population thinks that A.I. will destroy humanity, and will conquer the world. That's a fact that's what people think. If I think it's going to happen? I don't think so. I highly believe that education is one of the pillars that can address this fear for machine intelligence, and you spoke a lot about jobs I talk about it forever, but just my belief is that machines can actually replace some of our responsibilities, right? Not necessarily take and replace the entire job. Let's talk about lawyers, right? Lawyers currently spend between 40% to 60% of the time writing contracts, or looking at previous cases. The machine can write a contract in two minutes, or look up millions of data points of previous cases in zero time. Why a lawyer today needs to spend 40% to 60% of the time on that? >> Billable hours, that's why. >> It is, so I don't think the machine will replace the job of the lawyer. I think in the future, the machine replaces some of the responsibilities, like auditing, or writing contracts, or looking at previous cases. >> Menial labor, if you will. >> Yes, but you know, for example, the machine is not that great right now with negotiations skills. So maybe in the future, the job of the lawyer will be mostly around negotiation skills, rather than writing contracts, et cetera, but yeah, you're absolutely right. There is a big fear in the market right now among executives, among people in the public. I think we should educate people about what is the true implications of machine intelligence in this fourth industrial revolution and era, and education is definitely one of those. >> Well, one of my favorite stories, when people bring up this topic, is when Gary Kasparov lost to the IBM super computer, Blue Jean, or whatever it's called. >> Nir: Yup. >> Instead of giving up, what he said is he started a competition, where he proved that humans and machines could beat the IBM super computer. So to this day has a competition where the best chess player in the world is a combination between humans and machines, and so it's that creativity. >> Nir: Imagination. >> Imagination, right, combinatorial effects of different technologies that education, hopefully, can help keep those either way. >> Look, I'm a big fan of neuroscience. I wish I did my PhD in neuroscience, but we are very, very far away from understanding how our brain works. Now to try to imitate the brain when we don't know how the brain works? We are very far away from being in a place where a machine can actually replicate, and really best respond like a human. We don't know how our brain works yet. So we need to do a lot of research on that before we actually really write a very strong, powerful machine intelligence model that can actually replace us as humans, and outbid us. We can speak about Jeopardy, and what's on, and we can speak about AlphaGo, it's a Google company that kind of outperformed the world champion. These are very specific tasks, right? Again, like the lawyer, the machines can write beautiful contracts with NLP, machines can look at millions and trillions of data and figure out what's the conclusion there, right? Or summarize text very fast, but not necessarily good in negotiation yet. >> So when you think about a digital business, to us a digital business is a business that uses data to differentiate, and serve customers, and maintain customers. So when you talk about data driven, it strikes me that when everybody's saying digital business, digital transformation, it's about a data transformation, how well they utilize data, and if you look at the bell curve of organizations, most are not. Everybody wants to be data driven, many say they are data driven. >> Right. >> Dave: Would you agree most are not? >> I will agree that most companies say that they are data driven, but actually they're not. I work with a lot of Fortune 500 companies on a daily basis. I meet their executives and functional leaders, and actually see their data, and business problems that they have. Most of them do tend to say that they are data driven, but truly just ask them if they put data and decisions in the same place, every time they have to make a decision, they don't do it. It's a habit that they don't yet have. Companies need to start investing in building what we say healthy data culture in order to enable and become data driven. Part of it is democratization of data, right? Currently what I see if lots of organizations actually open the data just for the analyst, or the marketers, people who kind of make decisions, that need to make decisions with data, but not throughout the entire organization. I know I always say that everyone in the organization makes decisions on a daily basis, from the barista, to the CEO, right? And the entirety of becoming data driven is that data can actually help us make better decisions on a daily basis, so how about democratizing the data to everyone? So everyone, from the barista, to the CEO, can actually make better decisions on a daily basis, and companies don't excel yet in doing it. Not every company is as digital as Amazon. Amazon, I think, is actually one of the most digital companies in the world, if you look at the digital index. Not everyone is Google or Facebook. Most companies want to be there, most companies understand that they will not be able to survive in this era if they will not become data driven, so it's a big problem. We try at Galvanize to address this problem from executive type of education, where we actually meet with the C-level executives in companies, and actually guide them through how to write their data strategy, how to think about prioritizing data investment, to actual implementation of that, and so far we are highly successful. We were able to make a big transformation in very large, important organizations. So I'm actually very proud of it. >> How long are these eras? Is it a century, or more? >> This fourth industrial? >> Yeah. >> Well it's hard to predict that, and I'm not a machine, or what's on it. (laughs) >> But certainly more than 50 years, would you say? Or maybe not, I don't know. >> I actually don't think so. I think it's going to be fast, and we're going to move to the next one pretty soon that will be even more, with more intelligence, with more data. >> So the reason I ask, is there was an article I saw and linked, and I haven't had time to read it, but it talked about the Four Horsemen, Amazon, Google, Facebook, and Apple, and it said they will all be out of business in 50 years. Now, I don't know, I think Apple probably has 50 years of cash flow in the bank, but then they said, the one, the author said, if I had to predict one that would survive, it would be Amazon, to your point, because they are so data driven. The premise, again I didn't read the whole thing, was that some new data driven, digital upstart will disrupt them. >> Yeah, and you know, companies like Amazon, and Alibaba lately, that try kind of like in a competition with Amazon about who is becoming more data driven, utilizing more machine intelligence, are the ones that invested in these capabilities many, many years ago. It's no that they started investing in it last year, or five years ago. We speak about 15 and 20 years ago. So companies who were really a pioneer, and invested very early on, will predict actually to survive in the future, and you know, very much align. >> Yeah, I'm going to touch on something. It might be a bridge too far, I don't know, but you talk about, Dave brought it up, about replacing human capital, right? Because of artificial intelligence. >> Nir: Yup. >> Is there a reluctance, perhaps, on behalf of executives to embrace that, because they are concerned about their own price? >> Nir: You should be in the room with me. (laughing) >> You provide data, but you also provide that capability to analyze, and make the best informed decision, and therefore, eliminate the human element of a C-suite executive that maybe they're not as necessary today, or tomorrow, as they were two years ago. >> So it is absolutely true, and there is a lot of fear in the room, especially when I show them robots, they freak out typically, (John and Dave laugh) but the fact is well known. Leaders who will not embrace these skills, and understanding, and will help the organization to become agile, nimble, and data driven, will not survive. They will be replaced. So on the one hand, they're afraid from it. On the other side, they see that if they will not actually do something, and take an action today, they might be replaced in the future. >> Where should organizations start? Hey, I want to be data driven. Where do I start? >> That's a good question. So data science, machine learning, is a top down initiative. It requires a lot of funding. It requires a change in culture and habits. So it has to start from the top. The journey has to start from executive, from educating and executive about what is data science, what is machine learning, how to prioritize investments in this field, how to build data driven culture, right? When we spoke about data driven, we mainly speaks about the culture aspect here, not specifically about the technical side of it. So it has to come from the top, leaders have to incorporate it in the organization, the have to give authority and power for people, they have to put the funding at first, and then, this is how it's beautiful, that you actually see it trickles down to the organization when they have a very powerful CEO that makes a decision, and moves the organization quickly to become data driven, make executives look at data every time they make a decision, get them into the habit. When people look up to executives, they try to do the same, and if my boss is an example for me, someone who is looking at data every time he is making a decision, ask the right questions, know how to prioritize, set the right goals for me, this helps me, and helps the organization better perform. >> Follow the leader, right? >> Yup. >> Follow the leader. >> Yup, follow the leader. >> Thanks for being with us. >> Nir: Of course, it's my pleasure. >> Pinned this interesting love hate thing that we have going on. >> We should address that. >> Right, right. That's the next segment, how about that? >> Nir Kaldero from Galvanize joining us here live on The Cube. Back with more from New York in just a bit.

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. the west side of Manhattan, Yeah, but the headphones, and the VP at Galvanize, Nir Kaldero, in terms of the tech community, and your mission there. from the executives, to managers, you talked about, data driven. the more analysis I can apply to it, We are in the midst of the I think artificial but carry on. so you see that as being a major era. If you will, not just a chapter, Machines have always replaced man, people. But certain machines have But for the first time of the pillars that can address of the responsibilities, the job of the lawyer will to the IBM super computer, and so it's that creativity. that education, hopefully, kind of outperformed the world champion. and if you look at the bell from the barista, to the CEO, right? and I'm not a machine, or what's on it. 50 years, would you say? I think it's going to be fast, the author said, if I had to are the ones that invested in Yeah, I'm going to touch on something. Nir: You should be in the room with me. and make the best informed decision, So on the one hand, Hey, I want to be data driven. the have to give authority that we have going on. That's the next segment, how about that? New York in just a bit.

ENTITIES

Entity	Category	Confidence
Dave Vallente	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
40%	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
Gary Kasparov	PERSON	0.99+
New York	LOCATION	0.99+
$55,000	QUANTITY	0.99+
50 years	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Galvanize	ORGANIZATION	0.99+
Nir	PERSON	0.99+
New York City	LOCATION	0.99+
Mark Zuckerberg	PERSON	0.99+
Nir Kaldero	PERSON	0.99+
two minutes	QUANTITY	0.99+
tomorrow	DATE	0.99+
36%	QUANTITY	0.99+
1999	DATE	0.99+
Four Horsemen	ORGANIZATION	0.99+
United States	LOCATION	0.99+
60%	QUANTITY	0.99+
last year	DATE	0.99+
more than 50 years	QUANTITY	0.99+
$52,000	QUANTITY	0.99+
five years ago	DATE	0.99+
one	QUANTITY	0.98+
two years ago	DATE	0.98+
today	DATE	0.98+
first time	QUANTITY	0.98+
Manhattan	LOCATION	0.98+
Halloween	EVENT	0.97+
NLP	ORGANIZATION	0.97+
zero time	QUANTITY	0.97+
fourth wave	EVENT	0.97+
last night	DATE	0.96+
20 years ago	DATE	0.95+
AlphaGo	ORGANIZATION	0.95+
IBM Data Science	ORGANIZATION	0.93+
U.S.	LOCATION	0.93+
fourth industrial revolution	EVENT	0.93+
one side	QUANTITY	0.92+
millions and trillions	QUANTITY	0.9+
John Walls	PERSON	0.85+
years ago	DATE	0.83+
Edu	PERSON	0.82+
few weeks ago	DATE	0.82+
millions of data	QUANTITY	0.77+
fourth industrial revolution	EVENT	0.75+
Fortune 500	ORGANIZATION	0.73+
machine wave	EVENT	0.72+
cognitive	EVENT	0.72+
a century	QUANTITY	0.69+

Vikram Murali, IBM | IBM Data Science For All

>> Narrator: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome back to New York here on theCUBE. Along with Dave Vellante, I'm John Walls. We're Data Science For All, IBM's two day event, and we'll be here all day long wrapping up again with that panel discussion from four to five here Eastern Time, so be sure to stick around all day here on theCUBE. Joining us now is Vikram Murali, who is a program director at IBM, and Vikram thank for joining us here on theCUBE. Good to see you. >> Good to see you too. Thanks for having me. >> You bet. So, among your primary responsibilities, The Data Science Experience. So first off, if you would, share with our viewers a little bit about that. You know, the primary mission. You've had two fairly significant announcements. Updates, if you will, here over the past month or so, so share some information about that too if you would. >> Sure, so my team, we build The Data Science Experience, and our goal is for us to enable data scientist, in their path, to gain insights into data using data science techniques, mission learning, the latest and greatest open source especially, and be able to do collaboration with fellow data scientist, with data engineers, business analyst, and it's all about freedom. Giving freedom to data scientist to pick the tool of their choice, and program and code in the language of their choice. So that's the mission of Data Science Experience, when we started this. The two releases, that you mentioned, that we had in the last 45 days. There was one in September and then there was one on October 30th. Both of these releases are very significant in the mission learning space especially. We now support Scikit-Learn, XGBoost, TensorFlow libraries in Data Science Experience. We have deep integration with Horton Data Platform, which is keymark of our partnership with Hortonworks. Something that we announced back in the summer, and this last release of Data Science Experience, two days back, specifically can do authentication with Technotes with Hadoop. So now our Hadoop customers, our Horton Data Platform customers, can leverage all the goodies that we have in Data Science Experience. It's more deeply integrated with our Hadoop based environments. >> A lot of people ask me, "Okay, when IBM announces a product like Data Science Experience... You know, IBM has a lot of products in its portfolio. Are they just sort of cobbling together? You know? So exulting older products, and putting a skin on them? Or are they developing them from scratch?" How can you help us understand that? >> That's a great question, and I hear that a lot from our customers as well. Data Science Experience started off as a design first methodology. And what I mean by that is we are using IBM design to lead the charge here along with the product and development. And we are actually talking to customers, to data scientist, to data engineers, to enterprises, and we are trying to find out what problems they have in data science today and how we can best address them. So it's not about taking older products and just re-skinning them, but Data Science Experience, for example, it started of as a brand new product: completely new slate with completely new code. Now, IBM has done data science and mission learning for a very long time. We have a lot of assets like SPSS Modeler and Stats, and digital optimization. And we are re-investing in those products, and we are investing in such a way, and doing product research in such a way, not to make the old fit with the new, but in a way where it fits into the realm of collaboration. How can data scientist leverage our existing products with open source, and how we can do collaboration. So it's not just re-skinning, but it's building ground up. >> So this is really important because you say architecturally it's built from the ground up. Because, you know, given enough time and enough money, you know, smart people, you can make anything work. So the reason why this is important is you mentioned, for instance, TensorFlow. You know that down the road there's going to be some other tooling, some other open source project that's going to take hold, and your customers are going to say, "I want that." You've got to then integrate that, or you have to choose whether or not to. If it's a super heavy lift, you might not be able to do it, or do it in time to hit the market. If you architected your system to be able to accommodate that. Future proof is the term everybody uses, so have you done? How have you done that? I'm sure API's are involved, but maybe you could add some color. >> Sure. So we are and our Data Science Experience and mission learning... It is a microservices based architecture, so we are completely dockerized, and we use Kubernetes under the covers for container dockerstration. And all these are tools that are used in The Valley, across different companies, and also in products across IBM as well. So some of these legacy products that you mentioned, we are actually using some of these newer methodologies to re-architect them, and we are dockerizing them, and the microservice architecture actually helps us address issues that we have today as well as be open to development and taking newer methodologies and frameworks into consideration that may not exist today. So the microservices architecture, for example, TensorFlow is something that you brought in. So we can just pin up a docker container just for TensorFlow and attach it to our existing Data Science Experience, and it just works. Same thing with other frameworks like XGBoost, and Kross, and Scikit-Learn, all these are frameworks and libraries that are coming up in open source within the last, I would say, a year, two years, three years timeframe. Previously, integrating them into our product would have been a nightmare. We would have had to re-architect our product every time something came, but now with the microservice architecture it is very easy for us to continue with those. >> We were just talking to Daniel Hernandez a little bit about the Hortonworks relationship at high level. One of the things that I've... I mean, I've been following Hortonworks since day one when Yahoo kind of spun them out. And know those guys pretty well. And they always make a big deal out of when they do partnerships, it's deep engineering integration. And so they're very proud of that, so I want to come on to test that a little bit. Can you share with our audience the kind of integrations you've done? What you've brought to the table? What Hortonworks brought to the table? >> Yes, so Data Science Experience today can work side by side with Horton Data Platform, HDP. And we could have actually made that work about two, three months back, but, as part of our partnership that was announced back in June, we set up drawing engineering teams. We have multiple touch points every day. We call it co-development, and they have put resources in. We have put resources in, and today, especially with the release that came out on October 30th, Data Science Experience can authenticate using secure notes. That I previously mentioned, and that was a direct example of our partnership with Hortonworks. So that is phase one. Phase two and phase three is going to be deeper integration, so we are planning on making Data Science Experience and a body management pact. And so a Hortonworks customer, if you have HDP already installed, you don't have to install DSX separately. It's going to be a management pack. You just spin it up. And the third phase is going to be... We're going to be using YARN for resource management. YARN is very good a resource management. And for infrastructure as a service for data scientist, we can actually delegate that work to YARN. So, Hortonworks, they are putting resources into YARN, doubling down actually. And they are making changes to YARN where it will act as the resource manager not only for the Hadoop and Spark workloads, but also for Data Science Experience workloads. So that is the level of deep engineering that we are engaged with Hortonworks. >> YARN stands for yet another resource negotiator. There you go for... >> John: Thank you. >> The trivia of the day. (laughing) Okay, so... But of course, Hortonworks are big on committers. And obviously a big committer to YARN. Probably wouldn't have YARN without Hortonworks. So you mentioned that's kind of what they're bringing to the table, and you guys primarily are focused on the integration as well as some other IBM IP? >> That is true as well as the notes piece that I mentioned. We have a notes commenter. We have multiple notes commenters on our side, and that helps us as well. So all the notes is part of the HDP package. We need knowledge on our side to work with Hortonworks developers to make sure that we are contributing and making end roads into Data Science Experience. That way the integration becomes a lot more easier. And from an IBM IP perspective... So Data Science Experience already comes with a lot of packages and libraries that are open source, but IBM research has worked on a lot of these libraries. I'll give you a few examples: Brunel and PixieDust is something that our developers love. These are visualization libraries that were actually cooked up by IBM research and the open sourced. And these are prepackaged into Data Science Experience, so there is IBM IP involved and there are a lot of algorithms, mission learning algorithms, that we put in there. So that comes right out of the package. >> And you guys, the development teams, are really both in The Valley? Is that right? Or are you really distributed around the world? >> Yeah, so we are. The Data Science Experience development team is in North America between The Valley and Toronto. The Hortonworks team, they are situated about eight miles from where we are in The Valley, so there's a lot of synergy. We work very closely with them, and that's what we see in the product. >> I mean, what impact does that have? Is it... You know, you hear today, "Oh, yeah. We're a virtual organization. We have people all over the world: Eastern Europe, Brazil." How much of an impact is that? To have people so physically proximate? >> I think it has major impact. I mean IBM is a global organization, so we do have teams around the world, and we work very well. With the invent of IP telephoning, and screen-shares, and so on, yes we work. But it really helps being in the same timezone, especially working with a partner just eight miles or ten miles a way. We have a lot of interaction with them and that really helps. >> Dave: Yeah. Body language? >> Yeah. >> Yeah. You talked about problems. You talked about issues. You know, customers. What are they now? Before it was like, "First off, I want to get more data." Now they've got more data. Is it figuring out what to do with it? Finding it? Having it available? Having it accessible? Making sense of it? I mean what's the barrier right now? >> The barrier, I think for data scientist... The number one barrier continues to be data. There's a lot of data out there. Lot of data being generated, and the data is dirty. It's not clean. So number one problem that data scientist have is how do I get to clean data, and how do I access data. There are so many data repositories, data lakes, and data swamps out there. Data scientist, they don't want to be in the business of finding out how do I access data. They want to have instant access to data, and-- >> Well if you would let me interrupt you. >> Yeah? >> You say it's dirty. Give me an example. >> So it's not structured data, so data scientist-- >> John: So unstructured versus structured? >> Unstructured versus structured. And if you look at all the social media feeds that are being generated, the amount of data that is being generated, it's all unstructured data. So we need to clean up the data, and the algorithms need structured data or data in a particular format. And data scientist don't want to spend too much time in cleaning up that data. And access to data, as I mentioned. And that's where Data Science Experience comes in. Out of the box we have so many connectors available. It's very easy for customers to bring in their own connectors as well, and you have instant access to data. And as part of our partnership with Hortonworks, you don't have to bring data into Data Science Experience. The data is becoming so big. You want to leave it where it is. Instead, push analytics down to where it is. And you can do that. We can connect to remote Spark. We can push analytics down through remote Spark. All of that is possible today with Data Science Experience. The second thing that I hear from data scientist is all the open source libraries. Every day there's a new one. It's a boon and a bane as well, and the problem with that is the open source community is very vibrant, and there a lot of data science competitions, mission learning competitions that are helping move this community forward. And it's a good thing. The bad thing is data scientist like to work in silos on their laptop. How do you, from an enterprise perspective... How do you take that, and how do you move it? Scale it to an enterprise level? And that's where Data Science Experience comes in because now we provide all the tools. The tools of your choice: open source or proprietary. You have it in here, and you can easily collaborate. You can do all the work that you need with open source packages, and libraries, bring your own, and as well as collaborate with other data scientist in the enterprise. >> So, you're talking about dirty data. I mean, with Hadoop and no schema on, right? We kind of knew this problem was coming. So technology sort of got us into this problem. Can technology help us get out of it? I mean, from an architectural standpoint. When you think about dirty data, can you architect things in to help? >> Yes. So, if you look at the mission learning pipeline, the pipeline starts with ingesting data and then cleansing or cleaning that data. And then you go into creating a model, training, picking a classifier, and so on. So we have tools built into Data Science Experience, and we're working on tools, that will be coming up and down our roadmap, which will help data scientist do that themselves. I mean, they don't have to be really in depth coders or developers to do that. Python is very powerful. You can do a lot of data wrangling in Python itself, so we are enabling data scientist to do that within the platform, within Data Science Experience. >> If I look at sort of the demographics of the development teams. We were talking about Hortonworks and you guys collaborating. What are they like? I mean people picture IBM, you know like this 100 plus year old company. What's the persona of the developers in your team? >> The persona? I would say we have a very young, agile development team, and by that I mean... So we've had six releases this year in Data Science Experience. Just for the on premises side of the product, and the cloud side of the product it's got huge delivery. We have releases coming out faster than we can code. And it's not just re-architecting it every time, but it's about adding features, giving features that our customers are asking for, and not making them wait for three months, six months, one year. So our releases are becoming a lot more frequent, and customers are loving it. And that is, in part, because of the team. The team is able to evolve. We are very agile, and we have an awesome team. That's all. It's an amazing team. >> But six releases in... >> Yes. We had immediate release in April, and since then we've had about five revisions of the release where we add lot more features to our existing releases. A lot more packages, libraries, functionality, and so on. >> So you know what monster you're creating now don't you? I mean, you know? (laughing) >> I know, we are setting expectation. >> You still have two months left in 2017. >> We do. >> We do not make frame release cycles. >> They are not, and that's the advantage of the microservices architecture. I mean, when you upgrade, a customer upgrades, right? They don't have to bring that entire system down to upgrade. You can target one particular part, one particular microservice. You componentize it, and just upgrade that particular microservice. It's become very simple, so... >> Well some of those microservices aren't so micro. >> Vikram: Yeah. Not. Yeah, so it's a balance. >> You're growing, but yeah. >> It's a balance you have to keep. Making sure that you componentize it in such a way that when you're doing an upgrade, it effects just one small piece of it, and you don't have to take everything down. >> Dave: Right. >> But, yeah, I agree with you. >> Well, it's been a busy year for you. To say the least, and I'm sure 2017-2018 is not going to slow down. So continue success. >> Vikram: Thank you. >> Wish you well with that. Vikram, thanks for being with us here on theCUBE. >> Thank you. Thanks for having me. >> You bet. >> Back with Data Science For All. Here in New York City, IBM. Coming up here on theCUBE right after this. >> Cameraman: You guys are clear. >> John: All right. That was great.

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. Good to see you. Good to see you too. about that too if you would. and be able to do collaboration How can you help us understand that? and we are investing in such a way, You know that down the and attach it to our existing One of the things that I've... And the third phase is going to be... There you go for... and you guys primarily are So that comes right out of the package. The Valley and Toronto. We have people all over the We have a lot of interaction with them Is it figuring out what to do with it? and the data is dirty. You say it's dirty. You can do all the work that you need with can you architect things in to help? I mean, they don't have to and you guys collaborating. And that is, in part, because of the team. and since then we've had about and that's the advantage of microservices aren't so micro. Yeah, so it's a balance. and you don't have to is not going to slow down. Wish you well with that. Thanks for having me. Back with Data Science For All. That was great.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Vikram	PERSON	0.99+
John	PERSON	0.99+
three months	QUANTITY	0.99+
six months	QUANTITY	0.99+
John Walls	PERSON	0.99+
October 30th	DATE	0.99+
2017	DATE	0.99+
April	DATE	0.99+
June	DATE	0.99+
one year	QUANTITY	0.99+
Daniel Hernandez	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
September	DATE	0.99+
one	QUANTITY	0.99+
ten miles	QUANTITY	0.99+
YARN	ORGANIZATION	0.99+
eight miles	QUANTITY	0.99+
Vikram Murali	PERSON	0.99+
New York City	LOCATION	0.99+
North America	LOCATION	0.99+
two day	QUANTITY	0.99+
Python	TITLE	0.99+
two releases	QUANTITY	0.99+
New York	LOCATION	0.99+
two years	QUANTITY	0.99+
three years	QUANTITY	0.99+
six releases	QUANTITY	0.99+
Toronto	LOCATION	0.99+
today	DATE	0.99+
Both	QUANTITY	0.99+
two months	QUANTITY	0.99+
a year	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
third phase	QUANTITY	0.98+
both	QUANTITY	0.98+
this year	DATE	0.98+
first methodology	QUANTITY	0.98+
First	QUANTITY	0.97+
second thing	QUANTITY	0.97+
one small piece	QUANTITY	0.96+
One	QUANTITY	0.96+
XGBoost	TITLE	0.96+
Cameraman	PERSON	0.96+
about eight miles	QUANTITY	0.95+
Horton Data Platform	ORGANIZATION	0.95+
2017-2018	DATE	0.94+
first	QUANTITY	0.94+
The Valley	LOCATION	0.94+
TensorFlow	TITLE	0.94+

Daniel Hernandez, Analytics Offering Management | IBM Data Science For All

>> Announcer: Live from New York City, it's theCUBE. Covering IBM Data Science For All. Brought to you by IBM. >> Welcome to the big apple, John Walls and Dave Vellante here on theCUBE we are live at IBM's Data Science For All. Going to be here throughout the day with a big panel discussion wrapping up our day. So be sure to stick around all day long on theCUBe for that. Dave always good to be here in New York is it not? >> Well you know it's been kind of the data science weeks, months, last week we're in Boston at an event with the chief data officer conference. All the Boston Datarati were there, bring it all down to New York City getting hardcore really with data science so it's from chief data officer to the hardcore data scientists. >> The CDO, hot term right now. Daniel Hernandez now joins as our first guest here at Data Science For All. Who's a VP of IBM Analytics, good to see you. David thanks for being with us. >> Pleasure. >> Alright well give us first off your take, let's just step back high level here. Data science it's certainly been evolving for decades if you will. First off how do you define it today? And then just from the IBM side of the fence, how do you see it in terms of how businesses should be integrating this into their mindset. >> So the way I describe data science simply to my clients is it's using the scientific method to answer questions or deliver insights. It's kind of that simple. Or answering questions quantitatively. So it's a methodology, it's a discipline, it's not necessarily tools. So that's kind of the way I approach describing what it is. >> Okay and then from the IBM side of the fence, in terms of how wide of a net are you casting these days I assume it's as big as you can get your arms out. >> So when you think about any particular problem that's a data science problem, you need certain capabilities. We happen to deliver those capabilities. You need the ability to collect, store, manage, any and all data. You need the ability to organize that data so you can discover it and protect it. You got to be able to analyze it. Automate the mundane, explain the past, predict the future. Those are the capabilities you need to do data science. We deliver a portfolio of it. Including on the analyze part of our portfolio, our data science tools that we would declare as such. >> So data science for all is very aspirational, and when you guys made the announcement of the Watson data platform last fall, one of the things that you focused on was collaboration between data scientists, data engineers, quality engineers, application development, the whole sort of chain. And you made the point that most of the time that data scientists spend is on wrangling data. You're trying to attack that problem, and you're trying to break down the stovepipes between those roles that I just mentioned. All that has to happen before you can actually have data science for all. I mean that's just data science for all hardcore data people. Where are we in terms of sort of the progress that your clients have made in that regard? >> So you know, I would say there's two majors vectors of progress we've made. So if you want data science for all you need to be able to address people that know how to code and people that don't know how to code. So if you consider kind the history of IBM in the data science space especially in SPSS, which has been around for decades. We're mastering and solving data science problems for non-coders. The data science experience really started with embracing coders. Developers that grew up in open source, that lived and learned Jupiter or Python and were more comfortable there. And integration of these is kind of our focus. So that's one aspect. Serving the needs of people that know how to code and don't in the kind of data science role. And then for all means supporting an entire analytics life cycle from collecting the data you need in order to answer the question that you're trying to answer to organizing that information once you've collected so you can discover it inside of tools like our own data science experience and SPSS, and then of course the set of tools that around exploratory analytics. All integrated so that you can do that end to end life cycle. So where clients are, I think they're getting certainly much more sophisticated in understanding that. You know most people have approached data science as a tool problem, as a data prep problem. It's a life cycle problem. And that's kind of how we're thinking about it. We're thinking about it in terms of, alright if our job is answer questions, delivering insights through scientific methods, how do we decompose that problem to a set of things that people need to get the job done, serving the individuals that have to work together. >> And when you think about, go back to the days where it's sort of the data warehouse was king. Something we talked about in Boston last week, it used to be the data warehouse was king, now it's the process is much more important. But it was very few people had access to that data, you had the elapsed time of getting answers, and the inflexibility of the systems. Has that changed and to what degree has it changed? >> I think if you were to go ask anybody in business whether or not they have all the data they need to do their job, they would say no. Why? So we've invested in EDW's, we've invested in Hadoop. In part sometimes, the problem might be, I just don't have the data. Most of the time it is I have the data I just don't know where it is. So there's a pretty significant issue on data discoverability, and it's important that I might have data in my operational systems, I might have data inside my EDW, I don't have everything inside my EDW, I've standed up one or more data lakes, and to solve my problem like customer segmentation I have data everywhere, how do I find and bring it in? >> That seems like that should be a fundamental consideration, right? If you're going to gather this much more information, make it accessible to people. And if you don't, it's a big flaw, it's a big gap is it not? >> So yes, and I think part of the reason why is because governance professionals which I am, you know I spent quite a bit of time trying to solve governance related problems. We've been focusing pretty maniacally on kind of the compliance, and the regulatory and security related issues. Like how do we keep people from going to jail, how do we ensure regulatory compliance with things like e-discovery, and records for instance. And it just so happens the same discipline that you use, even though in some cases lighter weight implementations, are what you need in order to solve this data discovery problem. So the discourse around governance has been historically about compliance, about regulations, about cost takeout, not analytics. And so a lot of our time certainly in R&D is trying to solve that data discovery problem which is how do I discover data using semantics that I have, which as a regular user is not physical understandings of my data, and once I find it how am I assured that what I get is what I should get so that it's, I'm not subject to compliance related issues, but also making the company more vulnerable to data breach. >> Well so presumably part of that anyway involves automating classification at the point of creation or use, which is actually was a technical challenge for a number of years. Has that challenge been solved in your view? >> I think machine learning is, and in fact later on today I will be doing some demonstrations of technology which will show how we're making the application of machine learning easy, inside of everything we do we're applying machine learning techniques including to classification problems that help us solve the problem. So it could be we're automatically harvesting technical metadata. Are there business terms that could be automatically extracted that don't require some data steward to have to know and assert, right? Or can we automatically suggest and still have the steward for a case where I need a canonical data model, and so I just don't want the machine to tell me everything, but I want the machine to assist the data curation process. We are not just exploring the application of machine learning to solve that data classification problem, which historically was a manual one. We're embedding that into most of the stuff that we're doing. Often you won't even know that we're doing it behind the scenes. >> So that means that often times well the machine ideally are making the decisions as to who gets access to what, and is helping at least automate that governance, but there's a natural friction that occurs. And I wonder if you can talk about the balance sheet if you will between information as an asset, information as a liability. You know the more restrictions you put on that information the more it constricts you know a business user's ability. So how do you see that shaping up? >> I think it's often a people process problem, not necessarily a technology problem. I don't think as an industry we've figured it out. Certainly a lot of our clients haven't figured out that balance. I mean there are plenty of conversation I'll go into where I'll talk to a data science team in a same line of business as a governance team and what the data science team will tell us is I'm building my own data catalog because the stuff that the governance guys are doing doesn't help me. And the reason why it doesn't help me is because it's they're going through this top down data curation methodology and I've got a question, I need to go find the data that's relevant. I might not know what that is straight away. So the CDO function in a lot of organizations is helping bridge that. So you'll see governance responsibilities line up with the CDO with analytics. And I think that's gone a long way to bridge that gaps. But that conversation that I was just mentioning is not unique to one or two customers. Still a lot of customers are doing it. Often customers that either haven't started a CDO practice or are early days on it still. >> So about that, because this is being introduced to the workplace, a new concept right, fairly new CDOs. As opposed to CIO or CTO, you know you have these other. I mean how do you talk to your clients about trying to broaden their perspective on that and I guess emphasizing the need for them to consider putting somebody of a sole responsibility, or primary responsibility for their data. Instead of just putting it lumping it in somewhere else. >> So we happen to have one of the best CDO's inside of our group which is like a handy tool for me. So if I go into a client and it's purporting to be a data science problem and it turns out they have a data management issue around data discovery, and they haven't yet figured out how to install the process and people design to solve that particular issue one of the key things I'll do is I'll bring in our CDO and his delegates to have a conversation around them on what we're doing inside of IBM, what we're seeing in other customers to help institute that practice inside of, inside of their own organization. We have forums like the CDO event in Boston last week, which are designed to, you know it's not designed to be here's what IBM can do in technology, it's designed to say here's how the discipline impacts your business and here's some best practices you should apply. So if ultimately I enter into those conversations where I find that there's a need, I typically am like alright, I'm not going to, tools are part of the problem but not the only issue, let me bring someone in that can describe the people process related issues which you got to get right. In order for, in some cases to the tools that I deliver to matter. >> We had Seth Dobrin on last weekend in Boston, and Inderpal Bhandari as well, and he put forth this enterprise, sort of data blueprint if you will. CDO's are sort of-- >> Daniel: We're using that in IBM by the way. >> Well this is the thing, it's a really well thought out sort of structure that seems to be trickling down to the divisions. And so it's interesting to hear how you're applying Seth's expertise. I want to ask you about the Hortonworks relationship. You guys have made a big deal about that this summer. To me it was a no brainer. Really what was the point of IBM having a Hadoop distro, and Hortonworks gets this awesome distribution channel. IBM has always had an affinity for open source so that made sense there. What's behind that relationship and how's it going? >> It's going awesome. Perhaps what we didn't say and we probably should have focused on is the why customers care aspect. There are three main by an occasion use cases that customers are implementing where they are ready even before the relationship. They're asking IBM and Hortonworks to work together. And so we were coming to the table working together as partners before the deeper collaboration we started in June. The first one was bringing data science to Hadoop. So running data science models, doing data exploration where the data is. And if you were to actually rewind the clock on the IBM side and consider what we did with Hortonworks in full consideration of what we did prior, we brought the data science experience and machine learning to Z in February. The highest value transactional data was there. The next step was bring data science to where the, often for a lot of clients the second most valuable set of data which is Hadoop. So that was kind of part one. And then we've kind of continued that by bringing data science experience to the private cloud. So that's one use case. I got a lot data, I need to do data science, I want to do it in resident, I want to take advantage of the compute grid I've already laid down, and I want to take advantage of the performance benefits and the integrated security and governance benefits by having these things co-located. That's kind of play one. So we're bringing in data science experience and HDP and HDF, which are the Hortonworks distributions way closer together and optimized for each other. Another component of that is not all data is going to be in Hadoop as we were describing. Some of it's in an EDW and that data science job is going to require data outside of Hadoop, and so we brought big SQL. It was already supporting Hortonworks, we just optimized the stack, and so the combination of data science experience and big SQL allows you to data science against a broader surface area of data. That's kind of play one. Play two is I've got a EDW either for cost or agility reasons I want to augment it or some cases I might want to offload some data from it to Hadoop. And so the combination of Hortonworks plus big SQL and our data integration technologies are a perfect combination there and we have plenty of clients using that for kind of analytics offloading from EDW. And then the third piece that we're doing quite a bit of engineering, go-to-market work around is govern data lakes. So I want to enable self service analytics throughout my enterprise. I want self service analytics tools to everyone that has access to it. I want to make data available to them, but I want that data to be governed so that they can discover what's in it in the lake, and whatever I give them is what they should have access to. So those are the kind of the three tracks that we're working with Hortonworks on, and all of them are making stunning results inside of clients. >> And so that involves actually some serious engineering as well-- >> Big time. It's not just sort of a Barney deal or just a pure go to market-- >> It's certainly more the market texture and just works. >> Big picture down the road then. Whatever challenges that you see on your side of the business for the next 12 months. What are you going to tackle, what's that monster out there that you think okay this is our next hurdle to get by. >> I forgot if Rob said this before, but you'll hear him say often and it's statistically proven, the majority of the data that's available is not available to be Googled, so it's behind a firewall. And so we started last year with the Watson data platform creating an integrating data analytics system. What if customers have data that's on-prem that they want to take advantage of, what if they're not ready for the public cloud. How do we deliver public benefits to them when they want to run that workload behind a firewall. So we're doing a significant amount of engineering, really starting with the work that we did on a data science experience. Bringing it behind the firewall, but still delivering similar benefits you would expect if you're delivering it in the public cloud. A major advancement that IBM made is run IBM cloud private. I don't know if you guys are familiar with that announcement. We made, I think it's already two weeks ago. So it's a (mumbles) foundation on top of which we have micro services on top of which our stack is going to be made available. So when I think of kind of where the future is, you know our customers ultimately we believe want to run data and analytic workloads in the public cloud. How do we get them there considering they're not there now in a stepwise fashion that is sensible economically project management-wise culturally. Without having them having to wait. That's kind of big picture, kind of a big problem space we're spending considerable time thinking through. >> We've been talking a lot about this on theCUBE in the last several months or even years is people realize they can't just reform their business and stuff into the cloud. They have to bring the cloud model to their data. Wherever that data exists. If it's in the cloud, great. And the key there is you got to have a capability and a solution that substantially mimics that public cloud experience. That's kind of what you guys are focused on. >> What I tell clients is, if you're ready for certain workloads, especially green field workloads, and the capability exists in a public cloud, you should go there now. Because you're going to want to go there eventually anyway. And if not, then a vendor like IBM helps you take advantage of that behind a firewall, often in form facts that are ready to go. The integrated analytics system, I don't know if you're familiar with that. That includes our super advanced data warehouse, the data science experience, our query federation technology powered by big SQL, all in a form factor that's ready to go. You get started there for data and data science workloads and that's a major step in the direction to the public cloud. >> Alright well Daniel thank you for the time, we appreciate that. We didn't get to touch at all on baseball, but next time right? >> Daniel: Go Cubbies. (laughing) >> Sore spot with me but it's alright, go Cubbies. Alright Daniel Hernandez from IBM, back with more here from Data Science For All. IBM's event here in Manhattan. Back with more in theCUBE in just a bit. (electronic music)

Published Date : Nov 1 2017

SUMMARY :

Brought to you by IBM. So be sure to stick around all day long on theCUBe for that. to the hardcore data scientists. Who's a VP of IBM Analytics, good to see you. how do you see it in terms of how businesses should be So that's kind of the way I approach describing what it is. in terms of how wide of a net are you casting You need the ability to organize that data All that has to happen before you can actually and people that don't know how to code. Has that changed and to what degree has it changed? and to solve my problem like customer segmentation And if you don't, it's a big flaw, it's a big gap is it not? And it just so happens the same discipline that you use, Well so presumably part of that anyway We're embedding that into most of the stuff You know the more restrictions you put on that information So the CDO function in a lot of organizations As opposed to CIO or CTO, you know you have these other. the process and people design to solve that particular issue data blueprint if you will. that seems to be trickling down to the divisions. is going to be in Hadoop as we were describing. just a pure go to market-- that you think okay this is our next hurdle to get by. I don't know if you guys are familiar And the key there is you got to have a capability often in form facts that are ready to go. We didn't get to touch at all on baseball, Daniel: Go Cubbies. IBM's event here in Manhattan.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Daniel	PERSON	0.99+
February	DATE	0.99+
Boston	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
one	QUANTITY	0.99+
David	PERSON	0.99+
Manhattan	LOCATION	0.99+
Inderpal Bhandari	PERSON	0.99+
June	DATE	0.99+
Rob	PERSON	0.99+
Dave	PERSON	0.99+
New York	LOCATION	0.99+
New York City	LOCATION	0.99+
last year	DATE	0.99+
Seth	PERSON	0.99+
Python	TITLE	0.99+
third piece	QUANTITY	0.99+
EDW	ORGANIZATION	0.99+
second	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
last week	DATE	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
SQL	TITLE	0.99+
two customers	QUANTITY	0.99+
Hadoop	TITLE	0.99+
first	QUANTITY	0.99+
SPSS	TITLE	0.98+
Seth Dobrin	PERSON	0.98+
three tracks	QUANTITY	0.98+
John Walls	PERSON	0.98+
IBM Analytics	ORGANIZATION	0.98+
first guest	QUANTITY	0.97+
two weeks ago	DATE	0.97+
one aspect	QUANTITY	0.96+
first one	QUANTITY	0.96+
Barney	ORGANIZATION	0.96+
two majors	QUANTITY	0.96+
last weekend	DATE	0.94+
this summer	DATE	0.94+
Hadoop	ORGANIZATION	0.93+
decades	QUANTITY	0.92+
last fall	DATE	0.9+
two	QUANTITY	0.85+
IBM Data Science For All	ORGANIZATION	0.79+
three main	QUANTITY	0.78+
next 12 months	DATE	0.78+
CDO	TITLE	0.77+
D	ORGANIZATION	0.72+

Matt Kixmoeller, Pure Storage | CUBEcoversation, April 2019

>> we'LL run. Welcome to this special. Keep conversation. We're here in Mountain View, California. The pure storage headquarters here in Castro Tree, one of the many buildings they have here as they continue to grow as a public company. Our next guest is Kicks Vice President of strategy Employee number six Pure. Great to see you. Thanks for spending time. Thanks for having me. So cloud is the big wave that's coming around the future itself here. Now, people really impacted by it operationally coming to the reality that they got to actually use the cloud of benefits for many, many multiple benefits. But you guys have major bones in storage, flash arrays continuing to take take territory. So as you guys do that, what's the cloud play? How to customers who were using pure. And we've heard some good testimonials Yet a lot of happy customers. We've seen great performance, Easy to get in reliability performances. They're in the storage side on premise. Right? Okay. Now Operations says, Hey, I build faster. Cloud is certainly path there. Certainly. Good one. Your thoughts on strategy for the cloud? >> Absolutely. So look for about ten years into the journey here, a pure. And a lot of what we did in the first ten years was helped bring flash onto the scene. Um, and you know what a vision when we started the company of the All Flash Data Center and I'd like to first of all, remind people that look, we ain't there yet. If you look at the analyst numbers, about a third of the storage sold this year will be flashed two thirds disk. So we still have a long way to go in the old flash data center and a lot of work to do there. But of course, increasingly customers are wanting to move, were close to the cloud. And I think the last couple of years have almost seen a pendulum swing a little bit more back to reality. You know, when I met with CEOs to three years ago, you often heard we're going all cloud. We're going to cloud first and, you know, now there a few years into it. And they've realized that that cloud is a very powerful weapon in their in their arsenal for agility, for flexibility. But it's not necessarily cheaper on DH. So I think the swing back to really believe in in hybrid is the model of the day, and I think that I think people have realised in that journey is that the club early works best when you build a nap for the cloud natively. But what if you have a bunch of on prime maps that are in traditional architecture? How do I get in the cloud? And so one of the things we really focused on is how we can help customers take their mission critical applications and move them seamlessly to the cloud without re architecture. Because for most customers, that's really going to start. I mean, they could build some new stuff in the cloud, but the bulk of their business, if they want to move substantial portions of the cloud, they've got to figure out how to move what they've got. And we think we really had value in that. >> And the economics of the cloud is undeniable. People who are born in the cloud will testify that certainly as you guys have been successful on premise with the cloud, how do you make those economics, he seem, was as well as the operations. This seems to be the number one goal when you talk about how important that is and how hard it is, because it sounds easy just to say it. But it's actually really difficult to have seamless operations on Prime because, you know, Amazon, Google, Microsoft, they all got computing storage in the cloud and you got story. John Premise. This equation is a really important one to figure out what the importance and how hard is it to some of things that you guys are doing to solve that. >> Yeah, So I heard two things that question one around costs and one around operations on. You know, the first thing I think that has been nice to see over the last couple of years as people realizing that both the cloud and on from our cost effective in different ways, and I think a little bit about the way that I think about owning a car. Owning a car is relatively cost effective for me, and there's times and taken uber is relatively cost effective. I think they're both cheap when you look it on one metric, though, about what I pay per mile, it's way more expensive to own a car to take a number look about acquisition cost. It's way more expensive. Car, right? And so I think both of them provide value of my lives in the way that hybrid does today. But once you start to use both than the operational, part of your question comes in. How do I think about these two different worlds? And I think we believe that that storage is actually one of the areas where these two worlds are totally different on dso a couple things we've done to find a bridge together. First off on the cost side, one of the things we realised was that people that are going to run large amounts of on prime infrastructure increasingly want to do it in the cloud model. And so we introduced a new pricing model that we call the S to evergreen storage service, which will essentially allows you to subscribe to our storage even in your own data center. And so you can have an optics experience in the cloud. You gotta monoprix experience on Prem and when you buy and yes, to those licenses are transferrable so you can start on Prem, Move your stories to the cloud with pure go back and forth tons of flexibility. From the operational point of view, I think we're trying to get to the same experience as well such that you have a single storage experience for a manageability and automation point of view across both. And I think that last word of automation is key, because if you look at people who are really invested in cloud, it's all about automation. In one of the nice things I think that's made pure, so successful in on Prime Claude environments is this combination of simplicity and automation. You can't we automate what isn't simple to begin with on DH. So we started with simplicity. But as we've added rich FBI's, we're really seeing that become the dominant way that people administrated our storage. And so as we've gone to the cloud because it's the same software on both sides, literally the same integrations, the same AP calls everything works transparently across both places. >> That's a great point. We've been reporting on silicon ng on the Cube for years. Automation grave. You have to couple of manual taxes and automated, but the values and shifting and you guys in the storage business you know this data's data data is very valuable. You mentioned the car and Alice just take uber uber is an app. It's got Web services in the back end. So when you start thinking about cloud, you think you hear ap eyes You hear micro services as more and more applications going to need the data, they're going to need to have that in real time, some cases not near real time, either real time. And they're gonna need to have at the right time. So the role of data becomes important, which makes storage more important. So you automate the story, Okay, Take away that mundane tasks. Now the value shifts to making sure data is being presented properly. This is the renaissance of application development. Right now we're seeing this. How do you guys attack that market? How do you guys enable that? Mark, how do you satisfy that market? Because this is where the AP eyes could be connectors. This is where the data can be valuable. Whether it's analytic, score an app like uber. That's just, you know, slinging AP eyes together for a service that is now going to go public. Yeah, >> I think the mindset around data is one of the biggest differences between the old world in the New World. And if you think about the old world of applications. Yeah, monolithic databases that kind of privately owned their own data stores and the whole name of the game was delivering that as reliably as possible, kind of locking it down, making it super reliable. If you look at the idea of the Web scale application, the idea of an application is broken up into lots of little micro services, and those maker services somehow have to work together on data. And so what does it mean that the data level, it's not this kind of monolithic database anymore? It's got to be this open shared environment and, you know, as a result, if you look in the Web in Amazon's case, for example, the vast majority of applications are written on history object storage that's inherently shared. And so I think one of the bigger interesting challenges right now is how you get data constructs to actually go both ways. You know, if you want to take a non prime map that kind of is built around the database, you've got to figure out a way to move it to the cloud and ronit reliably on the flipside of the coin. If you want to build on Web skill tools and then be hybrid and run some of those things on Prem, well, you need an object store on prim and most people don't have that. And so you know, this whole kind of compatibility to make hybrid reality. It's forcing people on both sides of the weir to understand the other architecture er, and make sure they're compatible both ways >> and throw more complex into that equation. Is that skills, gaps? I know I know that cloud needed. But now men on premise so different skill got you guys had an announcement that's come out. So I want to ask you about your product announcement and your acquisitions. Go back to past six months. What's the most notable product announcements inequities that you guys have done? And what does that mean for pure and your customers? Yeah, >> absolutely. So I'll just kind of walk through it, So the first thing we announced was our new set of Cloud data services, and this was in essence, bringing our core software that runs on our purity. Operating environment right into the cloud. And so we call that cloud block store. And again, this is a lot of what I've been talking about, how you can take a tier one block storage application on Prem and seamlessly move it to the cloud along that same timeline. We also introduce something called P S O, which is the pure service orchestrator. And this was a tool set that we built specifically for the containers world for communities so that basically, in a container environment, our storage could be completely automated. It's been really fun watching customers use and just see how different that storage is in a container environment. You know, we look at our call home data with an R P. R. One application, and in our traditional on prime environment, the average array has about one administrative tasks per day. Make a volume. Delete something, Whatever. If you look in a container environment, that's tens of thousands, and so it's just a much more fluid environment, which there's no way a storage at Ben's going to do something ten thousand times a day they've got on, >> and that's where automation comes in. But what does that mean? the continuous station. That means the clients are using containers to be more flexible, they deploying more. What's the What's the inside of this container trend? >> You know, I think ultimately it's just a farm or fluid environment. It's totally automated, Andi. It's built on a world of share data. And so you need a shared, reliable data service that can power these containers, Um and then, you know, back to original question about about kind of product expansion. The next thing that we haven't announced last year was acquisition of a company called Story Juice, and we've subsequently brought out as a product that we call Object Engine. And this is all about a new type of data moving into the club, which is backup data and facilitating in this backup process. You know, in the past, people moved from tape back up to the space back up and, you know, we saw kind of two new inflection points here. Number one the opportunity Use flash on Prem. So the people have really fast recoveries on prep because in most environments now, space recovery just aren't fast enough, and then using low cost object storage in the cloud for retention. So the combination of flash on Prem and Object Storage in the Cloud can completely replace both disc and tape in the back of process >> case. I won the competition because you guys came in really with the vision of all Flash Data Center. You now have a cloud software that runs on Amazon and others with words. No hardware, he just the blocks are great solution. How have the competition fallen behind you guys really kind of catapulted into the lead, took share certainly from other vendors. In my public, someone predicted that pure would never make it to escape velocity. Some other pundits and other CEOs of tech company said that you guys achieve that, but their success now You guys go the next level. What is the importance of that ability you have? And what's the inability of the competition? So, you know, I like >> to joke with folks. When we started the company, I think flashes. It's an excuse, you know, We just tried to build a better storage company and we went out and I talkto many, many, many customers, and I found in general they didn't just not like their stories products they didn't like the companies that sold it to them, and so we tried to look at that overall experience. And, you know, we, of course, innovated around flash use. Consumer fresh brought the price down so I could actually afford to use it with the duplication. But we also just looked at that ownership experience. And when I talk to folks in the history, I think now we might even be better known for are evergreen approach that even for Flash. And it's been neat to watch customers now that even the earliest your customers or two or three cycles of refreshing they've seen a dramatic difference in just the storage experience that you can essentially subscribe to. A known over time through many generations of technology. Turn as opposed to that cycle of replacing a raise >> share a story of a custom that's been through that's reached fresh cycles from their first experience to what they're experienced. Now what what? Some of the experiences like any share some some insight. >> Yeah, so, you know, one of one of the first customers that really turn us on to this. That scale was a large telco provider, and they were interesting they run, you know, hundreds of here wanna raise from from competitors and you know, they do a three year cycle. But as they really like, looked at the cost of that three or cycle. They realized that it was eighteen months of usable life in those three years because it took him nine months to get the dirt on the array. And then when they knew the end was coming, it took him nine months to get the data off the array. And so parade it was cost him a million dollars just in data migration costs alone. Then you've wasted half of your life of the array, and so add that up over hundreds of raising your environment. You can quickly get the math. >> It's just it's a total cost of ownership, gets out of control, right? And >> so as we brought in Evergreen, there's just an immediate roo. I mean, it was accost equation. It was, you know, on parity with flash disk anyway. But if you look at all those operational savings, itjust is completed. And so I think what we started with Evergreen, we realised it was much more of a subscription model where people subscribe to a service with us. We updated. Refresh the hardware over time and it just keeps getting better over time. Sounds >> a lot like the cloud, right? And so we really your strategies bring common set of tools in there and read them again. That kind of service that been Kia. >> Yeah, I think you know another thing that we did from Day one was like, We're never gonna build a piece of on prime management software. So are on print. Our management experience from Day one was pure one, which is our SAS base management platform. You know, it started out as a call home application, but now is a very full featured south space management experience. And that's also served us well as you go to the cloud, because when you want to manage on permanent cloud together, we're about to do it from then the cloud itself >> tell about the application environment you mentioned earlier hybrid on multi class here. Ah, a lot of pressure and I t to get top line revenue, not just cost reduction was a good benefits you mentioned certainly gets their attention. But changing the organization's value proposition to their customers is about the experience either app driven or some other tech. This is now an imperative. It's happening very fast. Modernisation Renaissance. People call it all these things. How you guys helping that piece of the >> puzzle? Yeah, I mean, I think ultimately, for most customers, as they start toe really getting their mindset, that technology is there. Differentiation speed into Julia there, developers becomes key. And so you know, modern CEO is much less about being a cost cutting CEO today, and much more about that empower in Seo and how you can actually build the tools and bring them there for the ordination. Run faster. And a lot of that is about unlocking consumption. And so it's been it's been fun to see some of the lessons of the cloud in terms of instant consumption, agility growth actually come to the mindset of how people think about on Primus. Well, and so a lot of what we've done is tried Teo armed people on prom with those same capabilities so that they can easily deliver storage of service to their customers so folks can consume the FBI without having to call somebody to ask for storage. So things could take seconds, not weeks of procurement, right? And then now, as we bridge those models between on permanent cloud, it becomes a single spot where you can basically have that same experience to request storage wherever it may be. In the organization, >> the infrastructures code is really just, you know, pushing code not from local host or the machine, but to cloud or on prim and just kind of trickle all the way through. This is one of the focuses we're hearing in cloud native conversations, as you know, words like containers We talked briefly about you mentioned in the activities. Hi, Cooper Netease is really hot right now. Service meshes Micro services state ful Data's stateless data. These air like really hyped up areas, but a lot of traction force people take a look at it. How do you guys speak to the customers when they say, hey, kicks? We love all the pure stuff. We're on our third enter federation or anything about being a customer. I got this looming, you know, trend. I gotto understand, and either operationalize or not around. Cooper Netease service mesh these kinds of club native tools. How do you guys talk to that customer. What's the pitch? That's the value proposition. >> Yeah. I mean, I think you know, your your new Kupres environment is the last place you should consider a legacy Storage, You know, all all joking aside, we've We've been really, I think possibly impressed around how fast the adoption it started around containers in general. And Cooper, that is, You know, it started out as a developer thing. And, you know, we first saw it in our environment. When we started to build our second product up your flash blade four, five years ago, the engineering team started with honors from Day one. It was like, That's interesting. And so we started to >> see their useful. We have containers and communities worker straight, pretty nights. And >> so, you know, we just started to see that grow way also started to see it more within analytics and a I, you know, as we got into a I would area and are broader push around going after Big Gate and analytics. Those tool chains in particular, were very well set up to take advantage of containers because they're much more modern. That's much more about, you know, fluidly creating this data pipeline. And so it started in these key use cases. But I think you know, it's at a point right now where every enterprises considering it, there's certainly an opportunity in the development environment. And, you know, despite all of that, the folks who tend to use these containers, they don't think about storage. You know that if they go to the cloud and they start to build applications, they're not thinking many layers down in the organization. What the story is that supports me looks like. And so if you look at a storage team's job or never structure seems job is to provide the same experience to your container centric consumers, right? They should just be able Teo, orchestrate and build, and then stories should just happen underneath. >> I told Agree that I think that success milestone. If you could have that conversation that he had, you know you're winning what they do care about. We're hearing more of what you mentioned earlier about data pipeline data they care about because applications will be needing data. But it's a retail app or whatever. I might need to have access to multiple data, not some siloed or you know, data warehouse that might have little, you know. Hi, Leighton. See, they need data in the AP at the right moment. This has been a key discussion. Real time. I mean, this is the date. It's It's been a hard problem. Yeah. How do you guys look at that solution opportunity for your customers? I >> think one of the insights we had was that fundamentally folks needed infrastructure that cannot just run one tool or another tool, but a whole bunch of them. And, you know, you look at people building a data pipeline there, stitching together six, eight, ten tools that exist today and another twenty that don't exist tomorrow. And that flexibility is key, right? A lot of the original thought in that space was going to pick the right storage for this piece of the write stories for that piece. But as we introduced our flash blade product, we really position it as a data hub for these modern applications. And each of them requires something a little different. But the flexibility and scale of flash played was able to provide everything those applications needed. We're now seeing another opportunity in that space with Daz and the traditional architecture. You know, as we came out with envy me over fabrics within our flash ray product line. We see this is a way to really take Web scale architecture on Prem. You know, you look Quinn's within Google and Amazon and whatnot, right? They're not using hyper converge there, not using Daz disc inside of the same chassis that happens. We're on applications. They have dedicated in frustration for storage. That's simply design for dedicated servers. And they're connected with fast Internet, you know, networking on demand. And so we're basically trying to bring that same architecture to the on prime environment with nd me over fabric because they need me over fabric can make local disc feel like you know as fun. >> But this is the shift that's really going on here. This is a complete re architecture of computing and storage. Resource is >> absolutely, you know, and I think the thing that's changing it is that need for consolidation. In the early days, I might have said, Okay, I'm gonna deploy. I don't know, two hundred nodes of the Duke and all just design a server for her dupe with the right amount of discontent and put him over in those racks, and that will be like this. Then I'LL design something else for something else. Right now, people are looking for defining Iraq. They can print out, over and over and over and over again, and that rack needs to be flexible enough to deliver the right amount of storage to every application on demand over and >> over. You know, one trend I want get your reaction to a surveillance because this kind of points that value proposition functions have been very popular. It's still early days on what functions are, but is a tell sign a little bit on where this is going to your point around thinking, rethinking on Prem not in the radical wholesale business model change, but just more of operating change. I was deployed and how it works with the cloud because those two things, if working together, make server Lis very interesting. >> Yeah, absolutely. I mean, it's just a further form of abstraction, ultimately from the underlying hardware. And so you know, if you think about functions on demand or that kind of thing, that's absolute, something that just needs a big shared pool of storage and not to have any persistent findings to anything you know, Bill, to get to the storage needs, do its task, right? What it needs to and get out of the way. Right? >> Well, VP of strategy. A big roll. You guys did a good job. So congratulations being the number six employees of pure. How's the journey been? You guys have gone public, Still growing. Been around for it on those ten years. You're not really small little couple anymore. So you're getting into bigger accounts growing. How's that journey been for you? >> It's so it's been an amazing right. That's why I'm still here, coming in every day, excited to come to work. I think they think that we're the proudest of is it still feels like a small company. It still feels with, like we have a much aggression and much excitement to go out for the market everyday, as we always have the oranges very, very strong. But on the flip side, it's now fun that we get to solve customer problems at a scale that we probably could have even imagined in the early days. And I would also say right now it really feels like there's this next chapter opening up. You know, the first chapter was delivering the all flashes, and we're not even done with that yet. But as we bring our software to the cloud and really poured it natively be optimized for each of the clouds. It kind of opens up. Our engineers tto be creative in different ways. >> Generational shift happening. Seeing it, you know again. Application, modernization, hybrid multi clouded. Just some key pillars. But there's so much more opportunity to go. I want your thoughts. You've had the luxury of being working under two CEOs that have been very senior veterans Scott Dietzen and Charlie. What's it like working with both of them? And what's it like with Charlie? Now it's What's the big mandate? What what's the Hill you guys are trying to climb? Share some of the vision around Charlene's? Well, >> I'd say the thing that binds both Scott and truly together in DNA is that they're fundamentally both innovators. And, you know, if you look at pure, we're never going to be the low cost leader. We're not going to be. The company tells you everything, so we have to be the company that's most innovative in the spaces we playing. And so you know, that's job number one. It pure after reliability. So let's say that you remember, too. But that's key. And I think both of both of our CEOs have shared that common DNA, which is their fundamentally product innovators. And I think that's the fun thing about working for Charlie is he's really thoughtful about how you run a company of very large scale. How you how you manage the custom relationship to never sacrifice that experience because that's been great for pure but ultimately how you also, unlike people to run faster and a big organization, >> check every John Chambers, who Charlie worked with Cisco. With the back on the day, he said, One of the key things about a CEO is picking the right wave the right time. What is that way for pure. What do you guys riding that takes advantage of? The work still got to do in the data center on the story side. What's the big wave? >> So, you know, look, the first way was flash. That was a great way to be on and before its not over. But we really see a and an enormous opportunity where cloud infrastructure mentality comes on. And, you know, we think that's going to finally be the thing that gets people out of the mindset of doing things the old way. You know, you fundamentally could take the lessons we learned over here and apply it to the other side of my hybrid cloud. Every talks about hybrid cloud and all the thought processes what happens over the cloud half of the hybrid. Well, Ian from half of the hybrid is just as important. And getting that to be truly Cloudera is a key focus of >> Arya. And then again, micro Services only helped accelerate. And you want modern story, your point to make that work absolutely kicks. Thanks for spending time in sparing the insides. I really appreciate it. It's the Cube conversation here of Pure stores. Headquarters were in the arcade room. Get the insights and share in the data with you. I'm job for your Thanks for watching this cube conversation

Published Date : Apr 18 2019

SUMMARY :

in Castro Tree, one of the many buildings they have here as they continue to grow as a public company. is that the club early works best when you build a nap for the cloud natively. one to figure out what the importance and how hard is it to some of things that you guys are doing to solve that. the S to evergreen storage service, which will essentially allows you to subscribe to our storage even in your own data taxes and automated, but the values and shifting and you guys in the storage business you know this data's data of the bigger interesting challenges right now is how you get data constructs to actually go both ways. What's the most notable product announcements inequities that you guys have done? this is a lot of what I've been talking about, how you can take a tier one block storage application on Prem and seamlessly move What's the What's the inside of this container trend? And so you need a shared, reliable data service that can power these containers, What is the importance of that ability you have? a dramatic difference in just the storage experience that you can essentially subscribe to. Some of the experiences like any share some some insight. Yeah, so, you know, one of one of the first customers that really turn us on to this. It was, you know, on parity with flash disk anyway. And so we really your strategies bring common set of tools in there and read them again. And that's also served us well as you go to the cloud, because when you want to manage on tell about the application environment you mentioned earlier hybrid on multi class here. And so you know, modern CEO is much less about being a cost the infrastructures code is really just, you know, pushing code not from local host or the machine, And, you know, we first saw it in our environment. And But I think you know, it's at a point right now where every enterprises considering it, there's certainly an opportunity I might need to have access to multiple data, not some siloed or you know, And they're connected with fast Internet, you know, networking on demand. But this is the shift that's really going on here. absolutely, you know, and I think the thing that's changing it is that need for consolidation. You know, one trend I want get your reaction to a surveillance because this kind of points that value proposition functions something that just needs a big shared pool of storage and not to have any persistent findings to anything you know, So congratulations being the number six employees of pure. the first chapter was delivering the all flashes, and we're not even done with that yet. What what's the Hill you guys are trying to climb? And so you know, that's job number one. What do you guys riding that takes advantage of? You know, you fundamentally could take the lessons we learned over here and apply it to the other side of And you want modern story,

ENTITIES

Entity	Category	Confidence
Matt Kixmoeller	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Charlie	PERSON	0.99+
Google	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
six	QUANTITY	0.99+
two	QUANTITY	0.99+
eighteen months	QUANTITY	0.99+
nine months	QUANTITY	0.99+
Scott	PERSON	0.99+
April 2019	DATE	0.99+
Charlene	PERSON	0.99+
both	QUANTITY	0.99+
Castro Tree	LOCATION	0.99+
last year	DATE	0.99+
Scott Dietzen	PERSON	0.99+
Ian	PERSON	0.99+
tens of thousands	QUANTITY	0.99+
FBI	ORGANIZATION	0.99+
John Chambers	PERSON	0.99+
three years	QUANTITY	0.99+
Kia	ORGANIZATION	0.99+
three year	QUANTITY	0.99+
John Premise	PERSON	0.99+
first ten years	QUANTITY	0.99+
uber	ORGANIZATION	0.99+
Story Juice	ORGANIZATION	0.99+
both sides	QUANTITY	0.99+
Mountain View, California	LOCATION	0.99+
Evergreen	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
first chapter	QUANTITY	0.99+
ten years	QUANTITY	0.99+
Cooper	PERSON	0.99+
one metric	QUANTITY	0.99+
one	QUANTITY	0.99+
each	QUANTITY	0.99+
both places	QUANTITY	0.99+
second product	QUANTITY	0.99+
two things	QUANTITY	0.99+
three years ago	DATE	0.99+
first experience	QUANTITY	0.99+
Mark	PERSON	0.99+
two worlds	QUANTITY	0.99+
All Flash Data Center	ORGANIZATION	0.99+
First	QUANTITY	0.99+
first	QUANTITY	0.99+
one tool	QUANTITY	0.98+
five years ago	DATE	0.98+
eight	QUANTITY	0.98+
about ten years	QUANTITY	0.98+
tomorrow	DATE	0.98+
single	QUANTITY	0.98+
three	QUANTITY	0.98+
Object Engine	ORGANIZATION	0.98+
both ways	QUANTITY	0.97+
Prime	COMMERCIAL_ITEM	0.97+
today	DATE	0.97+
this year	DATE	0.97+
three cycles	QUANTITY	0.97+
Leighton	PERSON	0.96+
first thing	QUANTITY	0.96+
a million dollars	QUANTITY	0.96+
six employees	QUANTITY	0.95+
two different worlds	QUANTITY	0.95+
One	QUANTITY	0.95+

Arun Garg, NetApp | Cisco Live 2018

>> Live from Orlando, Florida it's theCUBE covering Cisco Live 2018. Brought to you by Cisco, NetApp and theCUBE's ecosystem partners. >> Hey, welcome back everyone. This is theCUBE's coverage here in Orlando, Florida at Cisco Live 2018. Our first year here at Cisco Live. We were in Barcelona this past year. Again, Cisco transforming to a next generation set of networking capabilities while maintaining all the existing networks and all the security. I'm John Furrier your host with Stu Miniman my co-host for the next three days. Our next guest is Arun Garg. Welcome to theCUBE. You are the Director of Product Management Converged Infrastructure Group at NetApp. >> Correct, thank you very much for having me on your show and it's a pleasure to meet with you. >> One of the things that we've been covering a lot lately is the NetApp's really rise in the cloud. I mean NetApp's been doing a lot of work on the cloud. I mean I've wrote stories back when Tom Georges was the CEO when Amazon just came on the scene. NetApp has been really into the cloud and from the customer's standpoint but now with storage and elastic resources and server lists, the customers are now startin' to be mindful. >> Absolutely. >> Of how to maximize the scale and with All Flash kind of a perfect storm. What are you guys up to? What's your core thing that you guys are talking about here at Cisco Live? >> So absolutely, thank you. So George Kurian, our CEO at NetApp, is very much in taking us to the next generation and the cloud. Within that I take care of some of the expansion plans we have on FlexPod with Cisco and in that we have got two new things that we are announcing right now. One is the FlexPod for Healthcare which is in FlexPod we've been doing horizontal application so far which are like the data bases, tier one database, as well as applications from Microsoft and virtual desktops. Now we are going vertical. Within the vertical our application, the first one we're looking in the vertical is healthcare. And so it's FlexPod for Healthcare. That's the first piece that we are addressing. >> What's the big thing with update on FlexPod? Obviously FlexPod's been very successful. What's the modernization aspect of it because Cisco's CEO was onstage today talking about Cisco's value proposition, about the old ways now transitioning to a new network architecture in the modern era. What's the update on FlexPod? Take a minute to explain what are the cool, new things going on with FlexPod. >> Correct, so the All Flash FAS, which is the underlying technology, which is driving the FlexPod, has really picked up over the last year as customers keep wanting to improve their infrastructure with better latencies and better performance the All Flash FAS has driven even the FlexPod into the next generation. So that's the place where we are seeing double-digit growth over the last five quarters consistently in FlexPod. So that's a very important development for us. We've also done more of the standard CVDs that we do on SAP and a few other are coming out. So those are all out there. Now we are going to make sure that all these assets can be consumed by the vertical industry in healthcare. And there's another solution we'll talk about, the managed private cloud on FlexPod. >> Yeah, Arun, I'd love to talk about the private cloud. So I think back to when Cisco launched UCS it was the storage partners that really helped drive that modernization for virtualization. NetApp with FlexPod, very successful over the years doing that. As we know, virtualization isn't enough to really be a private cloud. All the things that Chuck Robbins is talking about onstage, how do I modernize, how do I get you know, automation in there? So help us connect the dots as to how we got from you know, a good virtualized platform to this is, I think you said managed private cloud, FlexPod in Cisco. >> Absolutely. So everybody likes to consume a cloud. It's easy to consume a cloud. You go and you click on I need a VM, small, medium, large, and I just want to see a dashboard with how my VMs are doing. But in reality it's more difficult to just build your own cloud. There's complexity associated with it. You need a service platform where you can give a ticket, then you need an orchestration platform where you can set up the infrastructure, then you need a monitoring platform which will show you all of the ways your infrastructure's working. You need a capacity planning tool. There's tens of tools that need to be integrated. So what we have done is we have partnered with some of the premium partners and some DSIs who have already built this. So the risk of a customer using their private cloud infrastructure is minimized and therefore these partners also have a managed service. So when you combine the fact that you have a private cloud infrastructure in the software domain as well as a managed service and you put it on the on-prem FlexPod that are already sold then the customer benefits from having the best of both worlds, a cloud-like experience on their own premise. And that is what we are delivering with this FlexPod managed private cloud solution. >> Talk about the relationship with Cisco. So we're here at Cisco Live you guys have a good relationship with Cisco. What should customers understand about the relationship? What are the top bullet points and value opportunities and what does it mean to the impact for the customer? >> So we, all these solutions we work very closely with the Cisco business unit and we jointly develop these solutions. So within that what we do is there's the BU to BU interaction where the solution is developed and defined. There is a marketing to marketing interaction where the collateral gets created and reviewed by both parties. So you will not put a FlexPod brand unless the two companies agree. >> So it's tightly integrated. >> It's tightly integrated. The sales teams are aligned, the marketing, the communications team, the channel partner team. That's the whole value that the end customer gets because when a partner goes to a high-end enterprise customer he knows that both Cisco and NetApp teams can be brought to the table for the customer to showcase the value as well as help them through it all. >> Yeah, over in one of the other areas that's been talked about this show we talk about modernization. You talk about things like microservices. >> Yes. >> Containers are pretty important. How does that story of containerization fit into FlexPod? >> Absolutely. So containerization helps you get workloads, the cloud-native workloads or the type two native. Type two workloads as Gartner calls them. So our mode two. What we do is we work with the Cisco teams and we already had a CVD design with a hybrid cloud with a Cisco cloud center platform, which is the quicker acquisition. And we showed a design with that. What we are now bringing to the table is the ability for our customers to benefit with a managed service on top of it. So that's the piece we are dealing with the cloud teams. With the Cisco team the ACI fabric is very important to them. So that ACI fabric is visible and shown in our designs whether you do SAP, you do Oracle, you do VDI and you do basic infrastructure or you do the managed private cloud or FlexPod on Healthcare. All of these have the core networking technologies from Cisco, as well as the cloud technologies from Cisco in a form factor or in a manner that easily consumable by our customers. >> Arun, talk about the customer use cases. So say you've got a customer, obviously you guys have a lot of customers together with Cisco, they're doing some complex things with the technology, but for the customer out there that has not yet kinda went down the NetApp Cisco route, what do they do? 'Cause a lot of storage guys are lookin' at All Flash, so check, you guys have that. They want great performance, check. But then they gotta integrate. So what do you say to the folks watching that aren't yet customers about what they should look at and evaluate vis-a-vis your opportunity with them and say the competition? >> So yes, there are customers who are doing all this as separate silos, but the advantage of taking a converged infrastructure approach is that you benefit from the years of man experience or person experience that we have put behind in our labs to architect this, make sure that everything is working correctly and therefore is reduces their deployment time and reduces the risk. And if you want to be agile and faster even in the traditional infrastructure, while you're being asked to go to the cloud you can do it with our FlexPod design guides. If you want the cloud-like experience then you can do it with a managed private cloud solution on your premise. >> So they got options and they got flexibility on migrating to the cloud or architecting that. >> Yes. >> Okay, great, now I'm gonna ask you another question. This comes up a lot on theCUBE and certainly we see it in the industry. One of the trends is verticalization. >> Yes. >> So verticalization is not a new thing. Vertical industry, people go to market that way, they build products that are custom to verticals. But with cloud one of the benefits of cloud and kind of a cloud operations is you have a horizontally scalable capability. So how do you guys look at that, because these verticals, they gotta get closer to the front lines and have apps that are customized. I mean data that's fastly delivered to the app. How should verticals think about architecting storage to maintain the scale of horizontally scalable but yet provide customization into the applications that might be unique to the vertical? >> Okay, so let me give a trend first and then I'll get to the specific. So in the vertical industry, the next trend is industry clouds. For example, you have healthcare clouds and you'll have clouds to specific industries. And the reason is because these industries have to keep their data on-prem. So the data gravity plays a lot of impact in all of these decisions. And the security of their data. So that is getting into industry-specific clouds. The second pieces are analytics. So customers now are finding that data is valuable and the insight you can get from the data are actually more valuable. So what they want is the data on their premise, they want the ability all in their control so to say, they want the ability to not only run their production applications but also the ability to run analytics on top of that. In the specific example for health care what it does is when you have All Flash FAS it provides you a faster response for the patient because the physician is able to get the diagnostics done better if he has some kind of analytics helping him. [Interviewer] - Yeah. >> Plus the first piece I talked about, the rapid deployment is very important because you want to get your infrastructure set up so I can give an example on that too. >> Well before we get to the example, this is an important point because I think this is really the big megatrend. It's not really kinda talked much about but it's pretty happening is that what you just pointed out was it's not just about speeds and feeds and IOPs, the performance criteria to the industry cloud has other new things like data, the role of data, what they're using for the application. >> Correct. >> So it's just you've gotta have table stakes of great, fast storage. >> Yes. >> But it's gotta be integrated into what is becoming a use case for the verticals. Did I get that right? >> Yes, absolutely. So I'll give two examples. One I can name the customer. So they'll come at our booth tomorrow, in a minute here. So LCMC Health, part of UMC, and they have the UMC Medical Center. So when New Orleans had this Katrina disaster in Louisiana, so they came up with they need a hospital, fast. And they decided on FlexPod because within three months with the wire one's architecture and application they could scale their whole IT data center for health care. So that has helped them tremendously to get it up and running. Second is with the All Flash FAS they're able to provide faster response to their customer. So that's a typical example that we see in these kind of industries. >> Arun, thanks for coming on theCUBE. We really appreciate it. You guys are doing a great job. In following NetApps recent success lately, as always, NetApp's always goin' the next level. Quick question for you to end the segment. What's your take of Cisco Live this year? What's some of the vibe of the show? So I know it's day one, there's a lot more to come and you're just getting a sense of it. What's the vibe? What's coming out of the show this year? What's the big ah-ha? >> So I attended the keynote today and it was very interesting because Cisco has taken networking to the next level within 10 base networking, its data and analytics where you can put on a subscription mode on all the pieces of the infrastructure networking. And that's exactly the same thing which NetApp is doing, where we are going up in the cloud with this subscription base. And when you add the two subscription base then for us, at least in the managed private cloud solution we can provide the subscription base through the managed private cloud through our managed service provider. So knowing where the industry was going, knowing where Cisco was going and knowing where we want to go, we have come up with this solution which matches both these trends of Cisco as well as NetApp. >> And the number of connected devices going up every day. >> Yes. >> More network connections, more geo domains, it's complicated. >> It is complicated, but if you do it correctly we can help you find a way through it. >> Arun, thank you for coming on theCUBE. I'm John Furrier here on theCUBE with Stu Miniman here with NetApp at Cisco Live 2018. Back with more live coverage after this short break. (upbeat music)

Published Date : Jun 11 2018

SUMMARY :

Brought to you by Cisco, NetApp and all the security. and it's a pleasure to meet with you. and from the customer's standpoint What are you guys up to? One is the FlexPod for What's the modernization aspect of it So that's the place where we All the things that Chuck So the risk of a customer using Talk about the relationship with Cisco. So you will not put a FlexPod brand that the end customer gets Yeah, over in one of the other areas How does that story of So that's the piece we are and say the competition? and reduces the risk. on migrating to the cloud One of the trends is verticalization. the benefits of cloud and the insight you can get from the data Plus the first piece I talked the big megatrend. So it's just you've case for the verticals. One I can name the customer. What's some of the vibe of the show? So I attended the keynote today And the number of connected it's complicated. we can help you find a way through it. Arun, thank you for coming on theCUBE.

ENTITIES

Entity	Category	Confidence
Cisco	ORGANIZATION	0.99+
Tom Georges	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Arun	PERSON	0.99+
George Kurian	PERSON	0.99+
UMC Medical Center	ORGANIZATION	0.99+
Barcelona	LOCATION	0.99+
Arun Garg	PERSON	0.99+
two companies	QUANTITY	0.99+
John Furrier	PERSON	0.99+
LCMC Health	ORGANIZATION	0.99+
Chuck Robbins	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
UMC	ORGANIZATION	0.99+
second pieces	QUANTITY	0.99+
Louisiana	LOCATION	0.99+
Orlando, Florida	LOCATION	0.99+
Katrina	EVENT	0.99+
FlexPod	COMMERCIAL_ITEM	0.99+
NetApp	ORGANIZATION	0.99+
both parties	QUANTITY	0.99+
New Orleans	LOCATION	0.99+
Second	QUANTITY	0.99+
10 base	QUANTITY	0.99+
three months	QUANTITY	0.99+
first piece	QUANTITY	0.99+
two examples	QUANTITY	0.99+
tomorrow	DATE	0.99+
both	QUANTITY	0.99+
One	QUANTITY	0.99+
first one	QUANTITY	0.99+
today	DATE	0.98+
theCUBE	ORGANIZATION	0.98+
first year	QUANTITY	0.98+
last year	DATE	0.98+
Gartner	ORGANIZATION	0.98+
both worlds	QUANTITY	0.97+
Cisco Live 2018	EVENT	0.97+
two	QUANTITY	0.97+
NetApp	TITLE	0.97+
this year	DATE	0.97+
All Flash FAS	COMMERCIAL_ITEM	0.97+
one	QUANTITY	0.97+
two new things	QUANTITY	0.96+
tens of tools	QUANTITY	0.95+
UCS	ORGANIZATION	0.94+
Oracle	ORGANIZATION	0.94+

Caitlin Gordon, Dell EMC | Dell Technologies World 2018

>> Announcer: Live from Las Vegas, it's the Cube. Covering Dell Technologies World 2018. Brought to you by Dell EMC and its ecosystem partners. >> Well welcome back. Glad to have you live here on the Cube as we continue our coverage of Dell Technologies World 2018. We are live in Las Vegas. We're in the Sands Exposition Center. I'm with Keith Townsend who had a heck of a night last night. Just a good chicken-and-waffle Las Vegas night. >> You know what? One o'clock in the morning is chicken and waffles here in the Grand Lux, and the view of Venetian, I have to eat at Palazzo because the one in the Venetian closes at 11. >> Oh my, well you know how to live. You know how to live. And I've always said that about you. (laughs) It's a pleasure to welcome as our first guest of the day, Caitlin Gordon, who is the Director of Storage Marketing at Dell EMC. And good afternoon, Caitlin. Thanks for joining us. >> Thank you so much for having me. >> John: A Cube vet, right? You're a Cube veteran. >> I mean as three, is that like, is you're over the hump as a veteran? >> John: Oh absolutely. >> All right, then yes, I'm in. >> You deserve a varsity letter now. >> Aw, do I get a letter jacket too? >> Well, we'll work on that later. We'll give you a Cube sticker for now how 'about that? >> Okay, I'll take a sticker. >> All right, so you've given, you've launched I would say given birth, but you've launched a brand new product today, PowerMax. Tell us all about that. First off, paint us the big picture, and we'll drill down a little bit and find out what's so new about this. >> Yeah, absolutely. So hot off the presses. Announced just two hours ago in the keynote this morning. So PowerMax is, really, the future of storage. The way we're talking about it, it is fast. It is smart and it's efficient. So we could kind of go through each one of those, but the headline here, this is modern tier zero storage. It's designed for traditional applications of today, but also next gen applications like real-time analytics. We have some metrics that show us that up to 70% of companies are going to have these mission-critical, real-time analytic workloads. And they're going to need a platform to support those and why shouldn't it be the same platform that they already have for those traditional workloads. >> So let's just go back. What makes it smarter? And what makes it more efficient? You know, what makes it faster? >> Caitlin: Can we start with fast? >> Yeah sure. >> Okay, that's my favorite one. So fast. I've got some good hero numbers for ya. So we'll start there. 10 million IOPS. That makes it the world's fastest storage array. Full stop. No caveats to that. 150 gigabytes a second throughput. We've got under 300 microseconds latency. That's up to 50% faster than what we already have with VMAX All Flash. So that's great. That's wicked fast, as Bob said, right? But how do we actually do that is a little bit more interesting. So the architecture behind that, it is a multi-controller, scale out architecture. Okay, that's good. That's check. You had a good start with that. But the next thing we did is we built that with end-to-end NVME. So end-to-end NVME means it's NVME-based drives, flash drives now, SCM drives, next generation media coming soon. It's also NVME over Fabric ready. So we're going to have a non-disruptive upgrade in the very near future to add support for NVME over Fabric. So that means you can get all the way from server across the network, to your storage array with NVME. It's really NVME done right. >> So let's talk about today. We've NVME, Fabric ready, which I love NVME over Fabric. Connectivity getting 10 million IOPS to the server in order to take care of that. What are the practical use cases for that much performance? What type of workloads are we seeing? >> Where we see this going in is to data centers where they want to consolidate all of their workloads, all of their practices, all of their processes, on a single platform. 10 million IOPS means you will never have to think about if that array can support that workload. You will be able to support everything. And again, traditional apps, but also these emerging apps, but also mainframe. IBM i, file, all on the same system. >> So can we talk about that as opposed to, let's say you even compare it to another Dell family technology. We just had the team Sean Amay and his VMware customer talking about SAP HANA on XtremIO. XtremIO is really great for one-to-one application mapping, so that's as SAP HANA. So are you telling me that PowerMax is positioned that I can run SAP HANA and in addition to my other data center workloads and get similar performance? >> Absolutely, it is the massive consolidator. It's kind of an app hoarder. You can put anything on it that you've got. And it's block, it's file, and then it's also got support for mainframe and IBM i, which there's still a significant amount of that out there. >> So that's an interesting thing. You're having all of these traditional data services. Usually when we see tier zero type of arrays, Dell EMC had one just last year, there's no services because you just, it's either go really fast or moderately fast and data services. How do you guys do that? >> Yeah well the benefit of where we're coming from is that we built this on the platform of the flagship storage array that's been leading the industry for decades. So what we did is we took the foundation of what we had with VMAX, and we built from that this end-to-end NVME PowerMax. So you get all of that best-in-class hardware, that optimized software, but it comes with all the data services. So you get six nines availability, best-in-class data protection, resiliency, everything that you'd need, so you never have to worry. So this is truly built for your mission-critical applications. >> Yeah, so really interesting speeds and feeds. Let's talk about managing this box. VMAX has come a long way from the Symmetrix days, so much easier to manage. However, we're worried today about data tiering, moving workloads from one area to another. These analytics workloads move fast. How does PowerMax help with day two operations? >> So you've heard the mention of autonomous infrastructure, right? Really PowerMax is autonomous storage. So what is has is it has a built-in, real-time, machine learning engine. And that's designed to use pattern recognition. It actually looks at the IOs and it can determine in a sub-millisecond time, what data is hot, what data should be living where, which data should be compressed. It can optimize the data placement. It can optimize the data reduction. And we see this as a critical enabler to actually leveraging next-generation media in the most effective way. We see some folks out there talking about SCM and using it more as a cache. We're going to have SCM in the array, side-by-side with Flash. Now we know that the price point on that when it comes out the door is going to be more than Flash. So how do you cost-effectively use that? You have a machine learning engine that can analyze that data set and automatically place the data on that when it gets hot or before it even gets hot, and then move it off it when it needs to. So you can put in just as much as you need and no more than that. >> So let's talk about scale. You know I'm a typical storage ad man. I have my spreadsheet. I know what lines I map to what data and to what application. And I've statically managed this for the past 15 years. And it's served me well. How much better is PowerMax than my storage ad man? I can move two or three data sets a day from cache to Flash. >> Really what this enables from a storage administrator perspective, you can focus on much more strategic initiatives. You don't have to do the day-to-day management. You don't have to worry about what data's sending where. You don't have to worry about how much of the different media types you've put into that array. You just deploy it and it manages itself. You can focus on more tasks. The other part I wanted to mention is the fact that you heard Jeff mention this morning that we have Cloud.IQ in the portfolio. Cloud.IQ we're going to be bringing across the entire storage portfolio, including to PowerMax. So that will also really enable this Cloud-based monitoring predictive analytics to really take that to the next level as well. Simplify that even more. >> You know, I'd like to step back to the journey. More or less. When you start out on a project like this and you're reinventing, right, in a way. Do you set, how do you set the specs? You just ran off a really impressive array of capability. >> Caitlin: Yeah. >> Was that the initial goal line or how was that process, how do you manage that? How do you set those kinds of goals? And how do you get your teams to realize that kind of potential, and some people might look at you a little cross-eyed and say, are you kidding? >> Caitlin: Right, right. >> How are we going to get there? I don't know. (laughs) >> We always shoot for the moon. >> John: Right. >> So we always, this type of product takes well over a year to get into market. So you saw PowerMax Bob on stage there talking about it. So his team is the one that really brings this to market. They developed those requirements two years ago. And they were really looking to make sure that at this time, as soon as the technology curve is ready on NVME, we were there, right? So this just shipping with enterprise class, dual port, NVME drives. Those were not ready until right now. Right, those boxes start shipping next week. They are ready next week, right? So we're at the cutting edge of that. And that takes an extraordinary world-class engineering team. A product management team that understands our customers' requirements that we have today, 'cause we have thousands of customers, but more importantly is looking to what's also coming in the future. And then at some point in the process things do fall off, right? So we have even more coming in future releases as well. >> So let's talk connectivity into the box. How do I connect to this? Is this iSCSI, is this fiber channel? What connectivity-- >> So this is definitely fiber channel. And so our NVME over Fabric will be supported over fiber channel with this array. But we find with the install base with our VMAX install base especially they're very heavily invested in fiber channel today. So right now that's where we're still focused. 'Cause that's going to enable the most people to leverage it as quickly as possible. We're obviously looking at when it makes sense to have an IP-based protocol supported as well. >> So this storage is expensive on the back end. Talk to me about if data efficiency, dedup, are we coming out with. 'Cause a lot of these tier zero solutions don't have dedup out the box. >> Or they have it, but if you use it you can't actually get the performance that you paid for, right? >> There's no point in turning it on. >> Yeah, it's like yeah, we checked the box, but there's really no point. Yeah, so VMAX had compression. VMAX also had compression, and what we've done with PowerMax is we now have inline deduplication and compression. The secret to that is that it's a hardware-assisted. So it's designed to, that card actually will take in, it'll compress the data, and it also passes out the hashes you need for dedup. So that it's inline, it will not have a performance impact on the system. It can also be turned on and off by application and it can give you up to five-to-one data reduction. And you can leverage it with all your data services. Some competitive arrays, if you want to use encryption, sorry you can't actually use dedup. The way we've implemented it, you can actually do both the data reduction and the data services you need, especially encryption. >> So before we say goodbye, I'm just, I'm curious, when you see something like this get launched, right. Huge project. Year-long as you've been saying. And even further back in the making. Just from a personal standpoint, you get pumped? Are you, I would imagine-- >> Caitlin: I got to tell ya-- >> This is the end of a really long road for you. >> We have been worked, for the marketing team, we've been working on this for months. It is the best product I've ever launched. It's the best team I've ever worked with. In the past two days since I landed here to getting that keynote out the door has been so much adrenaline, built up, that we're just so excited to get this out there and share it with customers. >> And what's this done to the bar in your mind? Because you were here, now you're here. But tell me about this. What have you jumped over in your mind? >> We have set a very high bar. I'm not really sure what we're going to do at this point, right? From a product standpoint it is in a class by itself. There is just nothing else like it And from an overall what the team has delivered, from engineering all the way from my team, what we've brought together, what we've gotten from the executive, we've never done anything like it before. So we've set a high bar for ourselves, but we've jumped over some high bars before. So we've got some other plans in the future. >> I'm sorry go ahead. >> Let's not end the conversation too quickly. >> All right, all right, sure, all right. >> There is some-- >> He's got some burning questions. >> Yeah, I have burning, this is a big product. So I still have a lot of questions from a customer perspective. Let's talk data protection. You can't have mission-critical all this consolidation without data protection. >> Caitlin: Absolutely. >> What are the data protection features of the PowerMax? >> I'm so glad you asked. I spent a decade in data protection. It is a passionate topic of mine, right? So you look at data protection and kind of think of it as layered, within the array, so we have very efficient snapshot technology. You can take as many snaps as you need. Very, very efficient to take those. They don't take any extra space on them when you make those copies. >> Then can I use those as tertiary copies to actually perform, to point to workloads such as refreshing, QA, DAB, et cetera? >> Yeah, absolutely. You can mount those snapshots and leverage those for any type of use case. So it's not just for data protection. It's absolutely for active use as well. So it's kind of the on the array, and then the next level out is okay, how do I make a copy of that off the array? So the first one would be well do that to another PowerMax. So as you probably know, the VMAX really pioneered the entire primary storage replication concept. So we have certainly asynch if you have a longer distance, but a synchronous replication, but also Metro, if you have that truly active active-use case so, truly the gold standard in replication technologies. And our customers, it's one of the number one reasons why they say there is no other platform on the planet that they would ever use. And then, you go to the next level of we're really talking about backup. We have built in to PowerMax the capabilities to do a direct backup from PowerMax to a data domain. And that gets you that second protection copy also on a protection storage. So you have those multiple layers of protection. All the copies across all of the different places to ensure that have that operational recovery, disaster recovery in that array, and that the data's accessible at all times no matter what the scenario. >> So let's talk about what else we see. When we look at it, we go into our data center and you see a VMAX array, there's a big box with cabinets of shelves, and you're thinking, wow, this thing is rock solid. Look at the PowerMax. That thing is what about a six-- >> Caitlin: I think it's pretty cute, right? >> Yeah it's pretty cute. I love, that's a pretty array. (laughs) >> Yeah. >> You have one over there. So when you see a VMAX, it just gives you this feeling of comfort. PowerMax, let's talk about resiliency. Do we still have that same VMAX, rock solid, you go into a data center and you see two VMAX, and you're thinking this company's never going to go down. >> Caitlin: Right. >> What about PowerMax? >> Guess what? It is the same system. It's just a lot more compact. We have people consolidating from either VMAXs or competitive arrays, but they're in four racks and they come down into maybe half a rack. But you have all the same operating system, all the same data services, so you have non-disruptive upgrades. If you have to do a code upgrade across the whole array at the same time. You don't have to do rolling reboots of all the controllers. You can just upgrade that all at the same time. We have component-level fault isolation. So if a component fails, the whole controller doesn't go down. All you lose is that one little component on there until you're able to swap that out. So you have all of the resiliency that over six nines availability built into this array. Just like you did with the ones that used to be taking up a bit more floor tile space. The PowerMax is about 40% lower power consumption than you have with VMAX All Flash 'cause it can be supported in such a small footprint. >> So are we going to see PowerMax and converge system configurations? >> Yeah, absolutely. So if you're familiar with the VxBlock 1000, which we launched back in February, it will be available in a VxBlock 1000. And of course the big news on that is you have the flexibility to really choose any array. So it could be an X2 and a PowerMax in a VxBlock 1000. >> So that's curious. What is the, now that we have PowerMax, where's the position of the VMAX 250? >> So the, I'm glad you asked, 'cause it's an important thing to remember. VMAX All Flash is absolutely still around and we expect people to buy it for a good amount of time. The main reason being that the applications, the workloads, the customers, the data centers, that are buying these arrays, they have a very strict qualification policy. They take six, nine months, sometimes a year, to really qualify, even a new operating system. >> Keith: Right. >> Let alone a new platform. So we absolutely will be selling a lot of VMAX All Flash for the foreseeable future. >> Well, Caitlin, it's been a long time in the making, right? >> Absolutely. >> Huge day for you. >> Yes. >> So congratulations on that. >> Thank you, thank you. >> Great to have you here on the Cube. And best of luck, I'm sure, well you don't need it. Like I said, superior product, great start. And I wish you all the best down the road. >> Thank you. Hope to see you guys again soon. >> Caitlin Gordon. Now that'd be four. >> Yes, it'd be four. >> We'd love to have you back. Caitlin Gordon joining us from Dell EMC. PowerMax, the big launch coming just a couple hours ago here at Dell Technologies World 2018. Back with more live coverage here on the Cube after this short time out. (upbeat music)

Published Date : May 1 2018

SUMMARY :

Brought to you by Dell EMC Glad to have you live here on the Cube and the view of Venetian, first guest of the day, Caitlin Gordon, You're a Cube veteran. We'll give you a Cube sticker and find out what's so new about this. So hot off the presses. So let's just go back. So that means you can get all the way What are the practical use IBM i, file, all on the same system. So are you telling me that Absolutely, it is the How do you guys do that? So you get all of that from the Symmetrix days, So how do you cost-effectively use that? and to what application. You don't have to do the You know, I'd like to How are we going to get there? So his team is the one that connectivity into the box. enable the most people don't have dedup out the box. the data services you need, And even further back in the making. This is the end of a It is the best product I've ever launched. What have you jumped over in your mind? from the executive, we've never done Let's not end the So I still have a lot of questions So you look at data protection So it's kind of the on the array, and you see a VMAX I love, that's a pretty array. So when you see a VMAX, it just gives you all the same data services, so you have And of course the big news on that is So that's curious. So the, I'm glad you So we absolutely will be selling a lot Great to have you here on the Cube. Hope to see you guys again soon. Caitlin Gordon. We'd love to have you back.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Caitlin Gordon	PERSON	0.99+
Keith Townsend	PERSON	0.99+
Sean Amay	PERSON	0.99+
six	QUANTITY	0.99+
Jeff	PERSON	0.99+
Caitlin	PERSON	0.99+
two	QUANTITY	0.99+
Bob	PERSON	0.99+
February	DATE	0.99+
Keith	PERSON	0.99+
next week	DATE	0.99+
Las Vegas	LOCATION	0.99+
VxBlock 1000	COMMERCIAL_ITEM	0.99+
Dell EMC	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
150 gigabytes	QUANTITY	0.99+
VMAX	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
SAP HANA	TITLE	0.99+
nine months	QUANTITY	0.99+
today	DATE	0.99+
VMAX 250	COMMERCIAL_ITEM	0.99+
X2	COMMERCIAL_ITEM	0.99+
last year	DATE	0.99+
Sands Exposition Center	LOCATION	0.99+
two years ago	DATE	0.99+
First	QUANTITY	0.98+
both	QUANTITY	0.98+
Venetian	LOCATION	0.98+
Cloud.IQ	TITLE	0.98+
three	QUANTITY	0.98+
half a rack	QUANTITY	0.98+
two hours ago	DATE	0.98+
first guest	QUANTITY	0.97+
PowerMax	COMMERCIAL_ITEM	0.97+
XtremIO	TITLE	0.97+
Dell Technologies World 2018	EVENT	0.97+
a year	QUANTITY	0.97+
four racks	QUANTITY	0.97+
under 300 microseconds	QUANTITY	0.96+
Flash	TITLE	0.96+
first one	QUANTITY	0.96+
PowerMax	ORGANIZATION	0.96+
11	DATE	0.95+
single platform	QUANTITY	0.95+
up to 50%	QUANTITY	0.94+
10 million IOPS	QUANTITY	0.94+
one	QUANTITY	0.94+
last night	DATE	0.94+
one little component	QUANTITY	0.94+
each one	QUANTITY	0.93+
four	QUANTITY	0.93+
decades	QUANTITY	0.93+
a day	QUANTITY	0.92+
NVME	TITLE	0.92+

Chhandomay Mandal, Dell EMC | Dell Technologies World 2018

>> Announcer: Live from Las Vegas. It's theCube! Covering Dell Technologies World 2018. Brought to you by Dell EMC and it's ecosystem partners. >> Welcome back to theCube's coverage of Day One of Dell Technologies World. I'm Lisa Martin with Dave Vellante in Las Vegas. Excited to welcome back to theCube one of our alumni Chhandomay Mandal, the Director of Marketing at Dell EMC. Chhandomay, nice to see you again. >> Happy to be here. >> We had a exciting keynote this morning, Michael Dell was talking about number one in market share for servers and storage, expecting when the 2018 calendar numbers, came out the first quarter to gain shares. What's going on with storage with All-Flash? >> We are excited about our storage All-Flash portfolio. We are going to have a couple of surprising announcements tomorrow, I cannot give away all of this. But all of our portfolio is going to continue to innovate based on all the things Michael touched upon, ranging from artificial intelligence, machine learning, all of those things. We have a complete portfolio of All-Flash products covering different market segments, customers. Ranging from the Max All-Flash, XtremIO, Unity accessories. So we are really excited about the face of innovations we are doing, the way we are capturing a market. So it's a great time to be in All-Flash storage. >> Chhandomay, I wonder if we can talk about how we got here. So the first modern instantiation of Flash, and there were a lot of SSD's and battery backed up memories in the past, but it was, I think it was EMC, dropped a flash drive into a Symmetrix way back when, and that began to change things. But people soon realized, the controller architecture's not going to support that, so we need All-Flash architectures. And then people quickly realized, oh wow, it's taken us decades to build this rich stack of services. Now fast forward basically a decade plus, where are we today in terms of All-Flash capabilities and adoption? >> In the enterprises today, you see All-Flash getting adopted at a very high rate. In fact, of the storage that we ship, almost 80% of it is All-Flash storage, and again, We have different products for different segments. And as you mentioned, we started from dropping SSD's into the enterprise arrays, a whole thing through the process. Now if you look at us, we have modern purpose-built All-Flash arrays like XtremIO and then All-Flash arrays like VMAX All-Flash and some announcements where you are going to see the maturity level over the last decade, all the data services that got brought in, and the very high-performance, low latency with mission critical availability that we are able to deliver, across the platform for all of our enterprise products. >> So Flash everywhere. And then we've made the observation a lot that, and it sounds trite, but I'll put it out there anyways, historically, when you think about storage it was all about persisting data. And you'd try to make it go as fast as you could, but it was mechanical. Now with Flash, it's all about doing stuff faster, real-time, low latency, massive IOPS, we're shifting the bottlenecks around. What's your take on that dynamic? >> Flash is a fast media, so having great performance is really, it will stay. That is not really the differentiator so to speak, but it needs to be coupled with advanced data services. You need to have very high resiliency. The customers can rely on you with five lines, six lines of availability day in and day out. As well as, you need to do the business solutions, transforming IT, helping businesses transform in their digital transformation process. Let me give you some quick examples. Lets take XtremIO for example. It started out as a purposeful, modern, leading All-Flash array. And it is built upon a unique architecture taking the advantage of Flash Media. It is content error, metadata-centric, active-active controller architecture that helps us deliver very high performance hundreds of thousands to millions of IOPS with very low, consistent latency. No matter how much you have written to that, what loads you are running, what are the system load, etc. But again, that's the first layer. The second layer of it is the advanced data services always on inline reducing the data space. So for example, the inline, the duplication, compression, and making sure we are not writing the duplicate data to the SSD's. Thereby increasing the longevity of the SSD media, as well as reducing the capacity footprint. And driving down costs. Speaking of that. You wrap it around into a very simple, modern UI that's very easy to manage. No tuning needed. That's where today's IT could go from the tactical day to day operations to strategic innovations. How they can do the IT transformation. Get into the digital transformation. Get ahead of their competition. Not only today but for tomorrow. >> And the content awareness and the metadata-centricity are what you just explained? Is that right? Can you connect those? >> Uh sure. Suppose when the data is being written, right? It might have duplicate data. Say for example you are running a video environment. Right? For your tens of thousands of users everybody has their Windows VM. Probably the same data across all the laptops. When you look at it in the XtremIO metadata-centric, always in memory architecture, the request comes in, you try to look it up. Now when you need to do that your metadata is always in memory and you are doing data reduction based on a unique fingerprinting algorithm, checking whether you have seen the data before. If you haven't seen the data before then only you only write it doing other data services on top of it. But if you have seen the data before then you you update the metadata in memory and acknowledge the right. You get a very fast, alright performance that is actually at memory speed, not even at the SSD speed. So this metadata-centric architecture that has all the metadata all the time in memory helps you accelerate the process especially in the case where a lot of duplicate data is present. >> It's a memory speed? Because you somehow eliminated an IO? Or is that NVMe? Or, or..? >> When you access data, right? An application says I want to access block XYZ. Any controller will need to have the metadata for it. And then based on the metadata it needs to do the access. It's like, when you go to a library, you want to find a book from a bookshelf. First you need to know the control number. And then based one the control number, which shelf, which rack, you go and fetch it. Storage controllers of every type works in the same way. If you cannot have your metadata in memory, then the first step the controller has to do is go down to the array, fetch the metadata, and then based on the metadata you fetch the data and solve the IO request. If you have the metadata always in memory, then that step is always eliminated. You can guarantee that your metadata is there and all you need to do is look up and solve the IO request. That's the key of delivering consistent performance. Okay? In other arrays if the metadata is not in memory you'll not get that consistency. But here we can deliver day in, and day out, 90% full or 10% full, whether it's OLTP or VDI, That high performance with very minimal latency. That's the key here. >> High performance, low latency. You've given us some really good overview into the potential that the technology can make to help IT-innovate. And as Michael Dell even said this morning that IT innovation is key. IT can be a profit center of an organization, really as a catalyst for digital transformation. Talk to us about some of the business benefits. That if a business is really wrapping their head around IT as a profit center, and as a driver of business strategy. What are some of the business benefits that All-Flash array can deliver to an organization? Any examples come to mind? >> Yes, I'll answer your question with one of the customer examples. Let's see how they have been doing it. It's my favorite example of Boston Red Socks. I'm from the Boston area. >> You're a fan, right? >> Absolutely. All the Boston sports teams. When Boston Red Socks was in the digital transformation journey, they had to transform a lot of things. First of all, the experiences of the spectators like us, who are in the field living to the moment, whether it's the jumbotrons, or getting the experience digitally on the smartphones. That's one aspect. The other aspect is there are a lot of analytics on all the players across MLB. To get the competitive advantage in terms of, which pitch or which batter? Who has what capabilities or deficiencies that they can go after the right player or when they are against them, how to take advantage of them. And then there are a lot of the business applications in a virtualized environment. As you look, ranging from better spectator experience, ranging to the coaches getting competitive advantage from the opposing players or the scouting department. And running the general back office applications, like Exchange and (mumbling), whatever need might be. Now they were able to consolidate all of these things into the XtremIO All-Flash array platform. And the ability to deliver this performance as well as getting a data reduction of almost seven is to one, was a key for Red Socks' digital transformation journey. >> So the business impact to Lisa's point is lower cost obviously, simpler management. But also faster time to result? How did they turn that into a competitive advantage? >> If they could run... Those analytics previously used to take ten hours. Now they can do it in two hours. That's an 80% faster turnaround time. Right? Previously if they could support 10,000 spectators on one particular wireless network. Now they have 80,000. It's the experience that's transformative for folks who are enjoying the game. It's the number of applications they are running. It's how they are running. They're viewing IT as a strategic investment. As opposed to something that's needed to run the operations. >> Well baseball games are like five hours now, cause you can even do an in game at that speed. How 'about the data services? When Flash first came out, All-Flash architectures they were not very rich in terms of data services. That's evolved. I mean the industry in general, and Dell EMC specifically, has put a lot of effort into that. Maybe you could describe some of the data. What do we mean by data services? Let's talk about copy services, migration services, snapshotting, etc. What are the important ones that we should know about? >> The important data services are thin provisioning, the data reduction technologies, the duplication, compression. Then you have your data protection in forms of various types of array technologies. The most important one I'll put out as how matter your snapshot surfaces, as well as what you can do for your data protection, business continuity, disaster recovery. Those are very critical for any businesses that needs to rely upon having their systems up and running 365 days 24 seven. Having those type of data surfaces is a key. And not only having, but also having a maturity. For example, taking VMAX All-Flash in this particular case, right? It's upon two (mumbling) of reliability, where SRDF is the gold standard in industry, in terms of resiliency, right? Six-ninths of ability. Those... Somebody coming up with brand new array on Day One cannot have it. We have seen that evolution with folks who originally had very fast storage. But then there was no data services. Right? It's the evolution of having the performance as well as the right data surfaces. That helps the customer transform their journey, both in terms of modernizing the IT infrastructure, as well as having the digital transformation to be competitive today and tomorrow. >> And the positioning of XtremIO, just to clarify for our audience, cause you got All-Flash VMAX, you got XtremIO. It's really... It's the high end of the midrange. Is that how we should think about that? >> We have a lot of... As you said the IMAX All-Flash, XtremIO, they're all important, and effectively we have the portfolio because with one product you cannot solve each and every customer needs. So picking on your very specific example, XtremIO is great for mixed workload consolidation, virtualized applications, VDI, as well as situations where you have lots of copies. So for example, you have a database, you need to create (mumbling) copies. You have copies for your backup, sandboxing. In these type of scenarios XtremIO is extremely good. And kind of like is the sweet spot. We are going to... We are having new XtremIO X-Bricks that are even lower priced point than the previous generation. Literally 55% better price entry point. Now this enterprise plus capabilities of XtremIO will be also available in the mid-market, at the mid-range price. >> Well Chhandomay, thanks so much for stopping by, and not only expanding on the customer awards that we saw this morning, by sharing with us the impact that the Boston Red Socks were making. But also sharing with us what's new with XtremIO and All-Flash. >> Thank you. >> And speaking between two Bostonians... >> Big night tonight. You got Bruins. We got Celtics. Red Socks take a back seat for awhile. But they'll be back. >> We want to thank you for watching theCUBE. We are live at Day One of Dell Technologies World. I'm Lisa Martin with Dave Vellante. Thanks for watching. Stick around, we'll be right back after a short break.

Published Date : Apr 30 2018

SUMMARY :

Brought to you by Dell EMC and it's ecosystem partners. Chhandomay, nice to see you again. came out the first quarter the way we are capturing a market. the controller architecture's not going to support that, In the enterprises today, you see All-Flash getting historically, when you think about storage could go from the tactical day to day operations the request comes in, you try to look it up. Because you somehow eliminated an IO? and then based on the metadata you fetch the data into the potential that the technology can make I'm from the Boston area. And the ability to deliver this performance So the business impact to Lisa's point It's the number of applications they are running. What are the important ones that we should know about? It's the evolution of having the performance It's the high end of the midrange. And kind of like is the sweet spot. and not only expanding on the customer awards We got Celtics. We want to thank you for watching theCUBE.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
six lines	QUANTITY	0.99+
Michael Dell	PERSON	0.99+
Michael	PERSON	0.99+
five lines	QUANTITY	0.99+
80%	QUANTITY	0.99+
Celtics	ORGANIZATION	0.99+
ten hours	QUANTITY	0.99+
Boston Red Socks	ORGANIZATION	0.99+
Red Socks'	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
365 days	QUANTITY	0.99+
80,000	QUANTITY	0.99+
Chhandomay Mandal	PERSON	0.99+
Red Socks	ORGANIZATION	0.99+
first layer	QUANTITY	0.99+
two hours	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
10%	QUANTITY	0.99+
five hours	QUANTITY	0.99+
Lisa	PERSON	0.99+
second layer	QUANTITY	0.99+
2018	DATE	0.99+
Dell EMC	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
55%	QUANTITY	0.99+
Chhandomay	PERSON	0.99+
10,000 spectators	QUANTITY	0.99+
tomorrow	DATE	0.99+
First	QUANTITY	0.99+
millions	QUANTITY	0.99+
one	QUANTITY	0.99+
one product	QUANTITY	0.99+
tonight	DATE	0.98+
XtremIO	TITLE	0.98+
one aspect	QUANTITY	0.98+
Windows	TITLE	0.98+
two	QUANTITY	0.98+
Six-ninths	QUANTITY	0.98+
first step	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.97+
first	QUANTITY	0.97+
Day One	QUANTITY	0.97+
tens of thousands of users	QUANTITY	0.96+
hundreds of thousands	QUANTITY	0.96+
this morning	DATE	0.94+
Bostonians	PERSON	0.94+
Flash	TITLE	0.94+
IMAX	ORGANIZATION	0.93+
Dell Technologies World 2018	EVENT	0.93+
Boston	ORGANIZATION	0.92+
EMC	ORGANIZATION	0.92+
almost 80%	QUANTITY	0.91+
decades	QUANTITY	0.9+
each	QUANTITY	0.89+
first quarter	DATE	0.86+
Exchange	TITLE	0.85+
VMAX	ORGANIZATION	0.84+
XtremIO	ORGANIZATION	0.84+
Dell Technologies World	EVENT	0.83+
almost seven	QUANTITY	0.82+
MLB	EVENT	0.81+
two (	QUANTITY	0.81+
Technologies World	EVENT	0.81+
Unity	ORGANIZATION	0.78+
24 seven	QUANTITY	0.78+
VMAX	TITLE	0.77+

Bob Wambach, Dell EMC | VMworld 2017

(upbeat music) >> Narrator: Live from Las Vegas, it's theCUBE covering VMWorld 2017, brought to you by VMWare and its ecosystem partners. >> Welcome back to VMWorld 2017 everybody. This is theCube, the leader in live tech coverage. My name is Dave Vellante, and I'm with my co-host, Peter Burris. Bob Wambach is here. He's the Vice President of Marketing for Converged Platforms and Solutions at Dell EMC. Bob, good to see you again. >> Good to see you, guys. Always a pleasure. >> It's been a good week, you guys have had a lot going on. We were at the Influencer reception last night. Great shindig, thank you for that. >> Peter: Very much. >> Lot of momentum in this ecosystem: VMWAre, financials are looking good. We just had Pat Gelsinger on, he has a spring in his step. What's going on from your perspective? >> You know I see the spring in Pat's step, and I look at it and, you know I know the stock's up, everything's going great for them, but what I really see is the plan they've put in place, right? And this is a long time coming. If you remember last year you remember Pat was talking about, it's a multi-cloud world, right? And everything VMWare has been doing for the last couple of years has been leading up to some of these announcements that you're seeing now. So I see a guy who's really happy because, made some big bets, had a plan, and the bets are paying off. And most of the benefit is actually going to be in the future. And as you see, Michael's looking pretty happy too this week, right? (laughter) So I think if you heard Pat in the opening keynote, one of the things that struck me is he said we're going from data centers to center of data. And it's really recognizing that there's this explosion of data going on and this data has to be handled in different fashion, and that's a cloud operating model. It's not a cloud. the cloud's an operating model not a place, and it's a multi-cloud world out there. So, you look at most large companies, maybe they have Concur, they have ADP, they have Salesforce.com. There's multiple SaaS providers that they have and then they use on premise equipment, they want to cloud-ify that, right? Is how do I get to, I've got my own journey to cloud. Our job is to really help them both on their journey for on premise equipment, but then working with VMWare, working with Pivotal, is making easy to utilize and navigate the multi-cloud world as well. >> So, we've been talking all week, Peter is really sort of driving our research at Wikibon, helping us think through the customer implications and one of the things we've been talking all week is the reality of that data and not being able to move that data into the cloud, bringing that cloud operating model, as you were just pointing out to the data. But, the implication there, as you've talked about many times Peter, is you've got to have the simplicity and other attributes of the cloud in order to make that brand promise come true, what we call true private cloud. So, what are you guys doing in that regard to achieve that vision? >> First, it's listening. Michael Dell likes to say, and it's very frequently that he says, we have big ears to us. Our job is to really listen to customers, understand their business. You need to understand their business and then once you understand your business, you better know how to help them. And, there's also preferences. They've got capex versus opex preferences. They're going to make decisions of on premises versus off premises based upon data gravity, based upon governance, based upon SLA's, latency. All these things that have to do with the characteristics of the data; data movement. And, then you have a, there's actually a preference for, I want to build it myself. Or, I'm actually very focused on my business and I'd like to be nearly out of the IT business. So, we look at this, everybody's a builder, you're a builder at some level. If you are a builder down at the component level, where you want to pick your servers, you're going to pick vSAN. Then we have our Ready portfolio. vSAN Ready Nodes covers that, right? So, it's the easiest way to buy vSAN in a PowerEdge server. And, if you start going up the stack and you want that packaged with software, we have Ready bundles. And then we start moving into where people are realizing I don't add a lot of value to the business by putting together pieces of hardware and software. So, I want to rely on Dell EMC to do some of that for us. That's where our VxRail, VxRack, VxBlock comes in. Where we own the engineering, manufacturing, management, support, sustaining of that. All the life cycle assurance, single contact support. That's from us. Then there's customers further up that say, well I want a stack, a software stack. We increasingly see that the world's evolving into, sometimes people refer to it as stack wars. And VmWare is doing exceptionally well in the stack wars. They're very prevalent in on premise and now they also have the integrations with the Googles, with AWS, with IBM Cloud. Our announcement this week about the Ready system is taking Dell EMC's expertise in hyper-converged infrastructure, which we co-engineered, co-developed with VMWare, and VMWare taking the lead on how do you package up vSphere, NSX and vSAN together with it and vRealize. They control the roadmap for that, they know how to do the lifecycle automation updates, so what we do is we provide the hyper-converged infrastructure and it's actually a simple overall environment for customers when they combine these. When Michael talks about peanut butter and chocolate a couple of times, and that's really what I think about the Ready systems. There's VMWare, we have for Pivotal, we'll also have Pivotal Ready system that can give you either a Pivotal Cloud Foundry, the easiest way to get a Pivotal Cloud Foundry environment on our hyper-converged infrastructure, or the Pivotal Container Services, PKS on hyper-converged infrastructure. >> So Bob, you mentioned early on of having different overview of the portfolio, you mentioned early on that VMWare had a plan, and they've been executing about that plan. But, you also got a plan within the hyper-converged team, within the whole enterprise cloud team. So, software and hardware are once again co-mingled in ways that they haven't been for a long time. The kind of normal separation, just get the hardware and then you get the software. But, now we're seeing that because of the complexities of trying to bring all this together, talk a little bit about how you're influencing the VMWare plan and the VMWare plan is influencing the hardware side of things. >> You know it's a great question. I think there's been a great learning experience. As you know for several years, we've had Enterprise Hybrid Cloud. Enterprise Hybrid Cloud started with a request from customers to make it easier to create a full cloud. People were realizing, I've been trying to build my cloud. It's super hard. I actually don't want to spend my best people and my time and money on this. So, Enterprise Hybrid Cloud initially started working with some very large enterprises. And, it was a way to take any type of converged or hyper-converged infrastructure and bring the whole VMWare portfolio to market with full turn key system. Full stop, it's we own it, we will make this stuff work. So, the goodness there is that the customers would get something that was incredibly rich, and remember this, a lot of this started out on converged infrastructure, so you basing it on a SAN fabric, VMAX, All-Flash, XtremeIO data domain. So you have all the flexibility and option of the data services, rich data services and data protection. Now it turns out Enterprise Hybrid Cloud is really really hard, right? We don't have magic software to do this. There's hundreds of people that are making all this stuff work so that when it goes into these large enterprises it adapts to their environment and it's very reliable, robust, scalable, flexible. The other side of the coin is, it takes so long to test and QA the new VMWare, perfectly fine, very solid VMWare features, that they don't show up to market for a long time. The largest enterprises understand this, but for many customers, you end up having this misalignment, where VMWare's saying, "I want you to take these features now", and we're saying, "That's six months away in Enterprise Hybrid Cloud." So, what you've seen develop in the Ready systems are perfect example of this is if we constrain down for most people, most people are not the largest banks in the world, there's not the largest pharmas or governments. Hyper-converged infrastructure is ready for the vast majority of work loads today and they need a pretty well defined set of features and functionality. So, VMWare more takes the lead, on this is how we're going to package these up. This is our software suite. We know how to do life cycle. Together, you work on the hyper-converged infrastructure, which is also co-developed with them. And, it ends up being a very good path to get these into the hands of many more customers. We're talking 10x customers, if you think about hundreds of people that are likely EHC, Enterprise Hybrid Cloud candidates, versus many thousands that are VMWare Ready system candidates. So, I think it's a great example of how we work together to figure out what is the sweet spot for volume and velocity of being able to provide value very quickly to the largest number of customers. >> So, we Chad on theCube yesterday and we asked, Dave and I asked him a series of questions, and one of them was, so tell us about how the cloud experience is going to manifest itself through Dell EMC products. One of the things he said was, in anticipation of these cloud wars, or in these platform wars, I think was his term, that increasingly it is going to be about how well you bind between different clouds. Interesting, I was walking through the show earlier and I saw one of our big user clients and I stopped and said hi to him. And, the two things that he mentioned when I asked him what he's looking for is, one, he used the same word, bind, how well does this bind to that, tell me about how your platform is going to bind to other platforms. And, automation was the second one. He said, I want to see, increasingly we're going to bring new technology in based on its demonstrable automated characteristics. What do you think about that, as you think about building platforms and how the portfolio is going to evolve against those two dimensions. The ability to bind things better and the ability to automate things more. >> Right, so, I think it's spot on, first of all. And, if we look at two different use cases. The one use case of most customers today, VMWare customers, they're using the VMWare suite, environment on premises. VMWare actually now binds those to AWS, to IBM Cloud, to Google Cloud. And, for me the killer app is NSX, right? If you think about, you want to traverse, navigate these different clouds. You want to do it securely, protected, segmentation and all of the richness of security and control over that. NSX is really the way to do that. When we talk about automation, VMWare is the best company to take the lead in how to automate that binding it together. So, whereas in the past, with Enterprise Hybrid Cloud, we, and that continues to go on, we did all the automation, there's a much more efficient path for most customers with VMWare doing that. And, Enterprise Hybrid Cloud still remains the realm of, I'm going to say, hundreds of customers where these are huge deals. These are $50 Million and up deals. Where you're providing incredible value all in, for all their different applications, right? And, most, you know the vast majority of customers today clearly not on hyper-converged infrastructure, but they could be and if the value prop is so compelling, it's so compelling that it's definitely, that's where things are going. So, we look at where things are going and try to optimize for that. Pivotal Cloud Foundry is also something that, in my view, binds the developer environment together. You develop it once and then you can publish this wherever you want. So there is a strategy within Dell Technologies companies to work together to do this and the more we work together, another great thing happens, is that your field teams end up being aligned and telling the same story. So, whereas with Enterprise Hybrid Cloud we would have inherit conflict. Because we'd be speaking about the virtues of Enterprise Hybrid Cloud, but VMWare is telling them you need these new features, right? And this is where, when that little friction goes away and you have full alignment, so we're all on the same page, we're all the saying the same things, it's far more credible. >> Well, it also accelerates the customer. >> Bob: It sure does. >> And, I think that's probably one of the most important things. At the end of the day, it's to get the customers going. >> Yeah, we got to wrap, but somebody said the other day that VMWare is moving at the speed of the CIO. Robin Matlock today said today, yeah, but the CIO has to move faster, but it's hard. So, you're right, you're trying to accelerate that. And, to I guess my last point is when you were talking about, we've been talking about, forming the cloud model to your business, when you were describing sort of what you do for Enterprise Hybrid Cloud, that's not a trivial exercise. It requires a lot of expertise and a lot of process, and a lot of good thinking. >> Right, and it is very, it's by definition, customizable. You end up doing something different for every customer. Whereas, Ready, the Ready solutions portfolio I think are going to be huge. Just huge in the coming year. And the whole idea is to make it easy. It's ready for wherever you are on this journey. If you are ready for more of a, I want to jump into cloud and I see this path, I'm ready to move, then it's Ready Systems, right? If you are more of a, I want to put the software elements together myself and build that, then we have Ready bundles. And, high performance computing has been huge for us. Data analytics, increasingly I think those are connected together. So, there's synergy between the two of them. Then, the Ready nodes, for people who are, I really want to build this stuff myself, this is the path that I'm going down. And it takes all of the, we have an opinion, right? Our opinion is we want you moving quickly because we see the customers benefiting from it. Ultimately, all our customers are trying to be very competitive and successful at whatever their mission is, and we know the further up the stack you go, we can help you be more competitive. But, it takes the conflict out of the relationship when they know that I can help you wherever you are, we have something that is right for you. >> Alright, we got to wrap. Thanks Bob for coming on. Taking you on a journey of Vmworld 2017. Bob Wambach, thanks for coming back in theCube. >> Thanks. >> You're welcome. Keep right there buddy. We'll be back with our next guest. This is theCube. We're live from VMworld 2017. Be right back. (exciting music)

Published Date : Aug 30 2017

SUMMARY :

brought to you by VMWare and its ecosystem partners. Bob, good to see you again. Good to see you, guys. you guys have had a lot going on. Lot of momentum in this ecosystem: And most of the benefit is actually going to be in the future. is the reality of that data and not being able to move and VMWare taking the lead on how do you package up just get the hardware and then you get the software. and QA the new VMWare, and the ability to automate things more. VMWare is the best company to take the lead At the end of the day, it's to get the customers going. And, to I guess my last point is when you were talking and we know the further up the stack you go, Taking you on a journey of Vmworld 2017. This is theCube.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Bob Wambach	PERSON	0.99+
Peter	PERSON	0.99+
Robin Matlock	PERSON	0.99+
Peter Burris	PERSON	0.99+
Dave	PERSON	0.99+
Michael Dell	PERSON	0.99+
Pat	PERSON	0.99+
Michael	PERSON	0.99+
Bob	PERSON	0.99+
today	DATE	0.99+
Pat Gelsinger	PERSON	0.99+
$50 Million	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
last year	DATE	0.99+
yesterday	DATE	0.99+
six months	QUANTITY	0.99+
VMWare	ORGANIZATION	0.99+
two	QUANTITY	0.99+
First	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
VMWare	TITLE	0.99+
two things	QUANTITY	0.99+
one	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
One	QUANTITY	0.99+
Dell EMC	ORGANIZATION	0.98+
Googles	ORGANIZATION	0.98+
two dimensions	QUANTITY	0.98+
Pivotal	ORGANIZATION	0.98+
this week	DATE	0.98+
Concur	ORGANIZATION	0.98+
VMAX	ORGANIZATION	0.98+
VMWorld 2017	EVENT	0.98+
hundreds of people	QUANTITY	0.98+
both	QUANTITY	0.97+
vSAN	TITLE	0.97+
Enterprise Hybrid Cloud	TITLE	0.97+
last night	DATE	0.97+
single contact	QUANTITY	0.96+
XtremeIO	ORGANIZATION	0.96+
Dell Technologies	ORGANIZATION	0.96+
VMworld 2017	EVENT	0.96+
VMWAre	ORGANIZATION	0.96+
second one	QUANTITY	0.95+
NSX	ORGANIZATION	0.95+
thousands	QUANTITY	0.95+
10x customers	QUANTITY	0.93+
All-Flash	ORGANIZATION	0.92+
ADP	ORGANIZATION	0.91+
first	QUANTITY	0.9+
vSphere	TITLE	0.9+
two different use cases	QUANTITY	0.89+
IBM	ORGANIZATION	0.87+
theCube	ORGANIZATION	0.85+
hundreds of customers	QUANTITY	0.85+
hundreds of people	QUANTITY	0.85+
Cloud	TITLE	0.85+

Erik Kaulberg, Infinidat & Jason Chamiak, Peak 10 + ViaWest | VMworld 2017

>> Announcer: Live from Las Vegas, it's The Cube covering VMworld 2017 brought to you by VMware and it's ecosystem partners. (electronic music) >> Okay, welcome back everyone. Live here, day three coverage, I'm John Furrier, Dave Vellante. VMworld 2017, we're in the the VM village for wall to wall coverage of VMworld. Our next two guests, Erik Kaulberg who's the senior director of cloud solutions and Jason Chamiak who's the senior systems engineer of Peak 10. Guys, welcome back. Infinidat, you guys are doing great. >> Absolutely, it's been a wonderful year for us. >> We were just talking on camera, got surprised when we kind of went live. Day three, and we were just talking about Infinidat's history and the growth you guys have and just kind of the DNA of the company, how you guys attack the accounts and then kind of profile storage guys you go after and you're disruptive but you're not doing anything super-radical technically, you just come in blocking and tackling with storage solutions for big industrial clients. Give us this update. >> Absolutely, I mean I'd say that the disruption is in two areas. One, it's in how we're approaching the clients and where we're going in the data center. Most typical disruptors would start at the edge and eventually get to the core but Infinidat's modus operandi, from day one, was let's start in the core and then broaden the aperture so we're out there displacing VMAX, we're out there displacing legacy storage arrays that are used for Tier 1 workloads from day one and that strategy has worked out great for us with 260% year over year growth just this past quarter. It's been a wild ride. >> So one of the things that people may or may not know is that this whole scene here at VMworld is all about disruption, oh, the computer industry's thrown upside down. You guys have a very simple approach, come in and just get a better price performance, more bang for the buck if you will, but really deliver some of that core storage. Can you just take a minute to elaborate on that specific point? >> Absolutely, so the story line is really about commodity hardware paired with awesome software that makes all the difference versus the traditional architectures. So what we do with our combination of flash and DRAM and high-capacity hard drives allows us to make sure that the workloads are in the right place at the right time all the time and that means something transformational for our large-scale clients. And the challenge that we see as, versus all the other startups in this space or the smaller companies in this space, that ultimately you have real challenges doing that at scale unless you have the intelligence and the expertise that our three generations of storage leadership have really brought together. >> So Jason, I wonder if we can bring Peak 10 and ViaWest, recent merger, but bring you into the conversation. Maybe talk about, briefly, your company and your role. >> Yeah, sure, so Peak 10 and ViaWest were a hybrid IT company. We specialize in collocation and cloud services and we package that in with managed and professional services. We were looking for a way to consolidate a bunch of the dedicated client arrays that we had out there and we needed a good shared solution that offered high performance that we could throw a bunch of different workloads onto. We evaluated a bunch of flash arrays and other hybrid arrays and Infinidat just happened to outperform pretty much everything that we benchmarked. >> And your role is to look after that infrastructure? >> Yeah, so currently, we have 11 InfiniBox arrays ranging from the 1000 series up to the 6000. We have about four petabytes of physical space and almost 10 petabytes of virtual space. >> So, before we get into the environment, we want to do that, what are the, I mean, as a service provider, obviously, SLAs are super important, you're merging companies so you got a bunch of different infrastructure, you're going to have to deal with that down the road. But like a lot of service providers, you mentioned sort of you wanted to consolidate things, you are probably servicing different workloads with different types of infrastructure but what are the big drivers in your business? You know, cloud obviously, the big wave is here, what are the things that are driving your business that effect IT specifically? >> So one of the things is we want our clients to be able to get to market faster. So, with the InfiniBox, the implementation and configuration of it is extremely simplified over some of the other storage products that we've used in the past. So we're able to get our clients up to speed, they start to use the infrastructure sooner and the performance benefit is amazing. We've actually had testimonials from clients that have put their workload that they had residing on other vendor products, as soon as we put them on, even a shared InfiniBox, not even a dedicated but a shared InfiniBox with other workloads running, they've seen as much as a 500% to 800% improvement in application performance. >> So, paint a picture of your environment, at least the part that you're responsible and have visibility on. What's it look like? I mean, kind of workloads, servers, storage capacities, I mean, whatever you feel comfortable sharing. >> Yeah, sure. So I work on the platform engineering team and we're responsible for the infrastructure and code that make up our client center cloud offering and that is based on VMware and the InfiniBox. So we have a mixed workload. We have clients that have physical servers connecting that run Oracle RAC installations. They'll have Hadoop clusters, large SQL servers, whether that's normal OLTP or analytical workloads in addition to large and small VMware deployments. And we just run that all together on the same unit and there's no hotspots. >> Dave: Are you virtualizing RAC? >> I don't believe so, we may have some. >> Dave: But it's not possible and common that people don't? >> Yeah, I can tell you we do have some virtualized SQL server clusters out there along with physical, you name it, we have it out there. >> Okay, so take us back to pre-Infinidat. What was life like? What was the conversation like with Infinidat? You know, small company comes in knocking at your door, hey, I got an array to sell you. Take us through that story. >> We ended up with, like I mentioned before, we ended up with a lot of dedicated arrays for clients. I think, at one point, we were over 70 dedicated arrays. >> Dave: 70? >> Yeah. So that becomes kind of a management nightmare when it comes to patching and things like that. But even before we get to how we got that many, for each individual client, we try and talk to them, take a look at their workload and then from that, we would have to model what kind of RAID groups we need, how many disks within those RAID groups, so there was a lot of consulting time involved in getting the correct configuration for them. Moving to the InfiniBox, we don't have that problem. We don't have an option to do different types of RAID groups, everything just works within the infrastructure that's there. So we've saved a ton of time having to do all that consulting work beforehand and that also adds to, you know, quicker time to market for our clients. >> So you essentially consolidated a large number of arrays down to an InfiniBox infrastructure, is that right? >> Yeah, so we have, like I said before, we have 11. We have those scattered across multiple locations. >> Okay, and the biggest impact was what, time? People time or? >> Time, there's less time for deployment configuration. We spend less time looking at performance problems so we have more time to focus on the more important things. We do a lot of monitoring and things like that for these arrays now, we do trending and everything. We have time to actually put forth for creating those scripts and those infrastructures. >> So can you talk about performance? I mean, Erik, you could maybe address this too. Infinidat has basically said, look, you don't need an all-flash array, we can deliver a little bit of flash and a lot of spinning disk and work our algorithmic magic and deliver better performance than an all-flash array. Am I summarizing your point of view correctly? >> You got it, exactly. I mean, we would say that the all-flash array movement is great for certain workloads but by and large, for the 80, 90% of common data center environments, it's just a way to make storage expensive again. (laughing) >> Hear, hear, come to the party. And so Jason, from your experience, can you talk about the performance, did you look at other all-flash alternatives or other alternatives to Infinidat? >> Yeah, so we actually started looking at all-flash arrays to start off with because we knew that, with a cloud type infrastructure, we're going to be putting all these varied workloads on there. And we tested several flash arrays, we benchmark those when we get them in, and we actually saw more consistent and better performance across all those workloads from the InfiniBox. And, as you know, with the flash, you pay a lot for a much smaller amount of capacity so that was a problem too. So, from a cost perspective and performance perspective, the InfiniBox pretty much beat out all the competitors. >> I'm sorry if I missed this, how much capacity are you managing? >> So, right now, we have four petabytes of physical, about 10 petabytes of virtual. >> And how many people manage that? >> Probably just a handful of people and it's basically set it and forget it. >> So it's arms and legs? You know, like constantly tuning and... >> Yeah, we don't have to do any of that stuff, it's optimized from the start. >> And that was obviously different prior to the installation of Infinidat or? >> Yeah, before, there was a lot of, you know, like I said, tweaking of disconfigurations and storage pools and cache settings and things like that so there was a lot more hand-holding. >> So, what'd you do with all that time that freed up? I mean, what did you do with that labor resource? Where did you point it? >> We put that into our analytics and monitoring platform on the backend so we create a lot of scripts to help us kind of trend capacity and performance for the InfiniBox arrays. >> Erik, I want to ask you the final question for me. The story I'm hearing at VMworld is that as you do more of these projects, some of the costs kind of add up. Where are you guys seeing kind of the opportunity to come in, stabilize operations from storage to endpoint, free up that time, that's always a great value proposition, reduce steps and save time and money. But where is the action happening where the costs start to get out of control, when people start thinking about true private cloud, hybrid cloud, where's the hotspots that customers should look at saying, if you don't be careful, that's going to blow out of control in terms of costs. >> I personally think it's all about scale at some level. Whether you're thinking about a large-scale public cloud deployment or whether you're thinking about going from five all-flash arrays to 50, let's say, that's when the cumulative costs grow at an exponential rate. And that's the opportunity for companies like Infinidat, successfully bringing these multi-petabyte architectures to fruition while managing all the labor costs and all the implementation costs and operational costs. >> So vSAN's been growing like crazy, for instance, let's take that as an example. Those things can add up in price. How do you guys compare to, say, vSAN? >> So, head-to-head against vSAN at scale, there is no comparison frankly. Whether you're looking at-- >> John: You guys benefit over them or? >> Yeah, definitely us over them. When we look at multi-petabyte scale deployments of which there are relatively few in the market today, you have so much investment. One customer quoted $12 million to do what Infinidat could do for $2 million comparing against the vSAN base. >> I'm kind of skeptical on those numbers, I'd like to see, that's a huge delta so we'll have to kind of follow up on that. >> Erik: You'll have to see it to believe it. >> I mean, that's a $10 million savings. >> Erik: Absolutely. >> You're saying that you guys, it's going to save $10 million off the vSAN number. >> In terms of TCL, when you look at, again, it's not the cost of the hardware or even necessarily the software so much but it's the cost of the implementation, it's the opportunity cost versus all of the innovation, like he was mentioning previously, that really eats into the overall budget-- >> Okay, so let's go to the customers, okay, so that's a good value proposition, puts a stake in the ground, good order of magnitude in terms of solar system of value, right, two versus 12, that's significant. How does that play out in reality when you think about those kinds of numbers? Where's that saving coming from? Just the box deployment, the consolidation, where's that coming from? >> It's pretty much all over. So, part of the cost savings that we have too is once you have a large number of individual arrays, you've got to re-up on maintenance costs and things like that. So we're able to have a much lower number of arrays to service that same workload. We've saved there, we save on man-hours for configuration, for performance troubleshooting and things like that. So across the board, we're saving on time for our employees. >> John: Awesome, Erik, Jason, thanks so much for sharing. Bold statement, huge stake in the ground. Good job you guys are aggressive and hey, lower prices and potential performance is what people want so congratulations Infinidat. Here inside The Cube I'm John Furrier, Dave Vellante, back with more live coverage, day three of three days of coverage after this short break. Back from VMworld 2017. (electronic music)

Published Date : Aug 30 2017

SUMMARY :

covering VMworld 2017 brought to you by VMware Infinidat, you guys are doing great. and just kind of the DNA of the company, and that strategy has worked out great for us more bang for the buck if you will, And the challenge that we see as, but bring you into the conversation. and we package that in with Yeah, so currently, we have 11 InfiniBox arrays You know, cloud obviously, the big wave is here, and the performance benefit is amazing. I mean, whatever you feel comfortable sharing. and that is based on VMware and the InfiniBox. along with physical, you name it, we have it out there. hey, I got an array to sell you. I think, at one point, we were over 70 dedicated arrays. and that also adds to, you know, Yeah, so we have, like I said before, we have 11. so we have more time to focus on the more important things. So can you talk about performance? I mean, we would say that the all-flash array movement can you talk about the performance, and we actually saw more consistent and better performance So, right now, we have four petabytes of physical, and it's basically set it and forget it. So it's arms and legs? Yeah, we don't have to do any of that stuff, Yeah, before, there was a lot of, you know, and monitoring platform on the backend the opportunity to come in, stabilize operations And that's the opportunity for companies like Infinidat, How do you guys compare to, say, vSAN? So, head-to-head against vSAN at scale, you have so much investment. I'd like to see, that's a huge delta You're saying that you guys, Okay, so let's go to the customers, So, part of the cost savings that we have too Good job you guys are aggressive

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Erik Kaulberg	PERSON	0.99+
2017	DATE	0.99+
Jason Chamiak	PERSON	0.99+
Dave Volonte	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Rebecca	PERSON	0.99+
Marty Martin	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Jason	PERSON	0.99+
James	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Greg Muscurella	PERSON	0.99+
Erik	PERSON	0.99+
Melissa	PERSON	0.99+
Micheal	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Justin Warren	PERSON	0.99+
Michael Nicosia	PERSON	0.99+
Jason Stowe	PERSON	0.99+
Sonia Tagare	PERSON	0.99+
Aysegul	PERSON	0.99+
Michael	PERSON	0.99+
Prakash	PERSON	0.99+
John	PERSON	0.99+
Bruce Linsey	PERSON	0.99+
Denice Denton	PERSON	0.99+
Aysegul Gunduz	PERSON	0.99+
Roy	PERSON	0.99+
April 2018	DATE	0.99+
August of 2018	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Australia	LOCATION	0.99+
Europe	LOCATION	0.99+
April of 2010	DATE	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Japan	LOCATION	0.99+
Devin Dillon	PERSON	0.99+
National Science Foundation	ORGANIZATION	0.99+
Manhattan	LOCATION	0.99+
Scott	PERSON	0.99+
Greg	PERSON	0.99+
Alan Clark	PERSON	0.99+
Paul Galen	PERSON	0.99+
Google	ORGANIZATION	0.99+
Jamcracker	ORGANIZATION	0.99+
Tarek Madkour	PERSON	0.99+
Alan	PERSON	0.99+
Anita	PERSON	0.99+
1974	DATE	0.99+
John Ferrier	PERSON	0.99+
12	QUANTITY	0.99+
ViaWest	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
2015	DATE	0.99+
James Hamilton	PERSON	0.99+
John Furrier	PERSON	0.99+
2007	DATE	0.99+
Stu Miniman	PERSON	0.99+
$10 million	QUANTITY	0.99+
December	DATE	0.99+

Bill Philbin, HPE - HPE Discover 2017

>> Announcer: Live from Las Vegas, it's theCUBE. Covering HPE Discover 2017. Brought to you by Hewlett-Packard Enterprise. >> Okay, welcome back everyone. We're here live in Las Vegas for HPE, Hewlett-Packard Enterprise, Discover 2017. I'm John Furrier, co-host of theCUBE with Dave Vellante, and our next guest is Bill Philbin, who's the general manager of storage and big data for Hewlett-Packard Enterprise. Bill, welcome to theCUBE. Again, good to see you. I think you've been on since 2012, '13, '15. >> Is that right? What, are we carbon dating ourselves now or something? >> We've been tracking our CUBE alumni, but you're heading up the storage business-- >> Do I get a pen? >> We're working on that, Jerry Chen-- >> Seven of them >> Jerry Chen at Greylock wants to have, now, badge values. So, welcome back. >> Thank you, thank you for having me. >> You were just on theCUBE at VeeamON, which is an event Dave was hosting, I missed it in New Orleans. But a lot of stuff going on around stores, certainly. Virtualization has been around for a while, but now with Cloud; whole new ballgame. Programmable infrastructure, hybrid IT, Wikibond's true private Cloud report came out showing that private Cloud on Prim is $250 billion market. So nothing's really changing radically in the enterprise, per se, certainly maybe servers and storage, but people got to store their data. >> Bill: That's right What's the update from your perspective, what's the story here at HPE Discover? >> So I think there's really three things we're talking about amongst a number of announcements. One is sort of the extension of our All Flash environment for customers, who, as I was saying at Veeam, have the always-on. New world order is we expect everything to be available at a moment's notice, so I was in the middle of the Indian Ocean, using Google Voice over satellite IP on the boat, talking to San Jose, and it worked. That's always-on environment, and the best way to get that is, you know, with an All Flash [unknown], so that's number one. Number two, going back to the story about programmable infrastructures, storage also needs to be programmable, and so, if you've had Rick Lewis or Rick Lewis is coming he'll talk about composable infrastructures with Synergy, but the flip side of that is our belief that storage really needs to be invisible. And the acquisition of Nimble gets us a lot closer to sort of doing that in the same way that you have a safe self-driving car is all the rage. All that rich telemetry comes back, it's analyzed, fingerprinted, and sent out to customers to a point where it's, I call it the Rule of 85. 85% of the customers, the cases are raised by InfoSight and closed by InfoSight, and they have an 85 net promoter score. We're getting to a point where storage can be invisible, cause that's the experience you get on Amazon or as you swipe your credit card, say I want ten terabytes of storage, and that's the last time you have to think about it. We need to have the economics of the web, we need to have the programmability of the web, that's number two, and number three of what we talked about, and this is a big issue, a big thing we talked about with VeeamON, was data protection. The rules of data protection are also changing. Conventional backup does not protect data. I was with a customer a couple weeks ago in London. 120 petabytes; this is a financial services customer now. 120 petabytes of storage: not unusual. 40 of it was Hadoop, and they were surprised because it's unprotected, it's on servers, it's sort of the age of the client-server, and the age of Excel spreadsheets all over again. We realized that most businesses were running on Excel, so All Flash, a different way of supporting our customer support experience, and number three, it's all around how do you protect your data differently. >> What's the big trend from your standpoint, because a lot of that self-driving storage concept, or self-driving car analogy, it speaks to simplicity and automation. >> That's right >> The other thing that's going on is data is becoming more irrelevant, certainly in the Cloud. Whether that's a data protection impact or having data availability for Cloud-native apps, or in memory, or all kinds of cool stuff going on. So you got to lot of stuff happening, so to be invisible, and be programmable, customer's architectures are changing. What's the big trend that you're seeing from a customer standpoint? Are there new ways to lay out storage so that they can be invisible? Certainly a lot of people were looking at their simplification in IT operationally, and then have to prepare for the Cloud, whether that's Multicloud or hybrid or true private Cloud. What architects are you seeing changing, what are people doubling down on, what's the big trends in storage, kind of laying out storage as a strategy? >> So I think the thing about storage in the large, one of the trends obviously that we're seeing is sort of storage co-located with the server. When I started at HP now seven years ago, gen six to gen ten, which we've announced here at this show, the amount of locally attached storage in the box itself is massive. And then the applications are now becoming more and more responsible for data placement, and data replication. And so, even while capacities are growing, I think six or seven percent is what I saw from the latest IDC survey, the actual storage landscape, from a shared storage company, they're actually going down. And the reason is, application provisioning, application-aware storage is really the trend, that's sort of number one. Number two, you see customers looking at deploying the right storage for the right applications. hyperconverge with SimpliVity's a really good example of that, which is they're trying to find the right sort of storage to sort of serve up the right application. And that's where, if you're a single-PoINT provider company now in storage, and you don't have a software-only, a hyperconverge, an All Flash in a couple different flavors, including XP at the top, you're going to find it very, very difficult to sort of continue to compete in this market, and frankly, we're driving a lot of that consolidation, we put some bookends around what we're prepared to pay for. But if you're a PoINT providing storage company now? Life is a lot harder for you than it was a couple years ago. When we started with All Flash, I think it was like 94 All Flash companies. There are not 94 All Flash companies today. And so, I think that's sort of what we see. >> Well, to your point about PoINT companies are going to have a hard time remaining independent, and that's why a lot of 'em are in business to basically sell to a company like yours, cause they fill a need. So my question relates to R&D strategy. As the GM, relatively new GM, you know well that a large company like HPE has to participate in multiple markets, and in order to expand your team, you have to have the right product at the right time. One size does not fit all. So the Nimble acquisition brings in a capability at the lower end of the market, lower price spans, but it also has some unique attributes with regard to the way it uses data and analytics. You've got 3PAR Legendary at the high end. What's the strategy in terms of, and is there one, to bring the best of both of those worlds together, or is it sort of let 20 flowers bloom? >> So, I don't know if it's going to be 'let 20 flowers bloom', but I would probably answer a couple different ways. One is that InfoSight, you're right, is unique value proposition, is part of Nimble. I would bet if I come see you in Madrid, if you have me back for the, whatever, 13th time, [Laughing] that we'll be talking about how InfoSight and 3PAR can come together. So that's sort of the answer to number one. The answer to number two is, even though within the Nimble acquisition, one party acquired the other party, what we're really looking at is the best breed of both organizations. Whether that's a process, a person, a technology, we don't feel wedded to, "Just because we do it a certain way at HP, that means the Nimble team must conform." It's really, "Bring us the best and brightest." That's what we got. At the end of the day, we got a company, we got revenue, but we got the people, and in this storage business, these are serial entrepreneurs who have actually developed a product, we want to keep those people, and the way you do that is you bring 'em in and you use the best and greatest of all the technologies. There's probably other optimizations we'll look at, but looking at InfoSight across the entire portfolio, and one day maybe across the server portfolio, is the right thing to do. >> And just to follow up on that, Tom, if I may, so that's a hard core of sort of embedded technology, and then you've got a capability, we talk about the API economy all the time. How are you, and are you able to leverage other HPE activities to create infrastructure as code, specifically within the storage group? >> So if you look at us, at our converged systems appliances like our SAP HANA appliance, databases greater than six terabytes, we have 85% market share at Hewlett-Packard. And the way we do that, and that's all on 3PAR by the way, and the way we do that is we've got a fixed system that is designed solely to deliver HANA. On the flip of that, you have Synergy, which is a composable programmable infrastructure from the start, where it's all template-based and based on application provisioning. You provision storage, you provision the fabric, you provision compute. That programmable infrastructure also is supported by HP storage. And so, you have-- You can roll it the way you want to, and to some degree I think it's all about choice. If you want to go along, and build your own programmable infrastructure and OpenStack or VCloud Director, whatever it is, we have one of those. If you think simplicity is key, and app and server integration is important part of how you want to roll it out, we have one of those, that's called SimpliVity. If you want a traditional shared storage environment, we have one of those in 3PAR and Nimble, and if you want composable we have that. Now, choice means more than one, I don't know what it means in Latin or Italian, but I'm pretty sure choice means more than one. What we don't want to do is introduce, however, the complexity of what owning more than one is. And that's where things like Synergy make sense, or federation between SerVirtual and 3PAR, and soon we'll have federation between Nimble and 3PAR. So to help customers with that operational complexity problem, but we actually believe that choice is the most important thing we can provide our customers. >> I've always been a big fan of that compose thing, going back a couple years when you guys came and brought it out to the market. We're first, by the way, props to HP, also first on converged infrastructure way back in the day. I got to ask you, one of the things I love doing with theCUBE interviews is that we get to kind of get inspiration around some of the things that you're working on in your business unit. Back in 2010, Dave and I really kind of saw storage move from being boring storage, provisioning storage, to really the center of the action, and really since 2010 you've seen storage really at the center of all these converging trends. Virtualization, and hyperconverges, all this great stuff, now Cloud, so storage is kind of like the center point of all the action, so I got to ask you the question on virtualization, certainly changed the game with storage. Containerization is also changing the game, so I was telling some HP Labs guys last night that I've been looking at provisioning containers in microseconds. Where virtualization is extending and continuing to have a nice run, on the heels of that we got containerization, where apps are going to start working with storage. What's your vision and how do you guys look at that trend? How are you riding that next wave? >> It all comes down to an application-driven approach. As we were saying a little earlier, our view is that storage will be silent. You're going to provision an application. That's really the-- see, look at the difference between us and, let's say, Nutanix with SimpliVity. It's all about the application being provisioned into the hyperconverged environment. And if you look at the virtualization business alone, VMware's going to have a tough go because Hyper-V has actually gotten good enough, and it's cheaper, but people are really giving Hyper-V a much better look at than we've seen over the course of the last couple years. But guess what? That tool will commoditize, and the next commoditization point is going to be containers. Our vantage point, and if you look at 3PAR, you look at Nimble, we're already got it, we've already supported containers within the product, we've actually invested companies that are container-rich. I think it's all about, "What's the next--" >> And we at Dacron last year said, "We know you're parting with all the guys." But this is a big wave. You see containers as-- >> I see containers as sort of the place that virtualization sort of didn't ever get to. If you look at-- >> John: Well, the apps. >> On the apps absolutely, positively. And also it's a much simpler way to deploy an application over a conventional VM. I think containers will be important. Is it going to be important as the technology inflection point around All Flash? >> John: Flash is certainly very-- >> That I don't know, but I think as far as limiting costs in your datacenter, making it easier to deploy your applications, et cetera, I think containers is the one. >> What's the big news here, at HPE Discover 2017, for you guys? What's the story that you're telling, what's going on in the booth? Share some insight into what's happening here on the ground in Las Vegas from your standpoint. >> So I would say a couple of things. I think if you look out on the show floor, it seems more intimate and smaller this year. And there's a lot of concern, I think, that HP is chopping itself off into various pieces and parts, but I think the story that maybe we're not telling well enough, or that it gets missed, is out of that is actually a brand new company called Hewlett-Packard Enterprise, which is uniquely focused on serving enterprise infrastructure customers. And so I think, if I was going to encourage a news story, it's about the phoenix of that, and not the fact that we've taken the yes guys, and the software guys, and the PC guys. It's that company, maybe in Madrid we'll do this, and that company, that's really, really, really exciting. And as you said, storage; sort of in a Ptolemy versus Galileo approach. We believe everything, first of all, revolves around storage. We don't believe in Galileo. So if you look in here at the booth, we've announced the next generation of MSA platforms of 2052, we've got the 9450 3PAR -- three times as fast, more connectivity for All Flash solutions. We've talked about the secondary Flash array for Nimble, most effective place to protect your data is on an array, is on a type where the data came from, and that is the secondary Flash market. We're big into Cloud, we've talked about CloudBank here, which is the ability to keep a copy of your store-once data in any S3-compliant interface, including Scality. I don't know if I'm forgetting, I'm sure I'm forgetting something. >> John: There's a lot there. >> There's a lot there. >> I mean, you guys, I love your angle on the phoenix. We've been seeing that, we've been covering seven years now, and it is a phoenix. And the point that I think the news media is not getting on HP, there's a lot of fud out there, is that this is not a divested strategy. There's some things that went away that were the outsourcing business, but that was just natural. But this is HP-owned, it's not like it's like we're getting out of that, it's just how you're organizing it. >> And with a balance sheet that now is really a competitive weapon, if you will, you're going to see HP both grow organically and inorganically, and I think as the market continues to consolidate, the thing to remember also is there's fewer places to consolidate to. And so if you're a start-up, there's a handful of companies that you can go to now, and probably the best-equipped, right-sized, great balance sheet, great company, is Hewlett-Packard Enterprise. >> Well we had hoped to get Chris Hsu on, but I've always said the day we talk about the debates on management style, but I've always been a big believer as a computer science undergraduate, decouple highly cohesive strategy is a really viable one, I think that's a great one. >> Yeah, and there's still a good partnership with DXC, there'll be a great partnership with Micro Focus, and there's both financially as well as from a business perspective. But it's really an opportunity to focus, and if I was at another company, I would wonder whether or not if their strategy continues to be appropriate. >> Bill Philbin, senior Vice President and general manager of storage and big data at Hewlett-Packard Enterprises, theCUBE more live coverage after the short break. From Las Vegas, HPE Discover 2017, I'm John Furrier with Dave Vellante with theCUBE, we'll be right back after this short break.

Published Date : Jun 6 2017

SUMMARY :

Brought to you by Hewlett-Packard Enterprise. Again, good to see you. Jerry Chen at Greylock wants to have, now, badge values. So nothing's really changing radically in the enterprise, and that's the last time you have to think about it. What's the big trend from your standpoint, and then have to prepare for the Cloud, And the reason is, application provisioning, As the GM, relatively new GM, you know well and the way you do that is you bring 'em in And just to follow up on that, Tom, if I may, and the way we do that is we've got a fixed system on the heels of that we got containerization, and the next commoditization point is going to be containers. And we at Dacron last year said, I see containers as sort of the place as the technology inflection point around All Flash? in your datacenter, making it easier to deploy on the ground in Las Vegas from your standpoint. and that is the secondary Flash market. And the point that I think the news media is not getting the thing to remember also is but I've always said the day we talk But it's really an opportunity to focus, of storage and big data at Hewlett-Packard Enterprises,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
John	PERSON	0.99+
Bill Philbin	PERSON	0.99+
Jerry Chen	PERSON	0.99+
New Orleans	LOCATION	0.99+
London	LOCATION	0.99+
Hewlett-Packard Enterprise	ORGANIZATION	0.99+
Rick Lewis	PERSON	0.99+
six	QUANTITY	0.99+
John Furrier	PERSON	0.99+
HP	ORGANIZATION	0.99+
Nimble	ORGANIZATION	0.99+
Madrid	LOCATION	0.99+
Hewlett-Packard	ORGANIZATION	0.99+
2010	DATE	0.99+
Excel	TITLE	0.99+
Las Vegas	LOCATION	0.99+
Chris Hsu	PERSON	0.99+
Bill	PERSON	0.99+
$250 billion	QUANTITY	0.99+
Hewlett-Packard Enterprises	ORGANIZATION	0.99+
PoINT	ORGANIZATION	0.99+
HPE	ORGANIZATION	0.99+
Tom	PERSON	0.99+
85%	QUANTITY	0.99+
San Jose	LOCATION	0.99+
Indian Ocean	LOCATION	0.99+
CUBE	ORGANIZATION	0.99+
last year	DATE	0.99+
DXC	ORGANIZATION	0.99+
InfoSight	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Micro Focus	ORGANIZATION	0.99+
Dacron	ORGANIZATION	0.99+
2052	DATE	0.99+
Seven	QUANTITY	0.99+
seven years	QUANTITY	0.99+
2012	DATE	0.99+
HANA	TITLE	0.99+
120 petabytes	QUANTITY	0.99+
both	QUANTITY	0.99+
Wikibond	ORGANIZATION	0.99+
40	QUANTITY	0.99+
seven percent	QUANTITY	0.99+
seven years ago	DATE	0.98+
One size	QUANTITY	0.98+
85. 85%	QUANTITY	0.98+
All Flash	ORGANIZATION	0.98+
One	QUANTITY	0.98+
both organizations	QUANTITY	0.98+
first	QUANTITY	0.98+
13th time	QUANTITY	0.98+
ten terabytes	QUANTITY	0.98+
VeeamON	ORGANIZATION	0.98+
one	QUANTITY	0.98+
one party	QUANTITY	0.97+
HPE Discover	ORGANIZATION	0.97+
HP Labs	ORGANIZATION	0.97+
20 flowers	QUANTITY	0.97+

Eric Herzog, IBM Storage - #VMworld - #theCUBE

why from the mandalay bay convention center in las vegas it's the cues covering vmworld 2016 rock you buy vmware and its ecosystem sponsors now you're your host John furrier and John wall's well welcome back to Mandalay Bay here at vmworld along with John furrier I'm John wall's glad to be with you here on the cubes to continue our coverage what's happening at vmworld exclusive broadcast a partner here for the show and along with John we're joined by eric Herzog's the vice president product marketing and management at IBM storage and Erica I just found out you're one of the all-time 10 most popular cute guests or most prominent cube guests most prolific congratulations well thank you we always love coming to the cube it's always energizing you love controversy and I love controversy and you get down to the heart of it you're the hard copy of high tech they're like oh I loved and we could probably mark each of your appearances by the Hawaiian shirt I think what do you think either Hawaiian shirt or one of my luggage share we could trace those back ever stop vibe about the show I mean just your thoughts about they've been here for three four days now just your general feel about about the the messaging here and then what's actually being conveyed in the enthusiasm out on the show floor well it's pretty clear that the world has gone cloud the world is doing cognitive and big data analytics vmware is leading that charge their strong partner of IBM we do a lot of things with them both with our cloud division on our storage division and vmware is a very strong partner of IBM we have all kinds of integration in our storage technology products with vai with vasa with vcenter ops all the various product lines at vmware offers and the key thing is ever wants to go to the cloud so by working with IBM and vmware together makes it easier and easier for customers whether it be the small shop Herzog's barn grill or whether it be the giant fortune 500 global entity working with us together allow them to get to the cloud sooner faster and have a better cloud experience so you got you know everybody cloud and virtualization and you know big themes big big topics so why does storage still matter well the big thing is if you're going to go to a cloud infrastructure and you're going to run everything on the cloud you think of storage as at solid foundation it has to be rock solid it has to be highly resilient it has to be able to handle error codes and error messaging and things failing and things falling off the earth at the same time it needs to be incredibly fast where things like all-flash arrays come in and even flexible so things like software-defined storage so think of storage as the critical foundation underneath any cloud or virtualized environment if you don't have a strong storage foundation with great resiliency great availability great serviceability and great performance your cloud or your virtual infrastructure is going to be mediocre and that's a very generous term so that's a key point so controversial II speaking to get to the controversy the whole complexity around converged infrastructure hyper converge or whatever the customers are deploying for compute they're putting the storage close to that whether it's a SAS and the cloud which is basically a data center that no one knows the address of as we were saying they always going to have stores has to sit somewhere what is the key trends right now for you because software is leading the way iBM has been doing a lot of work I know and soft we've been covering you guys will be at IBM edge coming up shortly in a couple weeks where's the innovation on the storage side for you guys well how do you talk to the customer base to say ok I got some sass options now for back and recovery weird one of your partners earlier i'm talking about that where is the physical storage innovation is that the software what's your thoughts on so we have a couple paths of integration for us first software-defined storage several the other analyst firms have named it's the number one software-defined storage coming in the world for several years in a row now software-defined storage gives a flexible infrastructure you don't have to buy any of the underlying media or underlying array controller from us just by our software and then you could put on anybody else's hardware you want you can work with your cloud provider with your reseller with your distributor enterprises create their own cloud whether it's a software-defined storage gives you a wide swath of storage functionality backup archive primary store grid scale out software only so ultimate flexibility so that one area of innovation secondary ish is all flash all flash is not expensive essentially I love old Schwarzenegger movies in the 1980s was all about tape he was a spy go and show what is supposedly the CIA was Schwarzenegger I'll take mid 90s Schwarzenegger another spy movie show a datacenter all hard drive arrays now in the next Schwarzenegger movie hopefully it'll be all flash arrays from IBM in the background so flash is just an evolution and we do tons of humor white shirts I keep swapping monitors it so he's intimated I get one from Maui went from kawaii one from the Big Island so flash is where it's at from a system level perspective so you've got that innovation and then you've got converged infrastructure as you mentioned already will you get the server the storage the networking and VMware hypervisor all packaged up dramatically so we have a product called the vs tak we do jointly with Cisco and vmware we were late to market on that we freely admit that but just give you an idea in the first half of this year we have done almost 2x what we did in the entire year of 2015 so that's another growth ending particularly cloud service providers love to get these pre-canned pre racked versus tax and deploy them in a number of our public references are cloud service providers both big and small essentially wheel in a versus stack when they need it whelan not own will another pre-configured ready to go and they get up and up and quit going so those are three trends we just had a client on Scott equipment not a Monroe Louisiana went to the Versa stack and singing your praises like a great example of medium size small sized businesses so we keep think about enterprises and all this and that it doesn't have to be the case their services that you're providing the companies of all sizes that are gaining new efficiencies in protocol al people everybody needs storage and you think about it is really how do you want to consume the storage and in a smaller shop you may choose one way so versus stack is converged infrastructure our software-defined storage like spectrum accelerate spectrum virtualize a software-only model several of the products like spectrum accelerate inspect can protect are available through softlayer or other cloud is he consumed it as a cloud entity so whether you want to consume an on-premises software only full array full integrated stack or cloud configuration we offer any way in which you want to eat that cake big cake small cake fruit cake chocolate cake vanilla cake we got kicked for ever you need and we can cover every base with that a good point about the diversity of choices from tape to flash and they get the multi multi integrated Universal stack so a lot of different choices I want to ask you about you know with that kind of array of options how you view the competitive strategy for IBM with storage so you know I know you're a wrestler so is there a is there a judo move on the competition how would you talk about your differentiation how do you choke hold the competition well couple ways first a lot from a technical perspective by leading with software-defined storage and we are unmatched in that capacity according the industry analysts on what we do and we have it in all areas in block storage we got scale-out file storage and scale out big data analytics we got back up we got archive almost no one has that panoply of offering in a software-defined space and you don't need to buy the hardware from us you can buy from our competitors two things I hear software and then after the array of eyelash what's specifically on the software are you guys leading and have unmatched as-safir already well spectrum protect is you know been a leader in the enterprise for years spectrum scale is approaching 5,000 customers now and we have customers close to an exabyte in production single customer with an exabyte pretty incredible so for big data analytic workloads with on gastronomic research so for us it's all about the application workload in use case part of the reason we have a broad offering is anyone who comes in here and sits in front of you guys and says my array or my software will do everything for you is smoking something that's not legal just not true maybe in Colorado or yeah okay me but the reality is workloads applications and use cases very dramatically and let's take an easy example we have multiple all-flash arrays why do we have multiple all flash arrays a we have a version for mainframe attached everyone in there wants six or seven 9s guess what we can provide that it's expensive as they're all is that our six or seven 9s but now they can get all flash performance on the mainframe in the upper end of the Linux world that's what you would consume at the other end we have our flash our store wise 50 30 f which can be as low street price as low street price as eighteen thousand dollars for an all-flash array to get started basically the same prices our Drive rang and it has all the enterprise data services snapshot replication data encryption at rest migration capability tiering capability it's basically what a hard drive array used to cost so why not go all flash threat talk about the evolution of IBM storage actually them in a leader in storage in the beginning but there was a period of time there and Dave when I won't talk handling the cube about this where storage my BMC it took a lot of share but there's been a huge investment in storage over the past i'd say maybe five years in particular maybe past three specifically i think over a billion dollars has been spent I think we thought the Jamie talent variety of folks on from IBM what is the update take a minute to explain how IBM has regained their mojo in storage where that come from just add some color to that because I think that's something that let people go hmm I great for things from my being but they didn't always have it in the storage so as you know IBM invented the hard drive essentially created the storage industry so saying that we lost our mojos a fair statement but boy do we have it back explain so first thing is when you have this cloud and analytic cognitive era you need a solid foundation of storage and IBM is publicly talked about the future of the world is around cloud on cognitive infrastructure cognitive applications so if your storage is not the best from an availability perspective and from a performance perspective then the reality is your cloud and cognitive that you're trying to do is basically going to suck yeah so in order to have the cloud and convey this underlying infrastructure that's rock-solid so quite honestly as you mentioned Dave we've actually invested over three and a half billion dollars in the last three years not to mention we bought a company called Texas memory systems which is the grandfather our flash systems knocks before that so we've invested well over three billion dollars we've also made a number of executive hirings ed walls just joined us CEO of several startups former general manager from emc i myself was a senior vice president at emc we just hired a new VP of Sales they're serious you guys are serious you guys are all in investing bringing on the right team focusing on applications work gloves in use case as much as I love storage most CEOs hate it yeah there's almost no cio that whatever a storage guy they're all app guys got to talk their lingo application workload in use case how the storage enables their availability of those apps workloads and use cases and how it gives them the right performance to meet their essays to the business guy what's interesting I want to highlight that because I think it's a good point people might not know is that having just good storage in and of itself was an old siloed model but now you mentioned could we cover all the IBM events world of Watson we should call insights edge and and interconnect the cloud show cognitive is front and center there's absolutely the moon shot and the mandate from IBM to be number one in cognitive computing which means big data analytics integrated to the application level obviously bluemix in the cloud Philip blank was here on stage about IBM cloud the relation with VMware so that fails if it doesn't have good steward doesn't perform well and and latency matters right I mean data matters well I add a couple things there so first of all absolutely correct but the other thing is we actually have cognitive storage ok if you automate processes automatically for example to your data some of our competitors have tiering most of them tier only within their own box we actually can tear not only within our own box for from our box to emc our box to netapp our box to HP HP to del Delta hitachi we can t r from anything to anything so that's a huge advantage right there but we tier we don't just set policy which is when data's 90 days old automatically move it that's automation cog nation is where we not only watch the applications and watch the data set we move it from hot to cold so let's take for example financial data your publicly traded company cuban SiliconANGLE going to be public soon i'm sure guys are getting so big your finance guys going to say Dave John team this financial data is white-hot got to be on all flash after you guys do your announcement of your incredible earnings and thank God I hopefully get friend of the company stock and my stock goes way up as your stock goes way up what are we spoking now come on let me tell you when that happens the date is going to go stone-cold we see that you don't have to set a policy two-tier the data with IBM we automatically learn when the data is hot and when it's cold and move it back and forth for you you know there's no policy setting cognition or cognitive its storage understand or stands out as the work for some big data mojo coming into the storage right and that's a huge change so again not only is it critical for any cognitive application to have incredibly performance storage with incredible resiliency availability reliability ok when there is cognitive health care true cognitive health care and Dave's on the table and they bring out their cognitive Juan because they found something in your chest that they didn't see before if the storage fails not going to be good for Dave yeah at the same time if the storage is too slow that might not be good for Dave either when they run that cognitive wand a that hospital knows that it's never going to fail that doctor says Oh Dave okay we better take that thing out boom he takes it out Dave's healthy again well that's a real example by the way not necessary Dave on the table but there was a story we wrote insult an angle one of our most popular post last month IBM Watson actually found a diagnosis uncured a patient the doctor had missed I don't know if you saw that story when super viral but that's the kind of business use case that you're in kind of illuminating with the storage yeah well in fact that one of the recent trade shows what's called the flash memory summit we won an award for best enterprise application commercial developer spark cognition they developed cyber security applications they recommend IBM flash systems and actually Watson's embedded in their application and it detects security threats for enterprises so there's an example of combining cognition with Watson the cognition capability of flash systems and then their software which is commercially available it's not an in-house thing or they're you know a regular software all right now we're a now we're in like the big time you know intoxication mode with all this awesome futuristic real technology how does a customer get this now because now back to IT yeah the silos are still out there they're breaking down the silos how do you take this to customers what's to use case how do you guys deploy this what's the what are you seeing for success stories well the key thing is to make it easy to use and deploy which we do so if you want the cloud model we're available in software IBM Global resiliency services uses us for their resiliency service over 300 cloud providers you spectrum protect for backup pick the cloud guy just pick one you want we work with all of them if you want to deploy in-house we have a whole set of channel partners globally we have the IBM sales team IBM global services uses IBM's own storage of course to provide to the larger enterprises so with your big shop medium swaps well flop we have a whole set of people out there with our partner base with our own sales guys that can help that and you get up and then we back it up as you know IBM is renowned for supporting service in all of our divisions in all of our product portfolio not just in storage so they need support and service our storage service guys are there right away you'd it installed we can install it our partners can install this stuff so we try to make it as brain dead as possible as easy as possible Jen being cognitive and are some of our user interfaces are as easy as a Macintosh I mean drag-and-drop move your lungs around run analytics on when you're going to run out of storage so you know ahead of time all these things that cut things people want today remember IT budget cut dramatically in the downturn of 08 09 and while budgets have returned they're not hiring storage guys there are hiring developers and they're hiring cloud guys so those guys don't know how to use storage well you got to make it easy always fast and always resilient that way it doesn't fail anyway but when it does you just go into the GUI it tells you what's wrong bingo and IBM service our partner service comes right out and fix it so that's what you need today because there aren't as many storage guys as you used to be no question you've got the waterfront covered no doubt about that and again congratulations on cracking the top 10 way back we consider that an honor and a privilege to be a part of that great welcome picture we really appreciate it thank you we'll continue the coverage here on the Cuba vmworld right after this

Published Date : Aug 31 2016

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Colorado	LOCATION	0.99+
eric Herzog	PERSON	0.99+
eighteen thousand dollars	QUANTITY	0.99+
John	PERSON	0.99+
Eric Herzog	PERSON	0.99+
Erica	PERSON	0.99+
5,000 customers	QUANTITY	0.99+
John wall	PERSON	0.99+
vmworld	ORGANIZATION	0.99+
John wall	PERSON	0.99+
vmware	ORGANIZATION	0.99+
Herzog	ORGANIZATION	0.99+
CIA	ORGANIZATION	0.99+
John furrier	PERSON	0.99+
HP	ORGANIZATION	0.99+
John furrier	PERSON	0.99+
Macintosh	COMMERCIAL_ITEM	0.99+
five years	QUANTITY	0.99+
Mandalay Bay	LOCATION	0.98+
Schwarzenegger	PERSON	0.98+
last month	DATE	0.98+
today	DATE	0.98+
Hawaiian	OTHER	0.98+
over three and a half billion dollars	QUANTITY	0.97+
iBM	ORGANIZATION	0.97+
both	QUANTITY	0.97+
over a billion dollars	QUANTITY	0.97+
mid 90s	DATE	0.97+
10 most popular cute guests	QUANTITY	0.96+
BMC	ORGANIZATION	0.96+
over three billion dollars	QUANTITY	0.96+
#VMworld	ORGANIZATION	0.95+
Linux	TITLE	0.95+
first thing	QUANTITY	0.94+
10	QUANTITY	0.94+
Watson	TITLE	0.94+
Big Island	LOCATION	0.93+
three four days	QUANTITY	0.93+
one	QUANTITY	0.93+
las vegas	LOCATION	0.93+
1980s	DATE	0.92+
first	QUANTITY	0.92+
IBM Storage	ORGANIZATION	0.92+
each	QUANTITY	0.92+
six	QUANTITY	0.92+
2016	DATE	0.91+
one way	QUANTITY	0.91+
50 30 f	OTHER	0.91+
90 days old	QUANTITY	0.9+
over 300 cloud providers	QUANTITY	0.9+
2015	DATE	0.89+
first half of this year	DATE	0.89+
Philip blank	PERSON	0.88+
last three years	DATE	0.86+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for All-Flash: