Image Title

Search Results for Strata O'Reilly:

Breaking Analysis: Snowflake Summit 2022...All About Apps & Monetization


 

>> From theCUBE studios in Palo Alto in Boston, bringing you data driven insights from theCUBE and ETR. This is "Breaking Analysis" with Dave Vellante. >> Snowflake Summit 2022 underscored that the ecosystem excitement which was once forming around Hadoop is being reborn, escalated and coalescing around Snowflake's data cloud. What was once seen as a simpler cloud data warehouse and good marketing with the data cloud is evolving rapidly with new workloads of vertical industry focus, data applications, monetization, and more. The question is, will the promise of data be fulfilled this time around, or is it same wine, new bottle? Hello, and welcome to this week's Wikibon CUBE Insights powered by ETR. In this "Breaking Analysis," we'll talk about the event, the announcements that Snowflake made that are of greatest interest, the major themes of the show, what was hype and what was real, the competition, and some concerns that remain in many parts of the ecosystem and pockets of customers. First let's look at the overall event. It was held at Caesars Forum. Not my favorite venue, but I'll tell you it was packed. Fire Marshall Full, as we sometimes say. Nearly 10,000 people attended the event. Here's Snowflake's CMO Denise Persson on theCUBE describing how this event has evolved. >> Yeah, two, three years ago, we were about 1800 people at a Hilton in San Francisco. We had about 40 partners attending. This week we're close to 10,000 attendees here. Almost 10,000 people online as well, and over over 200 partners here on the show floor. >> Now, those numbers from 2019 remind me of the early days of Hadoop World, which was put on by Cloudera but then Cloudera handed off the event to O'Reilly as this article that we've inserted, if you bring back that slide would say. The headline it almost got it right. Hadoop World was a failure, but it didn't have to be. Snowflake has filled the void created by O'Reilly when it first killed Hadoop World, and killed the name and then killed Strata. Now, ironically, the momentum and excitement from Hadoop's early days, it probably could have stayed with Cloudera but the beginning of the end was when they gave the conference over to O'Reilly. We can't imagine Frank Slootman handing the keys to the kingdom to a third party. Serious business was done at this event. I'm talking substantive deals. Salespeople from a host sponsor and the ecosystems that support these events, they love physical. They really don't like virtual because physical belly to belly means relationship building, pipeline, and deals. And that was blatantly obvious at this show. And in fairness, all theCUBE events that we've done year but this one was more vibrant because of its attendance and the action in the ecosystem. Ecosystem is a hallmark of a cloud company, and that's what Snowflake is. We asked Frank Slootman on theCUBE, was this ecosystem evolution by design or did Snowflake just kind of stumble into it? Here's what he said. >> Well, when you are a data clouding, you have data, people want to do things with that data. They don't want just run data operations, populate dashboards, run reports. Pretty soon they want to build applications and after they build applications, they want build businesses on it. So it goes on and on and on. So it drives your development to enable more and more functionality on that data cloud. Didn't start out that way, you know, we were very, very much focused on data operations. Then it becomes application development and then it becomes, hey, we're developing whole businesses on this platform. So similar to what happened to Facebook in many ways. >> So it sounds like it was maybe a little bit of both. The Facebook analogy is interesting because Facebook is a walled garden, as is Snowflake, but when you come into that garden, you have assurances that things are going to work in a very specific way because a set of standards and protocols is being enforced by a steward, i.e. Snowflake. This means things run better inside of Snowflake than if you try to do all the integration yourself. Now, maybe over time, an open source version of that will come out but if you wait for that, you're going to be left behind. That said, Snowflake has made moves to make its platform more accommodating to open source tooling in many of its announcements this week. Now, I'm not going to do a deep dive on the announcements. Matt Sulkins from Monte Carlo wrote a decent summary of the keynotes and a number of analysts like Sanjeev Mohan, Tony Bear and others are posting some deeper analysis on these innovations, and so we'll point to those. I'll say a few things though. Unistore extends the type of data that can live in the Snowflake data cloud. It's enabled by a new feature called hybrid tables, a new table type in Snowflake. One of the big knocks against Snowflake was it couldn't handle and transaction data. Several database companies are creating this notion of a hybrid where both analytic and transactional workloads can live in the same data store. Oracle's doing this for example, with MySQL HeatWave and there are many others. We saw Mongo earlier this month add an analytics capability to its transaction system. Mongo also added sequel, which was kind of interesting. Here's what Constellation Research analyst Doug Henschen said about Snowflake's moves into transaction data. Play the clip. >> Well with Unistore, they're reaching out and trying to bring transactional data in. Hey, don't limit this to analytical information and there's other ways to do that like CDC and streaming but they're very closely tying that again to that marketplace, with the idea of bring your data over here and you can monetize it. Don't just leave it in that transactional database. So another reach to a broader play across a big community that they're building. >> And you're also seeing Snowflake expand its workload types in its unique way and through Snowpark and its stream lit acquisition, enabling Python so that native apps can be built in the data cloud and benefit from all that structure and the features that Snowflake is built in. Hence that Facebook analogy, or maybe the App Store, the Apple App Store as I propose as well. Python support also widens the aperture for machine intelligence workloads. We asked Snowflake senior VP of product, Christian Kleinerman which announcements he thought were the most impactful. And despite the who's your favorite child nature of the question, he did answer. Here's what he said. >> I think the native applications is the one that looks like, eh, I don't know about it on the surface but he has the biggest potential to change everything. That's create an entire ecosystem of solutions for within a company or across companies that I don't know that we know what's possible. >> Snowflake also announced support for Apache Iceberg, which is a new open table format standard that's emerging. So you're seeing Snowflake respond to these concerns about its lack of openness, and they're building optionality into their cloud. They also showed some cost op optimization tools both from Snowflake itself and from the ecosystem, notably Capital One which launched a software business on top of Snowflake focused on optimizing cost and eventually the rollout data management capabilities, and all kinds of features that Snowflake announced that the show around governance, cross cloud, what we call super cloud, a new security workload, and they reemphasize their ability to read non-native on-prem data into Snowflake through partnerships with Dell and Pure and a lot more. Let's hear from some of the analysts that came on theCUBE this week at Snowflake Summit to see what they said about the announcements and their takeaways from the event. This is Dave Menninger, Sanjeev Mohan, and Tony Bear, roll the clip. >> Our research shows that the majority of organizations, the majority of people do not have access to analytics. And so a couple of the things they've announced I think address those or help to address those issues very directly. So Snowpark and support for Python and other languages is a way for organizations to embed analytics into different business processes. And so I think that'll be really beneficial to try and get analytics into more people's hands. And I also think that the native applications as part of the marketplace is another way to get applications into people's hands rather than just analytical tools. Because most people in the organization are not analysts. They're doing some line of business function. They're HR managers, they're marketing people, they're sales people, they're finance people, right? They're not sitting there mucking around in the data, they're doing a job and they need analytics in that job. >> Primarily, I think it is to contract this whole notion that once you move data into Snowflake, it's a proprietary format. So I think that's how it started but it's usually beneficial to the customers, to the users because now if you have large amount of data in paket files you can leave it on S3, but then you using the Apache Iceberg table format in Snowflake, you get all the benefits of Snowflake's optimizer. So for example, you get the micro partitioning, you get the metadata. And in a single query, you can join, you can do select from a Snowflake table union and select from an iceberg table and you can do store procedure, user defined function. So I think what they've done is extremely interesting. Iceberg by itself still does not have multi-table transactional capabilities. So if I'm running a workload, I might be touching 10 different tables. So if I use Apache Iceberg in a raw format, they don't have it, but Snowflake does. So the way I see it is Snowflake is adding more and more capabilities right into the database. So for example, they've gone ahead and added security and privacy. So you can now create policies and do even cell level masking, dynamic masking, but most organizations have more than Snowflake. So what we are starting to see all around here is that there's a whole series of data catalog companies, a bunch of companies that are doing dynamic data masking, security and governance, data observability which is not a space Snowflake has gone into. So there's a whole ecosystem of companies that is mushrooming. Although, you know, so they're using the native capabilities of Snowflake but they are at a level higher. So if you have a data lake and a cloud data warehouse and you have other like relational databases, you can run these cross platform capabilities in that layer. So that way, you know, Snowflake's done a great job of enabling that ecosystem. >> I think it's like the last mile, essentially. In other words, it's like, okay, you have folks that are basically that are very comfortable with Tableau but you do have developers who don't want to have to shell out to a separate tool. And so this is where Snowflake is essentially working to address that constituency. To Sanjeev's point, and I think part of it, this kind of plays into it is what makes this different from the Hadoop era is the fact that all these capabilities, you know, a lot of vendors are taking it very seriously to put this native. Now, obviously Snowflake acquired Streamlit. So we can expect that the Streamlit capabilities are going to be native. >> I want to share a little bit about the higher level thinking at Snowflake, here's a chart from Frank Slootman's keynote. It's his version of the modern data stack, if you will. Now, Snowflake of course, was built on the public cloud. If there were no AWS, there would be no Snowflake. Now, they're all about bringing data and live data and expanding the types of data, including structured, we just heard about that, unstructured, geospatial, and the list is going to continue on and on. Eventually I think it's going to bleed into the edge if we can figure out what to do with that edge data. Executing on new workloads is a big deal. They started with data sharing and they recently added security and they've essentially created a PaaS layer. We call it a SuperPaaS layer, if you will, to attract application developers. Snowflake has a developer-focused event coming up in November and they've extended the marketplace with 1300 native apps listings. And at the top, that's the holy grail, monetization. We always talk about building data products and we saw a lot of that at this event, very, very impressive and unique. Now here's the thing. There's a lot of talk in the press, in the Wall Street and the broader community about consumption-based pricing and concerns over Snowflake's visibility and its forecast and how analytics may be discretionary. But if you're a company building apps in Snowflake and monetizing like Capital One intends to do, and you're now selling in the marketplace, that is not discretionary, unless of course your costs are greater than your revenue for that service, in which case is going to fail anyway. But the point is we're entering a new error where data apps and data products are beginning to be built and Snowflake is attempting to make the data cloud the defacto place as to where you're going to build them. In our view they're well ahead in that journey. Okay, let's talk about some of the bigger themes that we heard at the event. Bringing apps to the data instead of moving the data to the apps, this was a constant refrain and one that certainly makes sense from a physics point of view. But having a single source of data that is discoverable, sharable and governed with increasingly robust ecosystem options, it doesn't have to be moved. Sometimes it may have to be moved if you're going across regions, but that's unique and a differentiator for Snowflake in our view. I mean, I'm yet to see a data ecosystem that is as rich and growing as fast as the Snowflake ecosystem. Monetization, we talked about that, industry clouds, financial services, healthcare, retail, and media, all front and center at the event. My understanding is that Frank Slootman was a major force behind this shift, this development and go to market focus on verticals. It's really an attempt, and he talked about this in his keynote to align with the customer mission ultimately align with their objectives which not surprisingly, are increasingly monetizing with data as a differentiating ingredient. We heard a ton about data mesh, there were numerous presentations about the topic. And I'll say this, if you map the seven pillars Snowflake talks about, Benoit Dageville talked about this in his keynote, but if you map those into Zhamak Dehghani's data mesh framework and the four principles, they align better than most of the data mesh washing that I've seen. The seven pillars, all data, all workloads, global architecture, self-managed, programmable, marketplace and governance. Those are the seven pillars that he talked about in his keynote. All data, well, maybe with hybrid tables that becomes more of a reality. Global architecture means the data is globally distributed. It's not necessarily physically in one place. Self-managed is key. Self-service infrastructure is one of Zhamak's four principles. And then inherent governance. Zhamak talks about computational, what I'll call automated governance, built in. And with all the talk about monetization, that aligns with the second principle which is data as product. So while it's not a pure hit and to its credit, by the way, Snowflake doesn't use data mesh in its messaging anymore. But by the way, its customers do, several customers talked about it. Geico, JPMC, and a number of other customers and partners are using the term and using it pretty closely to the concepts put forth by Zhamak Dehghani. But back to the point, they essentially, Snowflake that is, is building a proprietary system that substantially addresses some, if not many of the goals of data mesh. Okay, back to the list, supercloud, that's our term. We saw lots of examples of clouds on top of clouds that are architected to spin multiple clouds, not just run on individual clouds as separate services. And this includes Snowflake's data cloud itself but a number of ecosystem partners that are headed in a very similar direction. Snowflake still talks about data sharing but now it uses the term collaboration in its high level messaging, which is I think smart. Data sharing is kind of a geeky term. And also this is an attempt by Snowflake to differentiate from everyone else that's saying, hey, we do data sharing too. And finally Snowflake doesn't say data marketplace anymore. It's now marketplace, accounting for its application market. Okay, let's take a quick look at the competitive landscape via this ETR X-Y graph. Vertical access remembers net score or spending momentum and the x-axis is penetration, pervasiveness in the data center. That's what ETR calls overlap. Snowflake continues to lead on the vertical axis. They guide it conservatively last quarter, remember, so I wouldn't be surprised if that lofty height, even though it's well down from its earlier levels but I wouldn't be surprised if it ticks down again a bit in the July survey, which will be in the field shortly. Databricks is a key competitor obviously at a strong spending momentum, as you can see. We didn't draw it here but we usually draw that 40% line or red line at 40%, anything above that is considered elevated. So you can see Databricks is quite elevated. But it doesn't have the market presence of Snowflake. It didn't get to IPO during the bubble and it doesn't have nearly as deep and capable go-to market machinery. Now, they're getting better and they're getting some attention in the market, nonetheless. But as a private company, you just naturally, more people are aware of Snowflake. Some analysts, Tony Bear in particular, believe Mongo and Snowflake are on a bit of a collision course long term. I actually can see his point. You know, I mean, they're both platforms, they're both about data. It's long ways off, but you can see them sort of in a similar path. They talk about kind of similar aspirations and visions even though they're quite in different markets today but they're definitely participating in similar tam. The cloud players are probably the biggest or definitely the biggest partners and probably the biggest competitors to Snowflake. And then there's always Oracle. Doesn't have the spending velocity of the others but it's got strong market presence. It owns a cloud and it knows a thing about data and it definitely is a go-to market machine. Okay, we're going to end on some of the things that we heard in the ecosystem. 'Cause look, we've heard before how particular technology, enterprise data warehouse, data hubs, MDM, data lakes, Hadoop, et cetera. We're going to solve all of our data problems and of course they didn't. And in fact, sometimes they create more problems that allow vendors to push more incremental technology to solve the problems that they created. Like tools and platforms to clean up the no schema on right nature of data lakes or data swamps. But here are some of the things that I heard firsthand from some customers and partners. First thing is, they said to me that they're having a hard time keeping up sometimes with the pace of Snowflake. It reminds me of AWS in 2014, 2015 timeframe. You remember that fire hose of announcements which causes increased complexity for customers and partners. I talked to several customers that said, well, yeah this is all well and good but I still need skilled people to understand all these tools that I'm integrated in the ecosystem, the catalogs, the machine learning observability. A number of customers said, I just can't use one governance tool, I need multiple governance tools and a lot of other technologies as well, and they're concerned that that's going to drive up their cost and their complexity. I heard other concerns from the ecosystem that it used to be sort of clear as to where they could add value you know, when Snowflake was just a better data warehouse. But to point number one, they're either concerned that they'll be left behind or they're concerned that they'll be subsumed. Look, I mean, just like we tell AWS customers and partners, you got to move fast, you got to keep innovating. If you don't, you're going to be left. Either if your customer you're going to be left behind your competitor, or if you're a partner, somebody else is going to get there or AWS is going to solve the problem for you. Okay, and there were a number of skeptical practitioners, really thoughtful and experienced data pros that suggested that they've seen this movie before. That's hence the same wine, new bottle. Well, this time around I certainly hope not given all the energy and investment that is going into this ecosystem. And the fact is Snowflake is unquestionably making it easier to put data to work. They built on AWS so you didn't have to worry about provisioning, compute and storage and networking and scaling. Snowflake is optimizing its platform to take advantage of things like Graviton so you don't have to, and they're doing some of their own optimization tools. The ecosystem is building optimization tools so that's all good. And firm belief is the less expensive it is, the more data will get brought into the data cloud. And they're building a data platform on which their ecosystem can build and run data applications, aka data products without having to worry about all the hard work that needs to get done to make data discoverable, shareable, and governed. And unlike the last 10 years, you don't have to be a keeper and integrate all the animals in the Hadoop zoo. Okay, that's it for today, thanks for watching. Thanks to my colleague, Stephanie Chan who helps research "Breaking Analysis" topics. Sometimes Alex Myerson is on production and manages the podcasts. Kristin Martin and Cheryl Knight help get the word out on social and in our newsletters, and Rob Hof is our editor in chief over at Silicon, and Hailey does some wonderful editing, thanks to all. Remember, all these episodes are available as podcasts wherever you listen. All you got to do is search Breaking Analysis Podcasts. I publish each week on wikibon.com and siliconangle.com and you can email me at David.Vellante@siliconangle.com or DM me @DVellante. If you got something interesting, I'll respond. If you don't, I'm sorry I won't. Or comment on my LinkedIn post. Please check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, and we'll see you next time. (upbeat music)

Published Date : Jun 18 2022

SUMMARY :

bringing you data driven that the ecosystem excitement here on the show floor. and the action in the ecosystem. Didn't start out that way, you know, One of the big knocks against Snowflake the idea of bring your data of the question, he did answer. is the one that looks like, and from the ecosystem, And so a couple of the So that way, you know, from the Hadoop era is the fact the defacto place as to where

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Frank SlootmanPERSON

0.99+

Frank SlootmanPERSON

0.99+

Doug HenschenPERSON

0.99+

Stephanie ChanPERSON

0.99+

Christian KleinermanPERSON

0.99+

AWSORGANIZATION

0.99+

Dave VellantePERSON

0.99+

Rob HofPERSON

0.99+

Benoit DagevillePERSON

0.99+

2014DATE

0.99+

Matt SulkinsPERSON

0.99+

JPMCORGANIZATION

0.99+

2019DATE

0.99+

Cheryl KnightPERSON

0.99+

Palo AltoLOCATION

0.99+

Denise PerssonPERSON

0.99+

Alex MyersonPERSON

0.99+

Tony BearPERSON

0.99+

Dave MenningerPERSON

0.99+

DellORGANIZATION

0.99+

JulyDATE

0.99+

GeicoORGANIZATION

0.99+

NovemberDATE

0.99+

SnowflakeTITLE

0.99+

40%QUANTITY

0.99+

OracleORGANIZATION

0.99+

App StoreTITLE

0.99+

Capital OneORGANIZATION

0.99+

second principleQUANTITY

0.99+

Sanjeev MohanPERSON

0.99+

SnowflakeORGANIZATION

0.99+

1300 native appsQUANTITY

0.99+

Tony BearPERSON

0.99+

David.Vellante@siliconangle.comOTHER

0.99+

Kristin MartinPERSON

0.99+

MongoORGANIZATION

0.99+

DatabricksORGANIZATION

0.99+

Snowflake Summit 2022EVENT

0.99+

FirstQUANTITY

0.99+

twoDATE

0.99+

PythonTITLE

0.99+

10 different tablesQUANTITY

0.99+

FacebookORGANIZATION

0.99+

ETRORGANIZATION

0.99+

bothQUANTITY

0.99+

SnowflakeEVENT

0.98+

one placeQUANTITY

0.98+

each weekQUANTITY

0.98+

O'ReillyORGANIZATION

0.98+

This weekDATE

0.98+

Hadoop WorldEVENT

0.98+

this weekDATE

0.98+

PureORGANIZATION

0.98+

about 40 partnersQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

last quarterDATE

0.98+

OneQUANTITY

0.98+

S3TITLE

0.97+

HadoopLOCATION

0.97+

singleQUANTITY

0.97+

Caesars ForumLOCATION

0.97+

IcebergTITLE

0.97+

single sourceQUANTITY

0.97+

SiliconORGANIZATION

0.97+

Nearly 10,000 peopleQUANTITY

0.97+

Apache IcebergORGANIZATION

0.97+

Bill Schmarzo, Hitachi Vantara | CUBE Conversation, August 2020


 

>> Announcer: From theCUBE studios in Palo Alto, in Boston, connecting with thought leaders all around the world. This is a CUBE conversation. >> Hey, welcome back, you're ready. Jeff Frick here with theCUBE. We are still getting through the year of 2020. It's still the year of COVID and there's no end in sight I think until we get to a vaccine. That said, we're really excited to have one of our favorite guests. We haven't had him on for a while. I haven't talked to him for a long time. He used to I think have the record for the most CUBE appearances of probably any CUBE alumni. We're excited to have him joining us from his house in Palo Alto. Bill Schmarzo, you know him as the Dean of Big Data, he's got more titles. He's the chief innovation officer at Hitachi Vantara. He's also, we used to call him the Dean of Big Data, kind of for fun. Well, Bill goes out and writes a bunch of books. And now he teaches at the University of San Francisco, School of Management as an executive fellow. He's an honorary professor at NUI Galway. I think he's just, he likes to go that side of the pond and a many time author now, go check him out. His author profile on Amazon, the "Big Data MBA," "The Art of Thinking Like A Data Scientist" and another Big Data, kind of a workbook. Bill, great to see you. >> Thanks, Jeff, you know, I miss my time on theCUBE. These conversations have always been great. We've always kind of poked around the edges of things. A lot of our conversations have always been I thought, very leading edge and the title Dean of Big Data is courtesy of theCUBE. You guys were the first ones to give me that name out of one of the very first Strata Conferences where you dubbed me the Dean of Big Data, because I taught a class there called the Big Data MBA and look what's happened since then. >> I love it. >> It's all on you guys. >> I love it, and we've outlasted Strata, Strata doesn't exist as a conference anymore. So, you know, part of that I think is because Big Data is now everywhere, right? It's not the standalone thing. But there's a topic, and I'm holding in my hands a paper that you worked on with a colleague, Dr. Sidaoui, talking about what is the value of data? What is the economic value of data? And this is a topic that's been thrown around quite a bit. I think you list a total of 28 reference sources in this document. So it's a well researched piece of material, but it's a really challenging problem. So before we kind of get into the details, you know, from your position, having done this for a long time, and I don't know what you're doing today, you used to travel every single week to go out and visit customers and actually do implementations and really help people think these through. When you think about the value, the economic value, how did you start to kind of frame that to make sense and make it kind of a manageable problem to attack? >> So, Jeff, the research project was eyeopening for me. And one of the advantages of being a professor is, you have access to all these very smart, very motivated, very free research sources. And one of the problems that I've wrestled with as long as I've been in this industry is, how do you figure out what is data worth? And so what I did is I took these research students and I stick them on this problem. I said, "I want you to do some research. Let me understand what is the value of data?" I've seen all these different papers and analysts and consulting firms talk about it, but nobody's really got this thing clicked. And so we launched this research project at USF, professor Mouwafac Sidaoui and I together, and we were bumping along the same old path that everyone else got, which was inched on, how do we get data on our balance sheet? That was always the motivation, because as a company we're worth so much more because our data is so valuable, and how do I get it on the balance sheet? So we're headed down that path and trying to figure out how do you get it on the balance sheet? And then one of my research students, she comes up to me and she says, "Professor Schmarzo," she goes, "Data is kind of an unusual asset." I said, "Well, what do you mean?" She goes, "Well, you think about data as an asset. It never depletes, it never wears out. And the same dataset can be used across an unlimited number of use cases at a marginal cost equal to zero." And when she said that, it's like, "Holy crap." The light bulb went off. It's like, "Wait a second. I've been thinking about this entirely wrong for the last 30 some years of my life in this space. I've had the wrong frame. I keep thinking about this as an act, as an accounting conversation. An accounting determines valuation based on what somebody is willing to pay for." So if you go back to Adam Smith, 1776, "Wealth of Nations," he talks about valuation techniques. And one of the valuation techniques he talks about is valuation and exchange. That is the value of an asset is what someone's willing to pay you for it. So the value of this bottle of water is what someone's willing to pay you for it. So everybody fixates on this asset, valuation in exchange methodology. That's how you put it on balance sheet. That's how you run depreciation schedules, that dictates everything. But Adam Smith also talked about in that book, another valuation methodology, which is valuation in use, which is an economics conversation, not an accounting conversation. And when I realized that my frame was wrong, yeah, I had the right book. I had Adam Smith, I had "Wealth of Nations." I had all that good stuff, but I hadn't read the whole book. I had missed this whole concept about the economic value, where value is determined by not how much someone's willing to pay you for it, but the value you can drive by using it. So, Jeff, when that person made that comment, the entire research project, and I got to tell you, my entire life did a total 180, right? Just total of 180 degree change of how I was thinking about data as an asset. >> Right, well, Bill, it's funny though, that's kind of captured, I always think of kind of finance versus accounting, right? And then you're right on accounting. And we learn a lot of things in accounting. Basically we learn more that we don't know, but it's really hard to put it in an accounting framework, because as you said, it's not like a regular asset. You can use it a lot of times, you can use it across lots of use cases, it doesn't degradate over time. In fact, it used to be a liability. 'cause you had to buy all this hardware and software to maintain it. But if you look at the finance side, if you look at the pure play internet companies like Google, like Facebook, like Amazon, and you look at their valuation, right? We used to have this thing, we still have this thing called Goodwill, which was kind of this capture between what the market established the value of the company to be. But wasn't reflected when you summed up all the assets on the balance sheet and you had this leftover thing, you could just plug in goodwill. And I would hypothesize that for these big giant tech companies, the market has baked in the value of the data, has kind of put in that present value on that for a long period of time over multiple projects. And we see it captured probably in goodwill, versus being kind of called out as an individual balance sheet item. >> So I don't think it's, I don't know accounting. I'm not an accountant, thank God, right? And I know that goodwill is one of those things if I remember from my MBA program is something that when you buy a company and you look at the value you paid versus what it was worth, it stuck into this category called goodwill, because no one knew how to figure it out. So the company at book value was a billion dollars, but you paid five billion for it. Well, you're not an idiot, so that four billion extra you paid must be in goodwill and they'd stick it in goodwill. And I think there's actually a way that goodwill gets depreciated as well. So it could be that, but I'm totally away from the accounting framework. I think that's distracting, trying to work within the gap rules is more of an inhibitor. And we talk about the Googles of the world and the Facebooks of the world and the Netflix of the world and the Amazons and companies that are great at monetizing data. Well, they're great at monetizing it because they're not selling it, they're using it. Google is using their data to dominate search, right? Netflix is using it to be the leader in on-demand videos. And it's how they use all the data, how they use the insights about their customers, their products, and their operations to really drive new sources of value. So to me, it's this, when you start thinking about from an economics perspective, for example, why is the same car that I buy and an Uber driver buys, why is that car more valuable to an Uber driver than it is to me? Well, the bottom line is, Uber drivers are going to use that car to generate value, right? That $40,000, that car they bought is worth a lot more, because they're going to use that to generate value. For me it sits in the driveway and the birds poop on it. So, right, so it's this value in use concept. And when organizations can make that, by the way, most organizations really struggle with this. They struggle with this value in use concept. They want to, when you talk to them about data monetization and say, "Well, I'm thinking about the chief data officer, try not to trying to sell data, knocking on doors, shaking their tin cup, saying, 'Buy my data.'" No, no one wants your data. Your data is more valuable for how you use it to drive your operations then it's a sell to somebody else. >> Right, right. Well, on of the other things that's really important from an economics concept is scarcity, right? And a whole lot of economics is driven around scarcity. And how do you price for scarcity so that the market evens out and the price matches up to the supply? What's interesting about the data concept is, there is no scarcity anymore. And you know, you've outlined and everyone has giant numbers going up into the right, in terms of the quantity of the data and how much data there is and is going to be. But what you point out very eloquently in this paper is the scarcity is around the resources to actually do the work on the data to get the value out of the data. And I think there's just this interesting step function between just raw data, which has really no value in and of itself, right? Until you start to apply some concepts to it, you start to analyze it. And most importantly, that you have some context by which you're doing all this analysis to then drive that value. And I thought it was really an interesting part of this paper, which is get beyond the arguing that we're kind of discussing here and get into some specifics where you can measure value around a specific business objective. And not only that, but then now the investment of the resources on top of the data to be able to extract the value to then drive your business process for it. So it's a really different way to think about scarcity, not on the data per se, but on the ability to do something with it. >> You're spot on, Jeff, because organizations don't fail because of a lack of use cases. They fail because they have too many. So how do you prioritize? Now that scarcity is not an issue on the data side, but it is this issue on the people resources side, you don't have unlimited data scientists, right? So how do you prioritize and focus on those opportunities that are most important? I'll tell you, that's not a data science conversation, that's a business conversation, right? And figuring out how you align organizations to identify and focus on those use cases that are most important. Like in the paper we go through several different use cases using Chipotle as an example. The reason why I picked Chipotle is because, well, I like Chipotle. So I could go there and I could write it off as research. But there's a, think about the number of use cases where a company like Chipotle or any other company can leverage your data to drive their key business initiatives and their key operational use cases. It's almost unbounded, which by the way, is a huge challenge. In fact, I think part of the problem we see with a lot of organizations is because they do such a poor job of prioritizing and focusing, they try to solve the entire problem with one big fell swoop, right? It's slightly the old ERP big bang projects. Well, I'm just going to spend $20 million to buy this analytic capability from company X and I'm going to install it and then magic is going to happen. And then magic is going to happen, right? And then magic is going to happen, right? And magic never happens. We get crickets instead, because the biggest challenge isn't around how do I leverage the data, it's about where do I start? What problems do I go after? And how do I make sure the organization is bought in to basically use case by use case, build out your data and analytics architecture and capabilities. >> Yeah, and you start backwards from really specific business objectives in the use cases that you outline here, right? I want to increase my average ticket by X. I want to increase my frequency of visits by X. I want to increase the amount of items per order from X to 1.2 X, or 1.3 X. So from there you get a nice kind of big revenue hit that you can plan around and then work backwards into the amount of effort that it takes and then you can come up, "Is this a good investment or not?" So it's a really different way to get back to the value of the data. And more importantly, the analytics and the work to actually call out the information. >> The technologies, the data and analytic technologies available to us. The very composable nature of these allow us to take this use case by use case approach. I can build out my data lake one use case at a time. I don't need to stuff 25 data sources into my data lake and hope there's someone more valuable. I can use the first use case to say, "Oh, I need these three data sources to solve that use case. I'm going to put those three data sources in the data lake. I'm going to go through the entire curation process of making sure the data has been transformed and cleansed and aligned and enriched and met of, all the other governance, all that kind of stuff this goes on. But I'm going to do that use case by use case, 'cause a use case can tell me which data sources are most important for that given situation. And I can build up my data lake and I can build up my analytics then one use case at a time. And there is a huge impact then, huge impact when I build out use case by use case. That does not happen. Let me throw something that's not really covered in the paper, but it is very much covered in my new book that I'm working on, which is, in knowledge-based industries, the economies of learning are more powerful than the economies of scale. Now think about that for a second. >> Say that again, say that again. >> Yeah, the economies of learning are more powerful than the economies of scale. And what that means is what I learned on the first use case that I build out, I can apply that learning to the second use case, to the third use case, to the fourth use case. So when I put my data into my data lake for my first use case, and the paper covers this, well, once it's in my data lake, the cost of reusing that data in a second, third and fourth use cases is basically, you know marginal cost is zero. So I get this ability to learn about what data sets are most important and to reapply that across the organization. So this learning concept, I learn use case by use case, I don't have to do a big economies of scale approach and start with 25 datasets of which only three or four might be useful. But I'm incurring the overhead for all those other non-important data sets because I didn't take the time to go through and figure out what are my most important use cases and what data do I need to support those use cases. >> I mean, should people even think of the data per se or should they really readjust their thinking around the application of the data? Because the data in and of itself means nothing, right? 55, is that fast or slow? Is that old or young? Well, it depends on a whole lot of things. Am I walking or am I in a brand new Corvette? So it just, it's funny to me that the data in and of itself really doesn't have any value and doesn't really provide any direction into a decision or a higher order, predictive analytics until you start to manipulate the data. So is it even the wrong discussion? Is data the right discussion? Or should we really be talking about the capabilities to do stuff within and really get people focused on that? >> So Jeff, there's so many points to hit on there. So the application of data is what's the value, and the queue of you guys used to be famous for saying, "Separating noise from the signal." >> Signal from the noise. Signal from a noise, right. Well, how do you know in your dataset what's signal and what's noise? Well, the use case will tell you. If you don't know the use case and you have no way of figuring out what's important. One of the things I use, I still rail against, and it happens still. Somebody will walk up my data science team and say, "Here's some data, tell me what's interesting in it." Well, how do you separate signal from noise if I don't know the use case? So I think you're spot on, Jeff. The way to think about this is, don't become data-driven, become value-driven and value is driven from the use case or the application or the use of the data to solve that particular use case. So organizations that get fixated on being data-driven, I hate the term data-driven. It's like as if there's some sort of frigging magic from having data. No, data has no value. It's how you use it to derive customer product and operational insights that drive value,. >> Right, so there's an interesting step function, and we talk about it all the time. You're out in the weeds, working with Chipotle lately, and increase their average ticket by 1.2 X. We talk more here, kind of conceptually. And one of the great kind of conceptual holy grails within a data-driven economy is kind of working up this step function. And you've talked about it here. It's from descriptive, to diagnostic, to predictive. And then the Holy grail prescriptive, we're way ahead of the curve. This comes into tons of stuff around unscheduled maintenance. And you know, there's a lot of specific applications, but do you think we spend too much time kind of shooting for the fourth order of greatness impact, instead of kind of focusing on the small wins? >> Well, you certainly have to build your way there. I don't think you can get to prescriptive without doing predictive, and you can't do predictive without doing descriptive and such. But let me throw a really one at you, Jeff, I think there's even one beyond prescriptive. One we're talking more and more about, autonomous, a ton of analytics, right? And one of the things that paper talked about that didn't click with me at the time was this idea of orphaned analytics. You and I kind of talked about this before the call here. And one thing we noticed in the research was that a lot of these very mature organizations who had advanced from the retrospective analytics of BI to the descriptive, to the predicted, to the prescriptive, they were building one off analytics to solve a problem and getting value from it, but never reusing this analytics over and over again. They were done one off and then they were thrown away and these organizations were so good at data science and analytics, that it was easier for them to just build from scratch than to try to dig around and try to find something that was never actually ever built to be reused. And so I have this whole idea of orphaned analytics, right? It didn't really occur to me. It didn't make any sense into me until I read this quote from Elon Musk, and Elon Musk made this statement. He says, " I believe that when you buy a Tesla, you're buying an asset that appreciates in value, not depreciates through usage." I was thinking, "Wait a second, what does that mean?" He didn't actually say it, "Through usage." He said, "He believes you're buying an asset that appreciates not depreciates in value." And of course the first response I had was, "Oh, it's like a 1964 and a half Mustang. It's rare, so everybody is going to want these things. So buy one, stick it in your garage. And 20 years later, you're bringing it out and it's worth more money." No, no, there's 600,000 of these things roaming around the streets, they're not rare. What he meant is that he is building an autonomous asset. That the more that it's used, the more valuable it's getting, the more reliable, the more efficient, the more predictive, the more safe this asset's getting. So there is this level beyond prescriptive where we can think about, "How do we leverage artificial intelligence, reinforcement, learning, deep learning, to build these assets that the more that they are used, the smarter they get." That's beyond prescriptive. That's an environment where these things are learning. In many cases, they're learning with minimal or no human intervention. That's the real aha moment. That's what I miss with orphaned analytics and why it's important to build analytics that can be reused over and over again. Because every time you use these analytics in a different use case, they get smarter, they get more valuable, they get more predictive. To me that's the aha moment that blew my mind. I realized I had missed that in the paper entirely. And it took me basically two years later to realize, dough, I missed the most important part of the paper. >> Right, well, it's an interesting take really on why the valuation I would argue is reflected in Tesla, which is a function of the data. And there's a phenomenal video if you've never seen it, where they have autonomous vehicle day, it might be a year or so old. And he's got his number one engineer from, I think the Microprocessor Group, The Computer Vision Group, as well as the autonomous driving group. And there's a couple of really great concepts I want to follow up on what you said. One is that they have this thing called The Fleet. To your point, there's hundreds of thousands of these things, if they haven't hit a million, that are calling home reporting home every day as to exactly how everyone took the Northbound 101 on-ramp off of University Avenue. How fast did they go? What line did they take? What G-forces did they take? And every one of those cars feeds into the system, so that when they do the autonomous update, not only are they using all their regular things that they would use to map out that 101 Northbound entry, but they've got all the data from all the cars that have been doing it. And you know, when that other car, the autonomous car couple years ago hit the pedestrian, I think in Phoenix, which is not good, sad, killed a person, dark tough situation. But you know, we are doing an autonomous vehicle show and the guy who made a really interesting point, right? That when something like that happens, typically if I was in a car wreck or you're in a car wreck, hopefully not, I learned the person that we hit learns and maybe a couple of witnesses learn, maybe the inspector. >> But nobody else learns. >> But nobody else learns. But now with the autonomy, every single person can learn from every single experience with every vehicle contributing data within that fleet. To your point, it's just an order of magnitude, different way to think about things. >> Think about a 1% improvement compounded 365 times, equals I think 38 X improvement. The power of 1% improvements over these 600,000 plus cars that are learning. By the way, even when the autonomous FSD, the full self-driving mode module isn't turned on, even when it's not turned on, it runs in shadow mode. So it's learning from the human drivers, the human overlords, it's constantly learning. And by the way, not only they're collecting all this data, I did a little research, I pulled out some of their job search ads and they've built a giant simulator, right? And they're there basically every night, simulating billions and billions of more driven miles because of the simulator. They are building, he's going to have a simulator, not only for driving, but think about all the data he's capturing as these cars are riding down the road. By the way, they don't use Lidar, they use video, right? So he's driving by malls. He knows how many cars are in the mall. He's driving down roads, he knows how old the cars are and which ones should be replaced. I mean, he has this, he's sitting on this incredible wealth of data. If anybody could simulate what's going on in the world and figure out how to get out of this COVID problem, it's probably Elon Musk and the data he's captured, be courtesy of all those cars. >> Yeah, yeah, it's really interesting, and we're seeing it now. There's a new autonomous drone out, the Skydio, and they just announced their commercial product. And again, it completely changes the way you think about how you use that tool, because you've just eliminated the complexity of driving. I don't want to drive that, I want to tell it what to do. And so you're saying, this whole application of air force and companies around things like measuring piles of coal and measuring these huge assets that are volume metric measured, that these things can go and map out and farming, et cetera, et cetera. So the autonomy piece, that's really insightful. I want to shift gears a little bit, Bill, and talk about, you had some theories in here about thinking of data as an asset, data as a currency, data as monetization. I mean, how should people think of it? 'Cause I don't think currency is very good. It's really not kind of an exchange of value that we're doing this kind of classic asset. I think the data as oil is horrible, right? To your point, it doesn't get burned up once and can't be used again. It can be used over and over and over. It's basically like feedstock for all kinds of stuff, but the feedstock never goes away. So again, or is it that even the right way to think about, do we really need to shift our conversation and get past the idea of data and get much more into the idea of information and actionable information and useful information that, oh, by the way, happens to be powered by data under the covers? >> Yeah, good question, Jeff. Data is an asset in the same way that a human is an asset. But just having humans in your company doesn't drive value, it's how you use those humans. And so it's really again the application of the data around the use cases. So I still think data is an asset, but I don't want to, I'm not fixated on, put it on my balance sheet. That nice talk about put it on a balance sheet, I immediately put the blinders on. It inhibits what I can do. I want to think about this as an asset that I can use to drive value, value to my customers. So I'm trying to learn more about my customer's tendencies and propensities and interests and passions, and try to learn the same thing about my car's behaviors and tendencies and my operations have tendencies. And so I do think data is an asset, but it's a latent asset in the sense that it has potential value, but it actually has no value per se, inputting it into a balance sheet. So I think it's an asset. I worry about the accounting concept medially hijacking what we can do with it. To me the value of data becomes and how it interacts with, maybe with other assets. So maybe data itself is not so much an asset as it's fuel for driving the value of assets. So, you know, it fuels my use cases. It fuels my ability to retain and get more out of my customers. It fuels ability to predict what my products are going to break down and even have products who self-monitor, self-diagnosis and self-heal. So, data is an asset, but it's only a latent asset in the sense that it sits there and it doesn't have any value until you actually put something to it and shock it into action. >> So let's shift gears a little bit and start talking about the data and talk about the human factors. 'Cause you said, one of the challenges is people trying to bite off more than they can chew. And we have the role of chief data officer now. And to your point, maybe that mucks things up more than it helps. But in all the customer cases that you've worked on, is there a consistent kind of pattern of behavior, personality, types of projects that enables some people to grab those resources to apply to their data to have successful projects, because to your point there's too much data and there's too many projects and you talk a lot about prioritization. But there's a lot of assumptions in the prioritization model that you can, that you know a whole lot of things, especially if you're comparing project A over in group A with project B, with group B and the two may not really know the economics across that. But from an individual person who sees the potential, what advice do you give them? What kind of characteristics do you see, either in the type of the project, the type of the boss, the type of the individual that really lends itself to a higher probability of a successful outcome? >> So first off you need to find somebody who has a vision for how they want to use the data, and not just collect it. But how they're going to try to change the fortunes of the organization. So it always takes a visionary, may not be the CEO, might be somebody who's a head of marketing or the head of logistics, or it could be a CIO, it could be a chief data officer as well. But you've got to find somebody who says, "We have this latent asset we could be doing more with, and we have a series of organizational problem challenges against which I could apply this asset. And I need to be the matchmaker that brings these together." Now the tool that I think is the most powerful tool in marrying the latent capabilities of data with all the revenue generating opportunities in the application side, because there's a countless number, the most important tool that I found doing that is design thinking. Now, the reason why I think design thinking is so important, because one of the things that design thinking does a great job is it gives everybody a voice in the process of identifying, validating, valuing, and prioritizing use cases you're going to go after. Let me say that again. The challenge organizations have is identifying, validating, valuing, and prioritizing the use cases they want to go after. Design thinking is a marvelous tool for driving organizational alignment around where we're going to start and what's going to be next and why we're going to start there and how we're going to bring everybody together. Big data and data science projects don't die because of technology failure. Most of them die because of passive aggressive behaviors in the organization that you didn't bring everybody into the process. Everybody's voice didn't get a chance to be heard. And that one person who's voice didn't get a chance to get heard, they're going to get you. They may own a certain piece of data. They may own something, but they're just waiting and lay, they're just laying there waiting for their chance to come up and snag it. So what you got to do is you got to proactively bring these people together. We call this, this is part of our value engineering process. We have a value engineering process around envisioning where we bring all these people together. We help them to understand how data in itself is a latent asset, but how it can be used from an economics perspective, drive all those value. We get them all fired up on how these can solve any one of these use cases. But you got to start with one, and you've got to embrace this idea that I can build out my data and analytic capabilities, one use case at a time. And the first use case I go after and solve, makes my second one easier, makes my third one easier, right? It has this ability that when you start going use case by use case two really magical things happen. Number one, your marginal cost flatten. That is because you're building out your data lake one use case at a time, and you're bringing all the important data lake, that data lake one use case at a time. At some point in time, you've got most of the important data you need, and the ability that you don't need to add another data source. You got what you need, so your marginal costs start to flatten. And by the way, if you build your analytics as composable, reusable, continuous learning analytic assets, not as orphaned analytics, pretty soon you have all the analytics you need as well. So your marginal cost flatten, but effect number two is that you've, because you've have the data and the analytics, I can accelerate time to value, and I can de-risked projects as I go use case by use case. And so then the biggest challenge becomes not in the data and the analytics, it's getting the all the business stakeholders to agree on, here's a roadmap we're going to go after. This one's first, and this one is going first because it helps to drive the value of the second and third one. And then this one drives this, and you create a whole roadmap of rippling through of how the data and analytics are driving this value to across all these use cases at a marginal cost approaching zero. >> So should we have chief design thinking officers instead of chief data officers that really actually move the data process along? I mean, I first heard about design thinking years ago, actually interviewing Dan Gordon from Gordon Biersch, and they were, he had just hired a couple of Stanford grads, I think is where they pioneered it, and they were doing some work about introducing, I think it was a a new apple-based alcoholic beverage, apple cider, and they talked a lot about it. And it's pretty interesting, but I mean, are you seeing design thinking proliferate into the organizations that you work with? Either formally as design thinking or as some derivation of it that pulls some of those attributes that you highlighted that are so key to success? >> So I think we're seeing the birth of this new role that's marrying capabilities of design thinking with the capabilities of data and analytics. And they're calling this dude or dudette the chief innovation officer. Surprise. >> Title for someone we know. >> And I got to tell a little story. So I have a very experienced design thinker on my team. All of our data science projects have a design thinker on them. Every one of our data science projects has a design thinker, because the nature of how you build and successfully execute a data science project, models almost exactly how design thinking works. I've written several papers on it, and it's a marvelous way. Design thinking and data science are different sides of the same coin. But my respect for data science or for design thinking took a major shot in the arm, major boost when my design thinking person on my team, whose name is John Morley introduced me to a senior data scientist at Google. And I was bottom coffee. I said, "No," this is back in, before I even joined Hitachi Vantara, and I said, "So tell me the secret to Google's data science success? You guys are marvelous, you're doing things that no one else was even contemplating, and what's your key to success?" And he giggles and laughs and he goes, "Design thinking." I go, "What the hell is that? Design thinking, I've never even heard of the stupid thing before." He goes, "I'd make a deal with you, Friday afternoon let's pop over to Stanford's B school and I'll teach you about design thinking." So I went with him on a Friday to the d.school, Design School over at Stanford and I was blown away, not just in how design thinking was used to ideate and bring and to explore. But I was blown away about how powerful that concept is when you marry it with data science. What is data science in its simplest sense? Data science is about identifying the variables and metrics that might be better predictors of performance. It's that might phrase that's the real key. And who are the people who have the best insights into what values or metrics or KPIs you might want to test? It ain't the data scientists, it's the subject matter experts on the business side. And when you use design thinking to bring this subject matter experts with the data scientists together, all kinds of magic stuff happens. It's unbelievable how well it works. And all of our projects leverage design thinking. Our whole value engineering process is built around marrying design thinking with data science, around this prioritization, around these concepts of, all ideas are worthy of consideration and all voices need to be heard. And the idea how you embrace ambiguity and diversity of perspectives to drive innovation, it's marvelous. But I feel like I'm a lone voice out in the wilderness, crying out, "Yeah, Tesla gets it, Google gets it, Apple gets it, Facebook gets it." But you know, most other organizations in the world, they don't think like that. They think design thinking is this Wufoo thing. Oh yeah, you're going to bring people together and sing Kumbaya. It's like, "No, I'm not singing Kumbaya. I'm picking their brains because they're going to help make their data science team much more effective and knowing what problems we're going to go after and how I'm going to measure success and progress. >> Maybe that's the next Dean for the next 10 years, the Dean of design thinking instead of data science, and who knew they're one and the same? Well, Bill, that's a super insightful, I mean, it's so, is validated and supported by the trends that we see all over the place, just in terms of democratization, right? Democratization of the tools, more people having access to data, more opinions, more perspective, more people that have the ability to manipulate the data and basically experiment, does drive better business outcomes. And it's so consistent. >> If I could add one thing, Jeff, I think that what's really powerful about design thinking is when I think about what's happening with artificial intelligence or AI, there's all these conversations about, "Oh, AI is going to wipe out all these jobs. Is going to take all these jobs away." And what we're actually finding is that if we think about machine learning, driven by AI and human empowerment, driven by design thinking, we're seeing the opportunity to exploit these economies of learning at the front lines where every customer engagement, every operational execution is an opportunity to gather not only more data, but to gather more learnings, to empower the humans at the front lines of the organization to constantly be seeking, to try different things, to explore and to learn from each of these engagements. I think it's, AI to me is incredibly powerful. And I think about it as a source of driving more learning, a continuous learning and continuously adapting an organization where it's not just the machines that are doing this, but it's the humans who've been empowered to do that. And my chapter nine in my new book, Jeff, is all about team empowerment, because nothing you do with AI is going to matter of squat if you don't have empowered teams who know how to take and leverage that continuous learning opportunity at the front lines of customer and operational engagement. >> Bill, I couldn't set a better, I think we'll leave it there. That's a great close, when is the next book coming out? >> So today I do my second to last final review. Then it goes back to the editor and he does a review and we start looking at formatting. So I think we're probably four to six weeks out. >> Okay, well, thank you so much, congratulations on all the success. I just love how the Dean is really the Dean now, teaching all over the world, sharing the knowledge and attacking some of these big problems. And like all great economics problems, often the answer is not economics at all. It's completely really twist the lens and don't think of it in that, all that construct. >> Exactly. >> All right, Bill. Thanks again and have a great week. >> Thanks, Jeff. >> All right. He's Bill Schmarzo, I'm Jeff Frick. You're watching theCUBE. Thanks for watching, we'll see you next time. (gentle music)

Published Date : Aug 3 2020

SUMMARY :

leaders all around the world. And now he teaches at the of the very first Strata Conferences into the details, you know, and how do I get it on the balance sheet? of the data, has kind of put at the value you paid but on the ability to And how do I make sure the analytics and the work of making sure the data has the time to go through that the data in and of itself and the queue of you is driven from the use case And one of the great kind And of course the first and the guy who made a really But now with the autonomy, and the data he's captured, and get past the idea of of the data around the use cases. and the two may not really and the ability that you don't need into the organizations that you work with? the birth of this new role And the idea how you embrace ambiguity people that have the ability of the organization to is the next book coming out? Then it goes back to the I just love how the Dean Thanks again and have a great week. we'll see you next time.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JeffPERSON

0.99+

Bill SchmarzoPERSON

0.99+

Jeff FrickPERSON

0.99+

SidaouiPERSON

0.99+

AmazonORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

John MorleyPERSON

0.99+

AppleORGANIZATION

0.99+

NetflixORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

AmazonsORGANIZATION

0.99+

five billionQUANTITY

0.99+

1%QUANTITY

0.99+

$20 millionQUANTITY

0.99+

$40,000QUANTITY

0.99+

August 2020DATE

0.99+

365 timesQUANTITY

0.99+

Adam SmithPERSON

0.99+

PhoenixLOCATION

0.99+

UberORGANIZATION

0.99+

secondQUANTITY

0.99+

NUI GalwayORGANIZATION

0.99+

fourQUANTITY

0.99+

thirdQUANTITY

0.99+

SchmarzoPERSON

0.99+

billionsQUANTITY

0.99+

ChipotleORGANIZATION

0.99+

Friday afternoonDATE

0.99+

The Art of Thinking Like A Data ScientistTITLE

0.99+

University AvenueLOCATION

0.99+

Hitachi VantaraORGANIZATION

0.99+

oneQUANTITY

0.99+

threeQUANTITY

0.99+

28 reference sourcesQUANTITY

0.99+

Elon MuskPERSON

0.99+

BillPERSON

0.99+

BostonLOCATION

0.99+

180QUANTITY

0.99+

The Computer Vision GroupORGANIZATION

0.99+

four billionQUANTITY

0.99+

first use caseQUANTITY

0.99+

Dan GordonPERSON

0.99+

TeslaORGANIZATION

0.99+

firstQUANTITY

0.99+

1776DATE

0.99+

zeroQUANTITY

0.99+

third use caseQUANTITY

0.99+

180 degreeQUANTITY

0.99+

Elon MuskPERSON

0.99+

38 XQUANTITY

0.99+

2020DATE

0.99+

twoQUANTITY

0.99+

todayDATE

0.99+

hundreds of thousandsQUANTITY

0.99+

Microprocessor GroupORGANIZATION

0.99+

25 data sourcesQUANTITY

0.99+

six weeksQUANTITY

0.99+

USFORGANIZATION

0.99+

fourth use caseQUANTITY

0.99+

Kathryn IBM promo v1


 

hi I'm Katie Kubek global portfolio product marketing manager for IBM master data management master data management is a key part within the data ops tool chain so ever trusted completely of your customers products and to offer unique and personalized digital experiences learn their about this translate our data off strata event on May 27th hope to chat with either

Published Date : May 4 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

EntityCategoryConfidence
Katie KubekPERSON

0.99+

IBMORGANIZATION

0.99+

May 27thDATE

0.99+

KathrynPERSON

0.98+

strataEVENT

0.89+

v1EVENT

0.65+

Basil Faruqui, BMC Software | BigData NYC 2017


 

>> Live from Midtown Manhattan, it's theCUBE. Covering BigData New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (calm electronic music) >> Basil Faruqui, who's the Solutions Marketing Manger at BMC, welcome to theCUBE. >> Thank you, good to be back on theCUBE. >> So first of all, heard you guys had a tough time in Houston, so hope everything's gettin' better, and best wishes to everyone down in-- >> We're definitely in recovery mode now. >> Yeah and so hopefully that can get straightened out quick. What's going on with BMC? Give us a quick update in context to BigData NYC. What's happening, what is BMC doing in the big data space now, the AI space now, the IOT space now, the cloud space? >> So like you said that, you know, the data link space, the IOT space, the AI space, there are four components of this entire picture that literally haven't changed since the beginning of computing. If you look at those four components of a data pipeline it's ingestion, storage, processing, and analytics. What keeps changing around it, is the infrastructure, the types of data, the volume of data, and the applications that surround it. And the rate of change has picked up immensely over the last few years with Hadoop coming in to the picture, public cloud providers pushing it. It's obviously creating a number of challenges, but one of the biggest challenges that we are seeing in the market, and we're helping costumers address, is a challenge of automating this and, obviously, the benefit of automation is in scalability as well and reliability. So when you look at this rather simple data pipeline, which is now becoming more and more complex, how do you automate all of this from a single point of control? How do you continue to absorb new technologies, and not re-architect our automation strategy every time, whether it's it Hadoop, whether it's bringing in machine learning from a cloud provider? And that is the issue we've been solving for customers-- >> Alright let me jump into it. So, first of all, you mention some things that never change, ingestion, storage, and what's the third one? >> Ingestion, storage, processing and eventually analytics. >> And analytics. >> Okay so that's cool, totally buy that. Now if your move and say, hey okay, if you believe that standard, but now in the modern era that we live in, which is complex, you want breath of data, but also you want the specialization when you get down to machine limits highly bounded, that's where the automation is right now. We see the trend essentially making that automation more broader as it goes into the customer environments. >> Correct >> How do you architect that? If I'm a CXO, or I'm a CDO, what's in it for me? How do I architect this? 'Cause that's really the number one thing, as I know what the building blocks are, but they've changed in their dynamics to the market place. >> So the way I look at it, is that what defines success and failure, and particularly in big data projects, is your ability to scale. If you start a pilot, and you spend three months on it, and you deliver some results, but if you cannot roll it out worldwide, nationwide, whatever it is, essentially the project has failed. The analogy I often given is Walmart has been testing the pick-up tower, I don't know if you've seen. So this is basically a giant ATM for you to go pick up an order that you placed online. They're testing this at about a hundred stores today. Now if that's a success, and Walmart wants to roll this out nation wide, how much time do you think their IT department's going to have? Is this a five year project, a ten year project? No, and the management's going to want this done six months, ten months. So essentially, this is where automation becomes extremely crucial because it is now allowing you to deliver speed to market and without automation, you are not going to be able to get to an operational stage in a repeatable and reliable manner. >> But you're describing a very complex automation scenario. How can you automate in a hurry without sacrificing the details of what needs to be? In other words, there would seem to call for repurposing or reusing prior automation scripts and rules, so forth. How can the Walmart's of the world do that fast, but also do it well? >> Yeah so we do it, we go about it in two ways. One is that out of the box we provide a lot of pre-built integrations to some of the most commonly used systems in an enterprise. All the way from the Mainframes, Oracles, SAPs, Hadoop, Tableaus of the world, they're all available out of the box for you to quickly reuse these objects and build an automated data pipeline. The other challenge we saw, and particularly when we entered the big data space four years ago was that the automation was something that was considered close to the project becoming operational. Okay, and that's where a lot of rework happened because developers had been writing their own scripts using point solutions, so we said alright, it's time to shift automation left, and allow companies to build automations and artifact very early in the developmental life cycle. About a month ago, we released what we call Control-M Workbench, its essentially a community edition of Control-M, targeted towards developers so that instead of writing their own scripts, they can use Control-M in a completely offline manner, without having to connect to an enterprise system. As they build, and test, and iterate, they're using Control-M to do that. So as the application progresses through the development life cycle, and all of that work can then translate easily into an enterprise edition of Control-M. >> Just want to quickly define what shift left means for the folks that might not know software methodologies, they don't think >> Yeah, so. of left political, left or right. >> So, we're not shifting Control-M-- >> Alt-left, alt-right, I mean, this is software development, so quickly take a minute and explain what shift left means, and the importance of it. >> Correct, so if you think of software development as a straight line continuum, you've got, you will start with building some code, you will do some testing, then unit testing, then user acceptance testing. As it moves along this chain, there was a point right before production where all of the automation used to happen. Developers would come in and deliver the application to Ops and Ops would say, well hang on a second, all this Crontab, and these other point solutions we've been using for automation, that's not what we use in production, and we need you to now go right in-- >> So test early and often. >> Test early and often. So the challenge was the developers, the tools they used were not the tools that were being used on the production end of the site. And there was good reason for it, because developers don't need something really heavy and with all the bells and whistles early in the development lifecycle. Now Control-M Workbench is a very light version, which is targeted at developers and focuses on the needs that they have when they're building and developing it. So as the application progresses-- >> How much are you seeing waterfall-- >> But how much can they, go ahead. >> How much are you seeing waterfall, and then people shifting left becoming more prominent now? What percentage of your customers have moved to Agile, and shifting left percentage wise? >> So we survey our customers on a regular basis, and the last survey showed that eighty percent of the customers have either implemented a more continuous integration delivery type of framework, or are in the process of doing it, And that's the other-- >> And getting close to a 100 as possible, pretty much. >> Yeah, exactly. The tipping point is reached. >> And what is driving. >> What is driving all is the need from the business. The days of the five year implementation timelines are gone. This is something that you need to deliver every week, two weeks, and iteration. >> Iteration, yeah, yeah. And we have also innovated in that space, and the approach we call jobs as code, where you can build entire complex data pipelines in code format, so that you can enable the automation in a continuous integration and delivery framework. >> I have one quick question, Jim, and I'll let you take the floor and get a word in soon, but I have one final question on this BMC methodology thing. You guys have a history, obviously BMC goes way back. Remember Max Watson CEO, and Bob Beach, back in '97 we used to chat with him, dominated that landscape. But we're kind of going back to a systems mindset. The question for you is, how do you view the issue of this holy grail, the promised land of AI and machine learning, where end-to-end visibility is really the goal, right? At the same time, you want bounded experiences at root level so automation can kick in to enable more activity. So there's a trade-off between going for the end-to-end visibility out of the gate, but also having bounded visibility and data to automate. How do you guys look at that market? Because customers want the end-to-end promise, but they don't want to try to get there too fast. There's a diseconomies of scale potentially. How do you talk about that? >> Correct. >> And that's exactly the approach we've taken with Control-M Workbench, the Community Edition, because earlier on you don't need capabilities like SLA management and forecasting and automated promotion between environments. Developers want to be able to quickly build and test and show value, okay, and they don't need something that is with all the bells and whistles. We're allowing you to handle that piece, in that manner, through Control-M Workbench. As things progress and the application progresses, the needs change as well. Well now I'm closer to delivering this to the business, I need to be able to manage this within an SLA, I need to be able to manage this end-to-end and connect this to other systems of record, and streaming data, and clickstream data, all of that. So that, we believe that it doesn't have to be a trade off, that you don't have to compromise speed and quality for end-to-end visibility and enterprise grade automation. >> You mentioned trade offs, so the Control-M Workbench, the developer can use it offline, so what amount of testing can they possibly do on a complex data pipeline automation when the tool's offline? I mean it seems like the more development they do offline, the greater the risk that it simply won't work when they go into production. Give us a sense for how they mitigate, the mitigation risk in using Control-M Workbench. >> Sure, so we spend a lot of time observing how developers work, right? And very early in the development stage, all they're doing is working off of their Mac or their laptop, and they're not really connected to any. And that is where they end up writing a lot of scripts, because whatever code business logic they've written, the way they're going to make it run is by writing scripts. And that, essentially, becomes the problem, because then you have scripts managing more scripts, and as the application progresses, you have this complex web of scripts and Crontabs and maybe some opensource solutions, trying to simply make all of this run. And by doing this on an offline manner, that doesn't mean that they're losing all of the other Control-M capabilities. Simply, as the application progresses, whatever automation that the builtin Control-M can seamlessly now flow into the next stage. So when you are ready to take an application into production, there's essentially no rework required from an automation perspective. All of that, that was built, can now be translated into the enterprise-grade Control M, and that's where operations can then go in and add the other artifacts, such as SLA management and forecasting and other things that are important from an operational perspective. >> I'd like to get both your perspectives, 'cause, so you're like an analyst here, so Jim, I want you guys to comment. My question to both of you would be, lookin' at this time in history, obviously in the BMC side we mention some of the history, you guys are transforming on a new journey in extending that capability of this world. Jim, you're covering state-of-the-art AI machine learning. What's your take of this space now? Strata Data, which is now Hadoop World, which is Cloud Air went public, Hortonworks is now public, kind of the big, the Hadoop guys kind of grew up, but the world has changed around them, it's not just about Hadoop anymore. So I'd like to get your thoughts on this kind of perspective, that we're seeing a much broader picture in big data in NYC, versus the Strata Hadoop show, which seems to be losing steam, but I mean in terms of the focus. The bigger focus is much broader, horizontally scalable. And your thoughts on the ecosystem right now? >> Let the Basil answer fist, unless Basil wants me to go first. >> I think that the reason the focus is changing, is because of where the projects are in their lifecycle. Now what we're seeing is most companies are grappling with, how do I take this to the next level? How do I scale? How do I go from just proving out one or two use cases to making the entire organization data driven, and really inject data driven decision making in all facets of decision making? So that is, I believe what's driving the change that we're seeing, that now you've gone from Strata Hadoop to being Strata Data, and focus on that element. And, like I said earlier, the difference between success and failure is your ability to scale and operationalize. Take machine learning for an example. >> Good, that's where there's no, it's not a hype market, it's show me the meat on the bone, show me scale, I got operational concerns of security and what not. >> And machine learning, that's one of the hottest topics. A recent survey I read, which pulled a number of data scientists, it revealed that they spent about less than 3% of their time in training the data models, and about 80% of their time in data manipulation, data transformation and enrichment. That is obviously not the best use of a data scientist's time, and that is exactly one of the problems we're solving for our customers around the world. >> That needs to be automated to the hilt. To help them >> Correct. to be more productive, to deliver faster results. >> Ecosystem perspective, Jim, what's your thoughts? >> Yeah, everything that Basil said, and I'll just point out that many of the core uses cases for AI are automation of the data pipeline. It's driving machine learning driven predictions, classifications, abstractions and so forth, into the data pipeline, into the application pipeline to drive results in a way that is contextually and environmentally aware of what's goin' on. The history, historical data, what's goin' on in terms of current streaming data, to drive optimal outcomes, using predictive models and so forth, in line to applications. So really, fundamentally then, what's goin' on is that automation is an artifact that needs to be driven into your application architecture as a repurposable resource for a variety of-- >> Do customers even know what to automate? I mean, that's the question, what do I-- >> You're automating human judgment. You're automating effort, like the judgments that a working data engineer makes to prepare data for modeling and whatever. More and more that can be automated, 'cause those are pattern structured activities that have been mastered by smart people over many years. >> I mean we just had a customer on with a Glass'Gim CSK, with that scale, and his attitude is, we see the results from the users, then we double down and pay for it and automate it. So the automation question, it's an option question, it's a rhetorical question, but it just begs the question, which is who's writing the algorithms as machines get smarter and start throwing off their own real-time data? What are you looking at? How do you determine? You're going to need machine learning for machine learning? Are you going to need AI for AI? Who writes the algorithms >> It's actually, that's. for the algorithm? >> Automated machine learning is a hot, hot not only research focus, but we're seeing it more and more solution providers, like Microsoft and Google and others, are goin' deep down, doubling down in investments in exactly that area. That's a productivity play for data scientists. >> I think the data markets going to change radically in my opinion. I see you're startin' to some things with blockchain and some other things that are interesting. Data sovereignty, data governance are huge issues. Basil, just give your final thoughts for this segment as we wrap this up. Final thoughts on data and BMC, what should people know about BMC right now? Because people might have a historical view of BMC. What's the latest, what should they know? What's the new Instagram picture of BMC? What should they know about you guys? >> So I think what I would say people should know about BMC is that all the work that we've done over the last 25 years, in virtually every platform that came before Hadoop, we have now innovated to take this into things like big data and cloud platforms. So when you are choosing Control-M as a platform for automation, you are choosing a very, very mature solution, an example of which is Navistar. Their CIO's actually speaking at the Keno tomorrow. They've had Control-M for 15, 20 years, and they've automated virtually every business function through Control-M. And when they started their predictive maintenance project, where they're ingesting data from about 300,000 vehicles today to figure out when this vehicle might break, and to predict maintenance on it. When they started their journey, they said that they always knew that they were going to use Control-M for it, because that was the enterprise standard, and they knew that they could simply now extend that capability into this area. And when they started about three, four years ago, they were ingesting data from about 100,000 vehicles. That has now scaled to over 325,000 vehicles, and they have no had to re-architect their strategy as they grow and scale. So I would say that is one of the key messages that we are taking to market, is that we are bringing innovation that spans over 25 years, and evolving it-- >> Modernizing it, basically. >> Modernizing it, and bringing it to newer platforms. >> Well congratulations, I wouldn't call that a pivot, I'd call it an extensibility issue, kind of modernizing kind of the core things. >> Absolutely. >> Thanks for coming and sharing the BMC perspective inside theCUBE here, on BigData NYC, this is the theCUBE, I'm John Furrier. Jim Kobielus here in New York city. More live coverage, for three days we'll be here, today, tomorrow and Thursday, and BigData NYC, more coverage after this short break. (calm electronic music) (vibrant electronic music)

Published Date : Feb 11 2019

SUMMARY :

Brought to you by SiliconANGLE Media who's the Solutions Marketing Manger at BMC, in the big data space now, the AI space now, And that is the issue we've been solving for customers-- So, first of all, you mention some things that never change, and eventually analytics. but now in the modern era that we live in, 'Cause that's really the number one thing, No, and the management's going to How can the Walmart's of the world do that fast, One is that out of the box we provide a lot of left political, left or right. Alt-left, alt-right, I mean, this is software development, and we need you to now go right in-- and focuses on the needs that they have And getting close to a 100 The tipping point is reached. The days of the five year implementation timelines are gone. and the approach we call jobs as code, At the same time, you want bounded experiences at root level And that's exactly the approach I mean it seems like the more development and as the application progresses, kind of the big, the Hadoop guys kind of grew up, Let the Basil answer fist, and focus on that element. it's not a hype market, it's show me the meat of the problems we're solving That needs to be automated to the hilt. to be more productive, to deliver faster results. and I'll just point out that many of the core uses cases like the judgments that a working data engineer makes So the automation question, it's an option question, for the algorithm? doubling down in investments in exactly that area. What's the latest, what should they know? should know about BMC is that all the work kind of modernizing kind of the core things. Thanks for coming and sharing the BMC perspective

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JimPERSON

0.99+

Jim KobielusPERSON

0.99+

WalmartORGANIZATION

0.99+

BMCORGANIZATION

0.99+

GoogleORGANIZATION

0.99+

NYCLOCATION

0.99+

MicrosoftORGANIZATION

0.99+

oneQUANTITY

0.99+

Basil FaruquiPERSON

0.99+

five yearQUANTITY

0.99+

ten monthsQUANTITY

0.99+

two weeksQUANTITY

0.99+

three monthsQUANTITY

0.99+

six monthsQUANTITY

0.99+

John FurrierPERSON

0.99+

15QUANTITY

0.99+

BasilPERSON

0.99+

HoustonLOCATION

0.99+

HortonworksORGANIZATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

MacCOMMERCIAL_ITEM

0.99+

BMC SoftwareORGANIZATION

0.99+

two waysQUANTITY

0.99+

bothQUANTITY

0.99+

tomorrowDATE

0.99+

Midtown ManhattanLOCATION

0.99+

OneQUANTITY

0.99+

ten yearQUANTITY

0.99+

over 25 yearsQUANTITY

0.99+

over 325,000 vehiclesQUANTITY

0.99+

about 300,000 vehiclesQUANTITY

0.99+

third oneQUANTITY

0.99+

three daysQUANTITY

0.99+

about 100,000 vehiclesQUANTITY

0.99+

about 80%QUANTITY

0.98+

BigDataORGANIZATION

0.98+

ThursdayDATE

0.98+

eighty percentQUANTITY

0.98+

todayDATE

0.98+

20 yearsQUANTITY

0.98+

one quick questionQUANTITY

0.98+

single pointQUANTITY

0.98+

Bob BeachPERSON

0.97+

four years agoDATE

0.97+

two use casesQUANTITY

0.97+

one final questionQUANTITY

0.97+

'97DATE

0.97+

InstagramORGANIZATION

0.97+

AgileTITLE

0.96+

New York cityLOCATION

0.96+

About a month agoDATE

0.96+

OraclesORGANIZATION

0.96+

HadoopTITLE

0.95+

about a hundred storesQUANTITY

0.94+

less than 3%QUANTITY

0.94+

2017DATE

0.93+

Glass'GimORGANIZATION

0.92+

aboutQUANTITY

0.92+

firstQUANTITY

0.91+

OpsORGANIZATION

0.91+

HadoopORGANIZATION

0.9+

Max WatsonPERSON

0.88+

100QUANTITY

0.88+

theCUBEORGANIZATION

0.88+

MainframesORGANIZATION

0.88+

NavistarORGANIZATION

0.86+

Rob Thomas, IBM | Change the Game: Winning With AI 2018


 

>> [Announcer] Live from Times Square in New York City, it's theCUBE covering IBM's Change the Game: Winning with AI, brought to you by IBM. >> Hello everybody, welcome to theCUBE's special presentation. We're covering IBM's announcements today around AI. IBM, as theCUBE does, runs of sessions and programs in conjunction with Strata, which is down at the Javits, and we're Rob Thomas, who's the General Manager of IBM Analytics. Long time Cube alum, Rob, great to see you. >> Dave, great to see you. >> So you guys got a lot going on today. We're here at the Westin Hotel, you've got an analyst event, you've got a partner meeting, you've got an event tonight, Change the game: winning with AI at Terminal 5, check that out, ibm.com/WinWithAI, go register there. But Rob, let's start with what you guys have going on, give us the run down. >> Yeah, it's a big week for us, and like many others, it's great when you have Strata, a lot of people in town. So, we've structured a week where, today, we're going to spend a lot of time with analysts and our business partners, talking about where we're going with data and AI. This evening, we've got a broadcast, it's called Winning with AI. What's unique about that broadcast is it's all clients. We've got clients on stage doing demonstrations, how they're using IBM technology to get to unique outcomes in their business. So I think it's going to be a pretty unique event, which should be a lot of fun. >> So this place, it looks like a cool event, a venue, Terminal 5, it's just up the street on the west side highway, probably a mile from the Javits Center, so definitely check that out. Alright, let's talk about, Rob, we've known each other for a long time, we've seen the early Hadoop days, you guys were very careful about diving in, you kind of let things settle and watched very carefully, and then came in at the right time. But we saw the evolution of so-called Big Data go from a phase of really reducing investments, cheaper data warehousing, and what that did is allowed people to collect a lot more data, and kind of get ready for this era that we're in now. But maybe you can give us your perspective on the phases, the waves that we've seen of data, and where we are today and where we're going. >> I kind of think of it as a maturity curve. So when I go talk to clients, I say, look, you need to be on a journey towards AI. I think probably nobody disagrees that they need something there, the question is, how do you get there? So you think about the steps, it's about, a lot of people started with, we're going to reduce the cost of our operations, we're going to use data to take out cost, that was kind of the Hadoop thrust, I would say. Then they moved to, well, now we need to see more about our data, we need higher performance data, BI data warehousing. So, everybody, I would say, has dabbled in those two area. The next leap forward is self-service analytics, so how do you actually empower everybody in your organization to use and access data? And the next step beyond that is, can I use AI to drive new business models, new levers of growth, for my business? So, I ask clients, pin yourself on this journey, most are, depends on the division or the part of the company, they're at different areas, but as I tell everybody, if you don't know where you are and you don't know where you want to go, you're just going to wind around, so I try to get them to pin down, where are you versus where do you want to go? >> So four phases, basically, the sort of cheap data store, the BI data warehouse modernization, self-service analytics, a big part of that is data science and data science collaboration, you guys have a lot of investments there, and then new business models with AI automation running on top. Where are we today? Would you say we're kind of in-between BI/DW modernization and on our way to self-service analytics, or what's your sense? >> I'd say most are right in the middle between BI data warehousing and self-service analytics. Self-service analytics is hard, because it requires you, sometimes to take a couple steps back, and look at your data. It's hard to provide self-service if you don't have a data catalog, if you don't have data security, if you haven't gone through the processes around data governance. So, sometimes you have to take one step back to go two steps forward, that's why I see a lot of people, I'd say, stuck in the middle right now. And the examples that you're going to see tonight as part of the broadcast are clients that have figured out how to break through that wall, and I think that's pretty illustrative of what's possible. >> Okay, so you're saying that, got to maybe take a step back and get the infrastructure right with, let's say a catalog, to give some basic things that they have to do, some x's and o's, you've got the Vince Lombardi played out here, and also, skillsets, I imagine, is a key part of that. So, that's what they've got to do to get prepared, and then, what's next? They start creating new business models, imagining this is where the cheap data officer comes in and it's an executive level, what are you seeing clients as part of digital transformation, what's the conversation like with customers? >> The biggest change, the great thing about the times we live in, is technology's become so accessible, you can do things very quickly. We created a team last year called Data Science Elite, and we've hired what we think are some of the best data scientists in the world. Their only job is to go work with clients and help them get to a first success with data science. So, we put a team in. Normally, one month, two months, normally a team of two or three people, our investment, and we say, let's go build a model, let's get to an outcome, and you can do this incredibly quickly now. I tell clients, I see somebody that says, we're going to spend six months evaluating and thinking about this, I was like, why would you spend six months thinking about this when you could actually do it in one month? So you just need to get over the edge and go try it. >> So we're going to learn more about the Data Science Elite team. We've got John Thomas coming on today, who is a distinguished engineer at IBM, and he's very much involved in that team, and I think we have a customer who's actually gone through that, so we're going to talk about what their experience was with the Data Science Elite team. Alright, you've got some hard news coming up, you've actually made some news earlier with Hortonworks and Red Hat, I want to talk about that, but you've also got some hard news today. Take us through that. >> Yeah, let's talk about all three. First, Monday we announced the expanded relationship with both Hortonworks and Red Hat. This goes back to one of the core beliefs I talked about, every enterprise is modernizing their data and application of states, I don't think there's any debate about that. We are big believers in Kubernetes and containers as the architecture to drive that modernization. The announcement on Monday was, we're working closer with Red Hat to take all of our data services as part of Cloud Private for Data, which are basically microservice for data, and we're running those on OpenShift, and we're starting to see great customer traction with that. And where does Hortonworks come in? Hadoop has been the outlier on moving to microservices containers, we're working with Hortonworks to help them make that move as well. So, it's really about the three of us getting together and helping clients with this modernization journey. >> So, just to remind people, you remember ODPI, folks? It was all this kerfuffle about, why do we even need this? Well, what's interesting to me about this triumvirate is, well, first of all, Red Hat and Hortonworks are hardcore opensource, IBM's always been a big supporter of open source. You three got together and you're proving now the productivity for customers of this relationship. You guys don't talk about this, but Hortonworks had to, when it's public call, that the relationship with IBM drove many, many seven-figure deals, which, obviously means that customers are getting value out of this, so it's great to see that come to fruition, and it wasn't just a Barney announcement a couple years ago, so congratulations on that. Now, there's this other news that you guys announced this morning, talk about that. >> Yeah, two other things. One is, we announced a relationship with Stack Overflow. 50 million developers go to Stack Overflow a month, it's an amazing environment for developers that are looking to do new things, and we're sponsoring a community around AI. Back to your point before, you said, is there a skills gap in enterprises, there absolutely is, I don't think that's a surprise. Data science, AI developers, not every company has the skills they need, so we're sponsoring a community to help drive the growth of skills in and around data science and AI. So things like Python, R, Scala, these are the languages of data science, and it's a great relationship with us and Stack Overflow to build a community to get things going on skills. >> Okay, and then there was one more. >> Last one's a product announcement. This is one of the most interesting product annoucements we've had in quite a while. Imagine this, you write a sequel query, and traditional approach is, I've got a server, I point it as that server, I get the data, it's pretty limited. We're announcing technology where I write a query, and it can find data anywhere in the world. I think of it as wide-area sequel. So it can find data on an automotive device, a telematics device, an IoT device, it could be a mobile device, we think of it as sequel the whole world. You write a query, you can find the data anywhere it is, and we take advantage of the processing power on the edge. The biggest problem with IoT is, it's been the old mantra of, go find the data, bring it all back to a centralized warehouse, that makes it impossible to do it real time. We're enabling real time because we can write a query once, find data anywhere, this is technology we've had in preview for the last year. We've been working with a lot of clients to prove out used cases to do it, we're integrating as the capability inside of IBM Cloud Private for Data. So if you buy IBM Cloud for Data, it's there. >> Interesting, so when you've been around as long as I have, long enough to see some of the pendulums swings, and it's clearly a pendulum swing back toward decentralization in the edge, but the key is, from what you just described, is you're sort of redefining the boundary, so I presume it's the edge, any Cloud, or on premises, where you can find that data, is that correct? >> Yeah, so it's multi-Cloud. I mean, look, every organization is going to be multi-Cloud, like 100%, that's going to happen, and that could be private, it could be multiple public Cloud providers, but the key point is, data on the edge is not just limited to what's in those Clouds. It could be anywhere that you're collecting data. And, we're enabling an architecture which performs incredibly well, because you take advantage of processing power on the edge, where you can get data anywhere that it sits. >> Okay, so, then, I'm setting up a Cloud, I'll call it a Cloud architecture, that encompasses the edge, where essentially, there are no boundaries, and you're bringing security. We talked about containers before, we've been talking about Kubernetes all week here at a Big Data show. And then of course, Cloud, and what's interesting, I think many of the Hadoop distral vendors kind of missed Cloud early on, and then now are sort of saying, oh wow, it's a hybrid world and we've got a part, you guys obviously made some moves, a couple billion dollar moves, to do some acquisitions and get hardcore into Cloud, so that becomes a critical component. You're not just limiting your scope to the IBM Cloud. You're recognizing that it's a multi-Cloud world, that' what customers want to do. Your comments. >> It's multi-Cloud, and it's not just the IBM Cloud, I think the most predominant Cloud that's emerging is every client's private Cloud. Every client I talk to is building out a containerized architecture. They need their own Cloud, and they need seamless connectivity to any public Cloud that they may be using. This is why you see such a premium being put on things like data ingestion, data curation. It's not popular, it's not exciting, people don't want to talk about it, but we're the biggest inhibitors, to this AI point, comes back to data curation, data ingestion, because if you're dealing with multiple Clouds, suddenly your data's in a bunch of different spots. >> Well, so you're basically, and we talked about this a lot on theCUBE, you're bringing the Cloud model to the data, wherever the data lives. Is that the right way to think about it? >> I think organizations have spoken, set aside what they say, look at their actions. Their actions say, we don't want to move all of our data to any particular Cloud, we'll move some of our data. We need to give them seamless connectivity so that they can leave their data where they want, we can bring Cloud-Native Architecture to their data, we could also help move their data to a Cloud-Native architecture if that's what they prefer. >> Well, it makes sense, because you've got physics, latency, you've got economics, moving all the data into a public Cloud is expensive and just doesn't make economic sense, and then you've got things like GDPR, which says, well, you have to keep the data, certain laws of the land, if you will, that say, you've got to keep the data in whatever it is, in Germany, or whatever country. So those sort of edicts dictate how you approach managing workloads and what you put where, right? Okay, what's going on with Watson? Give us the update there. >> I get a lot of questions, people trying to peel back the onion of what exactly is it? So, I want to make that super clear here. Watson is a few things, start at the bottom. You need a runtime for models that you've built. So we have a product called Watson Machine Learning, runs anywhere you want, that is the runtime for how you execute models that you've built. Anytime you have a runtime, you need somewhere where you can build models, you need a development environment. That is called Watson Studio. So, we had a product called Data Science Experience, we've evolved that into Watson Studio, connecting in some of those features. So we have Watson Studio, that's the development environment, Watson Machine Learning, that's the runtime. Now you move further up the stack. We have a set of APIs that bring in human features, vision, natural language processing, audio analytics, those types of things. You can integrate those as part of a model that you build. And then on top of that, we've got things like Watson Applications, we've got Watson for call centers, doing customer service and chatbots, and then we've got a lot of clients who've taken pieces of that stack and built their own AI solutions. They've taken some of the APIs, they've taken some of the design time, the studio, they've taken some of the Watson Machine Learning. So, it is really a stack of capabilities, and where we're driving the greatest productivity, this is in a lot of the examples you'll see tonight for clients, is clients that have bought into this idea of, I need a development environment, I need a runtime, where I can deploy models anywhere. We're getting a lot of momentum on that, and then that raises the question of, well, do I have expandability, do I have trust in transparency, and that's another thing that we're working on. >> Okay, so there's API oriented architecture, exposing all these services make it very easy for people to consume. Okay, so we've been talking all week at Cube NYC, is Big Data is in AI, is this old wine, new bottle? I mean, it's clear, Rob, from the conversation here, there's a lot of substantive innovation, and early adoption, anyway, of some of these innovations, but a lot of potential going forward. Last thoughts? >> What people have to realize is AI is not magic, it's still computer science. So it actually requires some hard work. You need to roll up your sleeves, you need to understand how I get from point A to point B, you need a development environment, you need a runtime. I want people to really think about this, it's not magic. I think for a while, people have gotten the impression that there's some magic button. There's not, but if you put in the time, and it's not a lot of time, you'll see the examples tonight, most of them have been done in one or two months, there's great business value in starting to leverage AI in your business. >> Awesome, alright, so if you're in this city or you're at Strata, go to ibm.com/WinWithAI, register for the event tonight. Rob, we'll see you there, thanks so much for coming back. >> Yeah, it's going to be fun, thanks Dave, great to see you. >> Alright, keep it right there everybody, we'll be back with our next guest right after this short break, you're watching theCUBE.

Published Date : Sep 18 2018

SUMMARY :

brought to you by IBM. Long time Cube alum, Rob, great to see you. But Rob, let's start with what you guys have going on, it's great when you have Strata, a lot of people in town. and kind of get ready for this era that we're in now. where you want to go, you're just going to wind around, and data science collaboration, you guys have It's hard to provide self-service if you don't have and it's an executive level, what are you seeing let's get to an outcome, and you can do this and I think we have a customer who's actually as the architecture to drive that modernization. So, just to remind people, you remember ODPI, folks? has the skills they need, so we're sponsoring a community and it can find data anywhere in the world. of processing power on the edge, where you can get data a couple billion dollar moves, to do some acquisitions This is why you see such a premium being put on things Is that the right way to think about it? to a Cloud-Native architecture if that's what they prefer. certain laws of the land, if you will, that say, for how you execute models that you've built. I mean, it's clear, Rob, from the conversation here, and it's not a lot of time, you'll see the examples tonight, Rob, we'll see you there, thanks so much for coming back. we'll be back with our next guest

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
IBMORGANIZATION

0.99+

DavePERSON

0.99+

HortonworksORGANIZATION

0.99+

six monthsQUANTITY

0.99+

RobPERSON

0.99+

Rob ThomasPERSON

0.99+

John ThomasPERSON

0.99+

two monthsQUANTITY

0.99+

one monthQUANTITY

0.99+

GermanyLOCATION

0.99+

last yearDATE

0.99+

Red HatORGANIZATION

0.99+

MondayDATE

0.99+

oneQUANTITY

0.99+

100%QUANTITY

0.99+

GDPRTITLE

0.99+

three peopleQUANTITY

0.99+

firstQUANTITY

0.99+

twoQUANTITY

0.99+

ibm.com/WinWithAIOTHER

0.99+

Watson StudioTITLE

0.99+

PythonTITLE

0.99+

ScalaTITLE

0.99+

FirstQUANTITY

0.99+

Data Science EliteORGANIZATION

0.99+

bothQUANTITY

0.99+

CubeORGANIZATION

0.99+

one stepQUANTITY

0.99+

OneQUANTITY

0.99+

Times SquareLOCATION

0.99+

todayDATE

0.99+

Vince LombardiPERSON

0.98+

threeQUANTITY

0.98+

Stack OverflowORGANIZATION

0.98+

tonightDATE

0.98+

Javits CenterLOCATION

0.98+

BarneyORGANIZATION

0.98+

Terminal 5LOCATION

0.98+

IBM AnalyticsORGANIZATION

0.98+

WatsonTITLE

0.97+

two stepsQUANTITY

0.97+

New York CityLOCATION

0.97+

Watson ApplicationsTITLE

0.97+

CloudTITLE

0.96+

This eveningDATE

0.95+

Watson Machine LearningTITLE

0.94+

two areaQUANTITY

0.93+

seven-figure dealsQUANTITY

0.92+

CubePERSON

0.91+

Basil Faruqui, BMC | theCUBE NYC 2018


 

(upbeat music) >> Live from New York, it's theCUBE. Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Okay, welcome back everyone to theCUBE NYC. This is theCUBE's live coverage covering CubeNYC Strata Hadoop Strata Data Conference. All things data happen here in New York this week. I'm John Furrier with Peter Burris. Our next guest is Basil Faruqui lead solutions marketing manager digital business automation within BMC returns, he was here last year with us and also Big Data SV, which has been renamed CubeNYC, Cube SV because it's not just big data anymore. We're hearing words like multi cloud, Istio, all those Kubernetes. Data now is so important, it's now up and down the stack, impacting everyone, we talked about this last year with Control M, how you guys are automating in a hurry. The four pillars of pipelining data. The setup days are over; welcome to theCUBE. >> Well thank you and it's great to be back on theCUBE. And yeah, what you said is exactly right, so you know, big data has really, I think now been distilled down to data. Everybody understands data is big, and it's important, and it is really you know, it's quite a cliche, but to a larger degree, data is the new oil, as some people say. And I think what you said earlier is important in that we've been very fortunate to be able to not only follow the journey of our customers but be a part of it. So about six years ago, some of the early adopters of Hadoop came to us and said that look, we use your products for traditional data warehousing on the ERP side for orchestration workloads. We're about to take some of these projects on Hadoop into production and really feel that the Hadoop ecosystem is lacking enterprise-grade workflow orchestration tools. So we partnered with them and some of the earliest goals they wanted to achieve was build a data lake, provide richer and wider data sets to the end users to be able to do some dashboarding, customer 360, and things of that nature. Very quickly, in about five years time, we have seen a lot of these projects mature from how do I build a data lake to now applying cutting-edge ML and AI and cloud is a major enabler of that. You know, it's really, as we were talking about earlier, it's really taking away excuses for not being able to scale quickly from an infrastructure perspective. Now you're talking about is it Hadoop or is it S3 or is it Azure Blob Storage, is it Snowflake? And from a control-end perspective, we're very platform and technology agnostic, so some of our customers who had started with Hadoop as a platform, they are now looking at other technologies like Snowflake, so one of our customers describes it as kind of the spine or a power strip of orchestration where regardless of what technology you have, you can just plug and play in and not worry about how do I rewire the orchestration workflows because control end is taking care of it. >> Well you probably always will have to worry about that to some degree. But I think where you're going, and this is where I'm going to test with you, is that as analytics, as data is increasingly recognized as a strategic asset, as analytics increasingly recognizes the way that you create value out of those data assets, and as a business becomes increasingly dependent upon the output of analytics to make decisions and ultimately through AI to act differently in markets, you are embedding these capabilities or these technologies deeper into business. They have to become capabilities. They have to become dependable. They have to become reliable, predictable, cost, performance, all these other things. That suggests that ultimately, the historical approach of focusing on the technology and trying to apply it to a periodic or series of data science problems has to become a little bit more mature so it actually becomes a strategic capability. So the business can say we're operating on this, but the technologies to take that underlying data science technology to turn into business operations that's where a lot of the net work has to happen. Is that what you guys are focused on? >> Yeah, absolutely, and I think one of the big differences that we're seeing in general in the industry is that this time around, the pull of how do you enable technology to drive the business is really coming from the line of business, versus starting on the technology side of the house and then coming to the business and saying hey we've got some cool technologies that can probably help you, it's really line of business now saying no, I need better analytics so I can drive new business models for my company, right? So the need for speed is greater than ever because the pull is from the line of business side. And this is another area where we are unique is that, you know, Control M has been designed in a way where it's not just a set of solutions or tools for the technical guys. Now, the line of business is getting closer and closer, you know, it's blending into the technical side as well. They have a very, very keen interest in understanding are the dashboards going to be refreshed on time? Are we going to be able to get all the right promotional offers at the right time? I mean, we're here at NYC Strata, there's a lot of real-time promotion happening here. The line of business has direct interest in the delivery and the timing of all of this, so we have always had multiple interfaces to Control M where a business user who has an interest in understanding are the promotional offers going to happen at the right time and is that on schedule? They have a mobile app for them to do that. A developer who's building up complex, multi-application platform, they have an API and a programmatic interface to do that. Operations that has to monitor all of this has rich dashboards to be able to do that. That's one of the areas that has been key for our success over the last couple decades, and we're seeing that translate very well into the big data place. >> So I just want to go under the hood for a minute because I love that answer. And I'd like to pivot off what Peter said, tying it back to the business, okay, that's awesome. And I want to learn a little bit more about this because we talked about this last year and I kind of am seeing it now. Kubernetes and all this orchestration is about workloads. You guys nailed the workflow issue, complex workflows. Because if you look at it, if you're adding line of business into the equation, that's just complexity in and of itself. As more workflows exist within its own line of business, whether it's recommendations and offers and workflow issues, more lines of business in there is complex for even IT to deal with, so you guys have nailed that. How does that work? Do you plug it in and the lines of businesses have their own developers, so the people who work with the workflows engage how? >> So that's a good question, with sort of orchestration and automation now becoming very, very generic, it's kind of important to classify where we play. So there's a lot of tools that do release and build automation. There's a lot of tools that'll do infrastructure automation and orchestration. All of this infrastructure and release management process is done ultimately to run applications on top of it, and the workflows of the application need orchestration and that's the layer that we play in. And if you think about how does the end user, the business and consumer interact with all of this technology is through applications, k? So the orchestration of the workflow's inside the applications, whether you start all the way from an ERP or a CRM and then you land into a data lake and then do an ML model, and then out come the recommendations analytics, that's the layer we are automating today. Obviously, all of this-- >> By the way, the technical complexity for the user's in the app. >> Correct, so the line of business obviously has a lot more control, you're seeing roles like chief digital officers emerge, you're seeing CTOs that have mandates like okay you're going to be responsible for all applications that are facing customer facing where the CIO is going to take care of everything that's inward facing. It's not a settled structure or science involved. >> It's evolving fast. >> It's evolving fast. But what's clear is that line of business has a lot more interest and influence in driving these technology projects and it's important that technologies evolve in a way where line of business can not only understand but take advantage of that. >> So I think it's a great question, John, and I want to build on that and then ask you something. So the way we look at the world is we say the first fifty years of computing were known process, unknown technology. The next fifty years are going to be unknown process, known technology. It's all going to look like a cloud. But think about what that means. Known process, unknown technology, Control M and related types of technologies tended to focus on how you put in place predictable workflows in the technology layer. And now, unknown process, known technology, driven by the line of business, now we're talking about controlling process flows that are being created, bespoke, strategic, differentiating doing business. >> Well, dynamic, too, I mean, dynamic. >> Highly dynamic, and those workflows in many respects, those technologies, piecing applications and services together, become the process that differentiates the business. Again, you're still focused on the infrastructure a bit, but you've moved it up. Is that right? >> Yeah, that's exactly right. We see our goal as abstracting the complexity of the underlying application data and infrastructure. So, I mean, it's quite amazing-- >> So it could be easily reconfigured to a business's needs. >> Exactly, so whether you're on Hadoop and now you're thinking about moving to Snowflake or tomorrow something else that comes up, the orchestration or the workflow, you know, that's as a business as a product that's our goal is to continue to evolve quickly and in a manner that we continue to abstract the complexity so from-- >> So I've got to ask you, we've been having a lot of conversations around Hadoop versus Kubernetes on multi cloud, so as cloud has certainly come in and changed the game, there's no debate on that. How it changes is debatable, but we know that multiple clouds is going to be the modus operandus for customers. >> Correct. >> So I got a lot of data and now I've got pipelining complexities and workflows are going to get even more complex, potentially. How do you see the impact of the cloud, how are you guys looking at that, and what are some customer use cases that you see for you guys? >> So the, what I mentioned earlier, that being platform and technology agnostic is actually one of the unique differentiating factors for us, so whether you are an AWS or an Azure or a Google or On-Prem or still on a mainframe, a lot of, we're in New York, a lot of the banks, insurance companies here still do some of the most critical processing on the mainframe. The ability to abstract all of that whether it's cloud or legacy solutions is one of our key enablers for our customers, and I'll give you an example. So Malwarebytes is one of our customers and they've been using Control M for several years. Primarily the entire structure is built on AWS, but they are now utilizing Google cloud for some of their recommendation analysis on sentiment analysis because their goal is to pick the best of breed technology for the problem they're looking to solve. >> Service, the best breed service is in the cloud. >> The best breed service is in the cloud to solve the business problem. So from Control M's perspective, transcending from AWS to Google cloud is completely abstracted for them, so runs Google tomorrow it's Azure, they decide to build a private cloud, they will be able to extend the same workflow orchestration. >> But you can build these workflows across whatever set of services are available. >> Correct, and you bring up an important point. It's not only being able to build the workflows across platforms but being able to define dependencies and track the dependencies across all of this, because none of this is happening in silos. If you want to use Google's API to do the recommendations, well, you've got to feed it the data, and the data's pipeline, like we talked about last time, data ingestion, data storage, data processing, and analytics have very, very intricate dependencies, and these solutions should be able to manage not only the building of the workflow but the dependencies as well. >> But you're defining those elements as fundamental building blocks through a control model >> Correct. >> That allows you to treat the higher level services as reliable, consistent, capabilities. >> Correct, and the other thing I would like to add here is not only just build complex multiplatform, multiapplication workflows, but never lose focus of the business service of the business process there, so you can tie all of this to a business service and then, these things are complex, there are problems, let's say there's an ETL job that fails somewhere upstream, Control M will immediately be able to predict the impact and be able to tell you this means the recommendation engine will not be able to make the recommendations. Now, the staff that's going to work under mediation understands the business impact versus looking at a screen where there's 500 jobs and one of them has failed. What does that really mean? >> Set priorities and focal points and everything else. >> Right. >> So I just want to wrap up by asking you how your talk went at Strata Hadoop Data Conference. What were you talking about, what was the core message? Was it Control M, was it customer presentations? What was the focus? >> So the focus of yesterday's talk was actually, you know, one of the things is academic talk is great, but it's important to, you know, show how things work in real life. The session was focused on a real-use case from a customer. Navistar, they have IOT data-driven pipelines where they are predicting failures of parts inside trucks and buses that they manufacture, you know, reducing vehicle downtime. So we wanted to simulate a demo like that, so that's exactly what we did. It was very well received. In real-time, we spun up EMR environment in AWS, automatically provision control of infrastructure there, we applied spark and machine learning algorithms to the data and out came the recommendation at the end was that, you know, here are the vehicles that are-- >> Fix their brakes. (laughing) >> Exactly, so it was very, very well received. >> I mean, there's a real-world example, there's real money to be saved, maintenance, scheduling, potential liability, accidents. >> Liability is a huge issue for a lot of manufacturers. >> And Navistar has been at the leading edge of how to apply technologies in that business. >> They really have been a poster child for visual transformation. >> They sure have. >> Here's a company that's been around for 100 plus years and when we talk to them they tell us that we have every technology under the sun that has come since the mainframe, and for them to be transforming and leading in this way, we're very fortunate to be part of their journey. >> Well we'd love to talk more about some of these customer use cases. Other people love about theCUBE, we want to do more of them, share those examples, people love to see proof in real-world examples, not just talk so appreciate it sharing. >> Absolutely. >> Thanks for sharing, thanks for the insights. We're here Cube live in New York City, part of CubeNYC, we're getting all the data, sharing that with you. I'm John Furrier with Peter Burris. Stay with us for more day two coverage after this short break. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media with Control M, how you guys are automating in a hurry. describes it as kind of the spine or a power strip but the technologies to take that underlying of the house and then coming to the business You guys nailed the workflow issue, and that's the layer that we play in. for the user's in the app. Correct, so the line of business and it's important that technologies evolve in a way So the way we look at the world is we say that differentiates the business. of the underlying application data and infrastructure. so as cloud has certainly come in and changed the game, and what are some customer use cases that you see for the problem they're looking to solve. is in the cloud. The best breed service is in the cloud But you can build these workflows across and the data's pipeline, like we talked about last time, That allows you to treat the higher level services and be able to tell you this means the recommendation engine So I just want to wrap up by asking you at the end was that, you know, Fix their brakes. there's real money to be saved, And Navistar has been at the leading edge of how They really have been a poster child for and for them to be transforming and leading in this way, people love to see proof in real-world examples, Thanks for sharing, thanks for the insights.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JohnPERSON

0.99+

Basil FaruquiPERSON

0.99+

Peter BurrisPERSON

0.99+

BMCORGANIZATION

0.99+

PeterPERSON

0.99+

500 jobsQUANTITY

0.99+

GoogleORGANIZATION

0.99+

New YorkLOCATION

0.99+

last yearDATE

0.99+

AWSORGANIZATION

0.99+

New York CityLOCATION

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

John FurrierPERSON

0.99+

HadoopTITLE

0.99+

first fifty yearsQUANTITY

0.99+

theCUBEORGANIZATION

0.99+

NavistarORGANIZATION

0.99+

tomorrowDATE

0.98+

yesterdayDATE

0.98+

oneQUANTITY

0.98+

this weekDATE

0.97+

MalwarebytesORGANIZATION

0.97+

CubeORGANIZATION

0.95+

Control MORGANIZATION

0.95+

NYCLOCATION

0.95+

SnowflakeTITLE

0.95+

Strata Hadoop Data ConferenceEVENT

0.94+

100 plus yearsQUANTITY

0.93+

CubeNYC Strata Hadoop Strata Data ConferenceEVENT

0.92+

last couple decadesDATE

0.91+

AzureTITLE

0.91+

about five yearsQUANTITY

0.91+

IstioORGANIZATION

0.9+

CubeNYCORGANIZATION

0.89+

dayQUANTITY

0.87+

about six years agoDATE

0.85+

KubernetesTITLE

0.85+

todayDATE

0.84+

NYC StrataORGANIZATION

0.83+

HadoopORGANIZATION

0.78+

one of themQUANTITY

0.77+

Big Data SVORGANIZATION

0.75+

2018EVENT

0.7+

KubernetesORGANIZATION

0.66+

fifty yearsDATE

0.62+

Control MTITLE

0.61+

four pillarsQUANTITY

0.61+

twoQUANTITY

0.6+

-PremORGANIZATION

0.6+

Cube SVCOMMERCIAL_ITEM

0.58+

a minuteQUANTITY

0.58+

S3TITLE

0.55+

AzureORGANIZATION

0.49+

cloudTITLE

0.49+

2018DATE

0.43+

Ronen Schwartz, Informatica | theCUBE NYC 2018


 

>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Welcome back to the Big Apple, everybody. This is theCUBE, the leader in live tech coverage. My name is Dave Vellante, I'm here with my cohost Peter Burris, and this is our week-long coverage of CUBENYC. It used to be, really, a big data theme. It sort of evolved into data, AI, machine learning. Ronan Schwartz is here, he's the senior vice president and general manager of cloud, big data, and data integration at data integration company Informatica. Great to see you again, Ronan, thanks so much for coming on. >> Thanks for inviting me, it's a good, warm day in New York. >> Yeah, the storm is coming and... Well, speaking of storms, the data center is booming. Data is this, you know, crescendo of storms (chuckles) have occurred, and you guys are at the center of that. It's been a tailwind for your business. Give us the update, how's business these days? >> So, we finished Q2 in a great, great success, the best Q2 that we ever had, and the third quarter looks just as promising, so I think the short answer is that we are seeing the strong demand for data, for technologies that supports data. We're seeing more users, new use cases, and definitely a huge growth in need to support... To support data, big data, data in the cloud, and so on, so I think very, very good Q2 and it looks like Q3's going to be just as good, if not better. >> That's great, so there's been a decades-long conversation, of course, about data, the value of data, but more often than not over the history of recent history, when I say recent I mean let's say 20 years on, data's been a problem for people. It's been expensive, how do you manage it, when do you delete it? It's sort of this nasty thing that people have to deal with. Fast forward to 2010, the whole Hadoop movement, all of a sudden data's the new oil, data's... You know, which Peter, of course, disagrees with for many reasons. >> No, it's... >> We don't have to get into it. >> It's subtlety. >> It's a subtlety, but you're right about it, and well, maybe if we have time we can talk about that, but the bromide of... But really focused attention on data and the importance of data and the value of data, and that was really a big contribution that Hadoop made. There were a lot of misconceptions. "Oh, we don't need the data warehouse anymore. "Oh, we don't need old," you know, "legacy databases." Of course none of those are true. Those are fundamental components of people's big data strategy, but talk about the importance of data and where Informatica fits. >> In a way, if I look into the same history that you described, and Informatica have definitely been a player through this history. We divide it into three eras. The first one is when data was like this thing that sits below the application, that used the application to feed the data in and if you want to see the data you go through the application, you see the data. We sometimes call that as Data 1.0. Data 2.0 was the time that companies, including Informatica, kind of froze and been able to give you a single view of the data across multiple systems, across your organization, and so on, because we're Informatica we have the ETL with data quality, even with master data management, kind of came into play and allowed an organization to actually build analytics as a system, to build single view as a system, et cetera. I think what is happening, and Hadoop was definitely a trigger, but I would say the cloud is just as big of a trigger as the big data technologies, and definitely everything that's happening right now with Spark and the processing power, et cetera, is contributing to that. This is the time of the Data 3.0 when data is actually in the center. It's not a single application like it was in the Data 2.0. It's not this thing below the application in Data 1.0. Data is in the center and everything else is just basically have to be connected to the data, and I think it's an amazing time. A big part of digitalization is the fact that the data is actually there. It's the most important asset the organization has. >> Yeah, so I want to follow up on something. So, last night we had a session Peter hosted on the future of AI, and he made the point, I said earlier data's the new oil. I said you disagreed, there's a nuance there. You made the point last night that oil, I can put oil in my car, I can put oil in my house, I can't do both. Data is the new currency, people said, "Well, I can spend a dollar or I can spend "a dollar on sports tickets, I can't do both." Data's different in that... >> It doesn't follow the economics of scarcity, and I think that's one of the main drivers here. As you talk about 1.0, 2.0, and 3.0, 1.0 it's locked in the application, 2.0 it's locked in a model, 3.0 now we're opening it up so that the same data can be shared, it can be evolved, it can be copied, it can be easily transformed, but their big issue is we have to sustain overall coherence of it. Security has to remain in place, we have to avoid corruption. Talk to us about some of the new demands given, especially that we've got this, more data but more users of that data. As we think about evidence-based management, where are we going to ensure that all of those new claims from all of those new users against those data sources can be satisfied? >> So, first, I truly like... This is a big nuance, it's not a small one. (laughs) The fact that you have better idea actually means that you do a lot of things better. It doesn't mean that you do one thing better and you cannot do the other. >> Right. I agree 100%, I actually contribute that for two things. One is more users, and the other thing is more ways to use the data, so the fact that you have better data, more data, big data, et cetera, actually means that your analytics is going to be better, right, but it actually means that if you are looking into hyperautomation and AI and machine learning and so on, suddenly this is possible to do because you have this data foundation that is big enough to actually support machine learning processes, and I think we're just in the beginning of that. I think we're going to see data being used for more and more use cases. We're in the integration business and in the data management business, and we're seeing, within what our customers are asking us to support, this huge growth in the number of patterns of how they want the data to be available, how they want to bring data into different places, into different users, so all of that is truly supporting what you just mentioned. I think if you look into the Data 2.0 timeframe, it was the time that a single team that is very, very strong with the right tools can actually handle the organization needs. In what you described, suddenly self-service. Can every group consume the data? Can I get the data in both batch and realtime? Can I get the data in a massive amount as well as in small chunks? These are all becoming very, very central. >> And very use case, but also user and context, you know, we think about time, dependent, and one of the biggest challenges that we have is to liberate the data in the context of the multiple different organization uses, and one of the biggest challenges that customers have, or that any enterprise has, and again, evidence-based management, nice trend, a lot of it's going to happen, but the familiarity with data is still something that's not, let's say broadly diffused, and a lot of the tools for ensuring that people can be made familiar, can discover, can reuse, can apply data, are modestly endowed today, so talk about some of these new tools that are going to make it easier to discover, capture, catalog, sustain these data assets? >> Yeah, and I think you're absolutely right, and if this is such a critical asset, and data is, and we're actually looking into more user consuming the data in more ways, it actually automatically create a bottleneck in how do I find the data, how do I identify the data that I need, and how am I making this available in the right place at the right time? In general, it looks like a problem that is almost unsolvable, like I got more data, more users, more patterns, nobody have their budget tripled or quadrupled just to be able to consume it. How do you address that, and I think Informatica very early have identified this growing need, and we have invested in a product that we call the enterprise data catalog, and it's actually... The concept of a catalog or a metadata repository, a place that you can actually identify all the data that exists, is not necessarily a new concept-- >> No, it's been around for years. >> Yes, but doing it in an enterprise-unified way is unique, and I think if you look into what we're trying to basically empower any user to do I basically, you know, we all using Google. You type something and you find it. If you're trying to find data in the organization in a similar way, it's a much harder task, and basically the catalog and Informatica unified, enterprise-unified catalog is doing that, leveraging a lot of machine learning and AI behind the scenes to basically make this search possible, make basically the identification of the data possible, the curation of the data possible, and basically empowering every user to find the data that he wants, see recommendation for other data that can work with it, and then basically consume the data in the way that he wants. I totally think that this will change the way IT is functioning. It is actually an amazing bridge between IT and the business. If there is one place that you can search all your data, suddenly the whole interface between IT and the business is changing, and Informatica's actually leading this change. >> So, the catalog gives you line-of-sight on all, (clears throat) all those data sources, what's the challenge in terms of creating a catalog and making it performant and useful? >> I think there are a few levels of the challenge. I chose the word enterprise-unified intelligent catalog deliberately, and I think each one of them is kind of representing a different challenge. The first challenge is the unified. There is technical metadata, this is the mapping and the processes that move data from one place to the other, then there is business metadata. These are the definition the business is using, and then there is the operational metadata as well, as well as the physical location and so on. Unifying all of them so that you can actually connect and see them in one place is a unique challenge that at this stage we have already completely addressed. The second one is enterprise, and when talking about enterprise metadata it means that you want all of your applications, you want application in the cloud, you want your cloud environment, your big data environment. You want, actually, your APIs, you want your integration environment. You want to be able to collect all of this metadata across the enterprise, so unified all the types, enterprise is the second one. The third challenge is actually the most exciting one, is how can you leverage intelligence so it's not limited by the human factor, by the amount of people that you have to actually put the data together, right? >> Mm-hm. >> And today we're using a very, very sophisticated, interesting logarithm to run on the metadata and be able to tell you that even though you don't know how the data got from here to here, it actually did get from here to here. >> Mm-hm. >> It's a dotted line, maybe somebody copied it, maybe something else happened, but the data is so similar that we can actually tell you it came from one place. >> So, actually, let me see, because I think there's... I don't think you missed a step, but let me reveal a step that's in there. One of the key issues in the enterprise side of things is to reveal how data's being used. The value of data is tied to its context, and having catalogs that can do, as you said, the unified, but also the metadata becomes part of how it's used makes that opportunity, that ability to then create audit trails and create lineage possible. >> You're absolutely right, and I think it actually is one of the most important things, is to see where the data came from and what steps did it go to. >> Right. >> There's also one other very interesting value of lineage that I think sometimes people tend to ignore is who else is using it? >> Right. >> Who else is consuming it, because that is actually, like, a very good indicator of how good the data is or how common the data is. The ability to actually leverage and create this lineage is a mandatory thing. The ability to create lineage that is inferred, and not actually specifically defined, is also very, very interesting, but we're now doing, like, things that are, I think, really exciting. For example, let's say that a user is looking into a data field in one source and he is actually identifying that this is a certain, specific ID that his organization is using. Now we're able to actually automatically understand that this field actually exists in 700 places, and actually, leverage the intelligence that he just gave us and actually ask him, "Do you want it to be automatically updated everywhere? "Do you want to do it in a step-by-step, guided way?" And this is how you actually scale to handle the massive amount of data, and this is how organizations are going to learn more and more and get the data to be better and better the more they work with the data. >> Now, Ronan, you have hard news this week, right? Why don't you update us on what you've announced? >> So, I think in the context for our discussion, Informatica announced here, actually today, this morning in Strata, a few very exciting news that are actually helping the customer go into this data journey. The first one is basically supporting data across, big data across multi-clouds. The ability to basically leverage all of these great tools, including the catalog, including the big data management, including data quality, data governance, and so on, on AWS, on Azure, on GCP, basically without any effort needed. We're even going further and we're empowering our user to use it in a serverless mode where we're actually allowing them full control over the resources that are being consumed. This is really, really critical because this is actually allowing them to do more with the data in a lower cost. I think the last part of the news that is really exciting is we added a lot, a lot of functionality around our Spark processing and the capabilities of the things that you can do so that the developers, the AI and machine learning can use their stuff, but at the same time we actually empower business users to do more than they ever did before. So, kind of being able to expand the amount of users that can access the data, wanting a more sophisticated way, and wanting a very simple but still very powerful way, I think this is kind of the summary of the news. >> And just a quick followup on that. If I understand it, it's your full complement of functionality across these clouds, is that right? You're not neutering... (chuckles) >> That is absolutely correct, yes, and we are seeing, definitely within our customers, a growing choice to decide to focus their big data efforts in the cloud, it makes a lot of sense. The ability to scale up and down in the cloud is significantly superior, but also the ability to give more users access in the cloud is typically easier, so I think Informatica have chosen as the market we're focusing on enterprise cloud data management. We talked a lot about data management. This is a lot about the cloud, the cloud part of it, and it's basically a very, very focused effort in optimizing things across clouds. >> Cloud is critical, obviously. That's how a lot of people want to do business. They want to do business in a cloud-like fashion, whether it's on-prem or off-prem. A lot of people want things to be off-prem. Cloud's important because it's where innovation is happening, and scale. Ronan, thanks so much for coming on theCUBE today. >> Yeah, thank you very much and I did learn something, oil is not one of the terms that I'm going to use for data in the future. >> Makes you think about that, right? >> I'm going to use something different, yes. >> It's good, and I also... My other takeaway is, in that context, being able to use data in multiple places. Usage is a proportional relationship between usage and value, so thanks for that. >> Excellent. >> Happy to be here. >> And thank you, everybody, for watching. We will be right back right after this short break. You're watching theCUBE at #CUBENYC, we'll be right back. (techy music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media Ronan Schwartz is here, he's the senior Well, speaking of storms, the data center is booming. the best Q2 that we ever had, and the third quarter conversation, of course, about data, the value of data, and the importance of data and the value of data, that the data is actually there. Data is the new currency, people said, so that the same data can be shared, it can be evolved, The fact that you have better idea actually so the fact that you have better data, in how do I find the data, how do I identify the data behind the scenes to basically make this search possible, by the amount of people that you have to actually put how the data got from here to here, it actually did get maybe something else happened, but the data and having catalogs that can do, as you said, it actually is one of the most important things, and get the data to be better and better of the things that you can do so that the developers, of functionality across these clouds, is that right? but also the ability to give more users That's how a lot of people want to do business. that I'm going to use for data in the future. being able to use data in multiple places. And thank you, everybody, for watching.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

RonanPERSON

0.99+

Ronan SchwartzPERSON

0.99+

InformaticaORGANIZATION

0.99+

PeterPERSON

0.99+

New YorkLOCATION

0.99+

100%QUANTITY

0.99+

Peter BurrisPERSON

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

20 yearsQUANTITY

0.99+

Ronen SchwartzPERSON

0.99+

700 placesQUANTITY

0.99+

2010DATE

0.99+

third challengeQUANTITY

0.99+

OneQUANTITY

0.99+

bothQUANTITY

0.99+

two thingsQUANTITY

0.99+

AWSORGANIZATION

0.99+

a dollarQUANTITY

0.99+

GoogleORGANIZATION

0.99+

one sourceQUANTITY

0.99+

first challengeQUANTITY

0.98+

first oneQUANTITY

0.98+

todayDATE

0.98+

oneQUANTITY

0.98+

last nightDATE

0.98+

firstQUANTITY

0.98+

this weekDATE

0.97+

this morningDATE

0.97+

second oneQUANTITY

0.97+

one placeQUANTITY

0.97+

SparkTITLE

0.97+

3.0OTHER

0.96+

single applicationQUANTITY

0.96+

New York CityLOCATION

0.95+

single teamQUANTITY

0.93+

decadesQUANTITY

0.92+

2.0OTHER

0.91+

each oneQUANTITY

0.91+

HadoopTITLE

0.9+

theCUBEORGANIZATION

0.89+

1.0OTHER

0.89+

single viewQUANTITY

0.89+

third quarterDATE

0.88+

Data 3.0TITLE

0.85+

Data 2.0TITLE

0.85+

Data 1.0OTHER

0.84+

Q2DATE

0.83+

Data 2.0OTHER

0.83+

AzureTITLE

0.82+

both batchQUANTITY

0.81+

Big AppleLOCATION

0.81+

NYCLOCATION

0.78+

one thingQUANTITY

0.74+

three erasQUANTITY

0.74+

GCPTITLE

0.65+

Q3DATE

0.64+

HadoopPERSON

0.64+

Jim Franklin & Anant Chintamaneni | theCUBE NYC 2018


 

>> Live from New York. It's theCUBE. Covering theCUBE New York City, 2018. Brought to you by SiliconANGLE Media, and it's ecosystem partners. >> I'm John Furrier with Peter Burris, our next two guests are Jim Franklin with Dell EMC Director of Product Management Anant Chintamaneni, who is the Vice President of Products at BlueData. Welcome to theCUBE, good to see you. >> Thanks, John. >> Thank you. >> Thanks for coming on. >> I've been following BlueData since the founding. Great company, and the founders are great. Great teams, so thanks for coming on and sharing what's going on, I appreciate it. >> It's a pleasure, thanks for the opportunity. >> So Jim, talk about the Dell relationship with BlueData. What are you guys doing? You have the Dell-ready solutions. How is that related now, because you've seen this industry with us over the years morph. It's really now about, the set-up days are over, it's about proof points. >> That's right. >> AI and machine learning are driving the signal, which is saying, 'We need results'. There's action on the developer's side, there's action on the deployment, people want ROI, that's the main focus. >> That's right. That's right, and we've seen this journey happen from the new batch processing days, and we're seeing that customer base mature and come along, so the reason why we partnered with BlueData is, you have to have those softwares, you have to have the contenders. They have to have the algorithms, and things like that, in order to make this real. So it's been a great partnership with BlueData, it's dated back actually a little farther back than some may realize, all the way to 2015, believe it or not, when we used to incorporate BlueData with Isilon. So it's been actually a pretty positive partnership. >> Now we've talked with you guys in the past, you guys were on the cutting edge, this was back when Docker containers were fashionable, but now containers have become so proliferated out there, it's not just Docker, containerization has been the wave. Now, Kubernetes on top of it is really bringing in the orchestration. This is really making the storage and the network so much more valuable with workloads, whether respective workloads, and AI is a part of that. How do you guys navigate those waters now? What's the BlueData update, how are you guys taking advantage of that big wave? >> I think, great observation, re-embrace Docker containers, even before actually Docker was even formed as a company by that time, and Kubernetes was just getting launched, so we saw the value of Docker containers very early on, in terms of being able to obviously provide the agility, elasticity, but also, from a packaging of applications perspective, as we all know it's a very dynamic environment, and today, I think we are very happy to know that, with Kubernetes being a household name now, especially a tech company, so the way we're navigating this is, we have a turnkey product, which has containerization, and then now we are taking our value proposition of big data and AI and lifecycle management and bringing it to Kubernetes with an open source project that we launched called Cube Director under our umbrella. So, we're all about bringing stateful applications like Hadoop, AI, ML to the community and to our customer base, which is some of the largest financial services in health care customers. >> So the container revolution has certainly groped developers, and developers have always had a history of chasing after the next cool technology, and for good reason, it's not like just chasing after... Developers tend not to just chase after the shiny thing, they chased after the most productive thing, and they start using it, and they start learning about it, and they make themselves valuable, and they build more valuable applications as a result. But there's this interesting meshing of creators, makers, in the software world, between the development community and the data science community. How are data scientists, who you must be spending a fair amount of time with, starting to adopt containers, what are they looking at? Are they even aware of this, as you try to help these communities come together? >> We absolutely talk to the data scientists and they're the drivers of determining what applications they want to consume for the different news cases. But, at the end of the day, the person who has to deliver these applications, you know data scientists care about time to value, getting the environment quickly all prepared so they can access the right data sets. So, in many ways, most of our customers, many of them are unaware that there's actually containers under the hood. >> So this is the data scientists. >> The data scientists, but the actual administrators and the system administrators were making these tools available, are using containers as a way to accelerate the way they package the software, which has a whole bunch of dependent libraries, and there's a lot of complexity our there. So they're simplifying all that and providing the environment as quickly as possible. >> And in so doing, making sure that whatever workloads are put together, can scaled, can be combined differently and recombined differently, based on requirements of the data scientists. So the data scientist sees the tool... >> Yeah. >> The tool is manifest as, in concert with some of these new container related technologies, and then the whole CICD process supports the data scientist >> The other thing to think about though, is that this also allows freedom of choice, and we were discussing off camera before, these developers want to pick out what they want to pick out what they want to work with, they don't want to have to be locked in. So with containers, you can also speed that deployment but give them freedom to choose the tools that make them best productive. That'll make them much happier, and probably much more efficient. >> So there's a separation under the data science tools, and the developer tools, but they end up all supporting the same basic objective. So how does the infrastructure play in this, because the challenge of big data for the last five years as John and I both know, is that a lot of people conflated. The outcome of data science, the outcome of big data, with the process of standing up clusters, and lining up Hadoop, and if they failed on the infrastructure, they said it was a failure overall. So how you making the infrastructure really simple, and line up with this time of value? >> Well, the reality is, we all need food and water. IT still needs server and storage in order to work. But at the end of the day, the abstraction has to be there just like VMware in the early days, clouds, containers with BlueData is just another way to create a layer of abstraction. But this one is in the context of what the data scientist is trying to get done, and that's the key to why we partnered with BlueData and why we delivered big data as a service. >> So at that point, what's the update from Dell EMC and Dell, in particular, Analytics? Obviously you guys work with a lot of customers, have challenges, how are you solving those problems? What are those problems? Because we know there's some AI rumors, big Dell event coming up, there's rumors of a lot of AI involved, I'm speculating there's going to be probably a new kind of hardware device and software. What's the state of the analytics today? >> I think a lot of the customers we talked about, they were born in that batch processing, that Hadoop space we just talked about. I think they largely got that right, they've largely got that figured out, but now we're seeing proliferation of AI tools, proliferation of sandbox environments, and you're psyched to see a little bit of silo behavior happening, so what we're trying to do is that IT shop is trying to dispatch those environments, dispatch with some speed, with some agility. They want to have it at the right economic model as well, so we're trying to strike a better balance, say 'Hey, I've invested in all this infrastructure already, I need to modernize it, and that I also need to offer it up in a way that data scientists can consume it'. Oh, by the way, we're starting to see them start to hire more and more of these data scientists. Well, you don't want your data scientists, this very expensive, intelligent resource, sitting there doing data mining, data cleansing, detail offloads, we want them actually doing modeling and analytics. So we find that a lot of times right now as you're doing an operational change, the operational mindset as you're starting to hire these very expensive people to do this very good work, at the corest of the data, but they need to get productive in the way that you hired them to be productive. >> So what is this ready solution, can you just explain what that is? Is it a program, is it a hardware, is it a solution? What is the ready solution? >> Generally speaking, what we do as a division is we look for value workloads, just generally speaking, not necessarily in batch processing, or AI, or applications, and we try and create an environment that solves that customer challenge, typically they're very complex, SAP, Oracle Database, it's AI, my goodness. Very difficult. >> Variety of tools, using hives, no sequel, all this stuff's going on. >> Cassandra, you've got Tensorflow, so we try fit together a set of knowledge experts, that's the key, the intellectual property of our engineers, and their deep knowledge expertise in a certain area. So for AI, we have a sight of them back at the shop, they're in the lab, and this is what they do, and they're serving up these models, they're putting data through its paces, they're doing the work of a data scientist. They are data scientists. >> And so this is where BlueData comes in. You guys are part of this abstraction layer in the ready solutions. Offering? Is that how it works? >> Yeah, we are the software that enables the self-service experience, the multitenancy, that the consumers of the ready solution would want in terms of being able to onboard multiple different groups of users, lines of business, so you could have a user that wants to run basic spark, cluster, spark jobs, or you could have another user group that's using Tensorflow, or accelerated by a special type of CPU or GPU, and so you can have them all on the same infrastructure. >> One of the things Peter and I were talking about, Dave Vellante, who was here, he's at another event right now getting some content but, one of the things we observed was, we saw this awhile ago so it's not new to us but certainly we're seeing the impact at this event. Hadoop World, there's now called Strata Data NYC, is that we hear words like Kubernetes, and Multi Cloud, and Istio for the first time. At this event. This is the impact of the Cloud. The Cloud has essentially leveled the Hadoop World, certainly there's some Hadoop activity going on there, people have clusters, there's standing up infrastructure for analytical infrastructures that do analytics, obviously AI drives that, but now you have the Cloud being a power base. Changing that analytics infrastructure. How has it impacted you guys? BlueData, how are you guys impacted by the Cloud? Tailwind for you guys? Helpful? Good? >> You described it well, it is a tailwind. This space is about the data, not where the data lives necessarily, but the robustness of the data. So whether that's in the Cloud, whether that's on Premise, whether that's on Premise in your own private Cloud, I think anywhere where there's data that can be gathered, modeled, and new insights being pulled out of, this is wonderful, so as we ditched data, whether it's born in the Cloud or born on Premise, this is actually an accelerant to the solutions that we built together. >> As BlueData, we're all in on the Cloud, we support all the three major Cloud providers that was the big announcement that we made this week, we're generally available for AWS, GCP, and Azure, and, in particular, we start with customers who weren't born in the Cloud, so we're talking about some of the large financial services >> We had Barclays UK here who we nominated, they won the Cloud Era Data Impact Award, and what they're actually going through right now, is they started on Prem, they have these really packaged certified technology stacks, whether they are Cloud Era Hadoop, whether they are Anaconda for data science, and what they're trying to do right now is, they're obviously getting value from that on Premise with BlueData, and now they want to leverage the Cloud. They want to be able to extend into the Cloud. So, we as a company have made our product a hybrid Cloud-ready platform, so it can span on Prem as well as multiple Clouds, and you have the ability to move the workloads from one to the other, depending on data gravity, SLA considerations. >> Compliancy. >> I think it's one more thing, I want to test this with you guys, John, and that is, analytics is, I don't want to call it inert, or passive, but analytics has always been about getting the right data to human beings so they can make decisions, and now we're seeing, because of AI, the distinction that we draw between analytics and AI is, AI is about taking action on the data, it's about having a consequential action, as a result of the data, so in many respects, NCL, Kubernetes, a lot of these are not only do some interesting things for the infrastructure associated with big data, but they also facilitate the incorporation of new causes of applications, that act on behalf of the brand. >> Here's the other thing I'll add to it, there's a time element here. It used to be we were passive, and it was in the past, and you're trying to project forward, that's no longer the case. You can do it right now. Exactly. >> In many respects, the history of the computing industry can be drawn in this way, you focused on the past, and then with spreadsheets in the 80s and personal computing, you focused on getting everybody to agree on the future, and now, it's about getting action to happen right now. >> At the moment it happens. >> And that's why there's so much action. We're passed the set-up phase, and I think this is why we're hearing, seeing machine learning being so popular because it's like, people want to take action there's a demand, that's a signal that it's time to show where the ROI is and get action done. Clearly we see that. >> We're capitalists, right? We're all trying to figure out how to make money in these spaces. >> Certainly there's a lot of movement, and Cloud has proven that spinning up an instance concept has been a great thing, and certainly analytics. It's okay to have these workloads, but how do you tie it together? So, I want to ask you, because you guys have been involved in containers, Cloud has certainly been a tailwind, we agree with you 100 percent on that. What is the relevance of Kubernetes and Istio? You're starting to see these new trends. Kubernetes, Istio, Cupflow. Higher level microservices with all kinds of stateful and stateless dynamics. I call it API 2.0, it's a whole other generation of abstractions that are going on, that are creating some goodness for people. What is the impact, in your opinion, of Kubernetes and this new revolution? >> I think the impact of Kubernetes is, I just gave a talk here yesterday, called Hadoop-la About Kubernetes. We were thinking very deeply about this. We're thinking deeply about this. So I think Kubernetes, if you look at the genesis, it's all about stateless applications, and I think as new applications are being written folks are thinking about writing them in a manner that are decomposed, stateless, microservices, things like Cupflow. When you write it like that, Kubernetes fits in very well, and you get all the benefits of auto-scaling, and so control a pattern, and ultimately Kubernetes is this finite state machine-type model where you describe what the state should be, and it will work and crank towards making it towards that state. I think it's a little bit harder for stateful applications, and I think that's where we believe that the Kubernetes community has to do a lot more work, and folks like BlueData are going to contribute to that work which is, how do you bring stateful applications like Hadoop where there's a lot of interdependent services, they're not necessarily microservices, they're actually almost close to monolithic applications. So I think new applications, new AI ML tooling that's going to come out, they're going to be very conscious of how they're running in a Cloud world today that folks weren't aware of seven or eight years ago, so it's really going to make a huge difference. And I think things like Istio are going to make a huge difference because you can start in the cloud and maybe now expand on to Prem. So there's going to be some interesting dynamics. >> Without hopping management frameworks, absolutely. >> And this is really critical, you just nailed it. Stateful is where ML will shine, if you can then cross the chasma to the on Premise where the workloads can have state sharing. >> Right. >> Scales beautifully. It's a whole other level. >> Right. You're going to the data into the action, or the activity, you're going to have to move the processing to the data, and you want to have nonetheless, a common, seamless management development framework so that you have the choices about where you do those things. >> Absolutely. >> Great stuff. We can do a whole Cube segment just on that. We love talking about these new dynamics going on. We'll see you in CF CupCon coming up in Seattle. Great to have you guys on. Thanks, and congratulations on the relationship between BlueData and Dell EMC and Ready Solutions. This is Cube, with the Ready Solutions here. New York City, talking about big data and the impact, the future of AI, all things stateful, stateless, Cloud and all. It's theCUBE bringing you all the action. Stay with us for more after this short break.

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media, Welcome to theCUBE, good to see you. Great company, and the founders are great. So Jim, talk about the Dell relationship with BlueData. AI and machine learning are driving the signal, so the reason why we partnered with BlueData is, What's the BlueData update, how are you guys and bringing it to Kubernetes with an open source project and the data science community. But, at the end of the day, the person who has to deliver and the system administrators So the data scientist sees the tool... So with containers, you can also speed that deployment So how does the infrastructure play in this, But at the end of the day, the abstraction has to be there What's the state of the analytics today? in the way that you hired them to be productive. and we try and create an environment that all this stuff's going on. that's the key, the intellectual property of our engineers, in the ready solutions. and so you can have them all on the same infrastructure. Kubernetes, and Multi Cloud, and Istio for the first time. but the robustness of the data. and you have the ability to move the workloads I want to test this with you guys, John, Here's the other thing I'll add to it, and personal computing, you focused on getting everybody to We're passed the set-up phase, and I think this is why how to make money in these spaces. we agree with you 100 percent on that. the Kubernetes community has to do a lot more work, And this is really critical, you just nailed it. It's a whole other level. so that you have the choices and the impact, the future of AI,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Anant ChintamaneniPERSON

0.99+

Peter BurrisPERSON

0.99+

Jim FranklinPERSON

0.99+

JohnPERSON

0.99+

BlueDataORGANIZATION

0.99+

DellORGANIZATION

0.99+

PeterPERSON

0.99+

JimPERSON

0.99+

2015DATE

0.99+

New YorkLOCATION

0.99+

100 percentQUANTITY

0.99+

John FurrierPERSON

0.99+

New York CityLOCATION

0.99+

Ready SolutionsORGANIZATION

0.99+

SeattleLOCATION

0.99+

yesterdayDATE

0.99+

Dell EMCORGANIZATION

0.99+

Barclays UKORGANIZATION

0.99+

first timeQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

todayDATE

0.99+

OneQUANTITY

0.98+

bothQUANTITY

0.98+

AWSORGANIZATION

0.98+

this weekDATE

0.97+

CF CupConEVENT

0.97+

oneQUANTITY

0.97+

CassandraPERSON

0.97+

sevenDATE

0.96+

two guestsQUANTITY

0.96+

IsilonORGANIZATION

0.96+

80sDATE

0.96+

NCLORGANIZATION

0.96+

SAPORGANIZATION

0.95+

API 2.0OTHER

0.92+

AnacondaORGANIZATION

0.92+

Cloud Era HadoopTITLE

0.91+

NYCLOCATION

0.91+

HadoopTITLE

0.91+

eight years agoDATE

0.91+

PremORGANIZATION

0.9+

CupflowTITLE

0.89+

PremiseTITLE

0.89+

KubernetesTITLE

0.88+

one more thingQUANTITY

0.88+

IstioORGANIZATION

0.87+

DockerTITLE

0.85+

DockerORGANIZATION

0.85+

CupflowORGANIZATION

0.84+

CubeORGANIZATION

0.83+

last five yearsDATE

0.82+

CloudTITLE

0.8+

KubernetesORGANIZATION

0.8+

Oracle DatabaseORGANIZATION

0.79+

2018DATE

0.79+

CloudsTITLE

0.78+

GCPORGANIZATION

0.77+

theCUBEORGANIZATION

0.76+

Cloud Era Data Impact AwardEVENT

0.74+

CubePERSON

0.73+

Influencer Panel | theCUBE NYC 2018


 

- [Announcer] Live, from New York, it's theCUBE. Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media, and its ecosystem partners. - Hello everyone, welcome back to CUBE NYC. This is a CUBE special presentation of something that we've done now for the past couple of years. IBM has sponsored an influencer panel on some of the hottest topics in the industry, and of course, there's no hotter topic right now than AI. So, we've got nine of the top influencers in the AI space, and we're in Hell's Kitchen, and it's going to get hot in here. (laughing) And these guys, we're going to cover the gamut. So, first of all, folks, thanks so much for joining us today, really, as John said earlier, we love the collaboration with you all, and we'll definitely see you on social after the fact. I'm Dave Vellante, with my cohost for this session, Peter Burris, and again, thank you to IBM for sponsoring this and organizing this. IBM has a big event down here, in conjunction with Strata, called Change the Game, Winning with AI. We run theCUBE NYC, we've been here all week. So, here's the format. I'm going to kick it off, and then we'll see where it goes. So, I'm going to introduce each of the panelists, and then ask you guys to answer a question, I'm sorry, first, tell us a little bit about yourself, briefly, and then answer one of the following questions. Two big themes that have come up this week. One has been, because this is our ninth year covering what used to be Hadoop World, which kind of morphed into big data. Question is, AI, big data, same wine, new bottle? Or is it really substantive, and driving business value? So, that's one question to ponder. The other one is, you've heard the term, the phrase, data is the new oil. Is data really the new oil? Wonder what you think about that? Okay, so, Chris Penn, let's start with you. Chris is cofounder of Trust Insight, long time CUBE alum, and friend. Thanks for coming on. Tell us a little bit about yourself, and then pick one of those questions. - Sure, we're a data science consulting firm. We're an IBM business partner. When it comes to "data is the new oil," I love that expression because it's completely accurate. Crude oil is useless, you have to extract it out of the ground, refine it, and then bring it to distribution. Data is the same way, where you have to have developers and data architects get the data out. You need data scientists and tools, like Watson Studio, to refine it, and then you need to put it into production, and that's where marketing technologists, technologists, business analytics folks, and tools like Watson Machine Learning help bring the data and make it useful. - Okay, great, thank you. Tony Flath is a tech and media consultant, focus on cloud and cyber security, welcome. - Thank you. - Tell us a little bit about yourself and your thoughts on one of those questions. - Sure thing, well, thanks so much for having us on this show, really appreciate it. My background is in cloud, cyber security, and certainly in emerging tech with artificial intelligence. Certainly touched it from a cyber security play, how you can use machine learning, machine control, for better controlling security across the gamut. But I'll touch on your question about wine, is it a new bottle, new wine? Where does this come from, from artificial intelligence? And I really see it as a whole new wine that is coming along. When you look at emerging technology, and you look at all the deep learning that's happening, it's going just beyond being able to machine learn and know what's happening, it's making some meaning to that data. And things are being done with that data, from robotics, from automation, from all kinds of different things, where we're at a point in society where data, our technology is getting beyond us. Prior to this, it's always been command and control. You control data from a keyboard. Well, this is passing us. So, my passion and perspective on this is, the humanization of it, of IT. How do you ensure that people are in that process, right? - Excellent, and we're going to come back and talk about that. - Thanks so much. - Carla Gentry, @DataNerd? Great to see you live, as opposed to just in the ether on Twitter. Data scientist, and owner of Analytical Solution. Welcome, your thoughts? - Thank you for having us. Mine is, is data the new oil? And I'd like to rephrase that is, data equals human lives. So, with all the other artificial intelligence and everything that's going on, and all the algorithms and models that's being created, we have to think about things being biased, being fair, and understand that this data has impacts on people's lives. - Great. Steve Ardire, my paisan. - Paisan. - AI startup adviser, welcome, thanks for coming to theCUBE. - Thanks Dave. So, uh, my first career was geology, and I view AI as the new oil, but data is the new oil, but AI is the refinery. I've used that many times before. In fact, really, I've moved from just AI to augmented intelligence. So, augmented intelligence is really the way forward. This was a presentation I gave at IBM Think last spring, has almost 100,000 impressions right now, and the fundamental reason why is machines can attend to vastly more information than humans, but you still need humans in the loop, and we can talk about what they're bringing in terms of common sense reasoning, because big data does the who, what, when, and where, but not the why, and why is really the Holy Grail for causal analysis and reasoning. - Excellent, Bob Hayes, Business Over Broadway, welcome, great to see you again. - Thanks for having me. So, my background is in psychology, industrial psychology, and I'm interested in things like customer experience, data science, machine learning, so forth. And I'll answer the question around big data versus AI. And I think there's other terms we could talk about, big data, data science, machine learning, AI. And to me, it's kind of all the same. It's always been about analytics, and getting value from your data, big, small, what have you. And there's subtle differences among those terms. Machine learning is just about making a prediction, and knowing if things are classified correctly. Data science is more about understanding why things work, and understanding maybe the ethics behind it, what variables are predicting that outcome. But still, it's all the same thing, it's all about using data in a way that we can get value from that, as a society, in residences. - Excellent, thank you. Theo Lau, founder of Unconventional Ventures. What's your story? - Yeah, so, my background is driving technology innovation. So, together with my partner, what our work does is we work with organizations to try to help them leverage technology to drive systematic financial wellness. We connect founders, startup founders, with funders, we help them get money in the ecosystem. We also work with them to look at, how do we leverage emerging technology to do something good for the society. So, very much on point to what Bob was saying about. So when I look at AI, it is not new, right, it's been around for quite a while. But what's different is the amount of technological power that we have allow us to do so much more than what we were able to do before. And so, what my mantra is, great ideas can come from anywhere in the society, but it's our job to be able to leverage technology to shine a spotlight on people who can use this to do something different, to help seniors in our country to do better in their financial planning. - Okay, so, in your mind, it's not just a same wine, new bottle, it's more substantive than that. - [Theo] It's more substantive, it's a much better bottle. - Karen Lopez, senior project manager for Architect InfoAdvisors, welcome. - Thank you. So, I'm DataChick on twitter, and so that kind of tells my focus is that I'm here, I also call myself a data evangelist, and that means I'm there at organizations helping stand up for the data, because to me, that's the proxy for standing up for the people, and the places and the events that that data describes. That means I have a focus on security, data privacy and protection as well. And I'm going to kind of combine your two questions about whether data is the new wine bottle, I think is the combination. Oh, see, now I'm talking about alcohol. (laughing) But anyway, you know, all analogies are imperfect, so whether we say it's the new wine, or, you know, same wine, or whether it's oil, is that the analogy's good for both of them, but unlike oil, the amount of data's just growing like crazy, and the oil, we know at some point, I kind of doubt that we're going to hit peak data where we have not enough data, like we're going to do with oil. But that says to me that, how did we get here with big data, with machine learning and AI? And from my point of view, as someone who's been focused on data for 35 years, we have hit this perfect storm of open source technologies, cloud architectures and cloud services, data innovation, that if we didn't have those, we wouldn't be talking about large machine learning and deep learning-type things. So, because we have all these things coming together at the same time, we're now at explosions of data, which means we also have to protect them, and protect the people from doing harm with data, we need to do data for good things, and all of that. - Great, definite differences, we're not running out of data, data's like the terrible tribbles. (laughing) - Yes, but it's very cuddly, data is. - Yeah, cuddly data. Mark Lynd, founder of Relevant Track? - That's right. - I like the name. What's your story? - Well, thank you, and it actually plays into what my interest is. It's mainly around AI in enterprise operations and cyber security. You know, these teams that are in enterprise operations both, it can be sales, marketing, all the way through the organization, as well as cyber security, they're often under-sourced. And they need, what Steve pointed out, they need augmented intelligence, they need to take AI, the big data, all the information they have, and make use of that in a way where they're able to, even though they're under-sourced, make some use and some value for the organization, you know, make better use of the resources they have to grow and support the strategic goals of the organization. And oftentimes, when you get to budgeting, it doesn't really align, you know, you're short people, you're short time, but the data continues to grow, as Karen pointed out. So, when you take those together, using AI to augment, provided augmented intelligence, to help them get through that data, make real tangible decisions based on information versus just raw data, especially around cyber security, which is a big hit right now, is really a great place to be, and there's a lot of stuff going on, and a lot of exciting stuff in that area. - Great, thank you. Kevin L. Jackson, author and founder of GovCloud. GovCloud, that's big. - Yeah, GovCloud Network. Thank you very much for having me on the show. Up and working on cloud computing, initially in the federal government, with the intelligence community, as they adopted cloud computing for a lot of the nation's major missions. And what has happened is now I'm working a lot with commercial organizations and with the security of that data. And I'm going to sort of, on your questions, piggyback on Karen. There was a time when you would get a couple of bottles of wine, and they would come in, and you would savor that wine, and sip it, and it would take a few days to get through it, and you would enjoy it. The problem now is that you don't get a couple of bottles of wine into your house, you get two or three tankers of data. So, it's not that it's a new wine, you're just getting a lot of it. And the infrastructures that you need, before you could have a couple of computers, and a couple of people, now you need cloud, you need automated infrastructures, you need huge capabilities, and artificial intelligence and AI, it's what we can use as the tool on top of these huge infrastructures to drink that, you know. - Fire hose of wine. - Fire hose of wine. (laughs) - Everybody's having a good time. - Everybody's having a great time. (laughs) - Yeah, things are booming right now. Excellent, well, thank you all for those intros. Peter, I want to ask you a question. So, I heard there's some similarities and some definite differences with regard to data being the new oil. You have a perspective on this, and I wonder if you could inject it into the conversation. - Sure, so, the perspective that we take in a lot of conversations, a lot of folks here in theCUBE, what we've learned, and I'll kind of answer both questions a little bit. First off, on the question of data as the new oil, we definitely think that data is the new asset that business is going to be built on, in fact, our perspective is that there really is a difference between business and digital business, and that difference is data as an asset. And if you want to understand data transformation, you understand the degree to which businesses reinstitutionalizing work, reorganizing its people, reestablishing its mission around what you can do with data as an asset. The difference between data and oil is that oil still follows the economics of scarcity. Data is one of those things, you can copy it, you can share it, you can easily corrupt it, you can mess it up, you can do all kinds of awful things with it if you're not careful. And it's that core fundamental proposition that as an asset, when we think about cyber security, we think, in many respects, that is the approach to how we can go about privatizing data so that we can predict who's actually going to be able to appropriate returns on it. So, it's a good analogy, but as you said, it's not entirely perfect, but it's not perfect in a really fundamental way. It's not following the laws of scarcity, and that has an enormous effect. - In other words, I could put oil in my car, or I could put oil in my house, but I can't put the same oil in both. - Can't put it in both places. And now, the issue of the wine, I think it's, we think that it is, in fact, it is a new wine, and very simple abstraction, or generalization we come up with is the issue of agency. That analytics has historically not taken on agency, it hasn't acted on behalf of the brand. AI is going to act on behalf of the brand. Now, you're going to need both of them, you can't separate them. - A lot of implications there in terms of bias. - Absolutely. - In terms of privacy. You have a thought, here, Chris? - Well, the scarcity is our compute power, and our ability for us to process it. I mean, it's the same as oil, there's a ton of oil under the ground, right, we can't get to it as efficiently, or without severe environmental consequences to use it. Yeah, when you use it, it's transformed, but our scarcity is compute power, and our ability to use it intelligently. - Or even when you find it. I have data, I can apply it to six different applications, I have oil, I can apply it to one, and that's going to matter in how we think about work. - But one thing I'd like to add, sort of, you're talking about data as an asset. The issue we're having right now is we're trying to learn how to manage that asset. Artificial intelligence is a way of managing that asset, and that's important if you're going to use and leverage big data. - Yeah, but see, everybody's talking about the quantity, the quantity, it's not always the quantity. You know, we can have just oodles and oodles of data, but if it's not clean data, if it's not alphanumeric data, which is what's needed for machine learning. So, having lots of data is great, but you have to think about the signal versus the noise. So, sometimes you get so much data, you're looking at over-fitting, sometimes you get so much data, you're looking at biases within the data. So, it's not the amount of data, it's the, now that we have all of this data, making sure that we look at relevant data, to make sure we look at clean data. - One more thought, and we have a lot to cover, I want to get inside your big brain. - I was just thinking about it from a cyber security perspective, one of my customers, they were looking at the data that just comes from the perimeter, your firewalls, routers, all of that, and then not even looking internally, just the perimeter alone, and the amount of data being pulled off of those. And then trying to correlate that data so it makes some type of business sense, or they can determine if there's incidents that may happen, and take a predictive action, or threats that might be there because they haven't taken a certain action prior, it's overwhelming to them. So, having AI now, to be able to go through the logs to look at, and there's so many different types of data that come to those logs, but being able to pull that information, as well as looking at end points, and all that, and people's houses, which are an extension of the network oftentimes, it's an amazing amount of data, and they're only looking at a small portion today because they know, there's not enough resources, there's not enough trained people to do all that work. So, AI is doing a wonderful way of doing that. And some of the tools now are starting to mature and be sophisticated enough where they provide that augmented intelligence that Steve talked about earlier. - So, it's complicated. There's infrastructure, there's security, there's a lot of software, there's skills, and on and on. At IBM Think this year, Ginni Rometty talked about, there were a couple of themes, one was augmented intelligence, that was something that was clear. She also talked a lot about privacy, and you own your data, etc. One of the things that struck me was her discussion about incumbent disruptors. So, if you look at the top five companies, roughly, Facebook with fake news has dropped down a little bit, but top five companies in terms of market cap in the US. They're data companies, all right. Apple just hit a trillion, Amazon, Google, etc. How do those incumbents close the gap? Is that concept of incumbent disruptors actually something that is being put into practice? I mean, you guys work with a lot of practitioners. How are they going to close that gap with the data haves, meaning data at their core of their business, versus the data have-nots, it's not that they don't have a lot of data, but it's in silos, it's hard to get to? - Yeah, I got one more thing, so, you know, these companies, and whoever's going to be big next is, you have a digital persona, whether you want it or not. So, if you live in a farm out in the middle of Oklahoma, you still have a digital persona, people are collecting data on you, they're putting profiles of you, and the big companies know about you, and people that first interact with you, they're going to know that you have this digital persona. Personal AI, when AI from these companies could be used simply and easily, from a personal deal, to fill in those gaps, and to have a digital persona that supports your family, your growth, both personal and professional growth, and those type of things, there's a lot of applications for AI on a personal, enterprise, even small business, that have not been done yet, but the data is being collected now. So, you talk about the oil, the oil is being built right now, lots, and lots, and lots of it. It's the applications to use that, and turn that into something personally, professionally, educationally, powerful, that's what's missing. But it's coming. - Thank you, so, I'll add to that, and in answer to your question you raised. So, one example we always used in banking is, if you look at the big banks, right, and then you look at from a consumer perspective, and there's a lot of talk about Amazon being a bank. But the thing is, Amazon doesn't need to be a bank, they provide banking services, from a consumer perspective they don't really care if you're a bank or you're not a bank, but what's different between Amazon and some of the banks is that Amazon, like you say, has a lot of data, and they know how to make use of the data to offer something as relevant that consumers want. Whereas banks, they have a lot of data, but they're all silos, right. So, it's not just a matter of whether or not you have the data, it's also, can you actually access it and make something useful out of it so that you can create something that consumers want? Because otherwise, you're just a pipe. - Totally agree, like, when you look at it from a perspective of, there's a lot of terms out there, digital transformation is thrown out so much, right, and go to cloud, and you migrate to cloud, and you're going to take everything over, but really, when you look at it, and you both touched on it, it's the economics. You have to look at the data from an economics perspective, and how do you make some kind of way to take this data meaningful to your customers, that's going to work effectively for them, that they're going to drive? So, when you look at the big, big cloud providers, I think the push in things that's going to happen in the next few years is there's just going to be a bigger migration to public cloud. So then, between those, they have to differentiate themselves. Obvious is artificial intelligence, in a way that makes it easy to aggregate data from across platforms, to aggregate data from multi-cloud, effectively. To use that data in a meaningful way that's going to drive, not only better decisions for your business, and better outcomes, but drives our opportunities for customers, drives opportunities for employees and how they work. We're at a really interesting point in technology where we get to tell technology what to do. It's going beyond us, it's no longer what we're telling it to do, it's going to go beyond us. So, how we effectively manage that is going to be where we see that data flow, and those big five or big four, really take that to the next level. - Now, one of the things that Ginni Rometty said was, I forget the exact step, but it was like, 80% of the data, is not searchable. Kind of implying that it's sitting somewhere behind a firewall, presumably on somebody's premises. So, it was kind of interesting. You're talking about, certainly, a lot of momentum for public cloud, but at the same time, a lot of data is going to stay where it is. - Yeah, we're assuming that a lot of this data is just sitting there, available and ready, and we look at the desperate, or disparate kind of database situation, where you have 29 databases, and two of them have unique quantifiers that tie together, and the rest of them don't. So, there's nothing that you can do with that data. So, artificial intelligence is just that, it's artificial intelligence, so, they know, that's machine learning, that's natural language, that's classification, there's a lot of different parts of that that are moving, but we also have to have IT, good data infrastructure, master data management, compliance, there's so many moving parts to this, that it's not just about the data anymore. - I want to ask Steve to chime in here, go ahead. - Yeah, so, we also have to change the mentality that it's not just enterprise data. There's data on the web, the biggest thing is Internet of Things, the amount of sensor data will make the current data look like chump change. So, data is moving faster, okay. And this is where the sophistication of machine learning needs to kick in, going from just mostly supervised-learning today, to unsupervised learning. And in order to really get into, as I said, big data, and credible AI does the who, what, where, when, and how, but not the why. And this is really the Holy Grail to crack, and it's actually under a new moniker, it's called explainable AI, because it moves beyond just correlation into root cause analysis. Once we have that, then you have the means to be able to tap into augmented intelligence, where humans are working with the machines. - Karen, please. - Yeah, so, one of the things, like what Carla was saying, and what a lot of us had said, I like to think of the advent of ML technologies and AI are going to help me as a data architect to love my data better, right? So, that includes protecting it, but also, when you say that 80% of the data is unsearchable, it's not just an access problem, it's that no one knows what it was, what the sovereignty was, what the metadata was, what the quality was, or why there's huge anomalies in it. So, my favorite story about this is, in the 1980s, about, I forget the exact number, but like, 8 million children disappeared out of the US in April, at April 15th. And that was when the IRS enacted a rule that, in order to have a dependent, a deduction for a dependent on your tax returns, they had to have a valid social security number, and people who had accidentally miscounted their children and over-claimed them, (laughter) over the years them, stopped doing that. Well, some days it does feel like you have eight children running around. (laughter) - Agreed. - When, when that rule came about, literally, and they're not all children, because they're dependents, but literally millions of children disappeared off the face of the earth in April, but if you were doing analytics, or AI and ML, and you don't know that this anomaly happened, I can imagine in a hundred years, someone is saying some catastrophic event happened in April, 1983. (laughter) And what caused that, was it healthcare? Was it a meteor? Was it the clown attacking them? - That's where I was going. - Right. So, those are really important things that I want to use AI and ML to help me, not only document and capture that stuff, but to provide that information to the people, the data scientists and the analysts that are using the data. - Great story, thank you. Bob, you got a thought? You got the mic, go, jump in here. - Well, yeah, I do have a thought, actually. I was talking about, what Karen was talking about. I think it's really important that, not only that we understand AI, and machine learning, and data science, but that the regular folks and companies understand that, at the basic level. Because those are the people who will ask the questions, or who know what questions to ask of the data. And if they don't have the tools, and the knowledge of how to get access to that data, or even how to pose a question, then that data is going to be less valuable, I think, to companies. And the more that everybody knows about data, even people in congress. Remember when Zuckerberg talked about? (laughter) - That was scary. - How do you make money? It's like, we all know this. But, we need to educate the masses on just basic data analytics. - We could have an hour-long panel on that. - Yeah, absolutely. - Peter, you and I were talking about, we had a couple of questions, sort of, how far can we take artificial intelligence? How far should we? You know, so that brings in to the conversation of ethics, and bias, why don't you pick it up? - Yeah, so, one of the crucial things that we all are implying is that, at some point in time, AI is going to become a feature of the operations of our homes, our businesses. And as these technologies get more powerful, and they diffuse, and know about how to use them, diffuses more broadly, and you put more options into the hands of more people, the question slowly starts to turn from can we do it, to should we do it? And, one of the issues that I introduce is that I think the difference between big data and AI, specifically, is this notion of agency. The AI will act on behalf of, perhaps you, or it will act on behalf of your business. And that conversation is not being had, today. It's being had in arguments between Elon Musk and Mark Zuckerberg, which pretty quickly get pretty boring. (laughing) At the end of the day, the real question is, should this machine, whether in concert with others, or not, be acting on behalf of me, on behalf of my business, or, and when I say on behalf of me, I'm also talking about privacy. Because Facebook is acting on behalf of me, it's not just what's going on in my home. So, the question of, can it be done? A lot of things can be done, and an increasing number of things will be able to be done. We got to start having a conversation about should it be done? - So, humans exhibit tribal behavior, they exhibit bias. Their machine's going to pick that up, go ahead, please. - Yeah, one thing that sort of tag onto agency of artificial intelligence. Every industry, every business is now about identifying information and data sources, and their appropriate sinks, and learning how to draw value out of connecting the sources with the sinks. Artificial intelligence enables you to identify those sources and sinks, and when it gets agency, it will be able to make decisions on your behalf about what data is good, what data means, and who it should be. - What actions are good. - Well, what actions are good. - And what data was used to make those actions. - Absolutely. - And was that the right data, and is there bias of data? And all the way down, all the turtles down. - So, all this, the data pedigree will be driven by the agency of artificial intelligence, and this is a big issue. - It's really fundamental to understand and educate people on, there are four fundamental types of bias, so there's, in machine learning, there's intentional bias, "Hey, we're going to make "the algorithm generate a certain outcome "regardless of what the data says." There's the source of the data itself, historical data that's trained on the models built on flawed data, the model will behave in a flawed way. There's target source, which is, for example, we know that if you pull data from a certain social network, that network itself has an inherent bias. No matter how representative you try to make the data, it's still going to have flaws in it. Or, if you pull healthcare data about, for example, African-Americans from the US healthcare system, because of societal biases, that data will always be flawed. And then there's tool bias, there's limitations to what the tools can do, and so we will intentionally exclude some kinds of data, or not use it because we don't know how to, our tools are not able to, and if we don't teach people what those biases are, they won't know to look for them, and I know. - Yeah, it's like, one of the things that we were talking about before, I mean, artificial intelligence is not going to just create itself, it's lines of code, it's input, and it spits out output. So, if it learns from these learning sets, we don't want AI to become another buzzword. We don't want everybody to be an "AR guru" that has no idea what AI is. It takes months, and months, and months for these machines to learn. These learning sets are so very important, because that input is how this machine, think of it as your child, and that's basically the way artificial intelligence is learning, like your child. You're feeding it these learning sets, and then eventually it will make its own decisions. So, we know from some of us having children that you teach them the best that you can, but then later on, when they're doing their own thing, they're really, it's like a little myna bird, they've heard everything that you've said. (laughing) Not only the things that you said to them directly, but the things that you said indirectly. - Well, there are some very good AI researchers that might disagree with that metaphor, exactly. (laughing) But, having said that, what I think is very interesting about this conversation is that this notion of bias, one of the things that fascinates me about where AI goes, are we going to find a situation where tribalism more deeply infects business? Because we know that human beings do not seek out the best information, they seek out information that reinforces their beliefs. And that happens in business today. My line of business versus your line of business, engineering versus sales, that happens today, but it happens at a planning level, and when we start talking about AI, we have to put the appropriate dampers, understand the biases, so that we don't end up with deep tribalism inside of business. Because AI could have the deleterious effect that it actually starts ripping apart organizations. - Well, input is data, and then the output is, could be a lot of things. - Could be a lot of things. - And that's where I said data equals human lives. So that we look at the case in New York where the penal system was using this artificial intelligence to make choices on people that were released from prison, and they saw that that was a miserable failure, because that people that release actually re-offended, some committed murder and other things. So, I mean, it's, it's more than what anybody really thinks. It's not just, oh, well, we'll just train the machines, and a couple of weeks later they're good, we never have to touch them again. These things have to be continuously tweaked. So, just because you built an algorithm or a model doesn't mean you're done. You got to go back later, and continue to tweak these models. - Mark, you got the mic. - Yeah, no, I think one thing we've talked a lot about the data that's collected, but what about the data that's not collected? Incomplete profiles, incomplete datasets, that's a form of bias, and sometimes that's the worst. Because they'll fill that in, right, and then you can get some bias, but there's also a real issue for that around cyber security. Logs are not always complete, things are not always done, and when things are doing that, people make assumptions based on what they've collected, not what they didn't collect. So, when they're looking at this, and they're using the AI on it, that's only on the data collected, not on that that wasn't collected. So, if something is down for a little while, and no data's collected off that, the assumption is, well, it was down, or it was impacted, or there was a breach, or whatever, it could be any of those. So, you got to, there's still this human need, there's still the need for humans to look at the data and realize that there is the bias in there, there is, we're just looking at what data was collected, and you're going to have to make your own thoughts around that, and assumptions on how to actually use that data before you go make those decisions that can impact lots of people, at a human level, enterprise's profitability, things like that. And too often, people think of AI, when it comes out of there, that's the word. Well, it's not the word. - Can I ask a question about this? - Please. - Does that mean that we shouldn't act? - It does not. - Okay. - So, where's the fine line? - Yeah, I think. - Going back to this notion of can we do it, or should we do it? Should we act? - Yeah, I think you should do it, but you should use it for what it is. It's augmenting, it's helping you, assisting you to make a valued or good decision. And hopefully it's a better decision than you would've made without it. - I think it's great, I think also, your answer's right too, that you have to iterate faster, and faster, and faster, and discover sources of information, or sources of data that you're not currently using, and, that's why this thing starts getting really important. - I think you touch on a really good point about, should you or shouldn't you? You look at Google, and you look at the data that they've been using, and some of that out there, from a digital twin perspective, is not being approved, or not authorized, and even once they've made changes, it's still floating around out there. Where do you know where it is? So, there's this dilemma of, how do you have a digital twin that you want to have, and is going to work for you, and is going to do things for you to make your life easier, to do these things, mundane tasks, whatever? But how do you also control it to do things you don't want it to do? - Ad-based business models are inherently evil. (laughing) - Well, there's incentives to appropriate our data, and so, are things like blockchain potentially going to give users the ability to control their data? We'll see. - No, I, I'm sorry, but that's actually a really important point. The idea of consensus algorithms, whether it's blockchain or not, blockchain includes games, and something along those lines, whether it's Byzantine fault tolerance, or whether it's Paxos, consensus-based algorithms are going to be really, really important. Parts of this conversation, because the data's going to be more distributed, and you're going to have more elements participating in it. And so, something that allows, especially in the machine-to-machine world, which is a lot of what we're talking about right here, you may not have blockchain, because there's no need for a sense of incentive, which is what blockchain can help provide. - And there's no middleman. - And, well, all right, but there's really, the thing that makes blockchain so powerful is it liberates new classes of applications. But for a lot of the stuff that we're talking about, you can use a very powerful consensus algorithm without having a game side, and do some really amazing things at scale. - So, looking at blockchain, that's a great thing to bring up, right. I think what's inherently wrong with the way we do things today, and the whole overall design of technology, whether it be on-prem, or off-prem, is both the lock and key is behind the same wall. Whether that wall is in a cloud, or behind a firewall. So, really, when there is an audit, or when there is a forensics, it always comes down to a sysadmin, or something else, and the system administrator will have the finger pointed at them, because it all resides, you can edit it, you can augment it, or you can do things with it that you can't really determine. Now, take, as an example, blockchain, where you've got really the source of truth. Now you can take and have the lock in one place, and the key in another place. So that's certainly going to be interesting to see how that unfolds. - So, one of the things, it's good that, we've hit a lot of buzzwords, right now, right? (laughing) AI, and ML, block. - Bingo. - We got the blockchain bingo, yeah, yeah. So, one of the things is, you also brought up, I mean, ethics and everything, and one of the things that I've noticed over the last year or so is that, as I attend briefings or demos, everyone is now claiming that their product is AI or ML-enabled, or blockchain-enabled. And when you try to get answers to the questions, what you really find out is that some things are being pushed as, because they have if-then statements somewhere in their code, and therefore that's artificial intelligence or machine learning. - [Peter] At least it's not "go-to." (laughing) - Yeah, you're that experienced as well. (laughing) So, I mean, this is part of the thing you try to do as a practitioner, as an analyst, as an influencer, is trying to, you know, the hype of it all. And recently, I attended one where they said they use blockchain, and I couldn't figure it out, and it turns out they use GUIDs to identify things, and that's not blockchain, it's an identifier. (laughing) So, one of the ethics things that I think we, as an enterprise community, have to deal with, is the over-promising of AI, and ML, and deep learning, and recognition. It's not, I don't really consider it visual recognition services if they just look for red pixels. I mean, that's not quite the same thing. Yet, this is also making things much harder for your average CIO, or worse, CFO, to understand whether they're getting any value from these technologies. - Old bottle. - Old bottle, right. - And I wonder if the data companies, like that you talked about, or the top five, I'm more concerned about their nearly, or actual $1 trillion valuations having an impact on their ability of other companies to disrupt or enter into the field more so than their data technologies. Again, we're coming to another perfect storm of the companies that have data as their asset, even though it's still not on their financial statements, which is another indicator whether it's really an asset, is that, do we need to think about the terms of AI, about whose hands it's in, and who's, like, once one large trillion-dollar company decides that you are not a profitable company, how many other companies are going to buy that data and make that decision about you? - Well, and for the first time in business history, I think, this is true, we're seeing, because of digital, because it's data, you're seeing tech companies traverse industries, get into, whether it's content, or music, or publishing, or groceries, and that's powerful, and that's awful scary. - If you're a manger, one of the things your ownership is asking you to do is to reduce asset specificities, so that their capital could be applied to more productive uses. Data reduces asset specificities. It brings into question the whole notion of vertical industry. You're absolutely right. But you know, one quick question I got for you, playing off of this is, again, it goes back to this notion of can we do it, and should we do it? I find it interesting, if you look at those top five, all data companies, but all of them are very different business models, or they can classify the two different business models. Apple is transactional, Microsoft is transactional, Google is ad-based, Facebook is ad-based, before the fake news stuff. Amazon's kind of playing it both sides. - Yeah, they're kind of all on a collision course though, aren't they? - But, well, that's what's going to be interesting. I think, at some point in time, the "can we do it, should we do it" question is, brands are going to be identified by whether or not they have gone through that process of thinking about, should we do it, and say no. Apple is clearly, for example, incorporating that into their brand. - Well, Silicon Valley, broadly defined, if I include Seattle, and maybe Armlock, not so much IBM. But they've got a dual disruption agenda, they've always disrupted horizontal tech. Now they're disrupting vertical industries. - I was actually just going to pick up on what she was talking about, we were talking about buzzword, right. So, one we haven't heard yet is voice. Voice is another big buzzword right now, when you couple that with IoT and AI, here you go, bingo, do I got three points? (laughing) Voice recognition, voice technology, so all of the smart speakers, if you think about that in the world, there are 7,000 languages being spoken, but yet if you look at Google Home, you look at Siri, you look at any of the devices, I would challenge you, it would have a lot of problem understanding my accent, and even when my British accent creeps out, or it would have trouble understanding seniors, because the way they talk, it's very different than a typical 25-year-old person living in Silicon Valley, right. So, how do we solve that, especially going forward? We're seeing voice technology is going to be so more prominent in our homes, we're going to have it in the cars, we have it in the kitchen, it does everything, it listens to everything that we are talking about, not talking about, and records it. And to your point, is it going to start making decisions on our behalf, but then my question is, how much does it actually understand us? - So, I just want one short story. Siri can't translate a word that I ask it to translate into French, because my phone's set to Canadian English, and that's not supported. So I live in a bilingual French English country, and it can't translate. - But what this is really bringing up is if you look at society, and culture, what's legal, what's ethical, changes across the years. What was right 200 years ago is not right now, and what was right 50 years ago is not right now. - It changes across countries. - It changes across countries, it changes across regions. So, what does this mean when our AI has agency? How do we make ethical AI if we don't even know how to manage the change of what's right and what's wrong in human society? - One of the most important questions we have to worry about, right? - Absolutely. - But it also says one more thing, just before we go on. It also says that the issue of economies of scale, in the cloud. - Yes. - Are going to be strongly impacted, not just by how big you can build your data centers, but some of those regulatory issues that are going to influence strongly what constitutes good experience, good law, good acting on my behalf, agency. - And one thing that's underappreciated in the marketplace right now is the impact of data sovereignty, if you get back to data, countries are now recognizing the importance of managing that data, and they're implementing data sovereignty rules. Everyone talks about California issuing a new law that's aligned with GDPR, and you know what that meant. There are 30 other states in the United States alone that are modifying their laws to address this issue. - Steve. - So, um, so, we got a number of years, no matter what Ray Kurzweil says, until we get to artificial general intelligence. - The singularity's not so near? (laughing) - You know that he's changed the date over the last 10 years. - I did know it. - Quite a bit. And I don't even prognosticate where it's going to be. But really, where we're at right now, I keep coming back to, is that's why augmented intelligence is really going to be the new rage, humans working with machines. One of the hot topics, and the reason I chose to speak about it is, is the future of work. I don't care if you're a millennial, mid-career, or a baby boomer, people are paranoid. As machines get smarter, if your job is routine cognitive, yes, you have a higher propensity to be automated. So, this really shifts a number of things. A, you have to be a lifelong learner, you've got to learn new skillsets. And the dynamics are changing fast. Now, this is also a great equalizer for emerging startups, and even in SMBs. As the AI improves, they can become more nimble. So back to your point regarding colossal trillion dollar, wait a second, there's going to be quite a sea change going on right now, and regarding demographics, in 2020, millennials take over as the majority of the workforce, by 2025 it's 75%. - Great news. (laughing) - As a baby boomer, I try my damnedest to stay relevant. - Yeah, surround yourself with millennials is the takeaway there. - Or retire. (laughs) - Not yet. - One thing I think, this goes back to what Karen was saying, if you want a basic standard to put around the stuff, look at the old ISO 38500 framework. Business strategy, technology strategy. You have risk, compliance, change management, operations, and most importantly, the balance sheet in the financials. AI and what Tony was saying, digital transformation, if it's of meaning, it belongs on a balance sheet, and should factor into how you value your company. All the cyber security, and all of the compliance, and all of the regulation, is all stuff, this framework exists, so look it up, and every time you start some kind of new machine learning project, or data sense project, say, have we checked the box on each of these standards that's within this machine? And if you haven't, maybe slow down and do your homework. - To see a day when data is going to be valued on the balance sheet. - It is. - It's already valued as part of the current, but it's good will. - Certainly market value, as we were just talking about. - Well, we're talking about all of the companies that have opted in, right. There's tens of thousands of small businesses just in this region alone that are opt-out. They're small family businesses, or businesses that really aren't even technology-aware. But data's being collected about them, it's being on Yelp, they're being rated, they're being reviewed, the success to their business is out of their hands. And I think what's really going to be interesting is, you look at the big data, you look at AI, you look at things like that, blockchain may even be a potential for some of that, because of mutability, but it's when all of those businesses, when the technology becomes a cost, it's cost-prohibitive now, for a lot of them, or they just don't want to do it, and they're proudly opt-out. In fact, we talked about that last night at dinner. But when they opt-in, the company that can do that, and can reach out to them in a way that is economically feasible, and bring them back in, where they control their data, where they control their information, and they do it in such a way where it helps them build their business, and it may be a generational business that's been passed on. Those kind of things are going to make a big impact, not only on the cloud, but the data being stored in the cloud, the AI, the applications that you talked about earlier, we talked about that. And that's where this bias, and some of these other things are going to have a tremendous impact if they're not dealt with now, at least ethically. - Well, I feel like we just got started, we're out of time. Time for a couple more comments, and then officially we have to wrap up. - Yeah, I had one thing to say, I mean, really, Henry Ford, and the creation of the automobile, back in the early 1900s, changed everything, because now we're no longer stuck in the country, we can get away from our parents, we can date without grandma and grandpa setting on the porch with us. (laughing) We can take long trips, so now we're looked at, we've sprawled out, we're not all living in the country anymore, and it changed America. So, AI has that same capabilities, it will automate mundane routine tasks that nobody wanted to do anyway. So, a lot of that will change things, but it's not going to be any different than the way things changed in the early 1900s. - It's like you were saying, constant reinvention. - I think that's a great point, let me make one observation on that. Every period of significant industrial change was preceded by the formation, a period of formation of new assets that nobody knew what to do with. Whether it was, what do we do, you know, industrial manufacturing, it was row houses with long shafts tied to an engine that was coal-fired, and drove a bunch of looms. Same thing, railroads, large factories for Henry Ford, before he figured out how to do an information-based notion of mass production. This is the period of asset formation for the next generation of social structures. - Those ship-makers are going to be all over these cars, I mean, you're going to have augmented reality right there, on your windshield. - Karen, bring it home. Give us the drop-the-mic moment. (laughing) - No pressure. - Your AV guys are not happy with that. So, I think the, it all comes down to, it's a people problem, a challenge, let's say that. The whole AI ML thing, people, it's a legal compliance thing. Enterprises are going to struggle with trying to meet five billion different types of compliance rules around data and its uses, about enforcement, because ROI is going to make risk of incarceration as well as return on investment, and we'll have to manage both of those. I think businesses are struggling with a lot of this complexity, and you just opened a whole bunch of questions that we didn't really have solid, "Oh, you can fix it by doing this." So, it's important that we think of this new world of data focus, data-driven, everything like that, is that the entire IT and business community needs to realize that focusing on data means we have to change how we do things and how we think about it, but we also have some of the same old challenges there. - Well, I have a feeling we're going to be talking about this for quite some time. What a great way to wrap up CUBE NYC here, our third day of activities down here at 37 Pillars, or Mercantile 37. Thank you all so much for joining us today. - Thank you. - Really, wonderful insights, really appreciate it, now, all this content is going to be available on theCUBE.net. We are exposing our video cloud, and our video search engine, so you'll be able to search our entire corpus of data. I can't wait to start searching and clipping up this session. Again, thank you so much, and thank you for watching. We'll see you next time.

Published Date : Sep 13 2018

SUMMARY :

- Well, and for the first

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
ChrisPERSON

0.99+

StevePERSON

0.99+

Mark LyndPERSON

0.99+

KarenPERSON

0.99+

Karen LopezPERSON

0.99+

JohnPERSON

0.99+

Steve ArdirePERSON

0.99+

AmazonORGANIZATION

0.99+

BobPERSON

0.99+

Peter BurrisPERSON

0.99+

Dave VellantePERSON

0.99+

Chris PennPERSON

0.99+

GoogleORGANIZATION

0.99+

Carla GentryPERSON

0.99+

DavePERSON

0.99+

Theo LauPERSON

0.99+

CarlaPERSON

0.99+

Kevin L. JacksonPERSON

0.99+

MicrosoftORGANIZATION

0.99+

IBMORGANIZATION

0.99+

PeterPERSON

0.99+

Tony FlathPERSON

0.99+

TonyPERSON

0.99+

April, 1983DATE

0.99+

AppleORGANIZATION

0.99+

Silicon ValleyLOCATION

0.99+

Ray KurzweilPERSON

0.99+

ZuckerbergPERSON

0.99+

New YorkLOCATION

0.99+

FacebookORGANIZATION

0.99+

2020DATE

0.99+

twoQUANTITY

0.99+

75%QUANTITY

0.99+

Ginni RomettyPERSON

0.99+

Bob HayesPERSON

0.99+

80%QUANTITY

0.99+

GovCloudORGANIZATION

0.99+

35 yearsQUANTITY

0.99+

2025DATE

0.99+

OklahomaLOCATION

0.99+

Mark ZuckerbergPERSON

0.99+

USLOCATION

0.99+

two questionsQUANTITY

0.99+

United StatesLOCATION

0.99+

AprilDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

29 databasesQUANTITY

0.99+

MarkPERSON

0.99+

7,000 languagesQUANTITY

0.99+

five billionQUANTITY

0.99+

Elon MuskPERSON

0.99+

1980sDATE

0.99+

Unconventional VenturesORGANIZATION

0.99+

IRSORGANIZATION

0.99+

SiriTITLE

0.99+

eight childrenQUANTITY

0.99+

bothQUANTITY

0.99+

oneQUANTITY

0.99+

ArmlockORGANIZATION

0.99+

FrenchOTHER

0.99+

Trust InsightORGANIZATION

0.99+

ninth yearQUANTITY

0.99+

congressORGANIZATION

0.99+

first timeQUANTITY

0.99+

PaisanPERSON

0.99+

Hemanth Manda, IBM & James Wade, Guidewell | Change the Game: Winning With AI 2018


 

>> Live from Time Square in New York City, it's theCUBE, covering IBM's Change the Game, Winning with AI. (theCUBE theme music) Brought to you by IBM. >> Hello everybody, welcome back to theCUBE's special presentation. We're covering IBM's announcement. Changing the Game, Winning with AI is the theme of IBM. And IBM has these customer meet-ups, analyst meet-ups, partner meet-ups and they do this in conjunction with Strata every year. And theCUBE has been there covering 'em. I'm Dave Vellante with us is James Wade, who's the Director of Application Hosting at Guidewell, and Hemanth Manda, who's the Director of Platform Offerings at IBM. Gentlemen, welcome to theCUBE thanks for coming on. >> Thank you. >> Hemanth, let's start with you. Platform offerings. A lot of platforms inside of IBM. What do you mean platform offerings? Which one are you responsible for? >> Yeah, so IBM's data and analytics portfolio is pretty wide. It's close to six billion dollar business. And we have hundred plus products. What we are trying to do, is we're trying to basically build a platform through IBM Cloud Private for Data. Bring capabilities that cuts across our portfolio and build upon it. We also make it open. Support multiple clouds and support other partners who wants to run on the platform. So that's what I'm leading. >> Okay, great and we'll come back and talk about that. But James, tell us more about Guidewell. Where are you guys based? What'd you do and what's your role? >> Guidewell is the largest insurer in the sate of Florida. We have about six and a half million members. We also do about 38, 39% of the government processing for MediCare, MediCaid claims. Very large payer. We've also recently moved in over the provider space. We actually have clinics throughout the state of Florida where our members can go in and actually get services there. So we're actually morphing as a company, away from just an insurance company, really to a healthcare company. Very exciting time to be there. We've doubled in size in the last six years from a six billion dollar company to a, I mean from an eight billion dollar company to an 18 billion dollar company. >> So both health insurer and provider, bringing those two worlds together. And the thinking there is just more efficient, you'd be able to drive efficiencies obviously out of your business, right? >> Yup, yes. I mean, the ultimate goal for us is just to have better health outcomes for our members. And the way you deliver that is, one, you do the insurance right, you do it well. You make sure that their processed and handled properly, that they're getting all the services that they need. But two, from a provider space, how do you take the information that you have about your members and use them in a provider space to make sure they're getting the right prescriptions at the right time, for the right situations that they're having, whatever's going on in their life. >> And keeping cost down. I mean, there's a lot of finger pointing in the industry. If you bring those two areas together, you know, now they got a single throat to choke, >> That's right, we get that too. (laughing) >> Buck stops with you. Okay, and you're responsible for the entire application portfolio across the insurance and the clinical side? >> Yes, I have, you know, be it both sides, we have Guidewell as the holding company, we have multiple companies underneath it. So all of those companies roll up into a single kind of IT infrastructure. And I manage that for them, for the entire company. >> Okay. Talk about the big drivers in you business. Obviously on the insurance side, it's the claims system is the life blood, the agency system to deal with, the channel. And now of course, you've got the clinical thing to worry about, but so, talk about sort of the drivers of your business and what's changing. >> Right, I mean, the biggest change we've had, obviously in last few years, has been the Affordable Care Act. It changed the way that, you know, from a group policy where if you're a big corporation and you work for a big corporation, that company actually buys insurance for you and provides it to their employees. Well now the individual market has grown significantly. We're still a group policy insurance company, don't get me wrong, we have a great portfolio of companies that we work with, but we also now sell directly to individuals. So they're in the consumer space directly. And that's just a different way of interacting with folks. You have to have sales sites. You have to have websites that are up, where folks can come and browse your products. You have to interface with government websites. Like CMS has their site where they set up and you're able to buy products through that. So it's really changed our marketing and sales channels completely. And on the back side, the volume of growth, I mean, with the new individual insurance market we've grown in size significantly in our number of members. And that's really stressed our IT systems, it's stressed our database environment. And it's really stressed our ability to kind of analyze the thing that we're doing. And make sure that we're processing claims efficiently and making sure that the members are getting what they expect from us. So, the velocity and change in size has really stressed us. >> Yeah, so you got the Affordable Care Act and some uncertainties around that, the regulations around that. You've got things like EMR and meaningful use that you got to worry about. So a lot of complexity in the application portfolio. And Hemanth, I imagine this is not a unique discussion that you have with some of your insurance clients and healthcare folks, although, you guys are a little different in that you're bringing those two worlds together. But your thoughts on what you're seeing the marketplace. >> Yeah, so I mean, this is not unique because the data is exploding and there are multiple data sources spread across multiple clouds. So in terms of trying to get a sense of where the data is, how to actually start leveraging it, how to govern it, how to analyze it, is a problem that is across all industry verticals. And especially as we are going through digital transformation right, trying to leverage and monetize your data becomes even more important. So. >> Yeah, so, well let's talk a little bit about the data. So your data, like a lot of companies, you must have a lot of data silos. And we have said on theCUBE a lot, that the innovation engine in the future is data. Applying machine intelligence to that data. Using cloud models, whether that cloud is in a private cloud or a public cloud or now even at the edge. But having a cloud-like experience for scale and agility is critical. So, that seems to be the innovation, whereas, last 20, 30 years the innovation has been you know kind of Moore's Law and being able to get the latest and greatest systems, so I can get data out of my data warehouse faster. So change in the innovation engine driven by data what are you seeing James? >> I mean, absolutely. Again, we go back to the mission of the company. It's to provide better health outcomes for our members, right. And IT, and using the data that we collect more effectively and efficiently, allows us to do that. I mean we, if you take, you know, across the board, you may have four or five doctors that you're working with and they've prescribed multiple things to you, but they're not talking. They have no idea what your other doctor is doing with you, unless you tell 'em and a lot of people forget. So just as an example, we would know as the payer, what you've been prescribed, what you've been using for multiple years. If we see something, using AI, machine learning, that you've just been prescribed is going to have a detrimental impact to something else that you're doing, we can alert you. We can send you SMS messages, we can send you emails, we could alert your doctors. Just to say, hey this could be a problem and it could cause a prescription collision and you can end up in the hospital or worse. And that's just one example of the things that we look at everyday to try to better the outcome for our members. But, you know, that's just the first layer. What else can you do with that? Are there predictive medicines? Are there things we could alert your doctors to, that we're seeing from other places, or populations, that kind of match, you know, your current, you know, kind of what you look like, what you do, what you think, what you're using. All the information we have about you, can we predict health outcomes down the future and let your doctors know? So, exciting time to be in this industry. >> Let's talk about the application architecture to support that outcome, because you know, you're not starting from a green field. You probably got some Cobalt running and it works, you can't mess with that stuff. And traditionally you built, especially in a regulated industry, you're building applications that are hardened. And as I said you have this data silo that really, you know, it's like, it works, don't touch it. How much of a challenge is it for you to enter this sort of new era? And how are you getting there? I'd like to understand, IBM's role as well. >> Well we, it's very challenging, number one. You have your, I don't want to call it legacy 'cause that makes it sound bad, but you do have kind of your legacy environments where we're collecting the information. It's kind of like the silos that have gathered the information, the sales information, the claims information, that type of stuff. But those may not be the best systems currently, to actually do the processing and the data analysis and having the machine learning run against it. So we have, you know, really complex ETL, you know, moving data from our kind of legacy environments in to these newer open source models that you guys support with, you know, IBM Cloud Private for Data. But basically, moving into these open source areas where we can kind of focus our tools on it and learn from that data. So that, you know, having your legacy environment and moving it to the new environment where you can do this processing, has been a challenge. I mean the velocity of change in the new environment, the types of databases that are out there Hadoop and then the products that you guys have that run through the information, that's one of the bigger challenges that we have. Our company is very supportive of IT, they give us plenty of budget, they give us plenty of resources. But even with all of the support that we get, the velocity of change in the new environment, in the AI space and the machine learning, is very difficult to keep up with. >> Yeah and you can't just stop doing what your doing in the existing environment, you still got to make changes to it. You got regulatory, you got hippo stuff that you've got to deal with. So you can't just freeze your code there. So, are things like containers and, you know, cloud native techniques coming into play? >> Absolutely, absolutely. We're developing all, you know, we kind of drew a line in the sand, our CIO about two years ago, line in the sand, everything that we develop now is in our cloud-first strategy. That doesn't necessarily mean it's going to go into the external cloud. We have an internal cloud that we have. And we have a very large power environment at Guidewell. Our mainframe is still sort of a cloud-like infrastructure. So, we developed it to be cloud native, cloud-first. And then if it, you know, more than likely stays in our four walls, but there's also the option that we can move it out. Move it to various clouds that are out there. As an IBM Cloud, Amazon, Microsoft, Google, any of those clouds. So we're developing with a cloud-first strategy all of the new things. Now, like you said, the legacy side, we have to maintain. I mean, still the majority of our business is processing claims for our members, right, and that's still in that kind of legacy environment. Runs on a mainframe in the power environment today. So we have to keep it up and running as well. >> How large of organization are you, head count wise? >> We have about 2,100 IT people at Guidewell. Probably a 17,000 person organization. So there is a significant percentage of the population of our employees that are IT directly. >> I was at a, right 'cause it is a IT heavy business, always has been. I was at a conference recently and they threw out a stat that the average organization has eight clouds. And I said, "we're like a 60 person company "and we have eight clouds." I mean you must have 8,000 clouds. (laughing) Imagine when you through in the SAS and so forth. But, you mentioned a number of other clouds. You mentioned IBM Cloud and some others. So, it's a multi-cloud world. >> Yes, yes. >> Okay, so I'm interested in how IBM is approaching that, right. You're not just saying, okay, IBM Cloud or nothing, I think, you know. And cloud is defined on-prem, off-prem, maybe now at the edge, your thoughts. >> Yeah, so, absolutely, I think that is our strategy. We would like to support all the clouds out there, we don't want to discriminate one versus the other. We do have our own public cloud, but what our strategy is, to support our products and platforms on any cloud. For example, IBM Cloud Private for Data, it can run in the data center, it can provide the benefits of the cloud within your firewall. But if you want to deploy it on any other public cloud infrastructures, such as Amazon or Red Hat OpenStack, we do support it. We are also looking to expand that support to Microsoft and Google in the future. So we are going forward with the multi-cloud strategy. Also, if you look at IBM's strength, right, we have significant on-premise business, right, that's our strength. So we want to basically start with enterprise-out. So by focusing on private cloud, and making sure that customers can actually move their offerings and products to private cloud, we are essentially providing a path for our customers and clients to move cloud, embrace cloud. So that's been our approach. >> So James, I'm interested in how you guys look at cloud-first. When you say cloud-first, first of all, I'm hearing, it's not about where it goes, it's about the experience. So we're going to bring the cloud model to the data, wherever the data lives. It's in the public cloud, of course it's cloud. If we bring it on-prem, we want a cloud-like experience. How do you guys looks at that cloud-like experience? Is it utility pricing, is it defined in sort of agility terms? Maybe you could elaborate. >> Actually, we're trying to go with the agility piece first, right. The hardest thing right now is to keep up with the pace that customers demand. I mean, you know, my boss Paul Stallings always talks about, you know, consumer-grade is now the industrial strength. Now you go home at night, your network at home is very fast to your PC. Your phone, you just hit an app, you always expect it to work. Well, we have to be able to provide that same level of support and reliability in the applications we're deploying inside of our infrastructure. So, to do that, you have to be fast, you have to be agile. And our cloud-first being, how do you get things to market faster, right. So you can build service faster build out your networks faster and build you databases faster. Already have like defined sizes, click a button and it's there. On-demand infrastructure, much like they do in the public loud, We want to have that internally. But second, and our finance department would tell you, is that, you know, most important is the utility piece. So once you can define these individuals modules that you can hit a button and immediately spin up and instantiate, you should be able to figure out what that cost the company. How do you define what a server cost? Total cost of ownership through the lifetime that server is for the company. Because if we can lower thar cost, if we can do these things very well, automate 'em, get the data where it needs to be, spin up quickly, we can reduce our administrative cost and then pass those savings right back to our members. You know, if we can find a way to save your grandmother $20 a month off her health insurance, that can make a lot of difference in a person's life, right. Just by cutting our cost on the IT side, we can deliver savings back to the company. And that's very key to us. >> And in terms of sort of what goes where, I guess it's a function of the physics, right, if there's latencies involved, the economics, which you mentioned are critical obviously in your business. And I guess the laws, you know, the edicts of the government-- >> Yes and the various contracts that you sign with companies. I mean, there's some companies that we deal with it in the state of Florida that want their data to stay in that sate of Florida. Well if you move it out to a various cloud provider, you don't know which data center that it's in. So you have to go, there's the laws and regulations based on your contracts. But you're exactly right. It's what have you signed up for, what've you agreed to, what are your member comfortable with as to where the data can actually go? >> How does IBM help Guidewell and other companies sort of mange through that complexity? >> Yeah, absolutely. So I think, in addition to what James mentioned, right, it's also about agility. Because for example, if you look at insurance applications, there's a specific time period where you probably would expect 10x of load, right. So you should be able to easily scale up and down. And also, as you're changing your business model, if you have new laws, or if you want to go after new businesses, you should be able to easily embrace that, right. So cloud provides sort of flexibility and elasticity and also the agility. So that's one. The other thing that you mentioned around regulation, especially in healthcare and also too with financial services industry. So what we're trying to do is, on our platform, we would like to actually have industry-specific accelerators. We've been working with fortune 500 companies for the last 30, 40 years. So we've gained a depth of knowledge that we currently have within our company. So we want to basically start exposing the accelerators. And this is on our roadmap and will be available fairly quickly. So that's one approach we're taking. The other approach we're taking is, we're also working with our business partners and technology partners because we do believe, in today's world, you cannot go after an opportunity all by yourself. You need to build an ecosystem and that's what we're doing. We're trying to work with, basically, specialty vendors who might be focused on that particular vertical, who can bring the depth in knowledge that we might not be having. And work with them and team up, so that they can build their solutions on top of the platform. So that's another approach that we're taking. >> So I got to ask you, I always ask this question of customers. Why IBM? >> I mean, this, you guys have been a part of our business for so long. You have very detailed sales guys that are embed really with our IT folks. You understand our systems. You understand what we do, when we do it, why we do it. You understand our business cycle. IBM really invests in their customers and understanding what they're doing, what they need to be done. And quite honestly, you guys bring some ideas to the table we haven't even thought of. You have such a breadth of understanding, and you're dealing with so many other companies, you'll see things out there that could be a nugget that we could use. And IBM's never shied of bringing that to us. Just a history and a legacy of really bringing innovative solutions to us to really help our business. And very companies out there really get to know a company's business, as well as IBM does. >> Hemanth I'll give you the last word. We got Change the Game, Winning with AI tonight You go to IBM.com/winwithAI and register there. I just did, I'm part of the analyst program. So, Hemanth, last word for you. >> Yeah, so, I think the world is changing really fast and unless enterprises embrace cloud and embrace artificial intelligence and cloud base their data to monetize new business models, it very hard to compete. Like, digital transformation is impacting every industry vertical, including IBM. So, I think going after this opportunistically is critical. And IBM Cloud Private for Data, the platform provides this. And please join us today, it's going to be a great event. And I look forward to meeting you guys, thank you. >> Awesome, and definitely agree. It's all about your digital meets data, applying machine intelligence, machine learning, AI, to that data. Being able to run it in a cloud-like model so you can scale, you can be fast. That's the innovation sandwich for the future. It's not just about the speed of the processor, or the size of the disk drive, or the flash or whatever is. It's really about that combination. theCUBE bringing you all the intelligence we can find. You're watching CUBE NYC. We'll be right back right after this short break. (theCUBE theme music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by IBM. Changing the Game, Winning with AI What do you mean platform offerings? And we have hundred plus products. What'd you do and what's your role? We also do about 38, 39% of the government processing And the thinking there is just more efficient, And the way you deliver that is, you know, now they got a single throat to choke, That's right, we get that too. and the clinical side? Yes, I have, you know, Talk about the big drivers in you business. It changed the way that, you know, that you have with some of your insurance clients And especially as we are going through the innovation has been you know kind of Moore's Law or populations, that kind of match, you know, and it works, you can't mess with that stuff. So we have, you know, really complex ETL, Yeah and you can't just stop doing what your doing And then if it, you know, of the population of our employees I mean you must have 8,000 clouds. okay, IBM Cloud or nothing, I think, you know. But if you want to deploy it How do you guys looks at that cloud-like experience? So, to do that, you have to be fast, And I guess the laws, you know, the edicts So you have to go, there's the laws and regulations So you should be able to easily scale up and down. So I got to ask you, And quite honestly, you guys bring some ideas to the table We got Change the Game, Winning with AI tonight And I look forward to meeting you guys, thank you. so you can scale, you can be fast.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JamesPERSON

0.99+

MicrosoftORGANIZATION

0.99+

James WadePERSON

0.99+

Dave VellantePERSON

0.99+

GoogleORGANIZATION

0.99+

IBMORGANIZATION

0.99+

HemanthPERSON

0.99+

AmazonORGANIZATION

0.99+

Affordable Care ActTITLE

0.99+

FloridaLOCATION

0.99+

Paul StallingsPERSON

0.99+

18 billion dollarQUANTITY

0.99+

eight billion dollarQUANTITY

0.99+

60 personQUANTITY

0.99+

GuidewellORGANIZATION

0.99+

17,000 personQUANTITY

0.99+

New York CityLOCATION

0.99+

six billion dollarQUANTITY

0.99+

bothQUANTITY

0.99+

fourQUANTITY

0.99+

two areasQUANTITY

0.99+

MediCareORGANIZATION

0.99+

MediCaidORGANIZATION

0.99+

five doctorsQUANTITY

0.99+

todayDATE

0.99+

twoQUANTITY

0.99+

Hemanth MandaPERSON

0.99+

both sidesQUANTITY

0.99+

first layerQUANTITY

0.99+

8,000 cloudsQUANTITY

0.99+

Change the Game: Winning With AITITLE

0.99+

about six and a half million membersQUANTITY

0.99+

two worldsQUANTITY

0.98+

eight cloudsQUANTITY

0.98+

hundred plus productsQUANTITY

0.98+

about 38, 39%QUANTITY

0.98+

singleQUANTITY

0.97+

oneQUANTITY

0.96+

companiesQUANTITY

0.96+

tonightDATE

0.96+

theCUBEORGANIZATION

0.96+

secondQUANTITY

0.96+

firstQUANTITY

0.95+

$20 a monthQUANTITY

0.95+

IBM CloudORGANIZATION

0.94+

CUBETITLE

0.93+

EMRTITLE

0.92+

IBM.com/winwithAIOTHER

0.91+

Time SquareLOCATION

0.89+

one exampleQUANTITY

0.87+

Rob Thomas, IBM | Change the Game: Winning With AI


 

>> Live from Times Square in New York City, it's The Cube covering IBM's Change the Game: Winning with AI, brought to you by IBM. >> Hello everybody, welcome to The Cube's special presentation. We're covering IBM's announcements today around AI. IBM, as The Cube does, runs of sessions and programs in conjunction with Strata, which is down at the Javits, and we're Rob Thomas, who's the General Manager of IBM Analytics. Long time Cube alum, Rob, great to see you. >> Dave, great to see you. >> So you guys got a lot going on today. We're here at the Westin Hotel, you've got an analyst event, you've got a partner meeting, you've got an event tonight, Change the game: winning with AI at Terminal 5, check that out, ibm.com/WinWithAI, go register there. But Rob, let's start with what you guys have going on, give us the run down. >> Yeah, it's a big week for us, and like many others, it's great when you have Strata, a lot of people in town. So, we've structured a week where, today, we're going to spend a lot of time with analysts and our business partners, talking about where we're going with data and AI. This evening, we've got a broadcast, it's called Winning with AI. What's unique about that broadcast is it's all clients. We've got clients on stage doing demonstrations, how they're using IBM technology to get to unique outcomes in their business. So I think it's going to be a pretty unique event, which should be a lot of fun. >> So this place, it looks like a cool event, a venue, Terminal 5, it's just up the street on the west side highway, probably a mile from the Javits Center, so definitely check that out. Alright, let's talk about, Rob, we've known each other for a long time, we've seen the early Hadoop days, you guys were very careful about diving in, you kind of let things settle and watched very carefully, and then came in at the right time. But we saw the evolution of so-called Big Data go from a phase of really reducing investments, cheaper data warehousing, and what that did is allowed people to collect a lot more data, and kind of get ready for this era that we're in now. But maybe you can give us your perspective on the phases, the waves that we've seen of data, and where we are today and where we're going. >> I kind of think of it as a maturity curve. So when I go talk to clients, I say, look, you need to be on a journey towards AI. I think probably nobody disagrees that they need something there, the question is, how do you get there? So you think about the steps, it's about, a lot of people started with, we're going to reduce the cost of our operations, we're going to use data to take out cost, that was kind of the Hadoop thrust, I would say. Then they moved to, well, now we need to see more about our data, we need higher performance data, BI data warehousing. So, everybody, I would say, has dabbled in those two area. The next leap forward is self-service analytics, so how do you actually empower everybody in your organization to use and access data? And the next step beyond that is, can I use AI to drive new business models, new levers of growth, for my business? So, I ask clients, pin yourself on this journey, most are, depends on the division or the part of the company, they're at different areas, but as I tell everybody, if you don't know where you are and you don't know where you want to go, you're just going to wind around, so I try to get them to pin down, where are you versus where do you want to go? >> So four phases, basically, the sort of cheap data store, the BI data warehouse modernization, self-service analytics, a big part of that is data science and data science collaboration, you guys have a lot of investments there, and then new business models with AI automation running on top. Where are we today? Would you say we're kind of in-between BI/DW modernization and on our way to self-service analytics, or what's your sense? >> I'd say most are right in the middle between BI data warehousing and self-service analytics. Self-service analytics is hard, because it requires you, sometimes to take a couple steps back, and look at your data. It's hard to provide self-service if you don't have a data catalog, if you don't have data security, if you haven't gone through the processes around data governance. So, sometimes you have to take one step back to go two steps forward, that's why I see a lot of people, I'd say, stuck in the middle right now. And the examples that you're going to see tonight as part of the broadcast are clients that have figured out how to break through that wall, and I think that's pretty illustrative of what's possible. >> Okay, so you're saying that, got to maybe take a step back and get the infrastructure right with, let's say a catalog, to give some basic things that they have to do, some x's and o's, you've got the Vince Lombardi played out here, and also, skillsets, I imagine, is a key part of that. So, that's what they've got to do to get prepared, and then, what's next? They start creating new business models, imagining this is where the cheap data officer comes in and it's an executive level, what are you seeing clients as part of digital transformation, what's the conversation like with customers? >> The biggest change, the great thing about the times we live in, is technology's become so accessible, you can do things very quickly. We created a team last year called Data Science Elite, and we've hired what we think are some of the best data scientists in the world. Their only job is to go work with clients and help them get to a first success with data science. So, we put a team in. Normally, one month, two months, normally a team of two or three people, our investment, and we say, let's go build a model, let's get to an outcome, and you can do this incredibly quickly now. I tell clients, I see somebody that says, we're going to spend six months evaluating and thinking about this, I was like, why would you spend six months thinking about this when you could actually do it in one month? So you just need to get over the edge and go try it. >> So we're going to learn more about the Data Science Elite team. We've got John Thomas coming on today, who is a distinguished engineer at IBM, and he's very much involved in that team, and I think we have a customer who's actually gone through that, so we're going to talk about what their experience was with the Data Science Elite team. Alright, you've got some hard news coming up, you've actually made some news earlier with Hortonworks and Red Hat, I want to talk about that, but you've also got some hard news today. Take us through that. >> Yeah, let's talk about all three. First, Monday we announced the expanded relationship with both Hortonworks and Red Hat. This goes back to one of the core beliefs I talked about, every enterprise is modernizing their data and application of states, I don't think there's any debate about that. We are big believers in Kubernetes and containers as the architecture to drive that modernization. The announcement on Monday was, we're working closer with Red Hat to take all of our data services as part of Cloud Private for Data, which are basically microservice for data, and we're running those on OpenShift, and we're starting to see great customer traction with that. And where does Hortonworks come in? Hadoop has been the outlier on moving to microservices containers, we're working with Hortonworks to help them make that move as well. So, it's really about the three of us getting together and helping clients with this modernization journey. >> So, just to remind people, you remember ODPI, folks? It was all this kerfuffle about, why do we even need this? Well, what's interesting to me about this triumvirate is, well, first of all, Red Hat and Hortonworks are hardcore opensource, IBM's always been a big supporter of open source. You three got together and you're proving now the productivity for customers of this relationship. You guys don't talk about this, but Hortonworks had to, when it's public call, that the relationship with IBM drove many, many seven-figure deals, which, obviously means that customers are getting value out of this, so it's great to see that come to fruition, and it wasn't just a Barney announcement a couple years ago, so congratulations on that. Now, there's this other news that you guys announced this morning, talk about that. >> Yeah, two other things. One is, we announced a relationship with Stack Overflow. 50 million developers go to Stack Overflow a month, it's an amazing environment for developers that are looking to do new things, and we're sponsoring a community around AI. Back to your point before, you said, is there a skills gap in enterprises, there absolutely is, I don't think that's a surprise. Data science, AI developers, not every company has the skills they need, so we're sponsoring a community to help drive the growth of skills in and around data science and AI. So things like Python, R, Scala, these are the languages of data science, and it's a great relationship with us and Stack Overflow to build a community to get things going on skills. >> Okay, and then there was one more. >> Last one's a product announcement. This is one of the most interesting product annoucements we've had in quite a while. Imagine this, you write a sequel query, and traditional approach is, I've got a server, I point it as that server, I get the data, it's pretty limited. We're announcing technology where I write a query, and it can find data anywhere in the world. I think of it as wide-area sequel. So it can find data on an automotive device, a telematics device, an IoT device, it could be a mobile device, we think of it as sequel the whole world. You write a query, you can find the data anywhere it is, and we take advantage of the processing power on the edge. The biggest problem with IoT is, it's been the old mantra of, go find the data, bring it all back to a centralized warehouse, that makes it impossible to do it real time. We're enabling real time because we can write a query once, find data anywhere, this is technology we've had in preview for the last year. We've been working with a lot of clients to prove out used cases to do it, we're integrating as the capability inside of IBM Cloud Private for Data. So if you buy IBM Cloud for Data, it's there. >> Interesting, so when you've been around as long as I have, long enough to see some of the pendulums swings, and it's clearly a pendulum swing back toward decentralization in the edge, but the key is, from what you just described, is you're sort of redefining the boundary, so I presume it's the edge, any Cloud, or on premises, where you can find that data, is that correct? >> Yeah, so it's multi-Cloud. I mean, look, every organization is going to be multi-Cloud, like 100%, that's going to happen, and that could be private, it could be multiple public Cloud providers, but the key point is, data on the edge is not just limited to what's in those Clouds. It could be anywhere that you're collecting data. And, we're enabling an architecture which performs incredibly well, because you take advantage of processing power on the edge, where you can get data anywhere that it sits. >> Okay, so, then, I'm setting up a Cloud, I'll call it a Cloud architecture, that encompasses the edge, where essentially, there are no boundaries, and you're bringing security. We talked about containers before, we've been talking about Kubernetes all week here at a Big Data show. And then of course, Cloud, and what's interesting, I think many of the Hadoop distral vendors kind of missed Cloud early on, and then now are sort of saying, oh wow, it's a hybrid world and we've got a part, you guys obviously made some moves, a couple billion dollar moves, to do some acquisitions and get hardcore into Cloud, so that becomes a critical component. You're not just limiting your scope to the IBM Cloud. You're recognizing that it's a multi-Cloud world, that' what customers want to do. Your comments. >> It's multi-Cloud, and it's not just the IBM Cloud, I think the most predominant Cloud that's emerging is every client's private Cloud. Every client I talk to is building out a containerized architecture. They need their own Cloud, and they need seamless connectivity to any public Cloud that they may be using. This is why you see such a premium being put on things like data ingestion, data curation. It's not popular, it's not exciting, people don't want to talk about it, but we're the biggest inhibitors, to this AI point, comes back to data curation, data ingestion, because if you're dealing with multiple Clouds, suddenly your data's in a bunch of different spots. >> Well, so you're basically, and we talked about this a lot on The Cube, you're bringing the Cloud model to the data, wherever the data lives. Is that the right way to think about it? >> I think organizations have spoken, set aside what they say, look at their actions. Their actions say, we don't want to move all of our data to any particular Cloud, we'll move some of our data. We need to give them seamless connectivity so that they can leave their data where they want, we can bring Cloud-Native Architecture to their data, we could also help move their data to a Cloud-Native architecture if that's what they prefer. >> Well, it makes sense, because you've got physics, latency, you've got economics, moving all the data into a public Cloud is expensive and just doesn't make economic sense, and then you've got things like GDPR, which says, well, you have to keep the data, certain laws of the land, if you will, that say, you've got to keep the data in whatever it is, in Germany, or whatever country. So those sort of edicts dictate how you approach managing workloads and what you put where, right? Okay, what's going on with Watson? Give us the update there. >> I get a lot of questions, people trying to peel back the onion of what exactly is it? So, I want to make that super clear here. Watson is a few things, start at the bottom. You need a runtime for models that you've built. So we have a product called Watson Machine Learning, runs anywhere you want, that is the runtime for how you execute models that you've built. Anytime you have a runtime, you need somewhere where you can build models, you need a development environment. That is called Watson Studio. So, we had a product called Data Science Experience, we've evolved that into Watson Studio, connecting in some of those features. So we have Watson Studio, that's the development environment, Watson Machine Learning, that's the runtime. Now you move further up the stack. We have a set of APIs that bring in human features, vision, natural language processing, audio analytics, those types of things. You can integrate those as part of a model that you build. And then on top of that, we've got things like Watson Applications, we've got Watson for call centers, doing customer service and chatbots, and then we've got a lot of clients who've taken pieces of that stack and built their own AI solutions. They've taken some of the APIs, they've taken some of the design time, the studio, they've taken some of the Watson Machine Learning. So, it is really a stack of capabilities, and where we're driving the greatest productivity, this is in a lot of the examples you'll see tonight for clients, is clients that have bought into this idea of, I need a development environment, I need a runtime, where I can deploy models anywhere. We're getting a lot of momentum on that, and then that raises the question of, well, do I have expandability, do I have trust in transparency, and that's another thing that we're working on. >> Okay, so there's API oriented architecture, exposing all these services make it very easy for people to consume. Okay, so we've been talking all week at Cube NYC, is Big Data is in AI, is this old wine, new bottle? I mean, it's clear, Rob, from the conversation here, there's a lot of substantive innovation, and early adoption, anyway, of some of these innovations, but a lot of potential going forward. Last thoughts? >> What people have to realize is AI is not magic, it's still computer science. So it actually requires some hard work. You need to roll up your sleeves, you need to understand how I get from point A to point B, you need a development environment, you need a runtime. I want people to really think about this, it's not magic. I think for a while, people have gotten the impression that there's some magic button. There's not, but if you put in the time, and it's not a lot of time, you'll see the examples tonight, most of them have been done in one or two months, there's great business value in starting to leverage AI in your business. >> Awesome, alright, so if you're in this city or you're at Strata, go to ibm.com/WinWithAI, register for the event tonight. Rob, we'll see you there, thanks so much for coming back. >> Yeah, it's going to be fun, thanks Dave, great to see you. >> Alright, keep it right there everybody, we'll be back with our next guest right after this short break, you're watching The Cube.

Published Date : Sep 13 2018

SUMMARY :

brought to you by IBM. Rob, great to see you. what you guys have going on, it's great when you have on the phases, the waves that we've seen where you want to go, you're the BI data warehouse modernization, a data catalog, if you and get the infrastructure right with, and help them get to a first and I think we have a as the architecture to news that you guys announced that are looking to do new things, I point it as that server, I get the data, of processing power on the the edge, where essentially, it's not just the IBM Cloud, Is that the right way to think about it? We need to give them seamless connectivity certain laws of the land, that is the runtime for people to consume. and it's not a lot of time, register for the event tonight. Yeah, it's going to be fun, we'll be back with our next guest

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
IBMORGANIZATION

0.99+

DavePERSON

0.99+

HortonworksORGANIZATION

0.99+

John ThomasPERSON

0.99+

two monthsQUANTITY

0.99+

six monthsQUANTITY

0.99+

six monthsQUANTITY

0.99+

RobPERSON

0.99+

Rob ThomasPERSON

0.99+

MondayDATE

0.99+

last yearDATE

0.99+

one monthQUANTITY

0.99+

Red HatORGANIZATION

0.99+

100%QUANTITY

0.99+

GermanyLOCATION

0.99+

New York CityLOCATION

0.99+

oneQUANTITY

0.99+

Vince LombardiPERSON

0.99+

GDPRTITLE

0.99+

three peopleQUANTITY

0.99+

Watson StudioTITLE

0.99+

CubeORGANIZATION

0.99+

ibm.com/WinWithAIOTHER

0.99+

twoQUANTITY

0.99+

Times SquareLOCATION

0.99+

bothQUANTITY

0.99+

tonightDATE

0.99+

FirstQUANTITY

0.99+

todayDATE

0.98+

Data Science EliteORGANIZATION

0.98+

The CubeTITLE

0.98+

two stepsQUANTITY

0.98+

ScalaTITLE

0.98+

PythonTITLE

0.98+

OneQUANTITY

0.98+

threeQUANTITY

0.98+

BarneyORGANIZATION

0.98+

Javits CenterLOCATION

0.98+

WatsonTITLE

0.98+

This eveningDATE

0.98+

IBM AnalyticsORGANIZATION

0.97+

one stepQUANTITY

0.97+

Stack OverflowORGANIZATION

0.96+

CloudTITLE

0.96+

seven-figure dealsQUANTITY

0.96+

Terminal 5LOCATION

0.96+

Watson ApplicationsTITLE

0.95+

Watson Machine LearningTITLE

0.94+

a monthQUANTITY

0.94+

50 million developersQUANTITY

0.92+

Daniel Hernandez, IBM | Change the Game: Winning With AI 2018


 

>> Live from Times Square in New York City, it's theCUBE, covering IBM's Change the Game, Winning with AI, brought to you by IBM. >> Hi everybody, welcome back to theCUBE's special presentation. We're here at the Western Hotel and the theater district covering IBM's announcements. They've got an analyst meeting today, partner event. They've got a big event tonight. IBM.com/winwithAI, go to that website, if you're in town register. You can watch the webcast online. You'll see this very cool play of Vince Lombardy, one of his famous plays. It's kind of a power sweep right which is a great way to talk about sort of winning and with X's and O's. So anyway, Daniel Hernandez is here the vice president of IBM analytics, long time Cube along. It's great to see you again, thanks for coming on. >> My pleasure Dave. >> So we've talked a number of times. We talked earlier this year. Give us the update on momentum in your business. You guys are doing really well, we see this in the quadrants and the waves, but your perspective. >> Data science and AI, so when we last talked we were just introducing something called IBM Club Private for data. The basic idea is anybody that wants to do data science, data engineering or building apps with data anywhere, we're going to give them a single integrated platform to get that done. It's going to be the most efficient, best way to do those jobs to be done. We introduced it, it's been a resounding success. Been rolling that out with clients, that's been a whole lot of fun. >> So we talked a little bit with Rob Thomas about some of the news that you guys have, but this is really your wheelhouse so I'm going to drill down into each of these. Let's say we had Rob Beerden on yesterday on our program and he talked a lot about the IBM Red Hat and Hortonworks relationship. Certainly they talked about it on their earnings call and there seems to be clear momentum in the marketplace. But give us your perspective on that announcement. What exactly is it all about? I mean it started kind of back in the ODPI days and it's really evolved into something that now customers are taking advantage of. >> You go back to June last year, we entered into a relationship with Hortonworks where the basic primacy, was customers care about data and any data driven initiative was going to require data science. We had to do a better job bringing these eco systems, one focused on kind of Hadoop, the other one on classic enterprise analytical and operational data together. We did that last year. The other element of that was we're going to bring our data science and machine learning tools and run times to where the data is including Hadoop. That's been a resounding success. The next step up is how do we proliferate that single integrated stack everywhere including private Cloud or preferred Clouds like Open Shift. So there was two elements of the announcement. We did the hybrid Cloud architecture initiative which is taking the Hadoop data stack and bringing it to containers and Kubernetes. That's a big deal for people that want to run the infrastructure with Cloud characteristics. And the other was we're going to bring that whole stack onto Open Shift. So on IBM's side, with IBM Cloud Private for data we are driving certification of that entire stack on OpenShift so any customer that's betting on OpenShift as their Cloud infrastructure can benefit from that and the single integrated data stack. It's a pretty big deal. >> So OpenShift is really interesting because OpenShift was kind of quiet for awhile. It was quiest if you will. And then containers come on the scene and OpenShift has just exploded. What are your perspectives on that and what's IBM's angle on OpenShift? >> Containers of Kubernetes basically allow you to get Cloud characteristics everywhere. It used to be locked in to kind of the public Cloud or SCP providers that were offering as a service whether PAS OR IAS and Docker and Kubernetes are making the same underline technology that enabled elasticity, pay as you go models available anywhere including your own data center. So I think it explains why OpenShift, why IBM Cloud Private, why IBM Club Private for data just got on there. >> I mean the Core OS move by Red Hat was genius. They picked that up for the song in our view anyway and it's really helped explode that. And in this world, everybody's talking about Kubernetes. I mean we're here at a big data conference all week. It used to be Hadoop world. Everybody's talking about containers, Kubernetes and Multi cloud. Those are kind of the hot trends. I presume you've seen the same thing. >> 100 percent. There's not a single client that I know, and I spend the majority of my time with clients that are running their workloads in a single stack. And so what do you do? If data is an imperative for you, you better run your data analytic stack wherever you need to and that means Multi cloud by definition. So you've got a choice. You can say, I can port that workload to every distinct programming model and data stack or you can have a data stack everywhere including Multi clouds and Open Shift in this case. >> So thinking about the three companies, so Hortonworks obviously had duped distro specialists, open source, brings that end to end sort of data management from you know Edge, or Clouds on Prim. Red Hat doing a lot of the sort of hardcore infrastructure layer. IBM bringing in the analytics and really empowering people to get insights out of data. Is that the right way to think about that triangle? >> 100 percent and you know with the Hortonworks and IBM data stacks, we've got our common services, particularly you're on open meta data which means wherever your data is, you're going to know about it and you're going to be able to control it. Privacy, security, data discovery reasons, that's a pretty big deal. >> Yeah and as the Cloud, well obviously the Cloud whether it's on Prim or in the public Cloud expands now to the Edge, you've also got this concept of data virtualization. We've talked about this in the past. You guys have made some announcements there. But let's put a double click on that a little bit. What's it all about? >> Data virtualization been going on for a long time. It's basic intent is to help you access data through whatever tools, no matter where the data is. Traditional approaches of data virtualization are pretty limiting. So they work relatively well when you've got small data sets but when you've got highly fragmented data, which is the case in virtually every enterprise that exists a lot of the undermined technology for data virtualization breaks down. Data coming through a single headnote. Ultimately that becomes the critical issue. So you can't take advantage of data virtualization technologies largely because of that when you've got wide scale deployments. We've been incubating technology under this project codename query plex, it was a code name that we used internally and that we were working with Beta clients on and testing it out, validating it technically and it was pretty clear that this is a game changing method for data virtualization that allows you to drive the benefits of accessing your data wherever it is, pushing down queries where the data is and getting benefits of that through highly fragmented data landscape. And so what we've done is take that extremely innovated next generation data virtualization technology include it in our data platform called IBM Club Private for Data, and made it a critical feature inside of that. >> I like that term, query plex, it reminds me of the global sisplex. I go back to the days when actually viewing sort of distributed global systems was very, very challenging and IBM sort of solved that problem. Okay, so what's the secret sauce though of query plex and data virtualization? How does it all work? What's the tech behind it? >> So technically, instead of data coming and getting funneled through one node. If you ever think of your data as kind of a graph of computational data nodes. What query plex does is take advantage of that computational mesh to do queries and analytics. So instead of bringing all the data and funneling it through one of the nodes, and depending on the computational horsepower of that node and all the data being able to get to it, this just federates it out. It distributes out that workload so it's some magic behind the scenes but relatively simple technique. Low computing aggregate, it's probably going to be higher than whatever you can put into that single node. >> And how do customers access these services? How long does it take? >> It would look like a standard query interface to them. So this is all magic behind the scenes. >> Okay and they get this capability as part of what? IBM's >> IBM's Club Private for Data. It's going to be a feature, so this project query plex, is introduced as next generation data virtualization technology which just becomes a part of IBM Club Private for Data. >> Okay and then the other announcement that we talked to Rob, I'd like to understand a little bit more behind it. Actually before we get there, can we talk about the business impact of query plex and data virtualization? Thinking about it, it dramatically simplifies the processes that I have to go through to get data. But more importantly, it helps me get a handle on my data so I can apply machine intelligence. It seems like the innovation sandwich if you will. Data plus AI and then Cloud models for scale and simplicity and that's what's going to drive innovation. So talk about the business impact that people are excited about with regard to query plex. >> Better economics, so in order for you to access your data, you don't have to do ETO in this particular case. So data at rest getting consumed because of this online technology. Two performance, so because of the way this works you're actually going to get faster response times. Three, you're going to be able to query more data simply because this technology allows you to access all your data in a fragmented way without having to consolidate it. >> Okay, so it eliminates steps, right, and gets you time to value and gives you a bigger corporate of data that you can the analyze and drive inside. >> 100 percent. >> Okay, let's talk about stack overflow. You know, Rob took us through a little bit about what that's, what's going on there but why stack overflow, you're targeting developers? Talk to me more about that. >> So stack overflow, 50 million active developers each month on that community. You're a developer and you want to know something, you have to go to stack overflow. You think about data science and AI as disciplines. The idea that that is only dermained to AI and data scientists is very limiting idea. In order for you to actually apply artificial intelligence for whatever your use case is instead of a business it's going to require multiple individuals working together to get that particular outcome done including developers. So instead of having a distinct community for AI that's focused on AI machine developers, why not bring the artificial intelligence community to where the developers already are, which is stack overflow. So, if you go to AI.stackexchange.com, it's going to be the place for you to go to get all your answers to any question around artificial intelligence and of course IBM is going to be there in the community helping out. >> So it's AI.stackexchange.com. You know, it's interesting Daniel that, I mean to talk about digital transformation talking about data. John Furrier said something awhile back about the dots. This is like five or six years ago. He said data is the new development kit and now you guys are essentially targeting developers around AI, obviously a data centric. People trying to put data at the core of the organization. You see that that's a winning strategy. What do you think about that? >> 100 percent, I mean we're the data company instead of IBM, so you're probably asking the wrong guy if you think >> You're biased. (laughing) >> Yeah possibly, but I'm acknowledged. The data over opinions. >> Alright, tell us about tonight what we can expect? I was referencing the Vince Lombardy play here. You know, what's behind that? What are we going to see tonight? >> We were joking a little bit about the old school power eye formation, but that obviously works for your, you're a New England fan aren't you? >> I am actually, if you saw the games this weekend Pat's were in the power eye for quite a bit of the game which I know upset a lot of people. But it works. >> Yeah, maybe we should of used it as a Dallas Cowboy team. But anyways, it's going to be an amazing night. So we're going to have a bunch of clients talking about what they're doing with AI. And so if you're interested in learning what's happening in the industry, kind of perfect event to get it. We're going to do some expert analysis. It will be a little bit of fun breaking down what those customers did to be successful and maybe some tips and tricks that will help you along your way. >> Great, it's right up the street on the west side highway, probably about a mile from the Javis Center people that are at Strata. We've been running programs all week. One of the themes that we talked about, we had an event Tuesday night. We had a bunch of people coming in. There was people from financial services, we had folks from New York State, the city of New York. It was a great meet up and we had a whole conversation got going and one of the things that we talked about and I'd love to get your thoughts and kind of know where you're headed here, but big data to do all that talk and people ask, is that, now at AI, the conversation has moved to AI, is it same wine, new bottle, or is there something substantive here? The consensus was, there's substantive innovation going on. Your thoughts about where that innovation is coming from and what the potential is for clients? >> So if you're going to implement AI for let's say customer care for instance, you're going to be three wrongs griefs. You need data, you need algorithms, you need compute. With a lot of different structure to relate down to capture data wasn't captured until the traditional data systems anchored by Hadoop and big data movement. We landed, we created a data and computational grid for that data today. With all the advancements going on in algorithms particularly in Open Source, you now have, you can build a neuro networks, you can do Cisco machine learning in any language that you want. And bringing those together are exactly the combination that you need to implement any AI system. You already have data and computational grids here. You've got algorithms bringing them together solving some problem that matters to a customer is like the natural next step. >> And despite the skills gap, the skill gaps that we talked about, you're seeing a lot of knowledge transfer from a lot of expertise getting out there into the wild when you follow people like Kirk Born on Twitter you'll see that he'll post like the 20 different models for deep learning and people are starting to share that information. And then that skills gap is closing. Maybe not as fast as some people like but it seems like the industry is paying attention to this and really driving hard to work toward it 'cause it's real. >> Yeah I agree. You're going to have Seth Dulpren, I think it's Niagara, one of our clients. What I like about them is the, in general there's two skill issues. There's one, where does data science and AI help us solve problems that matter in business? That's really a, trying to build a treasure map of potential problems you can solve with a stack. And Seth and Niagara are going to give you a really good basis for the kinds of problems that we can solve. I don't think there's enough of that going on. There's a lot of commentary communication actually work underway in the technical skill problem. You know, how do I actually build these models to do. But there's not enough in how do I, now that I solved that problem, how do we marry it to problems that matter? So the skills gap, you know, we're doing our part with our data science lead team which Seth opens which is telling a customer, pick a hard problem, give us some data, give us some domain experts. We're going to be in the AI and ML experts and we're going to see what happens. So the skill problem is very serious but I don't think it's most people are not having the right conversations about it necessarily. They understand intuitively there's a tech problem but that tech not linked to a business problem matters nothing. >> Yeah it's not insurmountable, I'm glad you mentioned that. We're going to be talking to Niagara Bottling and how they use the data science elite team as an accelerant, to kind of close that gap. And I'm really interested in the knowledge transfer that occurred and of course the one thing about IBM and companies like IBM is you get not only technical skills but you get deep industry expertise as well. Daniel, always great to see you. Love talking about the offerings and going deep. So good luck tonight. We'll see you there and thanks so much for coming on theCUBE. >> My pleasure. >> Alright, keep it right there everybody. This is Dave Vellanti. We'll be back right after this short break. You're watching theCUBE. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

IBM's Change the Game, Hotel and the theater district and the waves, but your perspective. It's going to be the most about some of the news that you guys have, and run times to where the It was quiest if you will. kind of the public Cloud Those are kind of the hot trends. and I spend the majority Is that the right way to and you're going to be able to control it. Yeah and as the Cloud, and getting benefits of that I go back to the days and all the data being able to get to it, query interface to them. It's going to be a feature, So talk about the business impact of the way this works that you can the analyze Talk to me more about that. it's going to be the place for you to go and now you guys are You're biased. The data over opinions. What are we going to see tonight? saw the games this weekend kind of perfect event to get it. One of the themes that we talked about, that you need to implement any AI system. that he'll post like the And Seth and Niagara are going to give you kind of close that gap. This is Dave Vellanti.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantiPERSON

0.99+

IBMORGANIZATION

0.99+

Daniel HernandezPERSON

0.99+

RobPERSON

0.99+

DanielPERSON

0.99+

John FurrierPERSON

0.99+

Tuesday nightDATE

0.99+

HortonworksORGANIZATION

0.99+

Rob BeerdenPERSON

0.99+

AI.stackexchange.comOTHER

0.99+

CiscoORGANIZATION

0.99+

ThreeQUANTITY

0.99+

DavePERSON

0.99+

New York CityLOCATION

0.99+

New York StateLOCATION

0.99+

Seth DulprenPERSON

0.99+

last yearDATE

0.99+

Rob ThomasPERSON

0.99+

yesterdayDATE

0.99+

tonightDATE

0.99+

Dallas CowboyORGANIZATION

0.99+

oneQUANTITY

0.99+

three companiesQUANTITY

0.99+

Open ShiftTITLE

0.99+

New YorkLOCATION

0.99+

two elementsQUANTITY

0.99+

IBM Red HatORGANIZATION

0.99+

100 percentQUANTITY

0.99+

June last yearDATE

0.99+

20 different modelsQUANTITY

0.98+

Vince LombardyPERSON

0.98+

fiveDATE

0.98+

Times SquareLOCATION

0.98+

Red HatORGANIZATION

0.97+

eachQUANTITY

0.97+

PatPERSON

0.97+

OpenShiftTITLE

0.97+

each monthQUANTITY

0.97+

single clientQUANTITY

0.96+

New EnglandLOCATION

0.96+

singleQUANTITY

0.96+

single stackQUANTITY

0.96+

HadoopTITLE

0.96+

six years agoDATE

0.94+

three wrongsQUANTITY

0.94+

IBM.com/winwithAIOTHER

0.94+

todayDATE

0.94+

earlier this yearDATE

0.93+

NiagaraORGANIZATION

0.93+

OneQUANTITY

0.92+

about a mileQUANTITY

0.92+

Kirk BornPERSON

0.91+

SethORGANIZATION

0.91+

IBM ClubORGANIZATION

0.89+

Change the Game: Winning With AITITLE

0.88+

50 million active developersQUANTITY

0.88+

Yaron Haviv, Iguazio | theCUBE NYC 2018


 

Live from New York It's theCUBE! Covering theCUBE New York City 2018 Brought to you by Silicon Angle Media and it's ecosystem partners >> Hey welcome back and we're live in theCUBE in New York city. It's our 2nd day of two days of coverage CUBE NYC. The hashtag CUBENYC Formerly Big data NYC renamed because it's about big data, it's about the server, it's about Cooper _________'s multi-cloud data. It's all about data, and that's the fundamental change in the industry. Our next guest is Yaron Haviv, who's the CTO of Iguazio, key alumni, always coming out with some good commentary smart analysis. Kind of a guest host as well as an industry participant supplier. Welcome back to theCUBE. Good to see you. >> Thank you John. >> Love having you on theCUBE because you always bring some good insight and we appreciate that. Thank you so much. First, before we get into some of the comments because I really want to delve into comments that David Richards said a few years ago, CEO of RenDisco. He said, "Cloud's going to kill Hadoop". And people were looking at him like, "Oh my God, who is this heretic? He's crazy. What is he talking about?" But you might not need Hadoop, if you can run server less Spark, Tensorflow.... You talk about this off camera. Is Hadoop going to be the open stack of the big data world? >> I don't think cloud necessary killed Hadoop, although it is working on that, you know because you go to Amazon and you know, you can consume a bunch of services and you don't really need to think about Hadoop. I think cloud native serve is starting to kill Hadoop, cause Hadoop is three layers, you know, it's a file system, it's DFS, and then you have server scheduling Yarn, then you have applications starting with map produce and then you evolve into things like Spark. Okay, so, file system I don't really need in the cloud. I use Asfree, I can use a database as a service, as you know, pretty efficient way of storing data. For scheduling, Kubernetes is a much more generic way of scheduling workloads and not confined to Spark and specific workloads. I can run with Dancerflow, I can run with data science tools, etc., just containerize. So essentially, why would I need Hadoop? If I can take the traditional tools people are now evolving in and using like Jupiter Notebooks, Spark, Dancerflow, you know, those packages with Kubernetes on top of a database as a service and some object store, I have a much easier stack to work with. And I could mobilize that whether it's in the cloud, you know on different vendors. >> Scale is important too. How do you scale it? >> Of course, you have independent scaling between data and computation, unlike Hadoop. So I can just go to Google, and use Vquery, or use, you know, DynamoDB on Amazon or Redchick, or whatever and automatically scale it down and then, you know >> That's a unique position, so essentially, Hadoop versus Kubernetes is a top-line story. And wouldn't that be ironic for Google, because Google essentially created Map Produce and Coudera ran with it and went public, but when we're talking about 2008 timeframe, 2009 timeframe, back when ventures with cloud were just emerging in the mainstream. So wouldn't it be ironic Kubernetes, which is being driven by Google, ends up taking over Hadoop? In terms of running things on Kubernetes and cloud eight on Visa Vis on premise with Hadoop. >> The poster is tend to give this comment about Google, but essentially Yahoo started Hadoop. Google started the technology  and couple of years after Hadoop started, with Google they essentially moved to a different architecture, with something called Percolator. So Google's not too associated with Hadoop. They're not really using this approach for a long time. >> Well they wrote the map-produced paper and the internal conversations we report on theCUBE about Google was, they just let that go. And Yahoo grabbed it. (cross-conversation) >> The companies that had the most experience were the first to leave. And I think it may respect what you're saying. As the marketplace realizes the outcomes of the dubious associate with, they will find other ways of achieving those outcomes. It might be more depth. >> There's also a fundamental shift in the consumption where Hadoop was about a ranking pages in a batch form. You know, just collecting logs and ranking pages, okay. The chances that people have today revolve around applying AI to business application. It needs to be a lot more concurring, transactional, real-time ish, you know? It's nothing to do with Hadoop, okay? So that's why you'll see more and more workers, mobilizing different black server functions, into service pre-canned services, etc. And Kubernetes playing a good role here is providing the trend. Transport for migrating workloads across cloud providers, because I can use GKE, the Google Kubenetes, or Amazon Kubernetes, or Azure Kubernetes, and I could write a similar application and deploy it on any cloud, or on Clam on my own private cluster. It makes the infrastructure agnostic really application focused. >> Question about Kubernetes we heard on theCUBE earlier, the VP of Project BlueData said that Kubernetes ecosystem and community needs to do a better job with Stapla, they nailed Stapflalis, Stafle application support is something that they need help on. Do you agree with that comment, and then if so, what alternatives do you have for customers who care about Stafe? >> They should use our product (laughing) >> (mumbling) Is Kubernetes struggling there? And if so, talk about your product >> So, I think that our challenge is rounded that there are many solutions in that. I think that they are attacking it from a different approach Many of them are essentially providing some block storage to different containers on really cloud 90. What you want to be able is to have multiple containers access the same data. That means either sharing through file systems, for objects or through databases because one container is generating, for example, ingestion or __________. Another container is manipulating that same data. A third container may look for something in the data, and generate a trigger or an action. So you need shared access to data from those containers. >> The rest of the data synchronizes all three of those things. >> Yes because the data is the form of state. The form of state cannot be associated with the same container, which is what most of where I am very active and sincere in those committees, and you have all the storage guys in the committees, and they think the block story just drag solution. Cause they still think like virtual machines, okay? But the general idea is that if you think about Kubernetes is like the new OS, where you have many processes, they're just scattered around. In OS, the way for us to share state between processes an OS, is whether through files, or through databases, in those form. And that's really what >> Threads and databases as a positive engagement. >> So essentially I gave maybe two years ago, a session at KubeCon in Europe about what we're doing on storing state. It's really high-performance access from those container processes to our database. Impersonate objects, files, streams or time series data, etc And then essentially, all those workloads just mount on top of and we can all share stape. We can even control the access for each >> Do you think you nailed the stape problem? >> Yes, by the way, we have a managed service. Anyone could go today to our cloud, to our website, that's in our cloud. It gets it's own Kubernetes cluster, a provision within less than 10 minutes, five to 10 minutes. With all of those services pre-integrated with Spark, Presto, ______________, real-time, these services functions. All that pre-configured on it's own time. I figured all of these- >> 100% compatible with Kubernetes, it's a good investment >> Well we're just expanding it to Kubernetes stripes, now it's working on them, Amazon Kubernetes, EKS I think, we're working on AKS and GK. We partner with Azure and Google. And we're also building an ad solution that is essentially exactly the same stock. Can run on an edge appliance in a factory. You can essentially mobilize data and functions back and forth. So you can go and develop your work loads, your application in the cloud, test it under simulation, push a single button and teleport the artifacts into the edge factory. >> So is it like a real-time Kubernetes? >> Yes, it's a real-time Kubernetes. >> If you _______like the things we're doing, it's all real-time. >> Talk about real-time in the database world because you mentioned time-series databases. You give objects store versus blog. Talk about time series. You're talking about data that is very relevant in the moment. And also understanding time series data. And then, it's important post-event, if you will, meaning How do you store it? Do you care? I mean, it's important to manage the time series. At the same time, it might not be as valuable as other data, or valuable at certain points and time, which changes it's relationship to how it's stored and how it's used. Talk about the dynamic of time series.. >> Figured it out in the last six or 12 months that since real-time is about time series. Everything you think about real-time censored data, even video is a time-series of frames, okay And what everyone wants to do is just huge amount of time series. They want to cross-correlate it, because for example, you think about stock tickers you know, the stock has an impact from news feeds or Twitter feeds, or of a company or a segment. So essentially, what they need to do is something called multi-volume analysis of multiple time series to be able to extract some meaning, and then decide if you want to sell or buy a stock, as in vacation example. And there is a huge gap in the solution in that market, because most of the time series databases were designed for operational databases, you know, things that monitor apps. Nothing that injects millions of data points per second, and cross-correlates and run real-time AI analytics. Ah, so we've essentially extended because we have a programmable database essentially under the hoop. We've extended it to support time series data with about 50 to 1 compression ratio, compared to some other solutions. You know we've break with the customer, we've done sizing, they told them us they need half a pitabyte. After a small sizing exercise, about 10 to 20 terabytes of storage for the same data they stored in Kassandra for 500 terabytes. No huge ingestion rates, and what's very important, we can do an in-flight with all those cross-correlations, so, that's something that's working very well for us. >> This could help on smart mobility. Kenex 5G comes on, certainly. Intelligent edge. >> So the customers we have, these cases that we applied right now is in financial services, two or three main applications. One is tick data and analytics, everyone wants to be smarter learning on how to buy and sell stocks or manage risk, the second one is infrastructure, monitoring, critical infrastructure, monitoring is SLA monitoring is be able to monitor network devices, latencies, applications, you now, transaction rate, or that, be able to predict potential failures or escalation We have similar applications; we have about three Telco customers using it for real-time time. Series analytics are metric data, cybersecurity attacks, congestion avoidance, SLA management, and also automotive. Fleet management, file linking, they are also essentially feeding huge data sets of time series analytics. They're running cross-correlation and AI logic, so now they can generate triggers. Now apply to Hadoop. What does Hadoop have anything to do with those kinds of applications? They cannot feed huge amounts of datasets, they cannot react in real-time, doesn't store time-series efficiently. >> Hapoop (laughing) >> You said that. >> Yeah. That's good. >> One, I know we don't have a lot of time left. We're running out of time, but I want to make sure we get this out here. How are you engaging with customers? You guys got great technical support. We can vouch for the tech chops that you guys have. We seen the solution. If it's compatible to Kubernetes, certainly this is an alternative to have really great analytical infrastructure. Cloud native, goodness of your building, You do PFC's, they go to your website, and how do you engage, how do you get deals? How do people work with you? >> So because now we have a cloud service, so also we engage through the cloud. Mainly, we're going after customers and leads, or from webinars and activities on the internet, and we sort of follow-up with those customers, we know >> Direct sales? >> Direct sales, but through lead generation mechanism. Marketplace activity, Amazon, Azure, >> Partnerships with Azure and Google now. And Azure joint selling activities. They can actually resale and get compensated. Our solution is an edge for Azure. Working on similar solution for Google. Very focused on retailers. That's the current market focus of since you think about stores that have a single supermarket will have more than a 1,000 cameras. Okay, just because they're monitoring shelves in real-time, think about Amazon go, kind of replication. Real-time inventory management. You cannot push a 1,000 camera feeds into the cloud. In order to analyze it then decide on inventory level. Proactive action, so, those are the kind of applications. >> So bigger deals, you've had some big deals. >> Yes, we're really not a raspberry pie-kind of solution. That's where the bigger customers >> Got it. Yaron, thank you so much. The CTO of Iguazio Check him out. It's actually been great commentary. The Hadoop versus Kubernetes narrative. Love to explore that further with you. Stay with us for more coverage after this short break. We're live in day 2 of CUBE NYC. Par Strata, Hadoop Strata, Hadoop World. CUBE Hadoop World, whatever you want to call it. It's all because of the data. We'll bring it to ya. Stay with us for more after this short break. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

It's all about data, and that's the fundamental change Love having you on theCUBE because you always and then you evolve into things like Spark. How do you scale it? and then, you know and cloud eight on Visa Vis on premise with Hadoop. Google started the technology and couple of years and the internal conversations we report on theCUBE The companies that had the most experience It's nothing to do with Hadoop, okay? and then if so, what alternatives do you have for So you need shared access to data from those containers. The rest of the data synchronizes is like the new OS, where you have many processes, We can even control the access for each Yes, by the way, we have a managed service. So you can go and develop your work loads, your application If you And then, it's important post-event, if you will, meaning because most of the time series databases were designed for This could help on smart mobility. So the customers we have, and how do you engage, how do you get deals? and we sort of follow-up with those customers, we know Direct sales, but through lead generation mechanism. since you think about stores that have Yes, we're really not a raspberry pie-kind of solution. It's all because of the data.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
JohnPERSON

0.99+

Lisa MartinPERSON

0.99+

Ed MacoskyPERSON

0.99+

Darren AnthonyPERSON

0.99+

Yaron HavivPERSON

0.99+

Mandy DollyPERSON

0.99+

Mandy DhaliwalPERSON

0.99+

David RichardsPERSON

0.99+

Suzi JewettPERSON

0.99+

AmazonORGANIZATION

0.99+

AWSORGANIZATION

0.99+

John FurrierPERSON

0.99+

HPORGANIZATION

0.99+

twoQUANTITY

0.99+

2.9 timesQUANTITY

0.99+

DarrenPERSON

0.99+

GoogleORGANIZATION

0.99+

SuziPERSON

0.99+

Silicon Angle MediaORGANIZATION

0.99+

RenDiscoORGANIZATION

0.99+

2009DATE

0.99+

Suzie JewittPERSON

0.99+

HPEORGANIZATION

0.99+

2022DATE

0.99+

YahooORGANIZATION

0.99+

LisaPERSON

0.99+

2008DATE

0.99+

AKSORGANIZATION

0.99+

Las VegasLOCATION

0.99+

500 terabytesQUANTITY

0.99+

60%QUANTITY

0.99+

2021DATE

0.99+

HadoopTITLE

0.99+

1,000 cameraQUANTITY

0.99+

oneQUANTITY

0.99+

18,000 customersQUANTITY

0.99+

fiveQUANTITY

0.99+

AmsterdamLOCATION

0.99+

2030DATE

0.99+

OneQUANTITY

0.99+

HIPAATITLE

0.99+

tomorrowDATE

0.99+

2026DATE

0.99+

YaronPERSON

0.99+

two daysQUANTITY

0.99+

EuropeLOCATION

0.99+

FirstQUANTITY

0.99+

todayDATE

0.99+

TelcoORGANIZATION

0.99+

bothQUANTITY

0.99+

threeQUANTITY

0.99+

Mick Hollison, Cloudera | theCUBE NYC 2018


 

(lively peaceful music) >> Live, from New York, it's The Cube. Covering "The Cube New York City 2018." Brought to you by SiliconANGLE Media and its ecosystem partners. >> Well, everyone, welcome back to The Cube special conversation here in New York City. We're live for Cube NYC. This is our ninth year covering the big data ecosystem, now evolved into AI, machine learning, cloud. All things data in conjunction with Strata Conference, which is going on right around the corner. This is the Cube studio. I'm John Furrier. Dave Vellante. Our next guest is Mick Hollison, who is the CMO, Chief Marketing Officer, of Cloudera. Welcome to The Cube, thanks for joining us. >> Thanks for having me. >> So Cloudera, obviously we love Cloudera. Cube started in Cloudera's office, (laughing) everyone in our community knows that. I keep, keep saying it all the time. But we're so proud to have the honor of working with Cloudera over the years. And, uh, the thing that's interesting though is that the new building in Palo Alto is right in front of the old building where the first Palo Alto office was. So, a lot of success. You have a billboard in the airport. Amr Awadallah is saying, hey, it's a milestone. You're in the airport. But your business is changing. You're reaching new audiences. You have, you're public. You guys are growing up fast. All the data is out there. Tom's doing a great job. But, the business side is changing. Data is everywhere, it's a big, hardcore enterprise conversation. Give us the update, what's new with Cloudera. >> Yeah. Thanks very much for having me again. It's, it's a delight. I've been with the company for about two years now, so I'm officially part of the problem now. (chuckling) It's been a, it's been a great journey thus far. And really the first order of business when I arrived at the company was, like, welcome aboard. We're going public. Time to dig into the S-1 and reimagine who Cloudera is going to be five, ten years out from now. And we spent a good deal of time, about three or four months, actually crafting what turned out to be just 38 total words and kind of a vision and mission statement. But the, the most central to those was what we were trying to build. And it was a modern platform for machine learning analytics in the cloud. And, each of those words, when you unpack them a little bit, are very, very important. And this week, at Strata, we're really happy on the modern platform side. We just released Cloudera Enterprise Six. It's the biggest release in the history of the company. There are now over 30 open-source projects embedded into this, something that Amr and Mike could have never imagined back in the day when it was just a couple of projects. So, a very very large and meaningful update to the platform. The next piece is machine learning, and Hilary Mason will be giving the kickoff tomorrow, and she's probably forgotten more about ML and AI than somebody like me will ever know. But she's going to give the audience an update on what we're doing in that space. But, the foundation of having that data management platform, is absolutely fundamental and necessary to do good machine learning. Without good data, without good data management, you can't do good ML or AI. Sounds sort of simple but very true. And then the last thing that we'll be announcing this week, is around the analytics space. So, on the analytic side, we announced Cloudera Data Warehouse and Altus Data Warehouse, which is a PaaS flavor of our new data warehouse offering. And last, but certainly not least, is just the "optimize for the cloud" bit. So, everything that we're doing is optimized not just around a single cloud but around multi-cloud, hybrid-cloud, and really trying to bridge that gap for enterprises and what they're doing today. So, it's a new Cloudera to say the very least, but it's all still based on that core foundation and platform that, you got to know it, with very early on. >> And you guys have operating history too, so it's not like it's a pivot for Cloudera. I know for a fact that you guys had very large-scale customers, both with three letter, letters in them, the government, as well as just commercial. So, that's cool. Question I want to ask you is, as the conversation changes from, how many clusters do I have, how am I storing the data, to what problems am I solving because of the enterprises. There's a lot of hard things that enterprises want. They want compliance, all these, you know things that have either legacy. You guys work on those technical products. But, at the end of the day, they want the outcomes, they want to solve some problems. And data is clearly an opportunity and a challenge for large enterprises. What problems are you guys going after, these large enterprises in this modern platform? What are the core problems that you guys knock down? >> Yeah, absolutely. It's a great question. And we sort of categorize the way we think about addressing business problems into three broad categories. We use the terms grow, connect, and protect. So, in the "grow" sense, we help companies build or find new revenue streams. And, this is an amazing part of our business. You see it in everything from doing analytics on clickstreams and helping people understand what's happening with their web visitors and the like, all the way through to people standing up entirely new businesses based simply on their data. One large insurance provider that is a customer of ours, as an example, has taken on the challenge and asked us to engage with them on building really, effectively, insurance as a service. So, think of it as data-driven insurance rates that are gauged based on your driving behaviors in real time. So no longer simply just using demographics as the way that you determine, you know, all 18-year old young men are poor drivers. As it turns out, with actual data you can find out there's some excellent 18 year olds. >> Telematic, not demographics! >> Yeah, yeah, yeah, exactly! >> That Tesla don't connect to the >> Exactly! And Parents will love this, love this as well, I think. So they can find out exactly how their kids are really behaving by the way. >> They're going to know I rolled through the stop signs in Palo Alto. (laughing) My rates just went up. >> Exactly, exactly. So, so helping people grow new businesses based on their data. The second piece is "Connect". This is not just simply connecting devices, but that's a big part of it, so the IOT world is a big engine for us there. One of our favorite customer stories is a company called Komatsu. It's a mining manufacturer. Think of it as the ones that make those, just massive mines that are, that are all over the world. They're particularly big in Australia. And, this is equipment that, when you leave it sit somewhere, because it doesn't work, it actually starts to sink into the earth. So, being able to do predictive maintenance on that level and type and expense of equipment is very valuable to a company like Komatsu. We're helping them do that. So that's the "Connect" piece. And last is "Protect". Since data is in fact the new oil, the most valuable resource on earth, you really need to be able to protect it. Whether that's from a cyber security threat or it's just meeting compliance and regulations that are put in place by governments. Certainly GDPR is got a lot of people thinking very differently about their data management strategies. So we're helping a number of companies in that space as well. So that's how we kind of categorize what we're doing. >> So Mick, I wonder if you could address how that's all affected the ecosystem. I mean, one of the misconceptions early on was that Hadoop, Big Data, is going to kill the enterprise data warehouse. NoSQL is going to knock out Oracle. And, Mike has always said, "No, we are incremental". And people are like, "Yeah, right". But that's really, what's happened here. >> Yes. >> EDW was a fundamental component of your big data strategies. As Amr used to say, you know, SQL is the killer app for, for big data. (chuckling) So all those data sources that have been integrated. So you kind of fast forward to today, you talked about IOT and The Edge. You guys have announced, you know, your own data warehouse and platform as a service. So you see this embracing in this hybrid world emerging. How has that affected the evolution of your ecosystem? >> Yeah, it's definitely evolved considerably. So, I think I'd give you a couple of specific areas. So, clearly we've been quite successful in large enterprises, so the big SI type of vendors want a, want a piece of that action these days. And they're, they're much more engaged than they were early days, when they weren't so sure all of this was real. >> I always say, they like to eat at the trough and then the trough is full, so they dive right in. (all laughing) They're definitely very engaged, and they built big data practices and distinctive analytics practices as well. Beyond that, sort of the developer community has also begun to shift. And it's shifted from simply people that could spell, you know, Hive or could spell Kafka and all of the various projects that are involved. And it is elevated, in particular into a data science community. So one of additional communities that we sort of brought on board with what we're doing, not just with the engine and SPARK, but also with tools for data scientists like Cloudera Data Science Workbench, has added that element to the community that really wasn't a part of it, historically. So that's been a nice add on. And then last, but certainly not least, are the cloud providers. And like everybody, they're, those are complicated relationships because on the one hand, they're incredibly valuable partners to it, certainly both Microsoft and Amazon are critical partners for Cloudera, at the same time, they've got competitive offerings. So, like most successful software companies there's a lot of coopetition to contend with that also wasn't there just a few years ago when we didn't have cloud offerings, and they didn't have, you know, data warehouse in the cloud offerings. But, those are things that have sort of impacted the ecosystem. >> So, I've got to ask you a marketing question, since you're the CMO. By the way, great message UL. I like the, the "grow, connect, protect." I think that's really easy to understand. >> Thank you. >> And the other one was modern. The phrase, say the phrase again. >> Yeah. It's the "Cloudera builds the modern platform for machine learning analytics optimized for the cloud." >> Very tight mission statement. Question on the name. Cloudera. >> Mmhmm. >> It's spelled, it's actually cloud with ERA in the letters, so "the cloud era." People use that term all the time. We're living in the cloud era. >> Yes. >> Cloud-native is the hottest market right now in the Linux foundation. The CNCF has over two hundred and forty members and growing. Cloud-native clearly has indicated that the new, modern developers here in the renaissance of software development, in general, enterprises want more developers. (laughs) Not that you want to be against developers, because, clearly, they're going to hire developers. >> Absolutely. >> And you're going to enable that. And then you've got the, obviously, cloud-native on-premise dynamic. Hybrid cloud and multi-cloud. So is there plans to think about that cloud era, is it a cloud positioning? You see cloud certainly important in what you guys do, because the cloud creates more compute, more capabilities to move data around. >> Sure. >> And (laughs) process it. And make it, make machine learning go faster, which gives more data, more AI capabilities, >> It's the flywheel you and I were discussing. >> It's the flywheel of, what's the innovation sandwich, Dave? You know? (laughs) >> A little bit of data, a little bit of machine itelligence, in the cloud. >> So, the innovation's in play. >> Yeah, Absolutely. >> Positioning around Cloud. How are you looking at that? >> Yeah. So, it's a fascinating story. You were with us in the earliest days, so you know that the original architecture of everything that we built was intended to be run in the public cloud. It turns out, in 2008, there were exactly zero customers that wanted all of their data in a public cloud environment. So the company actually pivoted and re-architected the original design of the offerings to work on-prim. And, no sooner did we do that, then it was time to re-architect it yet again. And we are right in the midst of doing that. So, we really have offerings that span the whole gamut. If you want to just pick up you whole current Cloudera environment in an infrastructure as a service model, we offer something called Altus Director that allows you to do that. Just pick up the entire environment, step it up onto AWUS, or Microsoft Azure, and off you go. If you want the convenience and the elasticity and the ease of use of a true platform as a service, just this past week we announced Altus Data Warehouse, which is a platform as a service kind of a model. For data warehousing, we have the data engineering module for Altus as well. Last, but not least, is everybody's not going to sign up for just one cloud vendor. So we're big believers in multi-cloud. And that's why we support the major cloud vendors that are out there. And, in addition to that, it's going to be a hybrid world for as far out as we can see it. People are going to have certain workloads that, either for economics or for security reasons, they're going to continue to want to run in-house. And they're going to have other workloads, certainly more transient workloads, and I think ML and data science will fall into this camp, that the public cloud's going to make a great deal of sense. And, allowing companies to bridge that gap while maintaining one security compliance and management model, something we call a Shared Data Experience, is really our core differentiator as a business. That's at the very core of what we do. >> Classic cloud workload experience that you're bringing, whether it's on-prim or whatever cloud. >> That's right. >> Cloud is an operating environment for you guys. You look at it just as >> The delivery mechanism. In effect. Awesome. All right, future for Cloudera. What can you share with us. I know you're a public company. Can't say any forward-looking statements. Got to do all those disclaimers. But for customers, what's the, what's the North Star for Cloudera? You mentioned going after a much more hardcore enterprise. >> Yes. >> That's clear. What's the North Star for you guys when you talk to customers? What's the big pitch? >> Yeah. I think there's a, there's a couple of really interesting things that we learned about our business over the course of the past six, nine months or so here. One, was that the greatest need for our offerings is in very, very large and complex enterprises. They have the most data, not surprisingly. And they have the most business gain to be had from leveraging that data. So we narrowed our focus. We have now identified approximately five thousand global customers, so think of it as kind of Fortune or Forbes 5000. That is our sole focus. So, we are entirely focused on that end of the market. Within that market, there are certain industries that we play particularly well in. We're incredibly well-positioned in financial services. Very well-positioned in healthcare and telecommunications. Any regulated industry, that really cares about how they govern and maintain their data, is really the great target audience for us. And so, that continues to be the focus for the business. And we're really excited about that narrowing of focus and what opportunities that's going to build for us. To not just land new customers, but more to expand our existing ones into a broader and broader set of use cases. >> And data is coming down faster. There's more data growth than ever seen before. It's never stopping.. It's only going to get worse. >> We love it. >> Bring it on. >> Any way you look at it, it's getting worse or better. Mick, thanks for spending the time. I know you're super busy with the event going on. Congratulations on the success, and the focus, and the positioning. Appreciate it. Thanks for coming on The Cube. >> Absolutely. Thank you gentlemen. It was a pleasure. >> We are Cube NYC. This is our ninth year doing all action. Everything that's going on in the data world now is horizontally scaling across all aspects of the company, the society, as we know. It's super important, and this is what we're talking about here in New York. This is The Cube, and John Furrier. Dave Vellante. Be back with more after this short break. Stay with us for more coverage from New York City. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media This is the Cube studio. is that the new building in Palo Alto is right So, on the analytic side, we announced What are the core problems that you guys knock down? So, in the "grow" sense, we help companies by the way. They're going to know I rolled Since data is in fact the new oil, address how that's all affected the ecosystem. How has that affected the evolution of your ecosystem? in large enterprises, so the big and all of the various projects that are involved. So, I've got to ask you a marketing question, And the other one was modern. optimized for the cloud." Question on the name. We're living in the cloud era. Cloud-native clearly has indicated that the new, because the cloud creates more compute, And (laughs) process it. machine itelligence, in the cloud. How are you looking at that? that the public cloud's going to make a great deal of sense. Classic cloud workload experience that you're bringing, Cloud is an operating environment for you guys. What can you share with us. What's the North Star for you guys is really the great target audience for us. And data is coming down faster. and the positioning. Thank you gentlemen. is horizontally scaling across all aspects of the

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
KomatsuORGANIZATION

0.99+

Dave VellantePERSON

0.99+

Mick HollisonPERSON

0.99+

MikePERSON

0.99+

AustraliaLOCATION

0.99+

AmazonORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

2008DATE

0.99+

Palo AltoLOCATION

0.99+

TomPERSON

0.99+

New YorkLOCATION

0.99+

MickPERSON

0.99+

John FurrierPERSON

0.99+

New York CityLOCATION

0.99+

TeslaORGANIZATION

0.99+

CNCFORGANIZATION

0.99+

Hilary MasonPERSON

0.99+

ClouderaORGANIZATION

0.99+

second pieceQUANTITY

0.99+

three letterQUANTITY

0.99+

North StarORGANIZATION

0.99+

Amr AwadallahPERSON

0.99+

zero customersQUANTITY

0.99+

fiveQUANTITY

0.99+

18 yearQUANTITY

0.99+

ninth yearQUANTITY

0.99+

OneQUANTITY

0.99+

DavePERSON

0.99+

this weekDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

bothQUANTITY

0.99+

ten yearsQUANTITY

0.98+

four monthsQUANTITY

0.98+

over two hundred and forty membersQUANTITY

0.98+

OracleORGANIZATION

0.98+

NYCLOCATION

0.98+

firstQUANTITY

0.98+

NoSQLTITLE

0.98+

The CubeORGANIZATION

0.98+

over 30 open-source projectsQUANTITY

0.98+

AmrPERSON

0.98+

todayDATE

0.98+

SQLTITLE

0.98+

eachQUANTITY

0.98+

GDPRTITLE

0.98+

tomorrowDATE

0.98+

CubeORGANIZATION

0.97+

approximately five thousand global customersQUANTITY

0.97+

StrataORGANIZATION

0.96+

about two yearsQUANTITY

0.96+

AltusORGANIZATION

0.96+

earthLOCATION

0.96+

EDWTITLE

0.95+

18-year oldQUANTITY

0.95+

Strata ConferenceEVENT

0.94+

few years agoDATE

0.94+

oneQUANTITY

0.94+

AWUSTITLE

0.93+

Altus Data WarehouseORGANIZATION

0.93+

first orderQUANTITY

0.93+

single cloudQUANTITY

0.93+

Cloudera Enterprise SixTITLE

0.92+

about threeQUANTITY

0.92+

ClouderaTITLE

0.84+

three broad categoriesQUANTITY

0.84+

past sixDATE

0.82+

DD, Cisco + Han Yang, Cisco | theCUBE NYC 2018


 

>> Live from New York, It's the CUBE! Covering theCUBE, New York City 2018. Brought to you by SiliconANGLE Media and its Ecosystem partners. >> Welcome back to the live CUBE coverage here in New York City for CUBE NYC, #CubeNYC. This coverage of all things data, all things cloud, all things machine learning here in the big data realm. I'm John Furrier and Dave Vellante. We've got two great guests from Cisco. We got DD who is the Vice President of Data Center Marketing at Cisco, and Han Yang who is the Senior Product Manager at Cisco. Guys, welcome to the Cube. Thanks for coming on again. >> Good to see ya. >> Thanks for having us. >> So obviously one of the things that has come up this year at the Big Data Show, used to be called Hadoop World, Strata Data, now it's called, the latest name. And obviously CUBE NYC, we changed from Big Data NYC to CUBE NYC, because there's a lot more going on. I heard hallway conversations around blockchain, cryptocurrency, Kubernetes has been said on theCUBE already at least a dozen times here today, multicloud. So you're seeing the analytical world try to be, in a way, brought into the dynamics around IT infrastructure operations, both cloud and on premises. So interesting dynamics this year, almost a dev ops kind of culture to analytics. This is a new kind of sign from this community. Your thoughts? >> Absolutely, I think data and analytics is one of those things that's pervasive. Every industry, it doesn't matter. Even at Cisco, I know we're going to talk a little more about the new AI and ML workload, but for the last few years, we've been using AI and ML techniques to improve networking, to improve security, to improve collaboration. So it's everywhere. >> You mean internally, in your own IT? >> Internally, yeah. Not just in IT, in the way we're designing our network equipment. We're storing data that's flowing through the data center, flowing in and out of clouds, and using that data to make better predictions for better networking application performance, security, what have you. >> The first topic I want to talk to you guys about is around the data center. Obviously, you do data center marketing, that's where all the action is. The cloud, obviously, has been all the buzz, people going to the cloud, but Andy Jassy's announcement at VMworld really is a validation that we're seeing, for the first time, hybrid multicloud validated. Amazon announced RDS on VMware on-premises. >> That's right. This is the first time Amazon's ever done anything of this magnitude on-premises. So this is a signal from the customers voting with their wallet that on-premises is a dynamic. The data center is where the data is, that's where the main footprint of IT is. This is important. What's the impact of that dynamic, of data center, where the data is with the option of a cloud. How does that impact data, machine learning, and the things that you guys see as relevant? >> I'll start and Han, feel free to chime in here. So I think those boundaries between this is a data center, and this a cloud, and this is campus, and this is the edge, I think those boundaries are going away. Like you said, data center is where the data is. And it's the ability of our customers to be able to capture that data, process it, curate it, and use it for insight to take decision locally. A drone is a data center that flies, and boat is a data center that floats, right? >> And a cloud is a data center that no one sees. >> That's right. So those boundaries are going away. We at Cisco see this as a continuum. It's the edge cloud continuum. The edge is exploding, right? There's just more and more devices, and those devices are cranking out more data than ever before. Like I said, it's the ability of our customers to harness the data to make more meaningful decisions. So Cisco's take on this is the new architectural approach. It starts with the network, because the network is the one piece that connects everything- every device, every edge, every individual, every cloud. There's a lot of data within the network which we're using to make better decisions. >> I've been pretty close with Cisco over the years, since '95 timeframe. I've had hundreds of meetings, some technical, some kind of business. But I've heard that term edge the network many times over the years. This is not a new concept at Cisco. Edge of the network actually means something in Cisco parlance. The edge of the network >> Yeah. >> that the packets are moving around. So again, this is not a new idea at Cisco. It's just materialized itself in a new way. >> It's not, but what's happening is the edge is just now generating so much data, and if you can use that data, convert it into insight and make decisions, that's the exciting thing. And that's why this whole thing about machine learning and artificial intelligence, it's the data that's being generated by these cameras, these sensors. So that's what is really, really interesting. >> Go ahead, please. >> One of our own studies pointed out that by 2021, there will be 847 zettabytes of information out there, but only 1.3 zettabytes will actually ever make it back to the data center. That just means an opportunity for analytics at the edge to make sense of that information before it ever makes it home. >> What were those numbers again? >> I think it was like 847 zettabytes of information. >> And how much makes it back? >> About 1.3. >> Yeah, there you go. So- >> So a huge compression- >> That confirms your research, Dave. >> We've been saying for a while now that most of the data is going to stay at the edge. There's no reason to move it back. The economics don't support it, the latency doesn't make sense. >> The network cost alone is going to kill you. >> That's right. >> I think you really want to collect it, you want to clean it, and you want to correlate it before ever sending it back. Otherwise, sending that information, of useless information, that status is wonderful. Well that's not very valuable. And 99.9 percent, "things are going well." >> Temperature hasn't changed. (laughs) >> If it really goes wrong, that's when you want to alert or send more information. How did it go bad? Why did it go bad? Those are the more insightful things that you want to send back. >> This is not just for IoT. I mean, cat pictures moving between campuses cost money too, so why not just keep them local, right? But the basic concepts of networking. This is what I want to get in my point, too. You guys have some new announcements around UCS and some of the hardware and the gear and the software. What are some of the new announcements that you're announcing here in New York, and what does it mean for customers? Because they want to know not only speeds and feeds. It's a software-driven world. How does the software relate? How does the gear work? What's the management look like? Where's the control plane? Where's the management plane? Give us all the data. >> I think the biggest issues starts from this. Data scientists, their task is to export different data sources, find out the value. But at the same time, IT is somewhat lagging behind. Because as the data scientists go from data source A to data source B, it could be 3 petabytes of difference. IT is like, 3 petabytes? That's only from Monday through Wednesday? That's a huge infrastructure requirement change. So Cisco's way to help the customer is to make sure that we're able to come out with blueprints. Blueprints enabling the IT team to scale, so that the data scientists can work beyond their own laptop. As they work through the petabytes of data that's come in from all these different sources, they're able to collaborate well together and make sense of that information. Only by scaling with IT helping the data scientists to work the scale, that's the only way they can succeed. So that's why we announced a new server. It's called a C480ML. Happens to have 8 GPUs from Nvidia inside helping customers that want to do that deep learning kind of capabilities. >> What are some of the use cases on these as products? It's got some new data capabilities. What are some of the impacts? >> Some of the things that Han just mentioned. For me, I think the biggest differentiation in our solution is things that we put around the box. So the management layer, right? I mean, this is not going to be one server and one data center. It's going to be multiple of them. You're never going to have one data center. You're going to have multiple data centers. And we've got a really cool management tool called Intersight, and this is supported in Intersight, day one. And Intersight also uses machine learning techniques to look at data from multiple data centers. And that's really where the innovation is. Honestly, I think every vendor is bend sheet metal around the latest chipset, and we've done the same. But the real differentiation is how we manage it, how we use the data for more meaningful insight. I think that's where some of our magic is. >> Can you add some code to that, in terms of infrastructure for AI and ML, how is it different than traditional infrastructures? So is the management different? The sheet metal is not different, you're saying. But what are some of those nuances that we should understand. >> I think especially for deep learning, multiple scientists around the world have pointed that if you're able to use GPUs, they're able to run the deep learning frameworks faster by roughly two waters magnitude. So that's part of the reason why, from an infrastructure perspective, we want to bring in that GPUs. But for the IT teams, we didn't want them to just add yet another infrastructure silo just to support AI or ML. Therefore, we wanted to make sure it fits in with a UCS-managed unified architecture, enabling the IT team to scale but without adding more infrastructures and silos just for that new workload. But having that unified architecture, it helps the IT to be more efficient and, at the same time, is better support of the data scientists. >> The other thing I would add is, again, the things around the box. Look, this industry is still pretty nascent. There is lots of start-ups, there is lots of different solutions, and when we build a server like this, we don't just build a server and toss it over the fence to the customer and say "figure it out." No, we've done validated design guides. With Google, with some of the leading vendors in the space to make sure that everything works as we say it would. And so it's all of those integrations, those partnerships, all the way through our systems integrators, to really understand a customer's AI and ML environment and can fine tune it for the environment. >> So is that really where a lot of the innovation comes from? Doing that hard work to say, "yes, it's going to be a solution that's going to work in this environment. Here's what you have to do to ensure best practice," etc.? Is that right? >> So I think some of our blueprints or validated designs is basically enabling the IT team to scale. Scale their stores, scale their CPU, scale their GPU, and scale their network. But do it in a way so that we work with partners like Hortonworks or Cloudera. So that they're able to take advantage of the data lake. And adding in the GPU so they're able to do the deep learning with Tensorflow, with Pytorch, or whatever curated deep learning framework the data scientists need to be able to get value out of those multiple data sources. These are the kind of solutions that we're putting together, making sure our customers are able to get to that business outcome sooner and faster, not just a-- >> Right, so there's innovation at all altitudes. There's the hardware, there's the integrations, there's the management. So it's innovation. >> So not to go too much into the weeds, but I'm curious. As you introduce these alternate processing units, what is the relationship between traditional CPUs and these GPUs? Are you managing them differently, kind of communicating somehow, or are they sort of fenced off architecturally. I wonder if you could describe that. >> We actually want it to be integrated, because by having it separated and fenced off, well that's an IT infrastructure silo. You're not going to have the same security policy or the storage mechanisms. We want it to be unified so it's easier on IT teams to support the data scientists. So therefore, the latest software is able to manage both CPUs and GPUs, as well as having a new file system. Those are the solutions that we're putting forth, so that ARC-IT folks can scale, our data scientists can succeed. >> So IT's managing a logical block. >> That's right. And even for things like inventory management, or going back and adding patches in the event of some security event, it's so much better to have one integrated system rather than silos of management, which we see in the industry. >> So the hard news is basically UCS for AI and ML workloads? >> That's right. This is our first server custom built ground up to support these deep learning, machine learning workloads. We partnered with Nvidia, with Google. We announced earlier this week, and the phone is ringing constantly. >> I don't want to say godbot. I just said it. (laughs) This is basically the power tool for deep learning. >> Absolutely. >> That's how you guys see it. Well, great. Thanks for coming out. Appreciate it, good to see you guys at Cisco. Again, deep learning dedicated technology around the box, not just the box itself. Ecosystem, Nvidia, good call. Those guys really get the hot GPUs out there. Saw those guys last night, great success they're having. They're a key partner with you guys. >> Absolutely. >> Who else is partnering, real quick before we end the segment? >> We've been partnering with software sci, we partner with folks like Anaconda, with their Anaconda Enterprise, which data scientists love to use as their Python data science framework. We're working with Google, with their Kubeflow, which is open source project integrating Tensorflow on top of Kubernetes. And of course we've been working with folks like Caldera as well as Hortonworks to access the data lake from a big data perspective. >> Yeah, I know you guys didn't get a lot of credit. Google Cloud, we were certainly amplifying it. You guys were co-developing the Google Cloud servers with Google. I know they were announcing it, and you guys had Chuck on stage there with Diane Greene, so it was pretty positive. Good integration with Google can make a >> Absolutely. >> Thanks for coming on theCUBE, thanks, we appreciate the commentary. Cisco here on theCUBE. We're in New York City for theCUBE NYC. This is where the world of data is converging in with IT infrastructure, developers, operators, all running analytics for future business. We'll be back with more coverage, after this short break. (upbeat digital music)

Published Date : Sep 12 2018

SUMMARY :

It's the CUBE! Welcome back to the live CUBE coverage here So obviously one of the things that has come up this year but for the last few years, Not just in IT, in the way we're designing is around the data center. and the things that you guys see as relevant? And it's the ability of our customers to It's the edge cloud continuum. The edge of the network that the packets are moving around. is the edge is just now generating so much data, analytics at the edge Yeah, there you go. that most of the data is going to stay at the edge. I think you really want to collect it, (laughs) Those are the more insightful things and the gear and the software. the data scientists to work the scale, What are some of the use cases on these as products? Some of the things that Han just mentioned. So is the management different? it helps the IT to be more efficient in the space to make sure that everything works So is that really where a lot of the data scientists need to be able to get value There's the hardware, there's the integrations, So not to go too much into the weeds, Those are the solutions that we're putting forth, in the event of some security event, and the phone is ringing constantly. This is basically the power tool for deep learning. Those guys really get the hot GPUs out there. to access the data lake from a big data perspective. the Google Cloud servers with Google. This is where the world of data

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

NvidiaORGANIZATION

0.99+

CiscoORGANIZATION

0.99+

Han YangPERSON

0.99+

GoogleORGANIZATION

0.99+

New YorkLOCATION

0.99+

Diane GreenePERSON

0.99+

AmazonORGANIZATION

0.99+

DavePERSON

0.99+

HortonworksORGANIZATION

0.99+

2021DATE

0.99+

New York CityLOCATION

0.99+

Andy JassyPERSON

0.99+

8 GPUsQUANTITY

0.99+

847 zettabytesQUANTITY

0.99+

John FurrierPERSON

0.99+

99.9 percentQUANTITY

0.99+

MondayDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

3 petabytesQUANTITY

0.99+

AnacondaORGANIZATION

0.99+

WednesdayDATE

0.99+

DDPERSON

0.99+

first timeQUANTITY

0.99+

one serverQUANTITY

0.99+

ClouderaORGANIZATION

0.99+

PythonTITLE

0.99+

first topicQUANTITY

0.99+

one pieceQUANTITY

0.99+

VMworldORGANIZATION

0.99+

'95DATE

0.98+

1.3 zettabytesQUANTITY

0.98+

NYCLOCATION

0.98+

bothQUANTITY

0.98+

oneQUANTITY

0.98+

this yearDATE

0.98+

Big Data ShowEVENT

0.98+

CalderaORGANIZATION

0.98+

two watersQUANTITY

0.97+

todayDATE

0.97+

ChuckPERSON

0.97+

OneQUANTITY

0.97+

Big DataORGANIZATION

0.97+

earlier this weekDATE

0.97+

IntersightORGANIZATION

0.97+

hundreds of meetingsQUANTITY

0.97+

CUBEORGANIZATION

0.97+

first serverQUANTITY

0.97+

last nightDATE

0.95+

one data centerQUANTITY

0.94+

UCSORGANIZATION

0.92+

petabytesQUANTITY

0.92+

two great guestsQUANTITY

0.9+

TensorflowTITLE

0.86+

CUBE NYCORGANIZATION

0.86+

HanPERSON

0.85+

#CubeNYCLOCATION

0.83+

Strata DataORGANIZATION

0.83+

KubeflowTITLE

0.82+

Hadoop WorldORGANIZATION

0.81+

2018DATE

0.8+

Stephanie McReynolds, Alation | theCUBE NYC 2018


 

>> Live from New York, It's theCUBE! Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hello and welcome back to theCUBE live in New York City, here for CUBE NYC. In conjunct with Strata Conference, Strata Data, Strata Hadoop This is our ninth year covering the big data ecosystem which has evolved into machine learning, A.I., data science, cloud, a lot of great things happening all things data, impacting all businesses I'm John Furrier, your host with Dave Vellante and Peter Burris, Peter is filling in for Dave Vellante. Next guest, Stephanie McReynolds who is the CMO, VP of Marketing for Alation, thanks for joining us. >> Thanks for having me. >> Good to see you. So you guys have a pretty spectacular exhibit here in New York. I want to get to that right away, top story is Attack of the Bots. And you're showing a great demo. Explain what you guys are doing in the show. >> Yah, well it's robot fighting time in our booth, so we brought a little fun to the show floor my kids are.. >> You mean big data is not fun enough? >> Well big data is pretty fun but occasionally you got to get your geek battle on there so we're having fun with robots but I think the real story in the Alation booth is about the product and how machine learning data catalogs are helping a whole variety of users in the organization everything from improving analyst productivity and even some business user productivity of data to then really supporting data scientists in their work by helping them to distribute their data products through a data catalog. >> You guys are one of the new guard companies that are doing things that make it really easy for people who want to use data, practitioners that the average data citizen has been called, or people who want productivity. Not necessarily the hardcore, setting up clusters, really kind of like the big data user. What's that market look like right now, has it met your expectations, how's business, what's the update? >> Yah, I think we have a strong perspective that for us to close the final mile and get to real value out of the data, it's a human challenge, there's a trust gap with managers. Today on stage over at STRATA it was interesting because Google had a speaker and it wasn't their chief data officer it was their chief decision scientist and I think that reflects what that final mile is is that making decisions and it's the trust gap that managers have with data because they don't know how the insides are coming to them, what are all the details underneath. In order to be able to trust decisions you have to understand who processed the data, what decision making criteria did they use, was this data governed well, are we introducing some bias into our algorithms, and can that be controlled? And so Alation becomes a platform for supporting getting answers to those issues. And then there's plenty of other companies that are optimizing the performance of those QUERYS and the storage of that data, but we're trying to really to close that trust gap. >> It's very interesting because from a management standpoint we're trying to do more evidence based management. So there's a major trend in board rooms, and executive offices to try to find ways to acculturate the executive team to using data, evidence based management healthcare now being applied to a lot of other domains. We've also historically had a situation where the people who focused or worked with the data was a relatively small coterie of individuals that crave these crazy systems to try to bring those two together. It sounds like what you're doing, and I really like the idea of the data scientists, being able to create data products that then can be distributed. It sounds like you're trying to look at data as an asset to be created, to be distributed so they can be more easily used by more people in your organization, have we got that right? >> Absolutely. So we're now seeing we're in just over a hundred production implementations of Alation, at large enterprises, and we're now seeing those production implementations get into the thousands of users. So this is going beyond those data specialists. Beyond the unicorn data scientists that understand the systems and math and technology. >> And business. >> And business, right. In business. So what we're seeing now is that a data catalog can be a point of collaboration across those different audiences in an enterprise. So whereas three years ago some of our initial customers kept the data catalog implementations small, right. They were getting access to the specialists to this catalog and asked them to certify data assets for others, what were starting to see is a proliferation of creation of self service data assets, a certification process that now is enterprise-wide, and thousands of users in these organizations. So Ebay has over a thousand weekly logins, Munich Reinsurance was on stage yesterday, their head of data engineering said they have 2,000 users on Alation at this point on their data lake, Fiserv is going to speak on Thursday and they're getting up to those numbers as well, so we see some really solid organizations that are solving medical, pharmaceutical issues, right, the largest re insurer in the world leading tech companies, starting to adopt a data catalog as a foundation for how their going to make those data driven decisions in the organization. >> Talk about how the product works because essentially you're bringing kind of the decision scientists, for lack of a better word, and productivity worker, almost like a business office suite concept, as a SAS, so you got a SAS model that says "Hey you want to play with data, use it but you have to do some front end work." Take us through how you guys roll out the platform, how are your customers consuming the service, take us through the engagement with customers. >> I think for customers, the most interesting part of this product is that it displays itself as an application that anyone can use, right? So there's a super familiar search interface that, rather than bringing back webpages, allows you to search for data assets in your organization. If you want more information on that data asset you click on those search results and you can see all of the information of how that data has been used in the organization, as well as the technical details and the technical metadata. And I think what's even more powerful is we actually have a recommendation engine that recommends data assets to the user. And that can be plugged into Tablo and Salesworth, Einstein Analytics, and a whole variety of other data science tools like Data Haiku that you might be using in your organization. So this looks like a very easy to use application that folks are familiar with that you just need a web browser to access, but on the backend, the hard work that's happening is the automation that we do with the platform. So by going out and crawling these source systems and looking at not just the technical descriptions of data, the metadata that exists, but then being able to understand by parsing the sequel weblogs, how that data is actually being used in the organization. We call it behavior I.O. by looking at the behaviors of how that data's being used, from those logs, we can actually give you a really good sense of how that data should be used in the future or where you might have gaps in governing that data or how you might want to reorient your storage or compute infrastructure to support the type of analytics that are actually being executed by real humans in your organization. And that's eye opening to a lot of I.T. sources. >> So you're providing insights to the data usage so that the business could get optimized for whether it's I.T. footprint component, or kinds of use cases, is that kind of how it's working? >> So what's interesting is the optimization actually happens in a pretty automated way, because we can make recommendations to those consumers of data of how they want to navigate the system. Kind of like Google makes recommendations as you browse the web, right? >> If you misspell something, "Oh did you mean this", kind of thing? >> "Did you mean this, might you also be interested in this", right? It's kind of a cross between Google and Amazon. Others like you may have used these other data assets in the past to determine revenue for that particular region, have you thought about using this filter, have you thought about using this join, did you know that you're trying to do analysis that maybe the sales ops guy has already done, and here's the certified report, why don't you just start with that? We're seeing a lot of reuse in organizations, wherein the past I think as an industry when Tablo and Click and all these B.I tools that were very self service oriented started to take off it was all about democratizing visualization by letting every user do their own thing and now we're realizing to get speed and accuracy and efficiency and effectiveness maybe there's more reuse of the work we've already done in existing data assets and by recommending those and expanding the data literacy around the interpretation of those, you might actually close this trust gap with the data. >> But there's one really important point that you raised, and I want to come back to it, and that is this notion of bias. So you know, Alation knows something about the data, knows a lot about the metadata, so therefore, I don't want to say understands, but it's capable of categorizing data in that way. And you're also able to look at the usage of that data by parsing some of sequel statements and then making a determination of the data as it's identified is appropriately being used based on how people are actually applying it so you can identify potential bias or potential misuse or whatever else it might be. That is an incredibly important thing. As you know John, we had an event last night and one of the things that popped up is how do you deal with emergence in data science in A.I, etc. And what methods do you put in place to actually ensure that the governance model can be extended to understand how those things are potentially in a very soft way, corrupting the use of the data. So could you spend a little bit more time talking about that because it's something a lot of people are interested in, quite frankly we don't know about a lot of tools that are doing that kind of work right now. It's an important point. >> I think the traditional viewpoint was if we just can manage the data we will be able to have a govern system. So if we control the inputs then well have a safe environment, and that was kind of like the classic single source of truth, data warehouse type model. >> Stewards of the data. >> What we're seeing is with the proliferation of sources of data and how quickly with IOT and new modern sources, data is getting created, you're not able to manage data at that point of that entry point. And it's not just about systems, it's about individuals that go on the web and find a dataset and then load it into a corporate database, right? Or you merge an Excel file with something that in a database. And so I think what we see happening, not only when you look at bias but if you look at some of the new regulations like [Inaudible] >> Sure. Ownership, [Inaudible] >> The logic that you're using to process that data, the algorithm itself can be biased, if you have a biased training data site that you feed it into a machine learning algorithm, the algorithm itself is going to be biased. And so the control point in this world where data is proliferating and we're not sure we can control that entirely, becomes the logic embedded in the algorithm. Even if that's a simple sequel statement that's feeding a report. And so Alation is able to introspect that sequel and highlight that maybe there is bias at work and how this algorithm is composed. So with GDPR the consumer owns their own data, if they want to pull it out from a training data set, you got to rerun that algorithm without that consumer data and that's your control point then going forward for the organization on different governance issues that pop up. >> Talk about the psychology of the user base because one of the things that shifted in the data world is a few stewards of data managed everything, now you've got a model where literally thousands of people of an organization could be users, productivity users, so you get a social component in here that people know who's doing data work, which in a way, creates a new persona or class of worker. A non techy worker. >> Yeah. It's interesting if you think about moving access to the data and moving the individuals that are creating algorithms out to a broader user group, what's important, you have to make sure that you're educating and training and sharing knowledge with that democratized audience, right? And to be able to do that you kind of want to work with human psychology, right? You want to be able to give people guidance in the course of their work rather than have them memorize a set of rules and try to remember to apply those. If you had a specialist group you can kind of control and force them to memorize and then apply, the more modern approach is to say "look, with some of these machine learning techniques that we have, why don't we make a recommendation." What you're going to do is introduce bias into that calculation. >> And we're capturing that information as you use the data. >> Well were also making a recommendation to say "Hey do you know you're doing this? Maybe you don't want to do that." Most people are using the data are not bad actors. They just can't remember all the rule sets to apply. So what were trying to do is cut someone behaviorally in the act before they make that mistake and say hey just a bit of a reminder, a bit of a coaching moment, did you know what you're doing? Maybe you can think of another approach to this. And we've found that many organizations that changes the discussion around data governance. It's no longer this top down constraint to finding insight, which frustrates an audience, is trying to use that data. It's more like a coach helping you improve and then social aspect of wanting to contribute to the system comes into play and people start communicating, collaborating, the platform and curating information a little bit. >> I remember when Microsoft Excel came out, the spreadsheet, or Lotus 123, oh my God, people are going to use these amazing things with spreadsheets, they did. You're taking a similar approach with analytics, much bigger surface area of work to kind of attack from a data perspective, but in a way kind of the same kind of concept, put the hands of the users, have the data in their hands so to speak. >> Yeah, enable everyone to make data driven decisions. But make sure that they're interpreting that data in the right way, right? Give them enough guidance, don't let them just kind of attack the wild west and fair it out. >> Well looking back at the Microsoft Excel spreadsheet example, I remember when a finance department would send a formatted spreadsheet with all the rules for how to use it out of 50 different groups around the world, and everyone figured out that you can go in and manipulate the macros and deliver any results they want. And so it's that same notion, you have to know something about that, but this site, in many respects Stephanie you're describing a data governance model that really is more truly governance, that if we think about a data asset it's how do we mediate a lot of different claims against that set of data so that its used appropriately, so its not corrupted, so that it doesn't effect other people, but very importantly so that the out6comes are easier to agree upon because there's some trust and there's some valid behaviors and there's some verification in the flow of the data utilization. >> And where we give voice to a number of different constituencies. Because business opinions from different departments can run slightly counter to one another. There can be friction in how to use particular data assets in the business depending on the lens that you have in that business and so what were trying to do is surface those different perspectives, give them voice, allow those constituencies to work that out in a platform that captures that debate, captures that knowledge, makes that debate a knowledge of foundation to build upon so in many ways its kind of like the scientific method, right? As a scientist I publish a paper. >> Get peer reviewed. >> Get peer reviewed, let other people weigh in. >> And it becomes part of the canon of knowledge. >> And it becomes part of the canon. And in the scientific community over the last several years you see that folks are publishing their data sets out publicly, why can't an enterprise do the same thing internally for different business groups internally. Take the same approach. Allow others to weigh in. It gets them better insights and it gets them more trust in that foundation. >> You get collective intelligence from the user base to help come in and make the data smarter and sharper. >> Yeah and have reusable assets that you can then build upon to find the higher level insights. Don't run the same report that a hundred people in the organization have already run. >> So the final question for you. As you guys are emerging, starting to do really well, you have a unique approach, honestly we think it fits in kind of the new guard of analytics, a productivity worker with data, which is we think is going to be a huge persona, where are you guys winning, and why are you winning with your customer base? What are some things that are resonating as you go in and engage with prospects and customers and existing customers? What are they attracted to, what are they like, and why are you beating the competition in your sales and opportunities? >> I think this concept of a more agile, grassroots approach to data governance is a breath of fresh air for anyone who spend their career in the data space. Were at a turning point in industry where you're now seeing chief decision scientists, chief data officers, chief analytic officers take a leadership role in organizations. Munich Reinsurance is using their data team to actually invest and hold new arms of their business. That's how they're pushing the envelope on leadership in the insurance space and were seeing that across our install base. Alation becomes this knowledge repository for all of those mines in the organization, and encourages a community to be built around data and insightful questions of data. And in that way the whole organization raises to the next level and I think its that vision of what can be created internally, how we can move away from just claiming that were a big data organization and really starting to see the impact of how new business models can be creative in these data assets, that's exciting to our customer base. >> Well congratulations. A hot start up. Alation here on theCUBE in New York City for cubeNYC. Changing the game on analytics, bringing a breath of fresh air to hands of the users. A new persona developing. Congratulations, great to have you. Stephanie McReynolds. Its the cube. Stay with us for more live coverage, day one of two days live in New York City. We'll be right back.

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media the CMO, VP of Marketing for Alation, thanks for joining us. So you guys have a pretty spectacular so we brought a little fun to the show floor in the Alation booth is about the product You guys are one of the new guard companies is that making decisions and it's the trust gap and I really like the idea of the data scientists, production implementations get into the thousands of users. and asked them to certify data assets for others, kind of the decision scientists, gaps in governing that data or how you might want to so that the business could get optimized as you browse the web, right? in the past to determine revenue for that particular region, and one of the things that popped up is how do you deal and that was kind of like the classic it's about individuals that go on the web and find a dataset the algorithm itself is going to be biased. because one of the things that shifted in the data world And to be able to do that you kind of They just can't remember all the rule sets to apply. have the data in their hands so to speak. that data in the right way, right? and everyone figured out that you can go in in the business depending on the lens that you have And in the scientific community over the last several years You get collective intelligence from the user base Yeah and have reusable assets that you can then build upon and why are you winning with your customer base? and really starting to see the impact of how new business bringing a breath of fresh air to hands of the users.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Stephanie McReynoldsPERSON

0.99+

AmazonORGANIZATION

0.99+

Dave VellantePERSON

0.99+

JohnPERSON

0.99+

Peter BurrisPERSON

0.99+

GoogleORGANIZATION

0.99+

StephaniePERSON

0.99+

ThursdayDATE

0.99+

New YorkLOCATION

0.99+

John FurrierPERSON

0.99+

50 different groupsQUANTITY

0.99+

PeterPERSON

0.99+

New York CityLOCATION

0.99+

EbayORGANIZATION

0.99+

2,000 usersQUANTITY

0.99+

ExcelTITLE

0.99+

Attack of the BotsTITLE

0.99+

thousandsQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

two daysQUANTITY

0.99+

yesterdayDATE

0.99+

ninth yearQUANTITY

0.99+

twoQUANTITY

0.99+

STRATAORGANIZATION

0.99+

TodayDATE

0.99+

FiservORGANIZATION

0.99+

last nightDATE

0.99+

three years agoDATE

0.99+

AlationPERSON

0.99+

NYCLOCATION

0.98+

Lotus 123TITLE

0.98+

Munich ReinsuranceORGANIZATION

0.98+

oneQUANTITY

0.98+

GDPRTITLE

0.97+

AlationORGANIZATION

0.96+

MicrosoftORGANIZATION

0.94+

SASORGANIZATION

0.94+

over a thousand weekly loginsQUANTITY

0.91+

theCUBEORGANIZATION

0.9+

Strata ConferenceEVENT

0.89+

single sourceQUANTITY

0.86+

thousands of peopleQUANTITY

0.86+

thousands of usersQUANTITY

0.84+

TabloORGANIZATION

0.83+

day oneQUANTITY

0.78+

2018EVENT

0.75+

CUBEORGANIZATION

0.75+

SalesworthORGANIZATION

0.74+

Einstein AnalyticsORGANIZATION

0.73+

TabloTITLE

0.73+

Strata HadoopEVENT

0.73+

a hundred peopleQUANTITY

0.7+

2018DATE

0.66+

pointQUANTITY

0.63+

yearsDATE

0.63+

AlationLOCATION

0.62+

ClickORGANIZATION

0.62+

Munich ReinsuranceTITLE

0.6+

over a hundredQUANTITY

0.59+

DataORGANIZATION

0.58+

Strata DataEVENT

0.57+

lastDATE

0.55+

HaikuTITLE

0.47+

Kickoff | theCUBE NYC 2018


 

>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Hello, everyone, welcome to this CUBE special presentation here in New York City for CUBENYC. I'm John Furrier with Dave Vellante. This is our ninth year covering the big data industry, starting with Hadoop World and evolved over the years. This is our ninth year, Dave. We've been covering Hadoop World, Hadoop Summit, Strata Conference, Strata Hadoop. Now it's called Strata Data, I don't know what Strata O'Reilly's going to call it next. As you all know, theCUBE has been present for the creation at the Hadoop big data ecosystem. We're here for our ninth year, certainly a lot's changed. AI's the center of the conversation, and certainly we've seen some horses come in, some haven't come in, and trends have emerged, some gone away, your thoughts. Nine years covering big data. >> Well, John, I remember fondly, vividly, the call that I got. I was in Dallas at a storage networking world show and you called and said, "Hey, we're doing "Hadoop World, get over there," and of course, Hadoop, big data, was the new, hot thing. I told everybody, "I'm leaving." Most of the people said, "What's Hadoop?" Right, so we came, we started covering, it was people like Jeff Hammerbacher, Amr Awadallah, Doug Cutting, who invented Hadoop, Mike Olson, you know, head of Cloudera at the time, and people like Abi Mehda, who at the time was at B of A, and some of the things we learned then that were profound-- >> Yeah. >> As much as Hadoop is sort of on the back burner now and people really aren't talking about it, some of the things that are profound about Hadoop, really, were the idea, the notion of bringing five megabytes of code to a petabyte of data, for example, or the notion of no schema on write. You know, put it into the database and then figure it out. >> Unstructured data. >> Right. >> Object storage. >> And so, that created a state of innovation, of funding. We were talking last night about, you know, many, many years ago at this event this time of the year, concurrent with Strata you would have VCs all over the place. There really aren't a lot of VCs here this year, not a lot of VC parties-- >> Mm-hm. >> As there used to be, so that somewhat waned, but some of the things that we talked about back then, we said that big money and big data is going to be made by the practitioners, not by the vendors, and that's proved true. I mean... >> Yeah. >> The big three Hadoop distro vendors, Cloudera, Hortonworks, and MapR, you know, Cloudera's $2.5 billion valuation, you know, not bad, but it's not a $30, $40 billion value company. The other thing we said is there will be no Red Hat of big data. You said, "Well, the only Red Hat of big data might be "Red Hat," and so, (chuckles) that's basically proved true. >> Yeah. >> And so, I think if we look back we always talked about Hadoop and big data being a reduction, the ROI was a reduction on investment. >> Yeah. >> It was a way to have a cheaper data warehouse, and that's essentially-- Well, what did we get right and wrong? I mean, let's look at some of the trends. I mean, first of all, I think we got pretty much everything right, as you know. We tend to make the calls pretty accurately with theCUBE. Got a lot of data, we look, we have the analytics in our own system, plus we have the research team digging in, so you know, we pretty much get, do a good job. I think one thing that we predicted was that Hadoop certainly would change the game, and that did. We also predicted that there wouldn't be a Red Hat for Hadoop, that was a production. The other prediction was is that we said Hadoop won't kill data warehouses, it didn't, and then data lakes came along. You know my position on data lakes. >> Yeah. >> I've always hated the term. I always liked data ocean because I think it was much more fluidity of the data, so I think we got that one right and data lakes still doesn't look like it's going to be panning out well. I mean, most people that deploy data lakes, it's really either not a core thing or as part of something else and it's turning into a data swamp, so I think the data lake piece is not panning out the way it, people thought it would be. I think one thing we did get right, also, is that data would be the center of the value proposition, and it continues and remains to be, and I think we're seeing that now, and we said data's the development kit back in 2010 when we said data's going to be part of programming. >> Some of the other things, our early data, and we went out and we talked to a lot of practitioners who are the, it was hard to find in the early days. They were just a select few, I mean, other than inside of Google and Yahoo! But what they told us is that things like SQL and the enterprise data warehouse were key components on their big data strategy, so to your point, you know, it wasn't going to kill the EDW, but it was going to surround it. The other thing we called was cloud. Four years ago our data showed clearly that much of this work, the modeling, the big data wrangling, et cetera, was being done in the cloud, and Cloudera, Hortonworks, and MapR, none of them at the time really had a cloud strategy. Today that's all they're talking about is cloud and hybrid cloud. >> Well, it's interesting, I think it was like four years ago, I think, Dave, when we actually were riffing on the notion of, you know, Cloudera's name. It's called Cloudera, you know. If you spell it out, in Cloudera we're in a cloud era, and I think we were very aggressive at that point. I think Amr Awadallah even made a comment on Twitter. He was like, "I don't understand "where you guys are coming from." We were actually saying at the time that Cloudera should actually leverage more cloud at that time, and they didn't. They stayed on their IPO track and they had to because they had everything betted on Impala and this data model that they had and being the business model, and then they went public, but I think clearly cloud is now part of Cloudera's story, and I think that's a good call, and it's not too late for them. It never was too late, but you know, Cloudera has executed. I mean, if you look at what's happened with Cloudera, they were the only game in town. When we started theCUBE we were in their office, as most people know in this industry, that we were there with Cloudera when they had like 17 employees. I thought Cloudera was going to run the table, but then what happened was Hortonworks came out of the Yahoo! That, I think, changed the game and I think in that competitive battle between Hortonworks and Cloudera, in my opinion, changed the industry, because if Hortonworks did not come out of Yahoo! Cloudera would've had an uncontested run. I think the landscape of the ecosystem would look completely different had Hortonworks not competed, because you think about, Dave, they had that competitive battle for years. The Hortonworks-Cloudera battle, and I think it changed the industry. I think it couldn't been a different outcome. If Hortonworks wasn't there, I think Cloudera probably would've taken Hadoop and making it so much more, and I think they wouldn't gotten more done. >> Yeah, and I think the other point we have to make here is complexity really hurt the Hadoop ecosystem, and it was just bespoke, new projects coming out all the time, and you had Cloudera, Hortonworks, and maybe to a lesser extent MapR, doing a lot of the heavy lifting, particularly, you know, Hortonworks and Cloudera. They had to invest a lot of their R&D in making these systems work and integrating them, and you know, complexity just really broke the back of the Hadoop ecosystem, and so then Spark came in, everybody said, "Oh, Spark's going to basically replace Hadoop." You know, yes and no, the people who got Hadoop right, you know, embraced it and they still use it. Spark definitely simplified things, but now the conversation has turned to AI, John. So, I got to ask you, I'm going to use your line on you in kind of the ask-me-anything segment here. AI, is it same wine, new bottle, or is it really substantively different in your opinion? >> I think it's substantively different. I don't think it's the same wine in a new bottle. I'll tell you... Well, it's kind of, it's like the bad wine... (laughs) Is going to be kind of blended in with the good wine, which is now AI. If you look at this industry, the big data industry, if you look at what O'Reilly did with this conference. I think O'Reilly really has not done a good job with the conference of big data. I think they blew it, I think that they made it a, you know, monetization, closed system when the big data business could've been all about AI in a much deeper way. I think AI is subordinate to cloud, and you mentioned cloud earlier. If you look at all the action within the AI segment, Diane Greene talking about it at Google Next, Amazon, AI is a software layer substrate that will be underpinned by the cloud. Cloud will drive more action, you need more compute, that drives more data, more data drives the machine learning, machine learning drives the AI, so I think AI is always going to be dependent upon cloud ends or some sort of high compute resource base, and all the cloud analytics are feeding into these AI models, so I think cloud takes over AI, no doubt, and I think this whole ecosystem of big data gets subsumed under either an AWS, VMworld, Google, and Microsoft Cloud show, and then also I think specialization around data science is going to go off on its own. So, I think you're going to see the breakup of the big data industry as we know it today. Strata Hadoop, Strata Data Conference, that thing's going to crumble into multiple, fractured ecosystems. >> It's already starting to be forked. I think the other thing I want to say about Hadoop is that it actually brought such great awareness to the notion of data, putting data at the core of your company, data and data value, the ability to understand how data at least contributes to the monetization of your company. AI would not be possible without the data. Right, and we've talked about this before. You call it the innovation sandwich. The innovation sandwich, last decade, last three decades, has been Moore's law. The innovation sandwich going forward is data, machine intelligence applied to that data, and cloud for scale, and that's the sandwich of innovation over the next 10 to 20 years. >> Yeah, and I think data is everywhere, so this idea of being a categorical industry segment is a little bit off, I mean, although I know data warehouse is kind of its own category and you're seeing that, but I don't think it's like a Magic Quadrant anymore. Every quadrant has data. >> Mm-hm. >> So, I think data's fundamental, and I think that's why it's going to become a layer within a control plane of either cloud or some other system, I think. I think that's pretty clear, there's no, like, one. You can't buy big data, you can't buy AI. I think you can have AI, you know, things like TensorFlow, but it's going to be a completely... Every layer of the stack is going to be impacted by AI and data. >> And I think the big players are going to infuse their applications and their databases with machine intelligence. You're going to see this, you're certainly, you know, seeing it with IBM, the sort of Watson heavy lift. Clearly Google, Amazon, you know, Facebook, Alibaba, and Microsoft, they're infusing AI throughout their entire set of cloud services and applications and infrastructure, and I think that's good news for the practitioners. People aren't... Most companies aren't going to build their own AI, they're going to buy AI, and that's how they close the gap between the sort of data haves and the data have-nots, and again, I want to emphasize that the fundamental difference, to me anyway, is having data at the core. If you look at the top five companies in terms of market value, US companies, Facebook maybe not so much anymore because of the fake news, though Facebook will be back with it's two billion users, but Apple, Google, Facebook, Amazon, who am I... And Microsoft, those five have put data at the core and they're the most valuable companies in the stock market from a market cap standpoint, why? Because it's a recognition that that intangible value of the data is actually quite valuable, and even though banks and financial institutions are data companies, their data lives in silos. So, these five have put data at the center, surrounded it with human expertise, as opposed to having humans at the center and having data all over the place. So, how do they, how do these companies close the gap? How do the companies in the flyover states close the gap? The way they close the gap, in my view, is they buy technologies that have AI infused in it, and I think the last thing I'll say is I see cloud as the substrate, and AI, and blockchain and other services, as the automation layer on top of it. I think that's going to be the big tailwind for innovation over the next decade. >> Yeah, and obviously the theme of machine learning drives a lot of the conversations here, and that's essentially never going to go away. Machine learning is the core of AI, and I would argue that AI truly doesn't even exist yet. It's machine learning really driving the value, but to put a validation on the fact that cloud is going to be driving AI business is some of the terms in popular conversations we're hearing here in New York around this event and topic, CUBENYC and Strata Conference, is you're hearing Kubernetes and blockchain, and you know, these automation, AI operation kind of conversations. That's an IT conversation, (chuckles) so you know, that's interesting. You've got IT, really, with storage. You've got to store the data, so you can't not talk about workloads and how the data moves with workloads, so you're starting to see data and workloads kind of be tossed in the same conversation, that's a cloud conversation. That is all about multi-cloud. That's why you're seeing Kubernetes, a term I never thought I would be saying at a big data show, but Kubernetes is going to be key for moving workloads around, of which there's data involved. (chuckles) Instrumenting the workloads, data inside the workloads, data driving data. This is where AI and machine learning's going to play, so again, cloud subsumes AI, that's the story, and I think that's going to be the big trend. >> Well, and I think you're right, now. I mean, that's why you're hearing the messaging of hybrid cloud and from the big distro vendors, and the other thing is you're hearing from a lot of the no-SQL database guys, they're bringing ACID compliance, they're bringing enterprise-grade capability, so you're seeing the world is hybrid. You're seeing those two worlds come together, so... >> Their worlds, it's getting leveled in the playing field out there. It's all about enterprise, B2B, AI, cloud, and data. That's theCUBE bringing you the data here. New York City, CUBENYC, that's the hashtag. Stay with us for more coverage live in New York after this short break. (techy music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media for the creation at the Hadoop big data ecosystem. and some of the things we learned then some of the things that are profound about Hadoop, We were talking last night about, you know, but some of the things that we talked about back then, You said, "Well, the only Red Hat of big data might be being a reduction, the ROI was a reduction I mean, first of all, I think we got and I think we're seeing that now, and the enterprise data warehouse were key components and I think we were very aggressive at that point. Yeah, and I think the other point and all the cloud analytics are and cloud for scale, and that's the sandwich Yeah, and I think data is everywhere, and I think that's why it's going to become I think that's going to be the big tailwind and I think that's going to be the big trend. and the other thing is you're hearing New York City, CUBENYC, that's the hashtag.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
AppleORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

AmazonORGANIZATION

0.99+

Diane GreenePERSON

0.99+

GoogleORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

JohnPERSON

0.99+

AlibabaORGANIZATION

0.99+

DavePERSON

0.99+

Dave VellantePERSON

0.99+

Jeff HammerbacherPERSON

0.99+

$30QUANTITY

0.99+

New YorkLOCATION

0.99+

2010DATE

0.99+

IBMORGANIZATION

0.99+

Doug CuttingPERSON

0.99+

Mike OlsonPERSON

0.99+

HortonworksORGANIZATION

0.99+

DallasLOCATION

0.99+

O'ReillyORGANIZATION

0.99+

YahooORGANIZATION

0.99+

ClouderaORGANIZATION

0.99+

fiveQUANTITY

0.99+

AWSORGANIZATION

0.99+

Abi MehdaPERSON

0.99+

John FurrierPERSON

0.99+

New York CityLOCATION

0.99+

$2.5 billionQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

MapRORGANIZATION

0.99+

Amr AwadallahPERSON

0.99+

$40 billionQUANTITY

0.99+

17 employeesQUANTITY

0.99+

VMworldORGANIZATION

0.99+

TodayDATE

0.99+

ImpalaORGANIZATION

0.99+

Nine yearsQUANTITY

0.99+

four years agoDATE

0.98+

last nightDATE

0.98+

last decadeDATE

0.98+

Strata Data ConferenceEVENT

0.98+

Strata ConferenceEVENT

0.98+

Hadoop SummitEVENT

0.98+

ninth yearQUANTITY

0.98+

Four years agoDATE

0.98+

two worldsQUANTITY

0.97+

five companiesQUANTITY

0.97+

todayDATE

0.97+

Strata HadoopEVENT

0.97+

Hadoop WorldEVENT

0.96+

CUBEORGANIZATION

0.96+

Google NextORGANIZATION

0.95+

TwitterORGANIZATION

0.95+

this yearDATE

0.95+

SparkORGANIZATION

0.95+

USLOCATION

0.94+

CUBENYCEVENT

0.94+

Strata O'ReillyORGANIZATION

0.93+

next decadeDATE

0.93+

Bala Chandrasekaran, Dell EMC | Dell EMC: Get Ready For AI


 

(techno music) >> Hey welcome back everybody, Jeff Frick here with theCUBE. We're in Austin, Texas at the Dell EMC HPC and AI Innovation Lab. As you can see behind me, there's racks and racks and racks of gear, where they build all types of vessel configurations around specific applications, whether its Oracle or S.A.P. And more recently a lot more around artificial intelligence, whether it's machine learning, deep learning, so it's a really cool place to be. We're excited to be here. And our next guest is Bala Chandrasekaran. He is in the technical staff as a systems engineer. Bala, welcome! >> Thank you. >> So how do you like playing with all these toys all day long? >> Oh I love it! >> I mean you guys have literally everything in there. A lot more than just Dell EMC gear, but you've got switches and networking gear-- >> Right. >> Everything. >> And not just the gear, it's also all the software components, it's the deep learning libraries, deep learning models, so a whole bunch of things that we can get to play around with. >> Now that's interesting 'cause it's harder to see the software, right? >> Exactly right. >> The software's pumping through all these machines but you guys do all types of really, optimization and configuration, correct? >> Yes, we try to make it easy for the end customer. And the project that I'm working on, machine learning for Hadoop, we try to make things easy for the data scientists. >> Right, so we got all the Hadoop shows, Hadoop World, Hadoop Summit, Strata, Big Data NYC, Silicone Valley, and the knock on Hadoop is always it's too hard, there aren't enough engineers, I can't get enough people to do it myself. It's a cool open source project, but it's not that easy to do. You guys are really helping people solve that problem. >> Yes and what you're saying is true for the infrastructure guys. Now imagine a data scientist, right? So Hadoop cluster accessing it, securing it, is going to be really tough for them. And they shouldn't be worried about it. Right? They should be focused on data science. So those are some of the things that we try to do for them. >> So what are some of the tips and tricks as you build these systems that throw people off all the time that are relatively simple things to fix? And then what are some of the hard stuff where you guys have really applied your expertise to get over those challenges? >> Let me give you a small example. So this is a new project A.I. we hired data scientists. So I walk the data scientist through the lab. He looked at all he cluster and he pulled me aside and said hey you're not going to ask me to work on these things, right? I have no idea how to do these things. So that kind of gives you a sense of what a data scientist should focus on and what what they shouldn't focus on. So some of the things that we do, and some of the things that are probably difficult for them is all the libraries that are needed to run their project, the conflicts between libraries, the dependencies between them. So one of the things that we do deliver this pre-configured engine that you can readily download into our product and run. So data scientist don't have to worry about what library I should use. >> Right. >> They have to worry about the models and accuracy and whatever data science needs to be done, rather than focusing on the infrastructure. >> So you not only package the hardware and the systems, but you've packaged the software distribution and all the kind of surrounding components of that as well. >> Exactly right. Right. >> So when you have the data scientists here talking about the Hadoop cluster, if they didn't want to talk about the hardware and the software, what were you helping them with? How did you engage with the customers here at the lab? >> So the example that I gave is for the data scientist that we newly hired for our team so we had to set up environments for them. so that was the example, but the same thing applies for a customer as well. So again to help them in solving the problem we tried to package some of the things as part of our product and deliver it to them so it's easy for them to deploy and get started on things. >> Now the other piece that's included and again is not in this room is the services -- >> Right. >> And the support so you guys have a full team of professional services. Once you configure and figure out what the optimum solution is for them then you got a team that can actually go deploy it at their actual site. >> So we have packaged things even for our services. So the services would go to the customer side. They would apply the solution and download and deploy our packages and be able to demonstrate how easy it is to think of them as tutorials if you like. So here are the tutorials. Here's how you run various models. So here's how easy it is for you to get started. So that's what they would train the customer on. So there's not just the deployment piece of it but just packaging things for them so they can show customers how to get started quickly, how everything works and kind of of give a green check mark if you will. >> So what are some of your favorite applications that people are using these things for? Do you get involved in the applications stack on the customer side? What are some of the fun use cases that people use in your technology to solve? >> So for the application my project is about mission learning on Hadoop via packaging Cloudera's CDSW that's Cloudera Data Science Workbench as part of the product. So that allows data science access to the Hadoop cluster and abstracting the complexities of the cluster. So they can access the cluster. They can access the data. They can have security without worrying about all the intricacies of the cluster. In addition to that they can create different projects, have different libraries in different projects. So they don't have to conflict with each other and also they can add users to it. They can work collaboratively. So basically choose to help data scientists, software developers, do their job and not worry about the infrastructure. >> Right. >> They should not be. >> Right great. Well Bala it's pretty exciting place to work. I'm sure you're having a ball. >> Yes I am thank you. >> All right. Well thanks for taking a few minutes with us and really enjoyed the conversation. >> I appreciate it thank you. All right he's Bala. I'm Jeff. You're watching theCUBE from Austin, Texas at the Dell EMC High Performance Computing and Artificial Intelligence Labs. Thanks for watching. (techno music)

Published Date : Aug 7 2018

SUMMARY :

He is in the technical staff as a systems engineer. I mean you guys have literally everything in there. And not just the gear, And the project that I'm working on, but it's not that easy to do. So those are some of the things that we try to do for them. So some of the things that we do, They have to worry about the models and accuracy and all the kind of surrounding components of that as well. Right. So the example that I gave is for the data scientist And the support so you guys So the services would go to the customer side. So for the application my project is about Well Bala it's pretty exciting place to work. All right. at the Dell EMC High Performance Computing

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Jeff FrickPERSON

0.99+

Bala ChandrasekaranPERSON

0.99+

OracleORGANIZATION

0.99+

JeffPERSON

0.99+

BalaPERSON

0.99+

Austin, TexasLOCATION

0.99+

AI Innovation LabORGANIZATION

0.99+

oneQUANTITY

0.98+

Dell EMC High Performance ComputingORGANIZATION

0.98+

Dell EMCORGANIZATION

0.98+

ClouderaORGANIZATION

0.97+

Dell EMC HPCORGANIZATION

0.96+

HadoopTITLE

0.95+

S.A.P.ORGANIZATION

0.94+

Artificial Intelligence LabsORGANIZATION

0.87+

NYCLOCATION

0.85+

theCUBEORGANIZATION

0.83+

Silicone ValleyLOCATION

0.79+

Hadoop SummitEVENT

0.78+

Big DataEVENT

0.72+

StrataEVENT

0.58+

Hadoop WorldEVENT

0.44+

HadoopORGANIZATION

0.41+

Thierry Pellegrino, Dell EMC | Dell EMC: Get Ready For AI


 

[Music] and welcome back everybody Jeff Rick here at the cube we're in Austin Texas at the deli MC high performance computing and artificial intelligence labs last been here for a long time as you can see behind us and probably here racks and racks and racks of some of the biggest baddest computers on the planet in fact I think number 256 we were told earlier it's just behind us we're excited to be here really as Dell and EMC puts together you know pre-configured solutions for artificial intelligence machine learning deep learning applications because that's a growing growing concern and growing growing importance to all the business people out there so we're excited to have the guy running the show he's Terry Pellegrino the VP of HPC and business strategy had a whole bunch of stuff you're a pretty busy guy I'm busy but you can see all those servers they're very busy too they're humming so just your perspective so the HPC part of this has been around for a while the rise of kind of machine learning and artificial intelligence as a business priority is relatively recent but you guys are jumping in with both feet oh absolutely I mean HPC is not new to us AI machine learning deep learning is happening that's the buzzword but we've been working on HPC clusters since back in the 90s and it's it's great to see this technology or this best practice getting into the enterprise space where data scientists need help and instead of looking for a one processor that will solve it all they look for the knowledge of HPC and what we've been able to put together and applying into their field right so how do you kind of delineate between HPC and say the AI portion of the lab or is it just kind of on a on a continuum how do you kind of slice and dice absolutely it's it's all in one place and you see it all behind us this area in front of us we try to get all those those those servers put together and add the value for all the different workloads right so you get HPC a piece equal a IML deal all in one lab right and they're all here they're all here the old the legacy application only be called legacy applications all the way to the to the meanest and the newest and greatest exactly the old stuff the new stuff and and actually you know what some things you don't see is we're also looking at where the technology is going to take all those workloads AI m LD L is the buzzword today but down the road you're gonna see more applications and we're already starting to test those technologies in this lab so it's past present and future right so one of the specific solutions you guys have put together is the DL using the new Nvidia technology what if you could talk we hear about a media all the time obviously they're in really well position in autonomous vehicles and and their GPUs are taking data centers by storm how's that going where do you see some of the applications outside of autonomous vehicles for the the Nvidia base oh there are many applications I think the technology itself is is proving to solve a lot of customer problems and you can apply it in many different verticals many workloads again and you can see it in autonomous vehicles you can see it in healthcare live science in financial services risk management it's it's really everywhere you need to solve a problem and you need dense compute solutions and NVIDIA has one of technologies that a lot of our customers leverage to solve their problems right and you're also launching a machine learning solution based on Hadoop which we we've been going to Hadoop summit Hadoop world and strata for eight nine years I guess since 2010 eight years and you know it's kind of funny because the knock on Hadoop is always there aren't enough people it's too hard you know it's just a really difficult technology so you guys are really taken again a solutions approach with a dupe for machine learning to basically deliver either a whole rack full of stuff or that spec that you can build at your own place no absolutely that's one of the three major tenants that we have for those solutions that we're launching we really want it to be a solution that's faster so performance is key when you're trying to extract data and insights from from your data set you really need to be fast you don't want it to take months it has to be within countable measures so it's one of them we want to make it simple a data scientist is never going to be a PhD in HPC or any kind of computer technologies so making it simple it's critical and the last one is we want to have this proven trusted adviser feel for our customers you see it around you this HPC lab was not built yesterday it's been here showcasing our capabilities in HPC world our ability to combine the Hadoop environment with other environments to solve enterprise class problems and bring business value to our customers and that's really what we we think are our differentiation comes from right and it's really a lab I mean you and I are both wearing court coats right now but there's a gear stack following really heights of every shape and size and I think what's interesting is while we talk about the sexy stuff the GPUs and the CPUs and the do there's a lot of details that make one of these racks actually work and it's probably integrating some of those things as lower tier things and making sure they all work seamlessly together so you don't get some nasty bottleneck on an inexpensive part that's holding back all that capacity oh absolutely you know it's funny you mentioned that we're talking to customers about the technologies we're assembling and contrary to some web tech type companies that just look for any compute at all costs and they'll just stack up a lot of technologies because they want the compute in in HPC type environments or when you try to solve problems with deep learning and machine learning you're only as strong as your weakest link and if you have a a server or a storage unit or a an interconnect between all those that is really weak you're gonna see your performance go way down and we watch out for that and you know the one thing that you alluded to which I just wanted to point out what you see behind us is the hardware the the secret sauce is really in the aggregation of all the components and all the software stacks because AI M LDL great easy acronyms but when you start peeling the layers you realize it's layers and layers of software which are moving very fast where you don't want to be spending your life understanding the inter up requirements between those layers and and worrying about whether your your compute and your storage solution is gonna work right you want to solve problems a scientist and that's what we're trying to do give you a solution which is an infrastructure plus a stack that's been validated proven and you can really get to work right and even within that validated design for a particular workload customers have an opportunity maybe needs a little bit more IO as a relative scale these a little bit more storage needs a little bit more compute so even within a basic structured system that you guys have SPECT and certified still customers can come in and make little mods based on what their specific workload you've got it this is not we're not in the phase in the acceptance of a I am LDL where things are cookie cutter it's still going to be a collaboration that's what we have a really strong team working with our customers directly and trying to solution for their problem right if you need a little bit more storage if you need faster storage for your scratch if you need a little bit more i/o bandwidth because you're in a remote environment I mean all those characteristics are gonna be critical and the solutions we're launching are not rigid they're they're perfect starting point for customers I want to get something to run directly they feel like it but if you if you have a solution that's more pointed we can definitely iterate and that's what our team in the field and all the engineers that you have seen today walk through the lab that's what their role is we want to be as a consultant as a partner designing the right solution for the customer right so Terry before I let you guys just kind of one question from your perspective of customers and you're out talking to customers and how the conversation around artificial intelligence and machine learning has evolved over the last several years from you know kind of a cool science experiment or it's all the HPC stuff with the government or whether or heavy lifting really moving from that actually into a boardroom conversation as a priority and a strategic imperative going forward how's that conversation evolving when you're out talking to customers well you know it has changed you're right back in the 60s the science was there the technology wasn't there today we have the science we have the technology and we're seeing all the C Class decision makers really want to find value out of the data that we've collected and that that's where the discussion takes place this is not a CIO discussion most of the time and in what's really fantastic in mama contrary to a lot of the the technologies I have grown on like big data cloud and all those buzzwords here we're looking at something that's tangible we have real-life examples of companies that are using deep learning and machine learning to solve problems save lives and get our technology in the hands of the right folks so they can impact the community it's really really fantastic and that growth is set for success and we want to be part of that right it's just a minute just you know the continuation of this democratization trend you know get more people more data give more people more tools get more people more power and you're gonna get innovation you're gonna solve more problems and it's so exciting absolutely totally agree with you all right teri well thanks for taking a few minutes out of your busy day and congrats on the Innovation Lab here thank you so much all righty teri I'm Jeff Rick we're at the Dell EMC HPC and AI innovation labs in Austin Texas thanks for watching

Published Date : Aug 7 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

EntityCategoryConfidence
Terry PellegrinoPERSON

0.99+

Jeff RickPERSON

0.99+

DellORGANIZATION

0.99+

EMCORGANIZATION

0.99+

NVIDIAORGANIZATION

0.99+

TerryPERSON

0.99+

Austin TexasLOCATION

0.99+

Austin TexasLOCATION

0.99+

2010DATE

0.99+

todayDATE

0.99+

NvidiaORGANIZATION

0.98+

yesterdayDATE

0.98+

both feetQUANTITY

0.98+

one questionQUANTITY

0.98+

HadoopTITLE

0.97+

oneQUANTITY

0.97+

Thierry PellegrinoPERSON

0.96+

eight yearsQUANTITY

0.96+

three major tenantsQUANTITY

0.95+

Innovation LabORGANIZATION

0.94+

bothQUANTITY

0.93+

Dell EMCORGANIZATION

0.93+

eight nine yearsQUANTITY

0.92+

deli MCORGANIZATION

0.92+

Dell EMC HPCORGANIZATION

0.9+

one thingQUANTITY

0.88+

one processorQUANTITY

0.87+

HPCORGANIZATION

0.87+

one placeQUANTITY

0.84+

60sDATE

0.71+

SPECTORGANIZATION

0.62+

yearsDATE

0.58+

few minutesQUANTITY

0.55+

90sDATE

0.52+

256OTHER

0.51+

lastDATE

0.49+

strataLOCATION

0.44+

Steve Herrod, General Catalyst & Devesh Garg, Arrcus | CUBEConversation, July 2018


 

[Music] [Applause] [Music] welcome to the special cube conversations here in Palo Alto cube studios I'm John Ferrier the founder of Silicon angle in the cube we're here with divest cargoes the founder and CEO of arcus Inc our curse com ar-are see us calm and Steve Herod General Partner at at General Catalyst VCU's funded him congratulations on your launch these guys launched on Monday a hot new product software OS for networking powering white boxes in a whole new generation of potentially cloud computing welcome to this cube conversation congratulations on your >> launch thank you John >> so today I should talk about this this >> startup when do you guys were founded let's get to the specifics date you were founded some of the people on the team and the funding and we were formally incorporated in February of 2016 we really got going in earnest in August of 2016 and have you know chosen to stay in stealth the the founding team consists of myself a gentleman by the name of Kop tell he's our CTO we also have a gentleman by the name of Derek Young he's our chief architect and our backgrounds are a combination of the semiconductor industry I spent a lot of time in the semiconductor industry most recently I was president of easy chip and we sold that company to Mellanox and Kher and Derek our networking protocol experts spent 20 plus years at places like Cisco and arguably some of the best protocol guys in the world so the three of us got together and basically saw an opportunity to to bring some of the insights and and architectural innovation you know we had in mind to the Mobius a pedigree in there some some top talent absolutely some of the things that they've done in the past from some notable yeah I mean you know some if you if you'd like some just high-level numbers we have 600 plus years of experience of deep networking expertise within the company our collective team has shipped over 400 products to production we have over 200 IETF RFC papers that have been filed by the team as well as 150 plus patents so we really can do something on the pedigree for sure yeah we absolutely focused on getting the best talent in the world because we felt that it would be a significant differentiation to be able to start from a clean sheet of paper and so really having people who have that expertise allowed us to kind of take a step back and you know reimagine what could be possible with an operating system and gave us the benefit of being able to you know choose >> best-in-class approaches so what's the >> cap the point that this all came >> together what was the guiding vision was it network os's are going to be cloud-based was it going to be more I owe t what was the some of the founding principles that really got this going because clearly we see a trend where you know Intel's been dominating we see what NVIDIA is doing competitively certainly on the GPU side you're seeing the white box has become a trend Google makes their own stuff apples big making their own silicon seeking the that's kind of a whole big scale world out there that has got a lot of hardware experience what was the catalyst for you guys when you found this kinda was the guiding principle yeah I would say there were three John and you hit you hit on a couple of them in your reference to Intel and NVIDIA with some of the innovation but if I start at the top level the market the networking market is a large market and it's also very strategic and foundational in a hyper-connected world that market is also dominated by a few people and there's essentially three vertically integrated OEM so that dominate that market and when you have that type of dominance it leads to ultimately high prices and muted innovations so we felt number one the market was going through tremendous change but at the same time it had been tightly controlled by a few people the other part of it was that there was a tremendous amount of innovation that was happening at the silicon component level coming from the semiconductor industry I was early at Broadcom very you know involved in some of the networking things that happened in the early stages of the company we saw tremendous amounts of innovation feature velocity that was happening at the silicon component level that in turn led to a lot of system hardware people coming into the market and producing systems based on this wide variety of choices for you know for the silicon but the missing link was really an operating system that would unleash all that innovation so Silicon Valley is back Steve you you know you're a VC now but you were the CTO at VMware one of the companies that actually changed how data centers operate certainly as it certainly as a pretext and cloud computing was seeing with micro services and the growth of cloud silicon's hot IT operations is certainly being decimated as we old knew it in the past everything's being automated away you need more function now there's a demand this is this penny how you see I mean you always see things are a little early as of technologist now VC what got you excited about these guys what's the what's the bottom line yeah maybe two points on that which so one silicon is is definitely become interesting again if you will in the in the Silicon Valley area and I think that's partly because cloud scale and web scale allows these environments where you can afford to put in new hardware and really take advantage of it I was a semiconductor I first austerity too so it's exciting for me to see that but um you know is the fish that it's kind of a straightforward story you know especially in a world of whether it's cloud or IOT or everything networking is you know like literally the core to all of us working going forward and the opportunity to rethink it in a new design and in software first mentality felt kind of perfect right now I think I I think device even sell the team a little short even is with all the numbers that are there kr for instance this co-founder was sort of everyone you talk to will call him mister BGP which is one of the main routing protocols in the internet so just a ridiculously deep team trying to take this on and there been a few companies trying to do something kind of like this and I think what do they say that the second Mouse gets the cheese and I think I think we've seen some things that didn't work the first time around and we can really I think improve on them and have a >> chance to make a major impact on the networking market you know just to kind of go on a tangent here for a second >> because you know as you're talking kind of my brain is kind of firing away because you know one of things I've been talking about on the cube a lot is ageism and if you look at the movement of the cloud that's brought us systems mindset back you look at all the best successes out there right now it's almost a old guys and gals but it's really systems people people who understand networking and systems because the cloud is an operating system you have an operating system for networking so you're seeing that trend certainly happened that's awesome the question I have for you device is what is the difference what's the impact of this new network OS because I'm almost envisioning if I think through my mind's eye you got servers and server list certainly big train seeing and cloud it's one resource pools one operating system and that needs to have cohesiveness and connectedness through services so is this how you guys are thinking about how are you guys think about the network os what's different about what you guys are doing with ARC OS versus what's out there today now that's a great question John so in terms of in terms of what we've done the the third piece you know of the puzzle so to speak when we were talking about our team I talked a little bit about the market opportunity I talked a little bit about the innovation that was happening at the semiconductor and systems level and said the missing link was on the OS and so as I said at the onset we had the benefit of hiring some of the best people in the world and what that gave us the opportunity was to look at the twenty plus years of development that had happened on the operating system side for networking and basically identify those things that really made sense so we had the benefit of being able to adopt what worked and then augment that with those things that were needed for a modern day networking infrastructure environment and so we set about producing a product we call it our Co s and the the characteristics of it that are unique are that its first of all its best-in-class protocols we have minimal dependency on open source protocols and the reason for that is that no serious network operator is going to put an open source networking protocol in the core of their network they're just not going to risk their business and the efficacy and performance of their network for something like that so we start with best-in-class protocols and then we captured them in a very open modular Services microservices based architecture and that allows us the flexibility and the extensibility to be able to compose it in a manner that's consistent with what the end-use case is going to be so it's designed from the onset to be very scalable and very versatile in terms of where it can be deployed we can deploy it you know in a physical environment we can deploy it visa via a container or we could deploy it in the cloud so we're agnostic to all of those use case scenarios and then in addition to that we knew that we had to make it usable it makes no sense to have the best-in-class protocols if our end customers can't use them so what we've done is we've adopted open config yang based models and we have programmable api's so in any environment people can leverage their existing tools their existing applications and they can relatively easily and efficiently integrate our Co s into their networking environment and then similarly we did the same thing on the hardware side we have something that we call D pal it's a data plane adaptation layer it's an intelligent how and what that allows us to do is be Hardware agnostic so we're indifferent to what the underlying hardware is and what we want to do is be able to take advantage of the advancements in the silicon component level as well as at the system level and be able to deploy our go S anywhere it's let's take a step back so you guys so the protocols that's awesome what's the value proposition for our Co S and who's the target audience you mentioned data centers in the past is a data center operators is it developers is it service providers who was your target customer yeah so so the the piece of the puzzle that wraps everything together is we wanted to do it at massive scale and so we have the ability to support internet scale with deep routing capabilities within our Co s and as a byproduct of that and all the other things that we've done architectural II were the world's first operating system that's been ported to the high-end Broadcom strata DNX family that product is called jericho plus in the marketplace and as a byproduct of that we can ingest a full internet routing table and as a byproduct of that we can be used in the highest end applications for network operators so performance is a key value public performance as measured by internet scale as measured by convergence times as measured by the amount of control visibility and access that we provide and by virtue of being able to solve that high-end problem it's very easy for us to come down so in terms of your specific question about what are the use cases we have active discussions in data center centric applications for the leaf and spine we have active discussions for edge applications we have active discussions going on for cloud centric applications arcus can be used anywhere who's the buyer those network operator so since we can go look a variety of personas network operator large telco that's right inner person running a killer app that's you know high mission-critical high scale is that Mike right yeah you're getting you're absolutely getting it right basically anybody that has a network and has a networking infrastructure that is consuming networking equipment is a potential customer for ours now the product has the extensibility to be used anywhere in the data center at the edge or in the cloud we're very focused on some of the use cases that are in the CDN peering and IP you know route reflector IP peering use cases great Steve I want to get your thoughts because I say I know how you invest you guys a great great firm over there you're pretty finicky on investments certainly team check pedigrees they're on the team so that's a good inside market tamp big markets what's the market here for you but how do you see this market what's the bet for you guys on the market side yeah it's pretty pretty straightforward as you look at the size of the networking market with you know three major players around here and you know a longer tail owning a small piece of Haitian giant market is a great way to get started and if you believe in the and the secular trends that are going on with innovation and hardware and the ability to take advantage of them I think we have identified a few really interesting starting use cases and web-scale companies that have a lot of cost and needs in the networking side but what I would love about the software architecture it reminds me a lot of things do have kind of just even the early virtualization pieces if you if you can take advantage of movement in advantages and hardware as they improve and really bring them into a company more quickly than before then those companies are gonna be able to have you know better economics on their networking early on so get a great layer in solve a particular use case but then the trends of being able to take advantage of new hardware and to be able to provide the data and the API is to programmatic and to manage it who one would that it's creative limp limitless opportunity because with custom silicon that has you know purpose-built protocols it's easy to put a box together and in a large data center or even boxes yeah you can imagine the vendors of the advances and the chips really love that there's a good company that can take advantage of them more quickly than others can so cloud cloud service refined certainly as a target audience here large the large clouds would love it there's an app coming in Broadcom as a customer they a partner of you guys in two parts first comes a partner so we we've ported arc OS onto multiple members of the Broadcom switching family so we have five or six of their components their networking system on chip components that we've ported to including the two highest end which is the jericho plus and you got a letter in the Broadcom buying CA and that's gonna open up IT operations to you guys and volge instead of applications and me to talk about what you just said extensibility of taking what you just said about boxes and tying applique and application performance you know what's going to see that vertically integrated and i think i think eloping yeah from from a semiconductor perspective since i spent a lot of time in the industry you know one of the challenges i had founded a high court count multi processor company and one of the challenges we always had was the software and at easy chip we had the world's highest and network processor challenge with software and i think if you take all the innovation in the silicon industry and couple it with the right software the combination of those two things opens up a vast number of opportunities and we feel that with our Co s we provide you know that software piece that's going to help people take advantage of all the great innovation that's happening you mentioned earlier open source people don't want to bring open source at the core the network yet the open source communities are growing really at an exponential rate you starting to see open source be the lingua franca for all developers especially the modern software developers wine not open sourcing the core the amino acids gotta be bulletproof you need security obviously answers there but that seems difficult to the trend on open source what's the what's the answer there on why not open source in the core yeah so we we take advantage of open source where it makes sense so we take advantage of open and onl open network Linux and we have developed our protocols that run on that environment the reason we feel that the protocols being developed in-house as opposed to leveraging things from the open source community are the internet scale multi-threading of bgp integrating things like open config yang based models into that environment right well it's not only proven but our the the the capabilities that we're able to innovate on and bring unique differentiation weren't really going back to a clean sheet of paper and so we designed it ground-up to really be optimized for the needs of today Steve your old boss Palmer rich used to talk about the harden top mmm-hmm similar here right you know one really no one's really gonna care if it works great it's under the under the harden top where you use open source as a connection point for services and opportunities to grow that similar concept yes I mean at the end of the day open source is great for certain things and for community and extensibility and for visibility and then on the flip side they look to a company that's accountable and for making sure it performs and as high quality and so I think I think that modern way for especially for the mission critical infrastructure is to have a mix of both and to give back to community where it makes sense to be responsible for hardening things are building them when they don't expense so how'd you how'd you how'd you land these guys you get him early and don't sit don't talk to any other VCS how did it all come together between you guys we've actually been friends for a while which has been great in it at one point we actually decided to ask hey what do you actually do I found that I was a venture investor and he is a network engineer but now I actually have actually really liked the networking space as a whole as much as people talk about the cloud or open source or storage being tough networking is literally everywhere and will be everywhere and whatever our world looks like so I always been looking for the most interesting companies in that space and we always joke like the investment world kind of San Francisco's applications mid here's sort of operating systems and the lower you get the more technical it gets and so well there's a vaccine I mean we're a media company I think we're doing things different we're team before we came on camera but I think media is undervalued I wrote just wrote a tweet on that got some traction on that but it's shifting back to silicon you're seeing systems if you look at some of the hottest areas IT operations is being automated away AI ops you know Auto machine learning starting to see some of these high-end like home systems like that's exactly where I was gonna go it's like the vid I I especially just love very deep intellectual property that is hard to replicate and that you can you know ultimately you can charge a premium for something that is that hard to do and so that's that's really something I get drugs in the deal with in you guys you have any other syndicates in the video about soda sure you know so our initial seed investor was clear ventures gentleman by the name of Chris rust is on our board and then Steve came in and led our most recent round of funding and he also was on the board what we've done beyond that institutional money is we have a group of very strategic individual investors two people I would maybe highlight amongst the vast number of advisers we have our gentleman by the name of Pankaj Patel punka JH was the chief development officer at Cisco he was basically number two at Cisco for a number of years deep operating experience across all facets of what we would need and then there's another gentleman by the name of Amarjeet Gill I've been friends with armored teeth for 30 years he's probably one of the single most successful entrepreneurs in the he's incubated companies that have been purchased by Broadcom by Apple by Google by Facebook by Intel by EMC so we were fortunate enough to get him involved and keep him busy great pedigree great investors with that kind of electoral property and those smart mines they're a lot of pressure on you as the CEO not to screw it up right I mean come on now get all those smart man come on okay you got it look at really good you know I I welcome it actually I enjoy it you know we look when you have a great team and you have as many capable people surrounding you it really comes together and so I don't think it's about me I actually think number one it's about I was just kidding by the way I think it's about the team and I'm merely a spokesperson to represent all the great work that our team has done so I'm really proud of the guys we have and frankly it makes my job easier you've got a lot of people to tap for for advice certainly the shared experiences electively in the different areas make a lot of sense in the investors certainly yeah up to you absolutely absolutely and it's not it's not just at the at the board it's just not at the investor level it's at the adviser level and also at you know at our individual team members when we have a team that executes as well as we have you know everything falls into place well we think the software worlds change we think the economics are changing certainly when you look at cloud whether it's cloud computing or token economics with blockchain and new emerging tech around AI we think the world is certainly going to change so you guys got a great team to kind of figure it out I mean you got a-you know execute in real time you got a real technology play with IP question is what's the next step what is your priorities now that you're out there congratulations on your launch thank you in stealth mode you got some customers you've got Broadcom relationships and looking out in the landscape what's your what's your plan for the next year what's your goals really to take every facet of what you said and just scale the business you know we're actively hiring we have a lot of customer activity this week happens to be the most recent IETF conference that happened in Montreal given our company launch on Monday there's been a tremendous amount of interest in everything that we're doing so that coupled with the existing customer discussions we have is only going to expand and then we have a very robust roadmap to continue to augment and add capabilities to the baseline capabilities that we brought to the market so I I really view the next year as scaling the business in all aspects and increasingly my time is going to be focused on commercially centric activities right well congratulations got a great team we receive great investment cube conversation here I'm John furry here the hot startup here launching this week here in California in Silicon Valley where silicon is back and software is back it's the cube bringing you all the action I'm John Fourier thanks for watching [Music]

Published Date : Jul 20 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

EntityCategoryConfidence
StevePERSON

0.99+

February of 2016DATE

0.99+

John FerrierPERSON

0.99+

Derek YoungPERSON

0.99+

August of 2016DATE

0.99+

DerekPERSON

0.99+

Steve HerodPERSON

0.99+

twenty plus yearsQUANTITY

0.99+

20 plus yearsQUANTITY

0.99+

Steve HerrodPERSON

0.99+

CaliforniaLOCATION

0.99+

EMCORGANIZATION

0.99+

CiscoORGANIZATION

0.99+

July 2018DATE

0.99+

MontrealLOCATION

0.99+

30 yearsQUANTITY

0.99+

NVIDIAORGANIZATION

0.99+

MondayDATE

0.99+

sixQUANTITY

0.99+

arcus IncORGANIZATION

0.99+

John FourierPERSON

0.99+

Amarjeet GillPERSON

0.99+

150 plus patentsQUANTITY

0.99+

JohnPERSON

0.99+

600 plus yearsQUANTITY

0.99+

AppleORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

fiveQUANTITY

0.99+

todayDATE

0.99+

GoogleORGANIZATION

0.99+

VMwareORGANIZATION

0.99+

easy chipORGANIZATION

0.99+

Silicon ValleyLOCATION

0.99+

BroadcomORGANIZATION

0.99+

two peopleQUANTITY

0.99+

MikePERSON

0.99+

IntelORGANIZATION

0.99+

Palo AltoLOCATION

0.99+

first timeQUANTITY

0.98+

Chris rustPERSON

0.98+

threeQUANTITY

0.98+

oneQUANTITY

0.98+

next yearDATE

0.98+

two partsQUANTITY

0.98+

over 400 productsQUANTITY

0.98+

firstQUANTITY

0.97+

third pieceQUANTITY

0.97+

John furryPERSON

0.97+

LinuxTITLE

0.97+

two pointsQUANTITY

0.97+

first operating systemQUANTITY

0.97+

this weekDATE

0.97+

three major playersQUANTITY

0.96+

bothQUANTITY

0.95+

KopPERSON

0.95+

General CatalystORGANIZATION

0.95+

MobiusORGANIZATION

0.94+

San FranciscoLOCATION

0.93+

PalmerPERSON

0.92+

ArrcusORGANIZATION

0.9+

MellanoxORGANIZATION

0.89+

singleQUANTITY

0.88+

one pointQUANTITY

0.88+

two thingsQUANTITY

0.88+

lingua francaTITLE

0.87+

General Catalyst VCUORGANIZATION

0.87+

KherPERSON

0.86+

VCSORGANIZATION

0.8+

Dinesh Nirmal, IBM | IBM Think 2018


 

>> Voiceover: Live from Las Vegas it's the Cube. Covering IBM Think 2018. Brought to you by IBM. >> Welcome back to IBM Think 2018. This is the Cube, the leader in live tech coverage. My name is Dave Vellante and this is our third day of wall-to-wall coverage of IBM Think. Dinesh Nirmal is here, he's the Vice-President of Analytics Development at IBM. Dinesh, great to see you again. >> I know. >> We just say each other a couple of weeks ago. >> I know, in New York. >> Yeah and, of course, in Big Data SV >> Right. >> Over at the Strata Conference. So, great to see you again. >> Well, Thank you. >> A little different venue here. We had real intimate in New York City and in San Jose. >> I know, I know. >> Massive. What are your thoughts on bringing all the clients together like this? >> I mean, it's great because we have combined all the conferences into one, which obviously helps because the message is very clear to our clients on what we are doing end-to-end, but the feedback has been tremendous. I mean, you know, very positive. >> What has the feedback been like in terms of how you guys are making progress in the analytics group? What are they like? What are they asking you for more of? >> Right. So on the analytics side, the data is growing you know, by terabytes a day and the questions is how do they create insights into this massive amount of data that they have in their premise or on Cloud. So we have been working to make sure that how can we build the tools to enable our customers to create insights whether the data is on private cloud, public, or hybrid. And that's a very unique valid proposition that we bring to our customers. Regardless of where your data is, we can help you whether it's cloud, private, or hybrid. >> Well so, we're living in this multi-petabyte world now. Like overnight it became multi-petabyte. And one of the challenges of course people have is not only how do you deal with that volume of data, but how do I act on it and get insights quickly. How do I operationalize it? So maybe you can talk about some of the challenges of operationalizing data. >> Right. So, when I look at machine learning, there is three D's I always say and you know, the first D is the data, the development of the model, and the deployment of the model. When I talk about operationalization, especially the deployment piece, is the one that gets the most challenging for our enterprise customers. Once you clean the data and you build the model how do you take that model and you bring it your existing infrastructure. I mean, you know, look at your large enterprises. Right? I mean, you know, they've been around for decades. So they have third party software. They have existing infrastructure. They have legacy systems. >> Dave: A zillion data marks and data warehouses >> Data marks, so into all of that, how do you infuse machine learning, becomes very challenging. I met with the CTO of a major bank a few months ago, and his statement kind of stands out to me. Where he said, "Dinesh, it only took us three weeks to build the model. It's been 11 months, we still haven't deployed it". So that's the challenge our customers face and that's where we bring in the skillset. Not just the tools but we bring the skills to enable and bring that into production. >> So is that the challenge? It's the skillsets or is it the organizational inertia around well I don't have the time to do that now because I've got to get this report out or ... >> Dinesh: Right Maybe you can talk about that a little. Right. So that is always there. Right? I mean, because once a priority is set obviously the different challenges pull you in different directions, so every organization faces that to a large extent. But I think if you take from a pure technical perspective, I would say the challenge is two things. Getting the right tools, getting the right skills. So, with IBM, what we are focusing is how do we bring the right tools, regardless of the form factor you have, whether Cloud, Private Cloud, Hybrid Cloud, and then how do we bring the right skills into it. So this week we announce the data science lead team, who can come in and help you with building models. Looking at the use cases. Should we be using vanilla machine learning or should we be using deep learning. All those things and how do we bring that model into the production environment itself. So I would say tools and skills. >> So skills wise, in the skills there's at least two paths. It's like the multi-tool athlete. You've got the understanding of the tech. >> Dinesh: Right. >> You know, the tools, most technology people say hey, I'll figure that out. But then there's this data and digital >> Right. >> Skills. It's like this double deep skills that is challenging. So you're saying you can help. >> Right. Sort of kick-start that and how does that work? That sort of a services engagement? That's part of the ... >> So, once you identify a use case, the data science lead team can come in, because they have the some level of vertical knowledge of your industry. They are very trained data scientists. So they can come assess the use case. Help you pick the algorithms to build it. And then help you deploy, cleanse the data. I mean, you bring up a very, very good point. I mean, let's just look at the data, right. The personas that's involved in data; there is the data engineer, there's the data scientist, there's the data worker, there's the data steward, there's the CTO. So, that's just the data piece. Right? I mean, there's so many personas that have to come together. And that's why I said the skills a very critical piece of all it, but also, working together. The collaboration is important. >> Alright, tell us more about IBM Cloud Private for data. We've heard about IBM Cloud Private. >> Danish: Right. >> Cloud Private for Data is new. What's that all about? >> Right, so we announced IBM Cloud Private for Data this week and let me tell you, Dave, this has been the most significant announcement from an analytic perspective, probably in a while, that we are getting such a positive response. And, I will tell you why. So when you look at the platform, our customers want three things. One, they want to be able to build on top of the platform. They want it to be open and they want it to be extensible. And we have all three available. The platform is built on Kubernetes. So it's completely open, it's scalable, it's elastic. All those features comes with it. And then we put that end-to-end so you can ingest the data, you can cleanse it or transform it. You can build models or do deep analytics on it. You can visualize it. So you can do everything on the platform. So I'll take an example, like block chain, for example, I mean you have, if I were to simplify it, Right? You have the ledger, where you are, obviously, putting your transactions in and then you have a stay database where you are putting your latest transactions in. The ledger's unstructured. So, how do you, as that is getting filled, How do you ingest that, transform it on the fly, and be able to write into a persistent place and do analytics on it. Only a platform can do with that kind of volume of data. And that's where the data platform brings in, which is very unique especially on the modern applications that you want to do. >> Yes, because if you don't have the platform ... Let's unpack this a little bit. You've got a series of bespoke products and then you've got, just a lot of latency in terms of the elapsed times to get to the insights. >> Dinesh: Right. >> Along the way you've got data consistency issues, data quality >> Dinesh: Right >> maybe is variable. Things change. >> Right. I mean, think about it, right. If you don't have the platform then you have side-load products. So all of a sudden you've got to get a product for your governance, your integration catalog. You need to get a product for ingest. You got to get a product for persistence. You got to get a product for analytics. You got to get a product for visualization. And then you add the complexity of the different personas working together between the multitude of products. You have a mess in your hand at that point. The platform solves that problem because it brings you an integrated end-to-end solution that you can use to build, for example, block chain in this case. >> Okay, I've asked you this before, but I've got to again and get it on record with Think. So, a lot of people would hear that and say Okay but it's a bunch of bespoke products that IBM has taken they've put a UI layer on top and called it a platform. So, what defines a platform and how have you not done that? >> Right. >> And actually created the platform? >> Right. So, we are taking the functionality of the existing parts and that's what differentiates us. Right? If you look at our governance portfolio, I can sit here and very confidently say no one can match that, so >> Dave: Sure. We obviously have that strength >> Real Tap >> Right, Real Tap. That we can bring. So we are bringing the functionality. But what we have done is we are taking the existing products and disintegrated in to micro services so we can make it cloud native. So that is a huge step for us, right? And then once you make that containerized and micro services it fits into the open platform that we talked about before. And now you have an end-to-end, well orchestrated pipeline that's available in the platform that can scale and be elastic as needed. So, it's not that we are bringing the products, we are bringing the functionality of it. >> But I want to keep on this for a second, so the experience for the user is different if you microserviced what you say because if you just did what I said and put a layer a UI layer on top, you would be going into these stovepipes and then cul-de-sac and then coming back >> Dinesh: Right. And coming back. So, the development effort for that must have been >> Oh, yeah. >> Fairly massive. You could have done the UI layer in, you know, in months. >> Right, right, right, then it is not really cloud native way of doing it, right? I mean, if you're just changing the UI and the experience, that's completely different. What we have done is that we have completely re-architected the underlying product suite to meet the experience and the underlying platform layer. So, what can happen? How long did this take? What kind of resources did you have to throw at this from a development standpoint? >> So this has been in development for 12-18 months. >> Yeah. >> And we put, you know, a tremendous amount of resources to make this happen. I mean, fortunately in our case we have the depth, we have the functionality. So it was about translating that into the cloud native way of doing the app development. >> So did you approach this with sort of multiple small teams? Or was there a larger team? What was your philosophy here? >> It was multiple small teams, right. So if you look at our governance portfolio we got to take our governance catalog, rewrite that code. If we look at our master data management portfolio, we got to take, so it's multiple of small teams with very core focus. >> I mean, I ask you these questions because I think it adds credibility to the claims that you're making about we have a platform not a series of bespoke products. >> Right and we demoed it. Actually tomorrow at 11, I'm going to deep dive into the architecture of the whole platform itself. How we built it. What are the components we used and I'm going to demo it. So the code is up and running and we are going to put it out there into Cube for everybody to go us it. >> At Mandalay Bay, where is that demo? >> It's in Mandalay Bay, yeah. >> Okay. >> We have a session at 11:30. >> Talk more about machine learning and how you've infused machine learning into the portfolio. >> Right. So, every part of our product portfolio has machinery so, I'll take two examples. One is DB2. So today, DB2 Optimizer is a cost-based optimizer. We have taken the optimizer and infused machine learning into it to say, you know, based on the query that's coming in take the right access path, predict the right access path and take it. And that has been such a great experience because we are seeing 30-50 percent performance improvement in most of the queries that we run through the machinery. So that's one. The other one is the classification, so let's say, you have a business term and you want to classify. So, if you have a zip code, we can use in our catalog to say there's an 80% chance this particular number is a zip code and then it can learn over time, if you tell it, no that's not a zip code, that's a post code in Canada. So the next time you put that in there it has learned. So every product we have infused machine learning and that's our goal is to become completely a cognitive platform pretty soon. I mean, you know, so that has also been a tremendous piece of work that we're doing. >> So what can we expect? I mean, you guys are moving fast. >> Yeah. >> We've seen you go from sort of a bespoke product company division to this platform division. Injecting now machine learning into the equation. You're bringing in new technologies like block chain, which you're able to do because you have a platform. >> Right. >> What should we expect in terms of the pace and the types of innovations that we could see going forward? What could you share with us without divulging secrets? >> Right. So, from a product perspective we want to infuse cognitive machine learning into every aspect of the product. So, we don't want to, we don't want our customers calling us, telling there's a problem. We want to be able to tell our customer a day or two hours ahead that there is a problem. So that is predictability, Right? So we want not just in the product, even in the services side, we want to infuse total machine learning into the product. From a platform perspective we want to make it completely open, extensible. So our partners can come and build on top of it. So every customer can take advantage of vertical and other solutions that they build. >> You get a platform, you get this fly-wheel effect, inject machine learning everywhere open API so you can bring in new technologies like block chain as they evolve. Dinesh, thank you very much for coming on the Cube. >> Oh, thank you so much. >> Always great to have you. >> It's a pleasure, thank you. >> Alright, keep it right there everybody. We'll be right back with our next guest. This is the Cube live from IBM Think 2018. We'll be right back. (techno music)

Published Date : Mar 21 2018

SUMMARY :

Brought to you by IBM. Dinesh, great to see you again. So, great to see you again. in New York City and in San Jose. all the clients together like this? I mean, you know, very positive. So on the analytics side, the data is growing So maybe you can talk I mean, you know, Not just the tools but we bring the skills So is that the challenge? obviously the different challenges pull you You've got the understanding of the tech. You know, the tools, most technology people So you're saying you can help. That's part of the ... I mean, let's just look at the data, right. Alright, tell us more about IBM Cloud Private for data. What's that all about? You have the ledger, where you are, obviously, Yes, because if you don't have the platform ... maybe is variable. And then you add the complexity of the different personas and how have you not done that? of the existing parts and that's what differentiates us. We obviously have that strength bringing the products, we are bringing So, the development effort You could have done the UI layer in, What kind of resources did you have to throw And we put, you know, a tremendous amount of resources So if you look at our governance portfolio I mean, I ask you these questions because I think So the code is up and running and we are going infused machine learning into the portfolio. So the next time you put that in there it I mean, you guys are moving fast. Injecting now machine learning into the equation. even in the services side, we want to infuse total You get a platform, you get this fly-wheel effect, This is the Cube live from IBM Think 2018.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

Dave VellantePERSON

0.99+

IBMORGANIZATION

0.99+

CanadaLOCATION

0.99+

Dinesh NirmalPERSON

0.99+

New YorkLOCATION

0.99+

San JoseLOCATION

0.99+

New York CityLOCATION

0.99+

three weeksQUANTITY

0.99+

two hoursQUANTITY

0.99+

80%QUANTITY

0.99+

Mandalay BayLOCATION

0.99+

11 monthsQUANTITY

0.99+

DineshPERSON

0.99+

Las VegasLOCATION

0.99+

11:30DATE

0.99+

two thingsQUANTITY

0.99+

a dayQUANTITY

0.99+

OneQUANTITY

0.99+

third dayQUANTITY

0.98+

this weekDATE

0.98+

three thingsQUANTITY

0.98+

two examplesQUANTITY

0.98+

threeQUANTITY

0.98+

todayDATE

0.98+

DB2TITLE

0.98+

oneQUANTITY

0.98+

12-18 monthsQUANTITY

0.97+

firstQUANTITY

0.97+

decadesQUANTITY

0.93+

terabytes a dayQUANTITY

0.93+

doubleQUANTITY

0.9+

ThinkORGANIZATION

0.88+

Vice-PresidentPERSON

0.87+

few months agoDATE

0.85+

petabyteQUANTITY

0.85+

tomorrow atDATE

0.83+

couple of weeks agoDATE

0.81+

30-50 percentQUANTITY

0.79+

DanishOTHER

0.76+

IBM Think 2018EVENT

0.75+

two pathsQUANTITY

0.73+

ThinkCOMMERCIAL_ITEM

0.72+

11DATE

0.68+

multiQUANTITY

0.68+

KubernetesTITLE

0.68+

a secondQUANTITY

0.66+

Cloud PrivateTITLE

0.62+

Cloud Private forTITLE

0.61+

Cloud Private for DataTITLE

0.61+

IBMEVENT

0.56+

CTOPERSON

0.56+

Strata ConferenceEVENT

0.55+

2018EVENT

0.54+

CubeTITLE

0.5+

CubeCOMMERCIAL_ITEM

0.45+

CubePERSON

0.27+

Jacques Nadeau, Dremio | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE, presenting Big Data Silicon Valley. Brought to you by SiliconANGLE Media and it's ecosystem partners. >> Welcome back to Big Data SV in San Jose. This theCUBE, the leader in live tech coverage. My name is Dave Vellante and this is day two of our wall-to-wall coverage. We've been here most of the week, had a great event last night, about 50 or 60 of our CUBE community members were here. We had a breakfast this morning where the Wikibon research team laid out it's big data forecast, the eighth big data forecast and report that we've put out, so check out that online. Jacques Nadeau is here. He is the CTO and co-founder of Dremio. Jacque, welcome to theCUBE, thanks for coming on. >> Thanks for having me here. >> So we were talking a little bit about what you guys do. Three year old company. Well, let me start. Why did you co-found Dremio? >> So, it was a very simple thing I saw, so, over the last ten years or so, we saw a regression in the ability for people to get at data, so you see all these really cool technologies that came out to store data. Data lakes, you know, SQL systems, all these different things that make developers very agile with data. But what we were also seeing was a regression in the ability for analysts and data consumers to get at that data because the systems weren't designed for analysts, they were designed for data producers and developers. And we said, you know what, there needs to be a way to solve this. We need to be able to empower people to be self-sufficient again at the data consumption layer. >> Okay, so you solved that problem how, you said, called it a self-service of a data platform. >> Yeah, yeah, so self-service data platform and the idea is pretty simple. It's that, no matter where the data is physically, people should be able to interact with a logical view of it. And so, we talk a little bit like it's Google Docs for your data. So people can go into the system, they can see the different data sets that are available to them, collaborate around those, create changes to those that they can then share with other people in the organization, always dealing with the logical layer and then, behind the scenes, we have physical capabilities to interact with all the different system we interact with. But that's something that business users shouldn't have to think as much about and so, if you think about how people interact with data today, it's very much about copies. So every time you want to do something, typically you're going to make a copy. I want to reshape the data, I make a copy. I want to make it go faster, I make a copy. And those copies are very, very difficult for people to manage and they could have mixed the business meaning of data with the physical, I'm making copies to make them faster or whatever. And so our perspective is that, if you can separate away the physical concerns from the logical, then business users have a much more, much more likelihood to be able to do something self-service. >> So you're essentially virtualizing my corpus of data, independent of location, is that right, I mean-- >> It's part of what we do, yeah. No, it's part of what we do. So, the way we look at it is, is kind of several different components to try to make something self-service. It starts with, yeah, virtualize or abstract away the details of the physical, right? But then, on top of that, expose a very, sort of a very user-friendly interface that allows people to sort of catalog and understand the different things, you know, search for things that they want to interact with, and then curate things, even if they're non-technical users, right? So the goal is that, if you talk to sort of even large internet companies in the Valley, it's very hard to even hire the amount of data engineering that you need to satisfy all the requests of your end-users of data. And so the, and so the goal of Dremio is basically to figure out different tools that can provide a non-technical experience for getting at the data. So that's sort of the start of it but then the second step is, once you've got access to this thing and people can collaborate and sort of deal with the data, then you've got these huge volumes of data, right? It's big data and so how do you make that go faster? And then we have some components that we deal with, sort of, speed and acceleration. >> So maybe talk about how people are leveraging this capability, this platform, what the business impact is, what have you seen there? >> So a lot of people have this problem, which is, they have data all over the place and they're trying to figure out "How do I expose this "to my end-users?" And those end-users might be analysts, they might be data scientists, they might be product managers that are trying to figure out how their product is working. And so, what they're doing today is they're typically trying to build systems internally that, to provide these capabilities. And so, for example, working with a large auto manufacturer. And they've got a big initiative where they're trying to make the data that they have, they have huge amounts of data across all sort of different parts of the organization and they're trying to make that available to different data consumers. Now, of course, there's a bunch of security concerns that you need to have around that, but they just want to make the data more accessible. And so, what they're doing is they're using Dremio to figure out ways to, basically, catalog all the data below, expose that to the different users, applying lots of different security rules around that, and then create a bunch of reflections, which make the things go faster as people are interacting with the things. >> Well, what about the governance factor? I mean, you heard this in the hadoop world years ago. "Ah, we're going to make, we're going to harden hadoop, "we're going to" and really, there was no governance and it became more and more important. How do you guys handle that? Do you partner with people? Is it up to the customer to figure that out? Do you provide that? >> It's several different things, right? It's a complex ecosystem, right? So it's a combination of things. You start with partnering with different systems to make sure that you integrate well with those things. So the different things that control some parts of credentials inside the systems all the way down to "What's the file system permissions?", right? "What are the permissions inside of something like Hive and the metastore there?" And then other systems on top of that, like Sentry or Ranger are also exposing different credentialing, right? And so we work hard to sort of integrate with those things. On top of that, Dremio also provides a full security model inside of the sort of virtual space that we work. And so people can control the permissions, the ability to access or edit any object inside of Dremio based on user roles and LDAP and those kinds of things. So it's, it's kind of multiple layers that have to be working together. >> And tell me more about the company. So founded three years ago, I think a couple of raises, >> Yep >> who's backing you? >> Yeah, yeah, yeah, so we founded just under three years ago. We had great initial investors, in Red Point and Lightspeed, so two great initial investors and we raised about 15 million on that round. And then we actually just closed a B round in January of this year and we added Norwest to the portfolio there. >> Awesome, so you're now in the mode of, I mean, they always say, you know, software is such a capital-efficient business but you see software companies raising, you know, 900 million dollars and so, presumably, that's to compete, to go to market and, you know, differentiate with your messaging and branding. Is that sort of what the, the phase that you're in now? You kind of developed a product, it's technically sound, it's proven in the marketspace and now you're scaling the, the go-to-market, is that right? >> That's exactly right. So, so we've had a lot of early successes, a lot of Fortune 100 companies using Dremio today. For example, we're working with TransUnion. We're working with Intel. We actually have a great relationship with OVH, which is the third-largest hosting company in the world, so a lot of great, Daimler is another one. So working with a lot of great companies, seeing sort of great early success with the product with those companies, and really looking to say "Hey, we're out here." We've got a booth for the first time at Strata here and we're sort of letting people know about, sort of, a better way, or easier way, for people to deal with data >> Yeah. >> A happier way. >> I mean, it's a crowded space, right? There's a lot of tools out there, a lot of companies. I'm interested in how you sort of differentiate. Obviously simplification is a part of that, the breadth of your capabilities. But maybe, in your words, you could share with me how you differentiate from the competition and how you break out from the noise. >> Yeah, yeah, yeah, so it's, you're absolutely right, it's a very crowded space. Everybody's using the same words and that makes it very hard for people to understand what's going on. And so, what we've found is very simple is that typically we will actually, the first meeting we deal with a customer, within the first 10 minutes we'll demo the product. Because so many technologies are technologies, not, they're not products and so you have to figure out how to use the product. You've got to figure out how you would customize it for your certain use-case. And what we've found with our product is, by making it very, very simple, people start, the light goes on in a very short amount of time and so, we also do things on our website so that you can see, in a couple of minutes, or even less than that, little animations that sort of give you a sense of what it's about. But really, it's just "Hey, this is a product "which is about", there's this light bulb that goes on, it's great. And you figure this out over the course of working with different customers, right? But there's this light bulb that goes on for people that are so confused by all the things that are going on and if we can just sit down with them, show them the product for a few minutes, all of a sudden they're like "Wait a minute, "I can use this", right? So you're frequently talking to buyers that are not the most technical parts of the organization initially, and so most of the technologies they look at are technologies that are very difficult to understand and they have to look to others to try to even understand how it would fit into their architecture. With Dremio, we have customers that can, that have installed it and gotten up, and within an hour or two, started to see real value. And that sort of excitement happens even in the demo, with most people. >> So you kind of have this bifurcated market. Since the big data meme, everybody says they're data-driven and you've got a bifurcated market in that, you've got the companies that are data-driven and you've got companies who say they're data-driven but really aren't. Who are your customers? Are they in both? Are they predominantly in the data-driven side? Are they predominantly in the trying to be data-driven? >> Well, I would say that they all would say that they're data-driven. >> Yeah, everyone, who's going to say "Well, we're not data-driven." >> Yeah, yeah, yeah. So I would say >> We're dead. >> I would say that everybody has data and they've got some ways that they're using it well and other places where they feel like they're not using it as well as they should. And so, I mean, the reason that we exist is to make it so it's easier for people to get value out of data, and so, if they were getting all the value they think they could get out of data, then we probably wouldn't exist and they would be fully data-driven. So I think that everybody, it's a journey and people are responding well to us, in part, because we're helping them down that journey. >> Well, the reason I asked that question is that we go to a lot of shows and everybody likes to throw out the digital transformation buzzword and then use Uber and Airbnb as an example, but if you dig deeper, you see that data is at the core of those companies and they're now beginning to apply machine intelligence and they're leveraging all this data that they've built up, this data architecture that they built up over the last five or 10 years. And then you've got this set of companies where all the data lives in silos and I can see you guys being able to help them. At the same time, I can see you helping the disruptors, so how do you see that? I mean, in terms of your role, in terms of affecting either digital transformations or digital disruptions. >> Well, I'd say that in either case, so we believe in a very sort of simple thing, which is that, so going back to what I said at the beginning, which is just that I see this regression in terms of data access, right? And so what happens is that, if you have a tightly-coupled system between two layers, then it becomes very difficult for people to sort of accommodate two different sets of needs. And so, the change over the last 10 years was the rise of the developer as the primary person for controlling data and that brought a huge amount of great things to it but analysis was not one of them. And there's tools that try to make that better but that's really the problem. And so our belief is very simple, which is that a new tier needs to be introduced between the consumers and the, and the producers of data. And that, and so that tier may interact with different systems, it may be more complex or whatever, for certain organizations, but the tier is necessary in all organizations because the analysts shouldn't be shaken around every time the developers change how they're doing data. >> Great. John Furrier has a saying that "Data is the new development kit", you know. He said that, I don't know, eight years ago and it's really kind of turned out to be the case. Jacques Nadeau, thanks very much for coming on theCUBE. Really appreciate your time. >> Yeah. >> Great to meet you. Good luck and keep us informed, please. >> Yes, thanks so much for your time, I've enjoyed it. >> You're welcome. Alright, thanks for watching everybody. This is theCUBE. We're live from Big Data SV. We'll be right back. (bright music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media We've been here most of the week, So we were talking a little bit about what you guys do. And we said, you know what, there needs to be a way Okay, so you solved that problem how, and the idea is pretty simple. So the goal is that, if you talk to sort of expose that to the different users, I mean, you heard this in the hadoop world years ago. And so people can control the permissions, And tell me more about the company. And then we actually just closed a B round that's to compete, to go to market and, you know, for people to deal with data and how you break out from the noise. and so most of the technologies they look at So you kind of have this bifurcated market. that they're data-driven. Yeah, everyone, who's going to say So I would say And so, I mean, the reason that we exist is At the same time, I can see you helping the disruptors, And so, the change over the last 10 years "Data is the new development kit", you know. Great to meet you. This is theCUBE.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
Dave VellantePERSON

0.99+

Jacques NadeauPERSON

0.99+

DaimlerORGANIZATION

0.99+

John FurrierPERSON

0.99+

NorwestORGANIZATION

0.99+

IntelORGANIZATION

0.99+

WikibonORGANIZATION

0.99+

TransUnionORGANIZATION

0.99+

JacquePERSON

0.99+

San JoseLOCATION

0.99+

OVHORGANIZATION

0.99+

LightspeedORGANIZATION

0.99+

second stepQUANTITY

0.99+

UberORGANIZATION

0.99+

two layersQUANTITY

0.99+

AirbnbORGANIZATION

0.99+

bothQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

Google DocsTITLE

0.99+

Red PointORGANIZATION

0.99+

StrataORGANIZATION

0.99+

60QUANTITY

0.98+

900 million dollarsQUANTITY

0.98+

three years agoDATE

0.98+

eight years agoDATE

0.98+

twoQUANTITY

0.98+

DremioPERSON

0.98+

first 10 minutesQUANTITY

0.98+

last nightDATE

0.98+

about 15 millionQUANTITY

0.97+

theCUBEORGANIZATION

0.97+

first timeQUANTITY

0.97+

DremioORGANIZATION

0.97+

Big Data SVORGANIZATION

0.96+

an hourQUANTITY

0.96+

two great initial investorsQUANTITY

0.95+

todayDATE

0.93+

first meetingQUANTITY

0.93+

this morningDATE

0.92+

two different setsQUANTITY

0.9+

thirdQUANTITY

0.88+

Big DataORGANIZATION

0.87+

SQLTITLE

0.87+

10 yearsQUANTITY

0.87+

CUBEORGANIZATION

0.87+

years agoDATE

0.86+

Silicon ValleyLOCATION

0.86+

January of this yearDATE

0.84+

DremioTITLE

0.84+

Three year oldQUANTITY

0.81+

last 10 yearsDATE

0.8+

SentryORGANIZATION

0.77+

one of themQUANTITY

0.75+

about 50QUANTITY

0.75+

day twoQUANTITY

0.74+

RangerORGANIZATION

0.74+

SVEVENT

0.7+

last ten yearsDATE

0.68+

eighth bigQUANTITY

0.68+

DataORGANIZATION

0.66+

BigEVENT

0.65+

couple of minutesQUANTITY

0.61+

CTOPERSON

0.56+

oneQUANTITY

0.55+

lastDATE

0.52+

100 companiesQUANTITY

0.52+

underDATE

0.51+

fiveQUANTITY

0.5+

2018DATE

0.5+

HiveTITLE

0.42+

David Abercrombie, Sharethrough & Michael Nixon, Snowflake | Big Data SV 2018


 

>> Narrator: Live from San Jose, it's theCUBE. Presenting Big Data, Silicon Valley. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hi, I'm George Gilbert, and we are broadcasting from the Strata Data Conference, we're right around the corner at the Forager Tasting Room & Eatery. We have this wonderful location here, and we are very lucky to have with us Michael Nixon, from Snowflake, which is a leading cloud data warehouse. And David Abercrombie from Sharethrough which is a leading ad tech company. And between the two of them, they're going to tell us some of the most advance these cases we have now for cloud-native data warehousing. Michael, why don't you start with giving us some context for how on a cloud platform one might rethink a data warehouse? >> Yeah, thank you. That's a great question because let me first answer it from the end-user, business value perspective, when you run a workload on a cloud, there's a certain level of expectation you want out of the cloud. You want scalability, you want unlimited scalability, you want to be able to support all your users, you want to be able to support the data types, whatever they may be that comes in into your organization. So, there's a level of expectation that one should expect from a service point of view once you're in a cloud. So, a lot of the technology that were built up to this point have been optimized for on-premises types of data warehousing where perhaps that level of service and currency and unlimited scalability was not really expected but, guess what? Once it comes to the cloud, it's expected. So those on-premises technologies aren't suitable in the cloud, so for enterprises and, I mean, companies, organizations of all types from finance, banking, manufacturing, ad tech as we'll have today, they want that level of service in the cloud. And so, those technologies will not work, and so it requires a rethinking of how those architectures are built. And it requires being built for the cloud. >> And just to, alright, to break this down and be really concrete, some of the rethinking. We separate compute from storage, which is a familiar pattern that we've learned in the cloud but we also then have to have this sort of independent elasticity between-- >> Yes. Storage and the compute, and then Snowflake's taken it even a step further where you can spin out multiple compute clusters. >> Right. >> Tell us how that works and why that's so difficult and unique. >> Yeah, you know, that's taking us under the covers a little bit, but what makes our infrastructure unique is that we have a three-layer architecture. We separate, just as you said, storage from the compute layer, from the services layer. And that's really important because as I mentioned before, you want unlimited capacity, unlimited resources. So, if you scale, compute, and today's world on on-premises MPP, what that really means is that you have to bring the storage along with the compute because compute is tied to the storage so when you scale the storage along with the compute, usually that involves a lot of burden on the data warehouse manager because now they have to redistribute the data and that means redistributing keys, managing keys if you will. And that's a burden, and by the reverse, if all you wanted to do was increase storage but not the compute, because compute was tied to storage. Why you have to buy these additional compute notes, and that might add to the cost when, in fact, all you really wanted to pay for was for additional storage? So, by separating those, you keep them independent, and so you can scale storage apart from compute and then, once you have your compute resources in place, the virtual warehouses that you're talking about that have completed the job, you spun them up, it's done its job, and you take it down, guess what? You can release those resources, and of course, in releasing those resources, basically you can cut your cost as well because, for us, it's pure usage-based pricing. You only pay for what you use, and that's really fantastic. >> Very different from the on-prem model where, as you were saying, tied compute and storage together, so. >> Yeah, let's think about what that means architecturally, right? So if you have an on-premises data warehouse, and you want to scale your capacity, chances are you'll have to have that hardware in place already. And having that hardware in place already means you're paying that expense and, so you may pay for that expense six months prior to need it. Let's take a retailer example. >> Yeah. >> You're gearing up for a peak season, which might be Christmas, and so you put that hardware in place sometime in June, you'll always put it in advanced because why? You have to bring up the environment, so you have to allow time for implementation or, if you will, deployment to make sure everything is operational. >> Okay. >> And then what happens is when that peak period comes, you can't expand in that capacity. But what happens once that peak period is over? You paid for that hardware, but you don't really need it. So, our vision is, or the vision we believe you should have when you move workloads to the cloud is, you pay for those when you need them. >> Okay, so now, David, help us understand, first, what was the business problem you were trying to solve? And why was Snowflake, you know, sort of uniquely suited for that? >> Well, let me talk a little bit about Sharethrough. We're ad tech, at the core of our business we run an ad exchange, where we're doing programmatic training with the bids, with the real-time bidding spec. The data is very high in volume, with 12 billion impressions a month, that's a lot of bids that we have to process, a lot of bid requests. The way it operates, the bids and the bid responses and programmatic training are encoded in JSONs, so our ad exchange is basically exchanging messages in JSON with our business partners. And the JSONs are very complicated, there's a lot of richness and detail, such that the advertisers can decide whether or not they want to bid. Well, this data is very complicated, very high-volume. And advertising, like any business, we really need to have good analytics to understand how our business is operating, how our publishers are doing, how our advertisers are doing. And it all depends upon this very high-volume, very complex JSON event data stream. So, Snowflake was able to ingest our high-volume data very gracefully. The JSON parsing techniques of Snowflake allow me to expose the complicated data structure in a way that's very transparent and usable to our analysts. Our use of Snowflake has replaced clunkier tools where the analysts basically had to be programmers, writing programs in Scala or something to do in analysis. And now, because we've transparently and easily exposed the complicated structures within Snowflake in a relational database, they can use good old-fashioned SQL to run their queries, literally, afternoon analysis is now a five-minute query. >> So, let me, as I'm listening to you describe this. We've had various vendors telling us about these workflows in the sort of data prep and data science tool change. It almost sounds to me like Snowflake is taking semi-structured or complex data and it's sort of unraveling it and normalizing is kind of an overloaded term but it's making it business-ready, so you don't need as much of that manual data prep. >> Yeah, exactly, you don't need as much manual data prep, or you don't need as much expertise. For instance, Snowflake's JSON capabilities, in terms of drilling down the JSON tree with dot path notation, or expanding nested objects is very expressive, very powerful, but still your typical analyst or your BI tool certainly wouldn't know how to do that. So, in Snowflake, we sort of have our cake and eat it too. We can have our JSONs with their full richness in our database, but yet we can simplify and expose the data elements that are needed for analysis, so that an analyst, their first day on the job, they can get right to work and start writing queries. >> So let me ask you about, a little more about the programmatic ad use case. So if you have billions of impressions per month, I'm guessing that means you have quite a few times more, in terms of bids, and then there's the, you know once you have, I guess a successful one, you want to track what happens. >> Correct. >> So tell us a little more about that, what that workload looks like, in terms of, what analytics you're trying to perform, what's your tracking? >> Yeah, well, you're right. There's different steps in our funnel. The impression request expands out by a factor of a dozen as we send it to all the different potential bidders. We track all that data, the responses come back, we track that, we track our decisions and why we selected the bidder. And then, once the ad is shown, of course there's various beacons and tracking things that fire. We'd have to track all of that data, and the only way we could make sense out of our business is by bringing all that data together. And in a way that is reliable, transparent, and visible, and also has data integrity, that's another thing I like about the Snowflake database is that it's a good old-fashioned SQL database that I can declare my primary keys, I can run QC checks, I can ensure high data integrity that is demanded by BI and other sorts of analytics. >> What would be, as you continue to push the boundaries of the ad tech service, what's some functionality that you're looking to add, and Snowflake as your partner, either that's in there now that you still need to take advantage of or things that you're looking to in the future? >> Well, moving forward, of course, we, it's very important for us to be able to quickly gauge the effectiveness of new products. The ad tech market is fast-changing, there's always new ways of bidding, new products that are being developed, new ways for the ad ecosystem to work. And so, as we roll those out, we need to be able to quickly analyze, you know, "Is this thing working or not?" You know, kind of an agile environment, pivot or prove it. Does this feature work or not? So, having all the data in one place makes that possible for that very quick assessment of the viability of a new feature, new product. >> And, dropping down a little under the covers for how that works, does that mean, like you still have the base JSON data that you've absorbed, but you're going to expose it with different schemas or access patterns? >> Yeah, indeed. For instance, we make use of the SQL schemas, roles, and permissions internally where we can have the different teams have their own domain of data that they can expose internally, and looking forward, there's the share house feature of Snowflake that we're looking to implement with our partners, where, rather than sending them data, like a daily dump of data, we can give them access to their data in our database through this top layer that Michael mentioned, the service layer, essentially allows me to create a view grant select onto another customer. So I no longer have to send daily data dumps to partners or have some sort of API for getting data. They can simply query the data themselves so we'll be implementing that feature with our major partners. >> I would be remiss in not asking at a data conference like this, now that there's the tie-in with CuBOL and Spark Integration and Machine Learning, is there anything along that front that you're planning to exploit in the near future? >> Well, yeah, Sharethrough, we're very experimental, playful, we're always examining new data technologies and new ways of doing things but now with Snowflake as sort of our data warehouse of curated data. I've got two petabytes of referential integrity data, and that is reliable. We can move forward into our other analyses and other uses of data knowing that we have captured every event exactly once, and we know exactly where it fits in a business context, in a relational manner. It's clean, good data integrity, reliable, accessible, visible, and it's just plain old SQL. (chuckles) >> That's actually a nice way to sum it up. We've got the integrity that we've come to expect and love from relational databases. We've got the flexibility of machine-oriented data, or JSON. But we don't have to give up the query engine, and then now you have more advanced features, analytic features that you can take advantage of coming down the pipe. >> Yeah, again we're a modern platform for the modern age, that's basically cloud-based computing. With a platform like Snowflake in the backend, you can now move those workloads that you're accustomed to to the cloud and have in the environment that you're familiar with, and it saves you a lot of time and effort. You can focus on more strategic projects. >> Okay, well, with that, we're going to take a short break. This has been George Gilbert, we're with Michael Nixon of Snowflake, and David Abercrombie of Sharethrough listening to how the most modern ad tech companies are taking advantage of the most modern cloud data warehouses. And we'll be back after a short break here at the Strata Data Conference, thanks. (quirky music)

Published Date : Mar 9 2018

SUMMARY :

Brought to you by SiliconANGLE Media some of the most advance these cases we have now a certain level of expectation you want out of the cloud. concrete, some of the rethinking. Storage and the compute, and then Snowflake's taken it and unique. that have completed the job, you spun them up, Very different from the on-prem model where, as you and you want to scale your capacity, chances are You have to bring up the environment, so you have to allow You paid for that hardware, but you don't really need it. of richness and detail, such that the advertisers can So, let me, as I'm listening to you describe this. of drilling down the JSON tree with dot path notation, I'm guessing that means you have quite a few times more, I like about the Snowflake database analyze, you know, "Is this thing working or not?" the service layer, essentially allows me to create and that is reliable. and then now you have more you can now move those workloads that you're accustomed to at the Strata Data Conference, thanks.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavidPERSON

0.99+

George GilbertPERSON

0.99+

David AbercrombiePERSON

0.99+

Michael NixonPERSON

0.99+

MichaelPERSON

0.99+

JuneDATE

0.99+

twoQUANTITY

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

San JoseLOCATION

0.99+

ScalaTITLE

0.99+

firstQUANTITY

0.99+

Silicon ValleyLOCATION

0.99+

five-minuteQUANTITY

0.99+

SnowflakeTITLE

0.99+

ChristmasEVENT

0.98+

Strata Data ConferenceEVENT

0.98+

three-layerQUANTITY

0.98+

first dayQUANTITY

0.98+

a dozenQUANTITY

0.98+

two petabytesQUANTITY

0.97+

SharethroughORGANIZATION

0.97+

JSONTITLE

0.97+

SQLTITLE

0.96+

one placeQUANTITY

0.95+

six monthsQUANTITY

0.94+

Forager Tasting Room & EateryORGANIZATION

0.91+

todayDATE

0.89+

SnowflakeORGANIZATION

0.87+

SparkTITLE

0.87+

12 billion impressions a monthQUANTITY

0.87+

Machine LearningTITLE

0.84+

Big DataORGANIZATION

0.84+

billions of impressionsQUANTITY

0.8+

CuBOLTITLE

0.79+

Big Data SV 2018EVENT

0.77+

onceQUANTITY

0.72+

theCUBEORGANIZATION

0.63+

JSONsTITLE

0.61+

timesQUANTITY

0.55+

Satyen Sangani, Alation | Big Data SV 2018


 

>> Announcer: Live from San Jose, it's theCUBE. Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. (upbeat music) >> Welcome back to theCUBE, I'm Lisa Martin with John Furrier. We are covering our second day of our event Big Data SV. We've had some great conversations, John, yesterday, today as well. Really looking at Big Data, digital transformation, Big Data, plus data science, lots of opportunity. We're excited to welcome back to theCUBE an alumni, Satyen Sangani, the co-founder and CEO of Alation. Welcome back! >> Thank you, it's wonderful to be here again. >> So you guys finish up your fiscal year end of December 2017, where in the first quarter of 2018. You guys had some really strong results, really strong momentum. >> Yeah. >> Tell us what's going on at Alation, how are you pulling this momentum through 2018. >> Well, I think we have had an enterprise focused business historically, because we solve a very complicated problem for very big enterprises, and so, in the last quarter we added customers like American Express, PepsiCo, Roche. And with huge expansions from our existing customers, some of whom, over the course of a year, I think went 12 X from an initial base. And so, we found some just incredible momentum in Q4 and for us that was a phenomenal cap to a great year. >> What about the platform you guys are doing? Can you just take a minute to explain what Alation does again just to refresh where you are on the product side? You mentioned some new accounts, some new use cases. >> Yeah. >> What's the update? Take a minute, talk about the update. >> Absolutely, so, you certainly know, John, but Alation's a data catalog and a data catalog essentially, you can think of it as Yelp or Amazon for data and information side of the enterprise. So if you think about how many different databases there are, how many different reports there are, how many different BI tools there are, how many different APIs there are, how many different algorithms there are, it's pretty dizzying for the average analyst. It's pretty dizzying for the average CIO. It's pretty dizzying for the average chief data officer. And particularly, inside of Fortune 500s where you have hundreds of thousands of databases. You have a situation where people just have too much signal or too much noise, not enough signal. And so what we do is we provide this Yelp for that information. You can come to Alation as a catalog. You can do a search on revenue 2017. You'll get all of the reports, all of the dashboards, all of the tables, all of the people that you might need to be able to find. And that gives you a single place of reference, so you can understand what you've got and what can answer your questions. >> What's interesting is, first of all, I love data. We're data driven, we're geeks on data. But when I start talking to folks that are outside the geek community or nerd community, you say data and they go, "Oh," because they cringe and they say, "Facebook." They see that data issues there. GDPR, data nightmare, where's the store, you got to manage it. And then, people are actually using data, so they're realizing how hard (laughs) it is. >> Yeah >> How much data do we have? So it's kind of like a tropic disillusionment, if you will. Now they got to get their hands on it. They've got to put it to work. >> Yeah. >> And they know that So, it's now becoming really hard (laughs) in their mind. This is business people. >> Yeah. >> They have data everywhere. How do you guys talk to that customer? Because, if you don't have quality data, if you don't have data you can trust, if you don't have the right people, it's hard to get it going. >> Yeah. >> How do you guys solve that problem and how do you talk to customers? >> So we talk a lot about data literacy. There is a lot of data in this world and that data is just emblematic of all of the stuff that's going on in this world. There's lots of systems, there's lots of complexity and the data, basically, just is about that complexity. Whether it's weblogs, or sensors, or the like. And so, you can either run away from that data, and say, "Look, I'm going to not, "I'm going to bury my head in the sand. "I'm going to be a business. "I'm just going to forget about that data stuff." And that's certainly a way to go. >> John: Yeah. >> It's a way to go away. >> Not a good outlook. >> I was going to say, is that a way of going out of business? >> Or, you can basically train, it's a human resources problem fundamentally. You've got to train your people to understand how to use data, to become data literate. And that's what our software is all about. That's what we're all about as a company. And so, we have a pretty high bar for what we think we do as a business and we're this far into that. Which is, we think we're training people to use data better. How do you learn to think scientifically? How do you go use data to make better decisions? How do you build a data driven culture? Those are the sorts of problems that I'm excited to work on. >> Alright, now take me through how you guys play out in an engagement with the customer. So okay, that's cool, you guys can come in, we're getting data literate, we understand we need to use data. Where are you guys winning? Where are you guys seeing some visibility, both in terms of the traction of the usage of the product, the use cases? Where is it kind of coming together for you guys? >> Yeah, so we literally, we have a mantra. I think any early stage company basically wins because they can focus on doing a couple of things really well. And for us, we basically do three things. We allow people to find data. We allow people to understand the data that they find. And we allow them to trust the data that they see. And so if I have a question, the first place I start is, typically, Google. I'll go there and I'll try to find whatever it is that I'm looking for. Maybe I'm looking for a Mediterranean restaurant on 1st Street in San Jose. If I'm going to go do that, I'm going to do that search and I'm going to find the thing that I'm looking for, and then I'm going to figure out, out of the possible options, which one do I want to go to. And then I'll figure out whether or not the one that has seven ratings is the one that I trust more than the one that has two. Well, data is no different. You're going to have to find the data sets. And inside of companies, there could be 20 different reports and there could be 20 different people who have information, and so you're going to trust those people through having context and understanding. >> So, trust, people, collaboration. You mentioned some big brands that you guys added towards the end of calendar 2017. How do you facilitate these conversations with maybe the chief data officer. As we know, in large enterprises, there's still a lot of ownership over data silos. >> Satyen: Yep. >> What is that conversation like, as you say on your website, "The first data catalog designed for collaboration"? How do you help these organizations as large as Coca-Cola understand where all the data are and enable the human resources to extract values, and find it, understand it, and trust it? >> Yeah, so we have a very simple hypothesis, which is, look, people fundamentally have questions. They're fundamentally curious. So, what you need to do as a chief data officer, as a chief information officer, is really figure out how to unlock that curiosity. Start with the most popular data sets. Start with the most popular systems. Start with the business people who have the most curiosity and the most demand for information. And oh, by the way, we can measure that. Which is the magical thing that we do. So we can come in and say, "Look, "we look at the logs inside of your systems to know "which people are using which data sets, "which sources are most popular, which areas are hot." Just like a social network might do. And so, just like you can say, "Okay, these are the trending restaurants." We can say, "These are the trending data sets." And that curiosity allows people to know, what data should I document first? What data should I make available first? What data do I improve the data quality over first? What data do I govern first? And so, in a world where you've got tons of signal, tons of systems, it's totally dizzying to figure out where you should start. But what we do is, we go these chief data officers and say, "Look, we can give you a tool and a catalyst so "that you know where to go, "what questions to answer, who to serve first." And you can use that to expand to other groups in the company. >> And this is interesting, a lot of people you mentioned social networks, use data to optimize for something, and in the case of Facebook, they they use my data to target ads for me. You're using data to actually say, "This is how people are using the data." So you're using data for data. (laughs) >> That's right. >> So you're saying-- >> Satyen: We're measuring how you can use data. >> And that's interesting because, I hear a lot of stories like, we bought a tool, we never used it. >> Yep. >> Or people didn't like the UI, just kind of falls on the side. You're looking at it and saying, "Let's get it out there and let's see who's using the data." And then, are you doubling down? What happens? Do I get a little star, do I get a reputation point, am I being flagged to HR as a power user? How are you guys treating that gamification in this way? It's interesting, I mean, what happens? Do I become like-- >> Yeah, so it's funny because, when you think about search, how do you figure out that something's good? So what Google did is, they came along and they've said, "We've got PageRank." What we're going to do is we're going to say, "The pages that are the best pages are the ones "that people link to most often." Well, we can do the same thing for data. The data sources that are the most useful ones are the people that are used most often. Now on top of that, you can say, "We're going to have experts put ratings," which we do. And you can say people can contribute knowledge and reviews of how this data set can be used. And people can contribute queries and reports on top of those data sets. And all of that gives you this really rich graph, this rich social graph, so that now when I look at something it doesn't look like Greek. It looks like, "Oh, well I know Lisa used this data set, "and then John used it "and so at least it must answer some questions "that are really intelligent about the media business "or about the software business. "And so that can be really useful for me "if I have no clue as to what I'm looking at." >> So the problem that you-- >> It's on how you demystify it through the social connections. >> So the problem that you solve, if what I hear you correctly, is that you make it easy to get the data. So there's some ease of use piece of it, >> Yep. >> cataloging. And then as you get people using it, this is where you take the data literacy and go into operationalizing data. >> Satyen: That's right. >> So this seems to be the challenge. So, if I'm a customer and I have a problem, the profile of your target customer or who your customers are, people who need to expand and operationalize data, how would you talk about it? >> Yeah, so it's really interesting. We talk about, one of our customers called us, sort of, the social network for nerds inside of an enterprise. And I think for me that's a compliment. (John laughing) But what I took from that, and when I explained the business of Alation, we start with those individuals who are data literate. The data scientists, the data engineers, the data stewards, the chief data officer. But those people have the knowledge and the context to then explain data to other people inside of that same institution. So in the same way that Facebook started with Harvard, and then went to the rest of the Ivies, and then went to the rest of the top 20 schools, and then ultimately to mom, and dad, and grandma, and grandpa. We're doing the exact same thing with data. We start with the folks that are data literate, we expand from there to a broader audience of people that don't necessarily have data in their titles, but have curiosity and questions. >> I like that on the curiosity side. You spent some time up at Strata Data. I'm curious, what are some of the things you're hearing from customers, maybe partners? Everyone used to talk about Hadoop, it was this big thing. And then there was a creation of data lakes, and swampiness, and all these things that are sort of becoming more complex in an organization. And with the rise of myriad data sources, the velocity, the volume, how do you help an enterprise understand and be able to catalog data from so many different sources? Is it that same principle that you just talked about in terms of, let's start with the lowest hanging fruit, start making the impact there and then grow it as we can? Or is an enterprise needs to be competitive and move really, really quickly? I guess, what's the process? >> How do you start? >> Right. >> What do people do? >> Yes! >> So it's interesting, what we find is multiple ways of starting with multiple different types of customers. And so, we have some customers that say, "Look, we've got a big, we've got Teradata, "and we've got some Hadoop, "and we've got some stuff on Amazon, "and we want to connect it all." And those customers do get started, and they start with hundreds of users, in some case, they start with thousands of users day one, and they just go Big Bang. And interestingly enough, we can get those customers enabled in matters of weeks or months to go do that. We have other customers that say, "Look, we're going to start with a team of 10 people "and we're going to see how it grows from there." And, we can accommodate either model or either approach. From our prospective, you just have to have the resources and the investment corresponding to what you're trying to do. If you're going to say, "Look, we're going to have, two dollars of budget, and we're not going to have the human resources, and the stewardship resources behind it." It's going to be hard to do the Big Bang. But if you're going to put the appropriate resources up behind it, you can do a lot of good. >> So, you can really facilitate the whole go big or go home approach, as as well as the let's start small think fast approach. >> That's right, and we always, actually ironically, recommend the latter. >> Let's start small, think fast, yeah. >> Because everybody's got a bigger appetite than they do the ability to execute. And what's great about the tool, and what I tell our customers and our employees all day long is, there's only metric I track. So year over year, for our business, we basically grow in accounts by net of churn by 55%. Year over year, and that's actually up from the prior year. And so from my perspective-- >> And what does that mean? >> So what that means is, the same customer gave us 55 cents more on the dollar than they did the prior year. Now that's best in class for most software businesses that I've heard. But what matters to me is not so much that growth rate in and of itself. What it means to me is this, that nobody's come along and says, "I've mastered my data. "I understand all of the information side of my company. "Every person knows everything there is to know." That's never been said. So if we're solving a problem where customers are saying, "Look, we get, and we can find, and understand, "and trust data, and we can do that better last year "than we did this year, and we can do it even more "with more people," we're going to be successful. >> What I like about what you're doing is, you're bringing an element of operationalizing data for literacy and for usage. But you're really bringing this notion of a humanizing element to it. Where you see it in security, you see it in emerging ecosystems. Where there's a community of data people who know how hard it is and was, and it seems to be getting easier. But the tsunami of new data coming in, IOT data, whatever, and new regulators like GDPR. These are all more surface area problems. But there's a community coming together. How have you guys seen your product create community? Have you seen any data on that, 'cause it sounds like, as people get networked together, the natural outcome of that is possibly usage you attract. But is there a community vibe that you're seeing? Is there an internal collaboration where they sit, they're having meet ups, they're having lunches. There's a social aspect in a human aspect. >> No, it's humanal, no, it's amazing. So in really subtle but really, really powerful ways. So one thing that we do for every single data source or every single report that we document, we just put who are the top users of this particular thing. So really subtly, day one, you're like, "I want to go find a report. "I don't even know "where to go inside of this really mysterious system". Postulation, you're able to say, "Well, I don't know where to go, but at least I can go call up John or Lisa," and say, "Hey, what is it that we know about this particular thing?" And I didn't have to know them. I just had to know that they had this report and they had this intelligence. So by just discovering people in who they are, you pick up on what people can know. >> So people of the new Google results, so you mentioned Google PageRank, which is web pages and relevance. You're taking a much more people approach to relevance. >> Satyen: That's right. >> To the data itself. >> That's right, and that builds community in very, very clear ways, because people have curiosity. Other people are in the mechanism why in which they satisfy that curiosity. And so that community builds automatically. >> They pay it forward, they know who to ask help for. >> That's right. >> Interesting. >> That's right. >> Last question, Satyen. The tag line, first data catalog designed for collaboration, is there a customer that comes to mind to you as really one that articulates that point exactly? Where Alation has come in and really kicked open the door, in terms of facilitating collaboration. >> Oh, absolutely. I was literally, this morning talking to one of our customers, Munich Reinsurance, largest reinsurance customer or company in the world. Their chief data officer said, "Look, three years ago, "we started with 10 people working on data. "Today, we've got hundreds. "Our aspiration is to get to thousands." We have three things that we do. One is, we actually discover insights. It's actually the smallest part of what we do. The second thing that we do is, we enable people to use data. And the third thing that we do is, drive a data driven culture. And for us, it's all about scaling knowledge, to centers in China, to centers in North America, to centers in Australia. And they've been doing that at scale. And they go to each of their people and they say, "Are you a data black belt, are you a data novice?" It's kind of like skiing. Are you blue diamond or a black diamond. >> Always ski in pairs (laughs) >> That's right. >> And they do ski in pairs. And what they end up ultimately doing is saying, "Look, we're going to train all of our workforce to become better, so that in three, 10 years, we're recognized as one of the most innovative insurance companies in the world." Three years ago, that was not the case. >> Process improvement at a whole other level. My final question for you is, for the folks watching or the folks that are going to watch this video, that could be a potential customer of yours, what are they feeling? If I'm the customer, what smoke signals am I seeing that say, I need to call Alation? What are some of the things that you've found that would tell a potential customer that they should be talkin' to you guys? >> Look, I think that they've got to throw out the old playbook. And this was a point that was made by some folks at a conference that I was at earlier this week. But they basically were saying, "Look, the DLNA's PlayBook was all about providing the right answer." Forget about that. Just allow people to ask the right questions. And if you let people's curiosity guide them, people are industrious, and ambitious, and innovative enough to go figure out what they need to go do. But if you see this as a world of control, where I'm going to just figure out what people should know and tell them what they're going to go know. that's going to be a pretty, a poor career to go choose because data's all about, sort of, freedom and innovation and understanding. And we're trying to push that along. >> Satyen, thanks so much for stopping by >> Thank you. >> and sharing how you guys are helping organizations, enterprises unlock data curiosity. We appreciate your time. >> I appreciate the time too. >> Thank you. >> And thanks John! >> And thank you. >> Thanks for co-hosting with me. For John Furrier, I'm Lisa Martin, you're watching theCUBE live from our second day of coverage of our event Big Data SV. Stick around, we'll be right back with our next guest after a short break. (upbeat music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media Satyen Sangani, the co-founder and CEO of Alation. So you guys finish up your fiscal year how are you pulling this momentum through 2018. in the last quarter we added customers like What about the platform you guys are doing? Take a minute, talk about the update. And that gives you a single place of reference, you got to manage it. So it's kind of like a tropic disillusionment, if you will. And they know that How do you guys talk to that customer? And so, you can either run away from that data, Those are the sorts of problems that I'm excited to work on. Where is it kind of coming together for you guys? and I'm going to find the thing that I'm looking for, that you guys added towards the end of calendar 2017. And oh, by the way, we can measure that. a lot of people you mentioned social networks, I hear a lot of stories like, we bought a tool, And then, are you doubling down? And all of that gives you this really rich graph, It's on how you demystify it So the problem that you solve, And then as you get people using it, and operationalize data, how would you talk about it? and the context to then explain data the volume, how do you help an enterprise understand have the resources and the investment corresponding to So, you can really facilitate the whole recommend the latter. than they do the ability to execute. What it means to me is this, that nobody's come along the natural outcome of that is possibly usage you attract. And I didn't have to know them. So people of the new Google results, And so that community builds automatically. is there a customer that comes to mind to And the third thing that we do is, And what they end up ultimately doing is saying, that they should be talkin' to you guys? And if you let people's curiosity guide them, and sharing how you guys are helping organizations, Thanks for co-hosting with me.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
PepsiCoORGANIZATION

0.99+

Lisa MartinPERSON

0.99+

Satyen SanganiPERSON

0.99+

JohnPERSON

0.99+

American ExpressORGANIZATION

0.99+

AlationORGANIZATION

0.99+

RocheORGANIZATION

0.99+

SatyenPERSON

0.99+

thousandsQUANTITY

0.99+

LisaPERSON

0.99+

55 centsQUANTITY

0.99+

AustraliaLOCATION

0.99+

AmazonORGANIZATION

0.99+

Coca-ColaORGANIZATION

0.99+

2018DATE

0.99+

10 peopleQUANTITY

0.99+

threeQUANTITY

0.99+

John FurrierPERSON

0.99+

hundredsQUANTITY

0.99+

YelpORGANIZATION

0.99+

San JoseLOCATION

0.99+

ChinaLOCATION

0.99+

HarvardORGANIZATION

0.99+

FacebookORGANIZATION

0.99+

twoQUANTITY

0.99+

TodayDATE

0.99+

2017DATE

0.99+

55%QUANTITY

0.99+

second dayQUANTITY

0.99+

North AmericaLOCATION

0.99+

GoogleORGANIZATION

0.99+

todayDATE

0.99+

two dollarsQUANTITY

0.99+

20 different peopleQUANTITY

0.99+

yesterdayDATE

0.99+

SiliconANGLE MediaORGANIZATION

0.99+

last yearDATE

0.99+

three years agoDATE

0.99+

firstQUANTITY

0.99+

second thingQUANTITY

0.99+

OneQUANTITY

0.99+

oneQUANTITY

0.99+

first quarter of 2018DATE

0.99+

20 different reportsQUANTITY

0.99+

three thingsQUANTITY

0.98+

theCUBEORGANIZATION

0.98+

last quarterDATE

0.98+

DLNAORGANIZATION

0.98+

third thingQUANTITY

0.98+

Three years agoDATE

0.98+

eachQUANTITY

0.98+

singleQUANTITY

0.98+

bothQUANTITY

0.98+

1st StreetLOCATION

0.98+

Big BangEVENT

0.98+

this yearDATE

0.98+

Strata DataORGANIZATION

0.97+

12 XQUANTITY

0.97+

GDPRTITLE

0.97+

seven ratingsQUANTITY

0.96+

AlationPERSON

0.95+

this morningDATE

0.95+

Big Data SV 2018EVENT

0.94+

first dataQUANTITY

0.94+

TeradataORGANIZATION

0.93+

10 yearsQUANTITY

0.93+

Ian Swanson, DataScience.com | Big Data SV 2018


 

(royal music) >> Announcer: John Cleese. >> There's a lot of people out there who have no idea what they're doing, but they have absolutely no idea that they have no idea what they're doing. Those are the ones with the confidence and stupidity who finish up in power. That's why the planet doesn't work. >> Announcer: Knowledgeable, insightful, and a true gentleman. >> The guy at the counter recognized me and said... Are you listening? >> John Furrier: Yes, I'm tweeting away. >> No, you're not. >> I tweet, I'm tweeting away. >> He is kind of rude that way. >> You're on your (bleep) keyboard. >> Announcer: John Cleese joins the Cube alumni. Welcome, John. >> John Cleese: Have you got any phone calls you need to answer? >> John Furrier: Hold on, let me check. >> Announcer: Live from San Jose, it's the Cube, presenting Big Data Silicon Valley, brought to you by Silicon Angle Media and its ecosystem partners. (busy music) >> Hey, welcome back to the Cube's continuing coverage of our event, Big Data SV. I'm Lisa Martin with my co-host, George Gilbert. We are down the street from the Strata Data Conference. This is our second day, and we've been talking all things big data, cloud data science. We're now excited to be joined by the CEO of a company called Data Science, Ian Swanson. Ian, welcome to the Cube. >> Thanks so much for having me. I mean, it's been a awesome two days so far, and it's great to wrap up my trip here on the show. >> Yeah, so, tell us a little bit about your company, Data Science, what do you guys do? What are some of the key opportunities for you guys in the enterprise market? >> Yeah, absolutely. My company's called datascience.com, and what we do is we offer an enterprise data science platform where data scientists get to use all they tools they love in all the languages, all the libraries, leveraging everything that is open source to build models and put models in production. Then we also provide IT the ability to be able to manage this massive stack of tools that data scientists require, and it all boils down to one thing, and that is, companies need to use the data that they've been storing for years. It's about, how do you put that data into action. We give the tools to data scientists to get that data into action. >> Let's drill down on that a bit. For a while, we thought if we just put all our data in this schema-on-read repository, that would be nirvana. But it wasn't all that transparent, and we recognized we have to sort of go in and structure it somewhat, help us take the next couple steps. >> Ian: Yeah, the journey. >> From this partially curated data sets to something that turns into a model that is actionable. >> That's actually been the theme in the show here at the Strata Data Conference. If we went back years ago, it was, how do we store data. Then it was, how do we not just store and manage, but how do we transform it and get it into a shape that we can actually use it. The theme of this year is how do we get it to that next step, the next step of putting it into action. To layer onto that, data scientists need to access data, yes, but then they need to be able to collaborate, work together, apply many different techniques, machine learning, AI, deep learning, these are all techniques of a data scientist to be able to build a model. But then there's that next step, and the next is, hey, I built this model, how do I actually get it in production? How does it actually get used? Here's the shocking thing. I was at an event where there's 500 data scientists in the audience, and I said, "Stand up if you worked on a model for more than nine months "and it never went into production." 90% of the audience stood up. That's the last mile that we're all still working on, and what's exciting is, we can make it possible today. >> Wanting to drill down into the sort of, it sounds like there's a lot of choice in the tools. But typically, to do a pipeline, you either need well established APIs that everyone understands and plugs together with, or you need an end to end sort of single vendor solution that becomes the sort of collaboration backbone. How are you organized, how are you built? >> This might be self-serving, but datascience.com, we have enterprise data science platform, we recommend a unified platform for data science. Now, that unified platform needs to be highly configurable. You need to make it so that that workbench, you can use any tool that you want. Some data scientists might want to use a hammer, others want to be able to use a screwdriver over here. The power is how configurable, how extensible it is, how open source you can adopt everything. The amazing trends that we've seen have been proprietary solutions going back decades, to now, the rise of open source. Every day, dozens if not hundreds of new machine learning libraries are being released every single day. We've got to give those capabilities to data scientists and make them scale. >> OK, so the, and I think it's pretty easy to see how you would have incorporate new machine learning libraries into a pipeline. But then there's also the tools for data preparation, and for like feature extraction and feature engineering, you might even have some tools that help you with figuring out which algorithm to select. What holds all that together? >> Yeah, so orchestrating the enterprise data science stack is the hardest challenge right now. There has to be a company like us that is the glue, that is not just, do these solutions work together, but also, how do they collaborate, what is that workflow? What are those steps in that process? There's one thing that you might have left out, and that is, model deployment, model interpretation, model management. >> George: That's the black art, yeah. >> That's where this whole thing is going next. That was the exciting thing that I heard in terms of all these discussion with business leaders throughout the last two days is model deployment, model management. >> If I can kind of take this to maybe shift the conversation a little bit to the target audience. Talked a lot about data scientists and needing to enable them. I'm curious about, we just talked with, a couple of guests ago, about the chief data officer. How, you work with enterprises, how common is the chief data officer role today? What are some of the challenges they've got that datascience.com can help them to eliminate? >> Yeah, the CIO and the chief data officer, we have CIOs that have been selecting tools for companies to use, and now the chief data officer is sitting down with the CEO and saying, "How do we actually drive business results?" We work very closely with both of those personas. But on the CDO side, it's really helping them educate their teams on the possibilities of what could be realized with the data at hand, and making sure that IT is enabling the data scientists with the right tools. We supply the tools, but we also like to go in there with our customers and help coach, help educate what is possible, and that helps with the CDO's mission. >> A question along that front. We've been talking about sort of empowering the data scientist, and really, from one end of the modeling life cycle all the way to the end or the deployment, which is currently the hardest part and least well supported. But we also have tons of companies that don't have data science trained people, or who are only modestly familiar. Where do, what do we do with them? How do we get those companies into the mainstream in terms of deploying this? >> I think whether you're a small company or a big company, digital transformation is the mandate. Digital transformation is not just, how do I make a taxi company become Uber, or how do I make a speaker company become Sonos, the smart speaker, it's how do I exploit all the sources of my data to get better and improved operational processes, new business models, increased revenue, reduced operation costs. You could start small, and so we work with plenty of smaller companies. They'll hire a couple data scientists, and they're able to do small quick wins. You don't have to go sit in the basement for a year having something that is the thing, the unicorn in the business, it's small quick wins. Now we, my company, we believe in writing code, trained, educated, data scientists. There are solutions out there that you throw data at, you push a button, it gets an output. It's this magic black box. There's risk in that. Model interpretation, what are the features it's scoring on, there's risk, but those companies are seeing some level of success. We firmly believe, though, in hiring a data science team that is trained, you can start small, two or three, and get some very quick wins. >> I was going to say, those quick wins are essential for survivability, like digital transformation is essential, but it's also, I mean, to survival at a minimum, right? >> Ian: Yes. >> Those quick wins are presumably transformative to an enterprise being able to sustain, and then eventually, or ideally, be able to take market share from their competition. >> That is key for the CDO. The CDO is there pitching what is possible, he's pitching, she's pitching the dream. In order to be able to help visualize what that dream and the outcome could be, we always say, start small, quick wins, then from there, you can build. What you don't want to do is go nine months working on something and you don't know if there's going to be outcome. A lot of data science is trial and error. This is science, we're testing hypotheses. There's not always an outcome that's to be there, so small quick wins is something we highly recommend. >> A question, one of the things that we see more and more is the idea that actionable insights are perishable, and that latency matters. In fact, you have a budget for latency, almost, like in that short amount of time, the more sort of features that you can dynamically feed into a model to get a score, are you seeing more of that? How are the use cases that you're seeing, how's that pattern unfolding? >> Yeah, so we're seeing more streaming data use cases. We work with some of the biggest technology companies in the world, so IoT, connected services, streaming real time decisions that are happening. But then, also, there are so many use cases around org that could be marketing, finance, HR related, not just tech related. On the marketing side, imagine if you're customer service, and somebody calls you, and you know instantly the lifetime value of that customer, and it kicks off a totally new talk track, maybe get escalated immediately to a new supervisor, because that supervisor can handle this top tier customer. These are decisions that can happen real time leveraging machine learning models, and these are things that, again, are small quick wins, but massive, massive impact. It's about decision process now. That's digital transformation. >> OK. Are you seeing patterns in terms of how much horsepower customers are budgeting for the training process, creating the model? Because we know it's very compute intensive, like, even Intel, some people call it, like, high performance compute, like a supercomputer type workload. How much should people be budgeting? Because we don't see any guidelines or rules of thumb for this. >> I still think the boundaries are being worked out. There's a lot of great work that Nvidia's doing with GPU, we're able to do things faster on compute power. But even if we just start from the basics, if you go and talk to a data scientist at a massive company where they have a team of over 1,000 data scientists, and you say to do this analysis, how do you spin up your compute power? Well, I go walk over to IT and I knock on the door, and I say, "Set up this machine, set up this cluster." That's ridiculous. A product like ours is able to instantly give them the compute power, scale it elastically with our cloud service partners or work with on-prem solutions to be able to say, get the power that you need to get the results in the time that's needed, quick, fast. In terms of the boundaries of the budget, that's still being defined. But at the end of the day, we are seeing return on investment, and that's what's key. >> Are you seeing a movement towards a greater scope of integration for the data science tool chain? Or is it that at the high end, where you have companies with 1,000 data scientists, they know how to deal with specialized components, whereas, when there's perhaps less of, a smaller pool of expertise, the desire for end to end integration is greater. >> I think there's this kind of thought that is not necessarily right, and that is, if you have a bigger data science team, you're more sophisticated. We actually see the same sophistication level of 1,000 person data science team, in many cases, to a 20 person data science team, and sometimes inverse, I mean, it's kind of crazy. But it's, how do we make sure that we give them the tools so they can drive value. Tools need to include collaboration and workflow, not just hammers and nails, but how do we work together, how do we scale knowledge, how do we get it in the hands of the line of business so they can use the results. It's that that is key. >> That's great, Ian. I also like that you really kind of articulated start small, quick ins can make massive impact. We want to thank you so much for stopping by the Cube and sharing that, and what you guys are doing at Data Science to help enterprises really take advantage of the value that data can really deliver. >> Thanks so much for having datascience.com on, really appreciate it. >> Lisa: Absolutely. George, thank you for being my co-host. >> You're always welcome. >> We want to thank you for watching the Cube. I'm Lisa Martin with George Gilbert, and we are at our event Big Data SV on day two. Stick around, we'll be right back with our next guest after a short break. (busy music)

Published Date : Mar 8 2018

SUMMARY :

Those are the ones with the confidence and stupidity and a true gentleman. The guy at the counter recognized me and said... Announcer: John Cleese joins the Cube alumni. brought to you by Silicon Angle Media We are down the street from the Strata Data Conference. and it's great to wrap up my trip here on the show. and it all boils down to one thing, and that is, the next couple steps. to something that turns into a model that is actionable. and the next is, hey, I built this model, that becomes the sort of collaboration backbone. how open source you can adopt everything. OK, so the, and I think it's pretty easy to see Yeah, so orchestrating the enterprise data science stack in terms of all these discussion with business leaders a couple of guests ago, about the chief data officer. and making sure that IT is enabling the data scientists empowering the data scientist, and really, having something that is the thing, or ideally, be able to take market share and the outcome could be, we always say, start small, the more sort of features that you can dynamically in the world, so IoT, connected services, customers are budgeting for the training process, get the power that you need to get the results Or is it that at the high end, We actually see the same sophistication level and sharing that, and what you guys are doing Thanks so much for having datascience.com on, George, thank you for being my co-host. and we are at our event Big Data SV on day two.

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
George GilbertPERSON

0.99+

Lisa MartinPERSON

0.99+

Ian SwansonPERSON

0.99+

GeorgePERSON

0.99+

IanPERSON

0.99+

LisaPERSON

0.99+

UberORGANIZATION

0.99+

John FurrierPERSON

0.99+

Silicon Angle MediaORGANIZATION

0.99+

JohnPERSON

0.99+

John CleesePERSON

0.99+

500 data scientistsQUANTITY

0.99+

90%QUANTITY

0.99+

dozensQUANTITY

0.99+

NvidiaORGANIZATION

0.99+

San JoseLOCATION

0.99+

20 personQUANTITY

0.99+

Data ScienceORGANIZATION

0.99+

nine monthsQUANTITY

0.99+

1,000 personQUANTITY

0.99+

twoQUANTITY

0.99+

two daysQUANTITY

0.99+

more than nine monthsQUANTITY

0.99+

second dayQUANTITY

0.99+

1,000 data scientistsQUANTITY

0.99+

threeQUANTITY

0.99+

Big Data SVEVENT

0.99+

over 1,000 data scientistsQUANTITY

0.99+

CubeORGANIZATION

0.99+

bothQUANTITY

0.99+

Strata Data ConferenceEVENT

0.98+

oneQUANTITY

0.98+

IntelORGANIZATION

0.98+

SonosORGANIZATION

0.98+

one thingQUANTITY

0.97+

a yearQUANTITY

0.96+

todayDATE

0.95+

day twoQUANTITY

0.95+

this yearDATE

0.94+

singleQUANTITY

0.92+

Big Data SV 2018EVENT

0.88+

DataScience.comORGANIZATION

0.87+

hundreds of new machine learning librariesQUANTITY

0.86+

lot of peopleQUANTITY

0.83+

decadesQUANTITY

0.82+

every single dayQUANTITY

0.81+

years agoDATE

0.77+

last two daysDATE

0.76+

datascience.comORGANIZATION

0.75+

one endQUANTITY

0.7+

yearsQUANTITY

0.67+

datascience.comOTHER

0.65+

couple stepsQUANTITY

0.64+

Big DataEVENT

0.64+

couple of guestsDATE

0.57+

coupleQUANTITY

0.52+

Silicon ValleyLOCATION

0.52+

thingsQUANTITY

0.5+

CubeTITLE

0.47+