Democratizing AI and Advanced Analytics with Dataiku x Snowflake

>>My name is Dave Volonte, and with me are two world class technologists, visionaries and entrepreneurs. And Wa Dodgeville is the he co founded Snowflake, and he's now the president of the product division. And Florian Duetto is the co founder and CEO of Data Aiko. Gentlemen, welcome to the Cube to first timers. Love it. >>Great to be here >>now, Florian you and Ben Wa You have a number of customers in common. And I have said many times on the Cube that you know, the first era of cloud was really about infrastructure, making it more agile, taking out costs. And the next generation of innovation is really coming from the application of machine intelligence to data with the cloud is really the scale platform. So is that premise your relevant to you? Do you buy that? And and why do you think snowflake and data ICU make a good match for customers? >>I think that because it's our values that are aligned when it's all about actually today allowing complexity for customers. So you close the gap or the democratizing access to data access to technology. It's not only about data data is important, but it's also about the impact of data. Who can you make the best out of data as fast as possible as easily as possible within an organization. And another value is about just the openness of the platform building the future together? Uh, I think a platform that is not just about the platform but also full ecosystem of partners around it, bringing the level off accessibility and flexibility you need for the 10 years away. >>Yeah, so that's key. But it's not just data. It's turning data into insights. Have been why you came out of the world of very powerful but highly complex databases. And we know we all know that you and the snowflake team you get very high marks for really radically simplifying customers lives. But can you talk specifically about the types of challenges that your customers air using snowflake to solve? >>Yeah, so So the really the challenge, you know, be four. Snowflake. I would say waas really? To put all the data, you know, in one place and run all the computers, all the workloads that you wanted to run, You know, against that data and off course, you know, existing legacy platforms. We're not able to support. You know that level of concurrency, Many workload. You know, we we talk about machine learning that a science that are engendering, you know, that our house big data were closed or running in one place didn't make sense at all. And therefore, you know what customers did is to create silos, silos of data everywhere, you know, with different system having a subset of the data. And of course, now you cannot analyze this data in one place. So, snowflake, we really solve that problem by creating a single, you know, architectural where you can put all the data in the cloud. So it's a really cloud native we really thought about You know how to solve that problem, how to create, you know, leverage, Cloud and the lessee cc off cloud to really put all the die in one place, but at the same time not run all workload at the same place. So each workload that runs in Snowflake that is dedicated, You know, computer resource is to run, and that makes it very Ajai, right? You know, Floyd and talk about, you know, data scientists having to run analysis, so they need you know a lot of compute resources, but only for, you know, a few hours on. Do you know, with snowflake they can run these new work lord at this workload to the system, get the compute resources that they need to run this workload. And when it's over, they can shut down. You know that their system, it will be automatically shut down. Therefore, they would not pay for the resources that they don't use. So it's a very Ajai system where you can do this, analyzes when you need, and you have all the power to run all this workload at the same time. >>Well, it's profound what you guys built to me. I mean, of course, everybody's trying to copy it now. It was like, remember that bringing the notion of bringing compute to the data and the Hadoop days, and I think that that Asai say everybody is sort of following your suit now are trying to Florian I gotta say the first data scientist I ever interviewed on the Cube was amazing. Hilary Mason, right after she started a bit Lee. And, you know, she made data science that sounds so compelling. But data science is hard. So same same question for you. What do you see is the biggest challenges for customers that they're facing with data science. >>The biggest challenge, from my perspective, is that owns you solve the issue of the data. Seidel with snowflake, you don't want to bring another Seidel, which would be a side off skills. Essentially, there is to the talent gap between the talented label of the market, or are it is to actually find recruits trained data scientist on what needs to be done. And so you need actually to simplify the access to technologies such as every organization can make it, whatever the talent, by bridging that gap and to get there, there is a need of actually breaking up the silos. And in a collaborative approach where technologists and business work together and actually put some their hands into those data projects together, >>it makes sense for flooring. Let's stay with you for a minute. If I can your observation spaces, you know it's pretty, pretty global, and and so you have a unique perspective on how companies around the world might be using data and data science. Are you seeing any trends may be differences between regions or maybe within different industries. What are you seeing? >>Yes. Yeah, definitely. I do see trends that are not geographic that much, but much more in terms of maturity of certain industries and certain sectors, which are that certain industries invested a lot in terms of data, data access, ability to start data in the last few years and no age, a level of maturity where they can invest more and get to the next steps. And it's really rely on the ability of certain medial certain organization actually to have built this long term strategy a few years ago and no start raping up the benefits. >>You know, a decade ago, Florian Hal Varian, we, you know, famously said that the sexy job in the next 10 years will be statisticians. And then everybody sort of change that to data scientists and then everybody. All the statisticians became data scientists, and they got a raise. But data science requires more than just statistics acumen. What what skills >>do >>you see as critical for the next generation of data science? >>Yeah, it's a good question because I think the first generation of the patient is became the licenses because they could done some pipe and quickly on be flexible. And I think that the skills or the next generation of data sentences will definitely be different. It will be first about being able to speak the language of the business, meaning, oh, you translate data inside predictive modeling all of this into actionable insight or business impact. And it would be about you collaborate with the rest of the business. It's not just a farce. You can build something off fast. You can do a notebook in python or your credit models off themselves. It's about, oh, you actually build this bridge with the business. And obviously those things are important. But we also has become the center of the fact that technology will evolve in the future. There will be new tools and technologies, and they will still need to keep this level of flexibility and get to understand quickly, quickly. What are the next tools they need to use the new languages or whatever to get there. >>As you look back on 2020 what are you thinking? What are you telling people as we head into next year? >>Yeah, I I think it's Zaveri interesting, right? We did this crisis, as has told us that the world really can change from one day to the next. And this has, you know, dramatic, you know, and perform the, you know, aspect. For example, companies all the sudden, you know, So their revenue line, you know, dropping. And they had to do less meat data. Some of the companies was the reverse, right? All the sudden, you know, they were online, like in stock out, for example, and their business, you know, completely, you know, change, you know, from one day to the other. So this GT off, You know, I, you know, adjusting the resource is that you have tow the task a need that can change, you know, using solution like snowflakes, you know, really has that. And we saw, you know, both in in our customers some customers from one day to the to do the next where, you know, growing like big time because they benefited, you know, from from from from co vid and their business benefited, but also, as you know, had to drop. And what is nice with with with cloud, it allows to, you know, I just compute resources toe, you know, to your business needs, you know, and really adjusted, you know, in our, uh, the the other aspect is is understanding what is happening, right? You need to analyze the we saw all these all our customers basically wanted to understand. What is that going to be the impact on my business? How can I adapt? How can I adjust? And and for that, they needed to analyze data. And, of course, a lot of data which are not necessarily data about, you know, their business, but also data from the outside. You know, for example, coffee data, You know, where is the States? You know, what is the impact? You know, geographic impact from covitz, You know, all the time and access to this data is critical. So this is, you know, the promise off the data crowd, right? You know, having one single place where you can put all the data off the world. So our customers, all the Children you know, started to consume the cov data from our that our marketplace and and we had the literally thousands of customers looking at this data analyzing this data, uh, to make good decisions So this agility and and and this, you know, adapt adapting, you know, from from one hour to the next is really critical. And that goes, you know, with data with crowding adjusting, resource is on and that's, you know, doesn't exist on premise. So So So indeed, I think the lesson learned is is we are living in a world which machines changing all the time and we have for understanding We have to adjust and and And that's why cloud, you know, somewhere it's great. >>Excellent. Thank you. You know the kid we like to talk about disruption, of course. Who doesn't on And also, I mean, you look at a I and and the impact that is beginning to have and kind of pre co vid. You look at some of the industries that were getting disrupted by, you know, we talked about digital transformation and you had on the one end of the spectrum industries like publishing which are highly disrupted or taxis. And you could say Okay, well, that's, you know, bits versus Adam, the old Negroponte thing. But then the flip side of that look at financial services that hadn't been dramatically disrupted. Certainly healthcare, which is ripe for disruption Defense. So the number number of industries that really hadn't leaned into digital transformation If it ain't broke, don't fix it. Not on my watch. There was this complacency and then, >>of >>course, co vid broke everything. So, florian, I wonder if you could comment? You know what industry or industries do you think you're gonna be most impacted by data science and what I call machine intelligence or a I in the coming years and decades? >>Honestly, I think it's all of them artist, most of them because for some industries, the impact is very visible because we're talking about brand new products, drones like cars or whatever that are very visible for us. But for others, we are talking about sport from changes in the way you operate as an organization, even if financial industry itself doesn't seems to be so impacted when you look it from the consumer side or the outside. In fact, internally, it's probably impacted just because the way you use data on developer for flexibility, you need the kind off cost gay you can get by leveraging the latest technologies is just enormous, and so it will actually transform the industry that also and overall, I think that 2020 is only a where, from the perspective of a I and analytics, we understood this idea of maturity and resilience, maturity, meaning that when you've got a crisis, you actually need data and ai more than before. You need to actually call the people from data in the room to take better decisions and look for a while and not background. And I think that's a very important learning from 2020 that will tell things about 2021 and the resilience it's like, Yeah, Data Analytics today is a function consuming every industries and is so important that it's something that needs to work. So the infrastructure is to work in frustration in super resilient. So probably not on prime on a fully and prime at some point and the kind of residence where you need to be able to plan for literally anything like no hypothesis in terms of behaviors can be taken for granted. And that's something that is new and which is just signaling that we're just getting to the next step for the analytics. >>I wonder, Benoit, if you have anything to add to that. I mean, I often wonder, you know, winter machine's gonna be able to make better diagnoses than doctors. Some people say already, you know? Well, the financial services traditional banks lose control of payment systems. Uh, you know what's gonna happen to big retail stores? I mean, maybe bring us home with maybe some of your final thoughts. >>Yeah, I would say, you know, I I don't see that as a negative, right? The human being will always be involved very closely, but the machine and the data can really have, you know, see, Coalition, you know, in the data that that would be impossible for for for human being alone, you know, you know, to to discover so So I think it's going to be a compliment, not a replacement on. Do you know everything that has made us you know faster, you know, doesn't mean that that we have less work to do. It means that we can doom or and and we have so much, you know, to do, uh, that that I would not be worried about, You know, the effect off being more efficient and and and better at at our you know, work. And indeed, you know, I fundamentally think that that data, you know, processing off images and doing, you know, I ai on on on these images and discovering, you know, patterns and and potentially flagging, you know, disease, where all year that then it was possible is going toe have a huge impact in in health care, Onda and And as as as Ryan was saying, every you know, every industry is going to be impacted by by that technology. So So, yeah, I'm very optimistic. >>Great guys. I wish we had more time. I gotta leave it there. But so thanks so much for coming on. The Cube was really a pleasure having you.

Published Date : Nov 20 2020

SUMMARY :

And Wa Dodgeville is the he co founded And I have said many times on the Cube that you know, the first era of cloud was really about infrastructure, So you close the gap or the democratizing access to data And we know we all know that you and the snowflake team you get very high marks for Yeah, so So the really the challenge, you know, be four. And, you know, And so you need actually to simplify the access to you know it's pretty, pretty global, and and so you have a unique perspective on how companies the ability of certain medial certain organization actually to have built this long term strategy You know, a decade ago, Florian Hal Varian, we, you know, famously said that the sexy job in the next And it would be about you collaborate with the rest of the business. So our customers, all the Children you know, started to consume the cov you know, we talked about digital transformation and you had on the one end of the spectrum industries You know what industry or industries do you think you're gonna be most impacted by data the kind of residence where you need to be able to plan for literally I mean, I often wonder, you know, winter machine's gonna be able to make better diagnoses that data, you know, processing off images and doing, you know, I ai on I gotta leave it there.

ENTITIES

Entity	Category	Confidence
Dave Volonte	PERSON	0.99+
Florian Duetto	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Florian Hal Varian	PERSON	0.99+
Florian	PERSON	0.99+
Benoit	PERSON	0.99+
Ryan	PERSON	0.99+
Ben Wa	PERSON	0.99+
Data Aiko	ORGANIZATION	0.99+
2020	DATE	0.99+
10 years	QUANTITY	0.99+
Lee	PERSON	0.99+
Wa Dodgeville	PERSON	0.99+
next year	DATE	0.99+
python	TITLE	0.99+
Snowflake	ORGANIZATION	0.99+
first	QUANTITY	0.99+
one place	QUANTITY	0.99+
one hour	QUANTITY	0.98+
a decade ago	DATE	0.98+
Floyd	PERSON	0.98+
2021	DATE	0.98+
one day	QUANTITY	0.98+
both	QUANTITY	0.97+
today	DATE	0.97+
first generation	QUANTITY	0.96+
Adam	PERSON	0.93+
Onda	ORGANIZATION	0.93+
one single place	QUANTITY	0.93+
florian	PERSON	0.93+
each workload	QUANTITY	0.92+
one	QUANTITY	0.91+
four	QUANTITY	0.9+
few years ago	DATE	0.88+
thousands of customers	QUANTITY	0.88+
Cube	COMMERCIAL_ITEM	0.87+
first data scientist	QUANTITY	0.84+
single	QUANTITY	0.83+
Asai	PERSON	0.82+
two world	QUANTITY	0.81+
first era	QUANTITY	0.74+
next 10 years	DATE	0.74+
Negroponte	PERSON	0.73+
Zaveri	ORGANIZATION	0.72+
Dataiku	ORGANIZATION	0.7+
Cube	ORGANIZATION	0.64+
Ajai	ORGANIZATION	0.58+
years	DATE	0.57+
covitz	PERSON	0.53+
decades	QUANTITY	0.52+
Cube	PERSON	0.45+
Snowflake	TITLE	0.45+
Seidel	ORGANIZATION	0.43+
snowflake	EVENT	0.35+
Seidel	COMMERCIAL_ITEM	0.34+

Benoit Dageville and Florian Douetteau V1

>> Hello everyone, welcome back to theCUBE'S wall to wall coverage of the Snowflake Data Cloud Summit. My name is Dave Vellante and with me are two world-class technologists, visionaries, and entrepreneurs. Benoit Dageville is the, he co-founded Snowflake. And he's now the president of the Product division and Florian Douetteau is the co-founder and CEO of Dataiku. Gentlemen, welcome to theCUBE, two first timers, love it. >> Great time to be here. >> Now Florian, you and Benoit, you have a number of customers in common. And I've said many times on theCUBE that, the first era of cloud was really about infrastructure, making it more agile taking out costs. And the next generation of innovation is really coming from the application of machine intelligence to data with the cloud, is really the scale platform. So is that premise relevant to you, do you buy that? And why do you think Snowflake and Dataiku make a good match for customers? >> I think that because it's our values that align. When it gets all about actually today, and knowing complexity per customer, so you close the gap or we need to commoditize the access to data, the access to technology, it's not only about data, data is important, but it's also about the impacts of data. How can you make the best out of data as fast as possible, as easily as possible within an organization? And another value is about just the openness of the platform, building a future together. I think a platform that is not just about the platform but also for the ecosystem of partners around it, bringing the little bit of accessibility and flexibility, you need for the 10 years of that. >> Yes, so that's key, but it's not just data. It's turning data into insights. Now Benoit, you came out of the world of very powerful, but highly complex databases. And we all know that, you and the Snowflake team, you get very high marks for really radically simplifying customers' lives. But can you talk specifically about the types of challenges that your customers are using Snowflake to solve? >> Yeah, so really the challenge before Snowflake, I would say, was really to put all the data, in one place and run all the computes, all the workloads that you wanted to run, against that data. And of course, existing legacy platforms were not able to support that level of concurrency, many workload. We talk about machine learning, data science, data engineering, data warehouse, big data workloads, all running in one place, didn't make sense at all. And therefore, what customers did, is to create silos, silos of data everywhere, with different systems having a subset of the data. And of course now you cannot analyze this data in one place. So Snowflake, we really solved that problem by creating a single architecture where you can put all the data in the cloud. So it's a really cloud native. We really thought about how to solve that problem, how to create leverage cloud and the elasticity of cloud to really put all the data in one place. But at the same time, not run all workload at the same place. So each workload that runs in Snowflake at least dedicate compute resources to run. And that makes it very agile, right. Florian talked about data scientist having to run analysis. So they need a lot of compute resources, but only for few hours and with Snowflake, they can run these new workload, add this workload to the system, get the compute resources that they need to run this workload. And then when it's over, they can shut down their system. It will automatically shut down. Therefore they would not pay for the resources that they don't choose. So it's a very agile system, where you can do these analysis when you need, and you have all the power to run all these workload at the same time. >> Well, it's profound what you guys built. To me, I mean, because everybody's trying to copy it now. It's like, I remember the notion of bringing compute to the data in the Hadoop days. And I think that, as I say, everybody is sort of following your suit now or trying to. Florian, I got to say, the first data scientist I ever interviewed on theCUBE was the amazing Hilary Mason, right after she started at Bitly. And she made data science sounds so compelling, but data science is hard. So same question for you. What do you see is the biggest challenges for customers that they're facing with data science? >> The biggest challenge from my perspective is that once you solve the issue of the data silo with Snowflake, you don't want to bring another silo, which would be a silo of skills. And essentially, thanks to that talent gap between the talent and labor of the markets, or how it is to actually find, recruit and train data scientists and what needs to be done. And so you need actually to simplify the access to technology such as every organization can make it, whatever the talents by bridging that gap. And to get there, there is a need of actually breaking up the silos. I think a collaborative approach, where technologies and business work together and actually all put some of their ends into those data projects together. >> Yeah, it makes sense. So Florian, Let's stay with you for a minute, if I can. Your observation spaces, is pretty, pretty global. And so, you have a unique perspective on how companies around the world might be using data and data science. Are you seeing any trends, maybe differences between regions or maybe within different industries? What are you seeing? >> Yep. Yeah, definitely, I do see trends that are not geographic that much, but much more in terms of maturity of certain industries and certain sectors, which are that certain industries invested a lot in terms of data, data access, ability to store data as well as few years and know each level of maturity where they can invest more and get to the next steps. And it's really reliant to reach out to certain details, certain organization, actually to have built this longterm data strategy a few years ago, and no stocks ripping off the benefits. >> You know, a decade ago, Florian, Hal Varian famously said that the sexy job in the next 10 years will be statisticians. And then everybody sort of changed that to data scientists. And then everybody, all the statisticians became data scientists and they got a raise. But data science requires more than just statistics acumen. What skills do you see is critical for the next generation of data science? >> Yeah, it's a good question because I think the first generation of data scientists became better scientists because they could learn some Python quickly and be flexible. And I think that skills of the next generation of data scientists will definitely be different. It will be first about being able to speak the language of the business, meaning all you translate data insight, predictive modeling, all of this into actionable insights or business impact. And it will be about who you collaborate with the rest of the business. It's not just how fast you can build something, how fast you can do a notebook in Python or do quantity models of some sorts. It's about how you actually build this bridge with the business. And obviously those things are important, but we also must be cognizant of the fact that technology will evolve in the future. There will be new tools in technologies, and they will still need to get this level of flexibility and get to understand quickly what are the next tools, they need to use or new languages or whatever to get there. >> Thank you for that. Benoit, let's come back to you. This year has been tumultuous to say the least for everyone, but it's a good time to be in tech, ironically. And if you're in cloud, it's even better. But you look at Snowflake and Dataiku, you guys had done well, despite the economic uncertainty and the challenges of the pandemic. As you look back on 2020, what are you thinking? What are you telling people as we head into next year? >> Yeah, I think it's very interesting, right. We, this crisis has told us that the world really can change from one day to the next. And this has dramatic and profound aspects. For example, companies all of a sudden, saw their revenue line dropping and they had to do less with data. And some of the companies was the reverse, right? All of a sudden, they were online like Instacart, for example, and their business completely change from one day to the other. So this agility of adjusting the resources that you have to do the task, a need that can change, using solution like Snowflake, really helps that. And we saw both in our customers. Some customers from one day to the next, were growing like big time, because they benefited from COVID and their business benefited, but also, as you know, had to drop and what is nice with cloud, it allows to adjust compute resources to your business needs and really address it in-house. The other aspect is understanding what is happening, right? You need to analyze. So we saw all our customers basically wanted to understand, what is it going to be the impact on my business? How can I adapt? How can I adjust? And for that, they needed to analyze data. And of course, a lot of data, which are not necessarily data about their business, but also data from the outside. For example, COVID data. Where is the state, what is the impact, geographic impact on COVID all the time. And access to this data is critical. So this is the promise of the data cloud, right? Having one single place where you can put all the data of the world. So, our customers all of a sudden, started to consume the COVID data from our data marketplace. And we have the unit already thousands of customers looking at this data, analyzing this data to make good decisions. So this agility and this adapting from one hour to the next is really critical and that goes with data, with cloud, more interesting resources and that's doesn't exist on premise. So, indeed I think the lesson learned is, we are living in a world which is changing all the time, and we have to understand it. We have to adjust and that's why cloud, some way is great. >> Excellent, thank you. You know, in theCUBE, we like to talk about disruption, of course, who doesn't. And also, I mean, you look at AI and the impact that it's beginning to have and kind of pre-COVID, you look at some of the industries that were getting disrupted by, everybody talks about digital transformation and you had on the one end of the spectrum, industries like publishing, which are highly disrupted or taxis, and you can say, "Okay well, that's Bits versus Adam, the old Negroponte thing." But then the flip side of this, it says, "Look at financial services that hadn't been dramatically disrupted, certainly healthcare, which is right for disruption, defense." So the more the number of industries that really hadn't leaned into digital transformation, if it ain't broke, don't fix it. Not on my watch. There was this complacency. And then of course COVID broke everything. So Florian, I wonder if you could comment, what industry or industries do you think are going to be most impacted by data science and what I call machine intelligence or AI in the coming years and decades? >> Honestly, I think it's all of them, or at least most of them. Because for some industries, the impact is very visible because we are talking about brand new products, drones, flying cars, or whatever is that are very visible for us. But for others, we are talking about spectrum changes in the way you operate as an organization. Even if financial industry itself doesn't seem to be so impacted when you look at it from the consumer side or the outside. In fact internally, it's probably impacted just because of the way you use data to develop for flexibility you need, is there kind of a cost gain you can get by leveraging the latest technologies, is just enormous. And so it will, actually comes from the industry, that also. And overall, I think that 2020 is a year where, from the perspective of AI and analytics, we understood this idea of maturity and resilience. Maturity, meaning that when you've got a crisis, you actually need data and AI more than before, you need to actually call the people from data in the room to take better decisions and look forward and not backward. And I think that's a very important learning from 2020 that will tell things about 2021. And resilience, it's like, yeah, data analytics today is a function consuming every industries, and is so important that it's something that needs to work. So the infrastructure needs to work, the infrastructure needs to be super resilient. So probably not on trend and not fully on trend, at some point and the kind of residence where you need to be able to plan for literally anything. like no hypothesis in terms of behaviors can be taken for granted. And that's something that is new and which is just signaling that we are just getting into a next step for all data analytics. >> I wonder Benoit, if you have anything to add to that, I mean, I often wonder, you know, when are machines going to be able to make better diagnoses than doctors, some people say already. Will the financial services, traditional banks lose control of payment systems? You know, what's going to happen to big retail stores? I mean, may be bring us home with maybe some of your final thoughts. >> Yeah, I would say, I don't see that as a negative, right? The human being will always be involved very closely, but then the machine and the data can really help, see correlation in the data that would be impossible for human being alone to discover. So, I think it's going to be a compliment, not a replacement and everything that has made us faster, doesn't mean that we have less work to do. It means that we can do more. And we have so much to do. That I would not be worried about the effect of being more efficient and better at our work. And indeed, I fundamentally think that, data, processing of images and doing AI on these images and discovering patterns and potentially flagging disease, way earlier than it was possible, it is going to have a huge impact in health care. And as Florian was saying, every industry is going to be impacted by that technology. So, yeah, I'm very optimistic. >> Great, Guys, I wish we had more time. We got to leave it there but so thanks so much for coming on theCUBE. It was really a pleasure having you. >> [Benoit & Florian] Thank you. >> You're welcome but keep it right there, everybody. We'll back with our next guest, right after this short break. You're watching theCUBE.

Published Date : Oct 21 2020

SUMMARY :

And he's now the president And the next generation of the access to data, the And we all know that, you all the workloads that you the notion of bringing the access to technology such as And so, you have a unique And it's really reliant to reach out Hal Varian famously said that the sexy job And it will be about who you collaborate and the challenges of the pandemic. adjusting the resources that you have end of the spectrum, of the way you use data to I mean, I often wonder, you know, So, I think it's going to be a compliment, We got to leave it there right after this short break.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Florian	PERSON	0.99+
Benoit	PERSON	0.99+
Florian Douetteau	PERSON	0.99+
Benoit Dageville	PERSON	0.99+
2020	DATE	0.99+
10 years	QUANTITY	0.99+
Dataiku	ORGANIZATION	0.99+
Hilary Mason	PERSON	0.99+
Python	TITLE	0.99+
Hal Varian	PERSON	0.99+
next year	DATE	0.99+
Snowflake	ORGANIZATION	0.99+
one place	QUANTITY	0.99+
both	QUANTITY	0.99+
one hour	QUANTITY	0.99+
Bitly	ORGANIZATION	0.99+
Snowflake Data Cloud Summit	EVENT	0.99+
a decade ago	DATE	0.98+
one day	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
first	QUANTITY	0.98+
each level	QUANTITY	0.98+
Snowflake	TITLE	0.98+
2021	DATE	0.97+
today	DATE	0.97+
first generation	QUANTITY	0.97+
pandemic	EVENT	0.97+
few years ago	DATE	0.93+
thousands of customers	QUANTITY	0.93+
single architecture	QUANTITY	0.92+
first era	QUANTITY	0.88+
Negroponte	PERSON	0.87+
first data scientist	QUANTITY	0.87+
Instacart	ORGANIZATION	0.87+
This year	DATE	0.86+
one single place	QUANTITY	0.86+
two	QUANTITY	0.83+
two world-	QUANTITY	0.78+
each workload	QUANTITY	0.78+
one	QUANTITY	0.76+
Adam	PERSON	0.74+
next 10 years	DATE	0.69+
first timers	QUANTITY	0.52+
COVID	OTHER	0.51+
COVID	ORGANIZATION	0.43+
COVID	EVENT	0.37+
decades	DATE	0.29+

Holly St. Clair, State of MA | Actifio Data Driven 2019

from Boston Massachusetts it's the cube covering Activia 2019 data-driven to you by Activia welcome to Boston everybody this is Dave Volante and I'm here with stupid man finally still in our hometown you're watching the cube the leader in live tech coverage we're covering actifi Oh data-driven hashtag data-driven 19 activity it was a company that is focus started focused on copy data management they sort of popularized the term the I the concept the idea of data virtualization there's big data digital transformation all the buzz it's kind of been a tailwind for the company and we followed them quite closely over the years poly st. Claire is here she's the CEO of the state of Massachusetts that's chief of ditch and chief data officer Holly thanks for coming on the Q thanks for having me so it's kind of rare that somebody shares the title of chief digital officer of chief data officer I think it's rare right now I think that would change you think it will change I think those two roles will come together I just think data fuels our digital world and it both creates the content and also monitors how we're doing and it's just inevitably I think either they're gonna be joined at the hip or it's gonna be the same person that's interesting I always thought the chief data officer sort of emerged from this wonky back-office role data quality of this careful the word walking okay well yeah let's talk about that but the chief digital officer is kind of the mover the shaker has a little marketing genius but but okay so you see those two roles coming together that maybe makes sense because why because there's there some tension in a lot of organizations between those two roles well I think the challenge with the way that sometimes people think about data is they think about it's only a technical process data is actually very creative and you also have to tell a story in order to be good with it it's the same thing as marketing but it's just a little bit of a different hue a different type of audience a different type of pace there's a technical component to the data work but I'm looking at my organization that I'm surrounded by additional technical folks CTO CSO privacy officer CIO so we have a lot of supports that might take away some of those roles are scrunched in under the data officer or the digital so I used to turn wonky before it kind of triggered you a little bit but but you're a modeler you're a data scientist your development programmer right no but I know enough to I know enough to read code and get in trouble okay so you can direct coders and you have data scientists working for you yeah right so you've got that entire organization underneath you and your your mission is blank fill in the blank so our mission is to use the best information technology to ensure that every users experience with the Commonwealth is fast easy and wicked awesome awesome Holly our team just got back from a very large public sector event down in DC and digging into you know how our agency is doing with you know cloud force initiatives how are they doing the city environments you were state of Massachusetts and you know rolled out that that first chief data if you keep dipped officer gets a little bit of insight inside how Massachusetts doing with these latest waves of innovation uh well you know we have our legacy systems and as our opportunities come up to improve those systems our reinvest in them we are taking a step forward to cloud we're not so dogmatic that it's cloud only but it's definitely cloud when it's appropriate I do think we'll always have some on-prem services but really when it's possible whether it's a staff service off-the-shelf or it's a cloud environment to make sense than we are moving to that in your keynote this morning you you talked about something called data minimalism yeah and wonder if you could explain that for audience because for the longest time it's been well you want to hoard all the data you want to get all the data and you know what do you do with it how do you manage you right right I mean data's only as good as your ability to use it and I often find that we're ingesting all this data and we don't really know what to do with it or really rather our business leaders and decision-makers can't quite figure out how to connect that to the mission or to act properly interrogate the data to get the information they want and so this idea is an idea that's sort of coming a little bit out of Europe and or some of the other trends we see around some cyber security and hacking worlds and the idea is this actually came from fjords Digital Trends for 2019 is data minimalism the idea is that you strongly connect your business objectives to the data collection program that you have you don't just collect data until you're sure that it supports your objectives so you know one of the things that I also talked about in the keynote was not just data minimalism but doing a try test iterate approach we often collect data hoping to see that we can create a change I think we need to prove that we can create the change before we do a widespread scalable data collection program because often we collect data and you still can't see what you're doing has an effect within the data the signals too strong or too too weak or you're asking the wrong question of the data or it's the wrong plectra collection of the technique and that's largely driven from a sort of privacy a privacy privacy the reality of how costly sometimes the kennedys but you know storage of data is cheap but the actual reality of moving it and saving it and knowing where it is and accessing it later that takes time and energy of your of your actual people so I think it's just important for us to think carefully about a resource in government we have a little less resources sometimes in the private sector so we're very strategic on what we do and so I think we need to really think about the data we use if the pendulum swings remember back to the days of you know 2006 the Federal Rules of Civil Procedure said okay you got to keep electronic records for whatever seven years of depending on industry and people said okay let's get rid of it as soon as we can data was viewed as a liability and then of course all the big data height we've talked about a little bit in your in your speech everybody said I could collect everything throw it into a data Lake and we all know those became data swamps so do you feel like the pendulum is swinging and there's maybe a little balance are we reaching an equilibrium is it going to be a you know hard shift back to data as a liability what are your thoughts well I think isn't with any trend there's always a little bit of a pendulum swing as we're learning it's with it with the equilibrium is equilibrium is I think that's a great word I think the piece that I neglected to mention is the relationship to the consumer trust you know for us in government we have to have the trust of our constituents we do have a higher bar than public sector in terms of handling data in a way that's respectful of individuals privacy and their security of their data and so I think to the extent that we are able to lend transparency and show the utility and the data we're using and that will gain the trust of our users or customers but if we continue to do things behind the scenes and not be overt about it I think then that can cause more problems I think we face is organizations to ask ourselves is having more data worth the sort of vulnerability introduces and the possible liability of trust of our of our customers when you betray to test over your customers it's really hard to replace that and so you know to a certain extent I think we should be more deliberate about our data and earn the trust of our customers okay how how does Massachusetts look at the boundary of data between the public sector and the private sector I've talked to you know some states where you know we're helping business off parking by giving you know new mobile apps access to that information you talked a little bit about health care you know I've done interviews with the massive macleod initiative here locally how do you look at that balance of sharing I think it is a real balance you know I don't think we do very much of it yet and we certainly don't share data that were not allowed to by law and we have very strict laws here in Massachusetts the stricter at the ten most states and so I think it's very strategic when we do share data we are looking for opportunities when we can when I talk about demand driven data I look forward to opening the conversation a little bit to ask people what data are they looking for to ask businesses and different institutions we have throughout the Commonwealth what data would help you do your job better and grow our economy and our jobs and I think that's a conversation we need to have over time to figure out what the right balances someday it'll be easier for us to share than others and some will never be able to share the first data scientist I've ever met is somebody I interviewed the amazing Hilary Mason and she said something that I want to circle back to something you said in your talk if she said the hardest part of my job or one of the hardest parts is people come to me with data and and it's the most valuable thing I can do is show them which questions to ask and you have talked about well what's a lot of times you don't know what questions to ask until you look at the data or vice versa what comes first the chicken or the egg what's your experience pin well I do think we need to be driven by the business objectives and goals it doesn't mean there's not an iterative process in there somewhere but you know data wonks we can we can just throw data all day long and still might not give you the answer there forward but I think it's really important for us to be driven by the business and I think executives don't know how to ask the questions of the data they don't know how to interrogate it or honestly more realistically we don't have a date of actually answers the question they want to know so we often have to use proxies for that information but I do think if there's an iterative after you get to a starting point so I do think knowing what the business question is first I know you gotta go but I want to ask your last question bring it back to the state where both Massachusetts residents and your services it sounds like you're picking off some some good wins with a through the fast ROI I mean you mentioned you know driver's license renewals etc how about procurement has procurement been a challenge from the state standpoint you are you looking at sort of the digital process and how to streamline procurement that is a conversation that the secretary what is currently in and I think it's a good one I don't think we have any any solutions yet but I think we have a lot of the issues that were struggling with but we're not alone all public sectors struggling with this type of procurement question so we're working on it all right last question there's quick thoughts on you know what you've seen here I know you're in and out but data-driven yeah it's a great theme it's a really exciting agenda there's people for all these different organizations and approaches to data-driven you know from movie executives and casting to era it's just really exciting to see the program it's Nate Claire thanks so much I'm coming on the queue thank you great to meet you okay keep it right there everybody we'll be back with our next guest right after this short break well the cube is here at data-driven day one special coverage we'll be right back

Published Date : Jun 19 2019

SUMMARY :

the data and you know what do you do

ENTITIES

Entity	Category	Confidence
Dave Volante	PERSON	0.99+
Massachusetts	LOCATION	0.99+
Europe	LOCATION	0.99+
Boston	LOCATION	0.99+
Hilary Mason	PERSON	0.99+
2006	DATE	0.99+
two roles	QUANTITY	0.99+
DC	LOCATION	0.99+
seven years	QUANTITY	0.99+
Holly	PERSON	0.99+
Activia	ORGANIZATION	0.99+
both	QUANTITY	0.98+
first	QUANTITY	0.97+
Boston Massachusetts	LOCATION	0.97+
ten most states	QUANTITY	0.95+
this morning	DATE	0.95+
Nate Claire	PERSON	0.94+
2019	DATE	0.93+
Federal Rules of Civil Procedure	TITLE	0.91+
MA	LOCATION	0.91+
Holly St. Clair	PERSON	0.9+
macleod	ORGANIZATION	0.85+
one	QUANTITY	0.84+
Claire	PERSON	0.83+
Commonwealth	ORGANIZATION	0.8+
first data	QUANTITY	0.79+
one of the things	QUANTITY	0.78+
Activia 2019	EVENT	0.77+
waves of innovation	EVENT	0.71+
chief data officer	PERSON	0.67+
Commonwealth	LOCATION	0.64+
19 activity	QUANTITY	0.61+
Actifio	TITLE	0.56+
lot of times	QUANTITY	0.53+
the issues	QUANTITY	0.52+

Leigh Martin, Infor | Inforum DC 2018

>> Live from Washington, D.C., it's theCUBE! Covering Inforum D.C. 2018. Brought to you by Infor. >> Well, welcome back to Washington, D.C., We are alive here at the Convention Center at Inforum 18, along with Dave Vellante, I'm John Walls. It's a pleasure now, welcome to theCUBE, Leigh Martin, who is the Senior Director of the Dynamic Science Labs at Infor, and good afternoon to you Leigh! >> Good afternoon, thank you for having me. >> Thanks for comin' on. >> Thank you for being here. Alright, well tell us about the Labs first off, obviously, data science is a big push at Infor. What do you do there, and then why is data science such a big deal? >> So Dynamic Science Labs is based in Cambridge, Massachusetts, we have about 20 scientists with backgrounds in math and science areas, so typically PhDs in Statistics and Operations Research, and those types of areas. And, we've really been working over the last several years to build solutions for Infor customers that are Math and Science based. So, we work directly with customers, typically through proof of concept, so we'll work directly with customers, we'll bring in their data, and we will build a solution around it. We like to see them implement it, and make sure we understand that they're getting the value back that we expect them to have. Once we prove out that piece of it, then we look for ways to deliver it to the larger group of Infor customers, typically through one of the Cloud Suites, perhaps functionality, that's built into a Cloud Suite, or something like that. >> Well, give me an example, I mean it's so, as you think-- you're saying that you're using data that's math and science based, but, for application development or solution development if you will. How? >> So, I'll give you an example, so we have a solution called Inventory Intelligence for Healthcare, it's moving towards a more generalized name of Inventory Intelligence, because we're going to move it out of the healthcare space and into other industries, but this is a product that we built over the last couple of years. We worked with a couple of customers, we brought in their loss and data, so their loss in customers, we bring the data into an area where we can work on it, we have a scientist in our team, actually, she's one of the Senior Directors in the team, Dawn Rose, who led the effort to design and build this, design and build the algorithm underlying the product; and what it essentially does is, it allows hospitals to find the right level of inventory. Most hospitals are overstocked, so this gives them an opportunity to bring down their inventory levels, to a manageable place without increasing stockouts, so obviously, it's very important in healthcare, that you're not having a lot of stockouts. And so, we spent a lot of time working with these customers, really understanding what the data was like that they were giving to us, and then Dawn and her team built the algorithm that essentially says, here's what you've done historically, right? So it's based on historic data, at the item level, at the location level. What've you done historically, and how can we project out the levels you should have going forward, so that they're at the right level where you're saving money, but again, you're not increasing stockouts, so. So, it's a lot of time and effort to bring those pieces together and build that algorithm, and then test it out with the customers, try it out a couple of times, you make some tweaks based on their business process and exactly how it works. And then, like I said, we've now built that out into originally a stand-alone application, and in about a month, we're going to go live in Cloud Suite Financials, so it's going to be a piece of functionality inside of Cloud Suite Financials. >> So, John, if I may, >> Please. >> I'm going to digress for a moment here because the first data scientist that I ever interviewed was the famous Hilary Mason, who's of course now at Cloudera, but, and she told me at the time that the data scientist is a part mathematician, part scientist, part statistician, part data hacker, part developer, and part artist. >> Right. (laughs) >> So, you know it's an amazing field that Hal Varian, who is the Google Economist said, "It's going to be the hottest field, in the next 10 years." And this is sort of proven true, but Leigh, my question is, so you guys are practitioners of data science, and then you bring that into your product, and what we hear from a lot of data scientists, other than that sort of, you know, panoply of skill sets, is, they spend more time wrangling data, and the tooling isn't there for collaboration. How are you guys dealing with that? How has that changed inside of Infor? >> It is true. And we actually really focus on first making sure we understand the data and the context of the data, so it's really important if you want to solve a particular business problem that a customer has, to make sure you understand exactly what is the definition of each and every piece of data that's in all of those fields that they sent over to you, before you try to put 'em inside an algorithm and make them do something for you. So it is very true that we spend a lot of time cleaning and understanding data before we ever dive into the problem solving aspect of it. And to your point, there is a whole list of other things that we do after we get through that phase, but it's still something we spend a lot of time on today, and that has been the case for, a long time now. We, wherever we can, we apply new tools and new techniques, but actually just the simple act of going in there and saying, "What am I looking at, how does it relate?" Let me ask the customer to clarify this to make sure I understand exactly what it means. That part doesn't go away, because we're really focused on solving the customer solution and then making sure that we can apply that to other customers, so really knowing what the data is that we're working with is key. So I don't think that part has actually changed too much, there are certainly tools that you can look at. People talk a lot about visualization, so you can start thinking, "Okay, how can I use some visualization to help me understand the data better?" But, just that, that whole act of understanding data is key and core to what we do, because, we want to build the solution that really answers the answers the business problem. >> The other thing that we hear a lot from data scientists is that, they help you figure out what questions you actually have to ask. So, it sort of starts with the data, they analyze the data, maybe you visualize the data, as you just pointed out, and all these questions pop out. So what is the process that you guys use? You have the data, you've got the data scientist, you're looking at the data, you're probably asking all these questions. You get, of course, get questions from your customers as well. You're building models maybe to address those questions, training the models to get better and better and better, and then you infuse that into your software. So, maybe, is that the process? Is it a little more complicated than that? Maybe you could fill in the gaps. >> Yeah, so, I, my personal opinion, and I think many of my colleagues would agree with me on this is, starting with the business problem, for us, is really the key. There are ways to go about looking at the data and then pulling out the questions from the data, but generally, that is a long and involved process. Because, it takes a lot of time to really get that deep into the data. So when we work, we really start with, what's the business problem that the customer's trying to solve? And then, what's the data that needs to be available for us to be able to solve that? And then, build the algorithm around that. So for us, it's really starting with the business problem. >> Okay, so what are some of the big problems? We heard this morning, that there's a problem in that, there's more job openings than there are candidates, and productivity, business productivity is not being impacted. So there are two big chewy problems that data scientists could maybe attack, and you guys seem to be passionate about those, so. How does data science help solve those problems? >> So, I think that, at Infor, I'll start off by saying at Infor there's actually, I talked about the folks that are in our office in Cambridge, but there's quite a bit of data science going on outside of our team, and we are the data science team, but there are lots of places inside of Infor where this is happening. Either in products that contains some sort of algorithmic approach, the HCM team for sure, the talent science team which works on HCM, that's a team that's led by Jill Strange, and we work with them on certain projects in certain areas. They are very focused on solving some of those people-related problems. For us, we work a little bit more on the, some of the other areas we work on is sort of the manufacturing and distribution areas, we work with the healthcare side of things, >> So supply chain, healthcare? >> Exactly. So some of the other areas, because they are, like I said, there are some strong teams out there that do data science, it's just, it's also incorporated with other things, like the talent science team. So, there's lots of examples of it out there. In terms of how we go about building it, so we, like I was saying, we work on answering the business, the business question upfront, understanding the data, and then, really sitting with the customer and building that out, and, so the problems that come to us are often through customers who have particular things that they want to answer. So, a lot of it is driven by customer questions, and particular problems that they're facing. Some of it is driven by us. We have some ideas about things that we think, would be really useful to customers. Either way, it ends up being a customer collaboration with us, with the product team, that eventually we'll want to roll it out too, to make sure that we're answering the problem in the way that the product team really feels it can be rolled out to customers, and better used, and more easily used by them. >> I presume it's a non-linear process, it's not like, that somebody comes to you with a problem, and it's okay, we're going to go look at that. Okay now, we got an answer, I mean it's-- Are you more embedded into the development process than that? Can you just explain that? >> So, we do have, we have a development team in Prague that does work with us, and it's depending on whether we think we're going to actually build a more-- a product with aspects to it like a UI, versus just a back end solution. Depends on how we've decided we want to proceed with it. so, for example, I was talking about Inventory Intelligence for Healthcare, we also have Pricing Science for Distribution, both of those were built initially with UIs on them, and customers could buy those separately. Now that we're in the Cloud Suites, that those are both being incorporated into the Cloud Suite. So, we have, going back to where I was talking about our team in Prague, we sometimes build product, sort of a fully encased product, working with them, and sometimes we work very closely with the development teams from the various Cloud Suites. And the product management team is always there to help us, to figure out sort of the long term plan and how the different pieces fit together. >> You know, kind of big picture, you've got AI right, and then machine learning, pumping all kinds of data your way. So, in a historical time frame, this is all pretty new, this confluence right? And in terms of development, but, where do you see it like 10 years from now, 20 years from now? What potential is there, we've talked about human potential, unlocking human potential, we'll unlock it with that kind of technology, what are we looking at, do you think? >> You know, I think that's such a fascinating area, and area of discussion, and sort of thinking, forward thinking. I do believe in sort of this idea of augmented intelligence, and I think Charles was talking a little bit about, about that this morning, although not in those particular terms; but this idea that computers and machines and technology will actually help us do better, and be better, and being more productive. So this idea of doing sort of the rote everyday tasks, that we no longer have to spend time doing, that'll free us up to think about the bigger problems, and hopefully, and my best self wants to say we'll work on famine, and poverty, and all those problems in the world that, really need our brains to focus on, and work. And the other interesting part of it is, if you think about, sort of the concept of singularity, and are computers ever going to actually be able to think for themselves? That's sort of another interesting piece when you talk about what's going to happen down the line. Maybe it won't happen in 10 years, maybe it will never happen, but there's definitely a lot of people out there, who are well known in sort of tech and science who talk about that, and talk about the fears related to that. That's a whole other piece, but it's fascinating to think about 10 years, 20 years from now, where we are going to be on that spectrum? >> How do you guys think about bias in AI and data science, because, humans express bias, tribalism, that's inherent in human nature. If machines are sort of mimicking humans, how do you deal with that and adjudicate? >> Yeah, and it's definitely a concern, it's another, there's a lot of writings out there and articles out there right now about bias in machine learning and in AI, and it's definitely a concern. I actually read, so, just being aware of it, I think is the first step, right? Because, as scientists and developers develop these algorithms, going into it consciously knowing that this is something they have to protect against, I think is the first step, for sure. And then, I was just reading an article just recently about another company (laughs) who is building sort of a, a bias tracker, so, a way to actually monitor your algorithm and identify places where there is perhaps bias coming in. So, I do think we'll see, we'll start to see more of those things, it gets very complicated, because when you start talking about deep learning and networks and AI, it's very difficult to actually understand what's going on under the covers, right? It's really hard to get in and say this is the reason why, your AI told you this, that's very hard to do. So, it's not going to be an easy process but, I think that we're going to start to see that kind of technology come. >> Well, we heard this morning about some sort of systems that could help, my interpretation, automate, speed up, and minimize the hassle of performance reviews. >> Yes. (laughs) >> And that's the classic example of, an assertive woman is called abrasive or aggressive, an assertive man is called a great leader, so it's just a classic example of bias. I mentioned Hilary Mason, rock star data scientist happens to be a woman, you happen to be a woman. Your thoughts as a woman in tech, and maybe, can AI help resolve some of those biases? >> Yeah. Well, first of all I want to say, I'm very pleased to work in an organization where we have some very strong leaders, who happen to be women, so I mentioned Dawn Rose, who designed our IIH solution, I mentioned Jill Strange, who runs the talent science organization. Half of my team is women, so, particularly inside of sort of the science area inside of Infor, I've been very pleased with the way we've built out some of that skill set. And, I'm also an active member of WIN, so the Women's Infor Network is something I'm very involved with, so, I meet a lot of people across our organization, a lot of women across our organization who have, are just really strong technology supporters, really intelligent, sort of go-getter type of people, and it's great to see that inside of Infor. I think there's a lot of work to be done, for sure. And you can always find stories, from other, whether it's coming out of Silicon Valley, or other places where you hear some, really sort of arcane sounding things that are still happening in the industry, and so, some of those things it's, it's disappointing, certainly to hear that. But I think, Van Jones said something this morning about how, and I liked the way he said it, and I'm not going to be able say it exactly, but he said something along the lines of, "The ground is there, the formation is starting, to get us moving in the right direction." and I think, I'm hopeful for the future, that we're heading in that way, and I think, you know, again, he sort of said something like, "Once the ground swell starts going in that direction, people will really jump in, and will see the benefits of being more diverse." Whether it's across, having more women, or having more people of color, however things expand, and that's just going to make us all better, and more efficient, and more productive, and I think that's a great thing. >> Well, and I think there's a spectrum, right? And on one side of the spectrum, there's intolerable and unacceptable behavior, which is just, should be zero tolerance in my opinion, and the passion of ours in theCUBE. The other side of that spectrum is inclusion, and it's a challenge that we have as a small company, and I remember having a conversation, earlier this year with an individual. And we talk about quotas, and I don't think that's the answer. Her comment was, "No, that's not the answer, you have to endeavor to reach deeper beyond your existing network." Which is hard sometimes for us, 'cause you're so busy, you're running around, it's like okay it's the convenient thing to do. But you got to peel the onion on that network, and actually take the extra time and make it a priority. I mean, your thoughts on that? >> No, I think that's a good point, I mean, if I think about who my circle is, right? And the people that I know and I interact with. If I only reach out to the smallest group of people, I'm not getting really out beyond my initial circle. So I think that's a very good point, and I think that that's-- we have to find ways to be more interactive, and pull from different areas. And I think it's interesting, so coming back to data science for a minute, if you sort of think about the evolution of where we got to, how we got to today where, now we're really pulling people from science areas, and math areas, and technology areas, and data scientists are coming from lots of places, right? And you don't always have to have a PhD, right? You don't necessary have to come up through that system to be a good data scientist, and I think, to see more of that, and really people going beyond, beyond just sort of the traditional circles and the traditional paths to really find people that you wouldn't normally identify, to bring into that, that path, is going to help us, just in general, be more diverse in our approach. >> Well it certainly it seems like it's embedded in the company culture. I think the great reason for you to be so optimistic going forward, not only about your job, but about the way companies going into that doing your job. >> What would you advise, young people generally, who want to crack into the data science field, but specifically, women, who have clearly, are underrepresented in technology? >> Yeah, so, I think the, I think we're starting to see more and more women enter the field, again it's one of those, people know it, and so there's less of a-- because people are aware of it, there's more tendency to be more inclusive. But I definitely think, just go for it, right? I mean if it's something you're interested in, and you want to try it out, go to a coding camp, and take a science class, and there's so many online resources now, I mean there's, the massive online courses that you can take. So, even if you're hesitant about it, there are ways you can kind of be at home, and try it out, and see if that's the right thing for you. >> Just dip your toe in the water. >> Yes, exactly, exactly! Try it out and see, and then just decide if that's the right thing for you, but I think there's a lot of different ways to sort of check it out. Again, you can take a course, you can actually get a degree, there's a wide range of things that you can do to kind of experiment with it, and then find out if that's right for you. >> And if you're not happy with the hiring opportunities out there, just start a company, that's my advice. >> That's right. (laughing together) >> Agreed, I definitely agree! >> We thank you-- we appreciate the time, and great advice, too. >> Thank you so much. >> Leigh Martin joining us here at Inforum 18, we are live in Washington, D.C., you're watching the exclusive coverage, right here, on theCUBE. (bubbly music)

Published Date : Sep 25 2018

SUMMARY :

Brought to you by Infor. and good afternoon to you Leigh! and then why is data science such a big deal? and we will build a solution around it. Well, give me an example, I mean it's so, as you think-- and how can we project out that the data scientist is a part mathematician, (laughs) and then you bring that into your product, and that has been the case for, a long time now. and then you infuse that into your software. and I think many of my colleagues and you guys seem to be passionate about those, so. some of the other areas we work on is sort of the so the problems that come to us are often through that somebody comes to you with a problem, And the product management team is always there to help us, what are we looking at, do you think? and talk about the fears related to that. How do you guys think about bias that this is something they have to protect against, Well, we heard this morning about some sort of And that's the classic example of, and it's great to see that inside of Infor. and it's a challenge that we have as a small company, and I think that that's-- I think the great reason for you to be and see if that's the right thing for you. and then just decide if that's the right thing for you, the hiring opportunities out there, That's right. we appreciate the time, and great advice, too. at Inforum 18, we are live in Washington, D.C.,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Hilary Mason	PERSON	0.99+
John Walls	PERSON	0.99+
Hal Varian	PERSON	0.99+
Jill Strange	PERSON	0.99+
Dynamic Science Labs	ORGANIZATION	0.99+
John	PERSON	0.99+
Leigh Martin	PERSON	0.99+
Washington, D.C.	LOCATION	0.99+
Cambridge	LOCATION	0.99+
Prague	LOCATION	0.99+
Silicon Valley	LOCATION	0.99+
Charles	PERSON	0.99+
Leigh	PERSON	0.99+
Infor	ORGANIZATION	0.99+
Van Jones	PERSON	0.99+
Dawn	PERSON	0.99+
WIN	ORGANIZATION	0.99+
first step	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Dawn Rose	PERSON	0.99+
Cambridge, Massachusetts	LOCATION	0.99+
Cloud Suite	TITLE	0.99+
Women's Infor Network	ORGANIZATION	0.98+
Convention Center	LOCATION	0.98+
one	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.97+
10 years	QUANTITY	0.97+
this morning	DATE	0.96+
Cloud Suites	TITLE	0.96+
first	QUANTITY	0.96+
one side	QUANTITY	0.95+
Cloud Suite Financials	TITLE	0.93+
each	QUANTITY	0.92+
two big chewy problems	QUANTITY	0.92+
about 20 scientists	QUANTITY	0.92+
D.C.	LOCATION	0.9+
earlier this year	DATE	0.9+
20 years	QUANTITY	0.88+
last couple of years	DATE	0.88+
DC	LOCATION	0.87+
first data scientist	QUANTITY	0.85+
Inforum 18	ORGANIZATION	0.83+
Google	ORGANIZATION	0.79+
Half of my team	QUANTITY	0.76+
years	DATE	0.75+
couple	QUANTITY	0.74+
Inventory Intelligence	TITLE	0.71+
years	QUANTITY	0.69+
HCM	ORGANIZATION	0.68+
about a month	QUANTITY	0.68+
next 10 years	DATE	0.68+
2018	DATE	0.66+
20	DATE	0.63+
theCUBE	ORGANIZATION	0.62+
last	DATE	0.55+
Inforum	ORGANIZATION	0.54+
zero	QUANTITY	0.52+
Economist	TITLE	0.51+
Cloud	TITLE	0.49+
Inventory	ORGANIZATION	0.47+
Inforum	EVENT	0.42+

Mick Hollison, Cloudera | theCUBE NYC 2018

(lively peaceful music) >> Live, from New York, it's The Cube. Covering "The Cube New York City 2018." Brought to you by SiliconANGLE Media and its ecosystem partners. >> Well, everyone, welcome back to The Cube special conversation here in New York City. We're live for Cube NYC. This is our ninth year covering the big data ecosystem, now evolved into AI, machine learning, cloud. All things data in conjunction with Strata Conference, which is going on right around the corner. This is the Cube studio. I'm John Furrier. Dave Vellante. Our next guest is Mick Hollison, who is the CMO, Chief Marketing Officer, of Cloudera. Welcome to The Cube, thanks for joining us. >> Thanks for having me. >> So Cloudera, obviously we love Cloudera. Cube started in Cloudera's office, (laughing) everyone in our community knows that. I keep, keep saying it all the time. But we're so proud to have the honor of working with Cloudera over the years. And, uh, the thing that's interesting though is that the new building in Palo Alto is right in front of the old building where the first Palo Alto office was. So, a lot of success. You have a billboard in the airport. Amr Awadallah is saying, hey, it's a milestone. You're in the airport. But your business is changing. You're reaching new audiences. You have, you're public. You guys are growing up fast. All the data is out there. Tom's doing a great job. But, the business side is changing. Data is everywhere, it's a big, hardcore enterprise conversation. Give us the update, what's new with Cloudera. >> Yeah. Thanks very much for having me again. It's, it's a delight. I've been with the company for about two years now, so I'm officially part of the problem now. (chuckling) It's been a, it's been a great journey thus far. And really the first order of business when I arrived at the company was, like, welcome aboard. We're going public. Time to dig into the S-1 and reimagine who Cloudera is going to be five, ten years out from now. And we spent a good deal of time, about three or four months, actually crafting what turned out to be just 38 total words and kind of a vision and mission statement. But the, the most central to those was what we were trying to build. And it was a modern platform for machine learning analytics in the cloud. And, each of those words, when you unpack them a little bit, are very, very important. And this week, at Strata, we're really happy on the modern platform side. We just released Cloudera Enterprise Six. It's the biggest release in the history of the company. There are now over 30 open-source projects embedded into this, something that Amr and Mike could have never imagined back in the day when it was just a couple of projects. So, a very very large and meaningful update to the platform. The next piece is machine learning, and Hilary Mason will be giving the kickoff tomorrow, and she's probably forgotten more about ML and AI than somebody like me will ever know. But she's going to give the audience an update on what we're doing in that space. But, the foundation of having that data management platform, is absolutely fundamental and necessary to do good machine learning. Without good data, without good data management, you can't do good ML or AI. Sounds sort of simple but very true. And then the last thing that we'll be announcing this week, is around the analytics space. So, on the analytic side, we announced Cloudera Data Warehouse and Altus Data Warehouse, which is a PaaS flavor of our new data warehouse offering. And last, but certainly not least, is just the "optimize for the cloud" bit. So, everything that we're doing is optimized not just around a single cloud but around multi-cloud, hybrid-cloud, and really trying to bridge that gap for enterprises and what they're doing today. So, it's a new Cloudera to say the very least, but it's all still based on that core foundation and platform that, you got to know it, with very early on. >> And you guys have operating history too, so it's not like it's a pivot for Cloudera. I know for a fact that you guys had very large-scale customers, both with three letter, letters in them, the government, as well as just commercial. So, that's cool. Question I want to ask you is, as the conversation changes from, how many clusters do I have, how am I storing the data, to what problems am I solving because of the enterprises. There's a lot of hard things that enterprises want. They want compliance, all these, you know things that have either legacy. You guys work on those technical products. But, at the end of the day, they want the outcomes, they want to solve some problems. And data is clearly an opportunity and a challenge for large enterprises. What problems are you guys going after, these large enterprises in this modern platform? What are the core problems that you guys knock down? >> Yeah, absolutely. It's a great question. And we sort of categorize the way we think about addressing business problems into three broad categories. We use the terms grow, connect, and protect. So, in the "grow" sense, we help companies build or find new revenue streams. And, this is an amazing part of our business. You see it in everything from doing analytics on clickstreams and helping people understand what's happening with their web visitors and the like, all the way through to people standing up entirely new businesses based simply on their data. One large insurance provider that is a customer of ours, as an example, has taken on the challenge and asked us to engage with them on building really, effectively, insurance as a service. So, think of it as data-driven insurance rates that are gauged based on your driving behaviors in real time. So no longer simply just using demographics as the way that you determine, you know, all 18-year old young men are poor drivers. As it turns out, with actual data you can find out there's some excellent 18 year olds. >> Telematic, not demographics! >> Yeah, yeah, yeah, exactly! >> That Tesla don't connect to the >> Exactly! And Parents will love this, love this as well, I think. So they can find out exactly how their kids are really behaving by the way. >> They're going to know I rolled through the stop signs in Palo Alto. (laughing) My rates just went up. >> Exactly, exactly. So, so helping people grow new businesses based on their data. The second piece is "Connect". This is not just simply connecting devices, but that's a big part of it, so the IOT world is a big engine for us there. One of our favorite customer stories is a company called Komatsu. It's a mining manufacturer. Think of it as the ones that make those, just massive mines that are, that are all over the world. They're particularly big in Australia. And, this is equipment that, when you leave it sit somewhere, because it doesn't work, it actually starts to sink into the earth. So, being able to do predictive maintenance on that level and type and expense of equipment is very valuable to a company like Komatsu. We're helping them do that. So that's the "Connect" piece. And last is "Protect". Since data is in fact the new oil, the most valuable resource on earth, you really need to be able to protect it. Whether that's from a cyber security threat or it's just meeting compliance and regulations that are put in place by governments. Certainly GDPR is got a lot of people thinking very differently about their data management strategies. So we're helping a number of companies in that space as well. So that's how we kind of categorize what we're doing. >> So Mick, I wonder if you could address how that's all affected the ecosystem. I mean, one of the misconceptions early on was that Hadoop, Big Data, is going to kill the enterprise data warehouse. NoSQL is going to knock out Oracle. And, Mike has always said, "No, we are incremental". And people are like, "Yeah, right". But that's really, what's happened here. >> Yes. >> EDW was a fundamental component of your big data strategies. As Amr used to say, you know, SQL is the killer app for, for big data. (chuckling) So all those data sources that have been integrated. So you kind of fast forward to today, you talked about IOT and The Edge. You guys have announced, you know, your own data warehouse and platform as a service. So you see this embracing in this hybrid world emerging. How has that affected the evolution of your ecosystem? >> Yeah, it's definitely evolved considerably. So, I think I'd give you a couple of specific areas. So, clearly we've been quite successful in large enterprises, so the big SI type of vendors want a, want a piece of that action these days. And they're, they're much more engaged than they were early days, when they weren't so sure all of this was real. >> I always say, they like to eat at the trough and then the trough is full, so they dive right in. (all laughing) They're definitely very engaged, and they built big data practices and distinctive analytics practices as well. Beyond that, sort of the developer community has also begun to shift. And it's shifted from simply people that could spell, you know, Hive or could spell Kafka and all of the various projects that are involved. And it is elevated, in particular into a data science community. So one of additional communities that we sort of brought on board with what we're doing, not just with the engine and SPARK, but also with tools for data scientists like Cloudera Data Science Workbench, has added that element to the community that really wasn't a part of it, historically. So that's been a nice add on. And then last, but certainly not least, are the cloud providers. And like everybody, they're, those are complicated relationships because on the one hand, they're incredibly valuable partners to it, certainly both Microsoft and Amazon are critical partners for Cloudera, at the same time, they've got competitive offerings. So, like most successful software companies there's a lot of coopetition to contend with that also wasn't there just a few years ago when we didn't have cloud offerings, and they didn't have, you know, data warehouse in the cloud offerings. But, those are things that have sort of impacted the ecosystem. >> So, I've got to ask you a marketing question, since you're the CMO. By the way, great message UL. I like the, the "grow, connect, protect." I think that's really easy to understand. >> Thank you. >> And the other one was modern. The phrase, say the phrase again. >> Yeah. It's the "Cloudera builds the modern platform for machine learning analytics optimized for the cloud." >> Very tight mission statement. Question on the name. Cloudera. >> Mmhmm. >> It's spelled, it's actually cloud with ERA in the letters, so "the cloud era." People use that term all the time. We're living in the cloud era. >> Yes. >> Cloud-native is the hottest market right now in the Linux foundation. The CNCF has over two hundred and forty members and growing. Cloud-native clearly has indicated that the new, modern developers here in the renaissance of software development, in general, enterprises want more developers. (laughs) Not that you want to be against developers, because, clearly, they're going to hire developers. >> Absolutely. >> And you're going to enable that. And then you've got the, obviously, cloud-native on-premise dynamic. Hybrid cloud and multi-cloud. So is there plans to think about that cloud era, is it a cloud positioning? You see cloud certainly important in what you guys do, because the cloud creates more compute, more capabilities to move data around. >> Sure. >> And (laughs) process it. And make it, make machine learning go faster, which gives more data, more AI capabilities, >> It's the flywheel you and I were discussing. >> It's the flywheel of, what's the innovation sandwich, Dave? You know? (laughs) >> A little bit of data, a little bit of machine itelligence, in the cloud. >> So, the innovation's in play. >> Yeah, Absolutely. >> Positioning around Cloud. How are you looking at that? >> Yeah. So, it's a fascinating story. You were with us in the earliest days, so you know that the original architecture of everything that we built was intended to be run in the public cloud. It turns out, in 2008, there were exactly zero customers that wanted all of their data in a public cloud environment. So the company actually pivoted and re-architected the original design of the offerings to work on-prim. And, no sooner did we do that, then it was time to re-architect it yet again. And we are right in the midst of doing that. So, we really have offerings that span the whole gamut. If you want to just pick up you whole current Cloudera environment in an infrastructure as a service model, we offer something called Altus Director that allows you to do that. Just pick up the entire environment, step it up onto AWUS, or Microsoft Azure, and off you go. If you want the convenience and the elasticity and the ease of use of a true platform as a service, just this past week we announced Altus Data Warehouse, which is a platform as a service kind of a model. For data warehousing, we have the data engineering module for Altus as well. Last, but not least, is everybody's not going to sign up for just one cloud vendor. So we're big believers in multi-cloud. And that's why we support the major cloud vendors that are out there. And, in addition to that, it's going to be a hybrid world for as far out as we can see it. People are going to have certain workloads that, either for economics or for security reasons, they're going to continue to want to run in-house. And they're going to have other workloads, certainly more transient workloads, and I think ML and data science will fall into this camp, that the public cloud's going to make a great deal of sense. And, allowing companies to bridge that gap while maintaining one security compliance and management model, something we call a Shared Data Experience, is really our core differentiator as a business. That's at the very core of what we do. >> Classic cloud workload experience that you're bringing, whether it's on-prim or whatever cloud. >> That's right. >> Cloud is an operating environment for you guys. You look at it just as >> The delivery mechanism. In effect. Awesome. All right, future for Cloudera. What can you share with us. I know you're a public company. Can't say any forward-looking statements. Got to do all those disclaimers. But for customers, what's the, what's the North Star for Cloudera? You mentioned going after a much more hardcore enterprise. >> Yes. >> That's clear. What's the North Star for you guys when you talk to customers? What's the big pitch? >> Yeah. I think there's a, there's a couple of really interesting things that we learned about our business over the course of the past six, nine months or so here. One, was that the greatest need for our offerings is in very, very large and complex enterprises. They have the most data, not surprisingly. And they have the most business gain to be had from leveraging that data. So we narrowed our focus. We have now identified approximately five thousand global customers, so think of it as kind of Fortune or Forbes 5000. That is our sole focus. So, we are entirely focused on that end of the market. Within that market, there are certain industries that we play particularly well in. We're incredibly well-positioned in financial services. Very well-positioned in healthcare and telecommunications. Any regulated industry, that really cares about how they govern and maintain their data, is really the great target audience for us. And so, that continues to be the focus for the business. And we're really excited about that narrowing of focus and what opportunities that's going to build for us. To not just land new customers, but more to expand our existing ones into a broader and broader set of use cases. >> And data is coming down faster. There's more data growth than ever seen before. It's never stopping.. It's only going to get worse. >> We love it. >> Bring it on. >> Any way you look at it, it's getting worse or better. Mick, thanks for spending the time. I know you're super busy with the event going on. Congratulations on the success, and the focus, and the positioning. Appreciate it. Thanks for coming on The Cube. >> Absolutely. Thank you gentlemen. It was a pleasure. >> We are Cube NYC. This is our ninth year doing all action. Everything that's going on in the data world now is horizontally scaling across all aspects of the company, the society, as we know. It's super important, and this is what we're talking about here in New York. This is The Cube, and John Furrier. Dave Vellante. Be back with more after this short break. Stay with us for more coverage from New York City. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media This is the Cube studio. is that the new building in Palo Alto is right So, on the analytic side, we announced What are the core problems that you guys knock down? So, in the "grow" sense, we help companies by the way. They're going to know I rolled Since data is in fact the new oil, address how that's all affected the ecosystem. How has that affected the evolution of your ecosystem? in large enterprises, so the big and all of the various projects that are involved. So, I've got to ask you a marketing question, And the other one was modern. optimized for the cloud." Question on the name. We're living in the cloud era. Cloud-native clearly has indicated that the new, because the cloud creates more compute, And (laughs) process it. machine itelligence, in the cloud. How are you looking at that? that the public cloud's going to make a great deal of sense. Classic cloud workload experience that you're bringing, Cloud is an operating environment for you guys. What can you share with us. What's the North Star for you guys is really the great target audience for us. And data is coming down faster. and the positioning. Thank you gentlemen. is horizontally scaling across all aspects of the

ENTITIES

Entity	Category	Confidence
Komatsu	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Mick Hollison	PERSON	0.99+
Mike	PERSON	0.99+
Australia	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
2008	DATE	0.99+
Palo Alto	LOCATION	0.99+
Tom	PERSON	0.99+
New York	LOCATION	0.99+
Mick	PERSON	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
Tesla	ORGANIZATION	0.99+
CNCF	ORGANIZATION	0.99+
Hilary Mason	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
three letter	QUANTITY	0.99+
North Star	ORGANIZATION	0.99+
Amr Awadallah	PERSON	0.99+
zero customers	QUANTITY	0.99+
five	QUANTITY	0.99+
18 year	QUANTITY	0.99+
ninth year	QUANTITY	0.99+
One	QUANTITY	0.99+
Dave	PERSON	0.99+
this week	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
both	QUANTITY	0.99+
ten years	QUANTITY	0.98+
four months	QUANTITY	0.98+
over two hundred and forty members	QUANTITY	0.98+
Oracle	ORGANIZATION	0.98+
NYC	LOCATION	0.98+
first	QUANTITY	0.98+
NoSQL	TITLE	0.98+
The Cube	ORGANIZATION	0.98+
over 30 open-source projects	QUANTITY	0.98+
Amr	PERSON	0.98+
today	DATE	0.98+
SQL	TITLE	0.98+
each	QUANTITY	0.98+
GDPR	TITLE	0.98+
tomorrow	DATE	0.98+
Cube	ORGANIZATION	0.97+
approximately five thousand global customers	QUANTITY	0.97+
Strata	ORGANIZATION	0.96+
about two years	QUANTITY	0.96+
Altus	ORGANIZATION	0.96+
earth	LOCATION	0.96+
EDW	TITLE	0.95+
18-year old	QUANTITY	0.95+
Strata Conference	EVENT	0.94+
few years ago	DATE	0.94+
one	QUANTITY	0.94+
AWUS	TITLE	0.93+
Altus Data Warehouse	ORGANIZATION	0.93+
first order	QUANTITY	0.93+
single cloud	QUANTITY	0.93+
Cloudera Enterprise Six	TITLE	0.92+
about three	QUANTITY	0.92+
Cloudera	TITLE	0.84+
three broad categories	QUANTITY	0.84+
past six	DATE	0.82+

Caryn Woodruff, IBM & Ritesh Arora, HCL Technologies | IBM CDO Summit Spring 2018

>> Announcer: Live from downtown San Francisco, it's the Cube, covering IBM Chief Data Officer Strategy Summit 2018. Brought to you by IBM. >> Welcome back to San Francisco everybody. We're at the Parc 55 in Union Square and this is the Cube, the leader in live tech coverage and we're covering exclusive coverage of the IBM CDO strategy summit. IBM has these things, they book in on both coasts, one in San Francisco one in Boston, spring and fall. Great event, intimate event. 130, 150 chief data officers, learning, transferring knowledge, sharing ideas. Cayn Woodruff is here as the principle data scientist at IBM and she's joined by Ritesh Ororo, who is the director of digital analytics at HCL Technologies. Folks welcome to the Cube, thanks for coming on. >> Thank you >> Thanks for having us. >> You're welcome. So we're going to talk about data management, data engineering, we're going to talk about digital, as I said Ritesh because digital is in your title. It's a hot topic today. But Caryn let's start off with you. Principle Data Scientist, so you're the one that is in short supply. So a lot of demand, you're getting pulled in a lot of different directions. But talk about your role and how you manage all those demands on your time. >> Well, you know a lot of, a lot of our work is driven by business needs, so it's really understanding what is critical to the business, what's going to support our businesses strategy and you know, picking the projects that we work on based on those items. So it's you really do have to cultivate the things that you spend your time on and make sure you're spending your time on the things that matter and as Ritesh and I were talking about earlier, you know, a lot of that means building good relationships with the people who manage the systems and the people who manage the data so that you can get access to what you need to get the critical insights that the business needs, >> So Ritesh, data management I mean this means a lot of things to a lot of people. It's evolved over the years. Help us frame what data management is in this day and age. >> Sure, so there are two aspects of data in my opinion. One is the data management, another the data engineering, right? And over the period as the data has grown significantly. Whether it's unstructured data, whether it's structured data, or the transactional data. We need to have some kind of governance in the policies to secure data to make data as an asset for a company so the business can rely on your data. What you are delivering to them. Now, the another part comes is the data engineering. Data engineering is more about an IT function, which is data acquisition, data preparation and delivering the data to the end-user, right? It can be business, it can be third-party but it all comes under the governance, under the policies, which are designed to secure the data, how the data should be accessed to different parts of the company or the external parties. >> And how those two worlds come together? The business piece and the IT piece, is that where you come in? >> That is where data science definitely comes into the picture. So if you go online, you can find Venn diagrams that describe data science as a combination of computer science math and statistics and business acumen. And so where it comes in the middle is data science. So it's really being able to put those things together. But, you know, what's what's so critical is you know, Interpol, actually, shared at the beginning here and I think a few years ago here, talked about the five pillars to building a data strategy. And, you know, one of those things is use cases, like getting out, picking a need, solving it and then going from there and along the way you realize what systems are critical, what data you need, who the business users are. You know, what would it take to scale that? So these, like, Proof-point projects that, you know, eventually turn into these bigger things, and for them to turn into bigger things you've got to have that partnership. You've got to know where your trusted data is, you've got to know that, how it got there, who can touch it, how frequently it is updated. Just being able to really understand that and work with partners that manage the infrastructure so that you can leverage it and make it available to other people and transparent. >> I remember when I first interviewed Hilary Mason way back when and I was asking her about that Venn diagram and she threw in another one, which was data hacking. >> Caryn: Uh-huh, yeah. >> Well, talk about that. You've got to be curious about data. You need to, you know, take a bath in data. >> (laughs) Yes, yes. I mean yeah, you really.. Sometimes you have to be a detective and you have to really want to know more. And, I mean, understanding the data is like the majority of the battle. >> So Ritesh, we were talking off-camera about it's not how titles change, things evolve, data, digital. They're kind of interchangeable these days. I mean we always say the difference between a business and a digital business is how they have used data. And so digital being part of your role, everybody's trying to get digital transformation, right? As an SI, you guys are at the heart of it. Certainly, IBM as well. What kinds of questions are our clients asking you about digital? >> So I ultimately see data, whatever we drive from data, it is used by the business side. So we are trying to always solve a business problem, which is to optimize the issues the company is facing, or try to generate more revenues, right? Now, the digital as well as the data has been married together, right? Earlier there are, you can say we are trying to analyze the data to get more insights, what is happening in that company. And then we came up with a predictive modeling that based on the data that will statically collect, how can we predict different scenarios, right? Now digital, we, over the period of the last 10 20 years, as the data has grown, there are different sources of data has come in picture, we are talking about social media and so on, right? And nobody is looking for just reports out of the Excel, right? It is more about how you are presenting the data to the senior management, to the entire world and how easily they can understand it. That's where the digital from the data digitization, as well as the application digitization comes in picture. So the tools are developed over the period to have a better visualization, better understanding. How can we integrate annotation within the data? So these are all different aspects of digitization on the data and we try to integrate the digital concepts within our data and analytics, right? So I used to be more, I mean, I grew up as a data engineer, analytics engineer but now I'm looking more beyond just the data or the data preparation. It's more about presenting the data to the end-user and the business. How it is easy for them to understand it. >> Okay I got to ask you, so you guys are data wonks. I am too, kind of, but I'm not as skilled as you are, but, and I say that with all due respect. I mean you love data. >> Caryn: Yes. >> As data science becomes a more critical skill within organizations, we always talk about the amount of data, data growth, the stats are mind-boggling. But as a data scientist, do you feel like you have access to the right data and how much of a challenge is that with clients? >> So we do have access to the data but the challenge is, the company has so many systems, right? It's not just one or two applications. There are companies we have 50 or 60 or even hundreds of application built over last 20 years. And there are some applications, which are basically duplicate, which replicates the data. Now, the challenge is to integrate the data from different systems because they maintain different metadata. They have the quality of data is a concern. And sometimes with the international companies, the rules, for example, might be in US or India or China, the data acquisitions are different, right? And you are, as you become more global, you try to integrate the data beyond boundaries, which becomes a more compliance issue sometimes, also, beyond the technical issues of data integration. >> Any thoughts on that? >> Yeah, I think, you know one of the other issues too, you have, as you've heard of shadow IT, where people have, like, servers squirreled away under their desks. There's your shadow data, where people have spreadsheets and databases that, you know, they're storing on, like a small server or that they share within their department. And so you know, you were discussing, we were talking earlier about the different systems. And you might have a name in one system that's one way and a name in another system that's slightly different, and then a third system, where it's it's different and there's extra granularity to it or some extra twist. And so you really have to work with all of the people that own these processes and figure out what's the trusted source? What can we all agree on? So there's a lot of... It's funny, a lot of the data problems are people problems. So it's getting people to talk and getting people to agree on, well this is why I need it this way, and this is why I need it this way, and figuring out how you come to a common solution so you can even create those single trusted sources that then everybody can go to and everybody knows that they're working with the the right thing and the same thing that they all agree on. >> The politics of it and, I mean, politics is kind of a pejorative word but let's say dissonance, where you have maybe of a back-end syst6em, financial system and the CFO, he or she is looking at the data saying oh, this is what the data says and then... I remember I was talking to a, recently, a chef in a restaurant said that the CFO saw this but I know that's not the case, I don't have the data to prove it. So I'm going to go get the data. And so, and then as they collect that data they bring together. So I guess in some ways you guys are mediators. >> [Caryn And Ritesh] Yes, yes. Absolutely. >> 'Cause the data doesn't lie you just got to understand it. >> You have to ask the right question. Yes. And yeah. >> And sometimes when you see the data, you start, that you don't even know what questions you want to ask until you see the data. Is that is that a challenge for your clients? >> Caryn: Yes, all the time. Yeah >> So okay, what else do we want to we want to talk about? The state of collaboration, let's say, between the data scientists, the data engineer, the quality engineer, maybe even the application developers. Somebody, John Fourier often says, my co-host and business partner, data is the new development kit. Give me the data and I'll, you know, write some code and create an application. So how about collaboration amongst those roles, is that something... I know IBM's gone on about some products there but your point Caryn, it's a lot of times it's the people. >> It is. >> And the culture. What are you seeing in terms of evolution and maturity of that challenge? >> You know I have a very good friend who likes to say that data science is a team sport and so, you know, these should not be, like, solo projects where just one person is wading up to their elbows in data. This should be something where you've got engineers and scientists and business, people coming together to really work through it as a team because everybody brings really different strengths to the table and it takes a lot of smart brains to figure out some of these really complicated things. >> I completely agree. Because we see the challenges, we always are trying to solve a business problem. It's important to marry IT as well as the business side. We have the technical expert but we don't have domain experts, subject matter experts who knows the business in IT, right? So it's very very important to collaborate closely with the business, right? And data scientist a intermediate layer between the IT as well as business I will say, right? Because a data scientist as they, over the years, as they try to analyze the information, they understand business better, right? And they need to collaborate with IT to either improve the quality, right? That kind of challenges they are facing and I need you to, the data engineer has to work very hard to make sure the data delivered to the data scientist or the business is accurate as much as possible because wrong data will lead to wrong predictions, right? And ultimately we need to make sure that we integrate the data in the right way. >> What's a different cultural dynamic that was, say ten years ago, where you'd go to a statistician, she'd fire up the SPSS.. >> Caryn: We still use that. >> I'm sure you still do but run some kind of squares give me some, you know, probabilities and you know maybe run some Monte Carlo simulation. But one person kind of doing all that it's your point, Caryn. >> Well you know, it's it's interesting. There are there are some students I mentor at a local university and you know we've been talking about the projects that they get and that you know, more often than not they get a nice clean dataset to go practice learning their modeling on, you know? And they don't have to get in there and clean it all up and normalize the fields and look for some crazy skew or no values or, you know, where you've just got so much noise that needs to be reduced into something more manageable. And so it's, you know, you made the point earlier about understanding the data. It's just, it really is important to be very curious and ask those tough questions and understand what you're dealing with. Before you really start jumping in and building a bunch of models. >> Let me add another point. That the way we have changed over the last ten years, especially from the technical point of view. Ten years back nobody talks about the real-time data analysis. There was no streaming application as such. Now nobody talks about the batch analysis, right? Everybody wants data on real-time basis. But not if not real-time might be near real-time basis. That has become a challenge. And it's not just that prediction, which are happening in their ERP environment or on the cloud, they want the real-time integration with the social media for the marketing and the sales and how they can immediately do the campaign, right? So, for example, if I go to Google and I search for for any product, right, for example, a pressure cooker, right? And I go to Facebook, immediately I see the ad within two minutes. >> Yeah, they're retargeting. >> So that's a real-time analytics is happening under different application, including the third-party data, which is coming from social media. So that has become a good source of data but it has become a challenge for the data analyst and the data scientist. How quickly we can turn around is called data analysis. >> Because it used to be you would get ads for a pressure cooker for months, even after you bought the pressure cooker and now it's only a few days, right? >> Ritesh: It's a minute. You close this application, you log into Facebook... >> Oh, no doubt. >> Ritesh: An ad is there. >> Caryn: There it is. >> Ritesh: Because everything is linked either your phone number or email ID you're done. >> It's interesting. We talked about disruption a lot. I wonder if that whole model is going to get disrupted in a new way because everybody started using the same ad. >> So that's a big change of our last 10 years. >> Do you think..oh go ahead. >> oh no, I was just going to say, you know, another thing is just there's so much that is available to everybody now, you know. There's not this small little set of tools that's restricted to people that are in these very specific jobs. But with open source and with so many software-as-a-service products that are out there, anybody can go out and get an account and just start, you know, practicing or playing or joining a cackle competition or, you know, start getting their hands on.. There's data sets that are out there that you can just download to practice and learn on and use. So, you know, it's much more open, I think, than it used to be. >> Yeah, community additions of software, open data. The number of open day sources just keeps growing. Do you think that machine intelligence can, or how can machine intelligence help with this data quality challenge? >> I think that it's it's always going to require people, you know? There's always going to be a need for people to train the machines on how to interpret the data. How to classify it, how to tag it. There's actually a really good article in Popular Science this month about a woman who was training a machine on fake news and, you know, it did a really nice job of finding some of the the same claims that she did. But she found a few more. So, you know, I think it's, on one hand we have machines that we can augment with data and they can help us make better decisions or sift through large volumes of data but then when we're teaching the machines to classify the data or to help us with metadata classification, for example, or, you know, to help us clean it. I think that it's going to be a while before we get to the point where that's the inverse. >> Right, so in that example you gave, the human actually did a better job from the machine. Now, this amazing to me how.. What, what machines couldn't do that humans could, you know last year and all of a sudden, you know, they can. It wasn't long ago that robots couldn't climb stairs. >> And now they can. >> And now they can. >> It's really creepy. >> I think the difference now is, earlier you know, you knew that there is an issue in the data. But you don't know that how much data is corrupt or wrong, right? Now, there are tools available and they're very sophisticated tools. They can pinpoint and provide you the percentage of accuracy, right? On different categories of data that that you come across, right? Even forget about the structure data. Even when you talk about unstructured data, the data which comes from social media or the comments and the remarks that you log or are logged by the customer service representative, there are very sophisticated text analytics tools available, which can talk very accurately about the data as well as the personality of the person who is who's giving that information. >> Tough problems but it seems like we're making progress. All you got to do is look at fraud detection as an example. Folks, thanks very much.. >> Thank you. >> Thank you very much. >> ...for sharing your insight. You're very welcome. Alright, keep it right there everybody. We're live from the IBM CTO conference in San Francisco. Be right back, you're watching the Cube. (electronic music)

Published Date : May 2 2018

SUMMARY :

Brought to you by IBM. of the IBM CDO strategy summit. and how you manage all those demands on your time. and you know, picking the projects that we work on I mean this means a lot of things to a lot of people. and delivering the data to the end-user, right? so that you can leverage it and make it available about that Venn diagram and she threw in another one, You need to, you know, take a bath in data. and you have to really want to know more. As an SI, you guys are at the heart of it. the data to get more insights, I mean you love data. and how much of a challenge is that with clients? Now, the challenge is to integrate the data And so you know, you were discussing, I don't have the data to prove it. [Caryn And Ritesh] Yes, yes. You have to ask the right question. And sometimes when you see the data, Caryn: Yes, all the time. Give me the data and I'll, you know, And the culture. and so, you know, these should not be, like, and I need you to, the data engineer that was, say ten years ago, and you know maybe run some Monte Carlo simulation. and that you know, more often than not And I go to Facebook, immediately I see the ad and the data scientist. You close this application, you log into Facebook... Ritesh: Because everything is linked I wonder if that whole model is going to get disrupted that is available to everybody now, you know. Do you think that machine intelligence going to require people, you know? Right, so in that example you gave, and the remarks that you log All you got to do is look at fraud detection as an example. We're live from the IBM CTO conference

ENTITIES

Entity	Category	Confidence
Ritesh Ororo	PERSON	0.99+
Caryn	PERSON	0.99+
John Fourier	PERSON	0.99+
Ritesh	PERSON	0.99+
IBM	ORGANIZATION	0.99+
US	LOCATION	0.99+
50	QUANTITY	0.99+
Cayn Woodruff	PERSON	0.99+
Boston	LOCATION	0.99+
San Francisco	LOCATION	0.99+
China	LOCATION	0.99+
India	LOCATION	0.99+
last year	DATE	0.99+
Excel	TITLE	0.99+
one	QUANTITY	0.99+
Caryn Woodruff	PERSON	0.99+
Ritesh Arora	PERSON	0.99+
Hilary Mason	PERSON	0.99+
60	QUANTITY	0.99+
130	QUANTITY	0.99+
One	QUANTITY	0.99+
Monte Carlo	TITLE	0.99+
HCL Technologies	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
third system	QUANTITY	0.98+
today	DATE	0.98+
Interpol	ORGANIZATION	0.98+
ten years ago	DATE	0.98+
two applications	QUANTITY	0.98+
first	QUANTITY	0.98+
Parc 55	LOCATION	0.98+
five pillars	QUANTITY	0.98+
one system	QUANTITY	0.98+
Google	ORGANIZATION	0.97+
two aspects	QUANTITY	0.97+
both coasts	QUANTITY	0.97+
one person	QUANTITY	0.96+
Ten years back	DATE	0.96+
two minutes	QUANTITY	0.95+
this month	DATE	0.95+
Union Square	LOCATION	0.95+
two worlds	QUANTITY	0.94+
Spring 2018	DATE	0.94+
Popular Science	TITLE	0.9+
CTO	EVENT	0.88+
days	QUANTITY	0.88+
one way	QUANTITY	0.87+
SPSS	TITLE	0.86+
single trusted sources	QUANTITY	0.85+
Venn	ORGANIZATION	0.84+
few years ago	DATE	0.84+
150 chief data officers	QUANTITY	0.83+
last 10 20 years	DATE	0.83+
Officer Strategy Summit 2018	EVENT	0.82+
hundreds of application	QUANTITY	0.8+
last 10 years	DATE	0.8+
Cube	COMMERCIAL_ITEM	0.79+
IBM Chief	EVENT	0.79+
IBM CDO strategy summit	EVENT	0.72+
last ten years	DATE	0.7+
IBM CDO Summit	EVENT	0.7+
fall	DATE	0.68+
Cube	TITLE	0.66+
spring	DATE	0.65+
last 20 years	DATE	0.63+
minute	QUANTITY	0.49+

Dr. Tendu Yogurtcu, Syncsort | Big Data SV 2018

>> Announcer: Live from San Jose, it's theCUBE. Presenting data, Silicon Valley brought to you by Silicon Angle Media and it's ecosystem partners. >> Welcome back to theCUBE. We are live in San Jose at our event, Big Data SV. I'm Lisa Martin, my co-host is George Gilbert and we are down the street from the Strata Data Conference. We are at a really cool venue: Forager Eatery Tasting Room. Come down and join us, hang out with us, we've got a cocktail par-tay tonight. We also have an interesting briefing from our analysts on big data trends tomorrow morning. I want to welcome back to theCUBE now one of our CUBE VIP's and alumna Tendu Yogurtcu, the CTO at Syncsort, welcome back. >> Thank you. Hello Lisa, hi George, pleasure to be here. >> Yeah, it's our pleasure to have you back. So, what's going on at Syncsort, what are some of the big trends as CTO that you're seeing? >> In terms of the big trends that we are seeing, and Syncsort has grown a lot in the last 12 months, we actually doubled our revenue, it has been really an successful and organic growth path, and we have more than 7,000 customers now, so it's a great pool of customers that we are able to talk and see the trends and how they are trying to adapt to the digital disruption and make data as part of their core strategy. So data is no longer an enabler, and in all of the enterprise we are seeing data becoming the core strategy. This reflects in the four mega trends, they are all connected to enable business as well as operational analytics. Cloud is one, definitely. We are seeing more and more cloud adoption, even our financial services healthcare and banking customers are now, they have a couple of clusters running in the cloud, in public cloud, multiple workloads, hybrid seems to be the new standard, and it comes with also challenges. IT governance as well as date governance is a major challenge, and also scoping and planning for the workloads in the cloud continues to be a challenge, as well. Our general strategy for all of the product portfolio is to have our products following design wants and deploy any of our strategy. So whether it's a standalone environment on Linux or running on Hadoop or Spark, or running on Premise or in the Cloud, regardless of the Cloud provider, we are enabling the same education with no changes to run all of these environments, including hybrid. Then we are seeing the streaming trend, with the connected devices with the digital disruption and so much data being generated, being able to stream and process data on the age, with the Internet of things, and in order to address the use cases that Syncsort is focused on, we are really providing more on the Change Data Capture and near real-time and real-time data replication to the next generation analytics environments and big data environments. We launched last year our Change Data Capture, CDC, product offering with data integration, and we continue to strengthen that vision merger we had data replication, real-time data replication capabilities, and we are now seeing even Kafka database becoming a consumer of this data. Not just keeping the data lane fresh, but really publishing the changes from multiple, diverse set of sources and publishing into a Kafka database and making it available for applications and analytics in the data pipeline. So the third trend we are seeing is around data science, and if you noticed this morning's keynote was all about machine learning, artificial intelligence, deep learning, how to we make use of data science. And it was very interesting for me because we see everyone talking about the challenge of how do you prepare the data and how do you deliver the the trusted data for machine learning and artificial intelligence use and deep learning. Because if you are using bad data, and creating your models based on bad data, then the insights you get are also impacted. We definitely offer our products, both on the data integration and data quality side, to prepare the data, cleanse, match, and deliver the trusted data set for data scientists and make their life easier. Another area of focus for 2018 is can we also add supervised learning to this, because with the premium quality domain experts that we have now in Syncsort, we have a lot of domain experts in the field, we can infuse the machine learning algorithms and connect data profiling capabilities we have with the data quality capabilities recommending business rules for data scientists and helping them automate the mandate tasks with recommendations. And the last but not least trend is data governance, and data governance is almost a umbrella focus for everything we are doing at Syncsort because everything about the Cloud trend, the streaming, and the data science, and developing that next generation analytics environment for our customers depends on the data governance. It is, in fact, a business imperative, and the regulatory compliance use cases drives more importance today than governance. For example, General Data Protection Regulation in Europe, GDPR. >> Lisa: Just a few months away. >> Just a few months, May 2018, it is in the mind of every C-level executive. It's not just for European companies, but every enterprise has European data sourced in their environments. So compliance is a big driver of governance, and we look at governance in multiple aspects. Security and issuing data is available in a secure way is one aspect, and delivering the high quality data, cleansing, matching, the example Hilary Mason this morning gave in the keynote about half of what the context matters in terms of searches of her name was very interesting because you really want to deliver that high quality data in the enterprise, trust of data set, preparing that. Our Trillium Quality for big data, we launched Q4, that product is generally available now, and actually we are in production with very large deployment. So that's one area of focus. And the third area is how do you create visibility, the farm-to-table view of your data? >> Lisa: Yeah, that's the name of your talk! I love that. >> Yes, yes, thank you. So tomorrow I have a talk at 2:40, March 8th also, I'm so happy it's on the Women's Day that I'm talking-- >> Lisa: That's right, that's right! Get a farm-to-table view of your data is the name of your talk, track data lineage from source to analytics. Tell us a little bit more about that. >> It's all about creating more visibility, because for audit reasons, for understanding how many copies of my data is created, valued my data had been, and who accessed it, creating that visibility is very important. And the last couple of years, we saw everyone was focused on how do I create a data lake and make my data accessible, break the data silos, and liberate my data from multiple platforms, legacy platforms that the enterprise might have. Once that happened, everybody started worrying about how do I create consumable data set and how do I manage this data because data has been on the legacy platforms like Mainframe, IMBI series has been on relational data stores, it is in the Cloud, gravity of data originating in the Cloud is increasing, it's originating from mobile. Hadoop vendors like Hortonworks and Cloudera, they are creating visibility to what happens within the Hadoop framework. So we are deepening our integration with the Cloud Navigator, that was our announcement last week. We already have integration both with Hortonworks and Cloudera Navigator, this is one step further where we actually publish what happened to every single granular level of data at the field level with all of the transformations that data have been through outside of the cluster. So that visibility is now published to Navigator itself, we also publish it through the RESTful API, so governance is a very strong and critical initiative for all of the businesses. And we are playing into security aspect as well as data lineage and tracking aspect and the quality aspect. >> So this sounds like an extremely capable infrastructure service, so that it's trusted data. But can you sell that to an economic buyer alone, or do you go in in conjunction with anther solution like anti-money laundering for banks or, you know, what are the key things that they place enough value on that they would spend, you know, budget on it? >> Yes, absolutely. Usually the use cases might originate like anti-money laundering, which is very common, fraud detection, and it ties to getting a single view of an entity. Because in anti-money laundering, you want to understand the single view of your customer ultimately. So there is usually another solution that might be in the picture. We are providing the visibility of the data, as well as that single view of the entity, whether it's the customer view in this case or the product view in some of the use cases by delivering the matching capabilities and the cleansing capabilities, the duplication capabilities in addition to the accessing and integrating the data. >> When you go into a customer and, you know, recognizing that we still have tons of silos and we're realizing it's a lot harder to put everything in one repository, how do customers tell you they want to prioritize what they're bringing into the repository or even what do they want to work on that's continuously flowing in? >> So it depends on the business use case. And usually at the time that we are working with the customer, they selected that top priority use case. The risk here, and the anti-money laundering, or for insurance companies, we are seeing a trend, for example, building the data marketplace, as that tantalize data marketplace concept. So depending on the business case, many of our insurance customers in US, for example, they are creating the data marketplace and they are working with near real-time and microbatches. In Europe, Europe seems to be a bit ahead of the game in some cases, like Hadoop production was slow but certainly they went right into the streaming use cases. We are seeing more directly streaming and keeping it fresh and more utilization of the Kafka and messaging frameworks and database. >> And in that case, where they're sort of skipping the batch-oriented approach, how do they keep track of history? >> It's still, in most of the cases, microbatches, and the metadata is still associated with the data. So there is an analysis of the historical what happened to that data. The tools, like ours and the vendors coming to picture, to keep track, of that basically. >> So, in other words, by knowing what happened operationally to the data, that paints a picture of a history. >> Exactly, exactly. >> Interesting. >> And for the governance we usually also partner, for example, we partner with Collibra data platform, we partnered with ASG for creating that business rules and technical metadata and providing to the business users, not just to the IT data infrastructure, and on the Hadoop side we partner with Cloudera and Hortonworks very closely to complete that picture for the customer, because nobody is just interested in what happened to the data in Hadoop or in Mainframe or in my relational data warehouse, they are really trying to see what's happening on Premise, in the Cloud, multiple clusters, traditional environments, legacy systems, and trying to get that big picture view. >> So on that, enabling a business to have that, we'll say in marketing, 360 degree view of data, knowing that there's so much potential for data to be analyzed to drive business decisions that might open up new business models, new revenue streams, increase profit, what are you seeing as a CTO of Syncsort when you go in to meet with a customer, data silos, when you're talking to a Chief Data Officer, what's the cultural, I guess, not shift but really journey that they have to go on to start opening up other organizations of the business, to have access to data so they really have that broader, 360 degree view? What's that cultural challenge that they have to, journey that they have to go on? >> Yes, Chief Data Officers are actually very good partners for us, because usually Chief Data Officers are trying to break the silos of data and make sure that the data is liberated for the business use cases. Still most of the time the infrastructure and the cluster, whether it's the deployment in the Cloud versus on Premise, it's owned by the IT infrastructure. And the lines of business are really the consumers and the clients of that. CDO, in that sense, almost mitigates and connects to those line of businesses with the IT infrastructure with the same goals for the business, right? They have to worry about the compliance, they have to worry about creating multiple copies of data, they have to worry about the security of the data and availability of the data, so CDOs actually help. So we are actually very good partners with the CDOs in that sense, and we also usually have IT infrastructure owner in the room when we are talking with our customers because they have a big stake. They are like the gatekeepers of the data to make sure that it is accessed by the right... By the right folks in the business. >> Sounds like maybe they're in the role of like, good cop bad cop or maybe mediator. Well Tendu, I wish we had more time. Thanks so much for coming back to theCUBE and, like you said, you're speaking tomorrow at Strata Conference on International Women's Day: Get a farm-to-table view of your data. Love the title. >> Thank you. >> Good luck tomorrow, and we look forward to seeing you back on theCUBE. >> Thank you, I look forward to coming back and letting you know about more exciting both organic innovations and acquisitions. >> Alright, we look forward to that. We want to thank you for watching theCUBE, I'm Lisa Martin with my co-host George Gilbert. We are live at our event Big Data SV in San Jose. Come down and visit us, stick around, and we will be right back with our next guest after a short break. >> Tendu: Thank you. (upbeat music)

Published Date : Mar 7 2018

SUMMARY :

brought to you by Silicon Angle Media and we are down the street from the Strata Data Conference. Hello Lisa, hi George, pleasure to be here. Yeah, it's our pleasure to have you back. and in all of the enterprise we are seeing data and delivering the high quality data, Lisa: Yeah, that's the name of your talk! it's on the Women's Day that I'm talking-- is the name of your talk, track data lineage and make my data accessible, break the data silos, that they place enough value on that they would and the cleansing capabilities, the duplication So it depends on the business use case. It's still, in most of the cases, operationally to the data, that paints a picture And for the governance we usually also partner, and the cluster, whether it's the deployment Love the title. to seeing you back on theCUBE. and letting you know about more exciting and we will be right back with our next guest Tendu: Thank you.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
George	PERSON	0.99+
May 2018	DATE	0.99+
George Gilbert	PERSON	0.99+
Syncsort	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
Europe	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
US	LOCATION	0.99+
Hilary Mason	PERSON	0.99+
San Jose	LOCATION	0.99+
ASG	ORGANIZATION	0.99+
2018	DATE	0.99+
Tendu	PERSON	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
360 degree	QUANTITY	0.99+
tomorrow	DATE	0.99+
Collibra	ORGANIZATION	0.99+
more than 7,000 customers	QUANTITY	0.99+
last week	DATE	0.99+
last year	DATE	0.99+
tomorrow morning	DATE	0.99+
one aspect	QUANTITY	0.99+
third area	QUANTITY	0.99+
Linux	TITLE	0.99+
Cloud Navigator	TITLE	0.99+
2:40	DATE	0.98+
Women's Day	EVENT	0.98+
Tendu Yogurtcu	PERSON	0.98+
GDPR	TITLE	0.98+
Spark	TITLE	0.97+
tonight	DATE	0.97+
Big Data SV	EVENT	0.97+
Kafka	TITLE	0.97+
International Women's Day	EVENT	0.97+
both	QUANTITY	0.97+
CDC	ORGANIZATION	0.96+
Navigator	TITLE	0.96+
Strata Data Conference	EVENT	0.96+
single view	QUANTITY	0.96+
Hadoop	TITLE	0.95+
third trend	QUANTITY	0.95+
one step	QUANTITY	0.95+
single view	QUANTITY	0.95+
Dr.	PERSON	0.94+
theCUBE	ORGANIZATION	0.94+
CUBE	ORGANIZATION	0.94+
this morning	DATE	0.94+
Cloud	TITLE	0.92+
last 12 months	DATE	0.91+
Change Data Capture	ORGANIZATION	0.9+
today	DATE	0.9+
European	OTHER	0.88+
last couple of years	DATE	0.88+
General Data Protection Regulation in Europe	TITLE	0.86+
Strata Conference	EVENT	0.84+
one	QUANTITY	0.83+
one repository	QUANTITY	0.83+
tons of silos	QUANTITY	0.82+
one area	QUANTITY	0.82+
Q4	DATE	0.82+
Big Data SV 2018	EVENT	0.81+
four mega trends	QUANTITY	0.76+
March 8th	DATE	0.76+

Wrap Up | IBM Fast Track Your Data 2017

>> Narrator: Live from Munich Germany, it's theCUBE, covering IBM, Fast Track Your Data. Brought to you by IBM. >> We're back. This is Dave Vellante with Jim Kobielus, and this is theCUBE, the leader in live tech coverage. We go out to the events. We extract the signal from the noise. We are here covering special presentation of IBM's Fast Track your Data, and we're in Munich Germany. It's been a day-long session. We started this morning with a panel discussion with five senior level data scientists that Jim and I hosted. Then we did CUBE interviews in the morning. We cut away to the main tent. Kate Silverton did a very choreographed scripted, but very well done, main keynote set of presentations. IBM made a couple of announcements today, and then we finished up theCUBE interviews. Jim and I are here to wrap. We're actually running on IBMgo.com. We're running live. Hilary Mason talking about what she's doing in data science, and also we got a session on GDPR. You got to log in to see those sessions. So go ahead to IBMgo.com, and you'll find those. Hit the schedule and go to the Hilary Mason and GDP our channels, and check that out, but we're going to wrap now. Jim two main announcements today. I hesitate to call them big announcements. I mean they were you know just kind of ... I think the word you used last night was perfunctory. You know I mean they're okay, but they're not game changing. So what did you mean? >> Well first of all, when you look at ... Though IBM is not calling this a signature event, it's essentially a signature event. They do these every June or so. You know in the past several years, the signature events have had like a one track theme, whether it be IBM announcing their investing deeply in Spark, or IBM announcing that they're focusing on investing in R as the core language for data science development. This year at this event in Munich, it's really a three track event, in terms of the broad themes, and I mean they're all important tracks, but none of them is like game-changing. Perhaps IBM doesn't intend them to be it seems like. One of which is obviously Europe. We're holding this in Munich. And a couple of things of importance to European customers, first and foremost GDPR. The deadline next year, in terms of compliance, is approaching. So sound the alarm as it were. And IBM has rolled out compliance or governance tools. Download and the go from the information catalog, governance catalog and so forth. Now announcing the consortium with Hortonworks to build governance on top of Apache Atlas, but also IBM announcing that they've opened up a DSX center in England and a machine-learning hub here in Germany, to help their European clients, in those countries especially, to get deeper down into data science and machine learning, in terms of developing those applicants. That's important for the audience, the regional audience here. The second track, which is also important, and I alluded to it. It's governance. In all of its manifestations you need a master catalog of all the assets for building and maintaining and controlling your data applications and your data science applications. The catalog, the consortium, the various offerings at IBM is announced and discussed in great detail. They've brought in customers and partners like Northern Trust, talk about the importance of governance, not just as a compliance mandate, but also the potential strategy for monetizing your data. That's important. Number three is what I call cloud native data applications and how the state of the art in developing data applications is moving towards containerized and orchestrated environments that involve things like Docker and Kubernetes. The IBM DB2 developer community edition. Been in the market for a few years. The latest version they announced today includes kubernetes support. Includes support for JSON. So it's geared towards new generation of cloud and data apps. What I'm getting at ... Those three core themes are Europe governance and cloud native data application development. Each of them is individually important, but none of them is game changer. And one last thing. Data science and machine learning, is one of the overarching envelope themes of this event. They've had Hilary Mason. A lot of discussion there. My sense I was a little bit disappointed because there wasn't any significant new announcements related to IBM evolving their machine learning portfolio into deep learning or artificial intelligence in an environment where their direct competitors like Microsoft and Google and Amazon are making a huge push in AI, in terms of their investments. There's a bit of a discussion, and Rob Thomas got to it this morning, about DSX. Working with power AI, the IBM platform, I would like to hear more going forward about IBM investments in these areas. So I thought it was an interesting bunch of announcements. I'll backtrack on perfunctory. I'll just say it was good that they had this for a lot of reasons, but like I said, none of these individual announcements is really changing the game. In fact like I said, I think I'm waiting for the fall, to see where IBM goes in terms of doing something that's actually differentiating and innovative. >> Well I think that the event itself is great. You've got a bunch of partners here, a bunch of customers. I mean it's active. IBM knows how to throw a party. They've always have. >> And the sessions are really individually awesome. I mean terms of what you learn. >> The content is very good. I would agree. The two announcements that were sort of you know DB2, sort of what I call community edition. Simpler, easier to download. Even Dave can download DB2. I really don't want to download DB2, but I could, and play with it I guess. You know I'm not database guy, but those of you out there that are, go check it out. And the other one was the sort of unified data governance. They tried to tie it in. I think they actually did a really good job of tying it into GDPR. We're going to hear over the next, you know 11 months, just a ton of GDPR readiness fear, uncertainty and doubt, from the vendor community, kind of like we heard with Y2K. We'll see what kind of impact GDPR has. I mean it looks like it's the real deal Jim. I mean it looks like you know this 4% of turnover penalty. The penalties are much more onerous than any other sort of you know, regulation that we've seen in the past, where you could just sort of fluff it off. Say yeah just pay the fine. I think you're going to see a lot of, well pay the lawyers to delay this thing and battle it. >> And one of our people in theCUBE that we interviewed, said it exactly right. It's like the GDPR is like the inverse of Y2K. In Y2K everybody was freaking out. It was actually nothing when it came down to it. Where nobody on the street is really buzzing. I mean the average person is not buzzing about GDPR, but it's hugely important. And like you said, I mean some serious penalties may be in the works for companies that are not complying, companies not just in Europe, but all around the world who do business with European customers. >> Right okay so now bring it back to sort of machine learning, deep learning. You basically said to Rob Thomas, I see machine learning here. I don't see a lot of the deep learning stuff quite yet. He said stay tuned. You know you were talking about TensorFlow and things like that. >> Yeah they supported that ... >> Explain. >> So Rob indicated that IBM very much, like with power AI and DSX, provides an open framework or toolkit for plugging in your, you the developers, preferred machine learning or deep learning toolkit of an open source nature. And there's a growing range of open source deep learning toolkits beyond you know TensorFlow, including Theano and MXNet and so forth, that IBM is supporting within the overall ESX framework, but also within the power AI framework. In other words they've got those capabilities. They're sort of burying that message under a bushel basket, at least in terms of this event. Also one of the things that ... I said this too Mena Scoyal. Watson data platform, which they launched last fall, very important product. Very important platform for collaboration among data science professionals, in terms of the machine learning development pipeline. I wish there was more about the Watson data platform here, about where they're taking it, what the customers are doing with it. Like I said a couple of times, I see Watson data platform as very much a DevOps tool for the new generation of developers that are building machine learning models directly into their applications. I'd like to see IBM, going forward turn Watson data platform into a true DevOps platform, in terms of continuous integration of machine learning and deep learning another statistical models. Continuous training, continuous deployment, iteration. I believe that's where they're going, or probably she will be going. I'd like to see more. I'm expecting more along those lines going forward. What I just described about DevOps for data science is a big theme that we're focusing on at Wikibon, in terms where the industry is going. >> Yeah, yeah. And I want to come back to that again, and get an update on what you're doing within your team, and talk about the research. Before we do that, I mean one of the things we talked about on theCUBE, in the early days of Hadoop is that the guys are going to make the money in this big data business of the practitioners. They're not going to see, you know these multi-hundred billion dollar valuations come out of the Hadoop world. And so far that prediction has held up well. It's the Airbnbs and the Ubers and the Spotifys and the Facebooks and the Googles, the practitioners who are applying big data, that are crushing it and making all the money. You see Amazon now buying Whole Foods. That in our view is a data play, but who's winning here, in either the vendor or the practitioner community? >> Who's winning are the startups with a hot new idea that's changing, that's disrupting some industry, or set of industries with machine learning, deep learning, big data, etc. For example everybody's, with bated breath, waiting for you know self-driving vehicles. And the ecosystem as it develops somebody's going to clean up. And one or more companies, companies we probably never heard of, leveraging everything we're describing here today, data science and containerized distributed applications that involve you know deep learning for you know image analysis and sensor analyst and so forth. Putting it all together in some new fabric that changes the way we live on this planet, but as you said the platforms themselves, whether they be Hadoop or Spark or TensorFlow, whatever, they're open source. You know and the fact is, by it's very nature, open source based solutions, in terms of profit margins on selling those, inexorably migrate to zero. So you're not going to make any money as a tool vendor, or a platform vendor. You got to make money ... If you're going to make money, you make money, for example from providing an ecosystem, within which innovation can happen. >> Okay we have a few minutes left. Let's talk about the research that you're working on. What's exciting you these days? >> Right, right. So I think a lot of people know I've been around the analyst space for a long long time. I've joined the SiliconANGLE Wikibon team just recently. I used to work for a very large solution provider, and what I do here for Wikibon is I focus on data science as the core of next generation application development. When I say next-generation application development, it's the development of AI, deep learning machine learning, and the deployment of those data-driven statistical assets into all manner of application. And you look at the hot stuff, like chatbots for example. Transforming the experience in e-commerce on mobile devices. Siri and Alexa and so forth. Hugely important. So what we're doing is we're focusing on AI and everything. We're focusing on containerization and building of AI micro-services and the ecosystem of the pipelines and the tools that allow you to do that. DevOps for data science, distributed training, federated training of statistical models, so forth. We are also very much focusing on the whole distributed containerized ecosystem, Docker, Kubernetes and so forth. Where that's going, in terms of changing the state of the art, in terms of application development. Focusing on the API economy. All of those things that you need to wrap around the payload of AI to deliver it into every ... >> So you're focused on that intersection between AI and the related topics and the developer. Who is winning in that developer community? Obviously Amazon's winning. You got Microsoft doing a good job there. Google, Apple, who else? I mean how's IBM doing for example? Maybe name some names. Who do you who impresses you in the developer community? But specifically let's start with IBM. How is IBM doing in that space? >> IBM's doing really well. IBM has been for quite a while, been very good about engaging with new generation of developers, using spark and R and Hadoop and so forth to build applications rapidly and deploy them rapidly into all manner of applications. So IBM has very much reached out to, in the last several years, the Millennials for whom all of this, these new tools, have been their core repertoire from the very start. And I think in many ways, like today like developer edition of the DB2 developer community edition is very much geared to that market. Saying you know to the cloud native application developer, take a second look at DB2. There's a lot in DB2 that you might bring into your next application development initiative, alongside your spark toolkit and so forth. So IBM has startup envy. They're a big old company. Been around more than a hundred years. And they're trying to, very much bootstrap and restart their brand in this new context, in the 21st century. I think they're making a good effort at doing it. In terms of community engagement, they have a really good community engagement program, all around the world, in terms of hackathons and developer days, you know meetups here and there. And they get lots of turnout and very loyal customers and IBM's got to broadest portfolio. >> So you still bleed a little bit of blue. So I got to squeeze it out of you now here. So let me push a little bit on what you're saying. So DB2 is the emphasis here, trying to position DB2 as appealing for developers, but why not some of the other you know acquisitions that they've made? I mean you don't hear that much about Cloudant, Dash TV, and things of that nature. You would think that that would be more appealing to some of the developer communities than DB2. Or am I mistaken? Is it IBM sort of going after the core, trying to evolve that core you know constituency? >> No they've done a lot of strategic acquisitions like Cloudant, and like they've acquired Agrath Databases and brought them into their platform. IBM has every type of database or file system that you might need for web or social or Internet of Things. And so with all of the development challenges, IBM has got a really high-quality, fit-the-purpose, best-of-breed platform, underlying data platform for it. They've got huge amounts of developers energized all around the world working on this platform. DB2, in the last several years they've taken all of their platforms, their legacy ... That's the wrong word. All their existing mature platforms, like DB2 and brought them into the IBM cloud. >> I think legacy is the right word. >> Yeah, yeah. >> These things have been around for 30 years. >> And they're not going away because they're field-proven and ... >> They are evolving. >> And customers have implemented them everywhere. And they're evolving. If you look at how IBM has evolved DB2 in the last several years into ... For example they responded to the challenge from SAP HANA. We brought BLU Acceleration technology in memory technology into DB2 to make it screamingly fast and so forth. IBM has done a really good job of turning around these product groups and the product architecture is making them cloud first. And then reaching out to a new generation of cloud application developers. Like I said today, things like DB2 developer community edition, it's just the next chapter in this ongoing saga of IBM turning itself around. Like I said, each of the individual announcements today is like okay that's interesting. I'm glad to see IBM showing progress. None of them is individually disruptive. I think the last week though, I think Hortonworks was disruptive in the sense that IBM recognized that BigInsights didn't really have a lot of traction in the Hadoop spaces, not as much as they would have wished. Hortonworks very much does, and IBM has cast its lot to work with HDP, but HDP and Hortonworks recognizes they haven't achieved any traction with data scientists, therefore DSX makes sense, as part of the Hortonworks portfolio. Likewise a big sequel makes perfect sense as the sequel front end to the HDP. I think the teaming of IBM and Hortonworks is propitious of further things that they'll be doing in the future, not just governance, but really putting together a broader cloud portfolio for the next generation of data scientists doing work in the cloud. >> Do you think Hortonworks is a legitimate acquisition target for IBM. >> Of course they are. >> Why would IBM ... You know educate us. Why would IBM want to acquire Hortonworks? What does that give IBM? Open source mojo, obviously. >> Yeah mojo. >> What else? >> Strong loyalty with the Hadoop market with developers. >> The developer angle would supercharge the developer angle, and maybe make it more relevant outside of some of those legacy systems. Is that it? >> Yeah, but also remember that Hortonworks came from Yahoo, the team that developed much of what became Hadoop. They've got an excellent team. Strategic team. So in many ways, you can look at Hortonworks as one part aqui-hire if they ever do that and one part really substantial and growing solution portfolio that in many ways is complementary to IBM. Hortonworks is really deep on the governance of Hadoop. IBM has gone there, but I think Hortonworks is even deeper, in terms of their their laser focus. >> Ecosystem expansion, and it actually really wouldn't be that expensive of an acquisition. I mean it's you know north of ... Maybe a billion dollars might get it done. >> Yeah. >> You know so would you pay a billion dollars for Hortonworks? >> Not out of my own pocket. >> No, I mean if you're IBM. You think that would deliver that kind of value? I mean you know how IBM thinks about about acquisitions. They're good at acquisitions. They look at the IRR. They have their formula. They blue-wash the companies and they generally do very well with acquisitions. Do you think Hortonworks would fit profile, that monetization profile? >> I wouldn't say that Hortonworks, in terms of monetization potential, would match say what IBM has achieved by acquiring the Netezza. >> Cognos. >> Or SPSS. I mean SPSS has been an extraordinarily successful ... >> Well the day IBM acquired SPSS they tripled the license fees. As a customer I know, ouch, it worked. It was incredibly successful. >> Well, yeah. Cognos was. Netezza was. And SPSS. Those three acquisitions in the last ten years have been extraordinarily pivotal and successful for IBM to build what they now have, which is really the most comprehensive portfolio of fit-to-purpose data platform. So in other words all those acquisitions prepared IBM to duke it out now with their primary competitors in this new field, which are Microsoft, who's newly resurgent, and Amazon Web Services. In other words, the two Seattle vendors, Seattle has come on strong, in a way that almost Seattle now in big data in the cloud is eclipsing Silicon Valley, in terms of where you know ... It's like the locus of innovation and really of customer adoption in the cloud space. >> Quite amazing. Well Google still hanging in there. >> Oh yeah. >> Alright, Jim. Really a pleasure working with you today. Thanks so much. Really appreciate it. >> Thanks for bringing me on your team. >> And Munich crew, you guys did a great job. Really well done. Chuck, Alex, Patrick wherever he is, and our great makeup lady. Thanks a lot. Everybody back home. We're out. This is Fast Track Your Data. Go to IBMgo.com for all the replays. Youtube.com/SiliconANGLE for all the shows. TheCUBE.net is where we tell you where theCUBE's going to be. Go to wikibon.com for all the research. Thanks for watching everybody. This is Dave Vellante with Jim Kobielus. We're out.

Published Date : Jun 25 2017

SUMMARY :

Brought to you by IBM. I mean they were you know just kind of ... I think the word you used last night was perfunctory. And a couple of things of importance to European customers, first and foremost GDPR. IBM knows how to throw a party. I mean terms of what you learn. seen in the past, where you could just sort of fluff it off. I mean the average person is not buzzing about GDPR, but it's hugely important. I don't see a lot of the deep learning stuff quite yet. And there's a growing range of open source deep learning toolkits beyond you know TensorFlow, of Hadoop is that the guys are going to make the money in this big data business of the And the ecosystem as it develops somebody's going to clean up. Let's talk about the research that you're working on. the pipelines and the tools that allow you to do that. Who do you who impresses you in the developer community? all around the world, in terms of hackathons and developer days, you know meetups here Is it IBM sort of going after the core, trying to evolve that core you know constituency? They've got huge amounts of developers energized all around the world working on this platform. Likewise a big sequel makes perfect sense as the sequel front end to the HDP. You know educate us. The developer angle would supercharge the developer angle, and maybe make it more relevant Hortonworks is really deep on the governance of Hadoop. I mean it's you know north of ... They blue-wash the companies and they generally do very well with acquisitions. I wouldn't say that Hortonworks, in terms of monetization potential, would match say I mean SPSS has been an extraordinarily successful ... Well the day IBM acquired SPSS they tripled the license fees. now in big data in the cloud is eclipsing Silicon Valley, in terms of where you know Well Google still hanging in there. Really a pleasure working with you today. And Munich crew, you guys did a great job.

ENTITIES

Entity	Category	Confidence
Kate Silverton	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Google	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Patrick	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Germany	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Y2K	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Chuck	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Munich	LOCATION	0.99+
England	LOCATION	0.99+
Rob Thomas	PERSON	0.99+
second track	QUANTITY	0.99+
Siri	TITLE	0.99+
two	QUANTITY	0.99+
21st century	DATE	0.99+
three track	QUANTITY	0.99+
Rob	PERSON	0.99+
next year	DATE	0.99+
4%	QUANTITY	0.99+
Mena Scoyal	PERSON	0.99+
Alex	PERSON	0.99+
Whole Foods	ORGANIZATION	0.99+
Each	QUANTITY	0.99+
Cloudant	ORGANIZATION	0.99+

Seth Dobrin, IBM Analytics - IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany; it's The Cube. Covering IBM; fast-track your data. Brought to you by IBM. (upbeat techno music) >> For you here at the show, generally; and specifically, what are you doing here today? >> There's really three things going on at the show, three high level things. One is we're talking about our new... How we're repositioning our hybrid data management portfolio, specifically some announcements around DB2 in a hybrid environment, and some highly transactional offerings around DB2. We're talking about our unified governance portfolio; so actually delivering a platform for unified governance that allows our clients to interact with governance and data management kind of products in a more streamlined way, and help them actually solve a problem instead of just offering products. The third is really around data science and machine learning. Specifically we're talking about our machine learning hub that we're launching here in Germany. Prior to this we had a machine learning hub in San Francisco, Toronto, one in Asia, and now we're launching one here in Europe. >> Seth, can you describe what this hub is all about? This is a data center where you're hosting machine learning services, or is it something else? >> Yeah, so this is where clients can come and learn how to do data science. They can bring their problems, bring their data to our facilities, learn how to solve a data science problem in a more team oriented way; interacting with data scientists, machine learning engineers, basically, data engineers, developers, to solve a problem for their business around data science. These previous hubs have been completely booked, so we wanted to launch them in other areas to try and expand the capacity of them. >> You're hosting a round table today, right, on the main tent? >> Yep. >> And you got a customer on, you guys going to be talking about sort of applying practices and financial and other areas. Maybe describe that a little bit. >> We have a customer on from ING, Heinrich, who's the chief architect for ING. ING, IBM, and Horton Works have a consortium, if you would, or a framework that we're doing around Apache Atlas and Ranger, as the kind of open-source operating system for our unified governance platform. So much as IBM has positioned Spark as a unified, kind of open-source operating system for analytics, for a unified governance platform... For a governance platform to be truly unified, you need to be able to integrate metadata. The biggest challenge about connecting your data environments, if you're an enterprise that was not internet born, or cloud born, is that you have proprietary metadata platforms that all want to be the master. When everyone wants to be the master, you can't really get anything done. So what we're doing around Apache Atlas is we are setting up Apache Atlas as kind of a virtual translator, if you would, or a dictionary between all the different proprietary metadata platforms so that you can get a single unified view of your data environment across hybrid clouds, on premise, in the cloud, and across different proprietary vendor platforms. Because it's open-sourced, there are these connectors that can go in and out of the proprietary platforms. >> So Seth, you seem like you're pretty tuned in to the portfolio within the analytics group. How are you spending your time as the Chief Data Officer? How do you balance it between customer visits, maybe talking about some of the products, and then you're sort of day job? >> I actually have three days jobs. My job's actually split into kind of three pieces. The first, my primary mission, is really around transforming IBM's internal business unit, internal business workings, to use data and analytics to run our business. So kind of internal business unit transformation. Part of that business unit transformation is also making sure that we're compliant with regulations like GDBR and other regulations. Another third is really around kind of rethinking our offerings from a CDO perspective. As a CDO, and as you, Dave, I've only been with IBM for seven months. As a former client recently, and as a CDO, what is it that I want to see from IBM's offerings? We kind of hit on it a little bit with the unified governance platform, where I think IBM makes fantastic products. But as a client, if a salesperson shows up to me, I don't want them selling me a product, 'cause if I want an MDM solution, I'll call you up and say, "Hey, I need an MDM solution. "Give me a quote." What I want them showing up is saying, "I have a solution that's going to solve "your governance problem across your portfolio." Or, "I'm going to solve your data science problem." Or, "I'm going to help you master your data, "and manage your data across "all these different environments." So really working with the offering management and the Dev teams to define what are these three or four, kind of business platforms that we want to settle on? We know three of them at least, right? We know that we have a hybrid data management. We have unified governance. We have data science and machine learning, and you could think of the Z franchise as a fourth platform. >> Seth, can you net out how governance relates to data science? 'Cause there is governance of the statistical models, machine learning, and so forth, version control. I mean, in an end to end machine learning pipeline, there's various versions of various artifacts they have to be managed in a structured way. Is your unified governance bundle, or portfolio, does it address those requirements? Or just the data governance? >> Yeah, so the unified governance platform really kind of focuses today on data governance and how good data governance can be an enabler of rapid data science. So if you have your data all pre-governed, it makes it much quicker to get access to data and understand what you can and can't do with data; especially being here in Europe, in the context of the EU GDPR. You need to make sure that your data scientists are doing things that are approved by the user, because basically your data, you have to give explicit consent to allow things to be done with it. But long term vision is that... essentially the output of models is data, right? And how you use and deploy those models also need to be governed. So the long term vision is that we will have a governance platform for all those things, as well. I think it makes more sense for those things to be governed in the data science platform, if you would. And we... >> We often hear separate from GDPR and all that, is something called algorithmic accountability; that more is being discussed in policy circles, in government circles around the world, as strongly related to everything you're describing. Being able to trace the lineage of any algorithmic decision back to the data, the metadata, and so forth, and the machine learning models that might have driven it. Is that where IBM's going with this portfolio? >> I think that's the natural extension of it. We're thinking really in the context of them as two different pieces, but if you solve them both and you connect them together, then you have that problem. But I think you're absolutely right. As we're leveraging machine learning and artificial intelligence, in general, we need to be able to understand how we got to a decision, and that includes the model, the data, how the data was gathered, how the data was used and processed. So it is that entire pipeline, 'cause it is a pipeline. You're not doing machine learning or AI in a vacuum. You're doing it in the context of the data, and you're doing it in the context about the individuals or the organizations that you're trying to influence with the output of those models. >> I call it Dev ops for data science. >> Seth, in the early Hadoop days, the real headwind was complexity. It still is, by the way. We know that. Companies like IBM are trying to reduce that complexity. Spark helps a little bit So the technology will evolve, we get that. It seems like one of the other big headwinds right now is that most companies don't have a great understanding of how they can take data and monetize it, turn it into value. Most companies, many anyway, make the mistake of, "Well, I don't really want to sell my data," or, "I'm not really a data supplier." And they're kind of thinking about it, maybe not in the right way. But we seem to be entering a next wave here, where people are beginning to understand I can cut costs, I can do predictive maintenance, I can maybe not sell the data, but I can enhance what I'm doing and increase my revenue, maybe my customer retention. They seem to be tuning, more so; largely, I think 'cause of the chief data officer roles, helping them think that through. I wonder if you would give us your point of view on that narrative. >> I think what you're describing is kind of the digital transformation journey. I think the end game, as enterprises go through a digital transformation, the end game is how do I sell services, outcomes, those types of things. How do I sell an outcome to my end user? That's really the end game of a digital transformation in my mind. But before you can get to that, before you transform your business's objectives, there's a couple of intermediary steps that are required for that. The first is what you're describing, is those kind of data transformations. Enterprises need to really get a handle on their data and become data driven, and start then transforming their current business model; so how do I accelerate my current business leveraging data and analytics? I kind of frame that, that's like the data science kind of transformation aspect of the digital journey. Then the next aspect of it is how do I transform my business and change my business objectives? Part of that first step is in fact, how do I optimize my supply chain? How do I optimize my workforce? How do I optimize my goals? How do I get to my current, you know, the things that Wall Street cares about for business; how do I accelerate those, make those faster, make those better, and really put my company out in front? 'Cause really in the grand scheme of things, there's two types of companies today; there's the company that's going to be the disruptor, and there's companies that's going to get disrupted. Most companies want to be the disruptors, and it's a process to do that. >> So the accounting industry doesn't have standards around valuing data as an asset, and many of us feel as though waiting for that is a mistake. You can't wait for that. You've got to figure out on your own. But again, it seems to be somewhat of a headwind because it puts data and data value in this fuzzy category. But there are clearly the data haves and the data have-nots. What are you seeing in that regard? >> I think the first... When I was in my former role, my former company went through an exercise of valuing our data and our decisions. I'm actually doing that same exercise at IBM right now. We're going through IBM, at least in the analytics business unit, the part I'm responsible for, and going to all the leaders and saying, "What decisions are you making?" "Help me understand the decisions that you're making." "Help me understand the data you need "to make those decisions." And that does two things. Number one, it does get to the point of, how can we value the decisions? 'Cause each one of those decisions has a specific value to the company. You can assign a dollar amount to it. But it also helps you change how people in the enterprise think. Because the first time you go through and ask these questions, they talk about the dashboards they want to help them make their preconceived decisions, validated by data. They have a preconceived notion of the decision they want to make. They want the data to back it up. So they want a dashboard to help them do that. So when you come in and start having this conversation, you kind of stop them and say, "Okay, what you're describing is a dashboard. "That's not a decision. "Let's talk about the decision that you want to make, "and let's understand the real value of that decision." So you're doing two things, you're building a portfolio of decisions that then becomes to your point, Jim, about Dev ops for data science. It's your backlog for your data scientists, in the long run. You then connect those decisions to data that's required to make those, and you can extrapolate the data for each decision to the component that each piece of data makes up to it. So you can group your data logically within an enterprise; customer, product, talent, location, things like that, and you can assign a value to those based on decisions they support. >> Jim: So... >> Dave: Go ahead, please. >> As a CDO, following on that, are you also, as part of that exercise, trying to assess the value of not just the data, but of data science as a capability? Or particular data science assets, like machine learning models? In the overall scheme of things, that kind of valuation can then drive IBM's decision to ramp up their internal data science initiatives, or redeploy it, or, give me a... >> That's exactly what happened. As you build this portfolio of decisions, each decision has a value. So I am now assigning a value to the data science models that my team will build. As CDOs, CDOs are a relatively new role in many organizations. When money gets tight, they say, "What's this guy doing?" (Dave laughing) Having a portfolio of decisions that's saying, "Here's real value I'm adding..." So, number one, "Here's the value I can add in the future," and as you check off those boxes, you can kind of go and say, "Here's value I've added. "Here's where I've changed how the company's operating. "Here's where I've generated X billions of dollars "of new revenue, or cost savings, or cost avoidance, "for the enterprise." >> When you went through these exercises at your previous company, and now at IBM, are you using standardized valuation methodologies? Did you kind of develop your own, or come up with a scoring system? How'd you do that? >> I think there's some things around, like net promoter score, where there's pretty good standards on how to assign value to increases in net promoter score, or decreases in net promoter score for certain aspects of your business. In other ways, you need to kind of decide as an enterprise, how do we value our assets? Do we use a three year, five year, ten year MPV? Do we use some other metric? You need to kind of frame it in the reference that your CFO is used to talking about so that it's in the context that the company is used to talking about. Most companies, it's net present value. >> Okay, and you're measuring that on an ongoing basis. >> Seth: Yep. >> And fine tuning as you go along. Seth, we're out of time. Thanks so much for coming back in The Cube. It was great to see you. >> Seth: Yeah, thanks for having me. >> You're welcome, good luck this afternoon. >> Seth: Alright. >> Keep it right there, buddy. We'll be back. Actually, let me run down the day here for you, just take a second to do that. We're going to end our Cube interviews for the morning, and then we're going to cut over to the main tent. So in about an hour, Rob Thomas is going to kick off the main tent here with a keynote, talking about where data goes next. Hilary Mason's going to be on. There's a session with Dez Blanchfield on data science as a team sport. Then the big session on changing regulations, GDPRs. Seth, you've got some customers that you're going to bring on and talk about these issues. And then, sort of balancing act, the balancing act of hybrid data. Then we're going to come back to The Cube and finish up our Cube interviews for the afternoon. There's also going to be two breakout sessions; one with Hilary Mason, and one on GDPR. You got to go to IBMgo.com and log in and register. It's all free to see those breakout sessions. Everything else is open. You don't even have to register or log in to see that. So keep it right here, everybody. Check out the main tent. Check out siliconangle.com, and of course IBMgo.com for all the action here. Fast track your data. We're live from Munich, Germany; and we'll see you a little later. (upbeat techno music)

Published Date : Jun 24 2017

SUMMARY :

Brought to you by IBM. that allows our clients to interact with governance and expand the capacity of them. And you got a customer on, you guys going to be talking about and Ranger, as the kind of open-source operating system How are you spending your time as the Chief Data Officer? and the Dev teams to define what are these three or four, I mean, in an end to end machine learning pipeline, in the data science platform, if you would. and the machine learning models that might have driven it. and you connect them together, then you have that problem. I can maybe not sell the data, How do I get to my current, you know, But again, it seems to be somewhat of a headwind of decisions that then becomes to your point, Jim, of not just the data, but of data science as a capability? and as you check off those boxes, you can kind of go and say, You need to kind of frame it in the reference that your CFO And fine tuning as you go along. and we'll see you a little later.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave	PERSON	0.99+
ING	ORGANIZATION	0.99+
Seth	PERSON	0.99+
Europe	LOCATION	0.99+
Seth Dobrin	PERSON	0.99+
Germany	LOCATION	0.99+
Jim	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Thomas	PERSON	0.99+
ten year	QUANTITY	0.99+
five year	QUANTITY	0.99+
seven months	QUANTITY	0.99+
Asia	LOCATION	0.99+
three year	QUANTITY	0.99+
three	QUANTITY	0.99+
four	QUANTITY	0.99+
Heinrich	PERSON	0.99+
Horton Works	ORGANIZATION	0.99+
Dez Blanchfield	PERSON	0.99+
two types	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
three days	QUANTITY	0.99+
two things	QUANTITY	0.99+
each piece	QUANTITY	0.99+
today	DATE	0.99+
Dav	PERSON	0.99+
each	QUANTITY	0.99+
first	QUANTITY	0.99+
Munich, Germany	LOCATION	0.99+
third	QUANTITY	0.99+
both	QUANTITY	0.99+
billions of dollars	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.98+
two different pieces	QUANTITY	0.98+
three things	QUANTITY	0.98+
DB2	TITLE	0.98+
first step	QUANTITY	0.98+
GDPR	TITLE	0.97+
Apache Atlas	ORGANIZATION	0.97+
fourth platform	QUANTITY	0.97+
2017	DATE	0.97+
three pieces	QUANTITY	0.97+
IBM Analytics	ORGANIZATION	0.96+
first time	QUANTITY	0.96+
single	QUANTITY	0.96+
Spark	TITLE	0.95+
Ranger	ORGANIZATION	0.91+
two breakout sessions	QUANTITY	0.88+
about an hour	QUANTITY	0.86+
each decision	QUANTITY	0.85+
Cube	COMMERCIAL_ITEM	0.84+
each one	QUANTITY	0.83+
this afternoon	DATE	0.82+
Cube	ORGANIZATION	0.8+
San Francisco, Toronto	LOCATION	0.79+
GDPRs	TITLE	0.76+
GDBR	TITLE	0.75+

Rob Thomas, IBM Analytics | IBM Fast Track Your Data 2017

>> Announcer: Live from Munich, Germany, it's theCUBE. Covering IBM: Fast Track Your Data. Brought to you by IBM. >> Welcome, everybody, to Munich, Germany. This is Fast Track Your Data brought to you by IBM, and this is theCUBE, the leader in live tech coverage. We go out to the events, we extract the signal from the noise. My name is Dave Vellante, and I'm here with my co-host Jim Kobielus. Rob Thomas is here, he's the General Manager of IBM Analytics, and longtime CUBE guest, good to see you again, Rob. >> Hey, great to see you. Thanks for being here. >> Dave: You're welcome, thanks for having us. So we're talking about, we missed each other last week at the Hortonworks DataWorks Summit, but you came on theCUBE, you guys had the big announcement there. You're sort of getting out, doing a Hadoop distribution, right? TheCUBE gave up our Hadoop distributions several years ago so. It's good that you joined us. But, um, that's tongue-in-cheek. Talk about what's going on with Hortonworks. You guys are now going to be partnering with them essentially to replace BigInsights, you're going to continue to service those customers. But there's more than that. What's that announcement all about? >> We're really excited about that announcement, that relationship, just to kind of recap for those that didn't see it last week. We are making a huge partnership with Hortonworks, where we're bringing data science and machine learning to the Hadoop community. So IBM will be adopting HDP as our distribution, and that's what we will drive into the market from a Hadoop perspective. Hortonworks is adopting IBM Data Science Experience and IBM machine learning to be a core part of their Hadoop platform. And I'd say this is a recognition. One is, companies should do what they do best. We think we're great at data science and machine learning. Hortonworks is the best at Hadoop. Combine those two things, it'll be great for clients. And, we also talked about extending that to things like Big SQL, where they're partnering with us on Big SQL, around modernizing data environments. And then third, which relates a little bit to what we're here in Munich talking about, is governance, where we're partnering closely with them around unified governance, Apache Atlas, advancing Atlas in the enterprise. And so, it's a lot of dimensions to the relationship, but I can tell you since I was on theCUBE a week ago with Rob Bearden, client response has been amazing. Rob and I have done a number of client visits together, and clients see the value of unlocking insights in their Hadoop data, and they love this, which is great. >> Now, I mean, the Hadoop distro, I mean early on you got into that business, just, you had to do it. You had to be relevant, you want to be part of the community, and a number of folks did that. But it's really sort of best left to a few guys who want to do that, and Apache open source is really, I think, the way to go there. Let's talk about Munich. You guys chose this venue. There's a lot of talk about GDPR, you've got some announcements around unified government, but why Munich? >> So, there's something interesting that I see happening in the market. So first of all, you look at the last five years. There's only 10 companies in the world that have outperformed the S&P 500, in each of those five years. And we started digging into who those companies are and what they do. They are all applying data science and machine learning at scale to drive their business. And so, something's happening in the market. That's what leaders are doing. And I look at what's happening in Europe, and I say, I don't see the European market being that aggressive yet around data science, machine learning, how you apply data for competitive advantage, so we wanted to come do this in Munich. And it's a bit of a wake-up call, almost, to say hey, this is what's happening. We want to encourage clients across Europe to think about how do they start to do something now. >> Yeah, of course, GDPR is also a hook. The European Union and you guys have made some talk about that, you've got some keynotes today, and some breakout sessions that are discussing that, but talk about the two announcements that you guys made. There's one on DB2, there's another one around unified governance, what do those mean for clients? >> Yeah, sure, so first of all on GDPR, it's interesting to me, it's kind of the inverse of Y2K, which is there's very little hype, but there's huge ramifications. And Y2K was kind of the opposite. So look, it's coming, May 2018, clients have to be GDPR-compliant. And there's a misconception in the market that that only impacts companies in Europe. It actually impacts any company that does any type of business in Europe. So, it impacts everybody. So we are announcing a platform for unified governance that makes sure clients are GDPR-compliant. We've integrated software technology across analytics, IBM security, some of the assets from the Promontory acquisition that IBM did last year, and we are delivering the only platform for unified governance. And that's what clients need to be GDPR-compliant. The second piece is data has to become a lot simpler. As you think about my comment, who's leading the market today? Data's hard, and so we're trying to make data dramatically simpler. And so for example, with DB2, what we're announcing is you can download and get started using DB2 in 15 minutes or less, and anybody can do it. Even you can do it, Dave, which is amazing. >> Dave: (laughs) >> For the first time ever, you can-- >> We'll test that, Rob. >> Let's go test that. I would love to see you do it, because I guarantee you can. Even my son can do it. I had my son do it this weekend before I came here, because I wanted to see how simple it was. So that announcement is really about bringing, or introducing a new era of simplicity to data and analytics. We call it Download And Go. We started with SPSS, we did that back in March. Now we're bringing Download And Go to DB2, and to our governance catalog. So the idea is make data really simple for enterprises. >> You had a community edition previous to this, correct? There was-- >> Rob: We did, but it wasn't this easy. >> Wasn't this simple, okay. >> Not anybody could do it, and I want to make it so anybody can do it. >> Is simplicity, the rate of simplicity, the only differentiator of the latest edition, or I believe you have Kubernetes support now with this new addition, can you describe what that involves? >> Yeah, sure, so there's two main things that are new functionally-wise, Jim, to your point. So one is, look, we're big supporters of Kubernetes. And as we are helping clients build out private clouds, the best answer for that in our mind is Kubernetes, and so when we released Data Science Experience for Private Cloud earlier this quarter, that was on Kubernetes, extending that now to other parts of the portfolio. The other thing we're doing with DB2 is we're extending JSON support for DB2. So think of it as, you're working in a relational environment, now just through SQL you can integrate with non-relational environments, JSON, documents, any type of no-SQL environment. So we're finally bringing to fruition this idea of a data fabric, which is I can access all my data from a single interface, and that's pretty powerful for clients. >> Yeah, more cloud data development. Rob, I wonder if you can, we can go back to the machine learning, one of the core focuses of this particular event and the announcements you're making. Back in the fall, IBM made an announcement of Watson machine learning, for IBM Cloud, and World of Watson. In February, you made an announcement of IBM machine learning for the z platform. What are the machine learning announcements at this particular event, and can you sort of connect the dots in terms of where you're going, in terms of what sort of innovations are you driving into your machine learning portfolio going forward? >> I have a fundamental belief that machine learning is best when it's brought to the data. So, we started with, like you said, Watson machine learning on IBM Cloud, and then we said well, what's the next big corpus of data in the world? That's an easy answer, it's the mainframe, that's where all the world's transactional data sits, so we did that. Last week with the Hortonworks announcement, we said we're bringing machine learning to Hadoop, so we've kind of covered all the landscape of where data is. Now, the next step is about how do we bring a community into this? And the way that you do that is we don't dictate a language, we don't dictate a framework. So if you want to work with IBM on machine learning, or in Data Science Experience, you choose your language. Python, great. Scala or Java, you pick whatever language you want. You pick whatever machine learning framework you want, we're not trying to dictate that because there's different preferences in the market, so what we're really talking about here this week in Munich is this idea of an open platform for data science and machine learning. And we think that is going to bring a lot of people to the table. >> And with open, one thing, with open platform in mind, one thing to me that is conspicuously missing from the announcement today, correct me if I'm wrong, is any indication that you're bringing support for the deep learning frameworks like TensorFlow into this overall machine learning environment. Am I wrong? I know you have Power AI. Is there a piece of Power AI in these announcements today? >> So, stay tuned on that. We are, it takes some time to do that right, and we are doing that. But we want to optimize so that you can do machine learning with GPU acceleration on Power AI, so stay tuned on that one. But we are supporting multiple frameworks, so if you want to use TensorFlow, that's great. If you want to use Caffe, that's great. If you want to use Theano, that's great. That is our approach here. We're going to allow you to decide what's the best framework for you. >> So as you look forward, maybe it's a question for you, Jim, but Rob I'd love you to chime in. What does that mean for businesses? I mean, is it just more automation, more capabilities as you evolve that timeline, without divulging any sort of secrets? What do you think, Jim? Or do you want me to ask-- >> What do I think, what do I think you're doing? >> No, you ask about deep learning, like, okay, that's, I don't see that, Rob says okay, stay tuned. What does it mean for a business, that, if like-- >> Yeah. >> If I'm planning my roadmap, what does that mean for me in terms of how I should think about the capabilities going forward? >> Yeah, well what it means for a business, first of all, is what they're going, they're using deep learning for, is doing things like video analytics, and speech analytics and more of the challenges involving convolution of neural networks to do pattern recognition on complex data objects for things like connected cars, and so forth. Those are the kind of things that can be done with deep learning. >> Okay. And so, Rob, you're talking about here in Europe how the uptick in some of the data orientation has been a little bit slower, so I presume from your standpoint you don't want to over-rotate, to some of these things. But what do you think, I mean, it sounds like there is difference between certainly Europe and those top 10 companies in the S&P, outperforming the S&P 500. What's the barrier, is it just an understanding of how to take advantage of data, is it cultural, what's your sense of this? >> So, to some extent, data science is easy, data culture is really hard. And so I do think that culture's a big piece of it. And the reason we're kind of starting with a focus on machine learning, simplistic view, machine learning is a general-purpose framework. And so it invites a lot of experimentation, a lot of engagement, we're trying to make it easier for people to on-board. As you get to things like deep learning as Jim's describing, that's where the market's going, there's no question. Those tend to be very domain-specific, vertical-type use cases and to some extent, what I see clients struggle with, they say well, I don't know what my use case is. So we're saying, look, okay, start with the basics. A general purpose framework, do some tests, do some iteration, do some experiments, and once you find out what's hunting and what's working, then you can go to a deep learning type of approach. And so I think you'll see an evolution towards that over time, it's not either-or. It's more of a question of sequencing. >> One of the things we've talked to you about on theCUBE in the past, you and others, is that IBM obviously is a big services business. This big data is complicated, but great for services, but one of the challenges that IBM and other companies have had is how do you take that service expertise, codify it to software and scale it at large volumes and make it adoptable? I thought the Watson data platform announcement last fall, I think at the time you called it Data Works, and then so the name evolved, was really a strong attempt to do that, to package a lot of expertise that you guys had developed over the years, maybe even some different software modules, but bring them together in a scalable software package. So is that the right interpretation, how's that going, what's the uptake been like? >> So, it's going incredibly well. What's interesting to me is what everybody remembers from that announcement is the Watson Data Platform, which is a decomposable framework for doing these types of use cases on the IBM cloud. But there was another piece of that announcement that is just as critical, which is we introduced something called the Data First method. And that is the recipe book to say to a client, so given where you are, how do you get to this future on the cloud? And that's the part that people, clients, struggle with, is how do I get from step to step? So with Data First, we said, well look. There's different approaches to this. You can start with governance, you can start with data science, you can start with data management, you can start with visualization, there's different entry points. You figure out the right one for you, and then we help clients through that. And we've made Data First method available to all of our business partners so they can go do that. We work closely with our own consulting business on that, GBS. But that to me is actually the thing from that event that has had, I'd say, the biggest impact on the market, is just helping clients map out an approach, a methodology, to getting on this journey. >> So that was a catalyst, so this is not a sequential process, you can start, you can enter, like you said, wherever you want, and then pick up the other pieces from majority model standpoint? Exactly, because everybody is at a different place in their own life cycle, and so we want to make that flexible. >> I have a question about the clients, the customers' use of Watson Data Platform in a DevOps context. So, are more of your customers looking to use Watson Data Platform to automate more of the stages of the machine learning development and the training and deployment pipeline, and do you see, IBM, do you see yourself taking the platform and evolving it into a more full-fledged automated data science release pipelining tool? Or am I misunderstanding that? >> Rob: No, I think that-- >> Your strategy. >> Rob: You got it right, I would just, I would expand a little bit. So, one is it's a very flexible way to manage data. When you look at the Watson Data Platform, we've got relational stores, we've got column stores, we've got in-memory stores, we've got the whole suite of open-source databases under the composed-IO umbrella, we've got cloud in. So we've delivered a very flexible data layer. Now, in terms of how you apply data science, we say, again, choose your model, choose your language, choose your framework, that's up to you, and we allow clients, many clients start by building models on their private cloud, then we say you can deploy those into the Watson Data Platform, so therefore then they're running on the data that you have as part of that data fabric. So, we're continuing to deliver a very fluid data layer which then you can apply data science, apply machine learning there, and there's a lot of data moving into the Watson Data Platform because clients see that flexibility. >> All right, Rob, we're out of time, but I want to kind of set up the day. We're doing CUBE interviews all morning here, and then we cut over to the main tent. You can get all of this on IBMgo.com, you'll see the schedule. Rob, you've got, you're kicking off a session. We've got Hilary Mason, we've got a breakout session on GDPR, maybe set up the main tent for us. >> Yeah, main tent's going to be exciting. We're going to debunk a lot of misconceptions about data and about what's happening. Marc Altshuller has got a great segment on what he calls the death of correlations, so we've got some pretty engaging stuff. Hilary's got a great piece that she was talking to me about this morning. It's going to be interesting. We think it's going to provoke some thought and ultimately provoke action, and that's the intent of this week. >> Excellent, well Rob, thanks again for coming to theCUBE. It's always a pleasure to see you. >> Rob: Thanks, guys, great to see you. >> You're welcome; all right, keep it right there, buddy, We'll be back with our next guest. This is theCUBE, we're live from Munich, Fast Track Your Data, right back. (upbeat electronic music)

Published Date : Jun 22 2017

SUMMARY :

Brought to you by IBM. This is Fast Track Your Data brought to you by IBM, Hey, great to see you. It's good that you joined us. and machine learning to the Hadoop community. You had to be relevant, you want to be part of the community, So first of all, you look at the last five years. but talk about the two announcements that you guys made. Even you can do it, Dave, which is amazing. I would love to see you do it, because I guarantee you can. but it wasn't this easy. and I want to make it so anybody can do it. extending that now to other parts of the portfolio. What are the machine learning announcements at this And the way that you do that is we don't dictate I know you have Power AI. We're going to allow you to decide So as you look forward, maybe it's a question No, you ask about deep learning, like, okay, that's, and speech analytics and more of the challenges But what do you think, I mean, it sounds like And the reason we're kind of starting with a focus One of the things we've talked to you about on theCUBE And that is the recipe book to say to a client, process, you can start, you can enter, and deployment pipeline, and do you see, IBM, models on their private cloud, then we say you can deploy and then we cut over to the main tent. and that's the intent of this week. It's always a pleasure to see you. This is theCUBE, we're live from Munich,

ENTITIES

Entity	Category	Confidence
Jim Kobielus	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Jim	PERSON	0.99+
Europe	LOCATION	0.99+
Rob	PERSON	0.99+
Marc Altshuller	PERSON	0.99+
Hilary	PERSON	0.99+
Hilary Mason	PERSON	0.99+
Rob Bearden	PERSON	0.99+
February	DATE	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
May 2018	DATE	0.99+
March	DATE	0.99+
Munich	LOCATION	0.99+
Scala	TITLE	0.99+
Apache	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
Last week	DATE	0.99+
Java	TITLE	0.99+
last year	DATE	0.99+
two announcements	QUANTITY	0.99+
10 companies	QUANTITY	0.99+
GDPR	TITLE	0.99+
Python	TITLE	0.99+
DB2	TITLE	0.99+
15 minutes	QUANTITY	0.99+
last week	DATE	0.99+
IBM Analytics	ORGANIZATION	0.99+
European Union	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
JSON	TITLE	0.99+
Watson Data Platform	TITLE	0.99+
third	QUANTITY	0.99+
One	QUANTITY	0.99+
this week	DATE	0.98+
today	DATE	0.98+
a week ago	DATE	0.98+
two things	QUANTITY	0.98+
SQL	TITLE	0.98+
last fall	DATE	0.98+
2017	DATE	0.98+
Munich, Germany	LOCATION	0.98+
each	QUANTITY	0.98+
Y2K	ORGANIZATION	0.98+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Hilary Mason: