Alex Sadovsky, Oracle - Data Platforms 2017 - #DataPlatforms2017

>> Announcer: Live from the Wigwam in Phoenix, Arizona it's the CUBE, covering Data Platforms 2017. Brought to you by Qubal. >> Hey, welcome back everybody, Jeff Frick here with the CUBE along with George Gilbert. We're at Data Platforms 2017 a the historic 99 years young Wigwam resort outside of Phoenix and we're excited to be joined by our next guest, Alex Sadovsky, he the director of data science for Oracle Data Cloud. Welcome. >> Thanks, thanks for having me. >> Absolutely, I know so I know we got a short time window, you're racing off to your next session. So, for the people that aren't here, what are you going to be talking about in your session here? >> So, the Oracle Data Cloud, what we do is online advertising and essentially we have lots and lots of data, customers comet to us and they have some sort of question in mind. They want to say, I want to figure out who's going to buy a mini-van in California next month, or who's going to get a hotel in Las Vegas, who's going to buy Kraft macaroni and cheese? All sorts of different questions. We have all of that data, we have to turn it into actionable insights, into audiences for them so they can advertise Facebook, Twitter, all over the web. And so, what this talk is really focusing on is how do we take all of this data and use it efficiently? And it's going to talk about the technologies that we've used specifically Hive, and then moving that technology over to Spark, just so that we can use more data, get quicker processing, and essentially make our clients have a better experience and give 'em a better product. >> And do the clients execute the results of this process inside their other Oracle apps, or is it something that they can use with any number of apps? >> So, a lot of the ways that we work, we actually are interfaced with companies like Facebook and Twitter directly. And so, essentially what we're doing is we're partnering with them so that the client, all they really need to do is kind of come to us either onboard some data through maybe other Oracle applications or onboard data directly through us and then push it out, we help push it all the way through the process, all the way into Facebook, etc. >> Yes, 'cause we were covering Oracle modern marketing, which is now Oracle modern customer experience, I'm sure you guys must be tightly integrated with all that. >> Yeah, and so for Oracle Data Cloud it's kind of interesting we're a collaboration of five recently acquired start-ups. And so it's everything from two to three years ago all of this coming together. So, for us, we're really excited because we're just at the tip of the iceberg of getting into the whole Oracle ecosystem and having that help build up our product even better. >> So, when you say partner with Facebook or Twitter, that would be for brand or direct response advertising that one of your B2B clients has signed up for? Or I should say, B2B, your B the client is B, and the end customer's C, so it's a B2B2C. And now okay, so you help them in a consultative way. You have the data, you have a consultative sales approach, are you building models for them? Or are you telling them, sort of running a model? >> Alex: Yeah. >> Sort of which is it? >> So, we will, we run models based upon data. So, a customer could come to us with, here are a thousand people that that customer knows bought their product last month, and they say, we want to expand our business, we want to advertise to 20 million people who might be similar to those thousand. And so that's where all of our data comes in. We can look at those thousand people and we can say, hey did you guys know that most of your customers are millennials? Did you know that most of them tend to live on the west coast or east coast populated cities? And we're not really consulting that in the sense of like there's people looking at the data, it's all machine learning. And so computers are looking at all of our data to help get insights from what the customers bringing to us. >> So, would it be fair to say then that the, let's say the thousand example that the customer brings in is the training data. >> Yes. >> And then you use your data in your databases, your consumer databases, to say, to generate essentially scores, since they were going to send out to these. >> That's 100% right. They come in with a thousand of their customers, we see how those customers rank up against every single household in the entire United States. >> I was going to say, we're going to be at Spark Summit in a couple weeks or a week, whenever it is. I can't keep track of all these shows. So, they can't do the whole thing wiHive to Spark, but in three minutes or less wiHive to Spark. >> So, number one reason for us, and number one reason I think a lot of people are moving to Spark is just speed. Without getting into a lot of technical details, there's just a lot better engine, a lot better flexible engine underneath Spark than kind of traditional Hive. >> And then machine learning models are, most of the libraries are built in, which Hive doesn't have. >> Yeah, machine learning is really built into Spark. There's, you know, whole projects within Spark built around that. And so, for us, we really, Spark considers machine learning kind of a first class citizen. And since that's essentially what our business is, we go 100% into Spark as well. >> So, let me ask you, what is the scope now and potentially in the future for these data based predictive models where customer comes to you with essentially some labeled data and then you'll come out with I guess that's the training data and then right now you have data in what categories? And then what categories would you like to have? >> So, we have data everything from what people are doing on the web, so what they're searching for, what websites they're going for. We have grocery store data. So, what people are buying in the grocery store. We have retail data. So, what people are buying in the malls. Because a lot of what happens is, even though consumers are spending a lot more time on the web, 80%-90% of purchases are still made in the store. So, we have all of this actual real world purchase data that we've partnered with different retail partners, including like automotive data, too. So that's really like the core of our data. So, really what we try to do is have data sets strategically placed all around and that's why the Oracle Data Cloud is made up of so many different start-ups, we're really getting expertise from different areas for different data sets to bring that together. >> Do you need to buy those sources of data? Or can you license? >> Data is everything from licensed to purchased outright to shared, revenue sharing with other companies. It's really, there's a huge data market right now. It's kind of the data gold rush and we're trying get in anywhere we can, figure out what's going to help us and what's going to help our customers make better models. >> What would you like to see in terms of a, if you look out a couple years, where would you like to see your data assets sort of augment all your Oracle applications? >> Yeah, so I think... SO, augmenting Oracle really we have so many different data assets that everything from like live streaming data, of what people are searching for on the web, to historically what someone has bought in the last three years and so, as we partner more and more with Oracle, Oracle has different things in healthcare, in retail, in all sorts of B2B applications. And our data really can fit almost everywhere. It's really like a data driven sort of product. And so, we've been partnering with Oracle left and right many different groups just trying to figure out where can this data help augment kind of your services. >> Alright, Alex, well, we got to leave it there. That was a good summary. I know you got to race off to your thing. I'll let you take a breath and get a glass of water. So thanks for squeezing us in your busy day. >> Alex: Thanks so much. >> Alright, he's Alex, he's George, I'm Jeff, you're watching the CUBE from Data Platforms 2017. We'll be right back after this short break. Thanks for watching.

Published Date : May 26 2017

SUMMARY :

Brought to you by Qubal. We're at Data Platforms 2017 a the historic 99 years what are you going to be talking about in your session here? and essentially we have lots and lots of data, So, a lot of the ways that we work, I'm sure you guys must be tightly integrated with all that. So, for us, we're really excited because we're just at the You have the data, you have a consultative sales approach, and they say, we want to expand our business, let's say the thousand example that the customer brings in And then you use your data in your databases, household in the entire United States. So, they can't do the whole thing wiHive to Spark, So, number one reason for us, most of the libraries are built in, And so, for us, we really, Spark considers machine learning So, we have data everything from what people are doing It's kind of the data gold rush of what people are searching for on the web, I know you got to race off to your thing. Thanks for watching.

ENTITIES

Entity	Category	Confidence
Alex Sadovsky	PERSON	0.99+
George Gilbert	PERSON	0.99+
California	LOCATION	0.99+
Alex	PERSON	0.99+
George	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
100%	QUANTITY	0.99+
80%	QUANTITY	0.99+
United States	LOCATION	0.99+
five	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
next month	DATE	0.99+
Phoenix	LOCATION	0.99+
20 million people	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
thousand people	QUANTITY	0.99+
three minutes	QUANTITY	0.99+
Qubal	PERSON	0.99+
last month	DATE	0.99+
Twitter	ORGANIZATION	0.98+
Oracle Data Cloud	ORGANIZATION	0.98+
99 years	QUANTITY	0.98+
thousand	QUANTITY	0.98+
Phoenix, Arizona	LOCATION	0.98+
Spark	TITLE	0.98+
Wigwam	LOCATION	0.96+
CUBE	ORGANIZATION	0.96+
Spark	ORGANIZATION	0.95+
a week	QUANTITY	0.95+
one	QUANTITY	0.94+
three years ago	DATE	0.89+
last three years	DATE	0.88+
two	DATE	0.88+
number one reason	QUANTITY	0.82+
Data Platforms	TITLE	0.79+
Hive	TITLE	0.78+
couple years	QUANTITY	0.78+
#DataPlatforms2017	EVENT	0.76+
Data Platforms 2017	EVENT	0.76+
Spark Summit	EVENT	0.75+
first	QUANTITY	0.71+
single household	QUANTITY	0.69+
a couple weeks	QUANTITY	0.66+
a thousand	QUANTITY	0.64+
Oracle Data	ORGANIZATION	0.64+
Data Platforms	EVENT	0.63+
Hive	ORGANIZATION	0.6+
2017	DATE	0.58+
data	QUANTITY	0.54+
Data	COMMERCIAL_ITEM	0.48+
Cloud	TITLE	0.41+

Kellyn Pot'Vin Gorman, Delphix - Data Platforms 2017 - #DataPlatforms2017

>> Announcer: Live from the Wigwam in Phoenix, Arizona. It's theCUBE covering Data Platforms 2017. Brought to you by Qubole. >> Hey welcome back everybody. Jeff Frick here with theCUBE. We're at the historic Wigwam Resort. 99 years young just outside of Phoenix. At Data Platforms 2017. I'm Jeff Frick here with George Gilbert from Wikibon who's co-hosting with me all day. Getting to the end of the day. And we're excited to have our next guest. She is Kellyn Gorman. The technical intelligence manager and also the office of the CTO at Delphix, welcome. >> Yes, thank you, thank you so much. >> Absolutely, so what is Delphix for people that aren't familiar with Delphix? >> Most of realize that the database and data in general is the bottleneck and Delphix completely revolutionizes that. We remove it from being the bottleneck by virtualizing data. >> So you must love this show. >> Oh I do, I do. I'm hearing all about all kinds of new terms that we can take advantage of. >> Right, Cloud-Native and SEPRATE, you know and I think just the whole concept of atomic computing. Breaking down, removing storage, from serve. Breaking it down into smaller parts. Sounds like it fits right into kind of your guys will house. >> Yeah, I kind of want to containerize it all and be able to move it everywhere. But I love it. Yeah. >> So what do you think of this whole concept of Data Ops? We've been talking about Dev Ops for, I don't know how long... How long have we been talking about Dev Ops George? Five years? Six years? A while? >> Yeah a while (small chuckle) >> But now... >> Actually maybe eight years. >> Jeff: you're dating yourself George. (all laugh) Now we're talking about Data Ops, right? And there's a lot of talk of Data Ops. So this is the first time I've really heard it coined in such a way where it really becomes the primary driver in the way that you basically deliver value inside your organization. >> Oh absolutely. You know I come from the database realm. I was a DBA for over two decades and Dev Ops was a hard sell to a lot of DBAs. They didn't want to hear about it. I tried to introduce it over and over. The idea of automating and taking us kind of out this manual intervention. That introduced many times human error. So Dev Ops was a huge step forward getting that out of there. But the database was still in data in general was still this bottleneck. So Data Ops is the idea that you automate all of this and if you virtualize that data we found with Delphix that removed that last hurdle. And that was my, I guess my session was on virtualizing big data. The idea that I could take any kind of structured or unstructured file and virtualize that as well and instead of deploying it to multiple environments, I was able to deploy it once and actually do IO on demand. >> So let's peel the onion on that a little bit. What does it mean to virtualize data? And how does that break databases' bottleneck on the application? >> Well right now, when you talk about a relational data or any kind of legacy data store, people are duplicating that through our kick processes. So if we talk about Oracle they're using things like Datapump. They're using transportable table spaces. These are very cumbersome they take a very long time. Especially with the introduction of the cloud, there's many room for failure. It's not made for that, especially as the network is our last bottleneck. Is what we're also feeling too for many of these folks. When we introduce big data, many of these environments many of these, I guess you'd say projects came out of open source. They were done as a need, as a necessity to fulfill. And they've got a lot of moving pieces. And to be able to containerize that and then deploy it once and the virtualize it so instead of let's say you have 16 gigs that you need to duplicate here and over and over again. Especially if you're going on-prem or to the cloud. That I'm able to do it once and then do that IO on demand and go back to a gold copy a central location. And it makes it look like it's there. I was able to deploy a 16 gig file to multiple environments in less than a minute. And then each of those developers each have their own environment. Each tester has their own and they actually have a read write full robust copy. That's amazing to folks. All of a sudden, they're not held back by it. >> So our infrastructure analysts and our Wikibon research CTO David Floyer, if I'm understanding this correctly, talks about this where it's almost like a snapshot. >> Absolutely >> And it's a read write snapshot although you're probably not going to merge it back into the original. And this way Dev tests and whoever else wants to operate on live data can do that. >> Absolutely, it's full read write what we call it data version control. We've always had version control at the cold level. You may of had it at the actual server level. But you've rarely ever had it at the data level for the database or with flat files. What I used was the cms.gov data. It's available to everyone, it's public data. And we realized that these files were quite large and cumbersome. And I was able to reproduce it and enhance what they were doing at TIME magazine. And create a used case that made sense to a lot of people. Things that they're seeing in their real world environments. >> So, tell us more, elaborate how dev ops expands on this, I'm sorry, not dev ops data ops. How, take that as an example and generalize it some more so that we see how if DBAs were a bottleneck. How they now can become an enabler? >> One it's getting them to raise new skills. Many DBAs think that their value relies on those archaic processes. "It's going to take me three weeks to do this." So I have three weeks of value. Instead of saying "I am going to be able to do this in one day" and those other resources are now also valuable because they're doing their jobs. We're also seeing that data was seen as the centralized point. People were trying to come up with these pain points of solution to them. We're able to take that out completely. And people are able to embrace agility. They have agile environments now. Dev Ops means that they're able to automate that very easily instead of having that stopping point of constantly hitting a data and saying "I've got to take time to refresh this." "How am I going to refresh it?" "Can I do just certain..." We hear about this all the time with testing. When I go to testing summits, they are trying to create synchronized virtualized data. They're creating test data sets that they have to manage. It may not be the same as production where I can actually create a container of the entire developmental production environment. And refresh that back. And people are working on their full product. There's no room for error that you're seeing. Where you would have that if you were just taking a piece of it. Or if you were able to just grab just one tier of that environment because the data was too large before. >> So would the automation part be a generation of snapshot one or more snapshots. And then the sort of orchestration distribution to get it to the intended audiences? >> Yes, and we would use >> Okay. things like Jenkins through Chev normal dev ops tools work along with this. Along with command line utilities that are part of our product. To allow people to just create what they would create normally. But many times it's been siloed and like I said, work around that data. We've included the data as part of that. That they can deploy it just as fast. >> So a lot of the conversation here this morning was really about put the data all in this through your or pick your favorite public cloud to enable access to all the applications to the UPIs, through all different types of things. How does that impact kind of what you guys do in terms of conceptually? >> If you're able to containerize that it makes you capable of deploying to multiple clouds. Which is what we're finding. About 60% of our customers are in more than one cloud, two to five exactly. As we're dealing with that and recognizing that it's kind of like looking at your cloud environments. Like your phone providers. People see something shiny and new a better price point, lesser dollar. We're able to provide that one by saving all that storage space. It's virtualized, it's not taking a lot of disc space. Second of all, we're seeing them say "You know, I'm going to go over to Google." Oh guess what? This project says they need the data and they need to actually take the data source over to Amazon now. We're able to do that very easily. And we do it from multi tier. Flat files, the data, legacy data sources as well as our application tier. >> Now, when you're doing these snapshots, my understanding if I'm getting it right, is it's like a, it's not a full Xerox. It's more like the Delta. Like if someone's doing test dev they have some portion of the source of the source of truth, and as they make changes to it, it grows to include the edits until they're done, in which case then the whole thing is blown away. >> It depends on the technology you're looking at. Ours is able to trap that. So when we're talking about a virtual database, we're using the native recovery mechanisms. To kind of think of it as a perpetual recovery state inside our Delphix engine. So those changes are going on and then you have your VDBs that are a snapshot in time that they're working on. >> Oh so like you take a snapshot and then it's like a journal >> the transactional data is from the logs is continually applied. Of course it's different depending on each technology. So we do it differently for Cybase versus Oracle versus Sequal server and so on and so forth. Virtual files when we talk about flat files are different as well. Your parent, you take an exact snapshot of it. But it's really just projecting that NFS mount to another place. So that mount, if you replace those files, or update them of course, then you would be able to refresh and create a new shot of those files. So somebody said "We refresh these files every single night." You would be able to then refresh and project them out to the new place. >> Oh so you're, it's almost like you're sub-classing them... >> Yes. >> Okay, interesting... When you go into a company that's got a big data initiative, where do you fit in the discussion, in the sequence how do you position the value add relative to the data platform that it's sort of the center of the priority of getting it a platform in place? >> Well, that's what's so interesting about this is that we haven't really talked to a lot of big data companies. We've been very relational over a period of time. But our product is very much a Swiss Army knife. It will work on flat files. We've been doing it for multi tier environments forever. It's that our customers are now going "I have 96 petabytes in Oracle. I'm about to move over to big data." so I was able to go out and say we how would I do this in a big data environment? And I found this used case being used by TIME magazine and then created my environment. And did it off of Amazon. But it was just a used case. I was just a proof of concept that I built to show and demonstrate that. Yeah, my guy's back at the office are going "Kellyn when you're done with it, you can just deliver it back to us." (laughing) >> Jeff: Alright Kellyn. Well thank you for taking a few minutes to stop by and pretty interesting story. Everything's getting virtualized machines, databases... >> Soon us! >> And our data. >> Soon George! >> Right, not me George... (George laughs) Alright, thanks again Kellyn >> Thank you so much. >> for stopping by. Alright I'm with George Gilbert. I'm Jeff Frick you're watching theCUBE from Data Platforms 2017 in Phoenix, Arizona. Thanks for watching. (upbeat electronic music)

Published Date : May 26 2017

SUMMARY :

Brought to you by Qubole. and also the office of the CTO at Delphix, welcome. Most of realize that the database that we can take advantage of. Right, Cloud-Native and SEPRATE, you know and be able to move it everywhere. So what do you think of this whole concept in the way that you basically deliver and instead of deploying it to multiple environments, What does it mean to virtualize data? And to be able to containerize that and our Wikibon research CTO David Floyer, into the original. You may of had it at the actual server level. so that we see how if DBAs were a bottleneck. They're creating test data sets that they have to manage. distribution to get it to the intended audiences? To allow people to just create what So a lot of the conversation here the data source over to Amazon now. of the source of truth, and as they make and then you have your VDBs that NFS mount to another place. Oh so you're, it's almost like you're to the data platform that it's sort of I'm about to move over to big data." to stop by and pretty interesting story. Right, not me George... Alright I'm with George Gilbert.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Jeff	PERSON	0.99+
Kellyn Gorman	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Kellyn	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
George	PERSON	0.99+
two	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
16 gig	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Phoenix	LOCATION	0.99+
Five years	QUANTITY	0.99+
eight years	QUANTITY	0.99+
Six years	QUANTITY	0.99+
16 gigs	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
less than a minute	QUANTITY	0.99+
each	QUANTITY	0.99+
99 years	QUANTITY	0.99+
Xerox	ORGANIZATION	0.99+
Phoenix, Arizona	LOCATION	0.99+
Delphix	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
96 petabytes	QUANTITY	0.98+
David Floyer	PERSON	0.98+
About 60%	QUANTITY	0.98+
Each tester	QUANTITY	0.98+
Wikibon	ORGANIZATION	0.98+
more than one cloud	QUANTITY	0.98+
Second	QUANTITY	0.98+
one day	QUANTITY	0.98+
first time	QUANTITY	0.97+
TIME	TITLE	0.97+
five	QUANTITY	0.97+
Ops	TITLE	0.96+
each technology	QUANTITY	0.96+
Qubole	PERSON	0.96+
CTO	PERSON	0.95+
one tier	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.94+
Chev	TITLE	0.93+
#DataPlatforms2017	EVENT	0.92+
Dev Ops	TITLE	0.91+
this morning	DATE	0.89+
Kellyn Pot'Vin Gorman	PERSON	0.88+
over two decades	QUANTITY	0.87+
one	QUANTITY	0.82+
Delphix	TITLE	0.81+
One	QUANTITY	0.77+
Datapump	ORGANIZATION	0.75+
Wigwam Resort	LOCATION	0.75+
Ops	ORGANIZATION	0.73+
single night	QUANTITY	0.72+
Jenkins	TITLE	0.71+
Wigwam	LOCATION	0.71+
Sequal	ORGANIZATION	0.7+
Data	TITLE	0.66+
Platforms	EVENT	0.65+
Data Platforms 2017	EVENT	0.64+
SEPRATE	PERSON	0.63+
cms.gov	OTHER	0.56+
Cybase	ORGANIZATION	0.56+
Cloud-	ORGANIZATION	0.55+
Delta	ORGANIZATION	0.54+
Data Ops	ORGANIZATION	0.52+
2017	DATE	0.44+

Colin Riddell, Epic Games - Data Platforms 2017 - #DataPlatforms2017

>> Narrator: Live from The Wigwam in Phoenix, Arizona, it's the CUBE. Covering Data Platforms 2017. Brought to you by Qubole. (techno music) >> Hey, welcome back everybody. Jeff Frick here with the CUBE. We are in The Wigwam Resort, historic Wigwam Resort, just outside of Phoenix, Arizona at Data Platforms 2017. It's a new Big Data event. You might say, god there's already a lot of Big Data events, but Qubole's taken a different approach to Big Data. Cloud-first, cloud-native, you're integrated with all the big public clouds and they all come from Big Data backgrounds, practitioner backgrounds. So it's a really cool thing and we're really excited to have our next guest, Colin Ridell, he's a Big Data architect from Epic Games, was up on a panel earlier today. Colin, Welcome. >> Thank you, thank you for having me. >> Absolutely, so, enjoyed your panel, a lot of topics that you guys covered. One of the ones we hear over and over again is get early wins. How do you drive adoption, change people's behaviors, it's not really a technology story. It's a human factors and behaviors story. So I wonder if you can share some of your experience, some best practices, some stories. >> So I don't know if there's really a rule book on best practices for that. Every environment is different, every company is different. But one thing that seems to be constant is resistance to change in a lot of the places, so... >> Jeff: That is consistent. >> We had some challenges when I came in. We were running a system that was on it's last legs basically, and we had to replace it. There was really no choice. There was no fixing it. And so, I did actually encounter a fair bit of resistance with regards to that when I started at Epic. >> Now it's interesting, you said a fair amount of resistance. Another one of your lessons was start slow, find some early wins, but you said, that you were thrown into a big project right off the bat. >> Colin: So, we were, yeah. >> I'm curious, how did the big project go, but when you do start slow, how small does it need to be where you can start to get these wins to break down the resistance. >> I think what we, the way we approached it was we looked at what was the most crucial process, or the most crucial set of processes. And that's where we started. So that was what we tried to convert first and then make that data available to people via an alternative method, which was Hive. And once people started using it and learned how to interact with it properly the barriers start to fall. >> What were some of the difficult change management issues? Where did you come from in terms of the technology platform and what resistance did you hit? >> So it was really a user interface was the main factor of resistance. So we were running a Hadoop cluster. It was fixed sized, it wasn't on PRaM, but it was in a private cloud. It was basically, simply being overloaded. We had to do constant maintenance on it. We had to prop it up. And it was, the performance was degrading and degrading and degrading. The idea behind the replacement was really to give us something that was scalable, that would grow in the future, that wouldn't run into these performance blockers that we were having. But again, like I said, the hardest factor was the user interface differences. People were used to the tool set that they were working with, they liked the way it worked. >> What was the tool set? >> I would rather not actually say that on camera, >> Jeff: That's fine. >> Does it source itself in Redmond or something? >> No, no it doesn't, they're not from Redmond. I just don't want to cast aspersions. >> No, you don't need to cast aspersions. The conflict was really just around familiarity with the tool, it wasn't really about a wholesale change in behavior and becoming more data-centric. >> No, because the tool that we replaced was an effort to become more data-centric to begin with. There definitely was a corporate culture of we want to be more data-informed. So that was not one of the factors that we had to overcome. It was really tool-based. >> But the games market is so competitive, right? You guys have to be on your game all the time and you got to keep an eye on what everybody else is doing in their games, and make course corrections as I understand, something becomes hot, or new, so you guys have to be super nimble on your feet. How does taking this approach help you be more nimble in the way that you guys get new code out, new functionality? >> It's really, really very easy for us now to inject new events into the game, we basically can break those events out and report on them or analyze what's going on in the game for free with the architecture that we have now. >> Does that mean it's the equivalent of, in IT operations, we instrument everything from the applications, to the middleware, down to the hardware. Are you essentially doing the same to the game so you can follow the pathway of a gamer, or the hotspots of all the gamers, that sort of thing? >> I'm not sure I fully understand your question. >> When you're running analytics on a massively multi-player game, what questions are you seeking to answer? >> Really what we are seeking to answer at the moment is what brings people back? What behaviors can we foster in-- >> Engagement. >> in our players. Yeah, engagement, exactly. >> And that's how you measure engagement, it's just as simple as, do they come back or time on game? >> That's the most simple measure that we use for it, yeah. >> So Colin, we're short on time, want to give you the last word. When you come to a conference like this, there's a lot of peer interaction, there's some great questions coming out of the panel, around specifically, how do you measure success? It wasn't technical at all. It's, what are the things that you're using to measure whether stuff is working. I wonder if you can talk to the power of being in an ecosystem of peers here. Any surprises or great insights that you've got. I know we've only been here for a couple days. >> I would say that one of the biggest values, obviously the sessions and the breakouts are great, but I think one of the greatest values of here is simply the networking aspect of it. The being able to speak to people who are facing similar challenges, or doing similar things. Even although they're in a completely different domain, the problems are constant. Or common at least. How do you do machine learning to categorize player behaviors in our case and in other cases it's categorization of feedback that people get from websites, stuff like that. I really think the networking aspect is the most valuable thing to conferences like this. >> Alright, awesome. Well, Colin Ridell, Epic Games, thanks for taking a few minutes to stop by the CUBE. >> You're welcome, more than welcome, thank you very much. >> Absolutely, alright, George Gilbert, I'm Jeff Frick, you're watching the CUBE from Data Platforms 2017 at the historic Wigwam Resort. Thanks for watching. (upbeat techno music)

Published Date : May 26 2017

SUMMARY :

Brought to you by Qubole. from Epic Games, was up on a panel earlier today. So I wonder if you can share some of your experience, is resistance to change in a lot of the places, so... There was really no choice. that you were thrown into a big project right off the bat. but when you do start slow, how small does it need to be So that was what we tried to convert first The idea behind the replacement was really to I just don't want to cast aspersions. No, you don't need to cast aspersions. So that was not one of the factors that we had to overcome. more nimble in the way that you guys in the game for free with the architecture that we have now. from the applications, to the middleware, in our players. I wonder if you can talk to the power of being How do you do machine learning thanks for taking a few minutes to stop by the CUBE. from Data Platforms 2017 at the historic Wigwam Resort.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Colin Ridell	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
Colin	PERSON	0.99+
Epic Games	ORGANIZATION	0.99+
Colin Riddell	PERSON	0.99+
Phoenix, Arizona	LOCATION	0.98+
Wigwam Resort	LOCATION	0.98+
one thing	QUANTITY	0.96+
One	QUANTITY	0.96+
Data Platforms 2017	EVENT	0.95+
Redmond	ORGANIZATION	0.94+
CUBE	ORGANIZATION	0.94+
Qubole	PERSON	0.94+
Epic	ORGANIZATION	0.91+
#DataPlatforms2017	EVENT	0.87+
one	QUANTITY	0.86+
earlier today	DATE	0.82+
Narrator: Live from The Wigwam	TITLE	0.79+
first	QUANTITY	0.71+
one of the factors	QUANTITY	0.67+
Covering	EVENT	0.65+
couple	QUANTITY	0.48+

Show Wrap - Data Platforms 2017 - #DataPlatforms2017

>> Announcer: Live from the Wigwam in Phoenix, Arizona. It's theCUBE. Covering Data Platforms 2017. Brought to you by Kubo. >> Hey welcome back everybody. Jeff Frick here with theCUBE along with George Gilbert from Wikibon. We've had a tremendous day here at DataPlatforms 2017 at the historic Wigwam Resort, just outside of Phoenix, Arizona. George, you've been to a lot of big data shows. What's your impression? >> I thought we're at the, we're sort of at the edge of what could be a real bridge to something new, which is, we've built big data systems for like out of traditional, as traditional software for deployment on traditional infrastructure. Even if you were going to put it in a virtual machine, it's still not a cloud. You're still dealing with server abstractions. But what's happening with Kubo is, they're saying, once you go to the cloud, whether it's Amazon, Azure, Google or Oracle, you're going to be dealing with services. Services are very different. It greatly simplifies the administrative experience, the developer experience, and more than that, they're focused on, they're focused on turning Kubo, the product on Kubo the service, so that they can automate the management of it. And we know that big data has been choking itself on complexity. Both admin and developer complexity. And they're doing something unique, both on sort of the big data platform management, but also data science operations. And their point, their contention, which we still have to do a little more homework on, is that the vendors who started with software on-prem, can't really make that change very easily without breaking what they've done on-prem. Cuz they have traditional perpetual license physical software as opposed to services, which is what is in the cloud. >> The question is, are people going to wait for them to figure it out. I talked to somebody in the hallway earlier this morning and we were talking about their move to put all their data into, it was S3, on their data lake. And he said, it's part of a much bigger transformational process that we're doing inside the company. And so, this move, from his cloud, public cloud viable, to tell me, give me a reason why it shouldn't go to the cloud, has really kicked in big time. And hear over and over and over that speed and agility, not just in deploying applications, but in operating as a company, is the key to success. And we hear over and over how many, how short the tenure is on the Fortune 500 now, compared to what it used to be. So if you're not speed and agile, which you pretty much have to use cloud, and software driven automated decision-making >> Yeah. >> that's powered by machine learning to eat. >> Those two things. >> A huge percentage of your transaction and decision-making, you're going to get smoked by the person that is. >> Let's let's sort of peel that back. I was talking to Monte Zweben who is the co-founder of Splice Machine, one of the most advance databases that sort of come out of nowhere over the last couple of years. And it's now, I think, in close beta on Amazon. He showed me, like a couple of screens for spinning it up and configuring it on Amazon. And he said, if I were doing that on-prem, he goes I needed Hadoop cluster with HBase. It would take me like four plus months. And that's an example of software versus services. >> Jeff: Right. >> And when you said, when you pointed out that, automated decision-making, powered by machine learning, that's the other part, which is these big data systems ultimately are in the service of creating machine learning models that will inform ever better decisions with ever greater speed and the key then is to plug those models into existing systems of record. >> Jeff: Right. Right. >> Because we're not going to, >> We're not going to to rip those out and rebuild them from scratch. >> Right. But as you just heard, you can pull the data out that you need, run it through a new age application. >> George: Yeah. >> And then feed it back into the old system. >> George: Yes. >> The other thing that came up, it was Oskar, I have to look him up, Oskar Austegard from Gannett was on one of the panels. We always talk about the flexibility to add capacity very easily in a cloud-based solution. But he talked about in the separation of storage and cloud, that they actually have times where they turn off all their compute. It's off. Off. >> And that was If you had to boil down the fundamental compatibility break between on-prem and in the cloud, the Kubo folks, both the CEO and CMO said, look, you cannot reconcile what's essentially server send, where the storage is attached to the compute node, the server. With cloud where you have storage separate from compute and allowing you to spin it down completely. He said those are just the fundamentally incompatible. >> Yeah, yeah. And also, Andretti, one of the founders in his talk, he talked about the big three trends, which we just kind of talked about, he summarized them right in serverless. This continual push towards smaller and smaller units >> George: Yeah. >> of store compute. And the increasing speed of networks is one, from virtual servers to just no servers, to just compute. The second one is automation, you've got to move to automation. >> George: Right. If you're not, you're going to get passed by your competitor that is. Or the competitor you that you don't even know that exists that's going to come out from over your shoulder. And the third one was the intelligence, right. There is a lot of intelligence that can be applied. And I think the other cusp that we're on, is this continuing crazy increase in compute horsepower. Which just keeps going. That the speed and the intelligence of these machines is growing at an exponential curve, not a linear curve. It's going to be bananas in the not too distance future. >> We're soaking up more and more that intelligence with machine learning. The training part of machine learning where the datasets to train a model are immense. Not only the dataset are large, but the amount of time to sort of chug through them to come up with the, just the right mix of variables and values for those variables. Or maybe even multiple models. So that we're going to see in the cloud. And that's going to chew up more and more cycles. Even as we have >> Jeff: Right. Right. >> specialized processors. >> Jeff: Right. But in the data ops world, in theory yes, but I don't have to wait to get it right. Right? I can get it 70% right. >> George: Yeah. >> Which is better than not right. >> George: Yeah. >> And I can continue to iterate over time. In that, I think was the the genius of dev-ops. To stop writing PRDs and MRDs. >> George: Yeah. >> And deliver something. And then listen and adjust. >> George: Yeah. >> And within the data ops world, it's the same thing. Don't try to figure it all out. Take the data you know, have some hypothesis. Build some models and iterate. That's really tough to compete with. >> George: Yeah. >> Fast, fast, fast iteration. >> We're doing actually a fair amount of research on that. On the Wikibon side. Which is, if you build, if you build an enterprise application that has, that is reinforced or informed by models in many different parts, in other words, you're modeling more and more digital entities within the business. >> Jeff: Right. >> Each of those has feedback loops. >> Jeff: Right. Right. >> And when you get the whole thing orchestrated and moving or learning in concert then you have essentially what Michael Porter many years ago called competitive advantage. Which is when each business process reinforces all the other business processes in service of a delivering a value proposition. And those models represent business processes and when they're learning and orchestrated all together, you have a, what Trump called a fined-tuned machine. >> I won't go there. >> Leaving out that it was Bigley and it was finely-tuned machine. >> Yeah, yeah. But the end of the day, if you're using resources and effort to improve an different resource and effort, you're getting a multiplier effect. >> Yes. >> And that's really the key part. Final thought as we go out of here. Are you excited about this? Do you see, they showed the picture the NASA headquarters with the big giant snowball truck loading up? Do you see more and more of this big enterprise data going into S3, going into Google Cloud, going into Microsoft Azure? >> You're asking-- >> Is this the solution for the data lake swamp issue that we've been talking about? >> You're asking the 64 dollar question. Which is, companies, we sensed a year ago at the at the Hortonworks DataWorks Summit in, was in June, down in San Jose last year. That was where we first got the sense that, people were sort of throwing in the towel on trying to build, large scale big data platforms on-prem. And what changes now is, are they now evaluating Hortonworks versus Cloudera versus MapR in the cloud or are they widening their consideration as Kubo suggests. Because now they want to look, not only at Cloud Native Hadoop, but they actually might want to look at Cloud Native Services that aren't necessarily related to Hadoop. >> Right. Right. And we know as a service wins. It's continue. PAS is a service. Software is a service. Time and time again, as a service either eats a lot of share from the incumbent or knocks the incumbent out. So, Hadoop as a service, regardless of your distro, via one of these types of companies on Amazon, it seems like it's got to win, right. It's going to win. >> Yeah but the difference is, so far, so far, the Clouderas and the MapRs and the Hortonworks of the world are more software than service when they're in the cloud. They don't hide all the knobs. You still need You still a highly trained admin to get them up-- >> But not if you buy it as a service, in theory, right. It's going to be packaged up by somebody else and they'll have your knobs all set. >> They're not designed yet that way. >> HD Insight >> Then, then, then, then, They better be careful cuz it might be a new, as a service distro, of the Hadoop system. >> My point, which is what this is. >> Okay, very good, we'll leave it at that. So George, thanks for spending the day with me. Good show as always. >> And I'll be in a better mood next time when you don't steal my candy bars. >> All right. He's George Goodwin. I'm Jeff Frick. You're watching theCUBE. We're at the historic 99 years young, Wigwam Resort, just outside of Phoenix, Arizona. DataPlatforms 2017. Thanks for watching. It's been a busy season. It'll continue to be a busy season. So keep it tuned. SiliconAngle.TV or YouTube.com/SiliconAngle. Thanks for watching.

Published Date : May 26 2017

SUMMARY :

Brought to you by Kubo. at the historic Wigwam Resort, is that the vendors who started with software on-prem, but in operating as a company, is the key to success. you're going to get smoked by the person that is. over the last couple of years. and the key then is to plug those models Jeff: Right. We're not going to to rip those out But as you just heard, We always talk about the flexibility to add capacity And that was And also, Andretti, one of the founders in his talk, And the increasing speed of networks is one, And the third one was the intelligence, right. but the amount of time to sort of chug through them Jeff: Right. But in the data ops world, in theory yes, And I can continue to iterate over time. And then listen and adjust. Take the data you know, have some hypothesis. On the Wikibon side. Jeff: Right. And when you get the whole thing orchestrated Leaving out that it was Bigley But the end of the day, if you're using resources And that's really the key part. You're asking the 64 dollar question. a lot of share from the incumbent and the Hortonworks of the world It's going to be packaged up by somebody else of the Hadoop system. which is what this is. So George, thanks for spending the day with me. And I'll be in a better mood next time We're at the historic 99 years young, Wigwam Resort,

ENTITIES

Entity	Category	Confidence
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
George	PERSON	0.99+
George Goodwin	PERSON	0.99+
George Gilbert	PERSON	0.99+
Michael Porter	PERSON	0.99+
Andretti	PERSON	0.99+
San Jose	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
64 dollar	QUANTITY	0.99+
70%	QUANTITY	0.99+
Trump	PERSON	0.99+
Oskar Austegard	PERSON	0.99+
June	DATE	0.99+
Oracle	ORGANIZATION	0.99+
Oskar	PERSON	0.99+
Google	ORGANIZATION	0.99+
NASA	ORGANIZATION	0.99+
Kubo	ORGANIZATION	0.99+
one	QUANTITY	0.99+
last year	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
four plus months	QUANTITY	0.99+
99 years	QUANTITY	0.99+
third one	QUANTITY	0.99+
Phoenix, Arizona	LOCATION	0.99+
a year ago	DATE	0.99+
Splice Machine	ORGANIZATION	0.98+
Both	QUANTITY	0.98+
Microsoft	ORGANIZATION	0.98+
Hadoop	TITLE	0.98+
both	QUANTITY	0.97+
Azure	ORGANIZATION	0.97+
Each	QUANTITY	0.96+
Monte Zweben	PERSON	0.96+
first	QUANTITY	0.94+
MapRs	ORGANIZATION	0.94+
earlier this morning	DATE	0.92+
Wigwam Resort	LOCATION	0.92+
two things	QUANTITY	0.92+
2017	DATE	0.92+
#DataPlatforms2017	EVENT	0.89+
Wikibon	ORGANIZATION	0.89+
second one	QUANTITY	0.89+
three trends	QUANTITY	0.89+
each business process	QUANTITY	0.87+
DataPlatforms	TITLE	0.86+
theCUBE	ORGANIZATION	0.85+
Cloudera	ORGANIZATION	0.85+
Hortonworks DataWorks Summit	EVENT	0.85+
Wigwam Resort	ORGANIZATION	0.85+
Kubo	PERSON	0.84+
Gannett	ORGANIZATION	0.82+
MapR	ORGANIZATION	0.8+
S3	TITLE	0.8+
many years ago	DATE	0.78+
DataPlatforms 2017	EVENT	0.74+
years	DATE	0.73+
YouTube.com/SiliconAngle	OTHER	0.72+
Clouderas	ORGANIZATION	0.7+
Cloud Native	TITLE	0.67+
Platforms	TITLE	0.67+
Google Cloud	TITLE	0.64+
Cloud Native Hadoop	TITLE	0.64+
last couple	DATE	0.64+
Azure	TITLE	0.61+

Mick Bass, 47Lining - Data Platforms 2017 - #DataPlatforms2017

>> Live, from The Wigwam, in Phoenix, Arizona, it's theCube, covering Data Platforms 2017. Brought to you by Cue Ball. Hey, welcome back everybody. Jeff Frick here with theCube. Welcome back to Data Platforms 2017, at the historic Wigwam Resort, just outside of Phoenix, Arizona. I'm here all day with George Gilbert from Wikibon, and we're excited to be joined by our next guest. He's Mick Bass, the CEO of 47Lining. Mick, welcome. >> Welcome, thanks for having me, yes. >> Absolutely. So, what is 47Lining, for people that aren't familiar? >> Well, you know every cloud has a silver lining, and if you look at the periodic table, 47 is the atomic number for silver. So, we are a consulting services company that helps customers build out data platforms and ongoing data processes and data machines in Amazon web services. And, one of the primary use cases that we help customers with is to establish data lakes in Amazon web services to help them answer some of their most valuable business questions. >> So, there's always this question about own vs buy, right, with Cloud and Amazon, specifically. >> Mm-hmm, mm-hmm. >> And, with a data lake, the perception right... That's huge, this giant cost. Clearly that's from benefits that come with putting your data lake in AWS vs having it on Primm. What are some of the things you take customers through, and kind of the scenario planning and the value planning? >> Well, just a couple of the really important aspects, one, is this notion of elastic and on-demand pricing. In a Cloud based data lake, you can start out with actually a very small infrastructure footprint that's focused on maybe just one or two business use cases. You can pay only for the data that you need to get your data leg bootstrapped, and demonstrate the business benefit from one of those use cases. But, then it's very easy to scale that up, in a pay as you go kind of a way. The second, you know, really important benefit that customers experience in a platform that's built on AWS, is the breadth of the tools and capabilities that they can bring to bare for their predictive analytics and descriptive analytics, and streaming kinds of data problems. So, you need Spark, you can have it. You need Hive, you can have it. You need a high performance, close to the metal, data warehouse, on a cluster database, you can have it. So, analysts are really empowered through this approach because they can choose the right tool for the right job, and reduce the time to business benefit, based on what their business owners are asking them for. >> You touched on something really interesting, which was... So, when a customer is on Primm, and let's say is evaluating Cloudera, MaPr, Hortonworks, there's a finite set of services or software components within that distro. Once they're on the Cloud, there's a thousand times more... As you were saying, you could have one of 27 different data warehouse products, you could have many different sequel products, some of which are really delivered as services. >> Mm-hmm >> How does the consideration of the customer's choice change when they go to the Cloud? >> Well, I think that what they find is that it's much more tenable to take an agile, iterative process, where they're trying to align the outgoing cost of the data lake build to keep that in alignment with the business benefits that come from it. And, so if you recognize the need for a particular kind of analytics approach, but you're not going to need that until down the road, two or three quarters from now. It's easy to get started with simple use cases, and then like add those incremental services, as the need manifests. One of the things that I mention in my talk, that I always encourage our customers to keep in mind, is that a data lake is more than just a technology construct. It's not just an analysis set of machinery, it's really a business construct. Your data lake has a profit and loss statement, and the way that you interact with your business owners to identify this specific value sources, that you're going to make pop for you company, can be made to align with the cost footprint, as you build your data lake out. >> So I'm curious, when you're taking customers though the journey to start kind of thinking of the data lake and AWS, are there any specific kind of application spaces, or vertical spaces where you have pretty high confidence that you can secure an early, and relatively easy, win to help them kind of move down the road? >> Absolutely. So, you know, many of our customers, in a very common, you know, business need, is to enhance the set of information that they have available for a 360 degree view of the customer. In many cases, this information and data, it's available in different parts of the enterprises, but it might be siloed. And, a data lake approach in AWS really helps you to pull it together in an agile fashion based on particular, quarter by quarter, objectives or capabilities that you're trying to respond to. Another very common example is predictive analytics for things like fraud detection, or mechanical failure. So, in eCommerce kinds of situations, being able to pull together semi-structured information that might be coming from web servers or logs, or like what cookies are associated with this particular user. It's very easy to pull together a fraud oriented predictive analytic. And, then the third area that is very common is internet of things use cases. Many enterprises are augmenting their existing data warehouse with sensor oriented time series data, and there's really no place in the enterprise for that data currently to land. >> So, when you say they are augmenting the data warehouse, are they putting it in the data warehouse, or they putting it in a sort of adjunct, time series database, from which they can sort of curate aggregates, and things like that to put in the data warehouse? >> It's very much the latter, right. And, the time series data itself may come from multiple different vendors and the input formats, in which that information lands, can be pretty diverse. And so, it's not really a good fit for a typical kind of data warehouse ingest or intake process. >> So, if you were to look at, sort of, maturity models for the different use cases, where would we be, you know, like IOT, Customer 360, fraud, things like that? >> I think, you know, so many customers have pretty rich fraud analytics capabilities, but some of the pain points that we hear is that it's difficult for them to access the most recent technologies. In some cases the order management systems that those analytics are running on are quite old. We just finished some work with a customer where literally the order management system's running on a mainframe, even today. Those systems have the ability to accept steer from like a sidecar decision support predictive analytic system. And, one of the things that's really cool about the Cloud is you could build a custom API just for that fraud analytics use case so that you can inject exactly the right information that makes it super cheap and easy for the ops team, that's running that mainframe, to consume the fraud improvement decision signal that you're offering. >> Interesting. And so, this may be diving into the weeds a little bit, but if you've got an order management system that's decades old and you're going to plug-in something that has to meet some stringent performance requirements, how do you, sort of, test... It's not just the end to end performance once, but you know for the 99th percentile, that someone doesn't get locked out for five minutes while he's to trying to finish his shopping cart. >> Exactly. And I mean, I think this is what is important about the concept of building data machines, in the Cloud. This is not like a once and done kind of process. You're not building an analytic that produces a print out that an executive is going to look at (laughing) and make a decision. (laughing) You're really creating a process that runs at consumer scale, and you're going to apply all of the same kinds of metrics of percentile performance that you would apply at any kind of large scale consumer delivery system. >> Do you custom-build, a fraud prevention application for each customer? Or, is there a template and then some additional capabilities that you'll learn by running through their training data? >> Well, I think largely, there are business by business distinctions in the approach that these customers take to fraud detection. There's also business by business direction distinction in their current state. But, what we find is that the commonalities in the kinds of patterns and approaches that you tend to apply. So, you know... We may have extra data about you based on your behavior on the web, and your behavior on a mobile app. The particulars of that data might be different for Enterprise A vs Enterprise B, but this pattern of joining up mobile data plus web data plus, maybe, phone-in call center data. Putting those all together, to increase the signal that can be made available to a fraud prevention algorithm, that's very common across all enterprises. And so, one of the roles that we play is to set up the platform, so that it's really easy to mobilize each of these data sources. So in many cases, it's the customer's data scientist that's saying, I think I know how to do a better job for my business. I just need to be unleashed to be able to access this data, and if I'm blocked, I need a platform where the answer that I get back is oh, you could have that, like, second quarter of 2019. Instead, you want to say, oh, we can onboard that data in an agile fashion pay, and increment a little bit of money because you've identified a specific benefit that could be made available by having that data. >> Alright Mick, well thanks for stopping by. I'm going to send Andy Jassy a note that we found the silver lining to the Cloud (laughing) So, I'm excited for that, if nothing else, so that made the trip well worth while, so thanks for taking a few minutes. >> You bet, thanks so much, guys. >> Alright Mick Bass, George Gilbert, Jeff Frick, you're watching theCube, from Data Platforms 2017. We'll be right back after this short break. Thanks for watching. (computer techno beat)

Published Date : May 26 2017

SUMMARY :

Brought to you by Cue Ball. So, what is 47Lining, for people that aren't familiar? and if you look at the periodic table, So, there's always this question about own vs buy, right, What are some of the things you take customers through, and reduce the time to business benefit, you could have many different sequel products, and the way that you interact with your business owners for that data currently to land. and the input formats, so that you can inject exactly the right information It's not just the end to end performance once, a print out that an executive is going to look at (laughing) of patterns and approaches that you tend to apply. the silver lining to the Cloud (laughing) Thanks for watching.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Mick Bass	PERSON	0.99+
Jeff Frick	PERSON	0.99+
five minutes	QUANTITY	0.99+
one	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Mick	PERSON	0.99+
360 degree	QUANTITY	0.99+
Cue Ball	PERSON	0.99+
AWS	ORGANIZATION	0.99+
47Lining	ORGANIZATION	0.99+
99th percentile	QUANTITY	0.99+
Phoenix, Arizona	LOCATION	0.99+
two	QUANTITY	0.99+
second quarter of 2019	DATE	0.99+
One	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
second	QUANTITY	0.98+
each	QUANTITY	0.98+
Spark	TITLE	0.96+
today	DATE	0.96+
Cloud	TITLE	0.95+
27 different data warehouse products	QUANTITY	0.95+
Wikibon	ORGANIZATION	0.95+
decades	QUANTITY	0.94+
three quarters	QUANTITY	0.9+
each customer	QUANTITY	0.89+
MaPr	ORGANIZATION	0.87+
third area	QUANTITY	0.87+
two business use cases	QUANTITY	0.81+
The Wigwam	ORGANIZATION	0.8+
theCube	ORGANIZATION	0.8+
Wigwam Resort	LOCATION	0.78+
Cloud	ORGANIZATION	0.77+
IOT	ORGANIZATION	0.76+
47	OTHER	0.74+
a thousand times	QUANTITY	0.73+
Customer	ORGANIZATION	0.72+
Cloudera	ORGANIZATION	0.7+
2017	DATE	0.7+
things	QUANTITY	0.68+
#DataPlatforms2017	EVENT	0.62+
Platforms	TITLE	0.61+
Primm	ORGANIZATION	0.59+
Data	ORGANIZATION	0.58+
Data Platforms	EVENT	0.53+
Data Platforms 2017	TITLE	0.5+
lake	ORGANIZATION	0.49+
2017	EVENT	0.46+
Data Platforms	ORGANIZATION	0.38+
360	OTHER	0.24+

Tripp Smith, Clarity - Data Platforms 2017 - #DataPlatforms2017

>> Narrator: Live from the Wigwam in Phoenix Arizona, it's theCUBE, covering data platforms 2017, brought to you by Qubole. >> Hey welcome back everybody, Jeff Frick here with theCUBE. I'm joined by George Gilbert from Wikibond and we're at DataPlatforms 2017. Small conference down at the historic Wigwam Resort, just outside of Phoenix, talking about, kind of a new approach to big data really. A Cloud native approach to big data and really kind of flipping the old model on it's head. We're really excited to be joined by Tripp Smith, he's the CTO of Clarity Insights, up on a panel earlier today. So first off, welcome Tripp. >> Thank you. >> For the folks that aren't familiar with Clarity Insights Give us a little background. >> So Clarity is a pure play data analytics professional services company. That's all we do. We say we advise, build and enable for our client. So what that means, is data strategy, data engineering and data science and making sure that we can action the insights that our customers get out of their data analytics platforms. >> Jeff: So not a real busy area these days. >> It's growing pretty well. >> Good for you. So a lot of interesting stuff came up on the panel. But one of the things that you reacted to, I reacted to as well from the keynote. Was this concept of, you know before you had kind of the data scientist with the data platform behind them, being service providers to the basic business units. Really turning that model on it's head. Giving access to the data to all the business units, and people that want to consume that. Making the data team really enablers of kind of a platform play. Seemed to really resonate with you as well. >> Yeah absolutely, so if you think about it, a lot of the focus on legacy platforms was driven by, scarcity around the resources to deal with data. So you created this almost pyramid structure with IT and architecture at the top. They were the gatekeepers and kind of the single door where Insights got out to the business. >> Jeff: Right. >> So in the big data world and with Cloud, with elastic scale, we've been able to turn that around and actually create much more collaborative friction in parallel with the business. Putting the data engineers, data scientists and business focus analystist together and making them more of partners, than just customers of IT. >> Jeff: Right, very interesting way, to think of it as a partner. It's a very different mindset. The other piece that came up over and over in the Q&A at the end. Was how do people get started? How are they successful? So you deal with a lot of customers, right? That's your business. What are some stories, or one that you can share of best practices, when people come and they say, we obviously hired you, we wrote a check. But how do we get started, where do we go first? How do you help people out? >> We focus on self funding analytic programs. Getting those early wins, tend to pay for more investment in analytics. So if you look at the ability to scale out as a starting point. Then aligning that business value and the roadmap in a way that going to both demonstrate the value along the way, and contribute to that capability is important. I think we also recommend to our clients that they solve the hard problems around security and data governance and compliance first. Because that allows them to deal with more valuable data and put that to work for their business. >> So is there any kind of low hanging fruit that you see time and time and time again? That just is like, ah we can do this. We know it's got huge ROI. It's either neglected cause they don't think it's valuable or it's neglected because it's in the backroom. Or is there any easy steps that you find some patterns? >> Yeah, absolutely. So we go to market by industry vertical. So within each vertical, we've defined the value maps and ROI levers within that business. Then align a lot of our analytic solutions to those ROI levers. In doing that, we focus this on being able to build a small, multifunctional team that can work directly with the business. Then deliver that in real time in an interactive way. >> Right, another thing you just talked about security and government, are we past the security concerns about public Cloud? Does that even come up as an issue anymore? >> You know, I think there was a great comment today that if you had money, you wouldn't put it in your safe at home. You'd put it in a bank. >> Jeff: I missed that one, that's a good one. >> The Cloud providers are really focused on security in a way that they can invest in it. That an individual enterprise really can't. So in a lot of cases, moving to the Cloud means, letting the experts take on the area that they're really good at and letting you focus on your business. >> Jeff: Right, interesting they had, Amazon is here, Google's here, Oracle's here and Azure is here. AWS reinvent one of my favorite things, is Tuesday night with James Hamilton. Which I don't know if you've ever been, it's a can't miss presentation. But he talks about the infrastructure investments that Amazon, AWS can make. Which again, compared to any individual enterprise are tremendous in not only security, but networking and all these other things that they do. So it really seems that the scale that these huge Cloud providers have now reach, gives them such an advantage over any individual enterprise, whether it's for security, or networking or anything else. So it's very different kind of a model. >> Yeah, absolutely, or even the application platform, like Google now having Spanner. Which has the scale advantage of Cassandra or H Based. The transactional capabilities of a traditional RDB mess. I guess my question is. Once a customer is considering Qubole, as a Cloud first data platform. How do you help the customer evaluate it? Relative to the dist rose that started out on Prim, and then the other Cloud native ones that are from Azure and Google and Amazon. >> You know I think that's a great question. It kind of focuses back on, letting the experts do what they're really good at. My business may not be differentiated by my ability to operate and support Hadoop. But it's really putting Hadoop to work in order to solve this business problems that makes me money. So when I look at something like Qubole, it's actually going to that expert and saying, "Hey own this for me and deliver this in a reliable way." Rather than me having to solve those problems over and over again myself. >> Do you think that those problems are not solved to the same degree by the Cloud native services? >> So I think there's definitely an ability to leverage Cloud data services. But there's also this aspect of administration and management, and understanding how those integrate within an ecosystem. That I don't think necessarily every company is going to be able to approach in the same way, that a company like Qubole can. So again, being able to shift that off and having that kind of support gives you the ability to focus back on what really makes a difference for you. >> So Tripp we're running out of time. We got a really tight schedule here. I'm just curious, it's a busy conference season. Big data's all over the place. How did you end up here? What is it about this conference and this technology that got you to come down to the, I think it's only a 106 today, weather to take it in. What do you see that's a special opportunity here? >> Yeah you know, this is Data Platforms 2017. It's been a really great conference, just in the focus on being able to look at Cloud and look at this differentiation. Outside of the realm of inventing new shiny objects and really putting it to work for new business cases and that sort of thing. >> Jeff: Well Tripp Smith, thanks for stopping by theCUBE. >> Excellent, Thank you guys for having me. >> All right, he's George Gilbert, I'm Jeff Frick. You're watching Data Platforms 2017 from the historic Wigwam Resort in Phoenix Arizona. Thanks for watching. (techno music)

Published Date : May 26 2017

SUMMARY :

brought to you by Qubole. and really kind of flipping the old model on it's head. For the folks that aren't familiar with Clarity Insights and data science and making sure that we can action Seemed to really resonate with you as well. So you created this almost pyramid structure So in the big data world and with Cloud, What are some stories, or one that you can share and put that to work for their business. that you see time and time and time again? to those ROI levers. that if you had money, and letting you focus on your business. So it really seems that the scale Relative to the dist rose that started out on Prim, But it's really putting Hadoop to work in order So again, being able to shift that off that got you to come down to the, and really putting it to work for new business cases from the historic Wigwam Resort in Phoenix Arizona.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
AWS	ORGANIZATION	0.99+
James Hamilton	PERSON	0.99+
Phoenix	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
Tripp Smith	PERSON	0.99+
Google	ORGANIZATION	0.99+
Tripp	PERSON	0.99+
Clarity Insights	ORGANIZATION	0.99+
Tuesday night	DATE	0.99+
today	DATE	0.99+
Clarity	ORGANIZATION	0.99+
Hadoop	TITLE	0.99+
both	QUANTITY	0.98+
Phoenix Arizona	LOCATION	0.98+
one	QUANTITY	0.97+
Qubole	ORGANIZATION	0.97+
Wigwam Resort	LOCATION	0.96+
first	QUANTITY	0.96+
Data Platforms 2017	EVENT	0.95+
106	QUANTITY	0.93+
each vertical	QUANTITY	0.93+
Wikibond	ORGANIZATION	0.92+
2017	DATE	0.92+
#DataPlatforms2017	EVENT	0.91+
DataPlatforms 2017	EVENT	0.9+
single door	QUANTITY	0.89+
first data platform	QUANTITY	0.88+
Narrator: Live from the	TITLE	0.86+
Azure	TITLE	0.82+
theCUBE	ORGANIZATION	0.8+
Data	TITLE	0.8+
Spanner	TITLE	0.79+
Cassandra	TITLE	0.6+
Wigwam	LOCATION	0.58+
Insights	ORGANIZATION	0.58+
Platforms 2017	EVENT	0.57+
CTO	PERSON	0.53+
Cloud	TITLE	0.52+
Prim	ORGANIZATION	0.44+
Based	OTHER	0.41+
H	TITLE	0.37+

Karthik Ramasamy, Streamlio - Data Platforms 2017 - #DataPlatforms2020

>> Narrator: Hi from the Wigwam in Phoenix, Arizona, it is theCUBE, covering Data Platforms 2017. Brought to you by Qubole. >> Hey welcome back everybody. Jeff Frick with theCUBE. We are down at the historic Wigwam 99 years young just outside of Phoenix, Arizona, Data Platforms 2017. It is really talking about a new approach to big data in cloud put on by Qubole about 200 people, very interesting conversation this morning and we're really interested to have Karthik Ramasamy. He is the co-founder of Streamlio which is still in stealth mode according to his LinkedIn profile so we won't talk about that but long time Twitter guy and really shared some great lessons this morning about things that you guys learned while growing Twitter. So welcome. >> Thank you, thanks for having me. >> Absolutely. One of the key parts of your whole talk was this concept of real time. I always joke with people real time is in time to do something about it. You went through a bunch of examples of real time is really a variable depending on what the right application is but at Twitter real time was super, super important. >> Yes it is indeed important because the nature of the streaming data, the nature of the Twitter data is streaming data because the tweets are coming at a high velocity. And Twitter positioned itself as more of a real time delivery company because that way what happens is whatever the information that we get within Twitter we need to have a strong time budget before we can deliver it to people so that people when they consume the information the information is live or real time. >> But the real time too is becoming obviously for Twitter but for lot of big enterprises it is more and more important and the great analogy I referred before is you used a sample data, is the sample historic data to make decisions. Now you want to keep all the data in real time to make decisions, so its a very different way you drive your decision-making process. >> Very different way of thinking. Especially considering the fact as you said the enterprises are getting into understanding what real time means for them and but if you look at some of the traditional enterprise like financial, they understand the value of real time. Similarly the upcoming new used cases like IoT they understand the value of real time like autonomous vehicles where they have to make quick decisions. Healthcare you have to make quick decisions because the preventive and the predictive maintenance is very important in those kind of segments. So because of those segments, its getting really popular and traditional enterprises like retail and all they're also valuing real time because it allows to blend in into the user behavior so that they can recommend products and other things in real time so that people can react to that so that its becoming more and more important. That's what I would say. >> So Hadoop started out as mostly batch infrastructure and Twitter was pioneer in the design pattern to accommodate both batch and in real time. How has that big data infrastructure evolved so that one, you don't have to split batch in real time and what should we expect going forward to make that platform stronger in terms of in your real time analytics and potentially so that it can inform decisions in systems of record. >> I think like today as of now there are two different infrastructures. One is in general is the Hadoop infrastructure. Other one is more of a real time infrastructure at this point. And the Hadoop is kind of considered as this monolithic, not monolithic, its kind of a mega store where every data like similar to all the rivers kind of reach the sea, it kind of becomes a storage sea where all the data comes and stores there. But before the data comes and stores there, lot of analytics and lot of visibility about the data from the point of its creation before it ends up there it setting done on those rive, whatever you call the data river so you could get lot of analytics done during the time before it ends up so that its more live than the other analytics. Hadoop had its own kind of limitations in terms of how much data it can handle, how real time the data can be. For example, you can kind of dump the data in real time into Hadoop but until you close the file you cannot see the data at all. There is a time budget gets into play there. And you could do smaller files like small, small files writing but the namenode will blob because like within a day you write million files, the namenode is not going to sustain that. So those are the trade-off. That's one of the reason we have to end up doing new real time infrastructure like the distributor log that allows you to the moment the data comes in data is immediately visible within the three to five millisecond timeframe. >> The distributed log you're talking about would be Kafka. The output of that would be to train the model or just score a model and then would that model essentially be carved off from this big data platform and be integrated within a system of record where would informed decisions. >> There are multiple things you could do. First of all, the distributed log essentially the data is kind of, you can think about as a data staging environment where the data kind of lands up there and once it lands up there when there's a lot of sharing of that same data going on in real time, when several jobs are taking they're using some popular data source, it provides a high fan out in the sense like 100 jobs can consume the same data they can be at different parts of the data itself. So that provides a nice sharing environment. Now once the data is around there, now the data is being used for different kind of analytics and one of them could be a model enhancement because typically in the back segment you build the model because you're looking at lot of data and other things, then once the model is built that model is pre-loaded into the real time computer environment like HERON then you look up this model and serve data based on that model whatever it tells you. For example when you do a ad serving to look up that model and what is our relevant ad for you to click. Then the next aspect is model enhancement. Because users behavior is going to change, over a period of time. Now can you capture and incrementally update the model so that those things are also partly done on the real time aspects rather than recomputing the batch and again and again and again. >> Okay so its sort of like a what's the delta? >> Karthik: Yes. >> Let's train on the delta and lets score on the delta. >> Yes and once the delta gets updated then when the new user behavior comes in they can look at that new model what that's being continuously being enhanced and once that enhancement is kind of captured you know that user behavior is changing. And ads are served accordingly. >> Okay so now that our customers are getting serious about moving to the cloud with their big data platforms and the applications on them, have you seen a change in the patterns of the apps they're looking to build or a change in the makeup of the platform that they want to use. >> SO that depends on, typically like, one disclosure is I've worked with Amazon and all, the AWS but within the companies that I worked for its everything is an on frame but thing is having said that cloud is nice because it gives you machines on the fly whenever you need to and it gives a bunch of tools around it where you can bootstrap it and all the various stuff. This works ideal for a smaller company and medium companies but the big companies one of the this things that we calculate in terms of the costwise how much is the cost that we have to pay versus doing it inhouse so there's still a huge gap unless cloud provider is going to provide a huge discount or whatever for the big companies to move in. So that is always a challenge that we get into because think about I have 10 or 20,000 notes of Hadoop can I move all of them into Amazon AWS, how much I am going to pay? Versus the cost of maintaining my own data centers and everything. I would say like I don't know the latest pricing and other things but approximately it comes to three x in terms of cost wise. >> If you're using... >> Our own on-prem and the data center and all of the staffing and everything. There's a difference of I would say three x. >> For on-prem being higher. >> On-prem being lower. >> Lower? >> Yes. >> But that assumes then that you've got flat utilization. >> Flat utilization but, I mean cloud of course I have the expands out of scale and all the various thing that you can, it gives an illusion of unlimited resources but in our case if you're provisioning so much machines in most of the at least 50 or 60% of the machines are used for production but the rest of them are used for staging, development, and all the various other environments so which means like the total cost of those machines even though like only is 50% utilized still you end up saving so much shit like operate out one-third of the cost that might be in the cloud. >> Alright Karthik, that opens up a whole can of interesting conversations. Again we just don't have time to jump into. So I'll give you the last word. When can we expect you to come out of stealth or is that stealthy too? >> It is kind of, that is stealthy too. >> Okay fair enough, I don't want to put you on the spot but thanks for stopping by and sharing your story. >> Karthik: Thanks, thanks for everything. >> Alright, he is Karthik, he is George, I'm Jeff. You're watching theCUBE. We are in the Wigwam resort just outside of phoenix at Data Platforms 2017. We will be back after this short break. Thanks for watching.

Published Date : May 26 2017

SUMMARY :

Narrator: Hi from the Wigwam in Phoenix, Arizona, He is the co-founder of Streamlio One of the key parts of your whole talk was the nature of the streaming data, But the real time too is becoming obviously for Twitter Especially considering the fact as you said the evolved so that one, you don't have to split batch so that its more live than the other analytics. and then would that model essentially be carved off the data is kind of, you can think about as a data staging Yes and once the delta gets updated makeup of the platform that they want to use. one of the this things that we calculate in terms of and all of the staffing and everything. But that assumes then that you've got and all the various other environments So I'll give you the last word. on the spot but thanks for stopping by We are in the Wigwam resort just outside of phoenix

ENTITIES

Entity	Category	Confidence
Karthik	PERSON	0.99+
Karthik Ramasamy	PERSON	0.99+
George	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
100 jobs	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
50%	QUANTITY	0.99+
Twitter	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
10	QUANTITY	0.99+
three	QUANTITY	0.99+
Wigwam	LOCATION	0.99+
LinkedIn	ORGANIZATION	0.98+
20,000 notes	QUANTITY	0.98+
both	QUANTITY	0.98+
60%	QUANTITY	0.98+
Streamlio	ORGANIZATION	0.98+
Phoenix, Arizona	LOCATION	0.98+
one	QUANTITY	0.98+
one-third	QUANTITY	0.98+
99 years	QUANTITY	0.98+
Hadoop	TITLE	0.97+
about 200 people	QUANTITY	0.97+
today	DATE	0.96+
One	QUANTITY	0.96+
two different infrastructures	QUANTITY	0.96+
Qubole	PERSON	0.96+
First	QUANTITY	0.89+
five millisecond	QUANTITY	0.86+
theCUBE	ORGANIZATION	0.86+
Data Platforms	ORGANIZATION	0.85+
this morning	DATE	0.82+
Amazon AWS	ORGANIZATION	0.8+
a day	QUANTITY	0.79+
HERON	ORGANIZATION	0.7+
least 50	QUANTITY	0.62+
three x	QUANTITY	0.6+
llion	QUANTITY	0.49+
Kafka	TITLE	0.49+
2017	DATE	0.47+
phoenix	ORGANIZATION	0.46+
2017	EVENT	0.42+
Data Platforms	TITLE	0.37+
Platforms	ORGANIZATION	0.25+

Saket Saurabh, Nexla - Data Platforms 2017 - #DataPlatforms2017

(upbeat music) [Announcer] Live from the Wigwam in Pheonix, Arizona, it's the Cube. Covering Data Platforms 2017. Brought to you by Cue Ball. >> Hey welcome back everybody, Jeff Frick here with the Cube. We are coming down to the end of a great day here at the historic Wigwam at the Data Platforms 2017, lot of great big data practitioners talking about the new way to do things, really coining the term data ops, or maybe not coining it but really leveraging it, as a new way to think about data and using data in your business, to be data-driven, software-defined, automated solution and company. So we're excited to have Saket Saurabh, he is the, and I'm sorry I butchered that, Saurabh. >> Saurabh, yeah. >> Saurabh, thank you, sorry. He is the co-founder and CEO of Nexla, and welcome. >> Thank you. >> So what is Nexla, tell us about Nexla for those that aren't familiar with the company. Thank you so much. Yeah so Nexla is a data operations platform. And the way we look at data is that data is increasingly moving between companies and one of the things that is driving that is the growth in machine learning. So imagine you are an e-commerce company, or a healthcare provider. You need to get data from your different partners. You know, suppliers and point-of-sale systems, and brands and all that. And the companies, when they are getting this data, from all these different places, it's so hard to manage. So we think of, you know just like cloud computing, made it easy to manage thousands of servers, we think of data ops as something that makes it easy to manage those thousands of data sources coming from so many partners. So you've jumped straight past the it's a cool buzz term in way to think about things, into the actual platform. So how does that platform fit within the cloud, and on Prim, is it part of the infrastructure, sits next to the infrastructure, is it a conduit? How does that work? >> Yeah, we think of it as, if you think of maybe machine learning or advanced analytics as the application, then data operations is sort of an underlying infrastructure for it. It's not really the hardware, the storage, but it's a layer on top. The job of data operations is to get the data from where it is to where you need it to be, and in the right form and shape. So now you can act on it. >> And do you find yourself replacing legacy stuff, or is this a brand new demand because of all the variant and so many types of datasets that are coming in that people want to leverage. >> Yeah, I mean to be honest, some of this has always been there in the sense that the day you connected a database to a network data started to move around. But if you think of the scale that has happened in the last six or seven years, none of those existing systems were ever designed for that. So when we talk about data growing at at a Moore's Law rate, when we talk about everybody getting into machine learning, when we talk about thousands of data sets across so many different partners that you work with, and when we think that reports that you get from your partners is no more sufficient, you need that underlying data, you can not basically feed that report into an algo. So when you look at all of these things we feel like it is a new thing in some ways. >> Right. Well, I want to unpack that a little bit because you made an interesting comment, before you turned on the cameras you just repeated, that you can't run an algorithm on a report. And in a world where we've got all the shared data sets, and it's funny too right, because you used to run a sample, now you want, you said, the raw. Not only all, but the raw data, so that you can do with it what you wish. Very different paradigm. >> Yeah. >> It sounds like there's a lot more, and you're not just parsing what's in the report, but you have to give it structure that can be combined with other data sources. And that sounds like a rather challenging task. Because the structure, all the metadata, the context that gives the data meaning that is relevant to other data sets, where does that come from? >> Yeah, so what happens, and this has been how technology companies have started to evolve. You want to focus on your core business. And therefore you will use a provider that processes your payments, you will use a provider that gives you search. You will use a provider that provides you the data for example for your e-commerce system. So there are different types of vendors you're working with. Which means that there's different types of data being involved. So when I look at for example a brand today, you could be say, a Nike, and your products are being sold on so many websites. If you want to really analyze your business well, you want data from every single one of those places, where your data team can now access it. So yes, it is that raw data, it is that metadata, and it is the data coming from all the systems that you can look at together and say when I ran this ad this is how people reacted to it, this was the marketing lift from that, this is the purchase that happened across these different channels, this is how my top line or bottom line was affected. And to analyze everything together you need all the data in a place. >> I'm curious on what do you find on the change in the business relationship. Because I'm sure there were agreements structured in another time which weren't quite as detailed, where the expectations in terms of what was exchanged wasn't quite this deep. Are you seeing people have to change their relationships to get this data? Is it out there that they're getting it, or is this really changing the way that people partner in data exchange, on like the example that you just used between say Nike and Foot Locker, to pick a name. >> Yeah, so I think companies that have worked together have always had reports come in, so you would get a daily report of how much you sold. Now just a high-level report of how much you sold is not sufficient anymore. You want to understand where was it bought, in which city, under what weather conditions, by what kind of user and all that stuff. So I think what companies are looking at, again, they have built their data systems, they have the data teams, unless they give the data their teams cannot be effective and you cannot really take a daily sales report and feed that into your algorithm, right? So you need very fine-grained data for that. So I think companies are doing this where, hey you were giving me a report before, I also need some underlying data. Report is for a business executive to look at and see how business is doing, and the underlying data is really for that algorithm to understand and maybe identify things that a report might not. >> Wouldn't there have been already, at least in the example of sell-through, structured data that's been exchanged between partners already like vendor-managed inventory, or you know where like a downstream retailer might make their sell-through data accessible to suppliers who actually take ownership of the inventory and are responsible for stocking it at optimal levels. >> Yeah, I think Walmart was the innovator in that, with the POS link system, back in the day, for retail. But the point is that this need for data to go from one company to their partners and back and forth is across every sector. So you need that in e-commerce, you need that in fintech, we see companies who have to manage your portfolio needs to connect with different banks and brokerages you work with to get the data. We see that in healthcare across different providers and pharmaceutical companies, you need that. We see that in automotive. If every care generates data, an insurance company needs to be able to understand that and look at it. >> This, it's a huge problem you're addressing, because this is the friction between inter-company applications. And we went through this with the B2B marketplaces, 15 plus years ago. But the reason we did these marketplace hubs was so that we could standardize the information exchange. If it's just Walgreens talking to Pfizer, and then doing another one-off deal with, I don't know, Lily, I don't know if they both still exist, it won't work for connecting all of pharmacy with all of pharma. How do you ensure standards between downstream and upstream? >> Yeah. So you're right, this has happened. When we do a wire transfer from one person to another, some data goes from a bank to another bank, still takes hours to get that, it's very tiny amount of data. That has all exploded, we are talking about zetabytes of data now every year. So the challenge is significantly bigger. Now coming to standards, what we have found, that two companies sitting together and defining a standard almost never works. It never works because applications change, systems change, the change is the only constant. So the way we've approached it at our company is, we monitor the data, we sit on top of the data and just learn the structure as we observe data flowing through. So we have tons of data flowing through and we're constantly learning the structure, and are identifying how the structure will map to the destination. So again, applying machine learning to see how the structure is changing, how the data volume is changing. So you are getting data from somewhere say every hour, and then it doesn't show up for two hours. Traditionally systems will go down, you may not even find for five days that the data wasn't there for that. So we look at the data structure, the amount of data, the time when it comes, and everything to instantly learn and be able to inform the downstream systems of what they should be expecting, if there is a change that somebody needs to be alerted about. So a lot of innovation is going in to doing this at scale without necessarily having to predefine something in a tight box that cannot be changed. Because it's extremely hard to control. >> All right, Saket, that's a great explanation. We're going to have to leave it there, we're out of time. And thank you for taking a few minutes out of your day to stop by. >> Thank you. >> All right. Jeff Frick with George Gilbert, we are at Data Platforms 2017, Pheonix Arizona, thanks for watching. (electronic music)

Published Date : May 25 2017

SUMMARY :

Brought to you by Cue Ball. at the historic Wigwam at the Data Platforms 2017, He is the co-founder and CEO of Nexla, So we think of, you know just like cloud computing, So now you can act on it. And do you find yourself replacing legacy stuff, the day you connected a database to a network Not only all, but the raw data, so that you can do with it but you have to give it structure that can be combined And to analyze everything together you need all the data I'm curious on what do you find on the change So you need very fine-grained data for that. or you know where like a downstream retailer But the point is that this need for data to go But the reason we did these marketplace hubs and just learn the structure as we observe data And thank you for taking a few minutes out of your day we are at Data Platforms 2017, Pheonix Arizona,

ENTITIES

Entity	Category	Confidence
Walmart	ORGANIZATION	0.99+
Walgreens	ORGANIZATION	0.99+
Saurabh	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Nike	ORGANIZATION	0.99+
George Gilbert	PERSON	0.99+
Pfizer	ORGANIZATION	0.99+
two hours	QUANTITY	0.99+
five days	QUANTITY	0.99+
Lily	PERSON	0.99+
two companies	QUANTITY	0.99+
Nexla	ORGANIZATION	0.99+
Saket	PERSON	0.99+
Foot Locker	ORGANIZATION	0.99+
Saket Saurabh	PERSON	0.99+
one person	QUANTITY	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
Pheonix, Arizona	LOCATION	0.97+
Cube	ORGANIZATION	0.97+
15 plus years ago	DATE	0.97+
today	DATE	0.97+
thousands of data sources	QUANTITY	0.97+
Wigwam	LOCATION	0.96+
Data Platforms 2017	EVENT	0.96+
thousands of servers	QUANTITY	0.95+
one company	QUANTITY	0.95+
#DataPlatforms2017	EVENT	0.92+
Cue Ball	PERSON	0.9+
thousands of data sets	QUANTITY	0.9+
Arizona	LOCATION	0.75+
last six	DATE	0.73+
hour	QUANTITY	0.69+
single	QUANTITY	0.67+
seven years	QUANTITY	0.67+
Moore	ORGANIZATION	0.66+
every	QUANTITY	0.64+
Pheonix	LOCATION	0.54+
Covering	EVENT	0.51+

Guido Appenzeller, Intel | HPE Discover 2021

(soft music) >> Welcome back to HPE Discover 2021, the virtual version, my name is Dave Vellante and you're watching theCUBE and we're here with Guido Appenzeller, who is the CTO of the Data Platforms Group at Intel. Guido, welcome to theCUBE, come on in. >> Aww, thanks Dave, I appreciate it. It's great to be here today. >> So I'm interested in your role at the company, let's talk about that, you're brand new, tell us a little bit about your background. What attracted you to Intel and what's your role here? >> Yeah, so I'm, I grew up with the startup ecosystem of Silicon Valley, I came from my PhD and never left. And, built software companies, worked at software companies worked at VMware for a little bit. And I think my initial reaction when the Intel recruiter called me, was like, Hey you got the wrong phone number, I'm a software guy, that's probably not who you're looking for. And, but we had a good conversation but I think at Intel, there's a realization that you need to look at what Intel builds more as this overall system from an overall systems perspective. That the software stack and then the hardware components are all getting more and more intricately linked and, you need the software to basically bridge across the different hardware components that Intel is building. So again, I was the CTO for the Data Platforms Group, so that builds the data center products here at Intel. And it's a really exciting job. And these are exciting times at Intel, with Pat, I've got a fantastic CEO at the helm. I've worked with him before at VMware. So a lot of things to do but I think a very exciting future. >> Well, I mean the, the data centers the wheelhouse of Intel, of course your ascendancy was a function of the PCs and the great volume and how you change that industry but really data centers is where, I remember the days people said, Intel will never be at the data center, it's just the toy. And of course, you're dominant player there now. So your initial focus here is really defining the vision and I'd be interested in your thoughts on the future what the data center looks like in the future where you see Intel playing a role, what are you seeing as the big trends there? Pat Gelsinger talks about the waves, he says, if you don't ride the waves you're going to end up being driftwood. So what are the waves you're driving? What's different about the data center of the future? >> Yeah, that's right. You want to surf the waves, that's the way to do it. So look, I like to look at this and sort of in terms of major macro trends, And I think that the biggest thing that's happening in the market right now is the cloud revolution. And I think we're well halfway through or something like that. And this transition from the classic, client server type model, that way with enterprises running all data centers to more of a cloud model where something is run by hyperscale operators or maybe run by an enterprise themselves of (indistinct) there's a variety of different models. but the provisioning models have changed. It's much more of a turnkey type service. And when we started out on this journey I think the, we built data centers the same way that we built them before. Although, the way to deliver IT have really changed, it's going through more of a service model and we really know starting to see the hardware diverge, the actual silicon that we need to build and how to address these use cases, diverge. And so I think one of the things that is probably most interesting for me is really to think through, how does Intel in the future build silicon that's built for clouds, like on-prem clouds, edge clouds, hyperscale clouds, but basically built for these new use cases that have emerged. >> So just a quick, kind of a quick aside, to me the definition of cloud is changing, it's evolving and it used to be this set of remote services in a hyperscale data center, it's now that experience is coming on-prem it's connecting across clouds, it's moving out to the edge it's supporting, all kinds of different workloads. How do you see that sort of evolving cloud? >> Yeah, I think, there's the biggest difference to me is that sort of a cloud starts with this idea that the infrastructure operator and the tenant are separate. And that is actually has major architectural implications, it just, this is a perfect analogy, but if I build a single family home, where everything is owned by one party, I want to be able to walk from the kitchen to the living room pretty quickly, if that makes sense. So, in my house here is actually the open kitchen, it's the same room, essentially. If you're building a hotel where your primary goal is to have guests, you pick a completely different architecture. The kitchen from your restaurants where the cooks are busy preparing the food and the dining room, where the guests are sitting, they are separate. The hotel staff has a dedicated place to work and the guests have a dedicated places to mingle but they don't overlap, typically. I think it's the same thing with architecture in the clouds. So, initially the assumption was it's all one thing and now suddenly we're starting to see like a much cleaner separation of these different areas. I think a second major influence is that the type of workloads we're seeing it's just evolving incredibly quickly, 10 years ago, things were mostly monolithic, today most new workloads are microservice based, and that has a huge impact in where CPU cycles are spent, where we need to put an accelerators, how we build silicon for that to give you an idea, there's some really good research out of Google and Facebook where they run numbers. And for example, if you just take a standard system and you run a microservice based an application but in the microservice-based architecture you can spend anywhere from I want to say 25 in some cases, over 80% of your CPU cycles just on overhead, and just on, marshaling demarshaling the protocols and the encryption and decryption of the packets and your service mesh that sits in between all of these things, that created a huge amount of overhead. So for us might have 80% go into these overhead functions really all focus on this needs to be on how do we enable that kind of infrastructure? >> Yeah, so let's talk a little bit more about workloads if we can, the overhead there's also sort of, as the software as the data center becomes software defined thanks to your good work at VMware, it is a lot of cores that are supporting that software-defined data center. And then- >> It's at VMware, yeah. >> And as well, you mentioned microservices container-based applications, but as well, AI is coming into play. And what is, AI is just kind of amorphous but it's really data-oriented workloads versus kind of general purpose ERP and finance and HCM. So those workloads are exploding, and then we can maybe talk about the edge. How are you seeing the workload mix shift and how is Intel playing there? >> I think the trends you're talking about is definitely right, and we're getting more and more data centric, shifting the data around becomes a larger and larger part of the overall workload in the data center. And AI is getting a ton of attention. Look if I talk to the most operators AI is still an emerging category. We're seeing, I'd say five, maybe 10% percent of workloads being AI is growing, they're very high value workloads. And they're very challenging workloads, but it's still a smaller part of the overall mix. Now edge is big and edge is two things, it's big and it's complicated because of the way I think about edge is it's not just one homogeneous market, it's really a collection of separate sub markets It's, very heterogeneous, it runs on a variety of different hardware. Edge can be everything from a little server, that's fanless, it's strapped to a phone, a telephone pole with an antenna on top of it, to aid a microcell, or it can be something that's running inside a car, modern cars has a small little data center inside. It can be something that runs on an industrial factory floor, the network operators, there's pretty broad range of verticals that all looks slightly different in their requirements. And, it's, I think it's really interesting, it's one of those areas that really creates opportunities for vendors like HPE, to really shine and address this heterogeneity with a broad range of solutions, very excited to work together with them in that space. >> Yeah, so I'm glad you brought HPE into the discussion, 'cause we're here at HPE Discover, I want to connect that. But so when I think about HPE strategy, I see a couple of opportunities for them. Obviously Intel is going to play in every part of the edge, the data center, the near edge and the far edge, and I gage HPE does as well with Aruba. Aruba is going to go to the far edge. I'm not sure at this point, anyway it's not yet clear to me how far, HPE's traditional server business goes to the, inside of automobiles, we'll see, but it certainly will be at the, let's call it the near edge as a consolidation point- >> Yeah. >> Et cetera and look the edge can be a race track, it could be a retail store, it could be defined in so many ways. Where does it make sense to process the data? But, so my question is what's the role of the data center in this world of edge? How do you see it? >> Yeah, look, I think in a sense what the cloud revolution is doing is that it's showing us, it leads to polarization of a classic data into edge and cloud, if that makes sense, it's splitting, before this was all mingled a little bit together, if my data centers my basement anyways, what's the edge, what's data center? It's the same thing. The moment I'm moving some workloads to the clouds I don't even know where they're running anymore then some other workloads that have to have a certain sense of locality, I need to keep closely. And there are some workloads you just can't move into the cloud. There's, if I'm generating lots of all the video data that I have to process, it's financially a completely unattractive to shift all of that, to a central location, I want to do this locally. And will I ever connect my smoke detector with my sprinkler system be at the cloud? No I won't, this stuff, if things go bad, that may not work anymore. So I need something that's that does this locally. So I think there's many reasons, why you want to keep something on premises. And I think it's a growing market, it's very exciting, we're doing some very good stuff with friends like HPE, they have the ProLiant DL, one 10 Gen10 Plus server with our latest a 3rd Generation Xeons on them the Open RAN, which is the radio access network in the telco space. HP Edgeline servers, also a 3rd Generation Xeons there're some really nice products there that I think can really help addressing enterprises, carriers and a number of different organizations, these edge use cases. >> Can you explain, you mentioned Open RAN, vRAN, should we essentially think of that as kind of the software-defined telco? >> Yeah, exactly. It's software-defined cellular. I actually, I learned a lot about that over the recent months. When I was taking these classes at Stanford, these things were still done in analog, that doesn't mean a radio signal will be processed in an analog way and digest it and today typically the radio signal is immediately digitized and all the processing of the radio signal happens digitally. And, it happens on servers, some of them HPE servers. And, it's a really interesting use case where we're basically now able to do something in a much, much more efficient way by moving it to a digital, more modern platform. And it turns out you can actually virtualize these servers and, run a number of different cells, inside the same server. And it's really complicated because you have to have fantastic real-time guarantees versus sophisticated software stack. But it's a really fascinating use case. >> A lot of times we have these debates and it's maybe somewhat academic, but I'd love to get your thoughts on it. And debate is about, how much data that is processed and inferred at the edge is actually going to come back to the cloud, most of the data is going to stay at the edge, a lot of it's not even going to be persisted. And the counter to that is, so that's sort of the negative is at the data center, but then the counter that is there going to be so much data, even a small percentage of all the data that we're going to create is going to create so much more data, back in the cloud, back in the data center. What's your take on that? >> Look, I think there's different applications that are easier to do in certain places. Look, going to a large cloud has a couple of advantages. You have a very complete software ecosystem around you, lots of different services. You'll have first, if you need very specialized hardware, if I wanted to run the bigger learning task where somebody needed a 1000 machines, and then this runs for a couple of days, and then I don't need to do that for another month or two, for that is really great. There's on demand infrastructure, having all this capability up there, at the same time it costs money to send the data up there. If I just look at the hardware cost, it's much much cheaper to build it myself, in my own data center or in the edge. So I think we'll see, customers picking and choosing what they want to do where, and that there's a role for both, absolutely. And so, I think there's certain categories. At the end of the day why do I absolutely need to have something at the edge? There's a couple of, I think, good use cases. One is, let me actually rephrase a little bit. I think it's three primary reasons. One is simply a bandwidth, where I'm saying, my video data, like I have a 100 4K video cameras, with 60 frames per second feeds, there's no way I'm going to move that into the cloud. It's just, cost prohibitive- >> Right. >> I have a hard time even getting (indistinct). There might be latency, if I need want to reliably react in a very short period of time, I can't do that in the cloud, I need to do this locally with me. I can't even do this in my data center. This has to be very closely coupled. And, then there's this idea of fade sharing. I think, if I want to make sure that if things go wrong, the system is still intact, anything that's sort of an emergency kind of a backup, an emergency type procedure, if things go wrong, I can't rely on the big good internet connection, I need to handle things, things locally, that's the smoke detector and the sprinkler system. And so for all of these, there's good reasons why we need to move things close to the edge so I think there'll be a creative tension between the two but both are huge markets. And I think there's great opportunities for HP ahead to work on all these use cases. >> Yeah, for sure, top brand is in that compute business. So before we wrap up today, thinking about your role, part of your role is a trend spotter. You're kind of driving innovation righty, surfing the waves as you said, skating to the puck, all the- >> I've got my perfect crystal ball right here, yeah I got. >> Yeah, all the cliches. (Dave chuckles) puts a little pressure on you, but, so what are some of the things that you're overseeing that you're looking towards in terms of innovation projects particularly obviously in the data center space, what's really exciting you? >> Look, there's a lot of them and I pretty much all the interesting ideas I get from talking to customers. You talk to the sophisticated customers, you try to understand the problems that they're trying to solve and they can't solve right now, and that gives you ideas to just to pick a couple, one thing what area I'm probably thinking about a lot is how can we build in a sense better accelerators for the infrastructure functions? So, no matter if I run an edge cloud or I run a big public cloud, I want to find ways how I can reduce the amount of CPU cycles I spend on microservice marshaling demarshaling, service mesh, storage acceleration and these things like that. And so well clearly, if this is a large chunk of the overall cycle budget, we need to find ways to shrink that to make this more efficient. So then I think, so this basic infrastructure function acceleration, sounds probably as unsexy as any topic would sound but I think this is actually really, really interesting area and one of the big levers we have right now in the data center. >> Yeah, I would agree Guido, I think that's actually really exciting because, you actually can pick up a lot of the wasted cycles now and that drops right to the bottom line, but please- >> Yeah, exactly. And it's kind of funny we're still measuring so much with SPEC and rates of CPU's performances, it's like, well, we may actually be measuring the wrong thing. If 80% of the cycles of my app are spent in overhead, then the speed of the CPU doesn't matter as much, it's other functions that (indistinct). >> Right. >> So that's one. >> The second big one is memory is becoming a bigger and bigger issue, and it's memory cost 'cause, memory prices, they used to sort of decline at the same rate that our core counts and then clock speeds increased, that's no longer the case. So we've run to some scaling limits, there's some physical scaling limits where memory prices are becoming stagnant. And this has become a major pain point for everybody who's building servers. So I think we need to find ways how we can leverage memory more efficiently, share memory more efficiently. We have some really cool ideas in that space that we're working on. >> Well, yeah. And Pat, let me just sorry to interrupt but Pat hinted to that and your big announcement. He talked about system on package and I think is what you used to talk about what I call disaggregated memory and better sharing of that memory resource. And that seems to be a clear benefit of value creation for the industry. >> Exactly. If this becomes a larger, if for our customers this becomes a larger part of the overall costs, we want to help them address that issue. And the third one is, we're seeing more and more data center operators that effectively power limited. So we need to reduce the overall power of systems, or maybe to some degree just figure out better ways of cooling these systems. But I think there's a lot of innovation that can be done there to both make these data centers more economical but also to make them a little more Green. Today data centers have gotten big enough that if you look at the total amount of energy that we're spending, this world as mankind, a chunk of that is going just to data center. And so if we're spending energy at that scale, I think we have to start thinking about how can we build data centers that are more energy efficient that are also doing the same thing with less energy in the future. >> Well, thank you for laying those out, you guys have been long-term partners with HP and now of course HPE, I'm sure Gelsinger is really happy to have you on board, Guido I would be and thanks so much for coming to theCUBE. >> It's great to be here and great to be at the HP show. >> And thanks for being with us for HPE Discover 2021, the virtual version, you're watching theCUBE the leader in digital tech coverage, be right back. (soft music)

Published Date : Jun 22 2021

SUMMARY :

2021, the virtual version, It's great to be here today. and what's your role here? so that builds the data data center of the future? the actual silicon that we need to build it's moving out to the edge is that the type of workloads we're seeing as the data center It's at VMware, And as well, you mentioned and larger part of the overall the data center, the near the role of the data center lots of all the video data about that over the recent months. And the counter to that is, move that into the cloud. and the sprinkler system. righty, surfing the waves I've got my perfect in the data center space, of the overall cycle If 80% of the cycles of my that's no longer the case. And that seems to be a clear benefit that are also doing the same thing happy to have you on board, great to be at the HP show. the virtual version,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Guido	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
Guido Appenzeller	PERSON	0.99+
80%	QUANTITY	0.99+
1000 machines	QUANTITY	0.99+
Pat	PERSON	0.99+
five	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
One	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
two	QUANTITY	0.99+
100	QUANTITY	0.99+
third one	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
Gelsinger	PERSON	0.99+
25	QUANTITY	0.99+
Data Platforms Group	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
both	QUANTITY	0.99+
one party	QUANTITY	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
HPE	ORGANIZATION	0.98+
10 years ago	DATE	0.98+
Today	DATE	0.98+
ProLiant DL	COMMERCIAL_ITEM	0.97+
VMware	ORGANIZATION	0.97+
first	QUANTITY	0.97+
three primary reasons	QUANTITY	0.96+
second	QUANTITY	0.95+
Data Platforms Group	ORGANIZATION	0.94+
Open RAN	TITLE	0.94+
10% percent	QUANTITY	0.94+
vRAN	TITLE	0.92+
HPE Discover	ORGANIZATION	0.91+
Stanford	ORGANIZATION	0.91+
HPE	TITLE	0.89+
over 80%	QUANTITY	0.89+
single family home	QUANTITY	0.88+
10 Gen10 Plus	COMMERCIAL_ITEM	0.83+
HPE Discover 2021	EVENT	0.81+
couple	QUANTITY	0.81+
60 frames per second feeds	QUANTITY	0.79+
one thing	QUANTITY	0.77+
HP	EVENT	0.75+
Edgeline	COMMERCIAL_ITEM	0.74+
4K	QUANTITY	0.74+
couple of days	QUANTITY	0.73+
second big	QUANTITY	0.72+
3rd Generation	COMMERCIAL_ITEM	0.72+
month	QUANTITY	0.69+
Aruba	ORGANIZATION	0.6+
telco	ORGANIZATION	0.57+
Discover 2021	EVENT	0.55+
theCUBE	ORGANIZATION	0.54+

Guido Appenzeller | HPE Discover 2021

(soft music) >> Welcome back to HPE Discover 2021, the virtual version, my name is Dave Vellante and you're watching theCUBE and we're here with Guido Appenzeller, who is the CTO of the Data Platforms Group at Intel. Guido, welcome to theCUBE, come on in. >> Aww, thanks Dave, I appreciate it. It's great to be here today. >> So I'm interested in your role at the company, let's talk about that, you're brand new, tell us a little bit about your background. What attracted you to Intel and what's your role here? >> Yeah, so I'm, I grew up with the startup ecosystem of Silicon Valley, I came from my PhD and never left. And, built software companies, worked at software companies worked at VMware for a little bit. And I think my initial reaction when the Intel recruiter called me, was like, Hey you got the wrong phone number, I'm a software guy, that's probably not who you're looking for. And, but we had a good conversation but I think at Intel, there's a realization that you need to look at what Intel builds more as this overall system from an overall systems perspective. That the software stack and then the hardware components are all getting more and more intricately linked and, you need the software to basically bridge across the different hardware components that Intel is building. So again, I was the CTO for the Data Platforms Group, so that builds the data center products here at Intel. And it's a really exciting job. And these are exciting times at Intel, with Pat, I've got a fantastic CEO at the helm. I've worked with him before at VMware. So a lot of things to do but I think a very exciting future. >> Well, I mean the, the data centers the wheelhouse of Intel, of course your ascendancy was a function of the PCs and the great volume and how you change that industry but really data centers is where, I remember the days people said, Intel will never be at the data center, it's just the toy. And of course, you're dominant player there now. So your initial focus here is really defining the vision and I'd be interested in your thoughts on the future what the data center looks like in the future where you see Intel playing a role, what are you seeing as the big trends there? Pat Gelsinger talks about the waves, he says, if you don't ride the waves you're going to end up being driftwood. So what are the waves you're driving? What's different about the data center of the future? >> Yeah, that's right. You want to surf the waves, that's the way to do it. So look, I like to look at this and sort of in terms of major macro trends, And I think that the biggest thing that's happening in the market right now is the cloud revolution. And I think we're well halfway through or something like that. And this transition from the classic, client server type model, that way with enterprises running all data centers to more of a cloud model where something is run by hyperscale operators or maybe run by an enterprise themselves of (indistinct) there's a variety of different models. but the provisioning models have changed. It's much more of a turnkey type service. And when we started out on this journey I think the, we built data centers the same way that we built them before. Although, the way to deliver IT have really changed, it's going through more of a service model and we really know starting to see the hardware diverge, the actual silicon that we need to build and how to address these use cases, diverge. And so I think one of the things that is probably most interesting for me is really to think through, how does Intel in the future build silicon that's built for clouds, like on-prem clouds, edge clouds, hyperscale clouds, but basically built for these new use cases that have emerged. >> So just a quick, kind of a quick aside, to me the definition of cloud is changing, it's evolving and it used to be this set of remote services in a hyperscale data center, it's now that experience is coming on-prem it's connecting across clouds, it's moving out to the edge it's supporting, all kinds of different workloads. How do you see that sort of evolving cloud? >> Yeah, I think, there's the biggest difference to me is that sort of a cloud starts with this idea that the infrastructure operator and the tenant are separate. And that is actually has major architectural implications, it just, this is a perfect analogy, but if I build a single family home, where everything is owned by one party, I want to be able to walk from the kitchen to the living room pretty quickly, if that makes sense. So, in my house here is actually the open kitchen, it's the same room, essentially. If you're building a hotel where your primary goal is to have guests, you pick a completely different architecture. The kitchen from your restaurants where the cooks are busy preparing the food and the dining room, where the guests are sitting, they are separate. The hotel staff has a dedicated place to work and the guests have a dedicated places to mingle but they don't overlap, typically. I think it's the same thing with architecture in the clouds. So, initially the assumption was it's all one thing and now suddenly we're starting to see like a much cleaner separation of these different areas. I think a second major influence is that the type of workloads we're seeing it's just evolving incredibly quickly, 10 years ago, things were mostly monolithic, today most new workloads are microservice based, and that has a huge impact in where CPU cycles are spent, where we need to put an accelerators, how we build silicon for that to give you an idea, there's some really good research out of Google and Facebook where they run numbers. And for example, if you just take a standard system and you run a microservice based an application but in the microservice-based architecture you can spend anywhere from I want to say 25 in some cases, over 80% of your CPU cycles just on overhead, and just on, marshaling demarshaling the protocols and the encryption and decryption of the packets and your service mesh that sits in between all of these things, that created a huge amount of overhead. So for us might have 80% go into these overhead functions really all focus on this needs to be on how do we enable that kind of infrastructure? >> Yeah, so let's talk a little bit more about workloads if we can, the overhead there's also sort of, as the software as the data center becomes software defined thanks to your good work at VMware, it is a lot of cores that are supporting that software-defined data center. And then- >> It's at VMware, yeah. >> And as well, you mentioned microservices container-based applications, but as well, AI is coming into play. And what is, AI is just kind of amorphous but it's really data-oriented workloads versus kind of general purpose ERP and finance and HCM. So those workloads are exploding, and then we can maybe talk about the edge. How are you seeing the workload mix shift and how is Intel playing there? >> I think the trends you're talking about is definitely right, and we're getting more and more data centric, shifting the data around becomes a larger and larger part of the overall workload in the data center. And AI is getting a ton of attention. Look if I talk to the most operators AI is still an emerging category. We're seeing, I'd say five, maybe 10% percent of workloads being AI is growing, they're very high value workloads. So (indistinct) any workloads, but it's still a smaller part of the overall mix. Now edge is big and edge is two things, it's big and it's complicated because of the way I think about edge is it's not just one homogeneous market, it's really a collection of separate sub markets It's, very heterogeneous, it runs on a variety of different hardware. Edge can be everything from a little server, that's (indistinct), it's strapped to a phone, a telephone pole with an antenna on top of it, to (indistinct) microcell, or it can be something that's running inside a car, modern cars has a small little data center inside. It can be something that runs on an industrial factory floor, the network operators, there's pretty broad range of verticals that all looks slightly different in their requirements. And, it's, I think it's really interesting, it's one of those areas that really creates opportunities for vendors like HPE, to really shine and address this heterogeneity with a broad range of solutions, very excited to work together with them in that space. >> Yeah, so I'm glad you brought HPE into the discussion, 'cause we're here at HPE Discover, I want to connect that. But so when I think about HPE strategy, I see a couple of opportunities for them. Obviously Intel is going to play in every part of the edge, the data center, the near edge and the far edge, and I gage HPE does as well with Aruba. Aruba is going to go to the far edge. I'm not sure at this point, anyway it's not yet clear to me how far, HPE's traditional server business goes to the, inside of automobiles, we'll see, but it certainly will be at the, let's call it the near edge as a consolidation point- >> Yeah. >> Et cetera and look the edge can be a race track, it could be a retail store, it could be defined in so many ways. Where does it make sense to process the data? But, so my question is what's the role of the data center in this world of edge? How do you see it? >> Yeah, look, I think in a sense what the cloud revolution is doing is that it's showing us, it leads to polarization of a classic data into edge and cloud, if that makes sense, it's splitting, before this was all mingled a little bit together, if my data centers my basement anyways, what's the edge, what's data center? It's the same thing. The moment I'm moving some workloads to the clouds I don't even know where they're running anymore then some other workloads that have to have a certain sense of locality, I need to keep closely. And there are some workloads you just can't move into the cloud. There's, if I'm generating lots of all the video data that I have to process, it's financially a completely unattractive to shift all of that, to a central location, I want to do this locally. And will I ever connect my smoke detector with my sprinkler system be at the cloud? No I won't (Guido chuckles) this stuff, if things go bad, that may not work anymore. So I need something that's that does this locally. So I think there's many reasons, why you want to keep something on premises. And I think it's a growing market, it's very exciting, we're doing some very good stuff with friends like HPE, they have the ProLiant DL, one 10 Gen10 Plus server with our latest a 3rd Generation Xeons on them the Open RAN, which is the radio access network in the telco space. HP Edgeline servers, also a 3rd Generation Xeons there're some really nice products there that I think can really help addressing enterprises, carriers and a number of different organizations, these edge use cases. >> Can you explain, you mentioned Open RAN, vRAN, should we essentially think of that as kind of the software-defined telco? >> Yeah, exactly. It's software-defined cellular. I actually, I learned a lot about that over the recent months. When I was taking these classes at Stanford, these things were still done in analog, that doesn't mean a radio signal will be processed in an analog way and digest it and today typically the radio signal is immediately digitized and all the processing of the radio signal happens digitally. And, it happens on servers, some of them HPE servers. And, it's a really interesting use case where we're basically now able to do something in a much, much more efficient way by moving it to a digital, more modern platform. And it turns out you can actually virtualize these servers and, run a number of different cells, inside the same server. And it's really complicated because you have to have fantastic real-time guarantees versus sophisticated software stack. But it's a really fascinating use case. >> A lot of times we have these debates and it's maybe somewhat academic, but I'd love to get your thoughts on it. And debate is about, how much data that is processed and inferred at the edge is actually going to come back to the cloud, most of the data is going to stay at the edge, a lot of it's not even going to be persisted. And the counter to that is, so that's sort of the negative is at the data center, but then the counter that is there going to be so much data, even a small percentage of all the data that we're going to create is going to create so much more data, back in the cloud, back in the data center. What's your take on that? >> Look, I think there's different applications that are easier to do in certain places. Look, going to a large cloud has a couple of advantages. You have a very complete software ecosystem around you, lots of different services. You'll have first, if you need very specialized hardware, if I wanted to run the bigger learning task where somebody needed a 1000 machines, and then this runs for a couple of days, and then I don't need to do that for another month or two, for that is really great. There's on demand infrastructure, having all this capability up there, at the same time it costs money to send the data up there. If I just look at the hardware cost, it's much much cheaper to build it myself, in my own data center or in the edge. So I think we'll see, customers picking and choosing what they want to do where, and that there's a role for both, absolutely. And so, I think there's certain categories. At the end of the day why do I absolutely need to have something at the edge? There's a couple of, I think, good use cases. One is, let me actually rephrase a little bit. I think it's three primary reasons. One is simply a bandwidth, where I'm saying, my video data, like I have a 100 4K video cameras, with 60 frames per second feeds, there's no way I'm going to move that into the cloud. It's just, cost prohibitive- >> Right. >> I have a hard time even getting (indistinct). There might be latency, if I need want to reliably react in a very short period of time, I can't do that in the cloud, I need to do this locally with me. I can't even do this in my data center. This has to be very closely coupled. And, then there's this idea of fade sharing. I think, if I want to make sure that if things go wrong, the system is still intact, anything that's sort of an emergency kind of a backup, an emergency type procedure, if things go wrong, I can't rely on the big good internet connection, I need to handle things, things locally, that's the smoke detector and the sprinkler system. And so for all of these, there's good reasons why we need to move things close to the edge so I think there'll be a creative tension between the two but both are huge markets. And I think there's great opportunities for HP ahead to work on all these use cases. >> Yeah, for sure, top brand is in that compute business. So before we wrap up today, thinking about your role, part of your role is a trend spotter. You're kind of driving innovation righty, surfing the waves as you said, skating to the puck, all the- >> I've got my perfect crystal ball right here, yeah I got. >> Yeah, all the cliches. (Dave chuckles) puts a little pressure on you, but, so what are some of the things that you're overseeing that you're looking towards in terms of innovation projects particularly obviously in the data center space, what's really exciting you? >> Look, there's a lot of them and I pretty much all the interesting ideas I get from talking to customers. You talk to the sophisticated customers, you try to understand the problems that they're trying to solve and they can't solve right now, and that gives you ideas to just to pick a couple, one thing what area I'm probably thinking about a lot is how can we build in a sense better accelerators for the infrastructure functions? So, no matter if I run an edge cloud or I run a big public cloud, I want to find ways how I can reduce the amount of CPU cycles I spend on microservice marshaling demarshaling, service mesh, storage acceleration and these things like that. And so well clearly, if this is a large chunk of the overall cycle budget, we need to find ways to shrink that to make this more efficient. So then I think, so this basic infrastructure function acceleration, sounds probably as unsexy as any topic would sound but I think this is actually really, really interesting area and one of the big levers we have right now in the data center. >> Yeah, I would agree Guido, I think that's actually really exciting because, you actually can pick up a lot of the wasted cycles now and that drops right to the bottom line, but please- >> Yeah, exactly. And it's kind of funny we're still measuring so much with SPEC and rates of CPU's performances, it's like, well, we may actually be measuring the wrong thing. If 80% of the cycles of my app are spent in overhead, then the speed of the CPU doesn't matter as much, it's other functions that (indistinct). >> Right. >> So that's one. >> The second big one is memory is becoming a bigger and bigger issue, and it's memory cost 'cause, memory prices, they used to sort of decline at the same rate that our core counts and then clock speeds increased, that's no longer the case. So we've run to some scaling limits, there's some physical scaling limits where memory prices are becoming stagnant. And this has become a major pain point for everybody who's building servers. So I think we need to find ways how we can leverage memory more efficiently, share memory more efficiently. We have some really cool ideas in that space that we're working on. >> Well, yeah. And Pat, let me just sorry to interrupt but Pat hinted to that and your big announcement. He talked about system on package and I think is what you used to talk about what I call disaggregated memory and better sharing of that memory resource. And that seems to be a clear benefit of value creation for the industry. >> Exactly. If this becomes a larger, if for our customers this becomes a larger part of the overall costs, we want to help them address that issue. And the third one is, we're seeing more and more data center operators that effectively power limited. So we need to reduce the overall power of systems, or maybe to some degree just figure out better ways of cooling these systems. But I think there's a lot of innovation that can be done there to both make these data centers more economical but also to make them a little more Green. Today data centers have gotten big enough that if you look at the total amount of energy that we're spending, this world as mankind, a chunk of that is going just to data center. And so if we're spending energy at that scale, I think we have to start thinking about how can we build data centers that are more energy efficient that are also doing the same thing with less energy in the future. >> Well, thank you for laying those out, you guys have been long-term partners with HP and now of course HPE, I'm sure Gelsinger is really happy to have you on board, Guido I would be and thanks so much for coming to theCUBE. >> It's great to be here and great to be at the HP show. >> And thanks for being with us for HPE Discover 2021, the virtual version, you're watching theCUBE the leader in digital tech coverage, be right back. (soft music)

Published Date : Jun 3 2021

SUMMARY :

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Guido	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
Pat	PERSON	0.99+
Guido Appenzeller	PERSON	0.99+
60 frames	QUANTITY	0.99+
80%	QUANTITY	0.99+
five	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
1000 machines	QUANTITY	0.99+
100	QUANTITY	0.99+
One	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
today	DATE	0.99+
HPE	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Gelsinger	PERSON	0.99+
25	QUANTITY	0.99+
Data Platforms Group	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
third one	QUANTITY	0.99+
one party	QUANTITY	0.99+
one	QUANTITY	0.99+
10 years ago	DATE	0.98+
first	QUANTITY	0.98+
Today	DATE	0.98+
VMware	ORGANIZATION	0.97+
ProLiant DL	COMMERCIAL_ITEM	0.97+
three primary reasons	QUANTITY	0.96+
second	QUANTITY	0.96+
Data Platforms Group	ORGANIZATION	0.94+
10% percent	QUANTITY	0.93+
Open RAN	TITLE	0.9+
over 80%	QUANTITY	0.89+
single family home	QUANTITY	0.88+
HPE Discover	ORGANIZATION	0.87+
HPE	TITLE	0.85+
vRAN	TITLE	0.85+
couple	QUANTITY	0.82+
Stanford	ORGANIZATION	0.81+
4K	QUANTITY	0.79+
telco	ORGANIZATION	0.79+
Aruba	LOCATION	0.79+
second feeds	QUANTITY	0.78+
couple of days	QUANTITY	0.77+
one thing	QUANTITY	0.77+
HPE Discover 2021	EVENT	0.75+
10 Gen10 Plus	COMMERCIAL_ITEM	0.75+
HP	EVENT	0.75+
Edgeline	COMMERCIAL_ITEM	0.74+
theCUBE	ORGANIZATION	0.67+

Evaristus Mainsah, IBM & Kit Ho Chee, Intel | IBM Think 2020

>> Announcer: From theCUBE studios in Palo Alto and Boston, it's theCUBE, covering IBM Think brought to you by IBM. >> Hi, there, this is Dave Vellante. We're back at the IBM Think 2020 Digital Event Experience are socially responsible and distant. I'm here in the studios in Marlborough, our team in Palo Alto. We've been going wall to wall coverage of IBM Think, Kit Chee here is the Vice President, and general manager of Cloud and Enterprise sales at Intel. Kit, thanks for coming on. Good to see you. >> Thank you, Dave. Thank you for having me on. >> You're welcome, and Evaristus Mainsah, Mainsah is here. Mainsah, he is the general manager of the IBM Cloud Pack Ecosystem for the IBM Cloud. Evaristus, it's good to see you again. Thank you very much, I appreciate your time. >> Thank you, Dave. Thank you very much. Thanks for having me. >> You're welcome, so Kit, let me start with you. How are you guys doing? You know, there's this pandemic, never seen it before. How're things where you are? >> Yeah, so we were quite fortunate. Intel's had an epidemic leadership team. For about 15 years now, we have a team consisting of medical safety and operational professionals, and this same team has, who has navigated as across several other health issues like bad flu, Ebola, Zika and each one and one virus then navigating us at this point with this pandemic. Obviously, our top priority as it would be for IBM is protecting the health and well being of employees while keeping the business running for our customers. The company has taken the following measures to take care of it direct and indirect workforce, Dave and to ensure business continuity throughout the developing situation. They're from areas like work from home policies, keeping hourly workers home and reimbursing for daycare, elderly care, helping with WiFi policies. So that's been what we've been up to Intel's manufacturing and supply chain operations around the world world are working hard to meet demand and we are collaborating with supply pains of our customers and partners globally as well. And more recently, we have about $16 Million to support communities, from frontline health care workers and technology initiatives like online education, telemedicine and compute need to research. So that's what we've been up to date. Pretty much, you know, busy. >> You know, every society that come to you, I have to say my entire career have been in the technology business and you know, sometimes you hear negative toward the big tech but, but I got to say, just as Kit was saying, big tech has really stepped up in this crisis. IBM has been no different and, you know, tech for good and I was actually I'm really proud. How are you doing in New York City? >> Evaristus: No, thank you, Dave, for that, you know, we are, we're doing great and, and our focus has been absolutely the same, so obviously, because we provide services to clients. At a time like this, your clients need you even more, but we need to focus on our employees to make sure that their health and their safety and their well being is protected. And so we've taken this really seriously, and actually, we have two ways of doing this. One of them is just on to purpose as a, as a company, on our clients, but the other is trying to activate the ecosystem because problems of this magnitude require you to work across a broad ecosystem to, to bring forth in a solution that are long lasting, for example, we have a call for code, which where we go out and we ask developers to use their skills and open source technologies to help solve some technical problems. This year, the focus was per AVADA initiatives around computing resources, how you track the Coronavirus and other services that are provided free of charge to our clients. Let me give you a bit more color, so, so IBM recently formed the high performance computing consortium made up of the feYderal government industry and academic leaders focus on providing high performance computing to solve the COVID-19 problem. So we're currently we have 33 members, now we have 27 active products, deploying something like 400 teraflops as our petaflop 400 petaflops of compute to solve the problem. >> Well, it certainly is challenging times, but at the same time, you're both in the, in the sweet spot, which is Cloud. I've talked to a number of CIOs who have said, you know, this is really, we had a cloud strategy before but we're really accelerating our cloud strategy now and, and we see this as sort of a permanent effect. I mean, Kit, you guys, big, big on ecosystem, you, you want frankly, a level playing field, the more optionality that you can give to customers, you know, the better and Cloud is really been exploding and you guys are powering, you know, all the world's Clouds. >> We are, Dave and honestly, that's a huge responsibility that we undertake. Before the pandemic, we saw the market through the lens of four key mega trends and the experiences we are all having currently now deepens our belief in the importance of addressing these mega trends, but specifically, we see marketplace needs around key areas of cloudification of everything below point, the amount of online activities that have spiked just in the last 60 days. It's a testimony of that. Pervasive AI is the second big area that we have seen and we are now resolute on investments in that area, 5G network transformation and the edge build out. Applications run the business and we know enterprise IT faces challenges when deploying applications that require data movement between Clouds and Cloud native technologies like containers and Kubernetes will be key enablers in delivering end to end data analytics, AI, machine learning and other critical workloads and Cloud environments at the edge. Pairing Intel's data centric portfolio, including Intel's obtain SSPs with Red Hat, Openshift, and IBM Cloud Paks, enterprise can now break through storage bottlenecks and have unconstrained data availability in the hybrid and multicloud environments, so we're pretty happy with the progress we're making that together with IBM. >> Yeah, Evaristus, I mean, you guys are making some big bets. I've, you know, written and discussed in my breaking analysis, I think a lot of people misunderstand IBM Cloud, Ginni Rometty arm and a bow said, hey, you know, we're after only 20% of the workloads are in cloud, we're going after the really difficult to move workloads and the hybrid workloads, that's really the fourth foundation that Arvin you know, talks about, that you and IBM has built, you know, your mainframes, you have middleware services, and in hybrid Cloud is really that fourth sort of platform that you're building out, but you're making some bets in AI. You got other services in the Cloud like, like blockchain, you know, quantum, we've been having really interesting discussions around quantum, so I wonder if you can talk a little bit about sort of where you're allocating resources, some of the big bets that, that you're making for the next decade. >> Well, thank you very much, Dave, for that. I think what we're seeing with clients is that there's increasing focus on and, and really an acceptance, that the best way to take advantage of the Cloud is through a hybrid cloud strategy, infused with data, so it's not just the Cloud itself, but actually what you need to do to data in order to make sure that you can really, truly transform yourself digitally, to enable you to, to improve your operations, and in use your data to improve the way that you work and improve the way that you serve your clients. And what we see is and you see studies out there that say that if you adopt a hybrid cloud strategy, instead of 2.5 times more effective than a public cloud only strategy, and Why is that? Well, you get thi6ngs such as you know, the opportunity to move your application, the extent to which you move your applications to the Cloud. You get things such as you know, reduction in, in, in risk, you, you get a more flexible architecture, especially if you focus on open certification, reduction and certification reduction, some of the tools that you use, and so we see clients looking at that. The other thing that's really important, especially in this moment is business agility, and resilience. Our business agility says that if my customers used to come in, now, they can't come in anymore, because we need them to stay at home, we still need to figure out a way to serve them and we write our applications quickly enough in order to serve this new client, service client in a new way. And well, if your applications haven't been modernized, even if you've moved to the Cloud, you don't have the opportunity to do that and so many clients that have made that transformation, figure out they're much more agile, they can move more easily in this environment, and we're seeing the whole for clients saying yes, I do need to move to the Cloud, but I need somebody to help improve my business agility, so that I can transform, I can change with the needs of my clients, and with the demands of competition and this leads you then to, you know, what sort of platform do you need to enable you to do this, it's something that's open, so that you can write that application once you can run it anywhere, which is why I think the IBM position with our ecosystem and Red Hat with this open container Kubernetes environment that allows you to write application once and deploy it anywhere, is really important for clients in this environment, especially, and the Cloud Paks which is developed, which I, you know, General Manager of the Cloud Pak Ecosystem, the logic of the Cloud Paks is exactly that you'll want plans and want to modernize one, write the applications that are cloud native so that they can react more quickly to market conditions, they can react more quickly to what the clients need and they, but if they do so, they're not unlocked in a specific infrastructure that keeps them away from some of the technologies that may be available in other Clouds. So we have talked about it blockchain, we've got, you know, Watson AI, AI technologies, which is available on our Cloud. We've got the weather, company assets, those are key asset for, for many, many clients, because weather influences more than we realize, so, but if you are locked in a Cloud that didn't give you access to any of those, because you hadn't written on the same platform, you know, that's not something that you you want to support. So Red Hat's platform, which is our platform, which is open, allows you to write your application once and deploy it anyways, particularly our customers in this particular environment together with the data pieces that come on top of that, so that you can scale, scale, because, you know, you've got six people, but you need 600 of them. How do you scale them or they can use data and AI in it? >> Okay, this must be music to your ears, this whole notion of you know, multicloud because, you know, Intel's pervasive and so, because the more Clouds that are out there, the better for you, better for your customers, as I said before, the more optionality. Can you6 talk a little bit about the rela6tionship today between IBM and Intel because it's obviously evolved over the years, PC, servers, you know, other collaboration, nearly the Cloud is, you know, the latest 6and probably the most rel6evant, you know, part of your, your collaboration, but, but talk more about what that's like you guys are doing together that's, that'6s interesting and relevant. >> You know, IBM and Intel have had a very rich history of collaboration starting with the invention of the PC. So for those of us who may take a PC for granted, that was an invention over 40 years ago, between the two companies, all the way to optimizing leadership, IBM software like BB2 to run the best on Intel's data center products today, right? But what's more germane today is the Red Hat piece of the study and how that plays into a partnership with IBM going forward, Intel was one of Red Hat's earliest investors back in 1998, again, something that most people may not realize that we were in early investment with Red Hat. And we've been a longtime pioneer of open source. In fact, Levin Shenoy, Intel's Executive Vice President of Data Platforms Group was part of COBOL Commies pick up a Red Hat summit just last week, you should definitely go listen to that session, but in summary, together Intel and Red Hat have made commercial open source viable and enterprise and worldwide competing globally. Basically, now we've65 used by nearly every vertical and horizontal industr6y. We are bringing our customers choice, scalability and speed of innovation for key technologies today, such as security, Telco, NFV, and containers, or even at ease and most recently Red Hat Openshift. We're very excited to see IBM Cloud Packs, for example, standardized on top of Openshift as that builds the foundation for IBM chapter two, and allows for Intel's value to scale to the Cloud packs and ultimately IBM customers. Intel began partnering with IBM on what is now called Pax over two years ago and we 6are committed to that success and scaling that, try ecosystem, hardware partners, ISVs and our channel. >> Yeah, so theCUBE by the way, covered Red Hat summit last week, Steve Minima and I did a detailed analysis. It was awesome, like if we do say so ourselves, but awesome in the sense of, it allowed us to really sort of unpack what's going on at Red Hat and what's happening at IBM. Evaristus, so I want to come back to you on this Cloud Pack, you got, it's, it's the kind of brand that you guys have, you got Cloud Packs all over the place, you got Cloud Packs for applications, data, integration, automation, multicloud management, what do we need to know about Cloud pack? What are the relevant components there? >> Evaristus: I think the key components is so this is think of this as you know, software that is designed that is Cloud native is designed for specific core use cases and it's built on Red Hat Enterprise Linux with Red Hat Openshift container Kubernetes environment, and then on top of that, so you get a set of common services that look right across all of them and then on top of that, you've got specific both open source and IBM software that deals with specific plant situations. So if you're dealing with applications, for example, the open source and IBM software would be the run times that you need to write and, and to blow applications to have setups. If you're dealing with data, then you've got Cloud Pack to data. The foundation is still Red Hat Enterprise Linux sitting on top of with Red Hat Openshift container Kubernetes environment sitting on top of that providing you with a set of common services and then you'll get a combination of IBM zone open, so IBM software as well as open source will have third party software that sits on top of that, as well as all of our AI infrastructure that sits on top of that and machine learning, to enable you to do everything that you need to do, data to get insights updates, you've got automation to speed up and to enable us to do work more efficiently, more effectively, to make your smart workers better, to make management easier, to help management manage work and processes, and then you've got multicloud management that allows you to see from a single pane, all of your applications that you've deployed in the different Cloud, because the idea here, of course, is that not all sitting in the same Cloud. Some of it is on prem, some of it is in other Cloud, and you want to be able to see and deploy applications across all of those. And then you've got the Cloud Pack to security, which has a combination of third party offerings, as well as ISV offerings, as well as AI offerings. Again, the structure is the same, REL, Red Hat Openshift and then you've got the software that enables you to manage all aspects of security and to deal with incidents when, when they arise. So that gives you data applications and then there's integration, as every time you start writing an application, you need to integrate, you need to access data security from someplace, you need to bring two pipes together for them to communicate and we use a Cloud Pack for integration to allow us to do that. You can open up API's and expose those API so others writing application and gain access to those API's. And again, this idea of resilience, this idea of agility, so you can make changes and you can adapt data things about it. So that's what the Cloud Pack provides for you and Intel has been an absolutely fantastic partner for us. One of the things that we do with Intel, of course, is to, to work on the reference architectures to help our certification program for our hardware OEMs so that we can scale that process, get many more OEMs adopt and be ready for the Cloud Packs and then we work with them on some of the ISV partners and then right up front. >> Got it, let's talk about the edge. Kity, you mentioned 5G. I mean it's a really exciting time, (laughs) You got windmills, you got autonomous vehicles, you got factories, you got to ship, you know, shipping containers. I mean, everything's getting instrumented, data everywhere and so I'm interested in, let's start with Intel's point of view on the edge, how that's going to evolve, you know what it means to Cloud. >> You know, Dave, it's, its definitely the future and we're excited to partner with IBM here. In addition to enterprise edge, the communication service providers think of the Telcos and take advantage of running standardized open software at the Telco edge, enabling a range of new workloads via scalable services, something that, you know, didn't happen in the past, right? Earlier this year, Intel announced a new C on second generation, scalable, atom based processes targeting the 5G radio access network, so this is a new area for us, in terms of investments going to 5G ran by deploying these new technologies, with Cloud native platforms like Red Hat Openshift and IBM Cloud Packs, comm service providers can now make full use of their network investments and bring new services such as Artificial Intelligence, augmented reality, virtual reality and gaming to the market. We've only touched the surface as it comes to 5G and Telco but IBM Red Hat and Intel compute together that I would say, you know, this space is super, super interesting, as more developed with just getting started. >> Evaristus, what do you think this means for Cloud and how that will evolve? Is this sort of a new Cloud that will form at the edge? Obviously, a lot of data is going to stay at the edge, probably new architectures are going to emerge and again, to me, it's all about data, you can create more data, push more data back to the Cloud, so you can model it. Some of the data is going to have to be done in real time at the edge, but it just really extends the network to new horizons. >> Evaristus: It does exactly that, Dave and we think of it and which is why I thought it will impact the same, right? You wouldn't be surprised to see that the platform is based on open containers and that Kubernetes is container environment provided by Red Hat and so whether your data ends up living at the edge or your data lives in a private data center, or it lives in some public Cloud, and how it flows between all of them. We want to make it easy for our clients to be able to do that. So this is very exciting for us. We just announced IBM Edge Application Manager that allows you to basically deploy and manage applications at endpoints of all these devices. So we're not talking about 2030, we're talking about thousands or hundreds of thousands. And in fact, we're working with, we're getting divided Intel's device onboarding, which will enable us to use that because you can get that and you can onboard devices very, very easily at scale, which if you get that combined with IBM Edge Application Manager, then it helps you onboard the devices and it helps you divide both central devices. So we think this is really important. We see lots of work that moving on the edge devices, many of these devices and endpoints now have sufficient compute to be able to run them, but right now, if they are IoT devices, the data has been transferred to hundreds of miles away to some data center to be processed and enormous pass and then only 1% of that actually is useful, right? 99% of it gets thrown away. Some of that actually has data residency requirements, so you may not be able to move the data to process, so why wouldn't you just process the data where the data is created around your analytics where the data is spread, or you have situations that are disconnected as well. So you can't actually do that. You don't want to stop this still in the supermarket, because there's, you lost connectivity with your data center and so the importance of being able to work offline and IBM Edge Application Manager actually allows you so it's tournament so you can do all of this without using lots of people because it's a process that is all sort or automated, but you can work whether you're connected or you're disconnected, and then you get replication when you get really, really powerful for. >> All right, I think the developer model is going to be really interesting here. There's so many new use cases and applications. Of course, Intel's always had a very strong developer ecosystem. You know, IBM understands the importance of developers. Guys, we've got to wrap up, but I wonder if you could each, maybe start with Kit. Give us your sense as to where you want to see this, this partnership go, what can we expect over the next, you know, two to five years and beyond? >> I think it's just the area of, you know, 5G, and how that plays out in terms of edge build out that we just touched on. I think that's a really interesting space, what Evaristus has said is spot on, you know, the processing, and the analytics at the edge is still fairly nascent today and that's growing. So that's one area, building out the Cloud for the different enterprise applications is the other one and obviously, it's going to be a hybrid world. It's not just a public Cloud world on prem world. So the whole hybrid build out What I call hybrid to DoD zero, it's a policy and so the, the work that both of us need to do IBM and Intel will be critical to ensure that, you know, enterprise IT, it has solutions across the hybrid sector. >> Great. Evaristus, give us the last word, bring us home. >> Evaristus: And I would agree with that as well, Kit. I will say this work that you do around the Intel's market ready solutions, right, where we can bring our ecosystem together to do even more on Edge, some of these use cases, this work that we're doing around blockchain, which I think you know, again, another important piece of work and, and I think what we really need to do is to focus on helping clients because many of them are working through those early cases right now, identify use cases that work and without commitment to open standards, using exactly the same standard across like what you've got on your open retail initiative, which we're going to do, I think is going to be really important to help you out scale, but I wanted to just add one more thing, Dave, if you if you permit me. >> Yeah. >> Evaristus: In this COVID era, one of the things that we've been able to do for customers, which has been really helpful, is providing free technology for 90 days to enable them to work in an offline situation to work away from the office. One example, for example, is the just the ability to transfer files and bandwidth, new bandwidth is an issue because the parents and the kids are all working from home, we have a protocol, IBM Aspera, which will make available customers for 90 days at no cost. You don't need to give us your credit card, just log on and use it to improve the way that you work. So your bandwidth feels as if you are in the office. We have what's an assistant that is now helping clients in more than 18 countries that keep the same thing, basically providing COVID information. So those are all available. There's a slew of offerings that we have. We just want listeners to know that they can go on the IBM website and they can gain those offerings they can deploy and use them now. >> That's huge. I knew about the 90 day program, I didn't realize a sparrow was part of that and that's really important because you're like, Okay, how am I going to get this file there? And so thank you for, for sharing that and guys, great conversation. You know, hopefully next year, we could be face to face even if we still have to be socially distant, but it was really a pleasure having you on. Thanks so much. Stay safe, and good stuff. I appreciate it. >> Evaristus: Thank you very much, Dave. Thank you, Kit. Thank you. >> Thank you, thank you. >> All right, and thank you for watching everybody. This is Dave Volante for theCUBE, our wall to wall coverage of the IBM Think 2020 Digital Event Experience. We'll be right back right after this short break. (upbeat music)

Published Date : May 5 2020

SUMMARY :

brought to you by IBM. and general manager of Cloud Thank you for having me on. Evaristus, it's good to see you again. Thank you very much. How are you guys doing? and to ensure business the technology business and you know, for that, you know, we and you guys are powering, you and the experiences we that Arvin you know, talks about, the extent to which you move the Cloud is, you know, and how that plays into a partnership brand that you guys have, and you can adapt data things about it. how that's going to evolve, you that I would say, you know, Some of the data is going to have and so the importance of the next, you know, to ensure that, you know, enterprise IT, the last word, bring us home. to help you out scale, improve the way that you work. And so thank you for, for sharing that Evaristus: Thank you very much, Dave. you for watching everybody.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Evaristus	PERSON	0.99+
Steve Minima	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Mainsah	PERSON	0.99+
Levin Shenoy	PERSON	0.99+
99%	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
600	QUANTITY	0.99+
Telcos	ORGANIZATION	0.99+
1998	DATE	0.99+
Dave Volante	PERSON	0.99+
Evaristus Mainsah	PERSON	0.99+
Marlborough	LOCATION	0.99+
33 members	QUANTITY	0.99+
Boston	LOCATION	0.99+
90 days	QUANTITY	0.99+
New York City	LOCATION	0.99+
2.5 times	QUANTITY	0.99+
Telco	ORGANIZATION	0.99+
27 active products	QUANTITY	0.99+
two	QUANTITY	0.99+
two companies	QUANTITY	0.99+
One	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
400 teraflops	QUANTITY	0.99+
1%	QUANTITY	0.99+
next year	DATE	0.99+
COVID-19	OTHER	0.99+
hundreds of miles	QUANTITY	0.99+
about $16 Million	QUANTITY	0.99+
last week	DATE	0.99+
both	QUANTITY	0.99+
six people	QUANTITY	0.99+
Red Hat	TITLE	0.99+
Cloud Paks	TITLE	0.99+
Red Hat Enterprise Linux	TITLE	0.99+
five years	QUANTITY	0.99+
hundreds of thousands	QUANTITY	0.98+
Kit	PERSON	0.98+
One example	QUANTITY	0.98+
second generation	QUANTITY	0.98+
more than 18 countries	QUANTITY	0.98+
AVADA	ORGANIZATION	0.98+
This year	DATE	0.98+
Data Platforms Group	ORGANIZATION	0.98+

Yaron Haviv, iguazio | AWS re:Invent 2017

Live from Las Vegas. It's the Cube covering AWS Reinvent 2017 presented by AWS, Intel, and our ecosystem of partners. >> Hello, welcome back. This is live coverage of the Cube's AWS re:Invent 2017. Two sets, a lot of action, day one of three days of wall to wall coverage. I'm John Furrier with my co-host Keith Townsend. Our next guest cube alumni is Yaron Haviv who's the founder and CTO of Iguazio, a hot new start up. And big news coming next. We got a big announcement. In following their work, Yaron, good to see you again. Thanks for coming back on. >> Hi, thanks! >> Hey you got a new shirt. Share that logo there. >> That's nuclio. That's our new serverless brainwork which is open source. Really kicks ass, it's about 100 times faster than Amazon. >> Word says it's 200 times faster. >> Yeah we don't want to shame. >> You set the bar. >> We doing 400,000 events per second on a single process. They do about 2000. Most of the open source project around the same ball park. >> Yaron, I got to get this off the bat. And then we can have a nice discussion afterwards. A pleasant discussion. Serverless. Let's first define what that means. Because there's a bunch of- I can take nuclio, install it in my data center, run it, am I serverless? >> You know so I mean I'm in the serverless working group. >> For CNCF >> for CNCF. And a we had a hot debate between the open source start ups. Doing what is called functional service and Amazon and others trying to push the notion of serverless. Which is serverless stands for server less. Meaning you don't manage server. And the way we position nucleo, it's actually both. Because on one end you can consume it as an open source project. Very easy to download. Single docker instruction and it's up and running unlike some other solutions. And on the other hand you can consume it as something within the Iguazio data platform. There is a slide from Amazon which I really like. Which is about serverless. They show serverless is attached to kinesis, DynaomoDB, S3 and Athena. Four services of data that attach to Lamda. Iguazio has API compatibility with kineses, DynamoDB with S3 and Presto, which is Athena as well. So exactly the same four data services that they position as far as the service ecosystem are supported on our platform. So we provide one platform, all the data services at Amazon has or at least interesting ones, serverless functions which are a hundred times faster, a few more tricks that they don't have-- >> So what is the definition then. In a pithy way, for someone out there who's learning about serverless. What is it? What's the definition? >> So the notion as a developer, you're sort of avoiding IT. You go, open a nice portal, you write the function, or you write your function in a get up repository somewhere. You click on a button and it gets deployed somewhere. Right now you know where it's going to get deployed. In the future, you may not know. >> Instead of an EC2 instance, get that prepared >> It's not really an EC2. >> The old way. The old way was. Right? >> The old way there were infrastructure guys building your EC2 instance, security layers, milware, etc. You go develop on your laptop and then you need to go and conform and all the continuous integration play was very complicated. Serverless comes inherently with scale out without the scale in, with continuous integration. You have versioning for the code. You can downgrade the version, you can upgrade the version. So essentially its a package version of a cloud native solution. That's the general idea. >> So I can do that if I'm doing it and managing it myself. It functions as a server. And if I'm doing it and it's a provided it as a cloud provider as a server, as a service, it's serverless. None of my operations team is dealing with servers. It's just writing code and just go. >> Yeah, you're writing a function. Push commit. You should play with nucleo, not just other things. But you'll see you're writing a function. Even see it has a built in editor. You write, you push deploy and it's already deployed somewhere. >> So give us some perspective before you move on. On the game what the impact is to a developer. Apples to oranges. Our old way you described it, new ways, it sounds easier! What's the impact? Is it time? Money? Can you quantify? >> The biggest challenge for businesses is to transform. I saw an interesting sentence. It's not about digital transformation, it's about businesses that need to work in a digital world. Okay? Because again, most of the communication of customers to businesses is becoming digital. Okay? Whether it's today from mobile apps tomorrow through Alexa. >> As Luke Cerney says, it's all software. Your business is the software. >> It's all about interactive really. Okay. As a business I always position there are two things you need to take care of as a business. One is increasing the revenue. And that's by engaging more customers. And increasing the revenue per customer. How do you engage more customer? Through digital services. Whether it's Twitters or proving a new service through your web portal. And the next thing is how do you generate more revenue from a customer is by showing recommendations. >> Finding more value. >> And the other aspect is operational efficiency. How do you automate your reparations to reduce the cost. You know Amazon uses robots to do the shipping and packing. So their margins can now be lower. So the generator is both those things. Reducing cost is becoming more and more dependent on automation which is digital. And increasing revenue become more about customer engagement which is digital. Okay so now you're a traditional enterprise. And you have your exchange to worry about. And all the legal stuff and the mainframes. But if you're not going to work on the transformation piece. You're going to die. Because some other start up is going to build insurance company which is sort of agile and all that. >> So you made an interesting comment earlier when you were talking about nucleo. And integrating the functions that really matter. The services that matter. Amazon releases 800 new services a year. >> Actually 1300. >> I'm sorry 1300. >> This time less, no? >> Right now they're at 1130 and they expect 1500, 1700 by the end of the year. Two years ago it was like 750 and then the year before that was 600. >> So is that an indicator as to Amazon's leading this race between the big, I don't know, three, four cloud providers. Rack and stack them for us. How do we assess the capability? >> It's a matter of mentality. Okay. Persos thinks like a supermarket. Just like an Amazon market. I could say I need a cover for my iPad. I'm gonna get 100 covers for my iPad. No one really, I need to now choose. So their strategy is we'll put dozens of services that do similar things. One is better at this, one is better at that. We control the market we'll sell more. We have a different approach. We do fewer services but each one sort of kicks ass. Each one is much better, much faster, much better engineered. Okay? This is also why we are on data plus provides 10 different data APIs and not 10 different individual data platforms. >> Alright so let's talk about the scoreboard. Even though they might be thinking about the supermarket. You've got Amazon, Azure Microsoft and Google. I've looked at some of the data. I mean, Microsoft's been international for a while from their MSN business. They now have Skype. They have data centers, they know a little bit about cloud. Amazon's got a lot more services. They support multiple versions of things. Google is kind of non-existent on the scale of comprehensiveness. >> Have you looked at their serverless functions? By the way? >> There's new stuff. Tensorflow, serverless. >> But serverless they only support an OJS. They have very few triggers and it's still defined as beta. >> That's the point, so people are touting my Forbes article. They're touting like a feature. There's a lot more that needs to get done. So the question I have for you is. There's a level of comprehensiveness that you need now. And I know you guys spend a lot of time building your solution. We've talked abut this at our last Cube interview. So the question is the whole MVP cousin, minimal viable product. Is great when you're building a consumer app for an iPhone. But when you start talking about a platform and now cloud. Question to you is there a level of completeness bar to be hurdled over for a legit cloud or cloud player? >> I don't think you need 1000 services to build a good cloud. But you do need a bunch of services. Okay? Now the way we see the world like Satya. Okay? Which is there is a core cloud. But there is sort of a belt around it which is what we call intelligence cloud. We would define ourselves as the intelligence cloud. So if someone is building a machine learning model and it needs a 5 year worth of data. And it just needs to do crawling on top of it. It's not really an interesting problem. It's commoditized, lots of CPO power, object storage. But the bigger challenge is doing game referencing close to the edge. This is what needs to happen in real time. You need fewer services but you need to be real time. >> Smarter integration to do that. Right? I mean. >> You have density problems. You don't have a lot of room to put a 100 servers. It needs to be a lot more integrated. You know look at Azure stack. Their slogan is consistency. Look at a slide that shows which Azure services are part of Azure stack. Less than 20%. Because it's a lot more complicated to take technology design whereas hyper scale and put them on few servers. >> How do customers figure it out? What does a customer do? It's all mind boggling. >> I love that concept of core services and then value around those core services. What are those core services that a cloud must have before I start to invest in that cloud providers strategy? >> So the point again, there's a lot of legacy that you need to grab with you. Especially someone like Amazon. So they have to have VMs and migration services from Oracle, etc. But let's assume I'm a start up and building a new client native applications. Do I need any of that? No. I can probably can do with containers. I don't really need to be VMs. I can use something like cybernetics, I can use sequel databases maybe some like sequel. So I can redesign my application differently with a lot fewer services. The problem for someone like Amazon in order to grow and be a supermarket, you have to have ten of everything. If I'm someone that focus on new applications I don't need so many services and so much legacy. >> Well I'll say one thing. You can call them a supermarket, use that retail analogy, I buy that analogy only to the extent that you used it. But if that's the case, then everyone's hungry for food. And they're the only supermarket in town. >> But Wholefoods maybe less stuff on the shelf. >> Everyone else is like a little hot dog stand compared to the supermarket. Amazon is crushing it. Your thoughts? I say that. Are they kicking ass? >> Obviously Amazon is kicking ass. But I think Azure is ramping up faster. Amazon is generating more alienation among people that they are starting to compete with. You know. >> Azure is copying Amazon. Right? >> Yeah. But they have a different angle. They know how to sell to enterprises. They already have the foot in the door for Office 365. I've talked to a customer. We're going Azure. I say why? >> Together: They've got 365. >> We already certify the security with 365 for us to use Azure it's a- >> Right up until that next breech. >> So the guys owning ITs, it's easier for them to go to Azure. The developers want Amazon. Because Amazon is sexier. >> We got to break. We debated this on the intro segment with he analyst. Question. IT buyers have been driven by a top down CIO driven, CXO driven waterfall, whatever you want to call it, old way. With developers now at the driver's seat, with all of this serverless function, serverless coming around the corner very fast. Are developers driving the buying decisions or not? Or is it IT? The budget's still there. They want to eliminate labor. They want more efficiencies. Are you seeing it again? Will it happen? >> Yeah because we are just in the middle. On one end we're an infrastructure. We're an infrastructure consumed by developers. So we keep on having those challenges within the accounts themselves. IT doesn't get what we're doing. Serverless, and database is serverless. Because they like to build stuff. They want to take the nutanix and take a hundred services on top of it. And it will take them two years to integrate it. By that time the business already moved somewhere else. >> So IT could be a dinosaur like the mainframe? >> Right. I think the smart ITs understand they need to adopt cloud instead of fight it. And more the line further up the step. And that sort of the thing we are trying to provide to them. When you are building stuff you are buying EMC storage. You are not just taking discs. So why do you focus on this low level block storage when you're buying infrastructure. Why no buy database as a service. And then you don't need all the hassle. Streaming is a service. Serverless is a service. And then you don't need all that stack. >> Yaron, you should be our guest analyst. But you're too busy building a company. We're going see you next week in Austin for Cubicon. Congratulations. I know you guys have worked hard. The founder and CTO of Iguazio. You're going to hear a lot about these guys. Smart team. They're either going to go big or go home. I think they're going to go big. Congratulations. More coverage here at AWS Re:Invent after this short break. I'm John Furrier with Keith Townsend.

Published Date : Nov 29 2017

SUMMARY :

It's the Cube This is live coverage of the Cube's AWS re:Invent 2017. Hey you got a new shirt. which is open source. Most of the open source project around the same ball park. Yaron, I got to get this off the bat. And on the other hand you can consume it as something What's the definition? In the future, you may not know. The old way was. You can downgrade the version, you can upgrade the version. So I can do that if I'm doing it and managing it myself. You write, you push deploy So give us some perspective before you move on. The biggest challenge for businesses is to transform. Your business is the software. And the next thing is how do you generate more revenue And all the legal stuff and the mainframes. And integrating the functions that really matter. and they expect 1500, 1700 by the end of the year. So is that an indicator as to Amazon's leading this race We control the market we'll sell more. on the scale of comprehensiveness. There's new stuff. But serverless they only support an OJS. So the question I have for you is. You need fewer services but you need to be real time. Smarter integration to do that. You don't have a lot of room to put a 100 servers. How do customers figure it out? before I start to invest in that cloud providers strategy? So the point again, there's a lot of legacy to the extent that you used it. compared to the supermarket. that they are starting to compete with. Azure is copying Amazon. They already have the foot in the door for Office 365. So the guys owning ITs, it's easier With developers now at the driver's seat, Because they like to build stuff. And that sort of the thing we are trying to provide to them. I know you guys have worked hard.

ENTITIES

Entity	Category	Confidence
Susan Wojcicki	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Jim	PERSON	0.99+
Jason	PERSON	0.99+
Tara Hernandez	PERSON	0.99+
David Floyer	PERSON	0.99+
Dave	PERSON	0.99+
Lena Smart	PERSON	0.99+
John Troyer	PERSON	0.99+
Mark Porter	PERSON	0.99+
Mellanox	ORGANIZATION	0.99+
Kevin Deierling	PERSON	0.99+
Marty Lans	PERSON	0.99+
Tara	PERSON	0.99+
John	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Jim Jackson	PERSON	0.99+
Jason Newton	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Daniel Hernandez	PERSON	0.99+
Dave Winokur	PERSON	0.99+
Daniel	PERSON	0.99+
Lena	PERSON	0.99+
Meg Whitman	PERSON	0.99+
Telco	ORGANIZATION	0.99+
Julie Sweet	PERSON	0.99+
Marty	PERSON	0.99+
Yaron Haviv	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Western Digital	ORGANIZATION	0.99+
Kayla Nelson	PERSON	0.99+
Mike Piech	PERSON	0.99+
Jeff	PERSON	0.99+
Dave Volante	PERSON	0.99+
John Walls	PERSON	0.99+
Keith Townsend	PERSON	0.99+
five	QUANTITY	0.99+
Ireland	LOCATION	0.99+
Antonio	PERSON	0.99+
Daniel Laury	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
six	QUANTITY	0.99+
Todd Kerry	PERSON	0.99+
John Furrier	PERSON	0.99+
$20	QUANTITY	0.99+
Mike	PERSON	0.99+
January 30th	DATE	0.99+
Meg	PERSON	0.99+
Mark Little	PERSON	0.99+
Luke Cerney	PERSON	0.99+
Peter	PERSON	0.99+
Jeff Basil	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Dan	PERSON	0.99+
10	QUANTITY	0.99+
Allan	PERSON	0.99+
40 gig	QUANTITY	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Data Platforms: